Review of ordinary regression and preview of time series
Post on: 21 Октябрь, 2015 No Comment
Review of ordinary regression and preview of time series 2 comments
As mentioned in my article “Ultimate essence in common for ALL econometric methods, ” ALL linear estimation model are basically coordinate transformation; those models assume covariance matrix (coordinate) of independent variables can be turned into a vector of covariance between dependent and each independent variable; so that’s why no multicollinearity plays an important role in construction of linear estimation models. If there is multicollinearity, i.e. correlation between independent variables, it’ll be inappropriate to turn covariance matrix (i.e. coordinate) into covariance vector between independent and dependent variable, because we can’t explain all variations across independent variables as a combination of covariance between independent and dependent variable. A matrix equation shows it below;
5Cbegin%7Bpmatrix%7D+%5Csigma_%7By1%7D+%5C%5C+%5Csigma_%7By2%7D+%5C%5C+%5Cvdots+%5C%5C+%5Csigma_%7Byp%7D+%5Cend%7Bpmatrix%7D+%3D+%5Cbegin%7Bpmatrix%7D+%5Csigma_%7B11%7D+%26+%5Csigma_%7B12%7D+%26+%5Ccdots+%26+%5Csigma_%7B1p%7D+%5C%5C+%5Csigma_%7B21%7D+%26+%5Csigma_%7B22%7D+%26+%5Ccdots+%26+%5Csigma_%7B2p%7D+%5C%5C+%5Cvdots+%26+%5Cvdots+%26+%5Cddots+%26+%5Cvdots+%5C%5C+%5Csigma_%7Bp1%7D+%26+%5Csigma_%7Bp2%7D+%26+%5Ccdots+%26+%5Csigma_%7Bpp%7D+%5Cend%7Bpmatrix%7D+%5Cbegin%7Bpmatrix%7D+%5Cbeta_1+%5C%5C+%5Cbeta_2+%5C%5C+%5Cvdots+%5C%5C+%5Cbeta_p+%5Cend%7Bpmatrix%7D&bg=ffffff&fg=000&s=0 /%
Therefore, we can rewrite it as equations as below;
5Cbegin%7Barray%7D%7Brcl%7D+%5Csigma_%7By1%7D+%26+%3D+%26+%5Cbeta_1+%5Csigma_%7B11%7D+%2B+%5Cbeta_2+%5Csigma_%7B12%7D+%2B+%5Ccdots+%2B+%5Cbeta_p+%5Csigma_%7Bp1%7D+%5C%5C+%5Csigma_%7By2%7D+%26+%3D+%26+%5Cbeta_1+%5Csigma_%7B21%7D+%2B+%5Cbeta_2+%5Csigma_%7B22%7D+%2B+%5Ccdots+%2B+%5Cbeta_p+%5Csigma_%7Bp2%7D+%5C%5C+%5Cvdots+%26+%3D+%26+%5Cvdots+%5C%5C+%5Csigma_%7Byp%7D+%26+%3D+%26+%5Cbeta_1+%5Csigma_%7Bp1%7D+%2B+%5Cbeta_2+%5Csigma_%7Bp2%7D+%2B+%5Ccdots+%2B+%5Cbeta_p+%5Csigma_%7Bpp%7D+%5Cend%7Barray%7D&bg=ffffff&fg=000&s=0 /%
where 5Csigma_%7Bii%7D&bg=ffffff&fg=000&s=0 /% is variance of i-th variable, and 5Csigma_%7Bij%7D&bg=ffffff&fg=000&s=0 /% is covariance between i- and j-th variable. (n = 1, 2, …, p)
Note here all β’s and σ’s are multiplied and added up to covariance between y and i-th x. In other words, linear regression assumes that all variances and covariances across p independent variables are equivalent to covariances between y and i-th x. Therefore, high correlations between x’s weaken goodness of model, because we can’t capture them using covariance between x’s and y.
However, there is a very different story for time series analysis for its special property that time series consists of initial value and accumulative shifts step to step. Consider an example; we have daily stock price data of Google for recent 5 years. It will be roughly 252 * 5 = 1260 days; in finance, 1 year is usually considered to have 252 trading days, excluding all weekends and holidays. If the initial stock price (day 0) is $30, then the price of next day should be $30 + (changes from initial day). And, price of day 2 will be $30 + (changes from day 0 to 1) + (changes from day 1 to 2), and so on.
When we are regressing population across all major cities around the world on various economic variables, we don’t have to consider this kind of property because it’s not a series. However, when we want to build a model for, say, population growth over time, then this special property always comes in. That’s why ‘time series’ is an independent field in econometrics.
In this article, let’s take a look at why and how of time series estimation in detail.
(1) Model structure
As mentioned above, all time series consist of initial value and shifts of each step. Therefore, we can expect appropriate model for estimation of time series data will be like;
28estimation+%5C%3B+of+%5C%3B+y_t+%29+%3D+y_0+%2B+%28estimation+%5C%3B+of+%5C%3B+changes+%5C%3B+accumulation%29+%2B+%28residuals%29&bg=ffffff&fg=000&s=0 /%
Now, what if we don’t know initial value? Then, we have to infer an appropriate estimation as the starting point, considering estimation of changes. Therefore, our model above should be rewritten as the following;
28estimation+%5C%3B+of+%5C%3B+y_t+%29+%3D+%28estimation+%5C%3B+of+%5C%3B+starting+%5C%3B+point%29+%2B+%28estimation+%5C%3B+of+%5C%3B+changes+%5C%3B+accumulation%29+%2B+%28residuals%29&bg=ffffff&fg=000&s=0 /%
Now, let’s think about what we should do to modify the model above to make it to the final model. Except for some special circumstances, we can expect newer data will matter more than older data. And also, we can simplify the problem by using data itself as explanatory variable (you can see it’s obviously wrong to say “independent / dependent” variable in case of time series). Consider 2 equations below. Change per step is denoted δ.
5Cbegin%7Barray%7D%7Brcl%7D+y_t+%26+%3D+%26+y_0+%2B+%5Cdelta_0+%2B+%5Cdelta_1+%2B+%5Ccdots+%2B+%5Cdelta_%7Bt-1%7D+%5C%5C+y_%7Bt-1%7D+%26+%3D+%26+y_0+%2B+%5Cdelta_0+%2B+%5Cdelta_1+%2B+%5Ccdots+%2B+%5Cdelta_%7Bt-2%7D+%5Cend%7Barray%7D&bg=ffffff&fg=000&s=0 /%
We can see immediately 7Bt-1%7D+%3D+%5Cdelta_%7Bt-1%7D&bg=ffffff&fg=000&s=0 /% Therefore, we can simplify estimation of today’s price by using yesterday’s price. However, using only yesterday’s price may be making it too simple. You know, stock price is a result of demand-supply balance in financial market, and there are many kind of investors; while Warren Buffett comes in bear markets and goes out of bull markets, momentum traders make their profits on the way up and get away before the way down. Arbitragers don’t care which way the market is on, because they trade stocks based on spreads, not their directions.
Anyway, making a model of some past data and choosing the number of past steps may be the best convincing way. Thus, we can construct a model for time series estimation as below;
3D+%5Calpha+%2B+%5Crho_1+y_%7Bt-1%7D+%2B+%5Crho_2+y_%7Bt-2%7D+%2B+%5Ccdots+%2B+%5Crho_p+y_%7Bt-p%7D+%2B+%5Cepsilon&bg=ffffff&fg=000&s=0 /%
and we have to determine p and ρ’s.
(2) Stationarity
By the way, before progress further, we have to think its appropriateness. Consider the model above once again; we are using p past prices as explanatory variables for today’s price. Therefore, it can be said that we are implicitly assuming that stock price is only dependent of past price, and there is no effect of any other external variables. But is it true? Even though we don’t know true model of stock price, we know it is affected by companies’ financial health, at least. This kind of problem often arises (and is ignored) in time series analysis. The point is, whether we need external explanatory variables.
Markov process: If current data process is only dependent of p past steps, it is (p-order) Markov process.
Actually, most time series analyses are used for Markov process, because they don’t employ external variables to explain the target variable. Frankly, I don’t think the assumption of Markov process is good for real world, there are some methods to proxy Markov process for variables in nature, though. Anyway, it’s worth your memory that there are 2 well-known processes out of all possible Markov process.
White noise: If p = α = 0, it is white noise. 3D+%5Cepsilon_t&bg=ffffff&fg=000&s=0 /%
Random walk: If p = ρ = 1 and α = 0, it is random walk. 3D+y_%7Bt-1%7D+%2B+%5Cepsilon_t&bg=ffffff&fg=000&s=0 /%
So, white noise is just a totally random noise with expected value of 0. Meanwhile, random walk is determined by previous 1 step and white noise. While we can expect white noise to be 0, we can’t predict random walk, because it doesn’t converge. Actually, random walk tells us an important property of time series, which is called stationarity. That is, setting all coefficients of past states ρ, then all predictable processes have 5Cleq+%7C+%5Crho_i+%7C+%3C+1&bg=ffffff&fg=000&s=0 /% You may see why intuitively. Remember; when we say “make an estimation,” we are computing an expected value, which is, a convergence of probabilistic variable. If the data has a property of divergence, we can’t predict it because we can’t predict the point where its probability will be maximized. If any of ρ’s are larger than 1 or less than –1, then that time series will NEVER converge. Consider a process 3D+%5Crho+y_%7Bt-1%7D+%2B+%5Cepsilon_t&bg=ffffff&fg=000&s=0 /% If ρ is, say, 1.5, then y(t) would be around 1.5^t y(0). And you know, if we extend it to the limit, obviously 5Clim+%5Climits_%7Bt+%5Cto+%5Cinfty%7D+1.5%5Et+%5Cto+%5Cinfty&bg=ffffff&fg=000&s=0 /% so it won’t converge. And if ρ = –1.5, then it will diverge. But if it is, say, 0.5, then we can expect it because it will converge at some time. Say, 5Clim+%5Climits_%7Bt+%5Cto+%5Cinfty%7D+0.5%5Et+%5Cto+0&bg=ffffff&fg=000&s=0 /% It directly becomes an important property of all time series.
Stationarity: 5Cleq+%7C+%5Crho_i+%7C+%3C+1&bg=ffffff&fg=000&s=0 /% Stationary Markov processes are predictable.
(3) Integration
As mentioned above, stock price is a random walk and we can’t predict it. However, it is known some unpredictable economic variables can be predicted by taking its own difference by some order. And that’s called integration. Consider Google stock price again.
3D+y_%7Bt-1%7D+%2B+%5Cepsilon_t&bg=ffffff&fg=000&s=0 /%
Now, let’s turn it into 1 order difference. Then we’ll get 7Bt-1%7D+%3D+%5Cepsilon_t&bg=ffffff&fg=000&s=0 /%
So, you can see 1-order difference is a white noise. And as you must know, we can predict it, because its expected value is zero and it follows normal distribution. In this case, it is said that y is integrated of 1 order. And it is written as 5Csim+I%281%29&bg=ffffff&fg=000&s=0 /% So, you may see stationary data is a special case of I(0).
Of course, there are many ways to take differences of orders. We can do it by taking: subtraction as above; logarithms for continuous variables; and proportion for discrete variables. If the data is integrated of some order, it will work whichever you do. But, for practical applications, here are some simple guides.
Rules of thumb for types of data
(1) Economic data in percent form such as unemployment rate and interest rate are often integrated by taking subtraction. 7Bt-1%7D&bg=ffffff&fg=000&s=0 /%
(2) Frequently-occuring economic data in numerical form such as stock price are often integrated by taking logarithms. 5Cln%7B%28+%5Cfrac%7B+y_t+%7D%7B+y_%7Bt-1%7D+%7D%29%7D&bg=ffffff&fg=000&s=0 /%
(3) Discretely-occuring economic data in numerical form such as GDP, saving, and investments are often integrated by taking proportions. 5Cfrac%7B+y_t+-+y_%7Bt-1%7D+%7D%7B+y_%7Bt-1%7D+%7D&bg=ffffff&fg=000&s=0 /%
And here is an important property worth your note.
When x and y are I(d), then their linear combination is also I(d). 5Csim+I%28d%29%2C+%5C%3B+y+%5Csim+I%28d%29+%5Cto+ax+%2B+by+%2B+c+%5Csim+I%28d%29&bg=ffffff&fg=000&s=0 /%
And finally, here is a collection of simple rules to determine number of order.
Rules of thumb for number of orders