Error Correction Model (ECM): An Intuitive Guide for Time Series Modelling

Prasun Biswas
5 min readAug 24, 2020

--

Written by Prasun Biswas and Chandan Durgia

Photo by Chris Liverani on Unsplash

Take a moment and think about our behaviors (conduct), one might push oneself to be someone he/she is not for a day or two, but again one will be back to his/her natural behavior (the equilibrium) the following day. This phenomenon is called reversion to equilibrium and is quite ubiquitous in nature.

Since trading is primarily behavioral driven, it is no surprise that the concept applies to a variety of trading areas.(Do read how many of the trading strategies are related to the concept of mean reversion ). Infact, the biggest bets ever in the currency trading world are made on the assumption of equilibrium reversion (if interested, read the best three trades ever in the currency trading worlds where traders like Mr. George Soros made a fortune).

Then there are many more examples from the stochastic world as well like Hull White Model, CIR Model etc. which are based on equilibrium reversion.

So, the point is — equilibrium reversion works.

Having already beaten the dead horse way too much, let’s shift gears !!.

When one talks about modeling a time series regression, the first thought that comes in one’s “programmed” mind is:

- Let’s ensure that stationarity exists for the dependent and independent variables and develop a regression model to capture the short-term trend between the variables.

- If stationarity doesn’t exist, let’s check whether the data is co-integrated, and develop a regression model to capture the long-term trend.

The question is — “is there a middle play?” — can we capture both the short-term and the long-term trends somehow? This is where the “Error Correction Model” (ECM) finds its value in the econometrics world.

In simple words, ECM describes how dependent variable (y) and the independent variable (x) behave in the short run consistent with a long run cointegrating relationship. To elaborate, first the long-term relationship between the co-integrated variables is captured by regressing the values of y and x. Then the error terms of this regression, together with other short-term drivers, is leveraged to correct for the short-term trends, in turn aligning with the long-term equilibrium. (Equilibrium reversion)

Continuing to elaborate more on this.

Let’s consider a dependent variable (y(t)) and independent variable (x(t)). If y and x are non-stationary variables and we fit a regression model on them — the model would be spurious, and the estimates wouldn’t be reliable. Now if we stationarize the variables and run the regression, though we will have a model with correct estimates, but this model will only capture the short-term relationship. Given the details, we can’t comment on the long-term relationship between the dependent and independent variables.

Now consider a case where y and x are co-integrated — which means that there is a long-term relationship between these variables. Mathematically it can be expressed as:

y(t) = a + b x(t) + u(t)

which means:

u(t) = y(t) -a -b x(t)

Great! so we have a long-term relationship wherein the error between the actual vs predicted in this long-term relationship is given by the equation above. Probably you would have guessed by now, this is the error which we are going to correct for in the shorter-term equation.

Now, we stationarize the errors and together with the stationary form of y and x, we regress as follows:

∆y = a1 + a2*∆x + a3*∆x1-a4* us(t) + v(t)

Breaking down the above equation, we have the y as a function of x and x1 (new variable which captures short term trend). Here y, x, x1 all are made stationary. Additionally, there is a us(t) component which is the stationarized error from the long-term relationship equation.

Image by Author

The key thing to take notice of is that the signage of a4 (the coefficient of us(t)) is negative and the rationale is quite intuitive. If y(t) > (a + b x(t)), it means that y is above its equilibrium value. The negative “a4” reduces some value of y to bring it closer to equilibrium. Note that a4 always lies between 0 and 1 which means that there is a gradual deduction in the short-term regression to align with the long-term equilibrium. The term error-correction infers that the last-period’s deviation from a long-run equilibrium, called the error, influences its short-run dynamics. Thus, ECM in a way captures an element of the speed (through “a4”) at which a dependent variable returns to the equilibrium.

There are multiple key use cases for ECM some common ones being:

· M2 (Money supply) demand as a function of GNP and short term int rate

· US 10yr as a function of US 3yr interest rates

· Stock prices as a function of market variables

· Deposit balances as a function of equity and interest rates variables

· Disposable income as a function of consumption expenditure and so on.

An important point to note, unlike linear regression wherein for practical considerations failing assumptions don’t impact the model significantly, for ECM each of the assumptions (like, co-integration of x and y in long term, all the variables including the error term from long term equation should be stationary in short term etc.) matters and shouldn’t be ignored.

Note that, for cointegration there are primarily two tests used in the industry: Johansen test and Engle-Granger test. Though, both the tests have their own pros and cons; however, Johansen’s test is considered an improvement over the Engle-Granger test as it avoids the issue of choosing a dependent variable and can detect multiple cointegrating vectors.

For stationarity tests, the combination of ADF, KPSS and Phillips-Perron are tested and if there are conflicting results from these tests, a judicious call is taken based on the data distribution.

Concluding Remarks: So, next time when you want to consider a relationship for time series variables do not only consider either just a long-term relationship or just a short-term relationship. Try fitting an ECM (assuming all statistical assumptions pass) to get the best of both the worlds — long- and short-term relationships.

Ending on a related philosophical note, Tim Minchin, a famous Australian comedian (and much more), in one of his famous speeches said:

“….. you should be careful of long-term dreams. If you focus too far in front of you, you won’t see the shiny thing out of the corner of your eye.”

The pursuit of a great model which performs in a longer term is something as a data scientist we all aim for, but in the process, we forget how important it is to understand and capture short term fluctuations — “those little shiny things out of the corner of your eye”. ECM, overall, is a great methodology to accomplish that.

Happy Learning!

--

--

Prasun Biswas

Writes about Data Science and Machine Learning | 10 years of industry experience | Get in touch: https://www.linkedin.com/in/prasun-biswas-254a5b78/