All you need to know about the basics of ARIMA

Praseeda Saripalle
4 min readJul 3, 2021

Time series refers to a series of data points indexed to a specific order of time.
Time series analysis refers to the usage of some statistical measures to analyze time-series data and extract some key inferences from it.

Time series forecasting is the creation of a predictive model that helps to predict future values based on the previous time-series data.

In the above graph is the time series model, which helps you predict the future values as well. The blue colored lines indicate the trends in the existing data, while the orange ones denote the future prediction lines.

The important aspects for a time-series model are:
1) Stationarity: The statistical properties are almost the same over time. Here,
* Mean should be kept constant
* Variance should be kept constant
* There shouldn't be any seasonality

2) Seasonality: Repeating trends or patterns over time. This shouldn't be there while working with the time series model.

So, it is important to understand whether a time series is Stationary or not before we go for prediction.

We can understand whether a time series is stationary or not by
* Visual inspection
* Global or local chek
* Dickey-Fuller test

How to change a non-stationary to stationary?
1)Differencing:
Differencing can help stabilise the mean of a time series by removing changes in the level of a time series, and therefore eliminating (or reducing) trend and seasonality.

2) Log operation: To smooth the effect of exponential curves.[log(exp(x))=x]

3)Seasonal differencing: Subtract the values from the time cycle

fig 3: First order differencing and fig 4: Second order differencing

Auto Regressive Model

What is meant by the Auto-Regressive approach?

A statistical model is autoregressive if it predicts future values based on past values.

  • Predict future based on past values
  • These models assume that the future will resemble the past
  • Autoregressive models operate under the premise that past values have an effect on current values, which makes the statistical technique popular for analyzing processes that vary over time.
  • AR(1) is the process one in which the current value is based on the immediately preceding value,
  • AR(2) process is one in which the current value is based on the previous two values.
  • An AR(0) process is used for white noise and has no dependence between the terms.

However, since autoregressive models base their predictions only on past information, they implicitly assume that the fundamental forces that influenced the past prices will not change over time. This can lead to surprising and inaccurate predictions if the underlying forces in question are in fact changing, such as if an industry is undergoing a rapid and unprecedented technological transformation.

Auto Correlation and Partial Auto Correlation:
1)
Relationship of a variable with its own values for a previous time period values. These previous time period values are known as lags. This is called Autocorrelation.

2)Partial Autocorrelation, summarizes the relationship between an observation in a time series with observations at previous time steps, but with the relationships of intervening observations removed.

  • An autocorrelation of +1 represents a perfect positive correlation, while an autocorrelation of negative 1 represents a perfect negative correlation.
  • The direct and indirect effect of values in the previous time lags is there.
    Eg: If you want to compare the marks of 8th std and 12th std. There could be many indirect effects also, like.. 8th std depends on 9th, and 9th depends on 10th and so on. These are all indirect effects.
  • Partial autocorrelation has only the direct effect of values in the previous time lags. Eg: Here we are only interested to check if 12std marks have a direct correlation with 8th std or not.

For Auto Regression, we are only concerned with Partial Autocorrelation.

Moving Average

The model that predicts the future values of time series based on the past errors.

ARIMA Model

Autoregressive Integrated Moving Averages

  • (ARIMA) models predict future values based on past values.
  • ARIMA makes use of lagged moving averages to smooth time series data.
  • ARIMA is a form of regression analysis that gauges the strength of one dependent variable relative to other changing variables.
  • There are 2 components in ARIMA:
  1. Autoregression (AR): refers to a model that shows a changing variable that regresses on its own lagged, or prior, values.
  2. Integrated (I): represents the differencing of raw observations to allow for the time series to become stationary (i.e., data values are replaced by the difference between the data values and the previous values).
  3. Moving average (MA): incorporates the dependency between an observation and a residual error from a moving average model applied to lagged observations.

Importhant terms in ARIMA:

ARIMA — p,d,q
p — Order of AR Model
d— Order of differencing to get stationary series.
q — Order of MA Model

--

--