Wednesday, November 9, 2011

Proc ARIMA to Fit MA(q) model

Moving average models are a class of models that estimate the process underlying a time series, where the current time series value is related to the random errors from previous time periods. In contrast, auto-regressive models estimate a process where the current time series value is related to the actual time series values from previous time periods. Like auto-regressive processes, time series with moving average processes also exhibit nonzero auto-correlation.

The Box-Jenkins approach provides a framework for identifying the moving average process underlying a time series and for estimating model parameters. For an MA(q) process, the auto-correlation plots should display the following patterns:
1. The ACF drops to 0 after lag q.
2. The IACF and PACF tail off exponentially.

A plot of the data shows a upward trend. With this type of trend, the time series is non-stationary. A time series is said to be stationary if the mean, variance, and lag covariance are constant over time. One common way of achieving this is to difference the series. Differencing a time series involves making calculations of the type Y_t-Y_t-1 for all observations.
*take first differences to remove non-stationary trend;
data predicted_series;
set predicted_series;
severity_diff=dif(severity);
run;

The first phase is the identification phase, which you perform by issuing the IDENTIFY statement.
proc arima data=predicted_series ;
identity var=severity_diff nlag=12 scan;
run;

The second phrase of the ARIMA procedure is the estimation phrase.
proc arima data=predicted_series ;
identity var=severity_diff scan;
estimate q=2 method=ml;
run;

The ACF drops to 0 after the sixth lag, indicating MA(6).
The IACF drops to 0 after the forth lag, indicating AR(4).
The PACF has a spike at the sixth lag, indicating AR(6).

The third phrase is to forecast the time series.
proc arima data=predicted_series ;
identify var=severity_diff;
estimate q=4 method=ml;
forecast lead=60 interval=month id=failure_date out=outlead;
run;

axis1 width=1 offset=(1 pct) label=(a=90 r=0 'Severity');
axis2 width=1 offset=(1 pct) label=('Failure Date') value=(h=1.25);
symbol1 v=star ci=red height=1 cells interpol=join l=1 w=2;
symbol2 v=dot ci=green height=1 cells interpol=join l=1 w=2;
legend1 label=('') value=('Actual Claim Severity' 'Predicted Claim Severity') across=2 mode=protect position=(top center inside);
title 'Severity Prediction';
proc gplot data=outlead;
format failure_date monyy. severity dollar.;
plot (severity_diff forecast)*failure_date/ overlay
caxis = BLACK
ctext = BLACK
vaxis = axis1
haxis = axis2
legend= legend1
grid
hminor=0
;
run;
quit; 

No comments:

Post a Comment

Blog Archive