-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathIBM stock forcast.R
135 lines (79 loc) · 5.08 KB
/
IBM stock forcast.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# In this exercise, I analyzed the IBM stock adjusted close price in the period of [2017-01-01/2021-12-10] to create a time series forecasting models (naïve , Seasonal naïve, ETS,ARIMA, Neural Network), then check which returned the best forecasting results the test period of the last 7 month of the data period. Data from is Yahoo Finance .
# Libraries
library(readr)
library(forecast)
# In this exercise, I analyzed the IBM stock adjusted close price in the period of [2017-01-01/2021-12-10] to create a time series forecasting models (naïve , Seasonal naïve, ETS,ARIMA, Neural Network), then check which returned the best forecasting results the test period of the last 7 month of the data period.
## Loading the Data
IBM <- read_csv("C:/Users/baaba/Desktop/MS. Applied Economic and Data analysis/ADEC7460.02 Fall 2021 Predictive AnalyticsForecasting [Fulton]/Module 6 VAR, Panel, HTS, Machine Learning 25 Nov -5 Dec/Assignment Homework Time Series/IBM.csv")
head(IBM)
summary(IBM)
# plotting the data
IBM.ts <- ts(IBM$`Adj Close`
, start = c(2016,01)
, end = c(2021,12)
, frequency = 12)
autoplot(IBM.ts)
## Training Testing split
# After loading data from Yahoo finance I split the data set into a training set from 2016,1 to 2021 and test set from 2021,6 to the end of data period.
# training sets
train = window(IBM.ts, end = c(2021,5))
# testing sets
test = window(IBM.ts, start = c(2021,6)) # 7 months test set
## Naïve Model
# I generated a naive model and I checked the performance for the model forecasting on the test set. The naive model generated a MASE = 1.2720367 and a RMSE = 17.736403. Next we need to compare it with the other models.
naive.fit = naive(train)
naive.forecast = forecast(naive.fit, h = 7)
summary(naive.fit)
checkresiduals(naive.forecast)
accuracy(naive.forecast, test)
autoplot(IBM.ts) +
autolayer(naive.forecast, series = "naive Forecast") +
autolayer(test , series = "Actual price")
## Seasonal naïve Model####
# Now I generated a Seasonal naive model and I checked the performance for the model forecasting on the test set. The Seasonal naive model generated a MASE = 1.009509 and a RMSE = 14.93724. the Seasonal naive model has worst RMSE and MASE than the naive model in this data, we will compare these models to check how good the next models are.
snaive.fit = snaive(train)
snaive.forecast = forecast(snaive.fit, h = 7)
summary(snaive.fit)
checkresiduals(snaive.forecast)
accuracy(snaive.forecast, test)
autoplot(IBM.ts) +
autolayer(snaive.forecast, series = "snaive Forecast") +
autolayer(test , series = "Actual price" )
## ETS Model####
# In the subsequent model iteration, I generated a Error, Trend, Seasonal model and I checked the performance for the model forecasting on the test set. The ETS Model generated a MASE = 0.7658349 and a RMSE = 10.320349. Although the Seasonal naive model looks much better in the graph but the ETS model has better MASE and RMSE than the Seasonal naive model.
ETS.fit = ets(train)
ETS.forecast = forecast(ETS.fit, h = 7)
summary(ETS.fit)
checkresiduals(ETS.forecast)
accuracy(ETS.forecast, test)
autoplot(IBM.ts) +
autolayer(ETS.forecast, series = "ETS Forecast") +
autolayer(test , series = "Actual price")
## ARIMA Model
# Next I implemented, an ARIMA Model and I checked the performance for the model, and the model returned a MASE = 0.3730515 and a RMSE = 5.452839 , the ARIMA Model is better than the ETS model for this data.
ARIMA.fit = auto.arima(train)
ARIMA.forecast = forecast(ARIMA.fit, h = 7)
summary(ARIMA.fit)
checkresiduals(ARIMA.forecast)
accuracy(ARIMA.forecast, test)
autoplot(IBM.ts) +
autolayer(ARIMA.forecast, series = "ARIMA Forecast") +
autolayer(test , series = "Actual price")
## Neural Network Model
# Next I tried to use the Neural Network Model to see if it can generate better performance that the other models. For the Neural Network Model I got a MASE = 0.4623200 and RMSE = 7.424988 . the Neural Network Model performance for this data is lower than the ARIMA models.
library(forecast)
nn.fit = nnetar(train, lambda=0)
sim <- ts(matrix(0, nrow=20, ncol=5), start=end(train)[1]+1)
for(i in seq(5))
sim[,i] <- simulate(nn.fit, nsim=20)
library(ggplot2)
ggplot2::autoplot(train) + forecast::autolayer(sim)
nn.forecast <- forecast(nn.fit, PI=TRUE, h=7)
checkresiduals(nn.forecast)
accuracy(nn.forecast, test)
autoplot(IBM.ts) +
autolayer(nn.forecast, series = "Neural Network Forecast") +
autolayer(test , series = "Actual price")
## Conclusion
data.frame(Model = c("1.naive modlel", "2.snaive model", "3.ETS model", "4.ARIMA model", "5.Neural Network"), "RMSE" = c(8.048907,14.93724 ,10.320349,5.452839,9.559687), "MASE" = c(0.5629862,1.009509,0.7658349,0.3730515,0.6892889))
# Ultimately, the ARIMA model returned the best results (MASE = 0.3730515 and RMSE = 5.452839), when we compare it to the actual test data, So we can choose the ARIMA model among these models to forecast the future IBM adjusted stock price.