a classification model that classify whether the EURUSD stock exchange will go up or down next day based on historical data
i used linear regression model to evaluate the features in the future prediction of the time-series over the time
i used multiple features and got the following results:
- using the High-Low average as the only feature
- using
as features
- using
as features
- using 116 features that can be found here
- using the High-Low average as the only feature
i used multiple classifiers to get the best results, the main idea is the same, having the average of any day compared by the next day (the objective of the task), and the rate could be higher
or lower0
, results could be found below- Adaboost classifier
training time
precision recall f1-score support 0.0 0.55 0.60 0.57 230 1.0 0.51 0.46 0.48 208 avg / total 0.53 0.53 0.53 438
- Random Forest
training time
precision recall f1-score support 0.0 0.61 0.61 0.61 230 1.0 0.57 0.57 0.57 208 avg / total 0.59 0.59 0.59 438
- SGD Classifier
training time
precision recall f1-score support 0.0 0.64 0.56 0.60 230 1.0 0.57 0.65 0.61 208 avg / total 0.61 0.60 0.60 438
training time
precision recall f1-score support 0.0 0.63 0.60 0.61 230 1.0 0.58 0.61 0.59 208 avg / total 0.60 0.60 0.60 438
- Adaboost classifier
this time we used multiple techniques to make sure that we have the best accuracy and that our model is actually has the accuracy that it says it has
- Polynomial Features
i did experiment with 2 and 3 poly feautres and got better results using 2nd degree poly feautres
- using correlation as indicator for the features
i tried to use feautres that has high correlation with the output variable, i tried features that has correlation above or equal [0.5, 0.7, 0.8, 0.9] and got the best cv score on the
correlation - Grid search for best params
i used grid search techniques to get the best params for my model, and after trying muliple feautres i got the best result using
logistic regression
with penalty score of100
and usingl1
as penalty function.
- Polynomial Features
- we could see clearly that the set of features could capture more of the time series and resulted in better model both on the short and long terms.
- i chose the logistic regression as it had the best cv score and test score for prediction.