-
Notifications
You must be signed in to change notification settings - Fork 0
ML Development Guide
Wesley Jones edited this page Sep 25, 2021
·
3 revisions
For time series, many say don't use random split for train/test. Use earlier data to train, later data to test. I'm still not sure I completely agree with this.
Add a validation set. So you will have train/validation/test. In chronological order.
Always save the model to a file
Always save the model accuracy to a file (json)
from sklearn.metrics import accuracy_score
import json
accuracy = accuracy_score(labels, predictions)
metrics = {"accuracy": accuracy}
accuracy_path = repo_path / "metrics/accuracy.json"
accuracy_path.write_text(json.dumps(metrics))
Use git tags to manage ready-to-go models
Create a new git branch for each new feature