[Paper Page] [时序人中文解读]
pip install -r requirements.txt
Download datasets from the Qingren/TSFM-ScalingLaws-Dataset. The directory organization structure is as follows:
- dataset_train
|- Lotsa16B
|- Lotsa1B
|- Lotsa100M
|- Lotsa10M
- dataset_test
|- Lotsa16B
|- Lotsa1B
|- Lotsa100M
|- Lotsa10M
|- LSF
|- Monash
Create a .env
file to indicate the pretraining dataset paths.
Test data is composed of three parts: in-distribution data dataset_test/Lotsa[DataSize]
, out-of-distribution data dataset_test/LSF
and dataset_test/Monash
Take the test data of Lotsa16B as an example, the storage_path
fields in config file cli/conf/pretrain/val_data/Lotsa16B_multi.yaml
indicate the test data path. The default path is given as follows:
- _target_: tsfm.data.builder.ConcatDatasetBuilder
- _target_: tsfm.data.builder.simple.SimpleEvalDatasetBuilder
storage_path: dataset_test/Monash
- _target_: tsfm.data.builder.ConcatDatasetBuilder
- _target_: tsfm.data.builder.simple.SimpleEvalDatasetBuilder
storage_path: dataset_test/LSF
- _target_: tsfm.data.builder.ConcatDatasetBuilder
- _target_: tsfm.data.builder.simple.SimpleEvalDatasetBuilder
storage_path: dataset_test/Lotsa16B
The hyperparameters of the model are defined in cli/conf/pretrain/model/[Model]_[ModelSize].yaml
The general training config is defined in cli/conf/pretrain/default_[ddp/fsdp]_val.yaml
# train an encoder
python -m cli.train_val -cp conf/pretrain -cn default_ddp_val_enc \
model=encoder_10M \
data=lotsa16B_weighted \
val_data=lotsa16B_lsf_monash \
trainer.logger.project=demo_scalinglaws \
# train a decoder
python -m cli.train_val -cp conf/pretrain -cn default_ddp_val_dec \
model=decoder_10M \
data=lotsa16B_weighted \
val_data=lotsa16B_lsf_monash \
trainer.logger.project=demo_scalinglaws \
When training models varying different numbers of parameters and different pretraining datasizes, the loss and metrics will be recorded via wandb. We need to rename each experiment in wandb following the format [encoder/decoder]_[ModelSize]_[DataSize]
, such as encoder_10M_16B
After collecting a series of experiments, download the wandb log and use the Jupyter scripts under analysis
to fit and visualize the scaling laws.
The well-trained models are available in the PeacefulData/TSFM-ScalingLaws-Checkpoints. You can try using the models with the Jupyter scripts in the demo
🙋 Please let us know if you find out a mistake or have any suggestions!
🌟 If you find the codebase helpful in your research, please consider to star this repository and cite the corresponding paper:
title={Towards Neural Scaling Laws for Time Series Foundation Models},
author={Yao, Qingren and Yang, Chao-Han Huck and Jiang, Renhe and Liang, Yuxuan and Jin, Ming and Pan, Shirui},
- TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis, in arXiv 2024. [paper] [GitHub Repo]
- Foundation Models for Time Series Analysis: A Tutorial and Survey, in KDD 2024. [paper] [Tutorial]
- What Can Large Language Models Tell Us about Time Series Analysis, in ICML 2024. [paper]
- Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects, in TPAMI 2024. [paper] [Website]
- Transformers in Time Series: A Survey, in IJCAI 2023. [paper] [GitHub Repo]
- A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection, in TPAMI 2024. [paper] [Website]
Our implementation builds upon the codebases of Uni2ts, which have been extensively modified to suit our specific requirements. We thank the authors of these implementations for sharing their code and providing related resources, which have been invaluable to this work.