-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Ptnn4both datatypes and alignment tests (#1827)
* Init model for both dataset * Remove some deprecated code * Add model template; * We must align with previous results * We choose another mode as the initial version * Almost success to run GRU * Successfully run training * Passed general_nn test * gru test * Alignment test passed * comment * fix readme & minor errors * general nn updates & benchmarks * Update examples/benchmarks/GeneralPtNN/workflow_config_gru2mlp.yaml --------- Co-authored-by: Young <[email protected]> Co-authored-by: you-n-g <[email protected]>
- Loading branch information
1 parent
2c33332
commit c9ed050
Showing
7 changed files
with
739 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
|
||
|
||
# Introduction | ||
|
||
What is GeneralPtNN | ||
- Fix previous design that fail to support both Time-series and tabular data | ||
- Now you can just replace the Pytorch model structure to run a NN model. | ||
|
||
We provide an example to demonstrate the effectiveness of the current design. | ||
- `workflow_config_gru.yaml` align with previous results [GRU(Kyunghyun Cho, et al.)](../README.md#Alpha158-dataset) | ||
- `workflow_config_gru2mlp.yaml` to demonstrate we can convert config from time-series to tabular data with minimal changes | ||
- You only have to change the net & dataset class to make the conversion. | ||
- `workflow_config_mlp.yaml` achieved similar functionality with [MLP](../README.md#Alpha158-dataset) | ||
|
||
# TODO | ||
|
||
- We will align existing models to current design. | ||
|
||
- The result of `workflow_config_mlp.yaml` is different with the result of [MLP](../README.md#Alpha158-dataset) since GeneralPtNN has a different stopping method compared to previous implementations. Specificly, GeneralPtNN controls training according to epoches, whereas previous methods controlled by max_steps. |
100 changes: 100 additions & 0 deletions
100
examples/benchmarks/GeneralPtNN/workflow_config_gru.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
qlib_init: | ||
provider_uri: "~/.qlib/qlib_data/cn_data" | ||
region: cn | ||
market: &market csi300 | ||
benchmark: &benchmark SH000300 | ||
data_handler_config: &data_handler_config | ||
start_time: 2008-01-01 | ||
end_time: 2020-08-01 | ||
fit_start_time: 2008-01-01 | ||
fit_end_time: 2014-12-31 | ||
instruments: *market | ||
infer_processors: | ||
- class: FilterCol | ||
kwargs: | ||
fields_group: feature | ||
col_list: ["RESI5", "WVMA5", "RSQR5", "KLEN", "RSQR10", "CORR5", "CORD5", "CORR10", | ||
"ROC60", "RESI10", "VSTD5", "RSQR60", "CORR60", "WVMA60", "STD5", | ||
"RSQR20", "CORD60", "CORD10", "CORR20", "KLOW" | ||
] | ||
- class: RobustZScoreNorm | ||
kwargs: | ||
fields_group: feature | ||
clip_outlier: true | ||
- class: Fillna | ||
kwargs: | ||
fields_group: feature | ||
learn_processors: | ||
- class: DropnaLabel | ||
- class: CSRankNorm | ||
kwargs: | ||
fields_group: label | ||
label: ["Ref($close, -2) / Ref($close, -1) - 1"] | ||
|
||
port_analysis_config: &port_analysis_config | ||
strategy: | ||
class: TopkDropoutStrategy | ||
module_path: qlib.contrib.strategy | ||
kwargs: | ||
signal: <PRED> | ||
topk: 50 | ||
n_drop: 5 | ||
backtest: | ||
start_time: 2017-01-01 | ||
end_time: 2020-08-01 | ||
account: 100000000 | ||
benchmark: *benchmark | ||
exchange_kwargs: | ||
limit_threshold: 0.095 | ||
deal_price: close | ||
open_cost: 0.0005 | ||
close_cost: 0.0015 | ||
min_cost: 5 | ||
task: | ||
model: | ||
class: GeneralPTNN | ||
module_path: qlib.contrib.model.pytorch_general_nn | ||
kwargs: | ||
n_epochs: 200 | ||
lr: 2e-4 | ||
early_stop: 10 | ||
batch_size: 800 | ||
metric: loss | ||
loss: mse | ||
n_jobs: 20 | ||
GPU: 0 | ||
pt_model_uri: "qlib.contrib.model.pytorch_gru_ts.GRUModel" | ||
pt_model_kwargs: { | ||
"d_feat": 20, | ||
"hidden_size": 64, | ||
"num_layers": 2, | ||
"dropout": 0., | ||
} | ||
dataset: | ||
class: TSDatasetH | ||
module_path: qlib.data.dataset | ||
kwargs: | ||
handler: | ||
class: Alpha158 | ||
module_path: qlib.contrib.data.handler | ||
kwargs: *data_handler_config | ||
segments: | ||
train: [2008-01-01, 2014-12-31] | ||
valid: [2015-01-01, 2016-12-31] | ||
test: [2017-01-01, 2020-08-01] | ||
step_len: 20 | ||
record: | ||
- class: SignalRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
model: <MODEL> | ||
dataset: <DATASET> | ||
- class: SigAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
ana_long_short: False | ||
ann_scaler: 252 | ||
- class: PortAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
config: *port_analysis_config |
93 changes: 93 additions & 0 deletions
93
examples/benchmarks/GeneralPtNN/workflow_config_gru2mlp.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
qlib_init: | ||
provider_uri: "~/.qlib/qlib_data/cn_data" | ||
region: cn | ||
market: &market csi300 | ||
benchmark: &benchmark SH000300 | ||
data_handler_config: &data_handler_config | ||
start_time: 2008-01-01 | ||
end_time: 2020-08-01 | ||
fit_start_time: 2008-01-01 | ||
fit_end_time: 2014-12-31 | ||
instruments: *market | ||
infer_processors: | ||
- class: FilterCol | ||
kwargs: | ||
fields_group: feature | ||
col_list: ["RESI5", "WVMA5", "RSQR5", "KLEN", "RSQR10", "CORR5", "CORD5", "CORR10", | ||
"ROC60", "RESI10", "VSTD5", "RSQR60", "CORR60", "WVMA60", "STD5", | ||
"RSQR20", "CORD60", "CORD10", "CORR20", "KLOW" | ||
] | ||
- class: RobustZScoreNorm | ||
kwargs: | ||
fields_group: feature | ||
clip_outlier: true | ||
- class: Fillna | ||
kwargs: | ||
fields_group: feature | ||
learn_processors: | ||
- class: DropnaLabel | ||
- class: CSRankNorm | ||
kwargs: | ||
fields_group: label | ||
label: ["Ref($close, -2) / Ref($close, -1) - 1"] | ||
|
||
port_analysis_config: &port_analysis_config | ||
strategy: | ||
class: TopkDropoutStrategy | ||
module_path: qlib.contrib.strategy | ||
kwargs: | ||
signal: <PRED> | ||
topk: 50 | ||
n_drop: 5 | ||
backtest: | ||
start_time: 2017-01-01 | ||
end_time: 2020-08-01 | ||
account: 100000000 | ||
benchmark: *benchmark | ||
exchange_kwargs: | ||
limit_threshold: 0.095 | ||
deal_price: close | ||
open_cost: 0.0005 | ||
close_cost: 0.0015 | ||
min_cost: 5 | ||
task: | ||
model: | ||
class: GeneralPTNN | ||
module_path: qlib.contrib.model.pytorch_general_nn | ||
kwargs: | ||
lr: 1e-3 | ||
n_epochs: 1 | ||
batch_size: 800 | ||
loss: mse | ||
optimizer: adam | ||
pt_model_uri: "qlib.contrib.model.pytorch_nn.Net" | ||
pt_model_kwargs: | ||
input_dim: 20 | ||
layers: [20,] | ||
dataset: | ||
class: DatasetH | ||
module_path: qlib.data.dataset | ||
kwargs: | ||
handler: | ||
class: Alpha158 | ||
module_path: qlib.contrib.data.handler | ||
kwargs: *data_handler_config | ||
segments: | ||
train: [2008-01-01, 2014-12-31] | ||
valid: [2015-01-01, 2016-12-31] | ||
test: [2017-01-01, 2020-08-01] | ||
record: | ||
- class: SignalRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
model: <MODEL> | ||
dataset: <DATASET> | ||
- class: SigAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
ana_long_short: False | ||
ann_scaler: 252 | ||
- class: PortAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
config: *port_analysis_config |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
qlib_init: | ||
provider_uri: "~/.qlib/qlib_data/cn_data" | ||
region: cn | ||
market: &market csi300 | ||
benchmark: &benchmark SH000300 | ||
data_handler_config: &data_handler_config | ||
start_time: 2008-01-01 | ||
end_time: 2020-08-01 | ||
fit_start_time: 2008-01-01 | ||
fit_end_time: 2014-12-31 | ||
instruments: *market | ||
infer_processors: [ | ||
{ | ||
"class" : "DropCol", | ||
"kwargs":{"col_list": ["VWAP0"]} | ||
}, | ||
{ | ||
"class" : "CSZFillna", | ||
"kwargs":{"fields_group": "feature"} | ||
} | ||
] | ||
learn_processors: [ | ||
{ | ||
"class" : "DropCol", | ||
"kwargs":{"col_list": ["VWAP0"]} | ||
}, | ||
{ | ||
"class" : "DropnaProcessor", | ||
"kwargs":{"fields_group": "feature"} | ||
}, | ||
"DropnaLabel", | ||
{ | ||
"class": "CSZScoreNorm", | ||
"kwargs": {"fields_group": "label"} | ||
} | ||
] | ||
process_type: "independent" | ||
|
||
port_analysis_config: &port_analysis_config | ||
strategy: | ||
class: TopkDropoutStrategy | ||
module_path: qlib.contrib.strategy | ||
kwargs: | ||
signal: <PRED> | ||
topk: 50 | ||
n_drop: 5 | ||
backtest: | ||
start_time: 2017-01-01 | ||
end_time: 2020-08-01 | ||
account: 100000000 | ||
benchmark: *benchmark | ||
exchange_kwargs: | ||
limit_threshold: 0.095 | ||
deal_price: close | ||
open_cost: 0.0005 | ||
close_cost: 0.0015 | ||
min_cost: 5 | ||
task: | ||
model: | ||
class: GeneralPTNN | ||
module_path: qlib.contrib.model.pytorch_general_nn | ||
kwargs: | ||
# FIXME: wrong parameters. | ||
lr: 2e-3 | ||
batch_size: 8192 | ||
loss: mse | ||
weight_decay: 0.0002 | ||
optimizer: adam | ||
pt_model_uri: "qlib.contrib.model.pytorch_nn.Net" | ||
pt_model_kwargs: | ||
input_dim: 157 | ||
dataset: | ||
class: DatasetH | ||
module_path: qlib.data.dataset | ||
kwargs: | ||
handler: | ||
class: Alpha158 | ||
module_path: qlib.contrib.data.handler | ||
kwargs: *data_handler_config | ||
segments: | ||
train: [2008-01-01, 2014-12-31] | ||
valid: [2015-01-01, 2016-12-31] | ||
test: [2017-01-01, 2020-08-01] | ||
record: | ||
- class: SignalRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
model: <MODEL> | ||
dataset: <DATASET> | ||
- class: SigAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
ana_long_short: False | ||
ann_scaler: 252 | ||
- class: PortAnaRecord | ||
module_path: qlib.workflow.record_temp | ||
kwargs: | ||
config: *port_analysis_config |
Oops, something went wrong.