forecasts = list(forecast_it) is very slow #63

xuyilin0121 · 2024-05-10T06:03:25Z

Hi there,

I have a very interesting problem here when I want to test the model using my data.
The original dataset has 5835 rows, i.e., 5835 time series and It includes 39 timesteps. I understand it is backtested, so I set the context length to 32 and the prediction length to 7.
Everything goes well until the last step forecasts = list(forecast_it). Based on my observation, if I input 100 time series, it takes at least 1 minute for conversion. Thus I suppose 5835 time series will need hours. The interesting thing is I have a DeepAR model before which uses GlutonTS package as well and it only needs like 10 minutes at most for converting the same dataset.
I tried to do the research but no helpful information can be found...so I raise the issue to see whether there are any difference between DeepAR result and Lag-llama Result which makes it slow for converting to list.

Thanks for your help!

ashok-arjun · 2024-05-12T11:30:00Z

Hi,

It's probably because Lag-Llama takes more time on average to forward-pass a single batch, as it's a bigger model that the DeepAR model you are using. Maybe you can try benchmark the time for 1 series, with the same context and prediction length; then you'd know if this is the case.

What's the batch size you're using for Lag-Llama?

xuyilin0121 · 2024-05-14T07:21:10Z

Hi,

Thanks so much for your reply!

I have tried the 1 series with tqdm when converting, and the result can be found below:

I am using batch_size = 64, but when I set it to 1, the training is faster

Moreover, I have to say I am not technical enough, because I think it makes more sense if the model takes more time for training or prediction instead of converting the "generator object PyTorchPredictor.predict" since I do see any difference between the list result of DeepAR and Lag-llama. Could you kindly help me with more explanation on this? Thanks!!!

simona-0 · 2024-06-02T14:41:41Z

Hi @xuyilin0121 , I came across the same problem when I tried to fine-tune lag-llama using a training dataset of ~700 observations (each with ~500 timestamps). The batch size I used is 64. The list() conversion took hours. Have you maybe found ways to deal with this issue? Thx

CoCoNuTeK · 2024-06-07T08:44:57Z

Same even with num_samples=5 it takes at least 30 seconds to conver with batch size of 4 and each sequence of lenght 1024 with pred len 512 if i set it took 100 it would take hours, so is there any new fix to this as this model wont be usable if i have to wait horus for one batch to finish... The interesting thing is that the inference is very fast bu the conversion to list is super long.........
Doing inference on GPU.

ashok-arjun · 2024-06-07T14:23:32Z

@CoCoNuTeK Thanks for describing the problem. Can you explain what you mean by inference is very fast but the conversion to list is super long?

@xuyilin0121 @simona-0 I am not sure as well. I'll be happy to take a look at this next week myself and fix it. I'll keep this thread updated

CoCoNuTeK · 2024-06-07T15:41:38Z

So this code

        log_info("Starting inference...")
        check_memory_usage()

        forecast_it, ts_it = make_evaluation_predictions(
            dataset=batch,
            predictor=predictor,
            num_samples=num_samples
        )
        log_info("Inference completed, converting to list...")

is almost instant, however this part

        forecasts = list(forecast_it)
        tss = list(ts_it)

takes way too long, if i had 5 num samples it took arround 30 seconds but i am pretty sure it didnt scale linearly but worse as with 100 it didnt even finish my setup is nvidia tesla t4 GPU cuda is setup
SEQ_LEN = 512 # Context length for the model
PRED_LEN = 512 # Horizon length for the model
BATCH_SIZE = 32
NUM_SAMPLES=100

ashok-arjun · 2024-06-16T22:25:39Z

Yes, the first block of code is supposed to just "create" the generators. The second part is what actually runs the inference.

We recently added support for deterministic point-forecasting, where only the mean of the forecast, and the forward pass is much faster since it uses just one sample as the previous prediction. This is supported by enabling use_single_pass_sampling when creating the Estimator. Can you try this, and check if it works OK for your use case?

@CoCoNuTeK @xuyilin0121 @simona-0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

forecasts = list(forecast_it) is very slow #63

forecasts = list(forecast_it) is very slow #63

xuyilin0121 commented May 10, 2024

ashok-arjun commented May 12, 2024

xuyilin0121 commented May 14, 2024 •

edited

Loading

simona-0 commented Jun 2, 2024

CoCoNuTeK commented Jun 7, 2024

ashok-arjun commented Jun 7, 2024 •

edited

Loading

CoCoNuTeK commented Jun 7, 2024 •

edited

Loading

ashok-arjun commented Jun 16, 2024

forecasts = list(forecast_it) is very slow #63

forecasts = list(forecast_it) is very slow #63

Comments

xuyilin0121 commented May 10, 2024

ashok-arjun commented May 12, 2024

xuyilin0121 commented May 14, 2024 • edited Loading

simona-0 commented Jun 2, 2024

CoCoNuTeK commented Jun 7, 2024

ashok-arjun commented Jun 7, 2024 • edited Loading

CoCoNuTeK commented Jun 7, 2024 • edited Loading

ashok-arjun commented Jun 16, 2024

xuyilin0121 commented May 14, 2024 •

edited

Loading

ashok-arjun commented Jun 7, 2024 •

edited

Loading

CoCoNuTeK commented Jun 7, 2024 •

edited

Loading