InferenceData (.nc file) not generated for HDDMRegression with stimulus coding #18

shabnamhossein · 2024-11-18T22:15:54Z

I am using Docker version 4.35.1 (173168) for Mac (Sequoia 15.1). The Kabuki version is 0.6.5RC4 and the HDDM version is 1.0.1RC.
I am trying to run your tutorial “HDDM_Regression_Stimcoding” in the "OfficialTutorials" folder in Jupyter notebook with the addition of saving the InferenceData to be able to do posterior predictive check later. Line 16 of the tutorial is changed to:

save_name = "model_fitted/hddmregressor_example"
model_reg_infdata = m_reg.sample(500, return_infdata = True, save_name = save_name,  sample_prior = True, loglike = True, ppc = True)

However, the .nc file cannot be generated due to this error:

Start converting to InferenceData...
Start to calculate pointwise log likelihood...
The time of calculation of loglikelihood took 99.754 seconds
Start generating posterior prediction...
fail to convert posterior predictive check (self.ppc) to xarray: could not broadcast input array from shape (900,1) into shape (900,)

The text was updated successfully, but these errors were encountered:

panwanke · 2024-11-20T08:15:23Z

I am using Docker version 4.35.1 (173168) for Mac (Sequoia 15.1). The Kabuki version is 0.6.5RC4 and the HDDM version is 1.0.1RC. I am trying to run your tutorial “HDDM_Regression_Stimcoding” in the "OfficialTutorials" folder in Jupyter notebook with the addition of saving the InferenceData to be able to do posterior predictive check later. Line 16 of the tutorial is changed to:
save_name = "model_fitted/hddmregressor_example"
model_reg_infdata = m_reg.sample(500, return_infdata = True, save_name = save_name,  sample_prior = True, loglike = True, ppc = True)
However, the .nc file cannot be generated due to this error:
Start converting to InferenceData...
Start to calculate pointwise log likelihood...
The time of calculation of loglikelihood took 99.754 seconds
Start generating posterior prediction...
fail to convert posterior predictive check (self.ppc) to xarray: could not broadcast input array from shape (900,1) into shape (900,)

Thank you for your feedback. After testing, we successfully replicated the issue you reported. Upon investigation, we found that the problem stems from an update in HDDM. Specifically, this issue does not occur when using HDDM version 0.8.0.

Let me first share a solution, followed by an explanation of the issue's origin.

Solution:

You can pull a Docker image with HDDM version 0.8.0 to fit models using stimcoding as a regressor. Use the command:
docker pull hcp4715/hddm:0.8.0.
This approach ensures that the environment remains stable without being affected by custom modifications.

Here are my test results with version 0.8.0 and it shows that it works.

Source of the Issue:

The issue arises from the following line in the HDDM repository:
https://github.com/hddm-devs/hddm/blob/6e766ef315629c20cd0be7267555c90c39cc0446/hddm/models/hddm_regression.py#L130.

Here, the wfpt_reg_like function is defined with the sampling_method="cssm" parameter, which is fixed and cannot be adjusted during model definition. However, this sampling method causes PPC errors, whereas the default method (sampling_method="drift") does not lead to such issues.

We will report this issue to the official HDDM maintainers. However, updates may not be guaranteed as they seem to be focusing on resolving these problems in HSSM.

shabnamhossein · 2024-11-23T02:43:02Z

Thanks for your reply. I changed the HDDM version to 0.8.0 and the .nc file is generated for the tutorial data. However, I still have issues with generating the .nc file for my own data when using HDDMStimCoding. Am I doing something wrong in defining my model or sampling? This is my model:

model = hddm.HDDMStimCoding(data, include=['v', 'a', 't'], stim_col='stim', split_param='v', p_outlier=0.05)
model_infdata = model.sample(10000, burn=500, chains=1, save_name='model', return_infdata=True, ppc = True)

And this is the error I get:

Start converting to InferenceData...
/opt/conda/lib/python3.8/site-packages/kabuki/hierarchical.py:1157: UserWarning: n_ppc is not given, set to default 500
  warnings.warn("n_ppc is not given, set to default 500")
Start generating posterior prediction...
fail to convert posterior predictive check (self.ppc) to xarray: Supply a grouping so that at most 1 observed node codes for each group.

I am using these versions of the packages:
The current HDDM version is: 0.8.0
The current kabuki version is: 0.6.5RC4
The current PyMC version is: 2.3.8
The current ArviZ version is: 0.15.1

panwanke · 2024-11-25T03:23:16Z

Thanks for your reply. I changed the HDDM version to 0.8.0 and the .nc file is generated for the tutorial data. However, I still have issues with generating the .nc file for my own data when using HDDMStimCoding. Am I doing something wrong in defining my model or sampling? This is my model:
model = hddm.HDDMStimCoding(data, include=['v', 'a', 't'], stim_col='stim', split_param='v', p_outlier=0.05)
model_infdata = model.sample(10000, burn=500, chains=1, save_name='model', return_infdata=True, ppc = True)
And this is the error I get:
Start converting to InferenceData...
/opt/conda/lib/python3.8/site-packages/kabuki/hierarchical.py:1157: UserWarning: n_ppc is not given, set to default 500
  warnings.warn("n_ppc is not given, set to default 500")
Start generating posterior prediction...
fail to convert posterior predictive check (self.ppc) to xarray: Supply a grouping so that at most 1 observed node codes for each group.
I am using these versions of the packages: The current HDDM version is: 0.8.0 The current kabuki version is: 0.6.5RC4 The current PyMC version is: 2.3.8 The current ArviZ version is: 0.15.1

I didn't reproduce the error and everything works fine when I use real data. I guess it has something to do with the data?

shabnamhossein · 2024-11-25T03:45:28Z

Can you tell me what the "grouping" in the error is referring to?
fail to convert posterior predictive check (self.ppc) to xarray: Supply a **grouping** so that at most 1 observed node codes for each group.

panwanke · 2024-11-25T07:36:25Z

Can you tell me what the "grouping" in the error is referring to? fail to convert posterior predictive check (self.ppc) to xarray: Supply a **grouping** so that at most 1 observed node codes for each group.

This error source was not caught by us. If you want to know where this error comes from, you can try to run ppc alone, and run model.gen_ppc (n_ppc = 500) or model.gen_ppc (n_ppc = 500, parallel = False) after fitting the model. Then you can see the generated posterior prediction data through model.ppc. If there is no problem, you can try model.to_infdata (ppc = True) again to see if there is a problem?
If model.gen_ppc doesn't work, you can try

from kabuki.analyze import post_pred_gen

ppc = post_pred_gen(model, samples=500, parallel=False)

hcp4715 assigned panwanke Nov 19, 2024

panwanke mentioned this issue Nov 20, 2024

[Compatibility issues with Kabuki] Issue with Fixed sampling_method="cssm" Causing PPC Errors in HDDM Regression Models hddm-devs/hddm#119

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InferenceData (.nc file) not generated for HDDMRegression with stimulus coding #18

InferenceData (.nc file) not generated for HDDMRegression with stimulus coding #18

shabnamhossein commented Nov 18, 2024

panwanke commented Nov 20, 2024 •

edited

Loading

shabnamhossein commented Nov 23, 2024

panwanke commented Nov 25, 2024

shabnamhossein commented Nov 25, 2024

panwanke commented Nov 25, 2024

InferenceData (.nc file) not generated for HDDMRegression with stimulus coding #18

InferenceData (.nc file) not generated for HDDMRegression with stimulus coding #18

Comments

shabnamhossein commented Nov 18, 2024

panwanke commented Nov 20, 2024 • edited Loading

Solution:

Source of the Issue:

shabnamhossein commented Nov 23, 2024

panwanke commented Nov 25, 2024

shabnamhossein commented Nov 25, 2024

panwanke commented Nov 25, 2024

panwanke commented Nov 20, 2024 •

edited

Loading