You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear Collin,
I want to run 2020puls using my own pan-cancer data without silent mutations(total mutation num >130, 000) to predict oncogene and TSG of Pan-cancer and type specific cancer. Should I train a new model using my data with –config drop_silent=”yes” followed by running predict or just run pretrained_predict using your pre-trained 20/20+ classifiers with the same config above?
Thanks.
The text was updated successfully, but these errors were encountered:
Ideally one would train an entire new model where silent mutations were not included to then apply it on additional data where they also weren't included. In general, scores will skew higher when no silent mutations are included in your data when scored used a model that was trained on data that contained silent mutations. However, as you noticed by the option, a reasonable workaround is to adjust what is considered a significant score by accounting for the fact that silent mutations are not included in the monte carlo simulations. This should help reduce potential biases, but ideally you should check the p-values and see if there are artificially large number of significant results for your data. If that is the case, then you may need to train a new model.
Dear Collin,
I want to run 2020puls using my own pan-cancer data without silent mutations(total mutation num >130, 000) to predict oncogene and TSG of Pan-cancer and type specific cancer. Should I train a new model using my data with –config drop_silent=”yes” followed by running predict or just run pretrained_predict using your pre-trained 20/20+ classifiers with the same config above?
Thanks.
The text was updated successfully, but these errors were encountered: