Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulate_experiment discrepancy in basemean values for simulated data #80

Open
vavouri-lab opened this issue Jun 8, 2022 · 0 comments

Comments

@vavouri-lab
Copy link

Hi there,

I have a question - potentially bug report - for the simulate_experiment function.

I am using polyester to simulated RNA-seq data for a set of features (GTF file), a genome (fasta file), a set of expression values (vector of expression values in the same order as in the GTF file) and a matrix of fold changes (following the same order as the GTF and the expression vector) for an experiment with 2 conditions and 2 replicates:

simulate_experiment( seqpath = "../data/mygenomedir/", reads_per_transcript = mymeanexpressionvalues, fold_changes = myfoldchangematrix, feature = "exon", gtf = "../data/mytest.gtf", transcriptid = myfeatureIDs, num_reps = c(2,2) )

Polyester seems to run fine but the features with expression change between conditions do not correspond to the ones I set them to be. This looks like a bug to me but I am also slightly unsure if I am using polyester correctly. Specifically, other than providing expression values and fold_changes in the same order as the order of features in the GTF file, I don't see how to set which features to be differentially expressed in my simulated experiment.

Looking through the R code, I see that simulate_experiment() calls seq_gtf() which gets the sequence of the features in the GTF file. I see that on line 114 of seq_gtf, the split function in R reorders the list of features according to alphanumeric order. I think that this reordering causes the expression and fold change values to be assigned to different features to the ones I want. Is this a bug or have I misunderstood how to set expression to specific features?

Thanks
Tanya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant