Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble changing size parameter #37

Open
rramaker opened this issue Sep 4, 2016 · 5 comments
Open

Trouble changing size parameter #37

rramaker opened this issue Sep 4, 2016 · 5 comments

Comments

@rramaker
Copy link

rramaker commented Sep 4, 2016

I'm attempting to run the "simulate_experiment" command function by following the associated help file except changing the size parameter as follows:

fastapath = system.file("extdata", "chr22.fa", package="polyester")
numtx = count_transcripts(fastapath)
set.seed(4)
fold_changes = sample(c(0.5, 1, 2), size=numtx, prob=c(0.05, 0.9, 0.05), replace=TRUE)
simulate_experiment(fastapath, reads_per_transcript=10,fold_changes=fold_changes, outdir='simulated_reads', seed=12, size=10)

This results in the following error:

Error in matrix(size, nrow = nrow(basemeans), ncol = ncol(basemeans)) : non-numeric matrix extent

This error happens when providing a vector of sizes for each transcript like so:

simulate_experiment(fastapath, reads_per_transcript=10,fold_changes=fold_changes, outdir='simulated_reads', seed=12, size=(10*fold_changes/3))

Am I formatting the size parameter correctly?

@JMF47
Copy link
Collaborator

JMF47 commented Sep 4, 2016

Hey there, the documentation for that function describes the following for size

the negative binomial size parameter (see NegBinomial) for the number of reads drawn per transcript. If left blank, defaults to reads_per_transcript / 3. Negative binomial variance is mean + mean^2 / size. Can either be left at default, a vector of the same length as number of transcripts in fasta, if the two groups should have the same size parameters, or a list with 2 elements, each of which is a vector with length equal to the number of transcripts in fasta, which represent the size parameters for each transcript in groups 1 and 2, respectively.

Does that clear up what you might need?

@rramaker
Copy link
Author

rramaker commented Sep 5, 2016

Thanks for your quick reply. I really appreciate your help.

It's possible I'm mis-reading something in the documentation, but my interpretation is you can provide one of two inputs for the size argument:

  1. A numeric vector of size values with the same length as the number of transcripts in the fasta file.
    I've tried this by modifying the example code provided in the documentation as follows:

fastapath = system.file("extdata", "chr22.fa", package="polyester")
numtx = count_transcripts(fastapath)
fold_changes = sample(c(0.5, 1, 2), size=numtx, prob=c(0.05, 0.9, 0.05), replace=TRUE)
simulate_experiment(fastapath, reads_per_transcript=10,fold_changes=fold_changes, outdir='simulated_reads', seed=12, size=rep(3,numtx))

  1. A list containing 2 numeric vectors of size values, each with the same length as the number of transcripts in the fasta file.
    I've also tried this by modifying the example code provided in the documentation as follows:

fastapath = system.file("extdata", "chr22.fa", package="polyester")
numtx = count_transcripts(fastapath)
fold_changes = sample(c(0.5, 1, 2), size=numtx, prob=c(0.05, 0.9, 0.05), replace=TRUE)
simulate_experiment(fastapath, reads_per_transcript=10,fold_changes=fold_changes, outdir='simulated_reads', seed=12, size=list(rep(3,numtx),rep(6,numtx)))

Both routes result in the same error:

Error in simulate_experiment(fastapath, reads_per_transcript = 10, fold_changes = fold_changes, :
size must be a number, numeric vector, or matrix.

It seems providing a numeric vector with length equal to the number of transcripts in the fasta is not working. Please correct me if I'm incorrectly interpreting the documentation. It may be helpful if you could provide some example code where you've modified the size parameter.

Thanks again!

@alyssafrazee
Copy link
Owner

It does look like you're adjusting the parameter correctly according to the documentation. I'll take a look at this when I get a chance (likely sometime in the next week, unless @JMF47 can get to it sooner).

@alyssafrazee
Copy link
Owner

alyssafrazee commented Sep 20, 2016

Working on fixing this now! One update I have is that, per the documentation, fold_changes needs to be a matrix specifying fold changes between groups. It needs one column per group. The default is two groups (and the num_reps arg is left at the default in your example, so you have 2 groups), but you've provided a fold_changes argument that's a vector, i.e., it maybe has one column but R doesn't think it has any columns at all, so that's a dimension mismatch. The following code does work:

fastapath = system.file("extdata", "chr22.fa", package="polyester")
numtx = count_transcripts(fastapath)
set.seed(4)
fold_change_values = sample(c(0.5, 1, 2), size=2*numtx, prob=c(0.05, 0.9, 0.05), replace=TRUE)
fold_changes = matrix(fold_change_values, nrow=numtx)

simulate_experiment(fastapath, reads_per_transcript=10,fold_changes=fold_changes, outdir='simulated_reads', seed=12, size=10)

(compare to your code in the original comment; note the class of fold_changes). I took that code right from the example page of simulate_experiment.

That said, there should be better error-handling there (the message should have told you that your fold change argument's dimensions didn't match your group dimensions). Also, you should be able to have just one group and change the size parameter. I'm adding those two features now (the latter is a bug).

edit: I realize you are actually allowed to provide a vector of fold changes with two groups -- I forgot I hadn't removed that functionality! Looking into it.

@alyssafrazee
Copy link
Owner

OK, I've fixed this (rather uglily) in #39. Will test more thoroughly and deploy to BioC shortly. Sorry for the confusion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants