Change how `get_gene_expression` handle duplications #59

vladimirsouza · 2024-04-24T20:26:20Z

In this PR, I added the parameter default_priority for default row prioritization. The default I'm using is:

list(
  protocol = c("Strand_Specific_Transcriptome_2",
               "Strand_Specific_Transcriptome_3"),
  ffpe_or_frozen = "frozen"
)

Please, let me know whether there is a more adequate default prioritization.

This commit also fixes an error in the creation of multi_exp column (sample_id replaced by sample_seqType in the split call) and changes how duplications are handled.

rdmorin · 2024-05-02T00:11:12Z

R/get_gene_expression.R

  stopifnot("You did not specify a valid engine. Please use one of \"read_tsv\", \"grep\", \"vroom\", or \"fread\"." = 
              engine %in% c("read_tsv", "grep", "vroom", "fread"))
-
+  if(default_priority & !missing(prioritize_rows_by)){


These three lines need to be removed. I think these were left over from the earlier implementation that was recently scrapped.

rdmorin · 2024-05-02T01:33:22Z

R/get_gene_expression.R

@@ -231,52 +221,39 @@ get_gene_expression = function(these_samples_metadata,

      # add column `multi_exp` to inform whether there are more than one 
      # `mrna_sample_id` associated to a `sample_id`


It seems like the whole expression data set is loaded in before anything is done with the metadata. This seems wasteful but perhaps it's a limitation imposed by having some of the required information exist only in the expression table?

vladimirsouza added 2 commits April 24, 2024 12:02

Add default_priority param to get_gene_expression

6f385e3

Add warning message in get_gene_expression

c9714cb

vladimirsouza requested review from lkhilton, rdmorin and Kdreval April 24, 2024 20:26

vladimirsouza added 3 commits April 29, 2024 23:37

Add collapse_duplicates to get_gene_expression

6131001

This commit also fixes an error in the creation of multi_exp column (sample_id replaced by sample_seqType in the split call) and changes how duplications are handled.

Fix calc_mutation_frequency_bin_region examples

bf51d69

Fix error in calc_mutation_frequency_bin_region

09b5fb0

rdmorin reviewed May 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change how `get_gene_expression` handle duplications #59

Change how `get_gene_expression` handle duplications #59

vladimirsouza commented Apr 24, 2024

rdmorin May 2, 2024

rdmorin May 2, 2024

		@@ -231,52 +221,39 @@ get_gene_expression = function(these_samples_metadata,

		# add column `multi_exp` to inform whether there are more than one
		# `mrna_sample_id` associated to a `sample_id`

Change how get_gene_expression handle duplications #59

Are you sure you want to change the base?

Change how get_gene_expression handle duplications #59

Conversation

vladimirsouza commented Apr 24, 2024

rdmorin May 2, 2024

Choose a reason for hiding this comment

rdmorin May 2, 2024

Choose a reason for hiding this comment

Change how `get_gene_expression` handle duplications #59

Change how `get_gene_expression` handle duplications #59