Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for altExps when merging SCEs #150

Closed
sjspielman opened this issue Nov 23, 2022 · 4 comments · Fixed by #251
Closed

Account for altExps when merging SCEs #150

sjspielman opened this issue Nov 23, 2022 · 4 comments · Fixed by #251
Assignees

Comments

@sjspielman
Copy link
Member

sjspielman commented Nov 23, 2022

From #148 (comment) (and other comments in this PR):

While implementing updates for #146, there were some challenges encountered with altExps when merging SCEs. Therefore, we'll need to circle back and include the ability for this function to handle altExps, in particular any CITEseq data which we expect in some ScPCA libraries.

@allyhawkins
Copy link
Member

We would like to go ahead and make these changes to account for merging objects with altExp objects. When we are working on this we should be sure to include the following:

  • A flag for whether or not we want to include the altExp in the returned merged object (keep/ don't keep)
  • For any features that are not found in all libraries, set the value to NA for libraries in which that feature is missing. We do not want to use 0.

@sjspielman
Copy link
Member Author

We have discussed an updated plan for merging altExps, which will involve some associated changes with the main experiment merging:

  • If include_altexps is TRUE, add a check for altExps: All altExps of the same name must have the same features. If they do not, then error out that merging has failed.
    • Later, we will circle back to adding functionality to handle feature mis-match across altExps
  • Let the current do.call(cbind, sce_list) code merge everything all at once, including altExps
    • This means that before merging, we need to make sure all altExps are compatible. Given that we are, for now, enforcing the same features, that should "just work."
  • In main experiment side, rather than subsetting to the intersection of shared features, we should expand the SCE to include the union of features. This will involve some code changes to prepare_sce_for merge - we'll want to create a new SCE entirely with all features as expected.
    • That said, we should probably still keep the check that the vector of intersecting features is not length 0. If there are no features are in common, that's a 🚩.

@jashapiro, please weigh in with any details I missed!

One detail I know I'm missing is - for the altExps check, do we want to enforce that all altExps are present at this stage as well, or fill in with NA-matrices any that are missing? If we go with enforcing for now, we'll circle back and add dummies when we return to handle altexp feature mismatches.

@jashapiro
Copy link
Member

  • If include_altexps is TRUE, add a check for altExps: All altExps of the same name must have the same features. If they do not, then error out that merging has failed.

    • Later, we will circle back to adding functionality to handle feature mis-match across altExps
  • Let the current do.call(cbind, sce_list) code merge everything all at once, including altExps

    • This means that before merging, we need to make sure all altExps are compatible. Given that we are, for now, enforcing the same features, that should "just work."

Making compatible altExps includes adding an NA-filled altExp to any SCE that does not have the given altExp. This should have rows for the features in the non-missing SCEs and columns for their samples, but may not need to have any other rowData or colData (it may be that we need to add NA values to these tables as well)

Note that we will also still need to run a prepare_sce_for_merge() step on the altExps or the equivalent to rename columns that we will want to add library-specific prefixes to.

  • In main experiment side, rather than subsetting to the intersection of shared features, we should expand the SCE to include the union of features. This will involve some code changes to prepare_sce_for merge - we'll want to create a new SCE entirely with all features as expected.

    • That said, we should probably still keep the check that the vector of intersecting features is not length 0. If there are no features are in common, that's a 🚩.

I do not think we need to modify this as part of this PR. We should be able to leave subsetting to shared features as it stands.

@sjspielman
Copy link
Member Author

For the altExps check, do we want to enforce that all altExps are present at this stage as well, or fill in with NA-matrices any that are missing? If we go with enforcing for now, we'll circle back and add dummies when we return to handle altexp feature mismatches.

So, no we will not enforce presence. For any given altexp name, all the present altexps must have the same features. If they do not have the same features, error out. But if an altexp is not present for the given name, make a dummy version to pop in:

Making compatible altExps includes adding an NA-filled altExp to any SCE that does not have the given altExp. This should have rows for the features in the non-missing SCEs and columns for their samples, but may not need to have any other rowData or colData (it may be that we need to add NA values to these tables as well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants