Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: isa-tab add-ped now only modifies values on process nodes of ped-supplied samples #207 #233

Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions cubi_tk/isa_tab/add_ped.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,13 @@ def on_visit_material(self, material, node_path, study=None, assay=None):

def on_visit_process(self, process, node_path, study=None, assay=None):
super().on_visit_process(process, node_path, study, assay)
donor_name = (
node_path[0].name
if _is_source(node_path[0])
else _sample_to_donor_name(node_path[0].name)
)
Nicolai-vKuegelgen marked this conversation as resolved.
Show resolved Hide resolved
if donor_name not in self.donor_map:
return None
proc_config_pairs = {
"library construction ": {
"library type": "library_type",
Expand Down Expand Up @@ -479,6 +486,10 @@ def _append_assay_line_protocol(
return counter, curr


def _sample_to_donor_name(sample_name):
return sample_name.rstrip("-N1")


def _donor_to_sample_name(donor_name):
return "%s-N1" % donor_name

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Sample Name Protocol REF Parameter Value[Concentration measurement] Performer Date Extract Name Characteristics[Concentration] Unit Term Source REF Term Accession Number Protocol REF Parameter Value[Provider name] Parameter Value[Provider contact] Parameter Value[Provider project ID] Parameter Value[Provider sample ID] Parameter Value[Provider QC status] Parameter Value[Requestor contact] Parameter Value[Requestor project] Parameter Value[Requestor sample ID] Parameter Value[Concentration measurement] Parameter Value[Library source] Parameter Value[Library strategy] Parameter Value[Library selection] Parameter Value[Library layout] Parameter Value[Library kit] Comment[Library kit catalogue ID] Parameter Value[Target insert size] Parameter Value[Wet-lab insert size] Parameter Value[Barcode kit] Parameter Value[Barcode kit catalogue ID] Parameter Value[Barcode name] Parameter Value[Barcode sequence] Performer Date Library Name Characteristics[Folder name] Characteristics[Concentration] Unit Term Source REF Term Accession Number Protocol REF Parameter Value[Platform] Parameter Value[Instrument model] Parameter Value[Base quality encoding] Parameter Value[Center name] Parameter Value[Center contact] Performer Date Raw Data File
index-N1 Nucleic acid extraction WES index-N1-DNA1 Library construction WES GENOMIC WXS Hybrid Selection PAIRED Agilent SureSelect Human All Exon V6r2 S00000XYZ index-N1-DNA1-WES1 index Nucleic acid sequencing WES ILLUMINA Illumina NextSeq 500 Phred+33
father-N1 Nucleic acid extraction WES father-N1-DNA1 Library construction WES GENOMIC WXS Hybrid Selection PAIRED father-N1-DNA1-WES1 father Nucleic acid sequencing WES Phred+33
mother-N1 Nucleic acid extraction WES mother-N1-DNA1 Library construction WES GENOMIC WXS Hybrid Selection PAIRED mother-N1-DNA1-WES1 mother Nucleic acid sequencing WES Phred+33
95 changes: 95 additions & 0 deletions tests/data/isa_tab/expected_output2/i_Investigation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
ONTOLOGY SOURCE REFERENCE
Term Source Name UO OBI NCBITAXON OMIM HP ORPHA
Term Source File http://data.bioontology.org/ontologies/UO http://data.bioontology.org/ontologies/OBI http://data.bioontology.org/ontologies/NCBITAXON http://data.bioontology.org/ontologies/OMIM http://data.bioontology.org/ontologies/HP http://data.bioontology.org/ontologies/ORDO
Term Source Version 48 35 10 12 570 2.8
Term Source Description Units of Measurement Ontology Ontology for Biomedical Investigations National Center for Biotechnology Information (NCBI) Organismal Classification Online Mendelian Inheritance in Man Human Phenotype Ontology Orphanet Rare Disease Ontology
INVESTIGATION
Investigation Identifier
Investigation Title Test ISA-Tab add-ped
Investigation Description
Investigation Submission Date
Investigation Public Release Date
Comment[Created With Configuration] /path/to/isa-configurations/bih_studies/bih_germline
Comment[Last Opened With Configuration] bih_germline
INVESTIGATION PUBLICATIONS
Investigation PubMed ID
Investigation Publication DOI
Investigation Publication Author List
Investigation Publication Title
Investigation Publication Status
Investigation Publication Status Term Accession Number
Investigation Publication Status Term Source REF
INVESTIGATION CONTACTS
Investigation Person Last Name
Investigation Person First Name
Investigation Person Mid Initials
Investigation Person Email
Investigation Person Phone
Investigation Person Fax
Investigation Person Address
Investigation Person Affiliation
Investigation Person Roles
Investigation Person Roles Term Accession Number
Investigation Person Roles Term Source REF
STUDY
Study Identifier test_isa_tpl_add_ped
Study Title Test ISA-Tab add-ped
Study Description
Comment[Study Grant Number]
Comment[Study Funding Agency]
Study Submission Date
Study Public Release Date
Study File Name s_test_isa_tpl_add_ped.txt
STUDY DESIGN DESCRIPTORS
Study Design Type
Study Design Type Term Accession Number
Study Design Type Term Source REF
STUDY PUBLICATIONS
Study PubMed ID
Study Publication DOI
Study Publication Author List
Study Publication Title
Study Publication Status
Study Publication Status Term Accession Number
Study Publication Status Term Source REF
STUDY FACTORS
Study Factor Name
Study Factor Type
Study Factor Type Term Accession Number
Study Factor Type Term Source REF
STUDY ASSAYS
Study Assay File Name a_test_isa_tpl_add_ped_exome_sequencing_nucleotide_sequencing.txt
Study Assay Measurement Type exome sequencing
Study Assay Measurement Type Term Accession Number
Study Assay Measurement Type Term Source REF
Study Assay Technology Type nucleotide sequencing
Study Assay Technology Type Term Accession Number http://purl.obolibrary.org/obo/OBI_0000626
Study Assay Technology Type Term Source REF OBI
Study Assay Technology Platform Illumina
STUDY PROTOCOLS
Study Protocol Name Sample collection Nucleic acid extraction WES Library construction WES Nucleic acid sequencing WES
Study Protocol Type Sample collection Nucleic acid extraction WES Library construction WES Nucleic acid sequencing WES
Study Protocol Type Term Accession Number
Study Protocol Type Term Source REF
Study Protocol Description
Study Protocol URI
Study Protocol Version
Study Protocol Parameters Name Concentration measurement Provider project ID;Library source;Library selection;Library layout;Barcode sequence;Wet-lab insert size;Requestor contact;Library kit;Barcode name;Provider contact;Concentration measurement;Provider sample ID;Target insert size;Provider name;Requestor project;Requestor sample ID;Barcode kit;Barcode kit catalogue ID;Provider QC status;Library strategy Base quality encoding;Platform;Center contact;Instrument model;Center name
Study Protocol Parameters Name Term Accession Number ;;;;;;;;;;;;;;;;;;; ;;;;
Study Protocol Parameters Name Term Source REF ;;;;;;;;;;;;;;;;;;; ;;;;
Study Protocol Components Name
Study Protocol Components Type
Study Protocol Components Type Term Accession Number
Study Protocol Components Type Term Source REF
STUDY CONTACTS
Study Person Last Name
Study Person First Name
Study Person Mid Initials
Study Person Email
Study Person Phone
Study Person Fax
Study Person Address
Study Person Affiliation
Study Person Roles
Study Person Roles Term Accession Number
Study Person Roles Term Source REF
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Source Name Characteristics[UUID] Characteristics[External links] Characteristics[Batch] Characteristics[Family] Characteristics[Organism] Term Source REF Term Accession Number Characteristics[Mother] Characteristics[Father] Comment[Family notes] Characteristics[Sex] Characteristics[Disease status] Characteristics[OMIM disease] Term Source REF Term Accession Number Characteristics[Orphanet disease] Term Source REF Term Accession Number Characteristics[HPO terms] Term Source REF Term Accession Number Comment[Disease notes] Protocol REF Performer Sample Name Characteristics[External links] Characteristics[Cell origin] Term Source REF Term Accession Number Characteristics[Cell type] Term Source REF Term Accession Number
index x-charite-medgen-blood-book-id:index 3 FAM_index Homo sapiens NCBITAXON http://purl.bioontology.org/ontology/NCBITAXON/9606 mother father female affected Sample collection index-N1
father x-charite-medgen-blood-book-id:father . FAM_index Homo sapiens NCBITAXON http://purl.bioontology.org/ontology/NCBITAXON/9606 0 0 male unaffected Sample collection father-N1
mother x-charite-medgen-blood-book-id:mother . FAM_index Homo sapiens NCBITAXON http://purl.bioontology.org/ontology/NCBITAXON/9606 0 0 female unaffected Sample collection mother-N1
1 change: 1 addition & 0 deletions tests/data/isa_tab/in_sheet_subset/input.ped
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
FAM index father mother 2 2
25 changes: 25 additions & 0 deletions tests/test_isa_tab_add_ped.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,3 +100,28 @@ def test_add_ped_just_update(tmpdir):
str(scratch_dir),
str(pathlib.Path(__file__).parent / "data" / "isa_tab" / "expected_output"),
)


def test_add_ped_sheet_subset_update(tmpdir):
"""Test updating study and assay, but only a subset of the sheet."""
scratch_dir = tmpdir / "scratch"
path_ped = pathlib.Path(__file__).parent / "data" / "isa_tab" / "in_sheet_subset" / "input.ped"
shutil.copytree(
str(pathlib.Path(__file__).parent / "data" / "isa_tab" / "in_just_update"), str(scratch_dir)
)

# Update metadata
args = BASE_ARGS[:]
args[12] = "S00000XYZ"

argv = args + [str(scratch_dir / "i_Investigation.txt"), str(path_ped)]

# Actually exercise code and perform test.
res = main(argv)

assert not res

compare_input_output(
str(scratch_dir),
str(pathlib.Path(__file__).parent / "data" / "isa_tab" / "expected_output2"),
)
Loading