Skip to content

Commit

Permalink
Add API documentation
Browse files Browse the repository at this point in the history
This PR added the documentation for APIs.

To see the doc website locally, run `mkdocs serve` in root folder of this repo.

The online website will be setup when this PR is approved.
  • Loading branch information
CunliangGeng authored Mar 7, 2024
1 parent 1948b8d commit 325b495
Show file tree
Hide file tree
Showing 25 changed files with 91 additions and 13 deletions.
1 change: 1 addition & 0 deletions docs/api/antismash.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.genomics.antismash
2 changes: 2 additions & 0 deletions docs/api/arranger.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@

::: nplinker.arranger
2 changes: 2 additions & 0 deletions docs/api/bigscape.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
::: nplinker.genomics.bigscape
::: nplinker.genomics.bigscape.run_bigscape
1 change: 1 addition & 0 deletions docs/api/genomics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.genomics
1 change: 1 addition & 0 deletions docs/api/genomics_abc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.genomics.abc
1 change: 1 addition & 0 deletions docs/api/genomics_utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.genomics.utils
13 changes: 13 additions & 0 deletions docs/api/gnps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
::: nplinker.metabolomics.gnps
options:
members:
- GNPSFormat
- GNPSDownloader
- GNPSExtractor
- GNPSSpectrumLoader
- GNPSMolecularFamilyLoader
- GNPSAnnotationLoader
- GNPSFileMappingLoader
- gnps_format_from_archive
- gnps_format_from_file_mapping
- gnps_format_from_task_id
2 changes: 2 additions & 0 deletions docs/api/loader.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@

::: nplinker.loader
1 change: 1 addition & 0 deletions docs/api/metabolomics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.metabolomics
1 change: 1 addition & 0 deletions docs/api/metabolomics_abc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.metabolomics.abc
1 change: 1 addition & 0 deletions docs/api/metabolomics_utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.metabolomics.utils
1 change: 1 addition & 0 deletions docs/api/mibig.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.genomics.mibig
1 change: 1 addition & 0 deletions docs/api/nplinker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.nplinker
1 change: 1 addition & 0 deletions docs/api/schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.schemas
7 changes: 7 additions & 0 deletions docs/api/scoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
::: nplinker.scoring
options:
members:
- ScoringMethod
- MetcalfScoring
- LinkCollection
- ObjectLink
1 change: 1 addition & 0 deletions docs/api/strain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.strain
1 change: 1 addition & 0 deletions docs/api/strain_utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: nplinker.strain.utils
3 changes: 3 additions & 0 deletions docs/api/utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
::: nplinker.utils
options:
members_order: alphabetical
23 changes: 22 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,28 @@ extra_javascript:
nav:
- Get Started:
- Welcome to NPLinker: index.md
- API:
- API Documentation:
- NPLinker: api/nplinker.md
- Dataset Arranger: api/arranger.md
- Dataset Loader: api/loader.md
- Genomics Data:
- Data Models: api/genomics.md
- Base Classes: api/genomics_abc.md
- MiBIG: api/mibig.md
- AntiSMASH: api/antismash.md
- BigScape: api/bigscape.md
- Utilities: api/genomics_utils.md
- Metabolomics Data:
- Data Models: api/metabolomics.md
- Base Classes: api/metabolomics_abc.md
- GNPS: api/gnps.md
- Utilities: api/metabolomics_utils.md
- Strain Data:
- Data Models: api/strain.md
- Utilities: api/strain_utils.md
- Scoring: api/scoring.md
- Schemas: api/schema.md
- General Utilities: api/utils.md

markdown_extensions:
- tables
Expand Down
6 changes: 3 additions & 3 deletions src/nplinker/genomics/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,11 +290,11 @@ def get_mappings_strain_id_bgc_id(
Key is strain id and value is a set of BGC ids.
See Also:
`extract_mappings_strain_id_original_genome_id`: Extract mappings
- `extract_mappings_strain_id_original_genome_id`: Extract mappings
"strain_id <-> original_genome_id".
`extract_mappings_original_genome_id_resolved_genome_id`: Extract mappings
- `extract_mappings_original_genome_id_resolved_genome_id`: Extract mappings
"original_genome_id <-> resolved_genome_id".
`extract_mappings_resolved_genome_id_bgc_id`: Extract mappings
- `extract_mappings_resolved_genome_id_bgc_id`: Extract mappings
"resolved_genome_id <-> bgc_id".
"""
mappings_dict = {}
Expand Down
2 changes: 2 additions & 0 deletions src/nplinker/metabolomics/gnps/gnps_extractor.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,15 @@ def __init__(self, file: str | PathLike, extract_dir: str | PathLike):
"""Class to extract files from a GNPS molecular networking archive(.zip).
Four files are extracted and renamed to the following names:
- file_mappings(.tsv/.csv)
- spectra.mgf
- molecular_families.tsv
- annotations.tsv
The files to be extracted are selected based on the GNPS workflow type,
as desribed below (in the order of the files above):
1. METABOLOMICS-SNETS
- clusterinfosummarygroup_attributes_withIDs_withcomponentID/*.tsv
- METABOLOMICS-SNETS*.mgf
Expand Down
9 changes: 8 additions & 1 deletion src/nplinker/metabolomics/gnps/gnps_file_mapping_loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ def __init__(self, file: str | PathLike):
The file mappings file is from GNPS output archive, as described below
for each GNPS workflow type:
1. METABOLOMICS-SNETS
- clusterinfosummarygroup_attributes_withIDs_withcomponentID/*.tsv
2. METABOLOMICS-SNETS-V2
Expand All @@ -29,7 +30,7 @@ def __init__(self, file: str | PathLike):
Raises:
ValueError: Raises ValueError if the file is not valid.
Example:
Examples:
>>> loader = GNPSFileMappingLoader("gnps_file_mappings.tsv")
>>> print(loader.mappings["1"])
['26c.mzXML']
Expand Down Expand Up @@ -137,6 +138,7 @@ def _load_snets(self) -> None:
"""Load file mapping from output of GNPS SNETS workflow.
The following columns are loaded:
- "cluster index": loaded as spectrum id
- "AllFiles": a list of files in which the spectrum occurs, separated
by '###'.
Expand All @@ -157,6 +159,7 @@ def _load_snetsv2(self) -> None:
"""Load file mapping from output of GNPS SNETS-V2 workflow.
The following columns are loaded:
- "cluster index": loaded as spectrum id
- "UniqueFileSources": a list of files in which the spectrum occurs,
separated by '|'.
Expand All @@ -174,13 +177,17 @@ def _load_fbmn(self):
"""Load file mapping from output of GNPS FBMN workflow.
The column "row ID" is loaded as spectrum id.
The column names containing " Peak area" are used to extract the file
names, and the values of these columns are used to determine whether
the spectrum occurs in the file. The file name is taken only if the
value is greater than 0.
An example data of the file is as follows:
```
row ID,5434_5433_mod.mzXML Peak area,5425_5426_mod.mzXML Peak area
1,1764067.8434999974,0.0
```
"""
pattern = " Peak area"
with open(self._file, mode="rt", encoding="utf-8") as f:
Expand Down
1 change: 1 addition & 0 deletions src/nplinker/metabolomics/gnps/gnps_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ def gnps_format_from_file_mapping(file: str | PathLike) -> GNPSFormat:
The GNSP file mapping file is located in different folders depending on the
GNPS workflow. Here are the locations in corresponding GNPS zip archives:
- METABOLOMICS-SNETS workflow: the .tsv file under folder "clusterinfosummarygroup_attributes_withIDs_withcomponentID"
- METABOLOMICS-SNETS-V2 workflow: the .clustersummary file (tsv) under folder "clusterinfosummarygroup_attributes_withIDs_withcomponentID"
- FEATURE-BASED-MOLECULAR-NETWORKING workflow: the .csv file under folder "quantification_table"
Expand Down
1 change: 1 addition & 0 deletions src/nplinker/metabolomics/gnps/gnps_spectrum_loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ def __init__(self, file: str | PathLike):
The file mappings file is from GNPS output archive, as described below
for each GNPS workflow type:
1. METABOLOMICS-SNETS
- METABOLOMICS-SNETS*.mgf
2. METABOLOMICS-SNETS-V2
Expand Down
21 changes: 13 additions & 8 deletions src/nplinker/strain/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,12 @@
def load_user_strains(json_file: str | PathLike) -> set[Strain]:
"""Load user specified strains from a JSON file.
The JSON file must follow the schema defined in "nplinker/schemas/user_strains.json".
The JSON file must follow the schema defined in `schemas/user_strains.json`.
An example content of the JSON file:
```
{"strain_ids": ["strain1", "strain2"]}
```
Args:
json_file: Path to the JSON file containing user specified strains.
Expand Down Expand Up @@ -53,10 +56,12 @@ def podp_generate_strain_mappings(
"""Generate strain mappings JSON file for PODP pipeline.
To get the strain mappings, we need to combine the following mappings:
- strain_id <-> original_genome_id <-> resolved_genome_id <-> bgc_id
- strain_id <-> MS_filename <-> spectrum_id
These mappings are extracted from the following files:
- "strain_id <-> original_genome_id" is extracted from `podp_project_json_file`.
- "original_genome_id <-> resolved_genome_id" is extracted from `genome_status_json_file`.
- "resolved_genome_id <-> bgc_id" is extracted from `genome_bgc_mappings_file`.
Expand All @@ -78,18 +83,18 @@ def podp_generate_strain_mappings(
The strain mappings stored in a StrainCollection object.
See Also:
`extract_mappings_strain_id_original_genome_id`: Extract mappings
- `extract_mappings_strain_id_original_genome_id`: Extract mappings
"strain_id <-> original_genome_id".
`extract_mappings_original_genome_id_resolved_genome_id`: Extract mappings
- `extract_mappings_original_genome_id_resolved_genome_id`: Extract mappings
"original_genome_id <-> resolved_genome_id".
`extract_mappings_resolved_genome_id_bgc_id`: Extract mappings
- `extract_mappings_resolved_genome_id_bgc_id`: Extract mappings
"resolved_genome_id <-> bgc_id".
`get_mappings_strain_id_bgc_id`: Get mappings "strain_id <-> bgc_id".
`extract_mappings_strain_id_ms_filename`: Extract mappings
- `get_mappings_strain_id_bgc_id`: Get mappings "strain_id <-> bgc_id".
- `extract_mappings_strain_id_ms_filename`: Extract mappings
"strain_id <-> MS_filename".
`extract_mappings_ms_filename_spectrum_id`: Extract mappings
- `extract_mappings_ms_filename_spectrum_id`: Extract mappings
"MS_filename <-> spectrum_id".
`get_mappings_strain_id_spectrum_id`: Get mappings "strain_id <-> spectrum_id".
- `get_mappings_strain_id_spectrum_id`: Get mappings "strain_id <-> spectrum_id".
"""
# Get mappings strain_id <-> original_geonme_id <-> resolved_genome_id <-> bgc_id
mappings_strain_id_bgc_id = get_mappings_strain_id_bgc_id(
Expand Down

0 comments on commit 325b495

Please sign in to comment.