Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable querying of multiple genes in Bgee API #165

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/src/en/bgee.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Return format: JSON/CSV (command-line) or data frame (Python).
This module was written by [Sam Wagenaar](https://github.com/techno-sam).

**Positional argument**
`ens_id`
Ensembl gene ID, e.g. ENSG00000169194 or ENSSSCG00000014725.
`ens_ids`
One or more Ensembl gene IDs, e.g. ENSG00000169194 or ENSSSCG00000014725.

NOTE: Some of the species in [Bgee](https://www.bgee.org/) are not in Ensembl or Ensembl metazoa, and for those you can use NCBI gene IDs, e.g. 118215821 (a gene in _Anguilla anguilla_).

Expand Down
2 changes: 1 addition & 1 deletion docs/src/es/bgee.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Este módulo fue escrito por [Sam Wagenaar](https://github.com/techno-sam).

**Argumento posicional**
`ens_id`
ID de gen Ensembl, por ejemplo, ENSG00000169194 o ENSSSCG00000014725.
Uno o varios ID de gen Ensembl, por ejemplo, ENSG00000169194 o ENSSSCG00000014725.

NOTA: Algunas de las especies en [Bgee](https://www.bgee.org/) no están en Ensembl, y para ellas puede utilizar los ID de genes del NCBI, p. 118215821 (un gen en _Anguilla anguilla_).

Expand Down
31 changes: 22 additions & 9 deletions gget/gget_bgee.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,35 +100,47 @@ def _bgee_orthologs(gene_id, json=False, verbose=True):
return df


def _bgee_expression(gene_id, json=False, verbose=True):
def _bgee_expression(gene_ids, json=False, verbose=True):
"""
Get expression data from Bgee

Args:

:param gene_id: Ensembl gene ID
:param gene_ids: Ensembl gene ID(s)
:param json: return JSON instead of DataFrame
:param verbose: log progress

Returns requested information as a DataFrame or JSON
"""
# must first obtain species
species = _bgee_species(gene_id, verbose=verbose)
# if single Ensembl ID passed as string, convert to list
if isinstance(gene_ids, str):
gene_ids = [gene_ids]

# make sure all gene IDs correspond to the same species
species_set = {_bgee_species(gene_id, verbose=verbose) for gene_id in gene_ids}
print(species_set)

if len(species_set) != 1:
raise RuntimeError("All gene_ids must be from a single species.")

# get the single species from the set
species = species_set.pop()

if verbose:
logger.info(f"Getting expression data for gene {gene_id} from Bgee")
logger.info(f"Getting expression data for gene {', '.join(gene_ids)} from Bgee")

# then obtain expression data
response = requests.get(
"https://bgee.org/api/",
params={
"display_type": "json",
"page": "gene",
"action": "expression",
"gene_id": gene_id,
"page": "data",
"action": "expr_calls",
"gene_id": gene_ids,
"species_id": species,
"cond_param": ["anat_entity", "cell_type"],
"data_type": "all",
"get_results": "true",
},
)

Expand All @@ -138,7 +150,7 @@ def _bgee_expression(gene_id, json=False, verbose=True):
"Please double-check the arguments and try again.\n"
)

expression_data = response.json()["data"]["calls"]
expression_data = response.json()["data"]["expressionData"]["expressionCalls"]

df = json_list_to_df(
expression_data,
Expand All @@ -150,6 +162,7 @@ def _bgee_expression(gene_id, json=False, verbose=True):
("expression_state", "expressionState"),
],
)

df["score"] = df["score"].astype(float)

if json:
Expand Down
Loading