Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to retrieve PPI interface, protein-ligand interface and protein-DNA/RNA interface? #31

Open
yang-arina opened this issue Apr 1, 2024 · 7 comments

Comments

@yang-arina
Copy link

Hi! Is the PISA API working?
I currently have a series of unp_ids, such as ['Q5VTD9', 'Q6PL24', 'Q9UKM9', 'Q9H0A3', 'Q9P2N6', 'P21127', 'Q8TD94', 'Q86SH2', 'Q9BYR3', 'Q99856']. How can I retrieve for each unp_id:

    1. all protein complexes and their PPI interface residues;
    1. all protein-ligand complexes and their protein interface residues;
    1. all protein-DNA/RNA complexes and their protein interface residues.

If you could help me with this, I would be extremely grateful!

@NatureGeorge
Copy link
Owner

NatureGeorge commented Apr 4, 2024

i. and iii can be fulfill with current version of pdb-profiling.

# homomeric 
pdb_profiling sifts-mapping --input demo_uniprot.txt --column UniProt --func pipe_select_ho --output he_result.txt
# heteromeric
pdb_profiling sifts-mapping --input demo_uniprot.txt --column UniProt --func pipe_select_he --output he_result.txt
# protein-DNA/RNA
pdb_profiling sifts-mapping --input demo_uniprot.txt --column UniProt --func pipe_select_else --kwargs 'func="Protein/NA"' --output na_result.txt

The PISA API have been fixed but also been updated, requiring some modification with the code and scripts. (https://www.ebi.ac.uk/pdbe/news/improved-macromolecular-interactions-data-pisa-lite)
I would try to implement the new PISA API in the following weeks.

At the mean time, you can try another tool with similar functionalities: https://github.com/ELELAB/PDBminer (https://pubs.acs.org/doi/10.1021/acs.jcim.3c00884). But their tool also seems to have some bugs to fix.

@NatureGeorge
Copy link
Owner

NatureGeorge commented Apr 4, 2024

Actually, to save your time and effort, you can directly turn to https://www.pdbbind-plus.org.cn/ for retrieving such interacting data.

EDIT: It seems that they do not provide the details of interacting residues.

@yang-arina
Copy link
Author

Actually, to save your time and effort, you can directly turn to https://www.pdbbind-plus.org.cn/ for retrieving such interacting data.

EDIT: It seems that they do not provide the details of interacting residues.

Thank u for your kind reply! I'll try through PISA API to get the interface residue!

@NatureGeorge
Copy link
Owner

Actually, to save your time and effort, you can directly turn to https://www.pdbbind-plus.org.cn/ for retrieving such interacting data.
EDIT: It seems that they do not provide the details of interacting residues.

Thank u for your kind reply! I'll try through PISA API to get the interface residue!

The valid PISA API list is here https://www.ebi.ac.uk/pdbe/api/pisa.html

@yang-arina
Copy link
Author

Hi, I would like to inquire about how to calculate the BS score. It's used to measure the mapping quality between UniProt and PDB chains, with a maximum value of 1. I have the UniProt fasta file and the corresponding PDB chain's pdb structure file. Could you please advise me on how to proceed with the calculation of the BS score? Thank you for your assistance.

@NatureGeorge
Copy link
Owner

NatureGeorge commented Apr 23, 2024

Sorry for the late reply. I have been busy around these days. The functionalities of BS score caculation and representative structure selection process for interacting proteins seems need to be restored. I came across https://pypi.org/project/arctic3d/ (https://www.nature.com/articles/s42003-023-05718-w) and found it useful. You can try it first.

@NatureGeorge
Copy link
Owner

NatureGeorge commented Apr 23, 2024

Oh. Maybe I misunderstood your needs. The way to calculate the BS score is straightforward. It composed of aligned, CN-terminal, Insertion, and Deletion scoring parts. I recommand you to only calculate the aligned part and normalized it with the full length of UniProt if you want to compute it by your self, it is enough for most cases.

image

BS score is intended for selecting representative strcuctures of each UniProt. The calculated results can be found at the bs_score column (or bs_score_1 and bs_score_2 for protein 1 and protein 2 respectively in ho/he/na_results.txt) of result.txt. According to the BS score, as well the the exact coverage region the the PDB to UniProt, we can select those PDB structures with as less coverage overlap but more overall coverage of the UniProt full length as possible. The selection boolean indicator can be found at the select_tag column (or select_tag_1, select_tag_2).

For interacting pairs, we focus on the different coverage of the interface residues to the UniProt full length pair, the selection boolean indicator can be found at the i_select_tag column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants