Skip to content

The repository includes all custom scripts and deep learning model code associated with the paper titled "An Explainable Language Model for Antibody Specificity Prediction Using Curated Influenza Hemagglutinin Antibodies.

Notifications You must be signed in to change notification settings

nicwulab/HA_Abs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

Sequence analysis of influenza hemagglutinin (HA) antibodies

This README describes the analysis in:
An explainable language model for antibody specificity prediction using curated influenza hemagglutinin antibodies

Contents

Env setup

if you set up env using conda, run conda installation as follow:

conda env create -f Ab_epitope/environment.yml

Dataset

CDR H3 analysis

  1. Extract CDR H3 sequences and references
    python3 script/parse_Ab_table.py

  2. Clustering CDR H3 sequences
    python3 script/CDRH3_clustering_optimal.py

  3. Analyzing CDR H3 clustering results
    python3 script/analyze_CDRH3_cluster.py

  4. Analyzing CDR H3 property
    python3 script/analyze_CDRH3_property.py

  5. Create sequence logos for different CDR H3 clusters
    python3 script/CDRH3_seqlogo.py

  6. Plot CDR H3 property for HA head and stem antibodies
    Rscript script/plot_CDRH3_property.R

Germline usage analysis

  1. Clonotype assignment
    python3 script/assign_clonotype.py

  2. Compute germline usag and extract public clonotype
    python3 script/extract_public_clonotype_VDJ.py

  3. Extract IGHD4-17-encoded head antibodies
    python3 script/analyze_IGHD4-17.py

  4. Analyzing the occurrence of YGD motif in CDR H3
    python3 script/analyze_YGD_motif.py

  5. Plot VDJ gene usage
    Rscript script/plot_VDJgene_freq.R

  6. Plot IGHV/IGK(L)V pairing frequency
    Rscript script/plot_Vpair_heatmap.R

  7. Plot frequency of YGD motif
    Rscript script/plot_YGD_freq.R

mBLM for specificity prediction

See Ab_epitope

About

The repository includes all custom scripts and deep learning model code associated with the paper titled "An Explainable Language Model for Antibody Specificity Prediction Using Curated Influenza Hemagglutinin Antibodies.

Resources

Stars

Watchers

Forks

Packages

No packages published