A Python 3
script for parsing the header information of PDB files.
The main purpose of this script is to extract some of the header information, such as Experiment Method
, Resolution
, R value
, R_free value
, mean B-factor
, then calculate the grades of Resolution and R_free value based on the grading of FirstGlance in Jmol. Finally, saving above mentioned information as a .csv
file (Pandas DataFrame).
Note: In some PDB files (due to the complex of the protein's structure and the limitations of the experimental detections), the R value
have different data, even in the same PDB file, or shown as NULL. Similarly for the R_free value
. In addition, the mean B-factor
some times reported as NULL
For examples:
In 1BRT.pdb
,
REMARK 3 FIT TO DATA USED IN REFINEMENT. REMARK 3 CROSS-VALIDATION METHOD : THROUGHOUT REMARK 3 FREE R VALUE TEST SET SELECTION : RANDOM REMARK 3 R VALUE (WORKING + TEST SET) : 0.140 REMARK 3 R VALUE (WORKING SET) : 0.147 REMARK 3 FREE R VALUE : 0.164 REMARK 3 FREE R VALUE TEST SET SIZE (%) : 5.000 REMARK 3 FREE R VALUE TEST SET COUNT : 2283
In 1GPD.pbd
REMARK 3 FIT TO DATA USED IN REFINEMENT. REMARK 3 CROSS-VALIDATION METHOD : NULL REMARK 3 FREE R VALUE TEST SET SELECTION : NULL REMARK 3 R VALUE (WORKING SET) : NULL REMARK 3 FREE R VALUE : NULL REMARK 3 FREE R VALUE TEST SET SIZE (%) : NULL REMARK 3 FREE R VALUE TEST SET COUNT : NULL REMARK 3 ESTIMATED ERROR OF FREE R VALUE : NULL ... REMARK 3 B VALUES. REMARK 3 FROM WILSON PLOT (A**2) : NULL REMARK 3 MEAN B VALUE (OVERALL, A**2) : NULL REMARK 3 OVERALL ANISOTROPIC B VALUE. REMARK 3 B11 (A**2) : NULL REMARK 3 B22 (A**2) : NULL REMARK 3 B33 (A**2) : NULL REMARK 3 B12 (A**2) : NULL REMARK 3 B13 (A**2) : NULL REMARK 3 B23 (A**2) : NULL
For example
python parse_PDB_header.py
Then the program will ask you to input the directory that contains the PDB files.
>>> Please type the directory contains PDB files:
If you are already in that directory, you only need to type ./
as input.
I thank Wayne for discussion about the calc_R_free_grade()
and deal_round()
functions.
I also would like to thank Zachary Ware for the detailed of the Decimal()
function, which published on 2015-08-08 09:36.