Skip to content

Commit

Permalink
Update results.md
Browse files Browse the repository at this point in the history
  • Loading branch information
BlueShark002 authored Sep 29, 2024
1 parent 2ad7ac5 commit 7a18d29
Showing 1 changed file with 17 additions and 0 deletions.
17 changes: 17 additions & 0 deletions docs/project/results.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,34 @@ We collected and categorized a dataset of 1,030 extremophiles from various genom
## 2.Establishment and refinement of the iExtreme model for extremophile identification
The iExtreme model was successfully developed by integrating 1,030 extremophilic genomes with non-extremophilic genomes to predict extremophiles and their optimal living conditions. Using SVM classification and k-mer feature extraction, the model achieved high prediction accuracies of 0.97 to 0.99 for halophiles, thermophiles, and pH-philes. Additionally, iExtreme re-predicted missing data, identifying 129 previously overlooked extremophiles. In total, the model identified 356 halophiles, 688 thermophiles, and 168 pH-philes, emphasizing the role of extreme environments in their evolution. To improve accessibility, an interactive website was created where users can input genomic data, and the model automatically predicts the species' optimal living conditions.

![Fig.2](../img/fig2.png)

<center>Fig.2 Establishment of deep learning model for extremophile identification.</center>

## 3.Discovery of extremophiles using the iExtreme model
We applied the iExtreme model to predict and refine the genomic data from established databases, including NCBI, ProPan, ProGenome, and GOMC. Through this approach, we successfully identified a total of 1,166 extremophile genomes within the NCBI database, and classified 729 species as extremophiles. Altogether, we identified 520 new extremophilic species and 4,419 new extremophile genomes across all databases. Additionally, by applying iExtreme to viral genomes from the NCBI viral database, we discovered 47 haloviruses, 35 thermoviruses, and 30 pH-viruses, with 36 extreviruses included in our final database. Notably, most of the identified extremophilic viruses were DNA-based, particularly dsDNA, suggesting that DNA viruses exhibit higher tolerance to extreme conditions. We validated the accuracy of our model by confirming that the predicted optimal living conditions aligned with those observed in our collected extremophiles.

![Fig.3](../img/fig3.png)

<center>Fig.3 Extremophile screening using iExtreme.</center>

## 4.Unveiling extremozymes through structure-based clustering within the iExtreme database
we utilized structure-based clustering within the iExtreme database to discover new extremozymes. We focused on two key industrial enzymes, D-psicose 3-epimerase (DPEase) and α-amylase. By applying our clustering method, we identified 136 potential DPEases and selected four candidates with high structural similarity but low sequence homology. After testing, we found that NthYcjR and TcaYcjR displayed strong DPEase activity and excellent thermostability, surpassing the widely used Clo-DPEase under high-temperature conditions. Using the same approach, we discovered novel α-amylases from Thermospira aquatica and Thermoanaerobacter thermocopriae, which showed similar activity and thermostability to known thermophilic α-amylases. These findings demonstrate the effectiveness of structure-based clustering in identifying new extremozymes with great potential for industrial applications.

![Fig.4](../img/fig4.png)

<center>Fig.4 Novel extremozyme discovery using iExtreme and structure-based clustering.</center>

## 5.Developing a droplet-based PANCE method for the directed evolution of DPEase
We developed a droplet-based phage-assisted non-continuous evolution (PANCE) method for the directed evolution of D-psicose 3-epimerase (DPEase). We designed a microfluidic chip to encapsulate M13 phages and E. coli host strains in droplets, ensuring each droplet contained one phage. Using this method, we successfully evolved Clo-DPEase by linking DPEase activity to the expression of M13 phage gene gIII in response to D-allulose concentration.

We further explored the use of physical mutagenesis techniques, including ARTP and microwave, to accelerate the evolution process, reaching a plateau by round 15-20. Additionally, we applied fluorescence-activated droplet sorting (FADS) to select high-activity phage mutations, significantly speeding up the PANCE process by two-fold.

As a result, we isolated a Clo-DPEase mutant, Phe246Tyr (Clo-DPEasemut), which exhibited a 1.9-fold higher catalytic efficiency (kcat/Km) compared to the wild type. The mutant also demonstrated enhanced stability and a longer half-life, retaining more than 28.5% conversion rate after four uses in the industrial production of allulose, with a peak yield of 265 g/L. These findings highlight the potential of Clo-DPEasemut for industrial applications.

![Fig.5](../img/fig5.png)

<center>Fig.5 Establishment and optimization of PANCED for directed evolution of DPEase.</center>



0 comments on commit 7a18d29

Please sign in to comment.