-
Notifications
You must be signed in to change notification settings - Fork 11
Metagenomic analysis to identify species composition of Fraxinus excelsior, Chalara fraxinea and infected material samples.
Christine Sambles and David Studholme. University of Exeter, Devon.
In order to check their species composition, we performed a metagenomic analysis on several datasets that had been generated from samples described as Fraxinus excelsior,Chalara fraxinea or as mixed infected material. We used MEGAN (v4, MEtaGenome Analyzer ( Huson, et al., 2011)) and the assembled transcripts from F. excelsior and C. fraxinea to identify the taxonomic groups to which uninfected sample transcripts are allocated, to use as a reference database for binning. We identified many transcripts specific to the infected samples (not in uninfected Fraxinus) that fell outside of the Chalara and Fraxinus bins; these may indicate additional microbial species present during infection. These might include species acting synergistically with C. fraxinea during infection or opportunists present as a secondary consequence of infection. The uninfected F. excelsior sample identifies species that are part of the ‘normal’ or ‘healthy’ tree microbiota and could therefore be excluded from the list of infection-related species.
Transcriptome assemblies:
F. excelsior: ATU1
C. fraxinea: KW1
Mixed material: AT1 , AT2 , Upton , Holt
BLASTX against GenBank:
F. excelsior: ATU1
C. fraxinea: KW1
Mixed material: AT1 , AT2 , Upton , Holt
We identified sequence similarity between assembled transcripts and GenBank protein sequences using BLASTX; we used as queries the transcripts from uninfected F. excelsior (ATU1), a C. fraxinea isolate (KW1) and four mixed material samples (AT1, AT2, Upton and Holt). We loaded the output from BLASTX into MEGAN and performed taxonomic binning using a minimum support value of 35, a minimum BLAST score of 50 and only retaining hits whose bit scores lie within 10% of the best score. The analyses were normalised, compared and rendered within MEGAN.
For each of the six samples, we identified the numbers and percentages of transcripts assigned to Helotiales (the order in which C. fraxinea is classified) and to Viridiplantae (green plants). The results are summarised in Table 1. Transcripts from the (nominally) C. fraxinea isolate KW1, binned into the classes of Dothideomycetes, Eurotiomycetes, Leotiomycetes and Sordariomycetes, which all reside within the subphylum of Pezizomycotina. This result is consistent with the sequenced sample being pure Chalara. As expected, all of the bin-able transcripts from the F. excelsior ATU1 transcripts fell within the Viridiplantae kingdom, specifically within the group of flowering plants (Magnoliophyta).
Sample |
Sample Type |
Helotiales |
Viridiplantae |
Non H/V |
F. excelsior |
0% |
79% |
21% |
|
C. fraxinea |
32% |
0% |
68% |
|
Mixed material |
16% |
47% |
37% |
|
Mixed material |
8.9% |
54% |
37% |
|
Mixed material |
13% |
13% |
74% |
|
Mixed material |
6.6% |
32% |
61% |
Table 1: Percentage of transcripts binned to Helotiales and Viridiplantae in normalised comparison for each sample.
In the data from Upton mixed material, 74% of the transcripts were not binned within the Helotiales or Viridiplantae, which is where C. fraxinea and F. excelsior transcripts are expected to fall, based on the results from pure isolate of C. fraxinea and the uninfected F. excelsior. In the Upton data, 34% of the total number of transcripts was assigned to Oomycetes; specifically 33% to Phytophthora spp.. Additionally, 13% are not assigned to any taxon and a further 11% had no significant similarity to proteins in the GenBank database, detectable by BLAST. The presence of Phytophthora spp. might be attributed to cross-lane contamination during sequencing, since the Norwich laboratory handling the Upton data also work with Phytophthora infestans. A similar contamination had also been reported for Fera samples with Maize Chlorotic Mottle Virus (MCMV) and Sugarcane Mosaic Virus (SMV) sequences being present. This is a common problem in Illumina sequencing which has led to the incorporation of a taxonomic binning step into sequencing pipelines including at our own sequencing facility at the University of Exeter. These contamination issues highlight the importance of confirming the taxonomic distribution of sequence data in addition to quality checks before performing any downstream analyses. Once identified, the contaminant reads can be removed from the dataset.
The Holt mixed material sample analysis showed 1.5% transcripts binned to Togninia minima, an ascomycete in the order Calosphaeriales. T. minima is a pathogen of grapevines and Prunus spp., however, the closely related T. fraxinopennsylvanica (anamorph: Phaeoacremonium mortoniae) has been observed in dead vascular tissue of declining ash tree branches (Fraxinus latifolia) in California (Eskalen, et al., 2005a; Eskalen, et al., 2005b). It may be that T. fraxinopennsylvanica is present in the Holt material and that the transcripts were assigned to the species T. minima because that is the most closely related species for which extensive sequence data is available.
For the nominally pure sample of C. fraxinea isolate KW1, only 32% of transcripts were assigned to Helotiales. However, 66% of transcripts were assigned to Fungi with 16% not assigned to a taxa and 17% had no significant similarity to proteins in the GenBank BLAST database. This is likely to be due to insufficient sequence data in the GenBank database from Chalara and closely related species.
[Fig 1](http://figshare.com/articles/Metagenomic_analysis_of_Fraxinus_excelsior_Chalara_fraxinea_and_four_infected_material_samples_AT1_AT2_Upton_and_Holt_/807684): Metagenomic analysis of Fraxinus excelsior, Chalara fraxinea and four infected material samples (AT1, AT2, Upton and Holt).Further analysis using alignments to the F. excelsior and C. fraxinea genome will help interpret whether or not other taxa such asTogninia sp., indicated to be present by the MEGAN analysis, are present or whether they are mis-assigned transcripts due to the lack of Chalara- and Fraxinus-related proteins in the database.