A recent post highlighted issues with analyzing fungal ITS data, and that inspired my labmate Sydney Glassman and me to want to share our experiences with using amplicons to characterize fungal communities. We are very excited that people are interested in delving into the wonderful world of fungi, and we wish to share our love of mycology with others! Fungi can have really interesting ecologies, and the field recently has had a lot of success with developing tools for studying environmental fungi.
- First, the general. While ITS is the universal barcode for fungi, there are, of course, issues with it and the primers used to isolate it. There are multiple, potentially different, copies of ITS per species, and the typical primers show taxonomic bias in what they target – for example, Cantharellus may be lost. Plus, there are two variable sections within the ITS region. Given the sequence read length of current technologies, only one is typically targeted, but there is debate over whether it is best to amplify ITS1 or ITS2 . Our group amplifies ITS1, and we have had good luck with it. The primers we use as well as the specifics of the bioinformatics pipeline that we employ can be found in a recent paper by Smith and Peay.
- ITS is variable in length, which can make it trickier to merge reads than, for example, 16S. We remove priming/adapter sites and low-quality sequences from the ends of reads, and we have found that this greatly improves the number of reads that can be paired.
- The variable nature of ITS precludes any sort of alignment across broad groups of fungi, and thus fungal analyses are taxonomic rather than phylogenetic (i.e. no UniFrac). There are efforts afoot to change that, but currently, if you are interested in doing phylogenetics, you would need to target another region than ITS, most often the ribosomal small subunit (18S). Meanwhile, for analyzing ITS, many of the default settings in QIIME, for instance, use a phylogenetically-informed process, so its important to use flags to mask alignments/trees.
- The ITS database for fungi typically used is the UNITE database, which began as a database of ectomycorrhizal fungi. While it has expanded greatly over the years, there are still many lineages of fungi yet to be represented or named (beyond “uncultured environmental clone”) in the database. [AMENDED] While the UNITE database is the best resource out there for characterizing fungi, and it continues to improve, some form of de novo (or open) OTU picking strategy probably makes sense for most environmental fungal studies.
- [EDITED] After taxonomic assignment, many OTUs may be unassigned (for example, they appear as “No Blast Hit”). In our experience when we BLAST these OTUs by hand on GenBank, they are bacteriophage or
chimerasspurious OTUs. So, we tend to remove OTUs that are unassigned after taxonomic identification.
- Some fungi have two taxonomic names: one for when they described in their sexual stage and another in the asexual stage, before molecular approaches revealed that this was one species, not two. This legacy remains in the database, so the richness of fungi can seem inflated. For example, the sexual stages of some Aspergillus species were named as Eurotium, and both of these genera appear in the database.
We hope that sharing our experiences facilitates the inclusion of fungi across broad ecological settings.