In just over a decade, metagenomics has developed into a powerful and productive method in microbiology and microbial ecology. The ability to retrieve and organize bits and pieces of genomic DNA from any natural context has opened a window into the vast universe of uncultivated microbes. Tremendous progress has been made in computational approaches to interpret this sequence data but none can completely recover the complex information encoded in metagenomes.
A number of challenges stand in the way. Simplifying assumptions are needed and lead to strong limitations and potential inaccuracies in practice. Critically, methodological improvements are difficult to gauge due to the lack of a general standard for comparison. Developers face a substantial burden to individually evaluate existing approaches, which consumes time and computational resources, and may introduce unintended biases.
The Critical Assessment of Metagenome Interpretation (CAMI) is a new community-led initiative designed to help tackle these problems by aiming for an independent, comprehensive and bias-free evaluation of methods. We are making extensive high-quality unpublished metagenomic data sets available for developers to test their short read assembly, binning and taxonomic profiling methods. The results of CAMI will provide exhaustive quantitative measurements of tool performance to serve as a guide to users under different scenarios, and to help developers identify promising directions for future work.
The competition is tentatively scheduled to open in the beginning of 2015. Key data sets are being generated, and CAMI is currently seeking additional data contributors to provide genomes of deep-branching lineages for data set generation. Contest participants can already register on the submission website, download test data sets, and upload their predictions for these.
To facilitate future benchmarking endeavours and the assessment of novel or altered software, reproducibility of the results is an important point on the agenda of CAMI Contest participants are therefore encouraged to additionally submit the software that has been used to generate their results in a Docker container. Among participants submitting reproducible results, CAMI will award a prize of 1000 Euros to three randomly chosen contestants with a submission performing better than a baseline. The results will be presented and discussed in a workshop a few month after the competition. For all reproducible contributions with permissions provided, a joint publication of the generated insights together with all CAMI contest participants and data contributors is planned.
If you are interested in participating, subscribe to the newsletter on the CAMI homepage and we keep you updated. Also, please take our survey so that we get a better idea of your needs in terms of compute resources and reference databases.
Alice McHardy*, Tanja Woyke, Eddy Rubin, Nikos Kyrpides, Paul Schulze-Lefert, Julia Vorholt, Nicole Shapiro, Hans-Peter Klenk, Stephan Majda, Johannes Droege, Ivan Gregor, Peter Hofmann, Eik Dahms, Jessika Fiedler, Ruben Garrido-Oter, Yang Bai, Girish Srinivas, Phil Blood, Mihai Pop, Aaron Darling, Matthew DeMaere, Dmitri Turaev, Chris Hill, Peter Belmann, Andreas Bremges, Liren Huang, Thomas Rattei*, Alexander Sczyrba*
- CAMI Info Website: http://www.cami-challenge.org
- CAMI Data Submission Website: https://data.cami-challenge.org
- CAMI presented at the UMC meeting at the Isaac Newton Institute, Cambridge, UK.
- Twitter: @CAMI_challenge