We proudly announce the beginning of the second round of challenges of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI) and release of the official challenge data sets!
Over the last two years, we received valuable feedback from the community on important challenges in the field and how to design interesting new data sets and challenges. We incorporated many of your suggestions, thanks again! For you to familiarize with data set types and formats, additional exemplary data sets together with accompanying standards of truth have already been made available over the last months. Two multisample “toy” data sets representing microbial communities from different human body sites and from mouse gut are already provided to allow participants to prepare for the challenges (https://data.cami-challenge.org/participate). These practice data sets are generated from known genomes, and therefore reference-based methods (e.g., using genome databases for their analysis) might perform better here than for real shotgun metagenomic data, where a substantial portion of microbial community members have not been sequenced.
The second CAMI challenge datasets will therefore again include new genomes from taxa (at different evolutionary distances) not found in public databases. Furthermore, a new focus will be on establishing the value of long sequencing reads for microbiome research, with data sets providing both long- and short-read data. Lastly, a clinical pathogen discovery challenge will be offered, mimicking an emergency diagnostic situation in the clinic.
Specifically, the second round of CAMI challenges comprise a metagenome assembly, a genome binning, a taxonomic binning and a taxonomic profiling challenge, across several multi-sample data sets from different environments. This includes a marine data set and a high-strain diversity data set. A third data set and a pathogen detection challenge on a clinical sample will follow later on (challenge dates will be announced).
We are looking forward to receiving your submissions!
The CAMI Team
CAMI II offers several challenges: an assembly, a genome binning, a taxonomic binning and a taxonomic profiling challenge, on several multi-sample data sets from different environments, including long and short read data. This includes a marine data set and a high-strain diversity data set, with a third data set to follow later. A pathogen detection challenge on a clinical sample will also follow later.
Assembly challenge: takes as input all read samples of a given data set, and returns a cross-sample assembly. Assembly results can be submitted for short read data OR long read data, OR both data types combined. For methods incapable of handling the entire data set, the FIRST TEN samples of a data set can be assembled and a ten-sample cross-assembly submitted. The assembly challenge will close early, namely once a gold standard assembly has been released after three months!
Profiling challenge: takes as input multiple read samples of a given data set and returns taxonomic profiles for all individual samples and one for the entire data set. This challenge closes after the second challenge period has ended.
Genome binning challenge: takes as input reads, or gold standard assemblies, or assemblies provided by CAMI after three months for every sample individually. It returns genome bin assignment for the analysed reads or contigs for every sample of a data set in the CAMI format.
Taxon binning challenge: takes as input reads, or gold standard assemblies, or assemblies provided by CAMI after three months. It returns a taxon bin assignment for the analysed reads or contigs in CAMI format for every sample in a data set.
The first challenges - metagenome assembly, taxonomic profiling, taxonomic or genome binning of raw read data - start on January 16th, 2019. For taxonomic profiling, taxonomic or genome binning methods using assembled data, assemblies will be provided on April 1st, 2019. The assembly challenge will close on Sunday March 31st, 2019. All other challenges close on Sunday, June 30th, 2019.
To get an alert when contests start, you can follow @CAMI_challenge on Twitter.
We will announce when result submission to the CAMI platform will open. We are looking forward to receiving your reproducible results! CAMI will represent all submitted results in anonymous form and indicate their performance using a range of metrics in comparisons to other tools. All submitted results should best be reproducible by providing the software, with the exact database versions and parameter settings used.
Tools can be submitted in one of the following ways:
The output format must conform to the CAMI standards (FAQ) to allow automatic benchmarking of results. For software using large custom databases, please contact the CAMI team before providing it.
We wish to include all participants as co-authors on the joint CAMI publication, given their consent and that their results are reproducible. We also cordially invite you to the CAMI evaluation meeting taking place at the Microbiome COSI session at ISMB in Basel, Switzerland (July 21st-25th) next year. Mark the date and follow COSI updates if you wish to submit your related work for a short talk or poster presentation or contact us if you would like to work with us on the final evaluation metrics and become part of the CAMI team.
Please fill in and submit the following form in order to get access to the CAMI 2 datasets. We will send you an Email with download instructions.