Computational Challenge

Computational Challenge

Objective of the Challenge

(You can find a summary of this content in our Webinar.)

The Challenge, organized as part of the sbv IMPROVER project, aims to investigate the diagnostic potential of metagenomics data to discriminate patients with inflammatory bowel disease (IBD) including ulcerative colitis (UC) or Crohn’s disease (CD) from non-IBD subjects, or within IBD subjects, separating UC and CD subjects (Figure 4).

Figure 4.Overview of the Metagenomics Diagnosis for IBD challenge.

The Challenge is articulated into two sub-challenges:

  1. In the first sub-challenge (“MEDIC RAW”), you are provided with shotgun metagenomics sequencing reads from fecal samples of human subjects diagnosed with IBD including CD and UC (=IBD) and of subjects without IBD (=non-IBD), and have the possibility to process metagenomics data with your own analysis pipeline and classify the samples.
  2. In the second sub-challenge (“MEDIC PROCESSED”), you are provided with taxonomic and pathway abundances matrices resulting from the processing of raw metagenomics sequencing reads, enabling you to participate in the Challenge without having to process raw metagenomics data.

You can choose to participate to either one or both sub-challenges.

Participants' Specific Tasks

1. CLASS PREDICTIONS - You will provide a confidence value in the interval [0,1] and with 1 being the highest confidence that the sample belongs to "class 1":

  • IBD (class 1) vs non-IBD (class 2)
  • UC (class 1) vs non-IBD (class 2)
  • CD (class 1) vs non-IBD (class 2)
  • UC (class 1) vs CD (class 2)

To train your predictor models, you can use raw (sub‑challenge 1) or processed (sub‑challenge 2) shotgun metagenomics data from two independent published studies, both available to registered users in their Dashboard section. You can also use any additional private/public datasets you find suitable.

2. FEATURES/SIGNATURE- You will provide the list of selected features (a subset of TaxIDs or PathIDs) used in their classification prediction model(s) applied on the test dataset, and their associated value of importance (optional), if available (e.g. variable of importance for features obtained using random forest based approaches).

3. METHOD DESCRIPTION- You will describe your approaches by providing sufficient information to allow reproducibility. Your write-up should include:

  • A description of the feature extraction method(s)
  • A description of the classification/machine learning approach(es) used.

The description should include publication references if the method is published, software and version, software parameters used, parameters/coefficients of the model applied for prediction, description of metagenomics pipeline for processing shotgun sequencing raw data. 

A write-up template provides guidance regarding minimal information needed.

How to Submit your Results on the sbv IMPROVER Website? 

More details on data and information required for submissions, e.g., format templates for the various submission files and for the write-up including minimum information required to allow reproducibility, are provided in the section 6 - Participants’ tasks and submission on the sbv IMPROVER challenge website of the Technical document, as well as the related sections on the Challenge Rules.

Datasets Provided for the Challenge

Organizers provide to participants with shotgun metagenomics sequencing data as raw and processed data for predictor model training and testing as described in the Technical Document and shown below in Figure 5.

Figure 5. Schematic view of the shotgun metagenomic sequencing datasets provided for training and test including (i) raw (FASTQ files) for the sub-challenge 1 and (ii) processed data for the sub-challenge 2 in the form of taxonomy and pathway abundances matrices.


More details are provided in Section 5 "Data files for download" of the Technical document.

Incentives and Awards

Thanks to your active participation in the Challenge, you will:

  • Show your talents of data scientist.
  • Help the scientific community to benchmark computational methods objectively and establish standards and best practices in computational metagenomics analysis.
  • Gain early access to new benchmarking datasets.
  • Receive an independent assessment of your method(s).
  • Grow your professional network by engaging with researchers from around the world.

In addition, you will have the opportunity to be awarded with the following Incentives (as defined in the Challenge Rules):

  • Win a cash price worth 2,000 USD that will be awarded to the three best performing teams of each sub-challenge (see Rules & Awards section).
  • Contribute to writing peer-reviewed scientific article(s) describing the outcome of the Challenge. 


Get Started

Who can participate?

Anyone (refer to “Eligibility” section of the Challenge Rules) with an interest in computational sciences (e.g. data analysis, metagenomics data , machine learning and artificial intelligence) can participate. Participants can either become part of the Challenge individually, or team up with other participants and join the Challenge as teams. Submissions will be ranked at the team level, but only the individual participants are eligible to receive the incentive described in section 8 below. You have the possibility to submit multiple submissions for your team. The best submission will be retained for the final scoring. Please note that you will not receive any intermediary assessment of your Submission(s). Your participation in the Challenge is governed by the applicable terms and condition, including the Challenge Rulesand Terms of Use.

How and When?

The Challenge was open from September 2019 to February 29th, 2020. This Challenge is now closed.

To participate, please, register on the sbv IMPROVER website, and create or join an existing team.

For any question, please, contact us


Share this page