Table of Contents
1. Workshop Schedule - August 23, 2020
Note: the time zone is UTC+1.
We will host two live sessions with panel discussion during the second one.
0:00am and 4:00pm | COCO introduction and results [Talk] | |
---|---|---|
0:15am and 4:15pm | COCO keypoints winner [Talk] | Team XForwardAI |
0:30am and 4:30pm | LVIS introduction and results [Talk] | Agrim Gupta |
0:40am and 4:40pm | LVIS winner [Talk] | Team LvisTraveler |
0:55am and 4:55pm | LVIS most innovative [Talk] | Team Asynchronous SSL |
1:10am and 5:10pm | LVIS spotlight [Talk][Video] | Team MMDet |
1:15am and 5:15pm | LVIS spotlight [Talk][Video] | Team CenterNet2 |
1:20am and 5:20pm | LVIS spotlight [Talk][Video] | Team PAI Vision |
1:25am and 5:25pm | LVIS spotlight [Talk][Video] | Team Innova |
5:30pm | Panel: The future of computer vision datasets, benchmarks, and challenges | Award committee |
2. Overview
The goal of the joint COCO and LVIS Workshop is to study object recognition in the context of scene understanding. This workshop will host the COCO suite of challenges and a new challenge on large vocabulary instance segmentation (LVIS). While both the COCO and LVIS challenges look at the general problem of visual recognition, the specific tasks in the challenges probe different aspects of the problem.
COCO is a widely used visual recognition dataset, designed to spur object detection research with a focus on full scene understanding. In particular: detecting non-iconic views of objects, localizing objects in images with pixel level precision, and detection of objects in complex scenes. The COCO dataset includes 330K images of complex scenes exhaustively annotated with 80 object categories with segmentation masks, 91 stuff categories with segmentation masks, person keypoint annotations, and 5 captions per image.
Large Vocabulary Instance Segmentation (LVIS) includes high-quality instance segmentations for more than 1000 entry-level object categories. The LVIS dataset contains a long-tail of categories with few examples, making it a distinct challenge from COCO and exposes shortcomings and new opportunities in machine learning. We expect this dataset to inspire new methods in the detection research community. This year we plan to host the first challenge for LVIS, a new large vocabulary dataset.
3. Challenge Dates
4. Organizers
4.1. COCO
- Alexander Kirillov (FAIR)
- Tsung-Yi Lin (Google Research)
- Yin Cui (Google Research)
- Matteo Ruggero Ronchi (California Institute of Technology)
- Natalia Neverova (FAIR)
- Vasil Khalidov (FAIR)
- Ross Girshick (FAIR)
- Piotr Dollar (FAIR)
4.2. LVIS
- Agrim Gupta (Stanford)
- Ross Girshick (FAIR)
4.3. Award committee
- Yin Cui (Google Research)
- Tsung-Yi Lin (Google Research)
- Alexander Kirillov (Facebook AI Research)
- Natalia Neverova (Facebook AI Research)
- Matteo Ruggero Ronchi (Caltech)
- Michael Maire (University of Chicago)
- Lubomir Bourdev (WaveOne, Inc.)
- James Hays (Georgia Tech)
- Larry Zitnick (Facebook AI Research)
- Ross Girshick (Facebook AI Research)
- Piotr Dollár (Facebook AI Research)
5. Rules and Awards
- Participants must submit a technical report that includes a detailed ablation study of their submission via CMT. For the technical report, use the following ECCV-based template. Suggested length of the report is 2-7 pages. The reports will be made public. This report will substitute the short text description that we requested previously. Only submissions with the report will be considered for any award and will be put in the COCO leaderboard.
- This year for each challenge track we will have two different awards: best result award and most innovative award. The most innovative award will be based on the method description in the submitted technical reports and decided by the COCO award committee. The commitee will invite teams to present at the workshop based on the innovations of the submissions rather than the best scores.
- This year we will award single best paper award for the most innovative and successful solution across all challenges. The winner will be determined by the workshop organization committee.
6. COCO Challenges
COCO is an image dataset designed to spur object detection research with a focus on detecting objects in context. The annotations include instance segmentations for object belonging to 80 categories, stuff segmentations for 91 categories, keypoint annotations for person instances, and five image captions per image. The specific tracks in the COCO 2018 Challenges are (1) object detection with segmentation masks (instance segmentation), (2) panoptic segmentation, (3) person keypoint estimation, and (4) DensePose. We describe each next. Note: neither object detection with bounding-box outputs nor stuff segmentation will be featured at the COCO 2020 challenge (but evaluation servers for both tasks remain open).
6.1. COCO Object Detection Task
The COCO Object Detection Task is designed to push the state of the art in object detection forward. Note: only the detection task with object segmentation output (that is, instance segmentation) will be featured at the COCO 2019 challenge. For full details of this task please see the COCO Object Detection Task.
6.2. COCO Panoptic Segmentation Task
The COCO Panoptic Segmentation Task has the goal of advancing the state of the art in scene segmentation. Panoptic segmentation addresses both stuff and thing classes, unifying the typically distinct semantic and instance segmentation tasks. For full details of this task please see the COCO Panoptic Segmentation Task.
6.3. COCO Keypoint Detection Task
The COCO Keypoint Detection Task requires localization of person keypoints in challenging, uncontrolled conditions. The keypoint task involves simultaneously detecting people and localizing their keypoints (person locations are not given at test time). For full details of this task please see the COCO Keypoint Detection Task.
6.4. COCO DensePose Task
The COCO DensePose Task requires dense estimation of human pose in challenging, uncontrolled conditions. The DensePose task involves simultaneously detecting people, segmenting their bodies and mapping all image pixels that belong to a human body to the 3D surface of the body. For full details of this task please see the COCO DensePose Task.
7. LVIS Challenge
LVIS is a new, large-scale instance segmentation dataset that features > 1000 object categories, many of which have very few training examples. LVIS presents a novel low-shot object detection challenge to encourage new research in object detection. For more information, please see LVIS challenge page.