Table of Contents
The goal of the joint COCO and LVIS Workshop is to study object recognition in the context of scene understanding. This workshop will host the COCO suite of challenges and a new challenge on large vocabulary instance segmentation (LVIS). While both the COCO and LVIS challenges look at the general problem of visual recognition, the specific tasks in the challenges probe different aspects of the problem.
COCO is a widely used visual recognition dataset, designed to spur object detection research with a focus on full scene understanding. In particular: detecting non-iconic views of objects, localizing objects in images with pixel level precision, and detection of objects in complex scenes. The COCO dataset includes 330K images of complex scenes exhaustively annotated with 80 object categories with segmentation masks, 91 stuff categories with segmentation masks, person keypoint annotations, and 5 captions per image.
Large Vocabulary Instance Segmentation (LVIS) includes high-quality instance segmentations for more than 1000 entry-level object categories. The LVIS dataset contains a long-tail of categories with few examples, making it a distinct challenge from COCO and exposes shortcomings and new opportunities in machine learning. We expect this dataset to inspire new methods in the detection research community. This year we plan to host the first challenge for LVIS, a new large vocabulary dataset.
2. Challenge Dates
- Alexander Kirillov (FAIR)
- Tsung-Yi Lin (Google Research)
- Yin Cui (Google Research)
- Matteo Ruggero Ronchi (California Institute of Technology)
- Ross Girshick (FAIR)
- Piotr Dollar (FAIR)