Efficient Data Collection via Crowdsourcing

A series of tutorials

In our tutorials, we present you a portion of unique industrial practical experience on efficient data labeling via crowdsourcing shared by both leading researchers and engineers from Yandex. Majority of ML projects require training data, and often this data can only be obtained by human labelling. Moreover, the more applications of AI appear, the more nontrivial tasks for collecting human labelled data arise. Production of such data in a large-scale requires construction of a technological pipeline, what includes solving issues related to quality control and smart distribution of tasks between workers.

We invite beginners, advanced specialists, and researchers to learn how to collect labelled data with good quality and do it efficiently.

Past events

Tutorial at KDD 2019

Tutorial at WSDM 2020

Tutorial at SIGMOD/PODS 2020

Tutorial at CVPR 2020

Tutorial at TheWebConf 2021

Tutorial at NAACL 2021


Alexey Drutsa

Crowdsourcing & Research Departments, Yandex

Valentina Fedorova

Research Department, Yandex

Dmitry Ustalov

Crowdsourcing Department, Yandex

Olga Megorskaya

Crowdsourcing Department, Yandex

Evfrosiniya Zerminova

Crowdsourcing Department, Yandex

Daria Baidakova

Crowdsourcing Department, Yandex

Denis Rogachevsky

Self-Driving Cars Department, Yandex

Ivan Semchuk

Self-Driving Cars Department, Yandex
Wed Jun 02 2021 19:57:58 GMT+0300 (Moscow Standard Time)