In our hands-on tutorial, we present a systematic view on using Human-in-the-Loop to obtain scalable offline evaluation processes and, in particular, high-quality relevance judgements.
We will gather the world's best experts to discuss the key issues of preparing labeled data for machine learning. We will focus on remoteness, fairness, and mechanisms in the context of crowdsourcing for data collection and labeling.
We will present a data processing pipeline required for the self-driving cars to learn how to behave autonomously on the roads. We will also demonstrate how data annotation constitutes a crucial part that makes the learning process effective. During our tutorial the participants will practice in launching the projects from the pipeline on one of the largest crowdsourcing marketplaces.
We will make an introduction to data labeling via public crowdsourcing marketplaces and will present the key components of efficient label collection. This will be followed by a practice session, where participants will launch their label collection project on one of the largest crowdsourcing marketplaces.
In this tutorial, we present a portion of unique industry experience in efficient data labeling via crowdsourcing shared by leading researchers and engineers from Yandex.