Toloka Aggregation Features
The data set contains about 60K crowdsourced labels for 1K tasks and groud truth labels for almost all the tasks. The task was to classify websites into 5 categories by the presence of adult content on them. Additionally, each task has 52 real-valued features that can be used to predict its category.
June 9, 2019