Datasets

Check out the datasets we release to benefit the wider research community.
3 of 5 datasets
  • GraphLand

    Graph machine learning
    Gleb Bazhenov
    Oleg Platonov
    Liudmila Prokhorenkova

    GraphLand is a benchmark of 14 diverse graph datasets for node property prediction from a range of different industrial applications. GraphLand allows evaluating graph ML models on a wide range of graphs with diverse sizes, structural characteristics, and feature sets, all in a unified setting. Further, GraphLand allows investigating such previously underexplored research questions as how realistic temporal distributional shifts under transductive and inductive settings influence graph ML model performance.

  • TabReD

    Tabular data
    Ivan Rubachev
    Nikolay Kartashev
    Yury Gorishniy
    Artem Babenko

    TabReD is a benchmark for evaluating tabular machine learning models under conditions representative of real-world deployments. It comprises eight datasets from production ML systems at Yandex and Kaggle competitions. TabReD addresses two gaps in existing benchmarks: (1) all datasets use time-based train/validation/test splits to evaluate models under temporal distribution drift, and (2) datasets are feature-rich (median 261 features vs. 13-23 in prior benchmarks) with extensive feature engineering, reflecting real ML pipelines. Experiments on TabReD demonstrate that methods successful on standard benchmarks may underperform on TabReD, making it a critical testbed for assessing whether tabular ML approaches generalize to industrial settings.

  • Heterophilous graph datasets

    Graph machine learning
    Oleg Platonov
    Denis Kuznedelev
    Michael Diskin
    Artem Babenko
    Liudmila Prokhorenkova

    A graph dataset is called heterophilous if nodes prefer to connect to other nodes that are not similar to them. For example, in financial transaction networks, fraudsters often perform transactions with non-fraudulent users, and in dating networks, most connections are between people of opposite genders. Learning under heterophily is an important subfield of graph ML. Thus, having diverse and reliable benchmarks is essential.

    We propose a benchmark of five diverse heterophilous graphs that come from different domains and exhibit a variety of structural properties. Our benchmark includes a word dependency graph Roman-empire, a product co-purchasing network Amazon-ratings, a synthetic graph emulating the minesweeper game Minesweeper, a crowdsourcing platform worker network Tolokers, and a question-answering website interaction network Questions.