### A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress?

ICLRNode classification is a classical graph representation learning task on which Graph Neural Networks (GNNs) have recently achieved strong results. However, it is often believed that standard GNNs only work well for homophilous graphs, i.e., graphs where edges tend to connect nodes of the same class. Graphs without this property are called heterophilous, and it is typically assumed that specialized methods are required to achieve strong performance on such graphs. In this work, we challenge this assumption. First, we show that the standard datasets used for evaluating heterophily-specific models have serious drawbacks, making results obtained by using them unreliable. The most significant of these drawbacks is the presence of a large number of duplicate nodes in the datasets Squirrel and Chameleon, which leads to train-test data leakage. We show that removing duplicate nodes strongly affects GNN performance on these datasets. Then, we propose a set of heterophilous graphs of varying properties that we believe can serve as a better benchmark for evaluating the performance of GNNs under heterophily. We show that standard GNNs achieve strong results on these heterophilous graphs, almost always outperforming specialized models. Our datasets and the code for reproducing our experiments are available at https://github.com/yandex-research/heterophilous-graphs

### Gradient Boosting Performs Gaussian Process Inference

ICLRThis paper shows that gradient boosting based on symmetric decision trees can be equivalently reformulated as a kernel method that converges to the solution of a certain Kernel Ridge Regression problem. Thus, we obtain the convergence to a Gaussian Process' posterior mean, which, in turn, allows us to easily transform gradient boosting into a sampler from the posterior to provide better knowledge uncertainty estimates through Monte-Carlo estimation of the posterior variance. We show that the proposed sampler allows for better knowledge uncertainty estimates leading to improved out-of-domain detection.

### Graph-based Nearest Neighbor Search in Hyperbolic Spaces

ICLRThe nearest neighbor search (NNS) problem is widely studied in Euclidean space, and graph-based algorithms are known to outperform other approaches for this task. However, hyperbolic geometry often allows for better data representation in various domains, including graphs, words, and images. In this paper, we show that graph-based approaches are also well suited for hyperbolic geometry. From a theoretical perspective, we rigorously analyze the time and space complexity of graph-based NNS, assuming that an n-element dataset is uniformly distributed within a d-dimensional ball of radius R in the hyperbolic space of curvature -1. Under some conditions on R and d, we derive the time and space complexity of graph-based NNS and compare the obtained results with known guarantees for the Euclidean case. Interestingly, in the dense setting (d << log(n)) and under some assumptions on the radius R, graph-based NNS has lower time complexity in the hyperbolic space. This agrees with our experiments: we consider datasets embedded in hyperbolic and Euclidean spaces and show that graph-based NNS can be more efficient in the hyperbolic space. We also demonstrate that graph-based methods outperform other existing baselines for hyperbolic NNS. Overall, our theoretical and empirical analysis suggests that graph-based NNS can be considered a default approach for similarity search in hyperbolic spaces.

# Liudmila Prokhorenkova

## Publications

## Posts

- March 6, 2023Research
### Introducing new heterophilous graph datasets

- October 12, 2022Research
### Graph-based nearest neighbor search

- December 8, 2021Research
### How to validate validation measures

## Datasets

### Heterophilous graph datasets

A graph dataset is called heterophilous if nodes prefer to connect to other nodes that are not similar to them. For example, in financial transaction networks, fraudsters often perform transactions with non-fraudulent users, and in dating networks, most connections are between people of opposite genders. Learning under heterophily is an important subfield of graph ML. Thus, having diverse and reliable benchmarks is essential.

We propose a benchmark of five diverse heterophilous graphs that come from different domains and exhibit a variety of structural properties. Our benchmark includes a word dependency graph Roman-empire, a product co-purchasing network Amazon-ratings, a synthetic graph emulating the minesweeper game Minesweeper, a crowdsourcing platform worker network Tolokers, and a question-answering website interaction network Questions.

### Shifts Dataset

The Shifts Dataset contains curated and labeled examples of real, 'in-the-wild' distributional shifts across three large-scale tasks. Specifically, it contains tabular weather prediction, machine translation, and vehicle motion prediction tasks' data used in Shifts Challenge 2021. Dataset shift is ubiquitous in all of these tasks and modalities.