Discovering graph structures and search algorithms
Imitation learning for graph-based search. The agent walks over the graph (black arrows). The expert (BFS) provides the correct choices at each step (red arrows). Finally, the agent is updated in a supervised manner.
RL scheme for graph construction. Left: the environment is a similarity graph equipped with a search algorithm. Right: the agent obtains the state and uses policy network to predict which outgoing edges to preserve.