Sergey Kastryulin

Today, we present the [YaART: Yet Another ART Rendering Technology](https://arxiv.org/abs/2404.05666) paper. YaART is our production-grade text-to-image cascaded diffusion model aligned to human preferences using Reinforcement Learning from Human Feedback (RLHF). 

In this study, we discuss our approach, highlighting the aspects of data selection, architecture design, and model training. We share the results of our investigations regarding the effect of model and training dataset sizes and comprehensively analyze how these choices affect both the efficiency of the training process and the quality of the generated images. Furthermore, we demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets, establishing a more efficient scenario of diffusion model training. We base our experiments on human evaluations, using DrawBench and our more challenging YaBasket prompt set, which we open for the research community via the [project page](https://ya.ru/ai/art/paper-yaart-v1).

Generative models in computer vision are powerful tool for various applications.

Generative models

Yandex Research team regularly contributes to the computer vision research community, mostly in the field of image retrieval and generative modelling.

Computer vision

Today, we present the YaART: Yet Another ART Rendering Technology paper. YaART is our production-grade text-to-image cascaded diffusion model aligned to human preferences using Reinforcement Learning from Human Feedback (RLHF). 

YaART pre-print is released