Using the Delay in a Treatment Effect to Improve Sensitivity and Preserve Directionality of Engagement Metrics in A/B Experiments

State-of-the-art user engagement metrics (such as session-per-user) are widely used by modern Internet companies to evaluate ongoing updates of their web services via A/B testing. These metrics are predictive of companies' long-term goals, but suffer from this property due to slow user learning of an evaluated treatment, which causes a delay in the treatment effect. That, in turn, causes low sensitivity of the metrics and requires to conduct A/B experiments with longer duration or larger set of users from a limited traffic. In this paper, we study how the delay property of user learning can be used to improve sensitivity of several popular metrics of user loyalty and activity. We consider both novel and previously known modifications of these metrics, including different methods of quantifying a trend in a metric's time series and delaying its calculation. These modifications are analyzed with respect to their sensitivity and directionality on a large set of A/B tests run on real users of Yandex. We discover that mostly loyalty metrics gain profit from the considered modifications. We find such modifications that both increase sensitivity of the source metric and are consistent with the sign of its average treatment effect as well.
Research areas
Published in
International Conference on World Wide Web
3 Apr 2017