We study revenue optimization learning algorithms for repeated posted-price auctions where a seller interacts with a (truthful or strategic) buyer that holds a fixed valuation. We focus on a practical situation in which the seller does not know in advance the number of played rounds (the time horizon) and has thus to use a horizon-independent pricing. First, we consider straightforward modifications of previously best known algorithms and show that these horizon-independent modifications have worser or even linear regret bounds. Second, we provide a thorough theoretical analysis of some broad families of consistent algorithms and show that there does not exist a no-regret horizon-independent algorithm in those families. Finally, we introduce a novel deterministic pricing algorithm that, on the one hand, is independent of the time horizon T and, on the other hand, has an optimal strategic regret upper bound in O(log log T). This result closes the logarithmic gap between the previously best known upper and lower bounds on strategic regret.
International Conference on World Wide Web
2 Apr 2017