Controlled experiments on the web: survey and practical guide

R. Kohavi, R. Longbotham, D. Sommerfield, и R. Henne.
Data Mining and Knowledge Discovery, 18 (1): 140--181 (2009)
DOI: 10.1007/s10618-008-0114-1

Аннотация

The web provides an unprecedented opportunity to evaluate ideas quickly using controlled experiments, also called randomized experiments, A/B tests (and their generalizations), split tests, Control/Treatment tests, MultiVariable Tests (MVT) and parallel flights. Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. We provide a practical guide to conducting online experiments, where end-users can help guide the development of features. Our experience indicates that significant learning and return-on-investment (ROI) are seen when development teams listen to their customers, not to the Highest Paid Person’s Opinion (HiPPO). We provide several examples of controlled experiments with surprising results. We review the important ingredients of running controlled experiments, and discuss their limitations (both technical and organizational). We focus on several areas that are critical to experimentation, including statistical power, sample size, and techniques for variance reduction. We describe common architectures for experimentation systems and analyze their advantages and disadvantages. We evaluate randomization and hashing techniques, which we show are not as simple in practice as is often assumed. Controlled experiments typically generate large amounts of data, which can be analyzed using data mining techniques to gain deeper understanding of the factors influencing the outcome of interest, leading to new hypotheses and creating a virtuous cycle of improvements. Organizations that embrace controlled experiments with clear evaluation criteria can evolve their systems with automated optimizations and real-time analyses. Based on our extensive practical experience with multiple systems and organizations, we share key lessons that will help practitioners in running trustworthy controlled experiments.

ключ BibTeX: kohavi2009controlled
тип записи: article
год: 2009
журнал: Data Mining and Knowledge Discovery
номер: 1
страницы: 140--181
издательство: Springer US
том: 18
issn: 1384-5810
language: English
DOI: 10.1007/s10618-008-0114-1
url: http://dx.doi.org/10.1007/s10618-008-0114-1

тэги

Пользователи данного ресурса

Комментарии и рецензиипоказать / перейти в невидимый режим

Пожалуйста, войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)

Цитировать эту публикацию

%0 Journal Article %1 kohavi2009controlled %A Kohavi, Ron %A Longbotham, Roger %A Sommerfield, Dan %A Henne, RandalM. %D 2009 %I Springer US %J Data Mining and Knowledge Discovery %K control experiment guide science survey web %N 1 %P 140--181 %R 10.1007/s10618-008-0114-1 %T Controlled experiments on the web: survey and practical guide %U http://dx.doi.org/10.1007/s10618-008-0114-1 %V 18 %X The web provides an unprecedented opportunity to evaluate ideas quickly using controlled experiments, also called randomized experiments, A/B tests (and their generalizations), split tests, Control/Treatment tests, MultiVariable Tests (MVT) and parallel flights. Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. We provide a practical guide to conducting online experiments, where end-users can help guide the development of features. Our experience indicates that significant learning and return-on-investment (ROI) are seen when development teams listen to their customers, not to the Highest Paid Person’s Opinion (HiPPO). We provide several examples of controlled experiments with surprising results. We review the important ingredients of running controlled experiments, and discuss their limitations (both technical and organizational). We focus on several areas that are critical to experimentation, including statistical power, sample size, and techniques for variance reduction. We describe common architectures for experimentation systems and analyze their advantages and disadvantages. We evaluate randomization and hashing techniques, which we show are not as simple in practice as is often assumed. Controlled experiments typically generate large amounts of data, which can be analyzed using data mining techniques to gain deeper understanding of the factors influencing the outcome of interest, leading to new hypotheses and creating a virtuous cycle of improvements. Organizations that embrace controlled experiments with clear evaluation criteria can evolve their systems with automated optimizations and real-time analyses. Based on our extensive practical experience with multiple systems and organizations, we share key lessons that will help practitioners in running trustworthy controlled experiments.

@article{kohavi2009controlled, abstract = {The web provides an unprecedented opportunity to evaluate ideas quickly using controlled experiments, also called randomized experiments, A/B tests (and their generalizations), split tests, Control/Treatment tests, MultiVariable Tests (MVT) and parallel flights. Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. We provide a practical guide to conducting online experiments, where end-users can help guide the development of features. Our experience indicates that significant learning and return-on-investment (ROI) are seen when development teams listen to their customers, not to the Highest Paid Person’s Opinion (HiPPO). We provide several examples of controlled experiments with surprising results. We review the important ingredients of running controlled experiments, and discuss their limitations (both technical and organizational). We focus on several areas that are critical to experimentation, including statistical power, sample size, and techniques for variance reduction. We describe common architectures for experimentation systems and analyze their advantages and disadvantages. We evaluate randomization and hashing techniques, which we show are not as simple in practice as is often assumed. Controlled experiments typically generate large amounts of data, which can be analyzed using data mining techniques to gain deeper understanding of the factors influencing the outcome of interest, leading to new hypotheses and creating a virtuous cycle of improvements. Organizations that embrace controlled experiments with clear evaluation criteria can evolve their systems with automated optimizations and real-time analyses. Based on our extensive practical experience with multiple systems and organizations, we share key lessons that will help practitioners in running trustworthy controlled experiments.}, added-at = {2016-05-01T18:08:44.000+0200}, author = {Kohavi, Ron and Longbotham, Roger and Sommerfield, Dan and Henne, RandalM.}, biburl = {https://www.bibsonomy.org/bibtex/2765430dbee32a915ac2e1d5d3a5a3e37/nosebrain}, doi = {10.1007/s10618-008-0114-1}, interhash = {a5ecea5e3fdb00c5d15b4c8a06cb50e0}, intrahash = {765430dbee32a915ac2e1d5d3a5a3e37}, issn = {1384-5810}, journal = {Data Mining and Knowledge Discovery}, keywords = {control experiment guide science survey web}, language = {English}, number = 1, pages = {140--181}, publisher = {Springer US}, timestamp = {2016-05-01T18:08:44.000+0200}, title = {Controlled experiments on the web: survey and practical guide}, url = {http://dx.doi.org/10.1007/s10618-008-0114-1}, volume = 18, year = 2009 }

BibSonomy