This video is a snapshot of Spotify's approach to software enginering and people management in 2014. It has since come to be known as the "Spotify Model" (although I didn't coin that term).
As a data-driven company, we rely heavily on experiments to guide products and features. At any given time, we have around 1,000 experiments running, and we’re adding more every day. Because we’re constantly increasing the number of experiments and logging corresponding data, we need a reliable, simple to use platform engineers can use without error. To eliminate common errors made by experimenters, we introduced a lightweight config UI, QA workflow and simplified APIs supporting A/B testing across multiple platforms. (For more information about our dashboards and data pipeline, check out our previous experiments post.)
For over two years I’ve been a Data Scientist on the Growth team at Airbnb. When I first started at the company we were running fewer than 100 experiments in a given week; we’re now running about 700.
At Airbnb, we are constantly iterating on the user experience and product features. This can include changes to the look and feel of the website or native apps, optimizations for our smart pricing and search ranking algorithms, or even targeting the right content and timing for our email campaigns. For the majority of this work, we leverage our internal A/B Testing platform, the Experimentation Reporting Framework (ERF), to validate our hypotheses and quantify the impact of our work. Read about the basics of ERF and our philosophy on interpreting experiments.
At Airbnb we are always trying to learn more about our users and improve their experience on the site. Much of that learning and improvement comes through the deployment of controlled experiments. If you haven’t already read our other post about experimentation I highly recommend you do it, but I will summarize the two main points: (1) running controlled experiments is the best way to learn about your users, and (2) there are a lot of pitfalls when running experiments. To that end, we built a tool to make running experiments easier by hiding all the pitfalls and automating the analytical heavy lifting.
Experimentation is at the core of how Uber improves the customer experience. Uber applies several experimental methodologies to use cases as diverse as testing out a new feature to enhancing our app design.
Ax is an accessible, general-purpose platform for understanding, managing, deploying, and automating adaptive experiments.Adaptive experimentation is the machine-learning guided process of iteratively exploring a (possibly infinite) parameter space in order to identify optimal configurations in a resource-efficient manner. Ax currently supports Bayesian optimization and bandit optimization as exploration strategies. Bayesian optimization in Ax is powered by BoTorch, a modern library for Bayesian optimization research built on PyTorch.
The Twitter experimentation tool, Duck Duck Goose (DDG for short), was first created in 2010. It has evolved into a system that is capable of aggregating many terabytes of data such as Tweets, social graph changes, server logs, and records of user interactions with web and mobile clients, to measure and analyze a large amount of flexible metrics.
Trustworthy data and analyses are key to making sound business decisions, particularly when it comes to A/B testing. Ignoring data quality issues or biases introduced through design and interpretations can lead to incorrect conclusions that could hurt your product.
Experimentation Platform (ExP) is a team of 60+ Data Scientists, Software Engineers and Program Managers. Our mission is to accelerate innovation through trustworthy experimentation. Most major products such as Bing, Cortana, Edge, Exchange, Identity, MSN, Office client, Office online, Photos, Skype, Speech, Store, Teams, Visual Studio Code, Windows, Xbox use our platform ExP to run trustworthy Online Controlled Experiments – aka A/B tests.
Ever wonder how Netflix serves a great streaming experience with high-quality video and minimal playback interruptions? Thank the team of engineers and data scientists who constantly A/B test their innovations to our adaptive streaming and content delivery network algorithms. What about more obvious changes, such as the complete redesign of our UI layout or our new personalized homepage? Yes, all thoroughly A/B tested.
Another day, another custom script to analyze an A/B test. Maybe you’ve done this before and have an old script lying around. If it’s new, it’s probably going to take some time to set up, right? Not at Netflix.
The Netflix experimentation platform (XP) was founded on two key tenets: (1) that democratizingcontributions will allow the platform to scale more efficiently, and (2) that non-distributed com-putation of causal models will lead to rapid innovations in statistical methodologies.