Mechanics of Testing a Big Hypothesis

by | Dec 10, 2020 | Last updated Feb 15, 2022

By Narek Tamoyan, Senior Product Manager

At Noom, one of our fundamental principles is learning fast and iterating rapidly on any initiative (not just product) we work on. Our project life cycle consists of the following steps:

  • Research
  • Ideate and prioritize
  • Design
  • Build
  • Test and learn
  • Iterate

We aim to keep this entire cycle under a couple of weeks (per project). However, sometimes we make exceptions and invest much more time/effort into projects.

In this article, I’ll talk about one of the most challenging problems my team recently tackled and how we decided to modify the “learn fast” principle to accommodate a significant product overhaul.

The Problem

Noom offers a psychology-based course to help users achieve their health goals and live healthier lives through behavior change. The course curriculum has a number of daily assignments (articles, quizzes, etc.) to help users gradually build healthy habits.

There were two primary problems with our approach to courses in 2019. First, a significant number of users fell behind in the course if they were not consistently engaging with it. Secondly, they felt overwhelmed by the number of daily assignments. These two problems were contributing to increased turnover in the two-week trial and low engagement longer term.

We previously tested several different hypotheses through relatively small experiments, but the impact was either negative or neutral. For example, we tried to make a version that didn’t advance the curriculum if the app wasn’t opened. However, the challenge ran deeper than just users who didn’t open the app—some users who did open it still didn’t finish all of their articles, and they also fell behind in the course. To address this, we needed a more holistic approach.

The Task Force

To tackle such a large problem, we needed buy-in from my team before committing significant time and effort. Armed with what we learned from past experiments and user research, I pitched my team a design sprint (more about design sprints here), and we added a start date to our calendars.

The details and twists and turns of the design sprint deserves its own blog post—but in the end, after a week of brainstorming, sketching, drawing, and debating, we had a functional prototype ready. We observed how our test users played with the prototype and documented every single piece of feedback we received.

Final Prototype

The primary hypothesis our initial experiment aimed to test was the following:

Allowing users to set a custom pace (the speed at which they progress through their course), and only advancing the course forward as the user engages with the content, would help users not feel behind or overwhelmed. This would reduce turnover as well as increase user engagement.

There were two cornerstone changes we included in our version:

  1. Make the course curriculum dynamic; the course doesn’t move forward without the user. This addresses the issue of users feeling they have fallen behind in their course.
  2. Introducing meaningful customization of pacing; each user can customize their own pace through the course (the number of daily assignments they get). This addresses users feeling overwhelmed by the number of daily assignments.

These two points weren’t just introducing a fundamental change to our course setup but also the very core of the user experience.

After roughly a month and a half of implementation, we were ready to launch our A/B experiment. I’ll breakdown what we learned into the following phases:

  • Experiment launch
  • Hypothesis validation/outcomes

Experiment Launch

As a Product Manager, one of the most exciting moments I have is looking at the very early findings/data after an experiment launch. However, I always put on my imaginary lab safety goggles to be extra cautious and look for possible side effects. This experiment was no exception; we started noticing some anomalies right away, due to some subtle bugs that only occurred at the scale of thousands of users.

As a result, we re-launched the experiment multiple times until we fixed all of the problems; this cost us an extra couple of weeks.

Hypothesis Validation

We finally got the outcomes we were looking for; both quantitative and qualitative insights indicated the hypothesis was proven! Specifically, we improved the trial retention by 2.5%, post-trial engagement by 3%, and 5% fewer users reported they were behind in the program or overwhelmed.

The results were shared with everyone in the team, and we decided to move forward with baselining the new change for all of our users. Here is where the real challenges began. We still faced a few categories of issues: 

  • Technical and operational complexities
  • Potential impact on ongoing experiments
  • Impact on team productivity

Technical and Operational Complexities

This initiative was changing the very foundation of our course and the core UX of the app. Building it as an experiment (fast and dirty) was one thing but baselining it for all of our users was a completely different level of complexity.

On the other hand, the change was going to impact not only users’ behavior but also some of our coaching (Noom users have an assigned health coach throughout their course) and customer support protocols. As a result, it took us more time to build the baseline and update our operational protocols than it took us to build the experiment. 

The Takeaway—Consider Going Big If

  • The problem/opportunity is big enough, and it’s solvable (must be informed by both qualitative and quantitative data).
  • You tried other small attempts to address it.
  • You carefully user tested your big idea/prototype.

Potential Impact on Ongoing Experiments

At the time, we had three other feature teams who had their own experiments. Some were successful and ready to be baselined, yet others were waiting to be launched. Almost all of those initiatives were blocked by my project. The change we were introducing impacted how almost every other feature was experienced; it wouldn’t be feasible to baseline/test any changes on those features without getting this change out first.

The Takeaway—Be Aware of Becoming a Bottleneck

  • Align with other teams and anyone who might be impacted by your project.
  • Learn about upcoming initiatives and figure out if any of those might be impacted by (or dependent on) your project, by how much, and align your priorities.
  • Give all the “to be impacted” stakeholders a clear expectation of your findings, projections, and timelines.

Impact on Team Productivity

It took my team almost six months to complete the project (in fairness, I should mention we did a lot of refactoring along with the main launch). So for six months, almost 100% of everyone’s time was spent on building, tuning, troubleshooting, and fixing this big change.

As you may expect, this hurt my engineer’s productivity, making them feel overwhelmed and in need of some fresh air. Unfortunately, we had no choice but to push through since the dependencies on the table were so big.

The Takeaway—Prepare Your Team Against Burnout

  • Give your team a clear expectation of the complexity and length of your project.
  • Be overly transparent regarding the risks, impact, and challenges throughout the project.
  • Burnout is almost unavoidable, so prepare for it and have a plan on how you are going to tackle it.

At Noom, we consider a project successful as long as it informs our next steps. We learned a lot during this process, which I hope will help the next time you tackle such a foundational product challenge.

If you are interested in Noom and want to join our growing product team, feel free to reach out to me directly or check our job openings.