A/B tests

You can create A/B tests in the Hypertune UI and then "drop" them anywhere in your flag targeting logic.

This means you can have a single feature flag with rules to enable the feature for specific users, e.g. employees, QA, beta users, etc, and a final default rule to A/B test the feature on everyone else.

It also means you can reuse a single A/B test across the logic of different feature flags to roll out and test related features in sync.

This power and flexibility is unique to Hypertune.

Structure

Each A/B test has a:

  • Name

  • Set of dimensions

  • Set of features

Each dimension has a:

  • Name

  • Set of arms

Each arm has a:

  • Name

  • Traffic allocation percentage

Each feature has a:

  • Name

  • Type

A typical A/B test will have a single dimension with two arms — the Test arm and the Control arm, each allocated 50% of traffic — and no features.

Using A/B tests in flag targeting logic

Once created, you can "drop" A/B tests anywhere in your flag targeting logic.

If the A/B test has more than one dimension, you select the one which is relevant for the flag.

Then for each arm in the dimension, you can set the flag values, or nest more flag targeting logic.

You also set the Unit ID of the A/B test, typically context.user.id. A hash of the Unit ID is used to determine the arm. So the same user will always end up in the same arm of a test.

If the A/B test has features, you also set the values of these features, or nest more logic for them.

When you evaluate a flag that uses an A/B test, an "exposure" will be logged with the Unit ID, the arm the unit was assigned for each dimension, and the feature values.

Staged, percentage-based rollouts

A staged, percentage-based rollout is a special case of an A/B test with only one dimension and one arm.

Multivariate tests

A multivariate or multidimensional test is an A/B test with multiple dimensions.

This lets you test all combinations of the arms of each dimension. For example, you can have one dimension called Button color with three arms: Red, Blue, Green. And another dimension called Button text with three arms: Sign up, Get access, Request access.

You can drop the test into a String flag, buttonColor, that controls the color of your call-to-action button, select the Button color dimension, then set flag values for each arm.

And you can drop the same test into another String flag, buttonText, that controls the text of your call-to-action button, but this time select the Button text dimension, then set flag values for each arm.

This will test all 9 combinations of button color and button text.

Machine learning loops

You can convert any A/B test into a machine learning loop. You set a goal function, expressed as a formula of your event types and you don't set any traffic allocation percentages on arms.

Hypertune will then automatically and continuously learn the best arm for each dimension, given the provided features.

For example, you can set up a machine learning loop to personalize the headline on your landing page to each unique visitor, to maximize sign ups.

Viewing results

To view the results of your A/B tests and machine learning loops, ensure you've set up the event types you want to track, then build a funnel to compare conversion rates across different arms.

Last updated