Introduction
Before You Start
Experiments
About
The Complete Playbook

100+ Growth Experiments
for B2B Teams

A tactical, category-by-category guide to running experiments across every growth surface, from your website to paid ads, email, events, content, and beyond.

103
Experiments
8
Categories
7
Attributes each
EF
Ed Farraye
Growth & Marketing Leader  ·  12+ years in B2B SaaS
Ed has led growth and marketing programs across early-stage startups and high-growth B2B companies, building go-to-market engines from zero and scaling them through Series A, B, and beyond. His work spans demand generation, product-led growth, lifecycle marketing, brand, and growth engineering, with a bias toward fast experimentation, rigorous attribution, and pragmatic execution over theory. This guide is a distillation of the experiments, frameworks, and hard lessons accumulated across those programs.
Some companies I've worked with or advised
HashiCorp Samsara Reddit Linear Warp LangChain Graphite Stainless Resolve.ai
Gusto Stytch Modal Push Security

The right mindset before you run a single experiment

1
Most experiments will fail, and that's the point. The goal is not to find experiments that work. The goal is to run enough quality experiments to find the few that have outsized impact. A 20–30% success rate is healthy. What matters is that you're learning fast, documenting everything, and iterating relentlessly. The wins compound; the losses teach you where not to spend your next dollar.
2
Ship quality experiments quickly, but don't conflate speed with sloppiness. A fast, well-formed experiment with a clear hypothesis and proper tracking beats a slow, perfectly-designed one every time. But a fast experiment with no baseline and broken attribution is just noise. The discipline is getting to "good enough to learn from" as fast as possible, not "perfect before we launch."
3
Pace yourself, close experiments before opening new ones. One of the most common growth mistakes is running too many experiments simultaneously. When you have five tests running at once, you rarely know which one moved the needle. Give each experiment a minimum run time, let it close, draw a conclusion, share it with the team, and then move to the next. The cadence of learning, not the volume of tests, is what compounds over time.
4
Document everything, even the failures. A failed experiment documented well is worth more than a successful one nobody wrote down. Your experiment tracker is your institutional memory. Future team members, new hires, and your future self will thank you for a clear record of what was tried, what the hypothesis was, and what happened.
How to use this guide

This guide is designed to be used in two ways: as a starting point for teams building their first experiment roadmap, and as a reference library for experienced growth practitioners looking for ideas they may not have considered. You don't need to read it front to back, jump to the category most relevant to your current focus, pick 2–3 experiments that match your available resources and ICP, and start there.

🗂
8 experiment categories
Experiments are organized by surface area: Website, Digital Ads, Content, Social Media, Events, Other Sponsorships, Email & User Comms, and PR & Influencers. Each category is self-contained, you can work through one at a time or jump across categories based on what your team needs most.
🧪
7 attributes per experiment
Every experiment card includes: a description, metrics to monitor, tools & technologies, a step-by-step guide, variations to run after the initial test, considerations before you start, and expected results. You should be able to walk away from any card with everything you need to execute.
Start with pre-work
Before running any experiment, work through the "Before You Start" tab. Skipping the pre-work, particularly metric alignment and tracking setup, is the single most common reason experiment results are inconclusive or misleading. Don't skip it.
📊
Prioritize ruthlessly
Not every experiment deserves your time right now. Use a simple ICE score (Impact × Confidence × Ease) to rank your shortlist. High-impact experiments on your highest-traffic or highest-intent surfaces almost always come first. Pick 2–3 to run at a time, not 10.
🧭
Let your context guide where you start
Not all experiments are equally accessible given your team's resources and constraints, and that's fine. If design resources are hard to come by, lean into experiments with lower creative overhead first: LinkedIn InMail campaigns, content initiatives like SEO refreshes, or email sequences that live or die on copy alone. If you've already identified your weakest link as the bottom of the funnel, focus there, post-demo follow-up sequences, in-product upsell placement tests, and closed-lost re-engagement will move the needle faster than a brand awareness campaign. The best experiment roadmap is the one that's matched to where you are, not where you aspire to be.
What's inside every experiment card

Each of the 103 experiments in this guide follows a consistent structure so you can quickly assess fit and move to execution.

Description, what you're testing, why it matters, and what problem it solves
Metrics to monitor, primary KPI and supporting signals to watch
Tools & technologies, recommended stack for execution, with alternatives
Step-by-step guide, max 10 steps, written to be immediately actionable
Variations, follow-on tests to run once the initial experiment concludes
Considerations, prerequisites, watch-outs, and things to verify before launch
Expected results, benchmarks, typical lift ranges, and what success looks like
Before you run any experiment

Anyone can pick an experiment from this guide and start running it today. But the teams that get the most out of this playbook are the ones who do the pre-work first. These five steps don't take long, but skipping them is the fastest way to run a lot of experiments that teach you nothing.

1
Align on your north star metric, and your near-term proxy
Every company ultimately wants to influence revenue. But depending on your monetization model and sales motion, revenue may be a lagging indicator that takes 6–12 months to show up after an experiment runs. Before you test anything, align your team on the metric that will actually tell you whether an experiment worked in a timeframe you can act on.

If you're a complex enterprise SaaS with a 6-month sales cycle, your experiment metric is probably new qualified meetings or 1-year ACV pipeline, not closed revenue. If you're a PLG product where users can upgrade within hours of signing up, you might measure free-to-paid conversion or trial activation rate within 72 hours. If you're somewhere in between, you might track demo requests or sign-up starts as your primary proxy.

There's no universal right answer. The right answer is the one your whole team agrees to measure before the experiment launches, and doesn't change mid-flight.
Examples: New qualified meetings, 1-year ACV pipeline, demo requests, trial activations, free-to-paid conversion rate, sign-up starts, MQL volume.
2
Set up your tracking, and quantify its expected gaps
You cannot accurately measure what you haven't tracked. Before launching any experiment, audit your full attribution stack: UTM parameter coverage, conversion event configuration in GA4, CRM attribution logic, and ad platform pixel health.

The critical thing most teams skip is quantifying their expected data gap. No tracking setup is 100% accurate, and pretending it is leads to false conclusions. In most industries, UTM parameter coverage runs at 60–80% of actual traffic. But for some ICPs, cybersecurity professionals, developers, enterprise IT buyers, ad blocker rates can be 30–50%+, which means your measured traffic and conversions may represent significantly less than half of what's actually happening.

Spend time before your first experiment understanding what your coverage looks like. If your tracking captures 65% of conversions, a result that looks flat might actually be a 20% lift you're not seeing. Set thresholds accordingly.
High ad-blocker industries: cybersecurity, developer tools, IT/infrastructure, and privacy-conscious enterprise buyers. Expect materially lower UTM coverage and plan for it.
3
Define and document your ICP
Every experiment in this guide is more effective when it's aimed at a specific, well-defined audience. Before you start testing, make sure you have a clear, written definition of your Ideal Customer Profile, both at the individual level and the organizational level.

At the individual level: job title, seniority, department, and the specific pain they're trying to solve. At the organizational level: company size, industry verticals, tech stack signals, growth stage, and any other firmographic characteristics that predict a good fit.

The more tailored your experiment experience is to that specific person, the better it will perform. "Anyone at a 50–5,000 person company" is a universe, not an audience. "VP of Engineering at a Series B fintech company using AWS with a team of 20–100 engineers" is an audience you can write for, target precisely, and measure accurately.
Your ICP definition should be a living document, update it every quarter as you close deals and learn more about who actually converts and retains.
4
Establish your baselines
Before testing a new variant, you need something to compare against. Capture current conversion rates, traffic volumes, open rates, CPL, or whatever metric your experiment will influence, for a minimum of 2 full weeks before you launch.

Without a baseline, a result is just a number. With a baseline, it becomes a signal. If your homepage is converting at 2.3% today and your new hero variant drives 3.1%, that's a 35% lift worth understanding and building on. If you didn't know the 2.3%, you have no context for the 3.1%.

This applies even to experiments where you're not running an A/B test, if you're launching a new nurture sequence, know your current lead-to-opportunity rate before you turn it on.
5
Set minimum run time, sample size, and confidence threshold, before you launch
Ending an experiment early because it looks good (or bad) is one of the most common and costly mistakes in growth. Define your success criteria before the experiment starts, and commit to not acting on results until those criteria are met.

At minimum: set a minimum run time of 2 full business weeks to account for weekly seasonality. Calculate your required sample size upfront — if you only have 50 visitors per day, a 2-week test may not give you enough data to be conclusive, and you need to know that before you start.

On confidence thresholds: the industry standard is 95%, and that's the right bar when you have the traffic to reach it. But don't let perfect be the enemy of useful. If you're at an earlier stage with lower data volume, there's nothing wrong with acting on 80% or 90% confidence, especially when the cost of waiting for 95% is weeks of additional run time and the cost of being wrong is low. Be honest with yourself about which decisions warrant the higher bar and which don't. A homepage hero test on a high-traffic site should clear 95%. A headline test on a page with 30 visitors a day probably shouldn't wait that long.

Also define what you'll do with inconclusive results. "No significant difference" is a valid and useful result — it tells you that the variable you tested doesn't materially affect conversion, which is worth knowing.
Avoid running experiments over major holidays, product launches, or other known seasonal events — external factors will contaminate your results unless that contamination is intentional and accounted for.
The experiment tracker

Keep a central record of every experiment you run, including the ones that fail. Your experiment tracker is your team's institutional memory. It prevents you from re-running tests that already have answers, surfaces patterns across experiments over time, and creates accountability for closing tests and drawing conclusions before opening new ones.

📋  Experiment tracker template  ·  9 columns, one row per experiment
Experiment Name
Experiment Category
Start Date
End Date
Hypothesis
Metric & Baseline
Owner
Estimated Impact
Estimated Cost ($ & time)
Results & Next Steps
Homepage hero A/B test
Website
Jun 2
Jun 16
Adding video above fold will increase demo requests by 15%+
Demo request CVR, baseline 2.3%
Sarah M.
+15% CVR lift (~12 additional demos/mo)
$500 design; 4h setup
3.1% CVR (+35%) — ship winner; test CTA copy next
LinkedIn Lead Gen Forms — demo requests
Digital Ads
Jun 5
Jun 26
LGF format will reduce CPL vs. landing page by 20%+
CPL, baseline $180
James T.
–$36/lead; target $120 CPL
$3,000 budget; 3h setup
$142 CPL (–21%) — scale budget; test new audience segment
Pricing page transparency test
Website
Jun 9
Jun 23
Adding 'starts at' price will improve pricing CVR
Pricing CVR, baseline 1.1%
Sarah M.
+20–30% CVR lift (2–3 extra demos/mo)
$0; 2h copy + 1h dev
Inconclusive (n too small) — extend run to Jul 7
Your experiment here
Category
YYYY-MM-DD
YYYY-MM-DD
If we do X, Y metric will change by Z
Metric, baseline value
Name
+/– expected delta
$X; Yh time
Result — next action
Download the full tracker template
9 columns · 8 sample rows across all categories · ready for Excel or Google Sheets
Please enter a valid email address to download.
✓ Downloading now — check your downloads folder.
Pacing rule: Close every experiment before opening a new one in the same surface area. Share conclusions with your team in writing, even a 3-sentence Slack summary. The discipline of closing loops is what separates teams that compound their learning from teams that run a lot of tests and never quite understand why things are or aren't working.
CHANGELOG v8 1 new experiment · Category renamed · TTI filter added Last updated: March 15, 2026
Website and Product +1: Sign-up auth method optimization — test GitHub, Google, LinkedIn, Microsoft and email auth combinations to maximize sign-up completion rate for your ICP. Category renamed from "Website" to "Website and Product".
All experiments: slug fields and Time To Implement (TTI) metric added. TTI filter now available on the experiments tab.
Website +2: Self-serve checkout & billing flow optimization; Urgency & scarcity mechanics.
Email +2: In-product onboarding flow experiment; Loyalty & rewards program structure test.
Content +1: In-product cross-sell & upsell placement test (moved to Email & User Comms as it lives inside the product).
Total: 103 experiments across 8 categories.
103
Total
19
Website & Product
15
Digital Ads
13
Content
9
Social
8
Events
12
Sponsorships
19
Email
8
PR
Tag:
TTI:
Showing all 103 experiments across 8 categories
📋
0 experiments selected