The conversation stayed with me. Specifically the moment you pulled out a piece of paper and asked me to diagram the audiobook pipeline. I didn't get there. I've been thinking about it since.
What follows is what I built in the days after we met. The attribution framework we spent most of our time on. The pipeline I should have drawn in the room. Some follow-on thinking about the backlist case. And a few things I keep coming back to.
This is the kind of problem worth obsessing over.
We spent most of our time here, and I've kept thinking about it. The ROAS gap between UK and US isn't primarily a campaign execution problem. It's a measurement problem. You can't optimize what you can't see. Here's how I'd build the visibility stack, in sequence.
Free URL-based tracking tags appended to every off-Amazon traffic source: Google Ads, Meta, TikTok, email sends, influencer posts. When someone clicks through and purchases on Amazon within a 14-day window, the tag captures it, impressions, detail-page views, add-to-carts, and purchases, all tied back to the originating channel and campaign. Your team is starting to roll these out. This is Phase 1. It closes the most obvious loop so you can finally see which external spend is driving Amazon revenue, title by title.
AMC is a privacy-safe data clean room built on AWS. You run custom SQL queries on Amazon's event-level data: path-to-purchase, cross-channel overlap, time-to-conversion by campaign type. In early 2025, AMC expanded to 28-day lookback windows and added first-touch and equal-weight attribution models alongside its default last-touch. Amazon Publisher Cloud, a separate product launched in 2023 and expanded in 2025, extends this by letting HC match its own first-party signals (Bible Gateway's 20M+ monthly visitors, Epic Reads, email lists) against Amazon's retail and behavioral data inside a clean room, without either party seeing the other's raw data. The output is audience segments HC actually owns and can activate programmatically. Everything in the attribution stack depends on getting this right.
Attribution tags only capture what they can track. Organic BookTok virality, editorial coverage, podcast placements: none of that shows up in a tracking link. A Bayesian media mix model estimates revenue contribution for every channel, including the ones you can't directly measure. Google released Meridian in January 2025, which is the current state of the art for full Bayesian MMM. It produces full probability distributions for ROI estimates rather than point estimates, with native support for Google Search and YouTube data. Calibrate it with geo-holdout tests: hold spend back in select markets, measure the revenue delta, and use that as ground truth to validate the model. At The Atlantic I used this same approach and found organic search was worth 40% more than last-touch attribution suggested. I'd expect the same pattern here.
From the backlist case data: author new releases drive +88% on their backlist titles. Adaptations drive +222%. Neither lift gets captured by campaign attribution because no campaign caused them. A halo model tracks revenue lift from forecastable events (new releases, film announcements, major press) and attributes it correctly, even when no ad spend touched the title. You already know which authors have releases coming. You know which titles are in screen development. The model pre-positions those titles and measures the true return on author investment, not just campaign spend.
Phase 1 closes Amazon's closed loop. Phase 2 builds visibility into what Amazon won't share. Phase 3 attributes revenue no campaign took credit for. Campaign-level tests show 4 to 17x ROAS. Title-level tracking shows 0.2 to 2.4x. Same campaigns, same catalog. The gap is measurement, not marketing performance. Don't scale spend until Phase 1 is in place.
You sketched this on paper in our session. Here's what I think the full stack looks like: an end-to-end system for generating Audible-compliant audiobooks from backlist text, with AI agents handling quality validation at each step.
Adjusts synthesis prompt parameters (speaking rate, stability, style settings). Splits the chunk further if length was the problem. Swaps voice model if drift is the cause. Retries up to 3 times before flagging for human review.
Crossfades chunks at split points with 10–20ms overlap to eliminate audible seams. Normalizes loudness across the full assembled file. Applies a room-tone consistency pass so silence between sections sounds uniform.
After our conversation I looked more closely at where each major provider actually sits on quality, stability, and cost for long-form production at scale:
| Provider | Quality ceiling | Long-form notes | Approx. cost / hr audio | Verdict |
|---|---|---|---|---|
| ElevenLabs (v3 / v2) | Highest naturalness; best emotional range and contextual nuance | Per-request context limit (~800 chars); chunking pipeline solves this directly | ~$15–30 | Premium choice |
| Play.ai (PlayHT) | Strong for US voices; slightly less expressive range than ElevenLabs | More consistent across long-form; fewer voice-drift issues per-request | ~$8–15 | Scale tier |
| Kokoro (open-source) | 82M-parameter model; competitive for neutral narration; limited emotional range | Self-hosted; no API limits; 6-hr audiobook generates in ~4 min on free GPU | ~$0.06 (compute) | Cost floor |
| Google Cloud TTS | Reliable but perceptibly synthetic at long-form; limited voice range | Excellent API stability and enterprise SLAs; best for structured/reference content | ~$10–20 | Not for frontlist |
Three things the case data made clear, and what I'd do about each.
The titles that did get attention grew 88–222% YoY. Author new release halos drive +88% on backlist titles. Film and TV adaptations drive +222% and generate 3.1× average revenue per title. These are forecastable events. You know which authors have releases coming, which titles are in screen development. Build a trigger-based system that automatically pre-positions and launches backlist campaigns when those events fire. No manual planning required per title. The system does it.
E-books are growing 43% YoY and carry the highest margin in the portfolio. The $2.99 promotional test returned 17× ROAS. The format responds aggressively to pricing. US changes e-book prices ~2× per year per ISBN. UK runs weekly. The gap is operational, not technical. A lightweight pricing engine on the elastic 30% of the catalog, trained on conversion data by genre, author tier, and seasonality, is achievable in 60–90 days with existing infrastructure. No new platforms needed.
Campaign-level tests show the spend works. Title-level tracking can't see the full causal chain because last-touch attribution at 30% coverage misses most of what drove the purchase. Deploying major backlist spend before closing the attribution gap (detailed in Section 1) is the wrong sequence. Phase 1 takes weeks. The ROI on getting measurement right before scaling is enormous.
A few things I've been pulling on since our conversation. Not fully formed, but worth digging into if given the chance.
Identify questions being asked on ChatGPT, Gemini, and Perplexity about topics HC titles answer. Generate authoritative long-form content at scale. Route readers to purchase. ChatGPT referrals already convert at 15.9% vs. 2–3% for organic search, without any optimization yet. The leverage across 200K titles is real. You mentioned your UK team is already doing this. The US isn't.
BookTok drove $760M+ in print sales in 2024. TikTok Shop now supports direct in-app purchase, and the Shop API gives more attribution signal than Amazon does. Your philosophy of finding format holes before they get competitive applies directly here. The measurement infrastructure should be in place now, before the channel gets expensive. TikTok Shop hit $15.8B in US sales in 2024, growing 650% in 16 months.
HC doesn't know who its readers are. Amazon keeps that. Amazon Publisher Cloud lets HC match first-party signals (Bible Gateway's 20M+ monthly visitors, Epic Reads, email lists) against Amazon's retail behavioral data without exposing individual identity. The output is audience segments HC actually owns and can activate programmatically. This is the foundation for the full attribution stack and for breaking the one-retailer dependency over time.
The pipeline in Section 2 solves the production cost problem. Voice selection is still manual and intuitive. A model that matches voice characteristics (pace, warmth, accent, register) to genre, subgenre, and reader demographics, trained on Audible completion rates and listener ratings, could improve acceptance rates and satisfaction scores at scale. The training signal exists in Audible's public review system right now.
The UK runs weekly pricing updates. The US runs ~2 changes per ISBN per year. The gap is operational, not technical. A lightweight pricing engine on the elastic 30% of the catalog, trained on conversion data by genre, author tier, and seasonality using the simplest effective model (as you've found in the UK), is achievable in 60–90 days. No new platforms. No new data sources. Measurable, attributable revenue lift without additional ad spend.