Bryan Davis: Since We Met

Section 01

The Attribution Problem

We spent most of our time here, and I've kept thinking about it. The ROAS gap between UK and US isn't primarily a campaign execution problem. It's a measurement problem. You can't optimize what you can't see. Here's how I'd build the visibility stack, in sequence.

Amazon Attribution Tags

Free URL-based tracking tags appended to every off-Amazon traffic source: Google Ads, Meta, TikTok, email sends, influencer posts. When someone clicks through and purchases on Amazon within a 14-day window, the tag captures it, impressions, detail-page views, add-to-carts, and purchases, all tied back to the originating channel and campaign. Your team is starting to roll these out. This is Phase 1. It closes the most obvious loop so you can finally see which external spend is driving Amazon revenue, title by title.

Phase 1

Amazon Marketing Cloud (AMC) + Publisher Cloud

AMC is a privacy-safe data clean room built on AWS. You run custom SQL queries on Amazon's event-level data: path-to-purchase, cross-channel overlap, time-to-conversion by campaign type. In early 2025, AMC expanded to 28-day lookback windows and added first-touch and equal-weight attribution models alongside its default last-touch. Amazon Publisher Cloud, a separate product launched in 2023 and expanded in 2025, extends this by letting HC match its own first-party signals (Bible Gateway's 20M+ monthly visitors, Epic Reads, email lists) against Amazon's retail and behavioral data inside a clean room, without either party seeing the other's raw data. The output is audience segments HC actually owns and can activate programmatically. Everything in the attribution stack depends on getting this right.

Phase 2

Bayesian Media Mix Modeling

Attribution tags only capture what they can track. Organic BookTok virality, editorial coverage, podcast placements: none of that shows up in a tracking link. A Bayesian media mix model estimates revenue contribution for every channel, including the ones you can't directly measure. Google released Meridian in January 2025, which is the current state of the art for full Bayesian MMM. It produces full probability distributions for ROI estimates rather than point estimates, with native support for Google Search and YouTube data. Calibrate it with geo-holdout tests: hold spend back in select markets, measure the revenue delta, and use that as ground truth to validate the model. At The Atlantic I used this same approach and found organic search was worth 40% more than last-touch attribution suggested. I'd expect the same pattern here.

Phase 2

Halo Effect Modeling

From the backlist case data: author new releases drive +88% on their backlist titles. Adaptations drive +222%. Neither lift gets captured by campaign attribution because no campaign caused them. A halo model tracks revenue lift from forecastable events (new releases, film announcements, major press) and attributes it correctly, even when no ad spend touched the title. You already know which authors have releases coming. You know which titles are in screen development. The model pre-positions those titles and measures the true return on author investment, not just campaign spend.

Phase 3

The sequence matters

Phase 1 closes Amazon's closed loop. Phase 2 builds visibility into what Amazon won't share. Phase 3 attributes revenue no campaign took credit for. Campaign-level tests show 4 to 17x ROAS. Title-level tracking shows 0.2 to 2.4x. Same campaigns, same catalog. The gap is measurement, not marketing performance. Don't scale spend until Phase 1 is in place.

Section 02

The Audiobook Pipeline

You sketched this on paper in our session. Here's what I think the full stack looks like: an end-to-end system for generating Audible-compliant audiobooks from backlist text, with AI agents handling quality validation at each step.

Input

Manuscript (full text)

Cleaned, formatted source text pulled from the editorial system. Frontmatter stripped. Character names and proper nouns flagged for pronunciation consistency downstream.

Stage 1: Chunking Engine

Smart text segmentation

Splits the manuscript into segments under 800 characters per API request. That's the documented ElevenLabs context limit where quality and prosody begin to drift. Breaks only at sentence or paragraph boundaries, never mid-clause. Each chunk is tagged with position metadata, chapter number, and any voice cues from the text (dialogue vs. narration, pacing notes). Chunks are queued for parallel processing.

Stage 2: Synthesis

ElevenLabs API, parallel batch generation

Generates audio per chunk in parallelized batches. Voice consistency is enforced via Voice ID pinning: the same voice profile is locked across every chunk in the project. Model by tier: Eleven v3 (highest emotional range and contextual nuance) for premium frontlist titles; Eleven Multilingual v2 (reliable, consistent, multi-language) for backlist at scale.

Stage 3: AI Agent ✦ Quality Gate

Acceptance Agent: ACX compliance and quality scoring

This is where the system earns its keep. Validates every chunk against Audible's official ACX submission requirements before it moves forward.

Bit rate
≥ 192 kbps CBR MP3

Sample rate
44.1 kHz

RMS loudness
−23 dB to −18 dB

Peak amplitude
≤ −3 dBFS

Noise floor
≤ −60 dBFS

Quality scoring
Pacing, pronunciation, voice drift

Fail Path

Remediation Agent

Adjusts synthesis prompt parameters (speaking rate, stability, style settings). Splits the chunk further if length was the problem. Swaps voice model if drift is the cause. Retries up to 3 times before flagging for human review.

↺ Returns to Stage 2 for re-synthesis

Pass Path

Stitch & Normalize

Crossfades chunks at split points with 10–20ms overlap to eliminate audible seams. Normalizes loudness across the full assembled file. Applies a room-tone consistency pass so silence between sections sounds uniform.

Stage 4: AI Agent

Final Validation Agent

Full-file compliance check on the assembled audiobook against ACX retail standards. Verifies voice consistency across all chunks. Flags any perceptible stitch artifacts. Generates a QA report with pass/fail detail per segment and an overall submission confidence score. Chapters that fail go back to the relevant stage individually. The full book doesn't need to be re-processed.

✓

Output

Audible-Ready Audiobook (.mp3)

ACX-compliant package ready for submission. Per-chapter files plus opening/closing credits as required by ACX. Estimated cost: $100–300 per title vs. $1,500+ for professional human narration. At 5,000 backlist titles, the production cost difference is $6–7M.

Provider Comparison

After our conversation I looked more closely at where each major provider actually sits on quality, stability, and cost for long-form production at scale:

Provider	Quality ceiling	Long-form notes	Approx. cost / hr audio	Verdict
ElevenLabs (v3 / v2)	Highest naturalness; best emotional range and contextual nuance	Per-request context limit (~800 chars); chunking pipeline solves this directly	~$15–30	Premium choice
Play.ai (PlayHT)	Strong for US voices; slightly less expressive range than ElevenLabs	More consistent across long-form; fewer voice-drift issues per-request	~$8–15	Scale tier
Kokoro (open-source)	82M-parameter model; competitive for neutral narration; limited emotional range	Self-hosted; no API limits; 6-hr audiobook generates in ~4 min on free GPU	~$0.06 (compute)	Cost floor
Google Cloud TTS	Reliable but perceptibly synthetic at long-form; limited voice range	Excellent API stability and enterprise SLAs; best for structured/reference content	~$10–20	Not for frontlist

Recommendation: ElevenLabs with the chunking pipeline for the first 1,000–2,000 backlist titles where Audible acceptance rate and listener completion matter most. Play.ai or Kokoro for high-volume deep backlist where cost-per-title is the primary constraint. The chunking architecture is provider-agnostic. Swapping the synthesis API in Stage 2 is the only change required.

Section 03

The Backlist Case

Three things the case data made clear, and what I'd do about each.

78%

of backlist titles
got zero investment

The backlist doesn't have a demand problem. It has an attention allocation problem.

The titles that did get attention grew 88–222% YoY. Author new release halos drive +88% on backlist titles. Film and TV adaptations drive +222% and generate 3.1× average revenue per title. These are forecastable events. You know which authors have releases coming, which titles are in screen development. Build a trigger-based system that automatically pre-positions and launches backlist campaigns when those events fire. No manual planning required per title. The system does it.

59%

margin on e-books,
only 10% of backlist rev

Every dollar shifted from print to e-book is worth approximately 2× in profit.

E-books are growing 43% YoY and carry the highest margin in the portfolio. The $2.99 promotional test returned 17× ROAS. The format responds aggressively to pricing. US changes e-book prices ~2× per year per ISBN. UK runs weekly. The gap is operational, not technical. A lightweight pricing engine on the elastic 30% of the catalog, trained on conversion data by genre, author tier, and seasonality, is achievable in 60–90 days with existing infrastructure. No new platforms needed.

4–17×

ROAS in campaign tests
vs 0.2–2.4× title-level

Same campaigns, same catalog. The gap is measurement, not marketing performance.

Campaign-level tests show the spend works. Title-level tracking can't see the full causal chain because last-touch attribution at 30% coverage misses most of what drove the purchase. Deploying major backlist spend before closing the attribution gap (detailed in Section 1) is the wrong sequence. Phase 1 takes weeks. The ROI on getting measurement right before scaling is enormous.

"The backlist isn't underperforming. It's being ignored. When it can't be measured, it keeps getting ignored."

Section 04

What I'm Thinking About

A few things I've been pulling on since our conversation. Not fully formed, but worth digging into if given the chance.

🔍

AI Revenue

LLM Content Pipelines for Discovery

Identify questions being asked on ChatGPT, Gemini, and Perplexity about topics HC titles answer. Generate authoritative long-form content at scale. Route readers to purchase. ChatGPT referrals already convert at 15.9% vs. 2–3% for organic search, without any optimization yet. The leverage across 200K titles is real. You mentioned your UK team is already doing this. The US isn't.

📱

Discovery

TikTok Shop: First-Mover Attribution

BookTok drove $760M+ in print sales in 2024. TikTok Shop now supports direct in-app purchase, and the Shop API gives more attribution signal than Amazon does. Your philosophy of finding format holes before they get competitive applies directly here. The measurement infrastructure should be in place now, before the channel gets expensive. TikTok Shop hit $15.8B in US sales in 2024, growing 650% in 16 months.

🧩

Infrastructure

Publisher Cloud as First-Party Data Layer

HC doesn't know who its readers are. Amazon keeps that. Amazon Publisher Cloud lets HC match first-party signals (Bible Gateway's 20M+ monthly visitors, Epic Reads, email lists) against Amazon's retail behavioral data without exposing individual identity. The output is audience segments HC actually owns and can activate programmatically. This is the foundation for the full attribution stack and for breaking the one-retailer dependency over time.

🎙️

AI Product

Dynamic Voice Casting for Backlist

The pipeline in Section 2 solves the production cost problem. Voice selection is still manual and intuitive. A model that matches voice characteristics (pace, warmth, accent, register) to genre, subgenre, and reader demographics, trained on Audible completion rates and listener ratings, could improve acceptance rates and satisfaction scores at scale. The training signal exists in Audible's public review system right now.

💰

Quick Win

Weekly Pricing Cadence: Fastest Revenue Path

The UK runs weekly pricing updates. The US runs ~2 changes per ISBN per year. The gap is operational, not technical. A lightweight pricing engine on the elastic 30% of the catalog, trained on conversion data by genre, author tier, and seasonality using the simplest effective model (as you've found in the UK), is achievable in 60–90 days. No new platforms. No new data sources. Measurable, attributable revenue lift without additional ad spend.