Alvar,

Thank you again for your time to speak about the team and this role. The conversation opened up several questions I've been thinking through since. Specifically, the moment you pulled out the paper and asked for the diagram of the audiobook pipeline. I got part of the way, but the gap you filled in on the use case with agents and reinforcement was fascinating to consider.

I went through my notes from the meeting and pulled together more ideas for the attribution stack, the pipeline I should have drawn in the room, and what I'd do about the backlist.

Bryan Davis
VP, Growth & Analytics candidate
Section 01

The Attribution Problem

The UK/US ROAS gap is a measurement problem, not a campaign problem. Here's the stack, in sequence.

1

Amazon Attribution Tags

  • Free URL tags on every off-Amazon source: Google, Meta, TikTok, email, influencer
  • 14-day purchase window captures impressions, page views, add-to-carts, and purchases
  • Ties each sale to the originating channel and campaign, title by title
  • Your team is rolling these out now. This is Phase 1.
Phase 1
2

Amazon Marketing Cloud (AMC) + Publisher Cloud

  • AMC: privacy-safe clean room; custom SQL on Amazon event-level data: path-to-purchase, cross-channel overlap, time-to-conversion
  • 28-day lookback (expanded 2025); first-touch and equal-weight models alongside last-touch
  • Publisher Cloud: match HC first-party signals (Bible Gateway 20M+ visitors, Epic Reads, email lists) against Amazon retail data; no raw data exposed on either side
  • Output: audience segments HC owns and can activate programmatically
Phase 2
3

Bayesian Media Mix Modeling

  • Tags only capture tracked channels. Organic BookTok, editorial, podcasts are invisible.
  • Bayesian MMM estimates revenue contribution for every channel, including untracked ones
  • Google Meridian (Jan 2025): full Bayesian; probability distributions, not point estimates; native Google/YouTube support
  • Calibrate with geo-holdout tests: hold spend in select markets, measure the delta, use as ground truth
  • At The Atlantic: same approach showed organic search worth 40% more than last-touch suggested
Phase 2
4

Halo Effect Modeling

  • New releases drive +88% on backlist titles; adaptations drive +222% and 3.1× avg revenue
  • Neither lift is captured by campaign attribution. No campaign caused them.
  • Halo model tracks lift from forecastable events: releases, film deals, major press
  • You already know which authors have releases coming, which titles are in screen development. Pre-position before the event fires.
Phase 3

The sequence matters

Phase 1
Close Amazon's measurement loop
Phase 2
See what Amazon won't share
Phase 3
Attribute revenue no campaign claimed
4–17×
ROAS in campaign tests
vs
0.2–2.4×
at title-level tracking
Same campaigns. Same catalog. The gap is measurement, not performance.
Don't scale spend before Phase 1 is in place.
Section 02

The Audiobook Pipeline

End-to-end system for Audible-compliant audiobook generation from backlist text. AI agents handle quality validation at every step.

3 AI agents orchestrate quality at every step
Agent 1
Acceptance Agent
Validates every synthesized chunk against ACX specs before it advances
Stack
  • ffprobe / ffmpeg for bit rate, RMS, peak
  • sox for noise floor
  • Whisper for pacing and pronunciation check
  • pyannote for voice drift detection
Agent 2
Remediation Agent
Receives failed chunks, adjusts parameters, retries synthesis up to 3 times
Stack
  • LLM (Claude) reasons on failure code
  • Adjusts ElevenLabs API params: stability, speaking rate, voice ID
  • Re-splits chunk if length was the cause
  • Flags to human queue after 3 failures
Agent 3
Final Validation Agent
Full-file QA on assembled audiobook before ACX submission
Stack
  • Same spec checks as Agent 1, full-file scope
  • Spectral analysis for seam artifacts at stitch points
  • Speaker verification across all chapters
  • Timestamped QA report; failed chapters loop only
Orchestration
LangGraph
Models the pipeline as a state graph. Each agent is a node; edges are conditional on pass/fail. Hundreds of chunks process in parallel. A single failure does not block the rest of the book.
State graph Conditional edges Async / parallel
01
Input
Manuscript preprocessing
  • Cleaned source text from editorial system
  • Frontmatter stripped
  • Character names and proper nouns flagged for pronunciation consistency
02
Stage 1
Chunking Engine
  • Segments text at ≤800 chars per request (ElevenLabs per-request context limit)
  • Breaks at sentence or paragraph boundaries only, never mid-clause
  • Tags each chunk: position, chapter, voice cues (dialogue vs. narration)
  • Queued for parallel processing
03
Stage 2 · ElevenLabs
Voice Synthesis
  • Parallel batch generation via ElevenLabs API
  • Voice ID pinning: same voice profile locked across every chunk
  • Eleven v3 for premium frontlist (highest emotional range)
  • Multilingual v2 for backlist at scale
↓ every chunk
✦ Agent 1
Acceptance Agent
Validates every synthesized chunk against ACX specs before it can advance.
Bit rate
≥ 192 kbps CBR MP3
Sample rate
44.1 kHz
RMS loudness
−23 dB to −18 dB
Peak amplitude
≤ −3 dBFS
Noise floor
≤ −60 dBFS
Quality scoring
Pacing, pronunciation, voice drift
pass / fail
✦ Agent 2
Remediation Agent
  • Adjusts speaking rate, stability, and style settings
  • Splits chunk further if length caused the failure
  • Swaps voice model if drift detected
  • Retries 3× before flagging for human review
Returns to Stage 2 for re-synthesis
Pass Path
Stitch & Normalize
  • 10–20ms crossfade at split points to eliminate audible seams
  • Loudness normalization across the full assembled file
  • Room-tone consistency pass for uniform silence
↓ assembled file
✦ Agent 3
Final Validation Agent
  • Full-file compliance check against ACX retail standards
  • Voice consistency verified across all chunks
  • Stitch artifacts flagged; QA report with per-segment pass/fail
  • Failing chapters loop individually. Full book not re-processed.
Audible-Ready Audiobook (.mp3)
$100–300per title (AI)
$1,500+per title (human narration)
$6–7Mcost difference at 5,000 titles
ElevenLabs

Already the active partner with existing credits. The synthesis stage is provider-agnostic by design; the agent layer sits above the API and is not tied to any single vendor. ElevenLabs is the right input here: highest naturalness, best emotional range, and the chunking pipeline directly solves its per-request context limit.

Section 03

The Backlist Case

Three things the case data made clear, and what I'd do about each.

78%
of backlist titles
got zero investment

The backlist doesn't have a demand problem. It has an attention allocation problem.

  • Titles that got attention grew 88–222% YoY
  • Author halos: +88% on backlist; adaptations: +222%, 3.1× avg revenue per title
  • These are forecastable: you know which authors have releases coming, which titles are in screen development
  • Fix: trigger-based system that auto-launches campaigns when events fire; no manual planning per title
59%
margin on e-books,
only 10% of backlist rev

Every dollar shifted from print to e-book is worth approximately 2× in profit.

  • E-books: 59% margin, growing 43% YoY. Highest margin in the portfolio.
  • $2.99 promo test: 17× ROAS; format responds hard to pricing
  • US: ~2 price changes/year per ISBN; UK runs weekly
  • Fix: pricing engine on elastic 30% of catalog, trained on genre/author tier/seasonality; 60–90 days, no new platforms
4–17×
ROAS in campaign tests
vs 0.2–2.4× title-level

Same campaigns, same catalog. The gap is measurement, not marketing performance.

  • Campaign-level: 4–17× ROAS. Title-level: 0.2–2.4×. Same campaigns, same catalog.
  • Last-touch at 30% coverage can't see the full causal chain
  • Scaling spend before closing the attribution gap is the wrong sequence
  • Fix: Phase 1 (Section 1) takes weeks. Close it first, then scale.
"The backlist isn't underperforming. It's being ignored. When it can't be measured, it keeps getting ignored."
Section 04

What I'm Thinking About

A few things I've been pulling on since our conversation. Not fully formed, but worth digging into if given the chance.

🔍
AI Revenue

LLM Content Pipelines for Discovery

  • Find questions on ChatGPT/Gemini/Perplexity that HC titles answer; generate content at scale
  • ChatGPT referrals convert at 15.9% vs. 2–3% for organic search, before any optimization
  • UK team is doing this. The US isn't.
📱
Discovery

TikTok Shop: First-Mover Attribution

  • BookTok drove $760M+ in print sales in 2024
  • TikTok Shop: direct in-app purchase + better attribution signal than Amazon
  • $15.8B US sales in 2024, up 650% in 16 months
  • Get measurement in place before the channel gets expensive
🧩
Infrastructure

Publisher Cloud as First-Party Data Layer

  • HC has no first-party reader data. Amazon keeps it.
  • Publisher Cloud: match HC signals (Bible Gateway 20M+, Epic Reads, email) vs. Amazon retail data in a clean room
  • Output: audience segments HC owns and can activate programmatically
  • Foundation for the attribution stack and reducing Amazon dependency
🎙️
AI Product

Dynamic Voice Casting for Backlist

  • Pipeline solves cost; voice selection is still manual
  • Model: match voice characteristics (pace, warmth, accent, register) to genre and reader demographics
  • Train on Audible completion rates and listener ratings
  • The training signal exists in Audible's public review system right now
💰
Quick Win

Weekly Pricing Cadence: Fastest Revenue Path

  • US: ~2 price changes/year per ISBN. UK: weekly.
  • Pricing engine on elastic 30% of catalog: trained on genre, author tier, seasonality
  • 60–90 days, no new platforms, no new data sources
  • Measurable revenue lift without additional ad spend