Bryan Davis: Since We Met

Alvar,

Thank you again for your time to speak about the team and this role. The conversation opened up several questions I've been thinking through since. Specifically, the moment you pulled out the paper and asked for the diagram of the audiobook pipeline. I got part of the way, but the gap you filled in on the use case with agents and reinforcement was fascinating to consider.

I went through my notes from the meeting and pulled together more ideas for the attribution stack, the pipeline I should have drawn in the room, and what I'd do about the backlist.

Bryan Davis

VP, Growth & Analytics candidate

Section 01

The Attribution Problem

The UK/US ROAS gap is a measurement problem, not a campaign problem. Here's the stack, in sequence.

Amazon Attribution Tags

Free URL tags on every off-Amazon source: Google, Meta, TikTok, email, influencer
14-day purchase window captures impressions, page views, add-to-carts, and purchases
Ties each sale to the originating channel and campaign, title by title
Your team is rolling these out now. This is Phase 1.

Phase 1

Amazon Marketing Cloud (AMC) + Publisher Cloud

AMC: privacy-safe clean room; custom SQL on Amazon event-level data: path-to-purchase, cross-channel overlap, time-to-conversion
28-day lookback (expanded 2025); first-touch and equal-weight models alongside last-touch
Publisher Cloud: match HC first-party signals (Bible Gateway 20M+ visitors, Epic Reads, email lists) against Amazon retail data; no raw data exposed on either side
Output: audience segments HC owns and can activate programmatically

Phase 2

Bayesian Media Mix Modeling

Tags only capture tracked channels. Organic BookTok, editorial, podcasts are invisible.
Bayesian MMM estimates revenue contribution for every channel, including untracked ones
Google Meridian (Jan 2025): full Bayesian; probability distributions, not point estimates; native Google/YouTube support
Calibrate with geo-holdout tests: hold spend in select markets, measure the delta, use as ground truth
At The Atlantic: same approach showed organic search worth 40% more than last-touch suggested

Phase 2

Halo Effect Modeling

New releases drive +88% on backlist titles; adaptations drive +222% and 3.1× avg revenue
Neither lift is captured by campaign attribution. No campaign caused them.
Halo model tracks lift from forecastable events: releases, film deals, major press
You already know which authors have releases coming, which titles are in screen development. Pre-position before the event fires.

Phase 3

The sequence matters

Phase 1

Close Amazon's measurement loop

→

Phase 2

See what Amazon won't share

→

Phase 3

Attribute revenue no campaign claimed

4–17×

ROAS in campaign tests

0.2–2.4×

at title-level tracking

Same campaigns. Same catalog. The gap is measurement, not performance.

Don't scale spend before Phase 1 is in place.

Section 02

The Audiobook Pipeline

End-to-end system for Audible-compliant audiobook generation from backlist text. AI agents handle quality validation at every step.

3 AI agents orchestrate quality at every step

Agent 1

Acceptance Agent

Validates every synthesized chunk against ACX specs before it advances

Stack

ffprobe / ffmpeg for bit rate, RMS, peak
sox for noise floor
Whisper for pacing and pronunciation check
pyannote for voice drift detection

Agent 2

Remediation Agent

Receives failed chunks, adjusts parameters, retries synthesis up to 3 times

Stack

LLM (Claude) reasons on failure code
Adjusts ElevenLabs API params: stability, speaking rate, voice ID
Re-splits chunk if length was the cause
Flags to human queue after 3 failures

Agent 3

Final Validation Agent

Full-file QA on assembled audiobook before ACX submission

Stack

Same spec checks as Agent 1, full-file scope
Spectral analysis for seam artifacts at stitch points
Speaker verification across all chapters
Timestamped QA report; failed chapters loop only

⬡

Orchestration

LangGraph

Models the pipeline as a state graph. Each agent is a node; edges are conditional on pass/fail. Hundreds of chunks process in parallel. A single failure does not block the rest of the book.

State graph Conditional edges Async / parallel

Input

Manuscript preprocessing

Cleaned source text from editorial system
Frontmatter stripped
Character names and proper nouns flagged for pronunciation consistency

↓

Stage 1

Chunking Engine

Segments text at ≤800 chars per request (ElevenLabs per-request context limit)
Breaks at sentence or paragraph boundaries only, never mid-clause
Tags each chunk: position, chapter, voice cues (dialogue vs. narration)
Queued for parallel processing

↓

Stage 2 · ElevenLabs

Voice Synthesis

Parallel batch generation via ElevenLabs API
Voice ID pinning: same voice profile locked across every chunk
Eleven v3 for premium frontlist (highest emotional range)
Multilingual v2 for backlist at scale

↓ every chunk

✦ Agent 1

Acceptance Agent

Validates every synthesized chunk against ACX specs before it can advance.

Bit rate
≥ 192 kbps CBR MP3

Sample rate
44.1 kHz

RMS loudness
−23 dB to −18 dB

Peak amplitude
≤ −3 dBFS

Noise floor
≤ −60 dBFS

Quality scoring
Pacing, pronunciation, voice drift

pass / fail

✦ Agent 2

Remediation Agent

Adjusts speaking rate, stability, and style settings
Splits chunk further if length caused the failure
Swaps voice model if drift detected
Retries 3× before flagging for human review

↺ Returns to Stage 2 for re-synthesis

Pass Path

Stitch & Normalize

10–20ms crossfade at split points to eliminate audible seams
Loudness normalization across the full assembled file
Room-tone consistency pass for uniform silence

↓ assembled file

✦ Agent 3

Final Validation Agent

Full-file compliance check against ACX retail standards
Voice consistency verified across all chunks
Stitch artifacts flagged; QA report with per-segment pass/fail
Failing chapters loop individually. Full book not re-processed.

↓

Audible-Ready Audiobook (.mp3)

$100–300per title (AI)

$1,500+per title (human narration)

$6–7Mcost difference at 5,000 titles

ElevenLabs

Already the active partner with existing credits. The synthesis stage is provider-agnostic by design; the agent layer sits above the API and is not tied to any single vendor. ElevenLabs is the right input here: highest naturalness, best emotional range, and the chunking pipeline directly solves its per-request context limit.

Section 03

The Backlist Case

Three things the case data made clear, and what I'd do about each.

78%

of backlist titles
got zero investment

The backlist doesn't have a demand problem. It has an attention allocation problem.

Titles that got attention grew 88–222% YoY
Author halos: +88% on backlist; adaptations: +222%, 3.1× avg revenue per title
These are forecastable: you know which authors have releases coming, which titles are in screen development
Fix: trigger-based system that auto-launches campaigns when events fire; no manual planning per title

59%

margin on e-books,
only 10% of backlist rev

Every dollar shifted from print to e-book is worth approximately 2× in profit.

E-books: 59% margin, growing 43% YoY. Highest margin in the portfolio.
$2.99 promo test: 17× ROAS; format responds hard to pricing
US: ~2 price changes/year per ISBN; UK runs weekly
Fix: pricing engine on elastic 30% of catalog, trained on genre/author tier/seasonality; 60–90 days, no new platforms

4–17×

ROAS in campaign tests
vs 0.2–2.4× title-level

Same campaigns, same catalog. The gap is measurement, not marketing performance.

Campaign-level: 4–17× ROAS. Title-level: 0.2–2.4×. Same campaigns, same catalog.
Last-touch at 30% coverage can't see the full causal chain
Scaling spend before closing the attribution gap is the wrong sequence
Fix: Phase 1 (Section 1) takes weeks. Close it first, then scale.

"The backlist isn't underperforming. It's being ignored. When it can't be measured, it keeps getting ignored."

Section 04

What I'm Thinking About

A few things I've been pulling on since our conversation. Not fully formed, but worth digging into if given the chance.

🔍

AI Revenue

LLM Content Pipelines for Discovery

Find questions on ChatGPT/Gemini/Perplexity that HC titles answer; generate content at scale
ChatGPT referrals convert at 15.9% vs. 2–3% for organic search, before any optimization
UK team is doing this. The US isn't.

📱

Discovery

TikTok Shop: First-Mover Attribution

BookTok drove $760M+ in print sales in 2024
TikTok Shop: direct in-app purchase + better attribution signal than Amazon
$15.8B US sales in 2024, up 650% in 16 months
Get measurement in place before the channel gets expensive

🧩

Infrastructure

Publisher Cloud as First-Party Data Layer

HC has no first-party reader data. Amazon keeps it.
Publisher Cloud: match HC signals (Bible Gateway 20M+, Epic Reads, email) vs. Amazon retail data in a clean room
Output: audience segments HC owns and can activate programmatically
Foundation for the attribution stack and reducing Amazon dependency

🎙️

AI Product

Dynamic Voice Casting for Backlist

Pipeline solves cost; voice selection is still manual
Model: match voice characteristics (pace, warmth, accent, register) to genre and reader demographics
Train on Audible completion rates and listener ratings
The training signal exists in Audible's public review system right now

💰

Quick Win

Weekly Pricing Cadence: Fastest Revenue Path

US: ~2 price changes/year per ISBN. UK: weekly.
Pricing engine on elastic 30% of catalog: trained on genre, author tier, seasonality
60–90 days, no new platforms, no new data sources
Measurable revenue lift without additional ad spend