AC.
Back to Projects
Make My Story
LiveProduct2025

Make My Story

Your child as the hero — a personalized, illustrated book in under 4 minutes

The Problem

Children's books sold as 'personalized' usually aren't. Most let you pick from a handful of pre-drawn characters — brown hair, blue eyes, kid with glasses — and drop a name into a templated story. Kids who don't fit the available molds get left out. Parents who want to teach something specific — a value, a fear to overcome, a lesson tied to their family — can't. Human illustrators do it properly, but cost hundreds and take weeks. AI seemed like the obvious fix. It isn't: ask Claude or ChatGPT for a children's book and characters drift page to page, narratives lose their thread, and illustrations stop matching the story mid-book.

The Solution

True personalization from a single photo — no character picker, no templates. A story written from scratch around the child's actual age, interests, and whatever the parent wants to teach, with consistent character likeness across all 12 illustrations. Achieving this required solving the consistency and coherence problems that naive AI approaches can't handle: a multi-stage pipeline where each stage uses the right model, structures context deliberately, and validates output before passing it forward. The pipeline is deterministic and resumable — a failure costs a retry of one stage, not the whole book.

Most "personalized" children's books aren't actually personal. My son loved stories where he was the hero — not a character that kind of looked like him, but him. Creating those stories took real effort, and nothing on the market did it properly. A weekend script using LLMs proved parents wanted this. Getting it to production quality — truly personalized, visually consistent, reliably good — turned out to be a genuinely hard engineering problem.

Parents upload a photo, pick a theme, and receive a fully illustrated 12-page book in under 4 minutes — with their child as the hero, consistent likeness across every page, voice narration, and a PDF. They see a story preview before they're charged, can refine it with page-level feedback, and only commit once they're happy.

How It Works

The system decomposes book generation into discrete pipeline stages, each with one job: character analysis extracts a visual description from the photo; story generation builds an outline then writes each page sequentially; illustration generates 12 images using that character description as a persistent reference; validation checks narrative logic before illustrations run and visual identity after each image; assembly produces the final PDF and narration.

Every step's output is stored. If anything fails mid-run, the pipeline resumes exactly where it left off.

The Hard Parts

Consistency. Keeping the same child recognizable across 12 independently generated illustrations required a specific approach to how character context is structured and passed between steps. Getting it wrong produces a beautiful first page and an unrecognizable character by the end — I got it wrong several times before getting it right.

Validation layers. Narrative logic is checked before illustrations are generated; visual identity is checked after each image. Catching failures early — before the most expensive operations — was a core design principle. A failed validation before image generation costs a fraction of one caught after.

Feedback loops. Users can refine the story before committing, with global or page-specific feedback injected into a re-run. After completion, individual pages can be regenerated selectively, and every generation attempt is stored — users can activate any variant or roll back for free.

Model assignment. Creative steps use the most capable models. Validation and classification use faster, cheaper ones — a model optimized for rule-checking produces more reliable validation than a creative model asked to follow rules.

Evaluation

The hardest quality dimension to measure was character identity fidelity. I built a quantitative evaluation framework using vision models as judges, scoring generated images against source photos — enabling systematic A/B testing of prompt strategies with results that could be compared, not just eyeballed.

This drove threshold calibration: the identity score that decides whether a page passes or triggers a retry. Set too high, you waste money regenerating images parents would have accepted. Set too low, you ship pages that break the magic. Prompt changes that looked like obvious improvements sometimes made things measurably worse — the framework was what made it possible to tell the difference.

What I Learned

Decomposition is the core skill. The gap between "ask an AI to make a book" and "produce a consistently high-quality personalized book" isn't filled by better prompting — it's filled by better problem decomposition.

Evaluation unlocks iteration. Without objective measurement, every change is a guess. Building the eval framework was slower than shipping features, but it's what made meaningful iteration possible.

Design for the distribution, not the happy path. AI output quality is a distribution. What matters is that the distribution is good enough that retry logic handles edge cases without exploding costs.

The UX of uncertainty is part of the product. Showing a preview before charging, letting parents refine before committing, surfacing progress during generation — this is inseparable from the technical architecture.

Tech Stack

Next.js 15TypeScriptPython 3.12Anthropic ClaudeGoogle GeminiTrigger.devSupabaseStripeVercelCloudflare

Outcomes & Impact

  • Launched to market with paying customers within 3 months of inception
  • Generates a complete 12-page illustrated book in under 4 minutes
  • Built end-to-end as a solo technical founder, including ~760 automated tests
  • Quantitative evaluation framework with vision-model-as-judge for character identity fidelity