Skip to content

KidStory: Storybook for Kids

KidStory Logo
AI-Powered Interactive Storybooks for Children
Speak a story. Watch it come to life.

Overview

KidStory is an AI-powered platform that transforms a child's spoken or typed story idea into a fully illustrated, narrated, and interactive storybook. The app orchestrates multiple AI models to create a cohesive multimedia experience — generating story text, watercolor illustrations, expressive narration, and interactive quizzes.

The application features interleaved AI generation where story content and illustrations are created together in a single stream, providing a magical experience where visuals appear as the story is written.

For Parents

Register an account, then create stories for your children to read and enjoy together.

For Kids

Let your child speak their own story idea and watch the AI bring it to life with pictures and narration.

Key Features

  • Voice & Text Input: Children can speak or type their story ideas
  • AI Story Generation: Creates complete stories with 4-6 pages based on child's input
  • Interleaved Illustration Generation: Story text and page illustrations generated together in one AI stream
  • Character Consistency: Upload character photos to maintain appearance across all illustrations
  • Multiple Narrator Voices: Choose from 3 AI narrator personalities with different voices
  • Interactive Storybooks: Page-turning interface with audio narration for each page
  • Magic Quiz: Post-story interactive quiz with voice input and AI-generated questions
  • Progress Tracking: Real-time visual feedback during story generation
  • Child-Safe Content: Built-in guardrails ensure age-appropriate content for children 3-10
  • User Library: Save and revisit created stories

Technology Stack

Frontend

  • Framework: Next.js 16.1.6 (App Router)
  • UI Library: React 19.2.3
  • Language: TypeScript 5
  • Styling: Tailwind CSS 4
  • Animations: Framer Motion 12.35.0
  • State Management: @tanstack/react-query 5.90.21

Backend & AI

  • AI Models: Google Gemini via Vertex AI
    • gemini-2.5-flash-image (interleaved story + image generation)
    • gemini-2.5-flash-preview-tts (text-to-speech narration)
    • gemini-2.5-flash (text-only quiz generation)
  • AI SDKs:
    • @google/genai 1.44.0+ (Vertex AI integration with vertexai: true)
  • Single SDK Architecture: Uses only @google/genai
  • Database: Firebase Firestore
  • Authentication: Firebase Auth (Google OAuth)
  • Storage: Google Cloud Storage (images & audio files)
  • Cloud Services: Google Cloud Platform

Development Tools

  • Package Manager: npm
  • Linting: ESLint 9
  • Build Tool: Next.js
  • Containerization: Docker

System Architecture

Click here to view the full diagram in a new tab.

KidStory - System Architecture

AI Generation Pipeline

1. User Input → Voice/Text Story Idea
2. Story Generation → Gemini 2.5 Flash Image (interleaved text + images)
3. Parallel Processing →
   ├─ Images → Generated via interleaved output
   └─ Audio Narration → Gemini 2.5 Flash TTS
4. Assembly → Combine images + audio + text
5. Storage → Save to Firestore + GCS
6. Delivery → Interactive storybook with quizzes

Key Technical Components

Frontend Hooks:

  • useStoryGenerator: Manages streaming story generation with real-time updates
  • useStoryAssembler: Coordinates parallel image/audio generation and assembly
  • useVoiceInput: Handles speech recognition for story input and quiz answers
  • useUserStories: Manages Firestore story library operations
  • useAuth: Firebase authentication wrapper

API Routes:

  • /api/generate-story: Streams interleaved story JSON + images from Gemini
  • /api/generate-image: Creates individual page illustrations with character reference support
  • /api/generate-audio: Generates speech using Gemini TTS with WAV output
  • /api/live-quiz: Creates interactive quizzes with text-only questions and TTS narration
  • /api/save-story: Persists completed stories to Firestore
  • /api/save-quiz-score: Records quiz results

Core Components:

  • StoryBook: Main story reader with page-turning animations
  • LiveQuizModal: Interactive quiz with voice input and AI feedback
  • CreateStoryPage: Story creation wizard with voice/text input
  • Dashboard: User story library and statistics
  • StoryPage: Individual story page with image and text

Getting Started

Prerequisites

  • Node.js 20+
  • npm or yarn
  • Google Cloud Project with Gemini API enabled
  • Firebase Project with Firestore and Authentication
  • Google Cloud Storage bucket

Installation

  1. Clone and navigate to project

    bash
    cd storybook-for-kids-app
  2. Install dependencies

    bash
    npm install
  3. Configure Environment Variables Create .env.local in the project root:

    env
    # Firebase Client Configuration (from Firebase Console)
    NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
    NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com
    NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id
    NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project.appspot.com
    NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_sender_id
    NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id
    NEXT_PUBLIC_FIREBASE_MEASUREMENT_ID=your_measurement_id
    
    # Firebase Admin (Service Account)
    FIREBASE_CLIENT_EMAIL=service-account@project.iam.gserviceaccount.com
    GOOGLE_CLOUD_PROJECT=your_project_id
    
    # Google Cloud Storage
    GCS_BUCKET_NAME=storybook-for-kids-media
    
    # Gemini Models (via Vertex AI)
    GEMINI_IMAGE_MODEL=gemini-2.5-flash-image
    GEMINI_TTS_MODEL=gemini-2.5-flash-preview-tts
    GEMINI_QUIZ_MODEL=gemini-2.5-flash
    
    # App Configuration
    NEXTAUTH_URL=http://localhost:3000
    NEXTAUTH_SECRET=your_secret_key_here
  4. Set up Firebase

    • Enable Google Sign-In in Firebase Authentication
    • Create Firestore database in native mode
    • Set up storage rules for your GCS bucket
  5. Set up Google Cloud

    • Enable Vertex AI API in Google Cloud Console
    • Create service account with appropriate permissions
    • Set up Application Default Credentials for Vertex AI access
  6. Run the development server

    bash
    npm run dev
  7. Open http://localhost:3000 in your browser

Development

bash
# Development server
npm run dev

# Build for production
npm run build

# Start production server
npm start

# Lint code
npm run lint

Deployment (NO DOCKER NEEDED!)

The app uses Google Cloud Build for simple, containerless deployment:

bash
# Deploy with ONE command (no Docker needed!)
gcloud builds submit --config=cloudbuild.yaml

What happens:

  1. ✅ Your code uploads to Google Cloud
  2. ✅ Dependencies installed with npm ci
  3. ✅ Next.js app built with npm run build
  4. ✅ Deployed directly to Cloud Run
  5. ✅ Environment variables set automatically
  6. ✅ HTTPS setup automatically

After deployment:

bash
# Get your app URL
gcloud run services describe storybook-for-kids --format='value(status.url)'

# Check logs
gcloud run logs read storybook-for-kids

Project Structure

storybook-for-kids-app/
├── app/                          # Next.js App Router
│   ├── (auth)/                   # Authentication routes
│   │   └── login/                # Login page
│   ├── (dashboard)/              # Dashboard routes
│   │   └── dashboard/            # User dashboard
│   ├── api/                      # API routes
│   │   ├── generate-audio/       # TTS narration
│   │   ├── generate-image/       # Image generation
│   │   ├── generate-story/       # Story generation (interleaved)
│   │   ├── live-quiz/            # Interactive quiz
│   │   ├── save-quiz-score/      # Quiz results
│   │   └── save-story/           # Story persistence
│   ├── create/                   # Story creation page
│   ├── story/[id]/               # Story viewer
│   ├── layout.tsx                # Root layout
│   ├── page.tsx                  # Home page
│   └── globals.css               # Global styles
├── components/                   # React components
│   ├── auth/                     # Authentication components
│   ├── providers/                # Context providers
│   ├── story/                    # Story-related components
│   │   ├── LiveQuizModal.tsx     # Interactive quiz
│   │   ├── MicButton.tsx         # Voice input button
│   │   ├── PageTurn.tsx          # Page turning animation
│   │   ├── StoryBook.tsx         # Main story viewer
│   │   ├── StoryCard.tsx         # Story library card
│   │   └── StoryPage.tsx         # Individual story page
│   └── ui/                       # UI components
│       ├── AudioPlayer.tsx       # Audio playback
│       ├── LoadingSparkles.tsx   # Loading animation
│       └── Toast.tsx             # Notification system
├── lib/                          # Utility libraries
│   ├── firebase/                 # Firebase configuration
│   │   ├── admin.ts              # Server-side Firebase
│   │   └── client.ts             # Client-side Firebase
│   ├── gcp/                      # Google Cloud services
│   │   ├── gemini.ts             # Gemini model configuration
│   │   ├── imagen.ts             # Image generation
│   │   ├── storage.ts            # Cloud Storage operations
│   │   └── tts.ts                # Text-to-speech synthesis
│   ├── hooks/                    # Custom React hooks
│   │   ├── useAuth.ts            # Authentication hook
│   │   ├── useStoryAssembler.ts  # Story assembly coordination
│   │   ├── useStoryGenerator.ts  # Story generation state
│   │   ├── useUserStories.ts     # Story library management
│   │   └── useVoiceInput.ts      # Speech recognition
│   └── utils/                    # Utility functions
│       ├── imageCompressor.ts    # Image compression
│       └── utils.ts              # General utilities
├── public/                       # Static assets
│   ├── audio/                    # Audio samples
│   ├── sounds/                   # Sound effects
│   └── animation/                # Lottie animations
├── types/                        # TypeScript definitions
│   ├── index.ts                  # Main type definitions
│   └── speech.d.ts               # Speech recognition types
├── document/                     # Documentation
├── middleware.ts                 # Next.js middleware
├── next.config.mjs               # Next.js configuration
├── package.json                  # Dependencies
├── Dockerfile                    # Container configuration
└── cloudbuild.yaml              # CI/CD configuration

Documentation

For judges and reviewers, we provide comprehensive documentation covering every aspect of the project:

DocumentDescription
How This App WorksAgentic workflow, model orchestration, and step-by-step explanation of story creation and quiz flows
System Architecture & DiagramsComplete architecture diagram, sequence diagrams, state machines, and data flow visualizations
Database SchemaFirestore data models, collection structure, and Cloud Storage layout
Hackathon RequirementsHow the project meets every Gemini Live Agent Challenge requirement with technical evidence
Deployment GuideStep-by-step Google Cloud Run deployment instructions, IAM setup, and troubleshooting

Usage Guide

Creating a Story

  1. Sign in with Google account
  2. Click "New Story" from dashboard
  3. Choose input method: Voice or text
  4. Speak or type your story idea
  5. Optional: Upload character photos and name them
  6. Select narrator voice (Luna, Stella, or Kiko)
  7. Choose page count (4, 5, or 6 pages)
  8. Click "Create My Story"
  9. Watch the magic happen as AI generates your story

Reading a Story

  1. Open story from your library
  2. Use arrow keys or click arrows to turn pages
  3. Listen to narration with audio player
  4. Complete the quiz after finishing the story
  5. Save your score and share your achievement

Quiz Features

  • Voice input for answering questions
  • AI-generated illustrations for each question
  • Sound effects based on AI suggestions
  • Score tracking and saving
  • Celebratory animations for correct answers

Technical Implementation Details

Interleaved Generation Flow

Interleaved Generation

The app uses Gemini's native interleaved output capability for story generation via Vertex AI:

typescript
// Story generation with interleaved output via Vertex AI
const result = await ai.models.generateContentStream({
  model: "gemini-2.5-flash-image",
  contents: [{ role: "user", parts }],
  config: {
    responseModalities: ["TEXT", "IMAGE"], // Single call, mixed output
  },
});

Character Consistency

When users upload character photos, the app:

  1. Compresses images to base64
  2. Sends them as reference images to Gemini
  3. Ensures consistent character appearance across all illustrations
  4. Uses character names in image prompts for accurate representation

Real-time Progress

  • Streaming updates during story generation
  • Parallel processing of images and audio
  • Visual progress indicators for each page
  • Error handling with user-friendly messages

Performance Optimizations

  • Image compression before upload
  • Parallel API calls for faster generation
  • Cached signed URLs for media files
  • Lazy loading of story components
  • Optimized animations with Framer Motion

Environment Variables Reference

VariableDescriptionRequired
NEXT_PUBLIC_FIREBASE_API_KEYFirebase web API keyYes
NEXT_PUBLIC_FIREBASE_AUTH_DOMAINFirebase auth domainYes
NEXT_PUBLIC_FIREBASE_PROJECT_IDFirebase project IDYes
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKETFirebase storage bucketYes
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_IDFirebase messaging sender IDYes
NEXT_PUBLIC_FIREBASE_APP_IDFirebase app IDYes
FIREBASE_CLIENT_EMAILService account emailYes
GOOGLE_CLOUD_PROJECTGCP project IDYes
GCS_BUCKET_NAMECloud Storage bucket nameNo (default: storybook-for-kids-media)
GEMINI_IMAGE_MODELInterleaved story modelNo (default: gemini-2.5-flash-image)
GEMINI_TTS_MODELText-to-speech modelNo (default: gemini-2.5-flash-preview-tts)
GEMINI_QUIZ_MODELQuiz generation modelNo (default: gemini-2.5-flash)
NEXTAUTH_URLNextAuth URLYes
NEXTAUTH_SECRETNextAuth secretYes

Troubleshooting

Common Issues

  1. Firebase Authentication Fails

    • Check Firebase project configuration
    • Verify Google Sign-In is enabled
    • Ensure correct API keys in environment variables
  2. Vertex AI API Errors

    • Verify Vertex AI API is enabled in Google Cloud
    • Check service account permissions
    • Ensure billing is enabled for the project
    • Verify Application Default Credentials are set up
  3. Image Generation Failures

    • Check character photo format (JPEG/PNG)
    • Verify image size (compressed to base64)
    • Ensure Gemini has image generation permissions
  4. Audio Playback Issues

    • Check browser audio permissions
    • Verify TTS API is enabled
    • Test with different narrator voices

Development Tips

  • Use Chrome for best voice recognition support
  • Monitor browser console for API errors
  • Check Firebase console for authentication logs
  • Use .env.example as template for environment variables

Made with ❤️ for children everywhere

Released under the MIT License.