KidStory: Storybook for Kids

AI-Powered Interactive Storybooks for Children
Speak a story. Watch it come to life.
Overview
KidStory is an AI-powered platform that transforms a child's spoken or typed story idea into a fully illustrated, narrated, and interactive storybook. The app orchestrates multiple AI models to create a cohesive multimedia experience — generating story text, watercolor illustrations, expressive narration, and interactive quizzes.
The application features interleaved AI generation where story content and illustrations are created together in a single stream, providing a magical experience where visuals appear as the story is written.
For Parents
Register an account, then create stories for your children to read and enjoy together.
For Kids
Let your child speak their own story idea and watch the AI bring it to life with pictures and narration.
Key Features
- Voice & Text Input: Children can speak or type their story ideas
- AI Story Generation: Creates complete stories with 4-6 pages based on child's input
- Interleaved Illustration Generation: Story text and page illustrations generated together in one AI stream
- Character Consistency: Upload character photos to maintain appearance across all illustrations
- Multiple Narrator Voices: Choose from 3 AI narrator personalities with different voices
- Interactive Storybooks: Page-turning interface with audio narration for each page
- Magic Quiz: Post-story interactive quiz with voice input and AI-generated questions
- Progress Tracking: Real-time visual feedback during story generation
- Child-Safe Content: Built-in guardrails ensure age-appropriate content for children 3-10
- User Library: Save and revisit created stories
Technology Stack
Frontend
- Framework: Next.js 16.1.6 (App Router)
- UI Library: React 19.2.3
- Language: TypeScript 5
- Styling: Tailwind CSS 4
- Animations: Framer Motion 12.35.0
- State Management: @tanstack/react-query 5.90.21
Backend & AI
- AI Models: Google Gemini via Vertex AI
gemini-2.5-flash-image(interleaved story + image generation)gemini-2.5-flash-preview-tts(text-to-speech narration)gemini-2.5-flash(text-only quiz generation)
- AI SDKs:
@google/genai1.44.0+ (Vertex AI integration withvertexai: true)
- Single SDK Architecture: Uses only
@google/genai - Database: Firebase Firestore
- Authentication: Firebase Auth (Google OAuth)
- Storage: Google Cloud Storage (images & audio files)
- Cloud Services: Google Cloud Platform
Development Tools
- Package Manager: npm
- Linting: ESLint 9
- Build Tool: Next.js
- Containerization: Docker
System Architecture
Click here to view the full diagram in a new tab.
AI Generation Pipeline
1. User Input → Voice/Text Story Idea
2. Story Generation → Gemini 2.5 Flash Image (interleaved text + images)
3. Parallel Processing →
├─ Images → Generated via interleaved output
└─ Audio Narration → Gemini 2.5 Flash TTS
4. Assembly → Combine images + audio + text
5. Storage → Save to Firestore + GCS
6. Delivery → Interactive storybook with quizzesKey Technical Components
Frontend Hooks:
useStoryGenerator: Manages streaming story generation with real-time updatesuseStoryAssembler: Coordinates parallel image/audio generation and assemblyuseVoiceInput: Handles speech recognition for story input and quiz answersuseUserStories: Manages Firestore story library operationsuseAuth: Firebase authentication wrapper
API Routes:
/api/generate-story: Streams interleaved story JSON + images from Gemini/api/generate-image: Creates individual page illustrations with character reference support/api/generate-audio: Generates speech using Gemini TTS with WAV output/api/live-quiz: Creates interactive quizzes with text-only questions and TTS narration/api/save-story: Persists completed stories to Firestore/api/save-quiz-score: Records quiz results
Core Components:
StoryBook: Main story reader with page-turning animationsLiveQuizModal: Interactive quiz with voice input and AI feedbackCreateStoryPage: Story creation wizard with voice/text inputDashboard: User story library and statisticsStoryPage: Individual story page with image and text
Getting Started
Prerequisites
- Node.js 20+
- npm or yarn
- Google Cloud Project with Gemini API enabled
- Firebase Project with Firestore and Authentication
- Google Cloud Storage bucket
Installation
Clone and navigate to project
bashcd storybook-for-kids-appInstall dependencies
bashnpm installConfigure Environment Variables Create
.env.localin the project root:env# Firebase Client Configuration (from Firebase Console) NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project.appspot.com NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_sender_id NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id NEXT_PUBLIC_FIREBASE_MEASUREMENT_ID=your_measurement_id # Firebase Admin (Service Account) FIREBASE_CLIENT_EMAIL=service-account@project.iam.gserviceaccount.com GOOGLE_CLOUD_PROJECT=your_project_id # Google Cloud Storage GCS_BUCKET_NAME=storybook-for-kids-media # Gemini Models (via Vertex AI) GEMINI_IMAGE_MODEL=gemini-2.5-flash-image GEMINI_TTS_MODEL=gemini-2.5-flash-preview-tts GEMINI_QUIZ_MODEL=gemini-2.5-flash # App Configuration NEXTAUTH_URL=http://localhost:3000 NEXTAUTH_SECRET=your_secret_key_hereSet up Firebase
- Enable Google Sign-In in Firebase Authentication
- Create Firestore database in native mode
- Set up storage rules for your GCS bucket
Set up Google Cloud
- Enable Vertex AI API in Google Cloud Console
- Create service account with appropriate permissions
- Set up Application Default Credentials for Vertex AI access
Run the development server
bashnpm run devOpen http://localhost:3000 in your browser
Development
# Development server
npm run dev
# Build for production
npm run build
# Start production server
npm start
# Lint code
npm run lintDeployment (NO DOCKER NEEDED!)
The app uses Google Cloud Build for simple, containerless deployment:
# Deploy with ONE command (no Docker needed!)
gcloud builds submit --config=cloudbuild.yamlWhat happens:
- ✅ Your code uploads to Google Cloud
- ✅ Dependencies installed with
npm ci - ✅ Next.js app built with
npm run build - ✅ Deployed directly to Cloud Run
- ✅ Environment variables set automatically
- ✅ HTTPS setup automatically
After deployment:
# Get your app URL
gcloud run services describe storybook-for-kids --format='value(status.url)'
# Check logs
gcloud run logs read storybook-for-kidsProject Structure
storybook-for-kids-app/
├── app/ # Next.js App Router
│ ├── (auth)/ # Authentication routes
│ │ └── login/ # Login page
│ ├── (dashboard)/ # Dashboard routes
│ │ └── dashboard/ # User dashboard
│ ├── api/ # API routes
│ │ ├── generate-audio/ # TTS narration
│ │ ├── generate-image/ # Image generation
│ │ ├── generate-story/ # Story generation (interleaved)
│ │ ├── live-quiz/ # Interactive quiz
│ │ ├── save-quiz-score/ # Quiz results
│ │ └── save-story/ # Story persistence
│ ├── create/ # Story creation page
│ ├── story/[id]/ # Story viewer
│ ├── layout.tsx # Root layout
│ ├── page.tsx # Home page
│ └── globals.css # Global styles
├── components/ # React components
│ ├── auth/ # Authentication components
│ ├── providers/ # Context providers
│ ├── story/ # Story-related components
│ │ ├── LiveQuizModal.tsx # Interactive quiz
│ │ ├── MicButton.tsx # Voice input button
│ │ ├── PageTurn.tsx # Page turning animation
│ │ ├── StoryBook.tsx # Main story viewer
│ │ ├── StoryCard.tsx # Story library card
│ │ └── StoryPage.tsx # Individual story page
│ └── ui/ # UI components
│ ├── AudioPlayer.tsx # Audio playback
│ ├── LoadingSparkles.tsx # Loading animation
│ └── Toast.tsx # Notification system
├── lib/ # Utility libraries
│ ├── firebase/ # Firebase configuration
│ │ ├── admin.ts # Server-side Firebase
│ │ └── client.ts # Client-side Firebase
│ ├── gcp/ # Google Cloud services
│ │ ├── gemini.ts # Gemini model configuration
│ │ ├── imagen.ts # Image generation
│ │ ├── storage.ts # Cloud Storage operations
│ │ └── tts.ts # Text-to-speech synthesis
│ ├── hooks/ # Custom React hooks
│ │ ├── useAuth.ts # Authentication hook
│ │ ├── useStoryAssembler.ts # Story assembly coordination
│ │ ├── useStoryGenerator.ts # Story generation state
│ │ ├── useUserStories.ts # Story library management
│ │ └── useVoiceInput.ts # Speech recognition
│ └── utils/ # Utility functions
│ ├── imageCompressor.ts # Image compression
│ └── utils.ts # General utilities
├── public/ # Static assets
│ ├── audio/ # Audio samples
│ ├── sounds/ # Sound effects
│ └── animation/ # Lottie animations
├── types/ # TypeScript definitions
│ ├── index.ts # Main type definitions
│ └── speech.d.ts # Speech recognition types
├── document/ # Documentation
├── middleware.ts # Next.js middleware
├── next.config.mjs # Next.js configuration
├── package.json # Dependencies
├── Dockerfile # Container configuration
└── cloudbuild.yaml # CI/CD configurationDocumentation
For judges and reviewers, we provide comprehensive documentation covering every aspect of the project:
| Document | Description |
|---|---|
| How This App Works | Agentic workflow, model orchestration, and step-by-step explanation of story creation and quiz flows |
| System Architecture & Diagrams | Complete architecture diagram, sequence diagrams, state machines, and data flow visualizations |
| Database Schema | Firestore data models, collection structure, and Cloud Storage layout |
| Hackathon Requirements | How the project meets every Gemini Live Agent Challenge requirement with technical evidence |
| Deployment Guide | Step-by-step Google Cloud Run deployment instructions, IAM setup, and troubleshooting |
Usage Guide
Creating a Story
- Sign in with Google account
- Click "New Story" from dashboard
- Choose input method: Voice or text
- Speak or type your story idea
- Optional: Upload character photos and name them
- Select narrator voice (Luna, Stella, or Kiko)
- Choose page count (4, 5, or 6 pages)
- Click "Create My Story"
- Watch the magic happen as AI generates your story
Reading a Story
- Open story from your library
- Use arrow keys or click arrows to turn pages
- Listen to narration with audio player
- Complete the quiz after finishing the story
- Save your score and share your achievement
Quiz Features
- Voice input for answering questions
- AI-generated illustrations for each question
- Sound effects based on AI suggestions
- Score tracking and saving
- Celebratory animations for correct answers
Technical Implementation Details
Interleaved Generation Flow
Interleaved Generation
The app uses Gemini's native interleaved output capability for story generation via Vertex AI:
// Story generation with interleaved output via Vertex AI
const result = await ai.models.generateContentStream({
model: "gemini-2.5-flash-image",
contents: [{ role: "user", parts }],
config: {
responseModalities: ["TEXT", "IMAGE"], // Single call, mixed output
},
});Character Consistency
When users upload character photos, the app:
- Compresses images to base64
- Sends them as reference images to Gemini
- Ensures consistent character appearance across all illustrations
- Uses character names in image prompts for accurate representation
Real-time Progress
- Streaming updates during story generation
- Parallel processing of images and audio
- Visual progress indicators for each page
- Error handling with user-friendly messages
Performance Optimizations
- Image compression before upload
- Parallel API calls for faster generation
- Cached signed URLs for media files
- Lazy loading of story components
- Optimized animations with Framer Motion
Environment Variables Reference
| Variable | Description | Required |
|---|---|---|
NEXT_PUBLIC_FIREBASE_API_KEY | Firebase web API key | Yes |
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN | Firebase auth domain | Yes |
NEXT_PUBLIC_FIREBASE_PROJECT_ID | Firebase project ID | Yes |
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET | Firebase storage bucket | Yes |
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID | Firebase messaging sender ID | Yes |
NEXT_PUBLIC_FIREBASE_APP_ID | Firebase app ID | Yes |
FIREBASE_CLIENT_EMAIL | Service account email | Yes |
GOOGLE_CLOUD_PROJECT | GCP project ID | Yes |
GCS_BUCKET_NAME | Cloud Storage bucket name | No (default: storybook-for-kids-media) |
GEMINI_IMAGE_MODEL | Interleaved story model | No (default: gemini-2.5-flash-image) |
GEMINI_TTS_MODEL | Text-to-speech model | No (default: gemini-2.5-flash-preview-tts) |
GEMINI_QUIZ_MODEL | Quiz generation model | No (default: gemini-2.5-flash) |
NEXTAUTH_URL | NextAuth URL | Yes |
NEXTAUTH_SECRET | NextAuth secret | Yes |
Troubleshooting
Common Issues
Firebase Authentication Fails
- Check Firebase project configuration
- Verify Google Sign-In is enabled
- Ensure correct API keys in environment variables
Vertex AI API Errors
- Verify Vertex AI API is enabled in Google Cloud
- Check service account permissions
- Ensure billing is enabled for the project
- Verify Application Default Credentials are set up
Image Generation Failures
- Check character photo format (JPEG/PNG)
- Verify image size (compressed to base64)
- Ensure Gemini has image generation permissions
Audio Playback Issues
- Check browser audio permissions
- Verify TTS API is enabled
- Test with different narrator voices
Development Tips
- Use Chrome for best voice recognition support
- Monitor browser console for API errors
- Check Firebase console for authentication logs
- Use
.env.exampleas template for environment variables

