Project Proposal: Auteur AI (Flow Edition)

1. Project Overview

This project aims to build Auteur AI, a production-grade, self-hosted asset management suite for Google Flow (Veo 3.1). The system will be a full-stack web application designed to run on a Linux cluster using Docker Compose. It features a React frontend, Python FastAPI backend, PostgreSQL database, MinIO object storage, and integration with an external OpenAI-compatible API for script analysis and JSON generation.

2. Requirements

Infrastructure & Architecture

Deployment: Docker Compose (Microservices-ready monolith).
Services:
- frontend: Nginx serving React code.
- backend: FastAPI (Port 8000).
- db: PostgreSQL 16 (Port 5432).
- minio: S3-compatible storage (Port 9000/9001).
- redis: Message broker (Port 6379).
- worker: Python background worker (Celery/ARQ).
AI Integration: External OpenAI-compatible API (e.g., OpenAI, Anthropic, OpenRouter, or self-hosted vLLM external to this stack).

Backend (Python/FastAPI)

Framework: FastAPI with Async SQLAlchemy.
Database: PostgreSQL with Alembic for migrations.
Storage: boto3 for MinIO interactions.
AI: openai Python library (configured with OPENAI_API_BASE and OPENAI_API_KEY) to support any compatible provider.
Core Modules:
- Asset Library: Upload handling, storage in MinIO, thumbnail generation.
- Script Parser: Ingest text/fountain files, chunking, AI analysis (Scene/Shot identification).
- Flow Assembly: Generating Google Veo compliant JSON payloads via LLM.

Frontend (React/Vite)

Stack: React 18, TypeScript, TailwindCSS.
UI Library: Shadcn/UI, Lucide-react.
State Management: TanStack Query (Server state), Zustand (Client state).
Features:
- Asset Library with drag-and-drop uploads.
- Script ingestion and review interface.
- "IDE-like" JSON editor (Monaco Editor).
- Optimistic UI updates.

Data Models

Projects: Top-level container.
Ingredients: Assets (Characters, Locations, Objects) with S3 keys and metadata.
Scenes: Script sections.
Shots: Atomic units with descriptions, durations, and extensive metadata (Slot system, Veo JSON).

3. Implementation Plan

I propose the following phased approach to build the application:

Phase 1: Infrastructure & Initialization

Set up the project repository structure.
Create docker-compose.yml with all defined services (Db, MinIO, Redis, Backend, Frontend stub).
Verify container orchestration and networking.

Phase 2: Backend Core

Initialize FastAPI project with poetry or pip.
Configure Database connection (Async SQLAlchemy).
Define Pydantic models and SQLAlchemy tables based on the schema.
Set up Alembic and run initial migrations.

Phase 3: Asset Management (MVP)

Implement MinIO client setup.
Build POST /api/assets/upload.
Implement background worker for thumbnail generation (Redis + Worker).
Create basic Frontend Asset Library view to test uploads and retrieval.

Phase 4: Script Ingestion & AI Integration

Implement POST /api/scripts/parse.
Integrate openai client.
Develop the prompt engineering logic for Script -> Visual Actions.
Build Frontend interface for Script upload and Shot review.

Phase 5: Flow Generation & UI Polish

Implement POST /api/shots/:id/generate-flow.
Integrate Monaco Editor for JSON result viewing.
Finalize UI styling and responsiveness.
Comprehensive documentation.

4. Questions & Clarifications Needed

AI Provider Details: Since we are using OpenAI-compatible endpoints, please make sure you have the OPENAI_API_BASE and OPENAI_API_KEY ready for the configuration. If you are using a specific model (e.g., gpt-4o, claude-3-5-sonnet, deepseek-chat), please specify it so we can set a sensible default in the .env file.
Ports: Are the default ports (8000, 5432, 9000, 9001, 6379) free to use on your host, or should we map them to different external ports?
Credentials: Do you have specific preference for the initial database/MinIO credentials, or should I generate secure defaults for the .env file?

Please review this proposal. Upon your confirmation, I will begin with Phase 1.

4.2 KiB Raw Blame History