# Project Proposal: Auteur AI (Flow Edition) ## 1. Project Overview This project aims to build **Auteur AI**, a production-grade, self-hosted asset management suite for Google Flow (Veo 3.1). The system will be a full-stack web application designed to run on a Linux cluster using Docker Compose. It features a React frontend, Python FastAPI backend, PostgreSQL database, MinIO object storage, and integration with an external OpenAI-compatible API for script analysis and JSON generation. ## 2. Requirements ### Infrastructure & Architecture - **Deployment**: Docker Compose (Microservices-ready monolith). - **Services**: - `frontend`: Nginx serving React code. - `backend`: FastAPI (Port 8000). - `db`: PostgreSQL 16 (Port 5432). - `minio`: S3-compatible storage (Port 9000/9001). - `redis`: Message broker (Port 6379). - `worker`: Python background worker (Celery/ARQ). - **AI Integration**: External OpenAI-compatible API (e.g., OpenAI, Anthropic, OpenRouter, or self-hosted vLLM external to this stack). ### Backend (Python/FastAPI) - **Framework**: FastAPI with Async SQLAlchemy. - **Database**: PostgreSQL with Alembic for migrations. - **Storage**: `boto3` for MinIO interactions. - **AI**: `openai` Python library (configured with `OPENAI_API_BASE` and `OPENAI_API_KEY`) to support any compatible provider. - **Core Modules**: - **Asset Library**: Upload handling, storage in MinIO, thumbnail generation. - **Script Parser**: Ingest text/fountain files, chunking, AI analysis (Scene/Shot identification). - **Flow Assembly**: Generating Google Veo compliant JSON payloads via LLM. ### Frontend (React/Vite) - **Stack**: React 18, TypeScript, TailwindCSS. - **UI Library**: Shadcn/UI, Lucide-react. - **State Management**: TanStack Query (Server state), Zustand (Client state). - **Features**: - Asset Library with drag-and-drop uploads. - Script ingestion and review interface. - "IDE-like" JSON editor (Monaco Editor). - Optimistic UI updates. ### Data Models - **Projects**: Top-level container. - **Ingredients**: Assets (Characters, Locations, Objects) with S3 keys and metadata. - **Scenes**: Script sections. - **Shots**: Atomic units with descriptions, durations, and extensive metadata (Slot system, Veo JSON). ## 3. Implementation Plan I propose the following phased approach to build the application: ### Phase 1: Infrastructure & Initialization - Set up the project repository structure. - Create `docker-compose.yml` with all defined services (Db, MinIO, Redis, Backend, Frontend stub). - Verify container orchestration and networking. ### Phase 2: Backend Core - Initialize FastAPI project with poetry or pip. - Configure Database connection (Async SQLAlchemy). - Define Pydantic models and SQLAlchemy tables based on the schema. - Set up Alembic and run initial migrations. ### Phase 3: Asset Management (MVP) - Implement MinIO client setup. - Build `POST /api/assets/upload`. - Implement background worker for thumbnail generation (Redis + Worker). - Create basic Frontend Asset Library view to test uploads and retrieval. ### Phase 4: Script Ingestion & AI Integration - Implement `POST /api/scripts/parse`. - Integrate `openai` client. - Develop the prompt engineering logic for Script -> Visual Actions. - Build Frontend interface for Script upload and Shot review. ### Phase 5: Flow Generation & UI Polish - Implement `POST /api/shots/:id/generate-flow`. - Integrate Monaco Editor for JSON result viewing. - Finalize UI styling and responsiveness. - Comprehensive documentation. ## 4. Questions & Clarifications Needed 1. **AI Provider Details**: Since we are using OpenAI-compatible endpoints, please make sure you have the `OPENAI_API_BASE` and `OPENAI_API_KEY` ready for the configuration. If you are using a specific model (e.g., `gpt-4o`, `claude-3-5-sonnet`, `deepseek-chat`), please specify it so we can set a sensible default in the `.env` file. 2. **Ports**: Are the default ports (8000, 5432, 9000, 9001, 6379) free to use on your host, or should we map them to different external ports? 3. **Credentials**: Do you have specific preference for the initial database/MinIO credentials, or should I generate secure defaults for the `.env` file? Please review this proposal. Upon your confirmation, I will begin with Phase 1.