4.2 KiB
4.2 KiB
Project Proposal: Auteur AI (Flow Edition)
1. Project Overview
This project aims to build Auteur AI, a production-grade, self-hosted asset management suite for Google Flow (Veo 3.1). The system will be a full-stack web application designed to run on a Linux cluster using Docker Compose. It features a React frontend, Python FastAPI backend, PostgreSQL database, MinIO object storage, and integration with an external OpenAI-compatible API for script analysis and JSON generation.
2. Requirements
Infrastructure & Architecture
- Deployment: Docker Compose (Microservices-ready monolith).
- Services:
frontend: Nginx serving React code.backend: FastAPI (Port 8000).db: PostgreSQL 16 (Port 5432).minio: S3-compatible storage (Port 9000/9001).redis: Message broker (Port 6379).worker: Python background worker (Celery/ARQ).
- AI Integration: External OpenAI-compatible API (e.g., OpenAI, Anthropic, OpenRouter, or self-hosted vLLM external to this stack).
Backend (Python/FastAPI)
- Framework: FastAPI with Async SQLAlchemy.
- Database: PostgreSQL with Alembic for migrations.
- Storage:
boto3for MinIO interactions. - AI:
openaiPython library (configured withOPENAI_API_BASEandOPENAI_API_KEY) to support any compatible provider. - Core Modules:
- Asset Library: Upload handling, storage in MinIO, thumbnail generation.
- Script Parser: Ingest text/fountain files, chunking, AI analysis (Scene/Shot identification).
- Flow Assembly: Generating Google Veo compliant JSON payloads via LLM.
Frontend (React/Vite)
- Stack: React 18, TypeScript, TailwindCSS.
- UI Library: Shadcn/UI, Lucide-react.
- State Management: TanStack Query (Server state), Zustand (Client state).
- Features:
- Asset Library with drag-and-drop uploads.
- Script ingestion and review interface.
- "IDE-like" JSON editor (Monaco Editor).
- Optimistic UI updates.
Data Models
- Projects: Top-level container.
- Ingredients: Assets (Characters, Locations, Objects) with S3 keys and metadata.
- Scenes: Script sections.
- Shots: Atomic units with descriptions, durations, and extensive metadata (Slot system, Veo JSON).
3. Implementation Plan
I propose the following phased approach to build the application:
Phase 1: Infrastructure & Initialization
- Set up the project repository structure.
- Create
docker-compose.ymlwith all defined services (Db, MinIO, Redis, Backend, Frontend stub). - Verify container orchestration and networking.
Phase 2: Backend Core
- Initialize FastAPI project with poetry or pip.
- Configure Database connection (Async SQLAlchemy).
- Define Pydantic models and SQLAlchemy tables based on the schema.
- Set up Alembic and run initial migrations.
Phase 3: Asset Management (MVP)
- Implement MinIO client setup.
- Build
POST /api/assets/upload. - Implement background worker for thumbnail generation (Redis + Worker).
- Create basic Frontend Asset Library view to test uploads and retrieval.
Phase 4: Script Ingestion & AI Integration
- Implement
POST /api/scripts/parse. - Integrate
openaiclient. - Develop the prompt engineering logic for Script -> Visual Actions.
- Build Frontend interface for Script upload and Shot review.
Phase 5: Flow Generation & UI Polish
- Implement
POST /api/shots/:id/generate-flow. - Integrate Monaco Editor for JSON result viewing.
- Finalize UI styling and responsiveness.
- Comprehensive documentation.
4. Questions & Clarifications Needed
- AI Provider Details: Since we are using OpenAI-compatible endpoints, please make sure you have the
OPENAI_API_BASEandOPENAI_API_KEYready for the configuration. If you are using a specific model (e.g.,gpt-4o,claude-3-5-sonnet,deepseek-chat), please specify it so we can set a sensible default in the.envfile. - Ports: Are the default ports (8000, 5432, 9000, 9001, 6379) free to use on your host, or should we map them to different external ports?
- Credentials: Do you have specific preference for the initial database/MinIO credentials, or should I generate secure defaults for the
.envfile?
Please review this proposal. Upon your confirmation, I will begin with Phase 1.