Files
flow-manager/PROPOSAL.md
2026-01-27 14:00:02 +01:00

4.2 KiB

Project Proposal: Auteur AI (Flow Edition)

1. Project Overview

This project aims to build Auteur AI, a production-grade, self-hosted asset management suite for Google Flow (Veo 3.1). The system will be a full-stack web application designed to run on a Linux cluster using Docker Compose. It features a React frontend, Python FastAPI backend, PostgreSQL database, MinIO object storage, and integration with an external OpenAI-compatible API for script analysis and JSON generation.

2. Requirements

Infrastructure & Architecture

  • Deployment: Docker Compose (Microservices-ready monolith).
  • Services:
    • frontend: Nginx serving React code.
    • backend: FastAPI (Port 8000).
    • db: PostgreSQL 16 (Port 5432).
    • minio: S3-compatible storage (Port 9000/9001).
    • redis: Message broker (Port 6379).
    • worker: Python background worker (Celery/ARQ).
  • AI Integration: External OpenAI-compatible API (e.g., OpenAI, Anthropic, OpenRouter, or self-hosted vLLM external to this stack).

Backend (Python/FastAPI)

  • Framework: FastAPI with Async SQLAlchemy.
  • Database: PostgreSQL with Alembic for migrations.
  • Storage: boto3 for MinIO interactions.
  • AI: openai Python library (configured with OPENAI_API_BASE and OPENAI_API_KEY) to support any compatible provider.
  • Core Modules:
    • Asset Library: Upload handling, storage in MinIO, thumbnail generation.
    • Script Parser: Ingest text/fountain files, chunking, AI analysis (Scene/Shot identification).
    • Flow Assembly: Generating Google Veo compliant JSON payloads via LLM.

Frontend (React/Vite)

  • Stack: React 18, TypeScript, TailwindCSS.
  • UI Library: Shadcn/UI, Lucide-react.
  • State Management: TanStack Query (Server state), Zustand (Client state).
  • Features:
    • Asset Library with drag-and-drop uploads.
    • Script ingestion and review interface.
    • "IDE-like" JSON editor (Monaco Editor).
    • Optimistic UI updates.

Data Models

  • Projects: Top-level container.
  • Ingredients: Assets (Characters, Locations, Objects) with S3 keys and metadata.
  • Scenes: Script sections.
  • Shots: Atomic units with descriptions, durations, and extensive metadata (Slot system, Veo JSON).

3. Implementation Plan

I propose the following phased approach to build the application:

Phase 1: Infrastructure & Initialization

  • Set up the project repository structure.
  • Create docker-compose.yml with all defined services (Db, MinIO, Redis, Backend, Frontend stub).
  • Verify container orchestration and networking.

Phase 2: Backend Core

  • Initialize FastAPI project with poetry or pip.
  • Configure Database connection (Async SQLAlchemy).
  • Define Pydantic models and SQLAlchemy tables based on the schema.
  • Set up Alembic and run initial migrations.

Phase 3: Asset Management (MVP)

  • Implement MinIO client setup.
  • Build POST /api/assets/upload.
  • Implement background worker for thumbnail generation (Redis + Worker).
  • Create basic Frontend Asset Library view to test uploads and retrieval.

Phase 4: Script Ingestion & AI Integration

  • Implement POST /api/scripts/parse.
  • Integrate openai client.
  • Develop the prompt engineering logic for Script -> Visual Actions.
  • Build Frontend interface for Script upload and Shot review.

Phase 5: Flow Generation & UI Polish

  • Implement POST /api/shots/:id/generate-flow.
  • Integrate Monaco Editor for JSON result viewing.
  • Finalize UI styling and responsiveness.
  • Comprehensive documentation.

4. Questions & Clarifications Needed

  1. AI Provider Details: Since we are using OpenAI-compatible endpoints, please make sure you have the OPENAI_API_BASE and OPENAI_API_KEY ready for the configuration. If you are using a specific model (e.g., gpt-4o, claude-3-5-sonnet, deepseek-chat), please specify it so we can set a sensible default in the .env file.
  2. Ports: Are the default ports (8000, 5432, 9000, 9001, 6379) free to use on your host, or should we map them to different external ports?
  3. Credentials: Do you have specific preference for the initial database/MinIO credentials, or should I generate secure defaults for the .env file?

Please review this proposal. Upon your confirmation, I will begin with Phase 1.