first commit
This commit is contained in:
94
PROPOSAL.md
Normal file
94
PROPOSAL.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Project Proposal: Auteur AI (Flow Edition)
|
||||
|
||||
## 1. Project Overview
|
||||
|
||||
This project aims to build **Auteur AI**, a production-grade, self-hosted asset management suite for Google Flow (Veo 3.1). The system will be a full-stack web application designed to run on a Linux cluster using Docker Compose. It features a React frontend, Python FastAPI backend, PostgreSQL database, MinIO object storage, and integration with an external OpenAI-compatible API for script analysis and JSON generation.
|
||||
|
||||
## 2. Requirements
|
||||
|
||||
### Infrastructure & Architecture
|
||||
|
||||
- **Deployment**: Docker Compose (Microservices-ready monolith).
|
||||
- **Services**:
|
||||
- `frontend`: Nginx serving React code.
|
||||
- `backend`: FastAPI (Port 8000).
|
||||
- `db`: PostgreSQL 16 (Port 5432).
|
||||
- `minio`: S3-compatible storage (Port 9000/9001).
|
||||
- `redis`: Message broker (Port 6379).
|
||||
- `worker`: Python background worker (Celery/ARQ).
|
||||
- **AI Integration**: External OpenAI-compatible API (e.g., OpenAI, Anthropic, OpenRouter, or self-hosted vLLM external to this stack).
|
||||
|
||||
### Backend (Python/FastAPI)
|
||||
|
||||
- **Framework**: FastAPI with Async SQLAlchemy.
|
||||
- **Database**: PostgreSQL with Alembic for migrations.
|
||||
- **Storage**: `boto3` for MinIO interactions.
|
||||
- **AI**: `openai` Python library (configured with `OPENAI_API_BASE` and `OPENAI_API_KEY`) to support any compatible provider.
|
||||
- **Core Modules**:
|
||||
- **Asset Library**: Upload handling, storage in MinIO, thumbnail generation.
|
||||
- **Script Parser**: Ingest text/fountain files, chunking, AI analysis (Scene/Shot identification).
|
||||
- **Flow Assembly**: Generating Google Veo compliant JSON payloads via LLM.
|
||||
|
||||
### Frontend (React/Vite)
|
||||
|
||||
- **Stack**: React 18, TypeScript, TailwindCSS.
|
||||
- **UI Library**: Shadcn/UI, Lucide-react.
|
||||
- **State Management**: TanStack Query (Server state), Zustand (Client state).
|
||||
- **Features**:
|
||||
- Asset Library with drag-and-drop uploads.
|
||||
- Script ingestion and review interface.
|
||||
- "IDE-like" JSON editor (Monaco Editor).
|
||||
- Optimistic UI updates.
|
||||
|
||||
### Data Models
|
||||
|
||||
- **Projects**: Top-level container.
|
||||
- **Ingredients**: Assets (Characters, Locations, Objects) with S3 keys and metadata.
|
||||
- **Scenes**: Script sections.
|
||||
- **Shots**: Atomic units with descriptions, durations, and extensive metadata (Slot system, Veo JSON).
|
||||
|
||||
## 3. Implementation Plan
|
||||
|
||||
I propose the following phased approach to build the application:
|
||||
|
||||
### Phase 1: Infrastructure & Initialization
|
||||
|
||||
- Set up the project repository structure.
|
||||
- Create `docker-compose.yml` with all defined services (Db, MinIO, Redis, Backend, Frontend stub).
|
||||
- Verify container orchestration and networking.
|
||||
|
||||
### Phase 2: Backend Core
|
||||
|
||||
- Initialize FastAPI project with poetry or pip.
|
||||
- Configure Database connection (Async SQLAlchemy).
|
||||
- Define Pydantic models and SQLAlchemy tables based on the schema.
|
||||
- Set up Alembic and run initial migrations.
|
||||
|
||||
### Phase 3: Asset Management (MVP)
|
||||
|
||||
- Implement MinIO client setup.
|
||||
- Build `POST /api/assets/upload`.
|
||||
- Implement background worker for thumbnail generation (Redis + Worker).
|
||||
- Create basic Frontend Asset Library view to test uploads and retrieval.
|
||||
|
||||
### Phase 4: Script Ingestion & AI Integration
|
||||
|
||||
- Implement `POST /api/scripts/parse`.
|
||||
- Integrate `openai` client.
|
||||
- Develop the prompt engineering logic for Script -> Visual Actions.
|
||||
- Build Frontend interface for Script upload and Shot review.
|
||||
|
||||
### Phase 5: Flow Generation & UI Polish
|
||||
|
||||
- Implement `POST /api/shots/:id/generate-flow`.
|
||||
- Integrate Monaco Editor for JSON result viewing.
|
||||
- Finalize UI styling and responsiveness.
|
||||
- Comprehensive documentation.
|
||||
|
||||
## 4. Questions & Clarifications Needed
|
||||
|
||||
1. **AI Provider Details**: Since we are using OpenAI-compatible endpoints, please make sure you have the `OPENAI_API_BASE` and `OPENAI_API_KEY` ready for the configuration. If you are using a specific model (e.g., `gpt-4o`, `claude-3-5-sonnet`, `deepseek-chat`), please specify it so we can set a sensible default in the `.env` file.
|
||||
2. **Ports**: Are the default ports (8000, 5432, 9000, 9001, 6379) free to use on your host, or should we map them to different external ports?
|
||||
3. **Credentials**: Do you have specific preference for the initial database/MinIO credentials, or should I generate secure defaults for the `.env` file?
|
||||
|
||||
Please review this proposal. Upon your confirmation, I will begin with Phase 1.
|
||||
Reference in New Issue
Block a user