File size: 3,768 Bytes
a333428
 
 
 
 
 
 
 
 
 
 
e84d389
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
title: Video Rag
emoji: πŸŽ₯
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
---


# Video RAG System Project

This FastAPI-based Video RAG (Retrieval-Augmented Generation) system provides endpoints to:

1. **Register & Authenticate** users
2. **Transcribe** YouTube or uploaded videos
3. **Query** the RAG system
4. **Manage** sessions (list, view, delete)

---

## Endpoint Flow

```mermaid
graph TD
  A[POST /register] --> B[POST /token]
  B --> C[POST /transcribe]
  B --> D[POST /upload]
  C --> E[Start RAG session]
  D --> E
  E --> F[POST /query]
  E --> G[GET /sessions]
  G --> H[GET /sessions/{session_id}]
  H --> F
  G --> I[DELETE /sessions/{session_id}]
```

1. **User Registration & Login**  
   - **POST /register**: Create a new user.  
   - **POST /token**: Obtain JWT access token.

2. **Video Transcription**  
   - **POST /transcribe** (YouTube URL): Transcribe via Google GenAI β†’ split & store chunks β†’ initialize chat history β†’ return `session_id`.  
   - **POST /upload** (Multipart Form Video): Upload & transcribe file β†’ split & store chunks β†’ initialize chat history β†’ return `session_id`.

3. **Query RAG System**  
   - **POST /query** with `{ session_id, query }`:  
     β€’ Rebuild FAISS retriever from MongoDB chunks  
     β€’ Invoke ConversationalRetrievalChain  
     β€’ Append messages to chat history  
     β€’ Return `{ answer, session_id, source_documents }`

4. **Session Management**  
   - **GET /sessions**: List all sessions for current user.  
   - **GET /sessions/{session_id}**: Get full transcription & Q&A history.  
   - **DELETE /sessions/{session_id}**: Remove metadata, chunks, chat history, and video files.

---

## README.md

```markdown
# Video RAG System

## Overview
A FastAPI application that:

- Authenticates users (JWT)
- Transcribes videos (YouTube or upload) via Google GenAI
- Stores transcription chunks in MongoDB
- Builds a FAISS retriever on demand
- Provides a conversational retrieval endpoint
- Manages sessions and associated data

## API Endpoints

| Method | Path                       | Auth Required | Description                                   |
|--------|----------------------------|---------------|-----------------------------------------------|
| POST   | /register                  | No            | Create a new user                             |
| POST   | /token                     | No            | Login and return JWT token                    |
| POST   | /transcribe                | Yes           | Transcribe YouTube video and init session     |
| POST   | /upload                    | Yes           | Upload & transcribe video file                |
| POST   | /query                     | Yes           | Run Q&A against a session                     |
| GET    | /sessions                  | Yes           | List all user sessions                        |
| GET    | /sessions/{session_id}     | Yes           | Get session transcription & chat history      |
| DELETE | /sessions/{session_id}     | Yes           | Delete session & all associated data          |

## Usage
1. Clone repo & install dependencies:
   ```bash
   pip install -r requirements.txt
   ```
2. Create `.env` with your credentials (MongoDB, JWT secret, API keys).
3. Run the app:
   ```bash
   uvicorn app.main:app --reload
   ```
4. Interact via HTTP clients (curl, Postman) following the flow above.

## Folder Structure
```
rag_system/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py
β”‚   β”œβ”€β”€ config.py
β”‚   β”œβ”€β”€ dependencies.py
β”‚   β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ db/
β”‚   β”œβ”€β”€ services/
β”‚   β”œβ”€β”€ routes/
β”‚   └── utils/
β”œβ”€β”€ temp_videos/
β”œβ”€β”€ .env
β”œβ”€β”€ requirements.txt
└── README.md
```
```
```