| <div align="center"> |
| <img width="1200" height="475" alt="GHBanner" src="https://github.com/user-attachments/assets/0aa67016-6eaf-458a-adb2-6e31a0763ed6" /> |
| </div> |
|
|
| ## Run Locally |
|
|
| **Prerequisites:** Node.js |
|
|
|
|
| 1. Install dependencies: |
| `npm install` |
| 2. Set the `GEMINI_API_KEY` in [.env.local](.env.local) to your Gemini API key |
| 3. Run the app: |
| `npm run dev` |
|
|
|
|
| Workflow Overview |
|
|
| ```mermaid |
| flowchart TD |
| subgraph S1 [Phase 1: Data Ingestion] |
| A[User Selects Working Group] -->|SA1-6, RAN1-2| B[Fetch Meetings via POST] |
| B --> C[User Selects Meeting] |
| C --> D[Filter Docs by Metadata] |
| D --> E[Extract Raw Text] |
| end |
| |
| subgraph S2 [Phase 2: Refinement & Caching] |
| E --> F{Text in Cache?} |
| F -- Yes --> G[Retrieve Cached Refinement] |
| F -- No --> H[LLM Processing] |
| H --> I[Task: Dense Chunking & 'What's New'] |
| I --> J[Store in Dataset] |
| J --> G |
| end |
| |
| subgraph S3 [Phase 3: Pattern Analysis] |
| G --> K[User Selects Pattern/Prompt] |
| K --> L{Result in Cache?} |
| L -- Yes --> M[Retrieve Analysis] |
| L -- No --> N[Execute Pattern] |
| N --> O[Multi-Model Verification] |
| O --> P[Store Result] |
| end |
| |
| S1 --> S2 --> S3 |
| ``` |
|
|
| ### Detailed Process Specification |
|
|
| #### Phase 1: Data Ingestion & Extraction |
| The user navigates a strict hierarchy to isolate relevant source text. |
| 1. **Working Group Selection:** User selects one group from the allowlist: `['SA1', 'SA2', 'SA3', 'SA4', 'SA5', 'SA6', 'RAN1', 'RAN2']`. |
| 2. **Meeting Retrieval:** System executes a `POST` request to the endpoint using the selected Working Group to retrieve the meeting list. |
| 3. **Document Filtering:** User selects a meeting, then filters the resulting file list using available metadata. |
| 4. **Text Extraction:** System extracts raw content from the filtered files into a text list. |
|
|
| #### Phase 2: Content Refinement (with Caching) |
| Raw text is processed into high-value summaries to reduce noise. |
| * **Cache Check:** Before processing, check the dataset for existing `(text_hash, refined_output)` pairs to prevent duplicate processing. |
| * **LLM Processing:** If not cached, pass text to the selected LLM (default provided, user-changeable). |
| * **Prompt Objective:** |
| 1. Create information-dense chunks (minimize near-duplicates). |
| 2. Generate a "What's New" paragraph wrapped in `SUGGESTION START` and `SUGGESTION END` tags. |
| * **Storage:** Save the input text and the LLM output to the dataset. |
|
|
| #### Phase 3: Pattern Analysis & Verification |
| Refined text is analyzed using specific user-defined patterns. |
| * **Pattern Selection:** User applies a specific prompt/pattern to the refined documents. |
| * **Cache Check:** Check the results database for existing `(document_id, pattern_id)` results. |
| * **Execution & Verification:** |
| * Run the selected pattern against the documents. |
| * **Verifier Mode:** Optionally execute the same input across multiple models simultaneously to compare results and ensure accuracy. |
| * **Storage:** Save the final analysis in the database to prevent future re-computation. |