| --- |
| title: Council Topics Classifier |
| emoji: ๐๏ธ |
| colorFrom: blue |
| colorTo: green |
| sdk: streamlit |
| sdk_version: 1.36.0 |
| app_file: src/streamlit_app.py |
| pinned: false |
| license: cc-by-4.0 |
| --- |
| |
| # ๐๏ธ Council Topics Classifier |
|
|
| **Council Topics Classifier** is a system for automatically identifying topics in **Portuguese municipal meeting minutes discussion subjects**. |
|
|
| --- |
|
|
| ## ๐ฏ About |
|
|
| This demo showcases the classifier's ability to: |
| - Detect topics in Portuguese municipal texts discussion subjects |
| - Use a hybrid feature set (TF-IDF + BERTimbau embeddings) |
| - Combine Logistic Regression and Gradient Boosting models in an adaptive weighted ensemble |
| - Apply dynamic thresholds optimized per topic |
| - Handle unbalanced topic distributions with active learning |
|
|
| --- |
|
|
| ## ๐ Model Performance |
|
|
| - **Model Architecture**: Logistic Regression + 3x Gradient Boosting models |
| - **Features**: TF-IDF (1โ3 n-grams) + BERTimbau contextual embeddings |
| - **Adaptive weighting**: Rare topics get higher LogReg weight, common topics get higher GB weight |
| - **Dynamic thresholds**: Optimized per topic using validation data |
|
|
| --- |
|
|
| ## ๐ Usage |
|
|
| 1. **Try Your Own Text**: Paste Portuguese municipal text in the input area |
| 2. **Demo Examples**: Select from pre-loaded examples to see topic predictions |
| 3. **View Results**: Confidence scores for each predicted topic are displayed interactively |
|
|
| --- |
|
|
| ## ๐ง Running Locally |
|
|
| ```bash |
| pip install -r requirements.txt |
| streamlit run app.py |
| |