| | ---
|
| | license: mit
|
| | tags:
|
| | - recommendation-system
|
| | - collaborative-filtering
|
| | - matrix-factorization
|
| | - movie-recommendations
|
| | - movielens
|
| | - machine-learning
|
| | library_name: scikit-learn
|
| | ---
|
| |
|
| | # DataSynthis_ML_JobTask
|
| |
|
| | A powerful movie recommendation system using collaborative filtering and matrix factorization techniques on the MovieLens 100k dataset.
|
| |
|
| | ## Model Description
|
| |
|
| | This model provides personalized movie recommendations using two state-of-the-art algorithms:
|
| |
|
| | - **Collaborative Filtering (CF)**: Item-based similarity using cosine similarity
|
| | - **Matrix Factorization (SVD)**: Singular Value Decomposition for dimensionality reduction
|
| |
|
| | ## Dataset
|
| |
|
| | - **MovieLens 100k**: 100,000 ratings from 943 users on 1,682 movies
|
| | - **User ID Range**: 1-943
|
| | - **Movie Count**: 1,682 unique movies
|
| | - **Rating Scale**: 1-5 stars
|
| |
|
| | ## Usage
|
| |
|
| | ### Python
|
| |
|
| | ```python
|
| | from model import predict
|
| |
|
| | # Get recommendations using SVD (default)
|
| | recommendations = predict(user_id=1, n_recommendations=10, method="svd")
|
| |
|
| | # Get recommendations using collaborative filtering
|
| | recommendations = predict(user_id=1, n_recommendations=10, method="cf")
|
| |
|
| | print(recommendations)
|
| | ```
|
| |
|
| | ### Parameters
|
| |
|
| | - **user_id** (int): User ID between 1-943 (required)
|
| | - **n_recommendations** (int): Number of recommendations between 1-20 (default: 10)
|
| | - **method** (str): "svd" for matrix factorization or "cf" for collaborative filtering (default: "svd")
|
| |
|
| | ### Output
|
| |
|
| | Returns a list of dictionaries with movie recommendations:
|
| |
|
| | ```json
|
| | [
|
| | {
|
| | "movie_id": 50,
|
| | "title": "Star Wars (1977)",
|
| | "predicted_rating": 4.5
|
| | },
|
| | {
|
| | "movie_id": 181,
|
| | "title": "Return of the Jedi (1983)",
|
| | "predicted_rating": 4.3
|
| | }
|
| | ]
|
| | ```
|
| |
|
| | ## Model Performance
|
| |
|
| | - **SVD Method**: Fast predictions with good accuracy using 20 components
|
| | - **Collaborative Filtering**: More interpretable, based on item similarity
|
| | - **Cold Start Handling**: Graceful error handling for unknown users
|
| |
|
| | ## Technical Details
|
| |
|
| | - **Framework**: Scikit-learn
|
| | - **Algorithms**: TruncatedSVD, Cosine Similarity
|
| | - **Data Processing**: Pandas for efficient matrix operations
|
| | - **Memory Efficient**: Optimized for large-scale recommendation tasks
|
| |
|
| | ## Installation
|
| |
|
| | ```bash
|
| | pip install pandas numpy scikit-learn
|
| | ```
|
| |
|
| | ## Training
|
| |
|
| | The model is pre-trained on the MovieLens 100k dataset. To retrain:
|
| |
|
| | ```python
|
| | from model import MovieRecommender
|
| |
|
| | model = MovieRecommender()
|
| | model.load_data()
|
| | model.train()
|
| | model.save_model("movie_recommender.pkl")
|
| | ```
|
| |
|
| | ## Citation
|
| |
|
| | ```bibtex
|
| | @misc{datasynthis_ml_jobtask,
|
| | title={DataSynthis ML JobTask: Movie Recommendation System},
|
| | author={tasdid25},
|
| | year={2025},
|
| | url={https://huggingface.co/tasdid25/DataSynthis_ML_JobTask}
|
| | }
|
| | ```
|
| |
|
| | ## License
|
| |
|
| | MIT License - see LICENSE file for details. |