beyoru
/

MinCoder-4B-Exp

Text Generation

text-generation-inference

Model card Files Files and versions

MinCoder-4B-Exp / README.md

beyoru's picture

Update README.md

8963dab verified 4 months ago

|

history blame contribute delete

776 Bytes

	---
	base_model: Qwen/Qwen3-4B-Instruct-2507
	tags:
	- transformers
	- qwen3
	license: apache-2.0
	language:
	- en
	library_name: transformers
	---

	## Model details
	This model is fine-tuned from Qwen3-4B-Instruct using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300">
	</p>


	Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.

	> This is an experimental model