|
|
| --- |
| datasets: |
| - codeparrot/self-instruct-starcoder |
| pipeline_tag: text2text-generation |
| metrics: |
| - code_eval |
| library_name: transformers |
| tags: |
| - code |
| model-index: |
| - name: StarCoder-SelfInstruct |
| results: |
| - task: |
| type: text-generation |
| dataset: |
| type: openai_humaneval |
| name: InstructHumanEval |
| metrics: |
| - name: pass@1 |
| type: pass@1 |
| value: 0.391 |
| verified: false |
| - task: |
| type: text-generation |
| dataset: |
| type: openai_humaneval |
| name: HumanEval |
| metrics: |
| - name: pass@1 |
| type: pass@1 |
| value: 0.346 |
| verified: false |
| --- |
| |
|
|
| # Model Card for Self-instruct-starcoder |
|
|
| <!-- Provide a quick summary of what the model is/does. --> |
|
|
| This model is an instruction-tuned version of ⭐️ StarCoder. The instruction dataset involved is [Self-instruct-starcoder](https://huggingface.co/datasets/codeparrot/self-instruct-starcoder) |
| which was built by boostrapping on StarCoder's generations. |
| ## Uses |
|
|
| <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
| The model was fine-tuned with the following template |
| ``` |
| Question: <instruction> |
| |
| Answer: <output> |
| ``` |
| If you have your model and tokenizer loaded, you can use the following code to make the model generate the right output to a given instruction |
|
|
| ```python |
| instruction = "Write a function to compute the GCD between two integers a and b" |
| prompt = f"Question:{instruction}\n\nAnswer:" |
| input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"] |
| completion = model.generate(input_ids, max_length=200) |
| print(tokenizer.batch_decode(completion[:,input_ids.shape[1]:])[0]) |
| ``` |
|
|
| ## More information |
|
|
| For additional information, check |
| - [self-intruct-starcoder](https://huggingface.co/codeparrot/self-instruct-starcoder) |
| - [starcoder](https://huggingface.co/bigcode/starcoder) |