BFCL results
#41
by
paverGulyaevich - opened
Hi. Cannot reproduce the results that you reported on your blog for BFCL benchmark. I ran the evaluation via the openai client and got the average multiturn score around 15%. Could you specify how exactly you did the evaluation on that benchmark?