Thanks for sharing your results, they're indeed pretty different. I looked at the source again and did append a "# " before every prompt made by those 10 `code` models (during testing thought that formatting it as a Python comment might help them).
Will re-run the script without that to see if it matches your results.
Thanks for sharing your results, they're indeed pretty different. I looked at the source again and did append a "# " before every prompt made by those 10 `code` models (during testing thought that formatting it as a Python comment might help them).
Will re-run the script without that to see if it matches your results.
[0] https://docs.together.ai/docs/models-inference#code-models