Many code models, like the recent OpenCoder have the functionality to perform fim fill-in-the-middle tasks, similar to Microsofts Githubs Copilot.

You give the model a prefix and a suffix, and it will then try and generate what comes inbetween the two, hoping that what it comes up with is useful to the programmer.

I don’t understand how we are supposed to treat these generations.

Qwen Coder (1.5B and 7B) for example likes to first generate the completion, and then it rewrites what is in the suffix. Sometimes it adds three entire new functions out of nothing, which doesn’t even have anything to do with the script itself.

With both Qwen Coder and OpenCoder I have found, that if you put only whitespace as the suffix (the part which comes after your cursor essentially), the model generates a normal chat response with markdown and everything.

This is some weird behaviour. I might have to put some fake code as the suffix to get some actually useful code completions.

    • Smorty [she/her]OP
      link
      fedilink
      English
      12 days ago

      I use ollama for generations, so all the formatting is presumably done for me. Sometimes it does generate correct code though, so it seems to be working.

      But thank you for the link!

    • @Deckweiss
      link
      English
      13 days ago

      Not op but I’ve looked at the examples for 15minutes out of curiosity and I still have no idea what the fuck they mean by that.

      • ffhein
        link
        English
        23 days ago

        Yea, the examples are not explained well, but if they’re using some other software/code for inference it might be configured wrong, or not compatible with qwen-coder FIM, and produce strange results for that reason.