Playing around with the FOSS game Cataclysm DDA, I felt compelled to parse and connect the CPP and JSON to see relationships and complexity. It’s the first time I’ve really felt motivated to do so. I’m just trying to wrap my head around how some features are implemented like z-levels, mining tools and various actions; simple stuff really. I find it challenging to parse something quite this large, so I started scripting a way to track down objects across the code base to see what is defined in JSON and what is hard coded. Normal? Obvious? FOSS alternatives to do this? I’m basically chaining a bunch of grep commands to print pretty trees with bat.

  • @j4k3OP
    link
    English
    4
    edit-2
    4 months ago

    Yeah this has been my experience too. LLMs don’t handle project specific code styles too well either. Or when there are several ways of doing things.

    Actually, earlier today I was asking a mixtral 8x7b about some bash ideas. I kept getting suggestions to use find and sed commands which I find unreadable and inflexible for my evolving scripts. They are fine for some specific task need, but I’ll move to Python before I want to fuss with either.

    Anyways, I changed the starting prompt to something like ‘Common sense questions and answers with Richard Stallman’s AI assistant.’ The results were remarkable and interesting on many levels. From the way the answers always terminated without continuing with another question/answer, to a short footnote about the static nature of LLM learning and capabilities, along with much better quality responses in general, the LLM knew how to respond on a much higher level than normal in this specific context. I think it is the combination of Stallman’s AI background and bash scripting that are powerful momentum builders here. I tried it on a whim, but it paid dividends and is a keeper of a prompting strategy.

    Overall, the way my scripts are collecting relationships in the source code would probably result in a productive chunking strategy for a RAG agent. I don’t think an AI would be good at what I’m doing at this stage, but it could use that info. It might even be possible to integrate the scripts as a pseudo database in the LLM model loader code for further prompting.