Dreams of AI

The Picard Maneuver · 7 months ago

Dreams of AI

Ephera · 7 months ago

Well, no, I’m just saying the text generation stuff did not change anything about that process.
It can try to generate the right text for the requests to grab this data, but since there’s going to be practically no documentation for that out there, it will struggle to do so from just its training data alone.

So, what you do instead is that you have a human figure out the API of each store that needs to be integrated + ideally a transformation of the returned data into a shared, documented format. And then you tell the text generation a trivial way for it to generate the text to make use of that.
So, basically you preface the whole user conversation with “If I ask for prices of Todd’s Tater Tots, run ./prices_todds_tater_tots.sh for that and use the result according to the JSON schema in prices_store.schema.json.”.

And then you repeat that for all the other stores, for some math API and some navigation API and then you’ve got a chance that the text generation figures out the right semantics of how these things should be called.
Semantics is what it’s good at. But the rest is still the same process as ten years ago.

@danc4498 · 7 months ago

I think you’re under selling what chat gpt is capable of. It is able to take outside data in and use it with the rest of the model. Bing does it with its web index data. I was able to ask what the cheapest gas station near me is and bing gave me a list, likely coming from gas buddy.

Ephera · 7 months ago

Yeah, it can easily do that, if such a comparison service already exists. Then it’s just yet another API that it calls, or in this case, it more likely just does a Bing search and recounts the top results. But I haven’t yet heard of such a service existing for groceries, so it would still need to be built.

I understand what you’re saying. In theory, it’s possible. But in practice, it is not just a matter of linearly improving LLMs and then at some point, they’ll just do it on their own. Any task that takes more than a few steps means that the error rate of the LLM multiplies.

The error rate would need to get magnitudes lower for that multiplication to not explode with many steps. The alternative is removing steps that the LLM needs to do, as I described. Some colleagues at my dayjob do basically nothing else now.