ah, missed the voice recordings part, I can see an LLM for annotating those. I meant to emphasise this is separate from the LLM assistant and coding models that are the most visible to consumers right now.
Is there a reason to believe they don’t use the same models? You wouldn’t necessarily need something specialized for ‘curating data’. The drug identification stuff is definitely separate though.
For text classification? They might. There’s a million things they could be using. I’ve definitely seen just throwing ChatGPT at text and asking it to generate a label for tagging and classification, but its much cheaper to use fine tuned Roberta or some other encoder-only model. Both are LLMs using a transformer architecture, just one is more what we’re familiar with the other is more meant for text classification tasks.
ah, missed the voice recordings part, I can see an LLM for annotating those. I meant to emphasise this is separate from the LLM assistant and coding models that are the most visible to consumers right now.
Is there a reason to believe they don’t use the same models? You wouldn’t necessarily need something specialized for ‘curating data’. The drug identification stuff is definitely separate though.
For text classification? They might. There’s a million things they could be using. I’ve definitely seen just throwing ChatGPT at text and asking it to generate a label for tagging and classification, but its much cheaper to use fine tuned Roberta or some other encoder-only model. Both are LLMs using a transformer architecture, just one is more what we’re familiar with the other is more meant for text classification tasks.