AI, trained almost exclusively on gigantic databases of human authored text such as reddit and quora with little regard or filtering for quality of said text given the scope of data to process, mirrors biases within humanity? Shocking
It’s statistically accurate, though. There are more male doctors in the world than there are female ones.
The most common surname for doctors in the US is Patel. That doesn’t mean you should assume a doctor is named Patel.
No, but it does mean that if I am suddenly called upon to invent a doctor “Patel” is a reasonable name to throw out there.
Sure, but so is Johnson (2nd Most common). Just like it’s reasonable to invent a male or female doctor.
LLMs are statistics machines though.
If you forced then to say the names of the doctors i’m pretty sure “Dr. Patel” would be overrepresented (or represented just as much as in the training data).
LLMs are not replacements for human beings, if you don’t want the most bland output, you either train your own with a dataset to fit your needs, or specifically tell the LLMs in the prompt (which sometimes they’ll just ignore) or pay a human to draw it.
Downvotes for facts
It loves to not answer how you ask all the time.
I once had it give me a list of names for something.
Then I asked for more names but it repeated a couple.
So I said give me more names and no repeats.
Guess what… it still did repeats and even classified it as a repeat something kinda like below.
John
Jane
Jerry (mentioned already in a previous message)
Jason