It’s vital to “keep humans in the loop” to avoid humanizing machine-learning models in research

Machine-learning models are quickly becoming common tools  in scientific research. These artificial intelligence systems are helping bioengineers discover new potential antibiotics, veterinarians interpret animals’ facial expressions, papyrologists read words on ancient scrolls, mathematicians solve baffling problems and climatologists predict sea-ice movements. Some scientists are even probing large language models’ potential as proxies or replacements for human participants in psychology and behavioral research. In one recent example, computer scientists ran ChatGPT through the conditions of the Milgram shock experiment—the famous study on obedience in which people gave what they believed were increasingly painful electric shocks to an unseen person when told to do so by an authority figure—and other well-known psychology studies. The artificial intelligence model responded in a similar way as humans did—75 percent of simulated participants administered shocks of 300 volts and above.

But relying on these machine-learning algorithms also carry risks. Some of those risks are commonly acknowledged, such as generative AI’s tendency to spit out occasional “hallucinations” (factual inaccuracies or nonsense). Artificial intelligence tools can also replicate and even amplify human biases about characteristics such as race and gender. And the AI boom, which has given rise to complex, trillion-variable models, requires water- and energy-hungry data centers that likely have high environmental costs.

  • @[email protected]
    link
    fedilink
    English
    138 months ago

    This physically hurts to comprehend. ChatGPT and the other Large Language Models that make up the current AI boom in popular science and tech spaces right now are not sentient. Please leave armchair misapplications of psychology at the door.

    They are the wrong tool for any job besides text parsing and generation.

    There are vague arguments to be made that since their learning corpus is based off an absurd amount of human produced text, that the end result model may somewhat represent a condensed sort of summary of the emotions and psychology in the training data, but that’s a hell of a stretch.

    It is literally and conceptually impossible to run “ChatGPT through the conditions of the Milgram shock experiment”. It cannot administer a shock to anyone. It does not understand the concept of a shock. It does not understand the concept of pain, or of being. It has not been given any input or output capabilities except text.


    Machine learning being used to optimize designs towards certain metrics? Hell yes!

    Using an LLM as a human participant analog? Do not pass go, please leave all professional credentials and accomplishments in the shred bin. I guess you can collect $200. There has to be some sort of gain out there or people wouldn’t keep misapplying LLMs to problem spaces and publishing it as “groundbreaking research”.

    • @GlitterInfection
      link
      English
      -2
      edit-2
      8 months ago

      The part you’re calling “a hell of a stretch” is actually the reason LLMs work. It’s not a good text parser. It’s a great pattern matcher. And it matches patterns that aren’t obvious or intuitive.

      Many of the listed uses are actually great for this type of tech.

      In theory, because of the amount of data used, there should be matched patterns that would allow it to be used for psychological research. Replicating well known studies in that area with the tech is a good way to test that theory.

      Using it as a first-line simulation might not be a bad idea as long as its followed up with a real study to validate the results.

      We just need to make sure that humans are checking the work properly because, as you say, it’s not sentient, nor is it really capable of following a code, like the scientific method.

      The real thing to fear is humans not doing their part out of greed, laziness, or malice.