• @BetaDoggo_
      link
      21 year ago

      It could be used to create a reward model like what is done right now with RLHF.