• @BetaDoggo_
    link
    English
    22
    edit-2
    3 months ago

    How many times is this same article going to be written? Model collapse from synthetic data is not a concern at any scale when human data is in the mix. We have entire series of models now trained with mostly synthetic data: https://huggingface.co/docs/transformers/main/model_doc/phi3. When using entirely unassisted outputs error accumulates with each generation but this isn’t a concern in any real scenarios.

    • Something Burger 🍔
      link
      fedilink
      English
      343 months ago

      As the number of articles about this exact subject increases, so does the likelihood of AI only being able to write about this very subject.