So many prompts I have tried yet the results from Bing are always pretty weak compared to the real gpt-4. I also prompted it to write some Russian poems, so far it only spewed out gibberish with no rhymes. On the other hand, the real gpt-4 can sometimes produce really impressive shit, that I read with interest.

Another thing I noticed is that if I try to get Bing to generate something inappropriate, it’ll go along and do it for a second, but then it’ll quickly wipe its message. That’s interesting because it suggests that the underlying model isn’t the same as OpenAI’s, which seems unable to generate harmful content at the core.

  • @AtmaJnana
    link
    English
    21 year ago

    Software usually isn’t monolithic. And this software, in particular, is way more complicated than you give it credit for. Consequently, you overlook many variables that would effect your casual testing.