Many Gen Z employees say ChatGPT is giving better career advice than their bosses::Nearly half of Gen Z workers say they get better job advice from ChatGPT than their managers, according to a recent survey.

  • @kromem
    link
    English
    111 months ago

    There’s something to be said for the abilities of a tool reflecting its wielder.

    In research circles, the most advanced pipelines in terms of prompting have a 90% success rate at things the same model only gets right around 30% of the time with naive zero shot prompting.

    At a minimum, people should be familiar with chain of thought prompting if using the models. That one is very easy to incorporate and makes a huge difference on complex problems.

    Though for anyone actually building serious pipelines for these products, the best technique I’ve seen to date was this one from DeepMind:

    We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to the framework is a self-discovery process where LLMs select multiple atomic reasoning modules such as critical thinking and step-by-step thinking, and compose them into an explicit reasoning structure for LLMs to follow during decoding. SELF-DISCOVER substantially improves GPT-4 and PaLM 2’s performance on challenging reasoning benchmarks such as BigBench-Hard, grounded agent reasoning, and MATH, by as much as 32% compared to Chain of Thought (CoT). Furthermore, SELF-DISCOVER outperforms inference-intensive methods such as CoT-Self-Consistency by more than 20%, while requiring 10-40x fewer inference compute. Finally, we show that the self-discovered reasoning structures are universally applicable across model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share commonalities with human reasoning patterns.

    So yes, maybe you aren’t getting a lot out of the models. But a lot of people are, and the difference between your experiences and theirs may just boil down to experience in using the tool. If I just started using Photoshop for an hour or two I might complain about how the software sucks at making good looking images. But we both know it wouldn’t be the software’s fault.

    • Rikudou_Sage
      link
      fedilink
      English
      811 months ago

      Well, one more comment like that and I guess I’m gonna have to edit my original comment, because I don’t want to explain again. I’m getting quite a lot out of LLMs (GPT-4, to be specific), it’s just that they’re very stupid. When they don’t straight up lie, they don’t know stuff. It’s quite simple, really, I usually deal with very complex problems that few people dealt with, the AI has (close to) no data on that, so it runs in circles and is not able to help.

      But when presented with questions that it has training data on, it’s brilliant - recently I needed to use reflection to get all types implementing an interface in .NET with the caveat that the interface is generic. GPT-4 was able to solve that problem 3rd message in the conversation, while I’m pretty sure it would take me hours, because I’d need to learn a lot of .NET’s internal workings before arriving at the quite simple solution.

      So, a good career advice - which one do you feel like it is? A simple question with a straight correct solution, or a complex and nuanced issue where there isn’t one general truth? Because the only correct answer to a request for career advice by someone who doesn’t know your situation extensively is (a version of) “I don’t know, what’s your situation in detail?”. Knowing GPT, it didn’t ask that question.

      So yes, LLMs are great! Just learn which use-cases it excels at and don’t ask it for complex advice.

      • @kromem
        link
        English
        3
        edit-2
        11 months ago

        When they don’t straight up lie, they don’t know stuff. It’s quite simple, really, I usually deal with very complex problems that few people dealt with, the AI has (close to) no data on that, so it runs in circles and is not able to help.

        You need to provide it the data. The fact they know things at all pretrained was kind of a surprise to everyone in the industry. Their current usecase as a Google replacement is really not ideally aligned with the capabilities. But the models have turned out to be surprisingly good at in context learning and are having increased context windows, so depending on the model you can absolutely provide it relevant reference material to ground the responses with a factual reference point before asking for deeper analysis. It’s hard to give specific recommendations without knowing more about what you are trying to accomplish, but “they’re very stupid” runs extremely counter to most of what I’ve seen at this point, and the rare cases where that seems to be the case there’s usually something more nuanced getting in the way and a slight modification to what or how I’m asking gets past it.

        Knowing GPT, it didn’t ask that question.

        Really? I find that the chat models are almost overturned to asking for more details as part of their reengagement strategy. In fact, a number of the employment related usage examples I’ve seen were things like users having the model ask a series of questions about work history and responsibilities in order to summarize resume fodder. So again, maybe a bit of a difference between users of the tools.

        Just learn which use-cases it excels at and don’t ask it for complex advice.

        My use of the models is almost entirely related to complex scenarios and while I’d agree that something like GPT-3 is dumb as shit, GPT-4 is probably among the smarter interactions I’ve had in my life and I used to consult for C-suite execs of Fortune 500s. One of my favorite results was explaining the factors I suspected were influencing it getting a question wrong and it generating a correct workaround that was quite brilliant (the issue was token similarity to a standard form of a question and the proposed solution was replacing the nouns with emojis, which did bypass the similarity bias and allowed it to answer correctly when it was failing before). In spite of there being no self-introspection capabilities, giving it background details resulted in novel and ultimately correct out-of-the-box solutions.

        From the sound of it, you are trying to use it for coding. I recommend switching to one of the models that specializes in that rather than using a generalist model.

        And on the off chance you are using the free 3.5 version - well stop that. That one sucks and is like using an Atari when there’s a PS3 available instead. Don’t make the mistake of extrapolating where the tech is at based on outdated tech being provided for free as a foot on the door.