• @[email protected]
    link
    fedilink
    42 months ago

    Why settle for good enough when you have a term that is both actually correct and more widely understood?

          • @[email protected]
            link
            fedilink
            22 months ago

            That’s basically like saying that typical smartphones are square because it’s close enough to rectangle and rectangle is too vague of a term. The point of more specific terms is to narrow down the set of possibilities. If you use “square” to mean the set of rectangles, then you lose the ability to do that and now both words are equally vague.

            • @[email protected]
              link
              fedilink
              12 months ago

              Is this referring to what I said about Markov chains or stochastic processes? If it’s the former the only discriminating factor is beam and not all LLMs use that. If it’s the latter then I don’t know what you mean. Molecular dffusion is a classic stochastic process, I am 100% correct in my example.

              • @[email protected]
                link
                fedilink
                12 months ago

                It’s in reference to your complaint about the imprecision of “stochastic process”. I’m not disagreeing that molecular diffusion is a stochastic process. I’m saying that if you want to use “Markov process” to describe a non-Markovian stochastic process, then you no longer have the precision you’re looking for and now molecular diffusion also falls under your new definition of Markov process.

                • @[email protected]
                  link
                  fedilink
                  02 months ago

                  Okay so both of those ideas are incorrect.

                  As I said, many are literally Markovian and the main discriminator is beam, which does not really matter for helping people understand my meaning nor should it confuse anyone that understands this topic. I will repeat: there are examples that are literally Markovian. In your example, it would be me saying there are rectangular phones but you step in to say, “but look those ones are curved! You should call it a shape, not a rectangle.” I’m not really wrong and your point is a nitpick that makes communication worse.

                  In terms of stochastic processes, no, that is incredibly vague just like calling a phone a “shape” would not be more descriptive or communicate better. So many things follow stochastic processes that are nothing like a Markov chain, whereas LLMs are like Markov Chains, either literally being them or being a modified version that uses derived tree representations.

                  • @[email protected]
                    link
                    fedilink
                    02 months ago

                    I’m not familiar with the term “beam” in the context of LLMs, so that’s not factored into my argument in any way. LLMs generate text based on the history of tokens generated thus far, not just the last token. That is by definition non-Markovian. You can argue that an augmented state space would make it Markovian, but you can say that about any stochastic process. Once you start doing that, both become mathematically equivalent. Thinking about this a bit more, I don’t think it really makes sense to talk about a process being Markovian or not without a wider context, so I’ll let this one go.

                    nitpick that makes communication worse

                    How many readers do you think know what “Markov” means? How many would know what “stochastic” or “random” means? I’m willing to bet that the former is a strict subset of the latter.