• @[email protected]
    link
    fedilink
    English
    710 months ago

    I mean geometry/trig have some of the simplest, most-straightforward, least ambiguous rulesets of any math. Why wouldn’t a computer outperform a human?

    • @[email protected]
      link
      fedilink
      English
      6
      edit-2
      10 months ago

      From the article:

      For many years, we’ve had software that can generate lists of valid conclusions that can be drawn from a set of starting assumptions. Simple geometry problems can be solved by “brute force”: mechanically listing every possible fact that can be inferred from the given assumption, then listing every possible inference from those facts, and so on until you reach the desired conclusion.

      But this kind of brute-force search isn’t feasible for an IMO-level geometry problem because the search space is too large. Not only do harder problems require longer proofs, but sophisticated proofs often require the introduction of new elements to the initial figure—as with point D in the above proof. Once you allow for these kinds of “auxiliary points,” the space of possible proofs explodes and brute-force methods become impractical.

      So, mathematicians must develop an intuition about which proof steps will likely lead to a successful result. DeepMind’s breakthrough was to use a language model to provide the same kind of intuitive guidance to an automated search process.

    • Kogasa
      link
      fedilink
      English
      410 months ago

      Geometry is a bit tricky. A lot of “obvious” facts about geometry are less obvious to prove from a given collection of axioms forming a model of geometry, because their “obviousness” stems from our natural facilities for understanding space and position. Sometimes, historically, things that are “obviously” true in geometry turn out to be false, or depend on unwritten assumptions, for complex reasons. It may be surprising in this light if current AI can beat humans’ intuition plus logic using purely analytic tools.