And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.
[Alicia Curth’s] team argued that the double-descent phenomenon—where models appear to perform better, then worse, and then better again as they get bigger—arises because of the way the complexity of the models was measured.
Like with emergent abilities, that also were not emerging abruptly if you use a different metric.
A lot of things boil down to measuring them right, unfortunately this is an unsolved problem in general ¯\_(ツ)_/¯
Like with emergent abilities, that also were not emerging abruptly if you use a different metric.
A lot of things boil down to measuring them right, unfortunately this is an unsolved problem in general ¯\_(ツ)_/¯