What surprised you the most this year?
Development of distributed training for AI models, since DiLoCo again DiPaCo from Google DeepMind to Distro from Nous Research. It completely changes the game in how we should think about great models and what we can do to train them. This means that pure computer parameters will not be very useful, and that we will have little control over the ways in which knowledge is produced.
That comes from Rohit Krishnan, who also answers some questions. Interviewed by Derek Robertson, via Mike Doherty.
Source link