In this episode, Ben Sorscher, a PhD student at Stanford, talks about reducing the size of data sets used to train models, particularly large language models, which are pushing the limits of scaling because of the enormous cost of training and the environmental impact of generating the electricity they consume.