Eye On AI

View Original

Week Ending 4.3.2022

RESEARCH WATCH: 4.3.2022

SPONSORED BY

Clear.ML is an open-source MLOps solution. Whether you're a Data Engineer, ML engineer, DevOps, or a Data Scientist, ClearML is hands-down the best collaborative MLOps tool with full visibility and extensibility.

This week was active for "Computer Science", with 1,495 new papers.

  • The paper discussed most in the news over the past week was "You Cannot Always Win the Race: Analyzing the LFENCE/JMP Mitigation for Branch Target Injection" by Alyssa Milburn (Intel) et al (Mar 2022), which was referenced 19 times, including in the article Chips & Salsa Episode 13: Intel STORM team in Intel. The paper got social media traction with 8 shares.

  • Leading researcher Oriol Vinyals (DeepMind) published "Training Compute-Optimal Large Language Models", which had 29 shares over the past 4 days. @arankomatsuzaki tweeted "Training Compute-Optimal Large Language Models Trains Chinchilla, which is Gopher w/ the same compute budget but with 70B parameters and 4x more more data. It significantly outperforms Gopher, e.g. by >7% on MMLU". This paper was also shared the most on social media with 677 tweets. @arankomatsuzaki (Aran Komatsuzaki) tweeted "Training Compute-Optimal Large Language Models Trains Chinchilla, which is Gopher w/ the same compute budget but with 70B parameters and 4x more more data. It significantly outperforms Gopher, e.g. by >7% on MMLU".

This week was very active for "Computer Science - Artificial Intelligence", with 215 new papers.

  • The paper discussed most in the news over the past week was by a team at Google: "Memorizing Transformers" by Yuhuai Wu et al (Mar 2022), which was referenced 3 times, including in the article Can a language model acquire new knowledge by simply reading new data? in Analytics India Magazine. The paper got social media traction with 61 shares. The authors extend language models with the ability to memorize the internal representations of past inputs. A user, @PaperTldr, tweeted "🗜72% In this work, we extend language models with internal representations of past inputs to improve their ability to lookup into past non - differentiable memory of recent benchmarks and tasks, including generic web, math papers, books, well as a".

  • Leading researcher Pieter Abbeel (UC Berkeley) published "Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design" The authors present a supervised pretraining approach to learn circuit representations that can be adapted to new circuit topologies or unseen prediction tasks. @CyrusHakha tweeted "Can we pre-train Graph Neural Networks for few-shot circuit modeling and design? We show that by pre-training deep GNNs to predict bias of circuits we can learn models that can be reused to do sample efficient circuit optimization and modeling. 📖 1/N".

  • The paper shared the most on social media this week is by a team at Tel Aviv University: "Transformer Language Models without Positional Encodings Still Learn Positional Information" by Adi Haviv et al (Mar 2022) with 185 shares. @Seb_Bratieres (Sébastien Bratières) tweeted "Received wisdom in DL: Transformers need positional encoding to make up for their intrinsically permutation-invariant architecture🤷. Or maybe they don't?🤔".

  • The most influential Twitter user discussing papers is Mark Riedl who shared "STaR: Bootstrapping Reasoning With Reasoning" by Eric Zelikman et al (Mar 2022) and said: "Is it possible to be both unsurprised that this works and also amazed that this works? Use prompting to generate “chains of thought”, then train on those chains to improve language model performance".

This week was very active for "Computer Science - Computer Vision and Pattern Recognition", with 497 new papers.

Over the past week, 18 new papers were published in "Computer Science - Computers and Society".

This week was very active for "Computer Science - Human-Computer Interaction", with 40 new papers.

This week was extremely active for "Computer Science - Learning", with 465 new papers.

  • The paper discussed most in the news over the past week was by a team at Google: "Pathways: Asynchronous Distributed Dataflow for ML" by Paul Barham et al (Mar 2022), which was referenced 3 times, including in the article How Google hopes to build more efficient, multi-capability AI systems in TheRegister.com. The paper got social media traction with 185 shares. A Twitter user, @kaushik_bokka, posted "Uff. Weekend Read! Seems to have a lot of gems and interesting takeaways in there", while @_arohan_ observed "Really cool work on scaling beyond what one might think is possible, with specific focus on MPMD (IIUC). Lot of model architecture ideas waiting to be unlocked".

  • Leading researcher Oriol Vinyals (DeepMind) came out with "Training Compute-Optimal Large Language Models", which had 29 shares over the past 4 days. @arankomatsuzaki tweeted "Training Compute-Optimal Large Language Models Trains Chinchilla, which is Gopher w/ the same compute budget but with 70B parameters and 4x more more data. It significantly outperforms Gopher, e.g. by >7% on MMLU". This paper was also shared the most on social media with 677 tweets. @arankomatsuzaki (Aran Komatsuzaki) tweeted "Training Compute-Optimal Large Language Models Trains Chinchilla, which is Gopher w/ the same compute budget but with 70B parameters and 4x more more data. It significantly outperforms Gopher, e.g. by >7% on MMLU".

Over the past week, 14 new papers were published in "Computer Science - Multiagent Systems".

Over the past week, 20 new papers were published in "Computer Science - Neural and Evolutionary Computing".

This week was very active for "Computer Science - Robotics", with 93 new papers.


EYE ON A.I. GETS READERS UP TO DATE ON THE LATEST FUNDING NEWS AND RELATED ISSUES. SUBSCRIBE FOR THE WEEKLY NEWSLETTER.