When AI’s Large Language Models Shrink | IEEE Spectrum (03/31/2023)

oogieva25 · April 14, 2023, 4:10pm

The large language model revolution is shifting state of the art AI research out of the reach of ordinary AI laboratories. Apart from the fact that this relegates the power of this technology to a few big players, forcing the research community to be reliant on their APIs, and removing the ability of external researchers to probe these models for safety concerns, it also represents an impractical road to scaling that leads to an innovative wall.

IEEE Spectrum

Dylan Patel of the consultancy SemiAnalysis says, “We won’t be able to make models bigger forever. There comes a point where even with hardware improvements, given the pace that we’re increasing the model size, we just can’t.” And so, the study and development of technology with smaller models now matters more than ever.

Last year, DeepMind showed (and researchers at Meta, Nvidia and Stanford confirmed) that training smaller models on far more data could significantly boost performance. Additionally, Patel brings up the promising “mixture of experts” technique which is training smaller, specialized sub-models for various tasks rather than using a large, more general model. He and Sara Hooker, research leader Cohere For AI, also talk about exploiting the sparsity of models to compress them, by finding ways to remove empty parameters from the model.

Still, Patel concedes that the large model paths have their necessary place in research and development. “The max size is going to continue to grow, and the quality at small sizes is going to continue to grow,” he says. “I think there’s two divergent paths, and you’re kind of following both.”

Topic		Replies	Views
The paper that started all this: Attention is All You Need \| Google Research News in AI	0	767	April 5, 2023
The Current State of Regulating AI \| The New York Times (4/15/23) News in AI ai-in-law , ai-news , regulation	0	376	April 21, 2023
To Teach Computers Math, Researchers Merge AI Approaches \| Quanta Magazine News in AI ai-in-writing , ai-news	0	383	February 27, 2023
Artificial intelligence is Social Science: Natural language Processing From and For Social Analysis (4/11/23) Events	3	643	May 1, 2023
Sparks of Artificial Intelligence (?): Early experiments with GPT-4 \| Microsoft Research News in AI	0	456	April 5, 2023

When AI’s Large Language Models Shrink | IEEE Spectrum (03/31/2023)

Related topics