Each new generation of large language model consumes a staggering amount of resources. 

Meta, for instance, trained its new Llama 3 models with about 10 times more data and 100 times more compute than Llama 2. Amid a chip shortage, it used two 24,000 GPU clusters, with each chip running around the price of a luxury car. It employed so much data in its AI work, it considered buying the publishing house Simon & Schuster to find more. 

Afterward, even its executives wondered aloud if the pace was sustainable.

“It is unclear whether we need to continue scaling or whether we need more innovation on post-training,” Ahmad Al-Dahle, Meta’s VP of GenAI, told me in an interview last week. “Is the infrastructure investment unsustainable over the long run? I don’t think we know.”

Subscribe now

For Meta — and its counterparts running large language models — the question of whether throwing more data, compute, and energy at the problem will lead to further scale looms large. Since LLMs entered the popular imagination, the best path to exponential improvement seemed to be combining these ingredients and allowing the magic to happen. But with the top bound of all three potentially in sight, the industry will need newer techniques, more efficient training, and custom built hardware to progress. Without advances in these areas, LLMs may indeed hit a wall.

The path of continued scale probably starts with better methods to train and run LLMs, some of which is already in motion. “We are starting to see new kinds of architectures that are going to change how these models scale in the future,” Swami Sivasubramanian, VP of AI and Data at Amazon Web Services, told me in an interview Thursday night. Sivasubramanian said researchers within Stanford and elsewhere are getting models to learn faster, with the same amount of data, and ten times cheaper inference. “I’m actually very optimistic about the future when it comes to novel model architectures, which has the potential to disrupt the space,” he said.  

Already, new methods of training these models seem to be paying off. “The smallest Llama 3 is basically as powerful as the the biggest Llama 2,” Mark Zuckerberg said on the Dwarkesh Patel podcast last week. 

To fuel these models — and get around potential bottlenecks in exhausting real world data — synthetic data created by AI is playing a key role. Though not fully proven yet, this data already made its way into model training. “Our coding abilities on Llama 3 is exceptionally high,” Meta’s Al-Dahle said. “Part of that was really being innovative and pushing on our ability to leverage models to generate synthetic data.” 

Along with finding better models, LLM progress likely depends on building better chips that can train and run these models faster and more efficiently than traditional chips. While NVIDIA GPUs are exceptionally useful for large language models, they aren’t purpose-built for them. Now some chips built specifically for generative AI are showing promise. Researchers like Andrew Ng have praised Groq, one buzzy name, as the type of chip that works fast enough to take generative AI to the next level, especially as the field pushes toward agents. 

Meanwhile, companies like Amazon, Intel, Google, and others are building “accelerators,” or custom chips that can run AI processes fast. At Amazon, Sivasubramanian said, the company’s purpose built Trainium chips are “designed with the sole purpose of being able to train these large language models” and already four times faster than the first generation. 

Given the need and the opportunity ahead, it’s no wonder OpenAI CEO Sam Altman is reportedly raising a lot of money to build chips powerful enough to achieve his aims.

The one LLM constraint that’s been little discussed is energy, and it may be the most important. “There’s a capital question of — at what point does it stop being worth it to put the capital in? — but I actually think before we hit that, you’re going to run into energy constraints,” Zuckerberg told Patel. He floated the idea of building a 1 gigawatt datacenter to advance AI, or something approximating a meaningful nuclear power plant. But given regulatory approvals and the build outs complexity, it could take years to produce. “I think it will happen,” he said. “This is only a matter of time.”

Until we get to such massive energy allocation, it may be difficult to say how much room LLMs have left to improve. But it seems like sooner or later, we will find out. “I am not thinking about it myself,” Sivasubramanian said with a laugh, of a nuclear-level plant to run AI models, “but I can’t speak to my infra team.”

Share

 ✅ Tech exec-approved (sponsor)

Right now, new AI tech is hitting the market daily. They’re not all created equal… but execs from Google and Meta have opted to invest in RAD AI, a branded content tool, the essential AI for brands. Why?

  • RAD AI has achieved ~3X revenue growth from 2022 to 2023 — while landing major clients, including Hasbro, Skechers, Sweetgreen, and more.

  • Brands using RAD AI have seen 3.5X ROI on campaigns and marketing channels.

  • It’s a proven and working AI backed by 6500+ investors and the Adobe Fund for Design.

Invest before April 29, when the RAD AI investment round closes.*

Learn More

What Else I’m Reading, Etc.

TSMC’s expansion into the U.S. is a culture clash disaster [Rest of World]

Cathie Wood’s ARK has lost investors $14.3 billion and they’re bailing big time [WSJ]

FTC bans most noncompetes [NPR]

TikTok as we know it, ban or not, is toast [Spyglass]

Worldcoin faces Orb shortage [Semafor]

Why the campus protestors are wearing masks [Semafor]

Quote Of The Week

“It’s not like life is going to be hunky-dory, forever”  

Google search head Prabhakar Raghavan addressing employees in an internal meeting.

Number of The Week

$1 billion+

Perplexity’s new valuation after a $63 million funding round announced this week, the company makes around $20 million per year.

This Week on Big Technology Podcast: Apple’s AI Play — With M.G. Siegler

You can listen on Apple, Spotify, or wherever you get your podcasts.

Advertise on Big Technology?

Reach 165,000+ plugged-in tech readers with your company’s latest campaign, product, or thought leadership. To learn more, write [email protected] or reply to this email.

Send me news, gossip, and scoops?

I’m always looking for new stories to write about, no matter big or small, from within the tech giants and the broader tech industry. You can share your tips here I will never publish identifying details without permission.

Thanks again for reading. Please share Big Technology if you like it!

And hit that Like Button so Big Technology doesn’t hit a wall

My book Always Day One digs into the tech giants’ inner workings, focusing on automation and culture. I’d be thrilled if you’d give it a read. You can find it here.

Questions? Email me by responding to this email, or by writing [email protected]

News tips? Find me on Signal at 516-695-8680

*Advertiser’s Disclosure: This is a paid advertisement for RAD AI’s Regulation CF offering. Please read the offering circular and related risks at invest.radintel.ai.

View Original Article
https://bigtechnology.substack.com/
Do you like Bigtechnology's articles? Follow on social!