Today's episode is seriously mind-expanding. In it, Mark and Charles detail how they're pushing the A.I. frontier through LLM merging, extremely efficient (even CPU-only!) LLM training, and *Small* Language Models.
Mark McQuade:
• Is Co-Founder and CEO of Arcee.ai.
• Previously, he held client-facing roles at Hugging Face and Roboflow as well as leading the data science and engineering practice of a Rackspace company.
• He studied electronic engineering at Fleming College in Canada.
Charles Goddard:
• Is Chief of Frontier Research at Arcee.ai
• Previously, he was a software engineer at Apple and the famed NASA Jet Propulsion Laboratory.
• Studied engineering at Olin College in Massachusetts.
Today’s episode is relatively technical so will likely appeal most to hands-on practitioners like data scientists and ML engineers. In it, Charles and Mark detail:
• How their impressive open-source model-merging approach combines the capabilities of multiple LLMs without increasing the model’s size.
• A separate open-source approach for training LLMs efficiently by targeting specific modules of the network to train while freezing others.
• The pros and cons of Mixture-of-Experts versus Mixture-of-Agents approaches.
• How to enable small language models to outcompete the big foundation LLMs like GPT-4, Gemini and Claude.
• How to leverage open-source projects to land big enterprise contracts and attract big chunks of venture capital.
On that final note, congrats to the Arcee.ai team on announcing their $24m Series A round this very day... unsurprising given their tremendously innovative tech and rapid revenue ramp-up! It's very rare to see runaway A.I. startup successes like this one.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.