Today’s episode is all about an LLM trained for robotics applications called RFM-1 that completely blows my mind because of the implications for what can now suddenly be accomplished so easily with robotics.
Read MoreDeep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas
Today, Prof. Barrett Thomas blends his rich technical understanding of Deep Reinforcement Learning with his commercial savviness to eloquently detail how Deep RL can be leveraged to minimize costs and maximize profits.
Barrett:
• Is Research Professor in Business Analytics and Senior Associate Dean at the University of Iowa’s College of Business.
• As will soon be unsurprising to you when you hear how well he communicates complex concepts, he’s won multiple teaching awards (amongst other academic prizes).
• He holds a PhD in Industrial and Operations Engineering from the University of Michigan.
Today’s episode is a technical one that will appeal primarily to hands-on practitioners like data scientists, software developers and machine learning engineers.
In this episode, Barrett details:
• What Markov Decision Processes are and how they relate to Deep Reinforcement Learning.
• How operations research leverages neural networks to minimize business costs and maximize business profits.
• How same-day delivery has been made possible by machine learning.
• How aerial drones and autonomous vehicles will revolutionize supply chains and transportation.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in March 2024
We're trying something novel on the SuperDataScience Podcast today: an ICMYI ("in case you missed it") episode that highlights the most gripping moments from my conversations with guests over the past month.
Please let me know what you think of this! Does it work for you? What would you change about it? Should we stop doing these entirely? Let me know right here on this post; your voice matters :)
For this inaugural ICYMI episode, conversation highlights include:
1. Sebastian Raschka, PhD on how Lightning AI makes LLM training and deployment easy (from Episode #767).
2. Dr. Travis Oliphant, creator of the ubiquitous NumPy and SciPy libraries, on the future of scientific computing (#765).
3. Award-winning, A.I.-focused venture capitalist Rudina Seseri letting us know what it takes to get a VC firm to invest in you (#763).
4. Prof. Zachary Lipton on his roadmap from AI startup to long-term commercial success (#769).
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko
You wanted more of Kirill Eremenko, now you've got it! Kirill returns to the show today to detail Decision Trees, Random Forests and all three of the leading gradient-boosting algorithms: XGBoost, LightGBM and CatBoost 😸
If you don’t already know him, Kirill:
• Is Founder and CEO of SuperDataScience, an e-learning platform that is the namesake of this very podcast.
• Launched the SuperDataScience Podcast in 2016 and hosted the show until he passed me the reins four years ago.
• Has reached more than 2.7 million students through the courses he’s published on Udemy, making him Udemy’s most popular data science instructor.
Today’s episode is a highly technical one focused specifically on Gradient Boosting methods and the foundational theory required to understand them. I expect this episode will be of interest primarily to hands-on practitioners like data scientists, software developers and machine learning engineers.
In this episode, Kirill details:
• Decisions Trees.
• How Decision Trees are ensembled into Random Forests via Bootstrap Aggregation.
• How the AdaBoost algorithm formed a bridge from Random Forests to Gradient Boosting.
• How Gradient Boosting works for both regression and classification tasks.
• All three of the most popular Gradient Boosting approaches — XGBoost, LightGBM and CatBoost — as well as when you should choose them.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The Neuroscientific Guide to Confidence
The inspiring entrepreneur Lucy Antrobus has run confidence-building workshops for thousands of people. In today's episode, she details her neuroscience-backed formula for developing bulletproof confidence.
Lucy:
• Advises the United Nations on innovation for impact.
• Was previously Founder/CEO of an award-winning NGO and Co-founder/COO of an edtech company.
• Critically for today’s episode, she has run confidence-building workshops for over 1000 people of 30+ nationalities, including refugees who have just arrived in Switzerland.
Today’s episode should be fascinating to anyone!
In it, Lucy details:
• The science of confidence, which we can grow through repetition and practice, much like we can develop muscles by repeating lifts at the gym.
• Concrete guidance from neuroscience research on what we can do to develop healthy confidence in ourselves and in those around us.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Generative AI for Medicine, with Prof. Zack Lipton
Generative A.I. is rapidly transforming medicine. My guest today is brilliant, inspiring Prof. Zachary Lipton — Chief Scientific Officer and CTO of Abridge, a startup that has quickly raised $208m to lead the transformation!
More on Zack:
• Assoc. Prof. in the Machine Learning Dept. of Carnegie Mellon University's Computer Science school.
• Highly-cited (23k+ citations) with research spanning core ML methods and theory, as well as applications in healthcare and NLP.
• Directs the Approximately Correct Machine Intelligence (ACMI) Lab at CMU, where they build robust systems for the real world.
• Is also a jazz saxophonist! 🎷
Despite Zack being such a deep technical expert, most of today’s content will be of interest to anyone who’d like to hear about the cutting edge of generative A.I. applications in healthcare.
The tech that Zack is leading development of at Abridge, which you can hear about in today's episode:
• Initial deployment uses ambient listening and generative A.I. to reduce the cognitive burden of clinical documentation, reducing burnout as well as enabling clinicians to spend less time with computers and more with patients.
• Industry-leading automatic speech recognition engine specifically designed for healthcare applications; can accurately transcribe speech in challenging environments, e.g., when there is background noise or when multiple people are speaking.
• Supports 14+ languages including handling code-switching (where speakers shift between languages) and interpreter-mediated conversations.
• In-house LLM development allows greater customization and responsible-use features, such as transparency (e.g., links to source transcript/audio) and evidence extraction (verification process).
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Is Claude 3 Better than GPT-4?
So across a broad range of tests, of benchmarks for LLMs, like MMLU, GPQA, Grade 8 school math, and tons of other tests, Claude 3 Opus, their now, their largest and most powerful model amongst the Claude 3 models.
Read MoreOpen-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka
Today's superhuman guest is Dr. Sebastian Raschka,, author of the bestselling "ML with PyTorch and sklearn" book, iconic technical blogger (>350k followers) and Staff Research Engineer at Lightning AI. Hear him detail open-source libraries for LLMs.
More on Sebastian:
• Is Staff Research Engineer at Lightning AI, the company behind the popular PyTorch Lightning open-source library for training and deploying PyTorch models, including Large Language Models (LLMs), with ease.
• Iconic technical blogger (50k subscribers) and social-media contributor (>350k combined followers across LinkedIn and Twitter)
• Was previously Assistant Professor of Statistics at University of Wisconsin-Madison.
• Holds a PhD in statistical data mining from Michigan State University.
Today’s episode is technical and will primarily be of interest to hands-on practitioners like data scientists, software developers and machine learning engineers.
In it, Sebastian details:
• The many super-helpful open-source libraries that PyTorch Lightning leads development of.
• Dora parameter-efficient fine-tuning.
• Google’s “open-source” Gemma models.
• Multi-query attention.
• The leading alternatives to RLHF.
• Where he sees the next big opportunities in LLM development.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution
Player Piano, despite being written seven decades ago, could not be more relevant to the AI revolution that’s accelerated dramatically in the past year.
Read MoreNumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant
Huge episode today with iconic Dr. Travis Oliphant, creator of NumPy and SciPy, the standard libraries for numeric operations (downloaded 8 million and 3 million times PER DAY, respectively). Hear about the future of open-source, including the impact of GenAI.
More on Travis:
• Founded Anaconda, Inc., the company behind the also-ubiquitous Python package manager.
• Founded the massive PyData conferences and communities as well as its associated non-profit foundation, NumFOCUS.
• Currently serves as the CEO of two firms: OpenTeams and Quansight.
• Holds a PhD in biomedical engineering from the Mayo Clinic in Minnesota.
Today’s episode will primarily be of interest to hands-on practitioners like data scientists, software developers and machine learning engineers.
In it, Travis details:
• How his journey creating open-source software began and how NumPy and SciPy grew to become the most popular foundational Python libraries for working with data.
• How he identifies commercial opportunities to support his vast open-source efforts and communities.
• How AI, particularly generative AI, is transforming open-source development.
• Where open-source innovation is headed in the years to come.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The Top 10 Episodes of 2023
In 2023, we had a new record of 4 million combined podcast downloads and YouTube views. That’s up from 3.3 million a year earlier; thank you for your support listening, rating, sharing, liking, commenting on episodes and so on!
Read MoreThe Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri
How should an A.I. startup find product-market fit? How do some A.I. startups become spectacularly successful? The renowned (and highly technical!) A.I. venture-capital investor Rudina Seseri answers these questions and more in today's episode.
Rudina:
• Founder and Managing Partner of Glasswing Ventures in Boston.
• Led investments and/or served on the Board of Directors of more than a dozen SaaS startups, many of which were acquired.
• Was named Startup Boston's 2022 "Investor of the Year" amongst many other formal recognitions.
• Is a sought-after keynote speaker on investing in A.I. startups.
• Executive Fellow at Harvard Business School.
• Holds an MBA from Harvard University.
Today’s episode will be interesting to anyone who’s keen on scaling their impact with A.I., particularly through A.I. startups or investment.
In this episode, Rudina details:
• How data are used to assess venture capital investments.
• What makes particular AI startups so spectacularly successful.
• Her "A.I. Palette" for examining categories of machine learning models and mapping them to categories of training data.
• How Generative AI isn’t a fad, but it is still only a component of the impact that AI more broadly can make.
• The automated systems she has built for staying up to date on all of the most impactful AI developments.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Gemini 1.5 Pro, the Million-Token-Context LLM
In episode, #761, we detailed the public release of Google’s Gemini Ultra, the only LLM that is in the same class as OpenAI’s GPT-4 in terms of capabilities. Well, hot on the heels of that announcement, is the release of Gemini Pro 1.5.
Read MoreGemini Ultra: How to Release an A.I. Product for Billions of Users, with Google’s Lisa Cohen
Google recently released Gemini Ultra, their largest language model. I love Ultra and now use it instead of GPT-4 on many tasks. Today's guest, Lisa Cohen, leads Gemini's rollout; hear from her how a company with billions of users rolls out new A.I. products.
More on Gemini Ultra:
• The only LLM with comparable capabilities to GPT-4 (in my experience as well as on benchmark evaluations, although I know benchmarking has plenty of issues!)
• Ultra maintains attention across large context windows (Gemini 1.5 Pro has a million-token context, btw!), competently generating natural language and code.
• Like GPT-4V, Ultra is multi-modal and so accepts both an image and text as input at the same time.
• Piggybacking on Google's excellence at search, I’ve found Gemini Ultra to be particularly effective at tasks that involve real-time search (the Google "Bard" project that focused on real-time information retrieval was renamed "Gemini" when Gemini Ultra was released).
Lisa Cohen is perhaps the best person on the planet to be speaking to about the momentous Gemini releases because Lisa is Director of Data Science & Engineering for Google's Gemini, Assistant and Search Platforms. In addition, she:
• Was previously Senior Director of Data Science at Twitter and Principal Director of Data Science at Microsoft.
• Holds a Master's in Applied Math from Harvard University.
In this episode, Lisa details:
• The three LLMs in Google’s Gemini family and how the largest one, Gemini Ultra, fits in.
• The many ways you can access Gemini models today.
• How absolutely enormous LLM projects are carried out and how they’re rolled out safely and confidently to literally billions of users.
• How LLMs like Gemini Ultra are transforming life and work for everyone from data scientists to educators to children, and how this transformation will continue in the coming years.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Humans Love A.I.-Crafted Beer
I recently recorded tipplers' reactions as they had their first taste of the A.I.-crafted "Krohn&Borg" lager I co-developed. Today's episode illustrates the result: Humans love A.I. beer! There's also cool content on using CRISPR-Cas9 to modify yeast genes.
Thanks again to Beau Warren, Head Brewer at Species X Beer Project, for the opportunity to collaborate on this delicious project. You can check out Episode #755 for tons of detail on the ML packages used and the models developed to craft beer with A.I.
And thanks to all of the guests/judges in today's episode:
• Rehgan Avon of AlignAI
• Alexandra Hagmeyer (Dauterman) of Path Robotics
• Kelsey Dingelstedt of Women in Analytics (WIA)
• William McFarland of Omega Yeast
• Jim Lachey of the Super Bowl XXVI-winning Washington Commanders
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko
Last month, Kirill Eremenko was on the show to detail Decoder-Only Transformers (like the GPT series). It was our most popular episode ever, so he's come right back today to detail an even more sophisticated architecture: Encoder-Decoder Transformers.
If you don’t already know him, Kirill:
• Is Founder and CEO of SuperDataScience, an e-learning platform that is the namesake of this podcast.
• Founded the Super Data Science Podcast in 2016 and hosted the show until he passed me the reins a little over three years ago.
• Has reached more than 2.7 million students through the courses he’s published on Udemy, making him Udemy’s most popular data science instructor.
Kirill was most recently on the show for Episode #747 to provide a technical introduction to the Transformer module that underpins all the major modern Large Language Models (LLMs) like the GPT, Gemini, Llama and BERT architectures. We received an unprecedented amount of positive feedback from that episode, demanding more! So here we are.
That episode, #747, as well as today’s, are perhaps the two most technical episodes of this podcast ever so they probably appeal mostly to hands-on practitioners like data scientists and ML engineers, particularly those who already have some understanding of deep neural networks.
In this episode, Kirill:
• Reviews the key Transformer theory that we covered in Episode #747, namely the individual neural-network components of the Decoder-Only architecture that prevails in generative LLMs like the GPT series models.
• Builds on that to detail the full, Encoder-Decoder Transformer architecture that is used in the original Transformer by Google, in their “Attention is All You Need” paper, as well as in other models that excel at both natural-language understanding and generation such as T5 and BART.
• Discusses the performance and capability pros and cons of full Encoder-Decoder architectures relative to Decoder-Only architectures like GPT and Encoder-Only architectures like BERT.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The Mamba Architecture: Superior to Transformers in LLMs
Modern, cutting-edge A.I. basically depends entirely on the Transformer. But now, the first serious contender to the Transformer has emerged and it’s called Mamba; we’ve got the full paper—called "Mamba: Linear-TimeSequence Modeling with Selective State Spaces" and written by researchers at Carnegie Mellon and Princeton.
Read MoreHow to Speak so You Blow Listeners’ Minds, with Cole Nussbaumer Knaflic
Cole Nussbaumer Knaflic's book, "storytelling with data", has sold over 500k copies... wild! In today's episode, Cole details the best tricks from her latest book, "storytelling with you" — a goldmine on how to inform and profoundly engage people.
Cole:
• Is the author of “storytelling with data”, which has sold half a million copies, been translated into over 20 languages and is used by more than 100 universities. Nearly a decade old, the book is the #1 bestseller still today in several Amazon categories.
• Also wrote the follow-on, hands-on “storytelling with data: let’s practice!” a bestseller in its own right.
• Serves as the Founder and CEO of the storytelling with data company, which provides data-storytelling workshops and other resources.
• Previously she was a People Analytics Manager at Google.
• Holds a degree in math as well as an MBA from the University of Washington.
Today’s episode will be of interest to anyone who’d like to communicate so effectively and compellingly that people are blown away.
In this episode, Cole details:
• Her top tips for planning, creating and delivering an incredible presentation.
• A few special tips for communicating data effectively for all of you data nerds like me.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
AlphaGeometry: AI is Suddenly as Capable as the Brightest Math Minds
Google DeepMind's open-sourced AlphaGeometry blends "fast thinking" (like intuition) with "slow thinking" (like careful, conscious reasoning) to enable a big leap forward in A.I. capability and match human Math Olympiad gold medalists on geometry problems.
KEY CONTEXT
• A couple weeks ago, DeepMind published on AlphaGeometry in the prestigious journal peer-reviewed Nature.
• DeepMind focused on geometry due to its demand for high-level reasoning and logical deduction, posing a unique challenge that traditional ML models struggle with.
MASSIVE RESULTS
• AlphaGeometry tackled 30 International Mathematical Olympiad problems, solving 25. This outperforms human Olympiad bronze and silver medalists' averages (who solved 19.3 and 22.9, respectively) and closely rivals gold medalists (who solved 25.9).
• This new system crushes the previous state-of-the-art A.I., which solved only 10 out of 30 problems.
• Beyond solving problems, AlphaGeometry also generates understandable proofs, making A.I.-generated solutions more accessible to humans.
HOW?
• AlphaGeometry uses a new method of generating synthetic theorems and proofs, simulating 100 million unique examples to overcome the limitations of (expensive, laborious) human-generated proofs.
• It combines a neural (deep learning) language model for intuitive guesswork with a symbolic deduction engine for logical problem-solving, mirroring "fast" and "slow thinking" processes akin to human cognition (per Daniel Kahneman's "Thinking, Fast and Slow" book).
IMPACT
• A.I. that can "think fast and slow" like AlphaGeometry could generalize across mathematical fields and potentially other scientific disciplines, pushing the boundaries of human knowledge and problem-solving capabilities.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Brewing Beer with A.I., with Beau Warren
In today's episode, Beau Warren of the innovative "Species X" brewery, details how we collaborated together on an A.I. model to craft the perfect beer. Dubbed "Krohn&Borg" lager, you can join us in Columbus, Ohio on Thursday night to try it yourself! 🍻
Read More