Filtering by Category: SuperDataScience

The Five Levels of Self-Driving Cars

Added on August 16, 2024 by Jon Krohn.

Back in Episode #748 earlier this year, I covered the five levels of Artificial General Intelligence. Well, today, inspired by my first-ever experience in an autonomous vehicle (a Waymo ride while in San Francisco recently), we’ve got an episode on the five levels of motor-vehicle automation.

Agentic AI, with Shingai Manjengwa

Added on August 13, 2024 by Jon Krohn.

Today's episode is all about Agentic A.I. — perhaps the hottest topic in A.I. today. Astoundingly intelligent and articulate Shingai Manjengwa couldn't be a better guide for us on this hot topic 🔥

Shingai:

Head of A.I. Education at ChainML, a prestigious startup focused on developing tools for a future powered by A.I. agents.
Founder and former CEO of Fireside Analytics Inc. (developed online data-science courses that have been undertaken by 500,000 unique students).
Previously was Director of Technical Education at the prominent global A.I. research center, the Vector Institute in Toronto.
Holds an MSc in Business Analytics from New York University.

Today’s episode should be equally appealing to hands-on practitioners like data scientists as to folks who generally yearn to stay abreast of the most cutting-edge A.I. techniques.

In today’s episode, Shingai details:

What A.I. agents are.

Why agents are the most exciting, fastest-growing A.I. application today.
How LLMs relate to agentic A.I.
Why multi-agent systems are particularly powerful.
How blockchain technology enables humans to better understand and trust A.I. agents.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Case You Missed It in July 2024

Added on August 12, 2024 by Jon Krohn.

In July, we had a yet another bevy of extraordinary guests on the SuperDataScience Podcast I host. ICYMI, this episode highlights the most fascinating moments from my convos with them.

Specifically, conversation highlights include:

1. Iconic Daliana Liu (ex-AWS Senior Data Scientist; host of The Data Scientist Show) on the hard skills data scientists need most in today's market.

2. Pulitzer prize-winning journalist and many-time NY Times bestselling author Charles Duhigg on the secrets to being a "Supercommunicator", i.e., getting people invested in your ideas and opening up.

3. Arcee.ai's CEO Mark McQuade and Chief of Frontier Research Charles Goddard detail the frontier "model merging" technique whereby the capabilities of multiple LLMs can be combined without increasing model size.

4. Prolific Google DeepMind researcher Dr. Rosanne Liu (no relation to Daliana!) on her landmark “Beyond the Imitation Game" paper, particularly why all LLM benchmarks are flawed.

5. Andrey Kurenkov, PhD (A.I. Engineering Lead at Astrocade and founder/co-host of my favorite podcast, "Last Week in A.I.") on how Artificial Superintelligence (ASI) may be just a few years and what the implications could be for us as individuals as well as as a society.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Superintelligence and the Six Singularities, with Dr. Daniel Hulme

Added on August 7, 2024 by Jon Krohn.

Artificial Superintelligence (ASI) could be realized in our lifetime... some even think within a few years. Today's brilliant guest, Dr. Daniel Hulme, details the six major ways society could be overhauled by ASI.

Llama 3.1 405B: The First Open-Source Frontier LLM

Added on August 7, 2024 by Jon Krohn.

Meta releasing its giant (405-billion parameter) Llama 3.1 model is a game-changer: For the first time, an "open-source" LLM competes at the frontier (against proprietary models GPT-4o and Claude).

How to Be a Supercommunicator, with Charles Duhigg

Added on July 30, 2024 by Jon Krohn.

Today, Pulitzer Prize winner and NY Times bestselling author Charles Duhigg reveals how you can become a "Supercommunicator", allowing you to connect with anyone, form deep bonds and get more done with others.

More on Charles:

• Pulitzer prize-winning journalist who currently writes for The New Yorker.
• His first book, "The Power of Habit", was published in 2012, spent over three years on New York Times bestseller lists and was translated into 40 languages.
• His second book, "Smarter Faster Better", was published in 2016 and was also a New York Times bestseller.
• Is a graduate of Yale University and Harvard Business School.

Today’s episode should be of great interest to everyone. In it, Charles provides the key takeaways from "Supercommunicators" including:

• Step-by-step instructions on how to connect meaningfully with anyone.
• The three types of conversation and how to ascertain which one you’re in at any given moment.
• How to have productive conflicts without the conversation spiraling out of control.
• How generative A.I. is transforming our conversations today and how the technology may transform them even more dramatically in the future.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

How to Thrive in Your (Data Science) Career, with Daliana Liu

Added on July 23, 2024 by Jon Krohn.

In today's episode, renowned Daliana Liu details how to overcome common (unhelpful!) career mindsets and thrive professionally, including finding your niche and getting promoted... all without burning out!

If you haven’t already heard of her, Daliana:

• Is well-known for her content creation on data science careers, particularly career-growth strategies, leading her to have >280,000 LinkedIn followers.
• Her The Data Scientist Show is in the top 2% of all podcasts globally in terms of downloads.
• Specializes in 1:1 career coaching as well as coaching groups through structured programs like her upcoming "Survive and Thrive in Data Science and AI Careers" course.
• Previously worked as a Senior Data Scientist at AWS and Predibase (a Bay Area open-source LLMs startup).
• Holds a Master's in Statistics from UC Irvine.

Today’s episode is well-suited to *anyone* who’d like to thrive more than ever professionally; it will particularly appeal to data scientists and related professionals like data analysts, ML engineers and software developers… but most of the advice Daliana covers is beneficial to anyone.

In today’s episode, Daliana details:

• Common unhelpful career mindsets and how to overcome them.
• How to find the role you really want as opposed to the one you think you want.
• How to find your niche in a fast-moving field.
• How to offset common professional issues like imposter syndrome, distraction and burnout.
• Her top tips for accelerating a technical career.
• The must-know tech skills for data scientists in today’s market.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Case You Missed It in June 2024

Added on July 19, 2024 by Jon Krohn.

June was yet another month of phenomenal guests on the SuperDataScience Podcast I host. ICYMI, today's episode highlights the most fascinating moments of my conversations with them.

Specifically, conversation highlights include:

Dr. Jason Yosinski, one of my all-time favorite deep-learning researchers and CEO/co-founder of climate-tech startup Windscape AI, shares the secrets to capturing investor interest and what it takes to turn heads in the AI startup scene. Spoiler alert: it’s more than just having a great idea! 🚀
Dr. Gina Guillaume-Joseph, systems engineer and A.I.-regulation guru, details the evolving regulatory field for A.I., helping you ensure that the A.I. systems you deploy won't fall foul of any laws.
Alexandre Andorra, co-founder of PyMC Labs and host of the Learning Bayesian Statistics podcast, on why being able to crunch larger and larger datasets has helped us to use a powerful modeling technique that was originally devised centuries ago (Bayesian stats, of course!)
Dr. Nathan Lambert, research scientist for the Allen Institute for AI (AI2) who previously built out the reinforcement learning from human feedback (RLHF) team at Hugging Face, on the lack of robustness in RLHF and how that could impact the future development and deployment of AI systems.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Merged LLMs Are Smaller And More Capable, with Arcee AI’s Mark McQuade and Charles Goddard

Added on July 16, 2024 by Jon Krohn.

Today's episode is seriously mind-expanding. In it, Mark and Charles detail how they're pushing the A.I. frontier through LLM merging, extremely efficient (even CPU-only!) LLM training, and *Small* Language Models.

Mark McQuade:

• Is Co-Founder and CEO of Arcee.ai.

• Previously, he held client-facing roles at Hugging Face and Roboflow as well as leading the data science and engineering practice of a Rackspace company.

• He studied electronic engineering at Fleming College in Canada.

Charles Goddard:

• Is Chief of Frontier Research at Arcee.ai

• Previously, he was a software engineer at Apple and the famed NASA Jet Propulsion Laboratory.

• Studied engineering at Olin College in Massachusetts.

Today’s episode is relatively technical so will likely appeal most to hands-on practitioners like data scientists and ML engineers. In it, Charles and Mark detail:

• How their impressive open-source model-merging approach combines the capabilities of multiple LLMs without increasing the model’s size.

• A separate open-source approach for training LLMs efficiently by targeting specific modules of the network to train while freezing others.

• The pros and cons of Mixture-of-Experts versus Mixture-of-Agents approaches.

• How to enable small language models to outcompete the big foundation LLMs like GPT-4, Gemini and Claude.

• How to leverage open-source projects to land big enterprise contracts and attract big chunks of venture capital.

On that final note, congrats to the Arcee.ai team on announcing their $24m Series A round this very day... unsurprising given their tremendously innovative tech and rapid revenue ramp-up! It's very rare to see runaway A.I. startup successes like this one.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

A Transformative Century of Technological Progress, with Annie P.

Added on July 12, 2024 by Jon Krohn.

For today's special episode (#800), I learned from my 94-year-old grandmother the tricks to living before electricity or running water... and how the wild tech transformation of the past century has impacted her.

In a bit more detail, in this episode, Annie covers:

What work and life were like growing up on a farm with no electricity or running water.
How education, communication, security, entertainment and food storage have evolved over her lifetime.
Similarities between geopolitical events in the 1930s and events transpiring today.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

AGI Could Be Near: Dystopian and Utopian Implications, with Dr. Andrey Kurenkov

Added on July 10, 2024 by Jon Krohn.

In today's episode, you can immerse yourself in one my favorite on-air convos ever: with the exceptional Andrey Kurenkov, on how soon AGI could be realized and the potential utopian/dystopian implications.

Andrey:

Founded and co-hosts my favorite podcast, "Last Week in A.I.", a weekly program that recaps all of the A.I.-related news you need to know about.
Is an ML Scientist at Astrocade, an NVIDIA-backed generative AI platform that converts your natural-language prompt into a functional video game.
Holds a PhD in Computer Science from Stanford University, with research focused on robotics and reinforcement learning.

Today’s episode should be of interest to just about anyone!

In it, Andrey details":

The genesis of the wide range of A.I. publications and podcasts he’s founded.
What the future of text-to-video-game generative A.I. could look like.
Why “A.I. as a product” rarely works commercially, but what you can succeed at with A.I. instead.
Why A.I. robotics is suddenly progressing so rapidly.
In one of my favorite on-air conversations ever, how soon AGI could be realized and the potentially dystopian or utopian implications.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Claude 3.5 Sonnet: Frontier Capabilities & Slick New “Artifacts” UI

Added on July 6, 2024 by Jon Krohn.

Anthropic’s latest publicly released model, Claude 3.5 Sonnet. This might not seem like a big deal because it’s not a “whole number” release like Claude 3 was or Claude 4 eventually will be, but in fact, it’s quite a big deal as this model now appears to actually represent the state of the art for text-in/text-out generative LLM, outcompeting the other frontier models like OpenAI’s GPT-4o and Google’s Gemini.

Deep Learning Classics and Trends, with Dr. Rosanne Liu

Added on July 3, 2024 by Jon Krohn.

Today's guest is the amazing Google DeepMind research scientist, Dr. Rosanne Liu!

Rosanne:

• Is a Research Scientist at Google DeepMind in California.

• Is Co-Founder and Executive Director of ML Collective, a non-profit that provides global ML research training and mentorship.

• Was a founding member of Uber AI Labs, where she served as a Senior Research Scientist.

• She has published deep learning research in top academic venues such as NeurIPS, ICLR, ICML and Science, and her work has been covered in publications like WIRED and the MIT Tech Review.

• Holds a PhD in Computer Science from Northwestern University.

Today’s episode, particularly in the second half when we dig into Rosanne’s fascinating research, is relatively technical so will probably appeal most to hands-on practitioners like data scientists and ML engineers.

In today’s episode, Rosanne details:

• The problem she founded the ML Collective to solve.

• How her work on the “intrinsic dimension” of deep learning models inspired the now-standard LoRA approach to fine-tuning LLMs.

• The thorny problems with LLM evaluation benchmarks and how they might be solved.

• The pros and cons of curiosity- vs goal-driven ML research.

• The positive impacts of diversity, equity and inclusion in the ML community.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Earth’s Coming Population Collapse and How AI Can Help, with Simon Kuestenmacher

Added on June 29, 2024 by Jon Krohn.

Worried about overpopulation? Excessive immigration? In today's episode, demographer Simon Kuestenmacher reveals the data on why we should be more concerned about the opposite: the coming global-population collapse.

Simon:

• Is Co-Founder and Director of The Demographics Group, a firm that provides advice on demographic data to businesses and governments.

• Writes a regular column on demographics for The Australian, the antipodean country’s most widely-read newspaper.

• He holds a Master’s in Urban Geography from the University of Melbourne.

Today’s episode should be of great interest to anyone! In it, Simon details:

• Why demography is the closest thing we have to a crystal ball.

• Why the world is at a greater risk of underpopulation than overpopulation by humans this century.

• How, in less than a decade, developed nations that depend on migrants to prevent their populations from declining will run out of immigrants.

• How A.I. and automation may solve both the coming low-migration crisis and the later global underpopulation crisis.

• The implications of vastly life-extending healthcare breakthroughs.

• What you can do in your career to prepare for the coming demographic and technological shifts.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Fast-Evolving Data and AI Regulatory Frameworks, with Dr. Gina Guillaume-Joseph

Added on June 25, 2024 by Jon Krohn.

A.I. regulatory frameworks are proliferating globally, protecting personal privacy while unlocking "dark data" for A.I.-model training. In today's episode, Dr. Gina Guillaume-Joseph is our expert guide to these A.I. regulations.

Gina:

Was, until recently, the CTO responsible for Government at Workday, aligning the HRtech giant with the U.S. federal government’s tech transformation strategy.
Prior to Workday, was Director of Technology at financial giant Capital One.
Earlier, spent 16 years supporting the federal government as a contractor with leading firms like Booz Allen Hamilton and The MITRE Corporation.
Now works as a fractional Chief Information Officer and as Adjunct Faculty at The George Washington University.
Holds a PhD in Systems Engineering from George Washington University and a Bachelor’s in Computer Science from Boston College.

Today’s episode should be of interest to just about anyone who would listen to this podcast because it focuses on the data and A.I. regulatory frameworks that will transform our industry.

In today’s episode, Gina details:

The “dark data conundrum”.
The most important data and A.I. regulations of recent years as well as those that are coming soon.
The pros and cons of being or hiring a fractional executive.
What system engineering is and why it’s an invaluable background for implementing large-scale A.I. projects.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Exciting (and Frightening!) Trends in Open-Source AI

Added on June 25, 2024 by Jon Krohn.

Friday's short episode of my podcast features four data-science luminaries (Emily Zabor, James David Long, Drew Conway and Jared Lander) explicating on the most exciting open-source A.I. trends they see.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Bayesian Methods and Applications, with Alexandre Andorra

Added on June 18, 2024 by Jon Krohn.

Is he a man or a country? Find out in today's episode with Alexandre Andorra — developer of the leading Bayesian library for Python, implementer of commercial Bayesian models and leading Bayesian educator/podcaster!

In Case You Missed It in May 2024

Added on June 17, 2024 by Jon Krohn.

We had another incredible set of guests in May on the SuperDataScience Podcast I host. ICYMI, today's episode highlights the most fascinating moments of my conversations with them.

Specifically, conversation highlights include:

1. Dr. Luis Serrano, a math- and ML-education YouTuber with 150k subscribers, explaining what language embeddings are, how they function, and how essential they are for running semantic search queries.

2. Sol Rashidi, serial C-suite data-role executive at Fortune 100s and bestselling author of "Your A.I. Survival Guide", on her approach to building data teams.

3. Co-founder of the MLOps Community, Demetrios Brinkmann, on the differences between ML Engineering and MLOps roles.

4. Navdeep Martin, an entrepreneur blending climate tech and generative A.I. in her latest startup, on opportunities where you can tackle climate change with technological innovation yourself.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

Added on June 11, 2024 by Jon Krohn.

In today's episode, the renowned RLHF thought-leader Dr. Nathan Lambert digs into the origins of RLHF, its role today in fine-tuning LLMs, emerging alternatives to RLHF... and how GenAI may democratize (human) education!

Nathan:

• Is a Research Scientist at the Allen Institute for AI (AI2) in Seattle, where he’s focused on fine-tuning Large Language Models (LLMs) based on human preferences as well as advocating for open-source AI.

• He’s renowned for his technical newsletter on AI called "Interconnects".

• Previously helped build an RLHF (reinforcement learning from human feedback) research team at Hugging Face.

• Holds a PhD from University of California, Berkeley in which he focused on reinforcement learning and robotics, and during which he worked at both Meta AI and Google DeepMind.

Today’s episode will probably appeal most to hands-on practitioners like data scientists and machine learning engineers, but anyone who’d like to hear from a talented communicator who works at the cutting edge of AI research may learn a lot by tuning in.

In today’s episode, Nathan details:

• What RLHF is and how its roots can be traced back to ancient philosophy and modern economics.

• Why RLHF is the most popular technique for fine-tuning LLMs.

• Powerful alternatives to RLHF such as RLAIF (reinforcement learning from A.I. feedback) and direct distilled preference optimization (dDPO).

• Limitations of RLHF.

• Why he considers AI to often be more alchemy than science.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Open-Source Libraries for Data Science at the New York R Conference

Added on June 7, 2024 by Jon Krohn.

For today's short episode, I asked four data-science luminaries about their favorite open-source libraries. Hear what Emily Zabor, James David Long, Drew Conway and Jared Lander chose, live on stage!

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.