As I mentioned in my year-end recap episode on December 29th, in 2022 this show enjoyed growth of 90% — that means the show’s audience nearly doubled relative to 2021, netting over 3.3 million podcast downloads and YouTube views in 2022. So, first off, thank you so much for listening to the show and for telling your friends and colleagues about the show — essentially all of our growth is organic so we’re highly dependent on your personal recommendations.
Anyway, given our near-doubling in 2022, I didn’t think the show could be growing much faster and thus I’ve been delightfully surprised that so far in 2023, growth has accelerated, with January being by far our biggest month ever and February not far behind.
With all of you new SuperDataScience listeners out there, I put together today’s episode to fill you in on the most listened-to episodes of 2022, giving you a data-backed set of outstanding episodes that you might want to go back and check out if you’re hankering for more content. For veteran listeners, this episode could be informative too: It’ll ensure that you didn’t miss any of the most popular episodes from last year that sound interesting to you.
One thing you might be wondering is why I’m airing a best-of-2022 episode in March. Well, there are two factors. First, internally at the SuperDataScience Podcast we use the 30-day mark after an episode’s release as our quantitative Key Performance Indicator as to how an episode’s been received by you. Second, I typically record episodes several weeks ahead of their release to give our production team plenty of time to clean them up real nice for you and to leave a few episodes in the pipeline in case I get ill.
All right, explanations out of the way, let’s dig into — quantitatively-speaking — the ten top-performing episodes of 2022.
The tenth-most popular episode featured Ann Emery providing a super-slick overview of how to influence others with your data. This was a tremendously practical episode packed with tips on effective data visualization, data presentation, and data storytelling. Evidently, a lot of folks were hankering for this practical info, so thanks, Ann!
The ninth-most popular episode starred the sage Sadie St. Lawrence, who — in the first episode of 2022 — predicted (remarkably accurately, it turns out) the data science trends for the year ahead. Special mention to Sadie for cracking the top ten two years in a row!
In eighth place is an episode with Kian Katanforoosh, a renowned Stanford lecturer and CEO of Workera, a fast-growing platform for data scientists, software engineers, and other technical practitioners to upskill systematically while staying in their job.
In seventh place is an episode featuring Dr. Noam Brown. His episode — on A.I. for crushing humans at poker — is close to my heart because it was the first one ever recorded with a live audience. I was nervous about it for months and so it’s a huge relief to me that the inaugural live recording was well-received by SuperDataScience listeners. If you’re looking for more info on game-playing A.I., the very next episode of this show (#663, which will be released on Tuesday) features Dr. Brown’s colleague Alexander Holden Miller detailing Meta’s astounding new A.I. that uses natural language to negotiate and build trust with humans in order to excel at an extremely complex board game. It’s an extraordinary achievement — in my view, a much bigger deal than ChatGPT — and so I highly recommend checking that forthcoming episode out.
Oops, I’m supposed to be focusing on the past! Back to the countdown: The sixth most listened-to episode in 2022 was with the renowned and exceptionally wise technologist Erik Bernhardsson covering tools for deploying data models efficiently into production.
In fifth was YouTube superstar Tina Huang who opened up about her typical workday at one of the world's largest tech companies, her strategies for efficient learning, and how best to prepare for a career in data science from scratch.
In fourth was another YouTube sensation, Shashank Kalanathi, who detailed how to get started in a data analytics career and then where you can grow to from that first data analyst role.
Spotting a clear trend here, our bronze medal goes to yet another YouTuber, this time the brilliant mind behind the mega-popular Python Simplified channel, Ms. Mariya Sha. In her wide-ranging and deeply philosophical appearance, amongst many topics Mariya covered how you can make learning any new machine learning concept much simpler.
In second place — our silver medalist — was New York University professor Jennifer Hill. A personal icon of mine — whose statistical modeling textbook I fell in love with when I was starting my PhD 16 years ago — Prof. Hill’s episode dug into how to design experiments in order to confidently infer causality from the results as well as her favorite Bayesian methods for analyzing causal direction. Hmm, 2021’s most popular episode also featured Bayesian stats so it seems like I should be booking more Bayesian experts in 2023!
Finally… are you ready for it? In first place, the most popular SuperDataScience episode of 2022 featured the author and educator Matt Harrison’s episode on effectively programming in the ubiquitous Pandas library for data processing. Matt’s highly practical episode – #557 – has already ratcheted up over 60,000 listens with no doubt tens of thousands more still to come.
What’s even more impressive about Matt’s stand-out episode — as well as others in the top ten from the first few months of last year such as Sadie, Tina, and Noam — is that my simple popularity-assessing methodology for this episode contains a critical flaw. Because the SuperDataScience podcast became significantly more popular each quarter of last year, guests whose episodes aired near the end of the year had an advantage over those whose episodes aired earlier. As I mentioned in my top-ten-episodes-of-2021 episode a year ago, it would have been more sophisticated to, say, fit locally estimated scatterplot smoothing to the data and then rank the episodes that had the largest residual above the regression curve. I considered that again this year, but then it adds a bit of voodoo and controversy to my results (e.g., did I choose the correct regression method? What if I picked a different one?) so for yet another year, I’m considering my easy-to-understand approach of taking listens at the 30-day mark to be sufficient.
Speaking of the show becoming more popular over time, to give you a sense of how much the show has taken off in recent months, the first five episodes of 2023 all would have cracked into 2022’s top ten. Q1 of 2023 could be the first time we hit one million listens in a single quarter — not only are we the most listened-to show in the data science industry, we’re also in the top 15 technology podcasts and are tantalizingly close to breaking into top 1000 podcasts worldwide across any category. I think this is pretty impressive given the narrow niche data science is, so thank you so much for listening and spreading the word about our show!
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.