This week's guest is Doug Eisenstein, an exceptionally clear and content-rich communicator. He fills us in on the complexity of engineering a coherent source of truth for financial models, integrating hundreds of data sources.
Topics covered in the episode include:
• A breakdown of the primary financial sectors and departments
• Why data source integration for finance is wildly complicated
• Specific data engineering approaches that resolve these issues including entity resolution, knowledge graph mapping and tri-temporality.
20 years ago, Doug founded the consulting firm, Advanti and they have since become a critical provider of solutions to complex data engineering problems faced by some of the world's largest banks and asset managers including Morgan Stanley, Bank of America, Citibank and State Street.
Listen or watch here.
Filtering by Category: SuperDataScience
Algorithm Aversion
Setting Yourself Apart in Data Science Interviews
For this week's guest episode, I interrogated Andrew Jones on his data science interview secrets. If you want to improve your interview performance — especially if you're in a data-related career — this episode's for you.
Andrew has held a number of senior data roles over the past decade, including at the tech giant Amazon. In those roles, Andrew interviewed hundreds upon hundreds of data scientists, leading him to create his Data Science Infinity educational program, a curriculum that provides you with the hard and soft skills you need to set yourself apart from other data scientists during the interview process.
Listen or watch here.
Continuous Calendars
Extremely practical post for you today! It's on the Continuous Calendar, which in my opinion is vastly superior to the standard monthly calendar in every imaginable respect. Click through for more detail.
Performance Marketing Analytics
My guest this week is Kris Tait, who fills us in on how data and machine learning have transformed — and will continue to transform — marketing, enabling even small firms to effectively target customers and grow their revenue.
In this episode of the SuperDataScience show, we cover:
• What performance marketing is
• The rapidly shifting digital marketing ecosystem, as well as how data and ML can mitigate the risks associated with these changes
• The sweet spot for augmenting human marketers' skills with machines
• How any firm should define metrics to maximize return on marketing investment, thereby ensuring broader commercial success
• The most useful modern data science tools for global digital marketing
Kris is the managing director for the US at Croud - Performance Marketing Agency of the Year, an innovative marketing agency that is driven by data analytics and machine learning algorithms.
Listen or watch here.
Top Resume Tips
In recent weeks, I've received several messages from folks struggling to get callbacks for Data Scientist interviews. In reviewing their résumés, I realized there are five specific tips that I highly recommend adhering to.
You can listen or watch here.
Knowledge Graphs
In this week's guest episode, wildly intelligent and meticulously communicative Maureen Teyssier, Ph.D. explains what Knowledge Graphs are, why they're so powerful, and how to grow a flourishing data science team.
In more detail, in today’s episode we cover:
• The theory and applications of Knowledge Graphs, a cool and powerful data type at the heart of much of Maureen’s work at Reonomy
• The data science techniques that Reonomy use to flow data through extremely high-volume pipelines, enabling them to efficiently apply models to their massive data sets
• What Maureen looks for in the data scientists that she hires and the tools and approaches she leverages in order to grow a highly effective data science team
• The differences between data scientists, data analysts, data engineers, and machine learning engineers.
• Maureen’s fascinating academic work in which she used gigantic supercomputers to simulate solar systems and galaxies
Maureen is Chief Data Scientist at Reonomy, a very well-funded New York start-up — they’ve raised over 100 million dollars — that is transforming the world of commercial real estate with data and data science. Prior to working in industry, Maureen was an academic working in the field of computational astrophysics; she obtained her PhD from Columbia University in the City of New York and then carried out research at Rutgers University in New Jersey.
Listen here.
Five Keys to Success
I've recently been able to achieve markedly better results than ever before across my personal and professional lives. For Five-Minute Friday, I reflect on five keys to success that may allow achievement of many complex, long-term goals.
You can listen or watch here.
How to Thrive as an Early-Career Data Scientist
Getting started in data science? Today's episode is for you! Sidney Arcidiacono is absolutely crushing her first year in the field; we discuss the options for getting started in the field and top tips for early-career success.
Trained as a phlebotomist (blood-sample collection), Sidney was inspired by the potential for machine learning to revolutionize healthcare, so she jumped feet first into a full-time computer science degree at Make School, specializing in the data science track. From no familiarity with code or models just a year ago, Sidney's immersion has paid off: She's now fluent in the modern data science software stack and landed a summer data science internship at GreenLight Biosciences, Inc., an RNA-molecule therapeutics firm (like the Pfizer/BioNTech/Moderna vaccines).
Sidney is terrifically sharp and engaging; I think you'll enjoy hearing from her as much as I did during filming.
Watch or listen here.
Peer-Driven Learning
"Peer-driven" learning — where you are formally taught by your coworkers — not only results in team members learning key new skills, but can have added benefits like team bonding, confidence, and innovation. Something to try!
Today's episode is directly inspired by a LinkedIn post by Laura Rodriguez. She tagged me in the post, citing a SuperDataScience episode on communication and relating it to her workplace at ForwardKeys. Thank you, Laura!
The 20% of Analytics Driving 80% of ROI
Today’s episode is with freakin' David Langer, people!! (So obviously it's brilliant, witty, and full of laughs.) He fills us in on the most powerful 20% of analytics — the analytics that drive 80% of companies’ return on investment.
Publishing under his Dave on Data brand, Dave's YouTube channel is top-notch, with several videos that have over a million views (and the thumbnails are hilarious; check 'em out). He is an exceptionally accomplished data scientist and software engineer, including spending nearly a decade at Microsoft's Global HQ, where his titles included principal software architect, principal data scientist, and director of analytics.
Topics in the episode include:
Surprisingly powerful modeling approaches in spreadsheet tools like Excel
The SQL databases we'll need if the data sets we're working with are too big for spreadsheets
Why R programming is easy and should be our default language choice for moderate to advanced statistical analysis
How companies can maximize value from machine learning
Listen or watch here.
The Machine Learning House
In last week’s Five-Minute Friday, I discussed how, in the data science field, the learning never stops. But there’s one big counterpoint: The foundational subjects that underlie the field barely change at all, decade after decade.
These subjects — linear algebra, calculus, probability, statistics, data structures, and algorithms — build a strong foundation for your “Machine Learning House”. Today's Five-Minute Friday articulates my perspective that investing time in studying these foundational subjects will reap great dividends throughout your data science career.
You can listen or watch here.
Machine Learning at NVIDIA
This week's guest is absolute rockstar Dr. Anima Anandkumar, who's both professor at prestigious Caltech and director of ML research at NVIDIA. The episode is exceptionally content-rich but also lots of fun; Anima was a joy to film with.
In the episode, Anima fills us in on:
The cutting-edge interdisciplinary research she carries out (applying highly optimized mathematical operations to allow state-of-the-art ML models to be executed on NVIDIA's state-of-the-art GPUs)
How this blending of leading software and leading hardware enables world-changing innovations across disparate fields, from healthcare to robotics
What it's like in the workweek of a researcher bridging the academic and industrial realms
The open-source data science tools and techniques that she most highly recommends
Listen or watch here.
99 Days to Your First Data Science Job
He's BAAAAACK! Kirill Eremenko is the GUEST on the SuperDataScience show for the first time. In this episode, he details his exceptional new learning pathway that enables folks to land their first data science job in 99 days.
We also cover:
• What Kirill's been up to; life philosophies he's honed
• 5 myths holding people back from starting a data science career
• 5 items you need to land a data science job
Kirill created the SuperDataScience podcast in 2016 and hosted the program (over 400 episodes!) until passing the torch to yours truly on January 1st.
Kirill also founded the SuperDataScience company and is the firm’s CEO today. SuperDataScience.com, the namesake of this podcast, is a comprehensive online education platform for data science and related data specializations. Through SuperDataScience.com and his Udemy courses, Kirill has taught well over a million students worldwide, launching countless data science careers.
You can listen or watch here.
Learning Deep Learning Together
I'm joined today by Prof. Konrad Körding of the University of Pennsylvania, a world-leading researcher on links between biological neuroscience and A.I. He also leads Neuromatch Academy, a super cool group-based deep learning school.
Neuromatch is an innovative, hands-on program for learning deep learning that matches students with similar interests, languages, and time zones into tight-knit study teams. This matching approach is wildly successful, with 86% of students completing the program, compared to a 10% industry average.
In the first half of the episode, we go over the details of the Neuromatch curriculum, providing you with a survey of all of the state-of-the-art deep learning approaches. The second half is a mind-blowing exploration of the limits of artificial neural networks today and how incorporating more biological neuroscience may enable machines to develop artificial general intelligence (AGI) — i.e., machines that learn as well as humans do.
Listen or watch here.
The History of Data
Last month, I thought I was taking a risk by doing an episode on the History of Algebra, but it was an unusually popular episode! To follow up, today's Five-Minute Friday is on the four-billion-year History of Data — hope you enjoy it 😁
You can watch or listen here.
High-Impact Data Science Made Easy
Today, the wise Noah Gift weighs pros and cons of data science learning options (university degrees vs online certifications; full-time vs on-the-job) as well as how MLOps can quickly make you exponentially more impactful.
Noah has worked in countless technical leadership roles. He held the roles at companies ranging from tech start-ups he founded to prominent institutions like ABC, Caltech, and AT&T. Today, Noah’s founder of a consultancy called Pragmatic AI Labs — and he devises and teaches data science curricula at several of the most prestigious American universities, including Duke, Northwestern, and Berkeley. He has written eight books, including the bestselling Python for DevOps and the forthcoming Practical MLOps.
On top of all that incredible background, Noah has rich, well-formed life philosophies, which we dig into into detail. I learned a ton from him during this episode, and have been thinking about concepts we discussed time and again since filming. I highly recommend checking the episode out!
You can listen or watch here.
Good vs. Great Data Scientists
What separates a good data scientist from a great one? I asked this on Twitter recently and received hundreds of replies — some witty, others very thoughtful. For today's Five-Minute Friday episode, I review and summarize the thread.
The Tweet has had a crazy 7k engagements on 220k impressions so far — evidently it's a topic that lots of people have an opinion on. I highlighted some of my favorite individual replies in the video, including those from Martin Goodson, Chris Albon, Brandon Rohrer, Chelsea Parlett-Pelleriti, and Isabella Ghement.
What do you think? Let me know if I missed anything important!
You can listen to or watch my video summary here, or you can click through for the blog-post version.
The full Twitter thread is here if you'd like to dig through the entirety of the collective wisdom.
Read MoreAnalytics for Commercial and Personal Success
I believe the easiest way to attain success — in personal or professional endeavors alike — is to rigorously track and analyze the right data. Konrad Kopczynski is a master on this topic and he joins me for this week's guest episode.
Whether you're developing machine learning models, maximizing your company's profitability, or tackling a full-length Ironman triathlon, if you're disciplined about data collection, tracking, and reflection, you can iterate, improve, and achieve your dream state. This is a central tenet of my life and much of my ideology on it has been influenced by my near-decade-long friendship with Konrad.
Konrad is the founder and managing partner of impakt Advisors, a consultancy that specializes in harnessing data for, well, impact. They structure the various data sources into thoughtfully constructed data warehouses and then layer on top analytics, data-science models, and visualizations to enable real-time reports, dashboards, and predictions across all the key areas of a business, including digital marketing, customer retention, behavioral segmentation, and profit margin.
Listen to or watch here.
A.I. vs Machine Learning vs Deep Learning
"A.I.", "Machine Learning", and "Deep Learning" are terms that are often thrown around interchangeably. They shouldn't be! For Five-Minute-Friday this week, I define each of the three terms in straightforward language.
You can watch or listen to the episode here. Or you can expand below to read my blog post version.
Read More