Today's episode is all about Polars — the hot library for Python that offers up to 100x speedups for DataFrame operations relative to pandas. Marco Gorelli, a core Polars developer, is our gifted guide.
Marco is a tremendously talented communicator of complex technical topics, making him the perfect guest for this highly technical episode. He:
• Is a core developer of the popular Python libraries pandas and Polars.
• Is the creator of the Narwhals library.
• Has spoken at several major Python conferences (such as PyData), taught Polars professionally, and wrote the first complete Polars plugins tutorial.
• Currently works as Senior Software Engineer at Quansight Labs.
• Previously, worked as a data scientist and was one of the prize winners (from amongst >100,000 entrants!) of the M6 forecasting competition.
• Holds a Master’s in Mathematics and the Foundations of Computer Science from the University of Oxford.
Today’s episode will appeal primarily to hands-on technical folks like data scientists, ML engineers and software developers.
In today’s episode, Marco details:
• What the hot, fast-growing Polars library for working with DataFrames in Python is (it already has 65m downloads and 28k GitHub stars).
• How Polars offers up to 100x speed-ups relative to Pandas on DataFrame operations.
• How the lightweight, dependency-free Narwhals package he created allows for easy compatibility between different DataFrame libraries such as Polars and Pandas.
• How he got addicted to open-source development.
• The simple trick he used to be a prize-winner in super-popular forecasting competitions.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.