Room: Room 111
April 5
11:00–11:25
In 2023, we saw several libraries - which had previously only supported pandas - add support for other dataframe libraries such as Polars, Modin, and cuDF.
This talk could be of interest to anyone working with dataframes. In particular, those maintaining or contributing to libraries which use dataframes will learn about how they can best support multiple dataframe libraries.
Some interest in dataframes and dataframe-consuming libraries
In 2023, we saw several libraries - which had previously only supported pandas - add support for other dataframe libraries, with a particular emphasis on Polars. They typically did this in one of three ways:
to_pandas
or via the Dataframe Interchange Protocol)These all represented a quality-of-life improvement for Polars users. But were there drawbacks?
So what can be done instead? A solution I'll present is to use Narwhals, which guarantees that your code will work the same way across dataframe libraries - even ones that don't exist yet - all using familiar Polars syntax.
The format will roughly be:
to_pandas
?By the end of the talk, attendees will have learned about the dataframe ecosystem, and those involved with dataframe-consuming libraries will know all they need in order to effectively support multiple dataframe libraries. Library maintainers and contributors will get the most out of the talk, but anyone regularly using dataframes will also learn a lot and the tools they use.
Marco is a core dev of pandas and Polars and works at Quansight Labs as Senior Software Engineer. He also consults and trains clients professionally on Polars. He has also written the first Polars Plugins Tutorial and has taught Polars Plugins to clients.
He has a background in Mathematics and holds an MSc from the University of Oxford, and was one of the prize winners in the M6 Forecasting Competition (2nd place overall Q1).