This book was inspired by a series of blog posts recommending a Pythonic data science stack that mimics the comfort and familiarity of R’s tidyverse tools. In particular, this book explores the Polars library, a rising star in the DataFrame space (alongside established players like Pandas, Dask, Modin, Ray, and Vaex).
In a nutshell, Polars is defined as a query engine with a DataFrame front-end. It offers a rich set of intuitive functions and principled workflows for data manipulation and analysis. Designed from the ground up with performance in mind, Polars is also noted for its lightning-fast execution speed. Despite its quick rise in popularity, Polars is still in its early stages of development.Polars has finally reached the milestone of version 1.0
The majority of this book contains structured examples of data wrangling tasks that demonstrate idiomatic Polars and related dplyr/tidyr code for comparison. Examples might include discussion on API choices.
This work builds upon the organizational style of https://ddotta.github.io/cookbook-rpolars/ but the content and examples are tailored to Polars library in Python. So credit goes to Damien Dotta, all remaining errors are mine.
Contributing
Feel free to open an issue if you notice any problems with this book. It’s free and open source, and your feedback is valuable to me.