Data Day - Apr 9
Talk
confirmed
Python, rust and arrow for data processing
Speaker
PV
Paulius Venclovas
Paulius Venclovas is a Senior Data Engineer at "Flo Health" where he focus on delivering data processing and governance solutions on Databricks platform. Previously he worked at financial startup "Curve" and data consultancy company "Beyond Analysis". Paulius also holds a Master’s degree in Computing (AI/ML) from Imperial College London.
Abstract
Python struggles with heavy data loads. Rust offers speed, and PyO3 makes bridging the two seamless. This talk shows how to build a shared Rust core to avoid code duplication. I will also cover using Apache Arrow for zero-copy data sharing and removing serialization costs entirely. Discover how this stack enables high-performance data processing in Python and Pyspark