Confirmed Talks

Web day

Keynote | Daniel Roy Greenfeld
Keynote | Tom Christie
How we Develop and Maintain a Modern Python Service at Mozilla: Merino as Example | Taddes (Tadas) Korris | Details
At Mozilla, we maintain services that are used by millions of users daily. These services are the backbone of expanding Firefox and providing users with useful features, all while protecting privacy. Learn about how one service, Merino, was planned to meet user needs at scale. This service providers users with search recommendations and suggestions from local and remote providers. Get some insights on how we develop, deploy and monitor and maintain this modern Python service.
Details
Scaling, Refactoring and fixing a Django MVP for Production | Piotr Gryko | Details
So you've built an AI startup using Async DJango - the MVP looks great and your hand full of users love it. Now you need to clean up the MVP, so you can scale. This is the Part Two, to building an AI startup with Async DJango - we talk about moving from ChromaDb to a OpenSearch/ElasticSearch, document processing steps to Celery/RabbitMQ, selfhosting via vLLM, migrating from Django templates to a ReactJs APP, better monitoring and logging
Details
Cracking the Code: Decoding Anti-Bot Systems! | Fabien Vauchelles | Details
Join us for a presentation where we share the mysteries of anti-bot systems, guarding websites, APIs, and mobile applications ! 🌐📲 🛠️ What's in Store: 1/ Exploring the Defence Layers 2/ Anti-Bot Reputation Score Demystified 3/ Strategies for Evasion After this talk, you'll emerge well-equipped with knowledge to navigate and comprehend the nuances of these protective measures! 🚀🔒
Details
µDjango 2.0, an asynchronous microservices technique. | Maxim Danilov | Details
A standard Django project involves working with multiple files and folders from the start. Let's see how the work with a Django project changes when we have only one file. This solution automatically transforms Django into a microservice-oriented async framework with "batteries included” philosophy.
Details
Your proposal: Data Harvest: Unlocking Insights with Web Scraping | Yuliia Barabash | Details
In today's data-driven world, knowing how to gather and analyze information is more critical than ever. Join us for a compact session on using Python for crawling the web and solve real-time problems. We'll cover the basics, and then dive into a practical example of collecting data from the internet.
Details
FastDjango: Conjuring Powerful APIs with the Sorcery of Django Ninja | Julius Boakye | Workshop | Details
Dive into the world of modern web development by fusing the power of Django and FastAPI. This talk will guide you through the process of building robust, scalable, and efficient APIs using Django Ninja, a web framework that combines Django's reliability and FastAPI's speed. We'll explore how to leverage Django's ORM and user authentication while enjoying FastAPI's performance and type checking. Whether you're a Django veteran looking to supercharge your APIs or a beginner eager to learn cutting-edge techniq
Details
How to Utilize Machine Learning for Better Web Scraping | Tadas Gedgaudas | Details
Join Tadas Gedgaudas in an enlightening talk on revolutionizing web scraping with machine learning. Uncover how ChatGPT can adapt to website layout changes, making scraping more efficient and reducing maintenance needs. Delve into data structurization with ML, the seamless integration of ChatGPT for parsing, and its practical impact for developers.
Details
Python behind the scenes of Danske Bank's Cloud Migration at Scale | Romualdas | Details
A glance behind the curtains into how an execution part is going with Danske Banks Lift and Shift journey to public cloud. Let's deep dive into some of the technical challenges and a snek Python stack standing right in front helping orchestrate Cloud Migration at Scale.
Details

Python day

Keynote | Arjan Codes
Keynote | Robert Smallshire
Watsonx: A GenAI platform that's built for business | Robert Dzisevič | Details
The hype for GenAI keeps rising. Nowadays, almost every company wants to adopt this technology in their business, but in order to successfully deliver a GenAI project, it takes much more than just figuring out, what to ask ChatGPT. During the presentation, I'll introduce you to an AI platform, that allows users to deliver GenAI projects with confidence.
Details
Using Rust & PyO3 to make Pydantic v2 even faster | David Hewitt | Details
In this talk we'll review some of the changes we've made to Pydantic since 2.0 to push performance even further. This is possible largely because Pydantic chose to implement the core in Rust. We'll focus on two main topics: - Come learn about optimizations Pydantic has been working on since 2.0 - Come see our draft ideas how Pydantic v3 could be even faster than v2 You should leave this talk excited about performance wins for your apps using Pydantic and inspired to try Rust in your own code.
Details
Unleashing Python's potential with MAX Platform | Antanas Daujotis | Details
The speech will address Python's limitations in AI and how MAX Platform can overcome them by offering superior speed, seamless Python code execution, and hardware compatibility. It will inspire Pythonistas to explore MAX Platform and unlock new possibilities in AI development and beyond.
Details
Object Oriented Programing the way it should be | Laimonas Sutkus | Workshop | Details
While Functional Programming gains traction, I'll showcase how OOP, done right, yields clean, efficient code. Explore a fresh perspective, gain insights, and reshape your coding approach.
Details
The role of Rust, Zig and C++ in the Python ecosystem | Cristián Maureira-Fredes | Details
Python's ecosystem is one of the best out there, and this is mainly due to its community and what lies inside its core, a C API. Being partially in C enables Python to interact with many languages out there which might be known by you like C++, Rust or Zig. But how does it work? On this talk, you will be able to understand how Python can embrace the power and performance of other languages, in order to expose modules that improve the whole ecosystem.
Details
Grokking Event-Driven Web App with Python | Tung Hoang | Details
Crafting scalable event-driven applications using Python can be a tricky endeavor, requiring careful consideration of various factors, from understanding synchronous and asynchronous network calls to tackling the Python Global Interpreter Lock (GIL) bottleneck and implementing robust auto-scaling strategies. This talk delves into advanced techniques and concepts for designing and implementing scalable event-driven applications with Python, empowering you to overcome these challenges effectively.
Details
503 days working full-time on FOSS: lessons learned | Rodrigo Girão Serrão | Details
I've been working full-time on a Python FOSS project for 503 days, so what did I learn? Am I a better (Python) programmer? Better teammate? Better person? In this talk I will share some lessons I learned over the course of these 503 days: - how to get a tech job in this day & age - how to put your ego aside - how to deal with mistakes - how to interact with users & contributors online - how it feels to collaborate to a large codebase As for the first 3 questions... Ask my colleagues!
Details
Deadcode - a tool to ﬁnd and ﬁx dead (unused) Python code | Albertas Gimbutas | Details
A newly developed deadcode Python package to detect and automatically fix unused Python code will be introduced. Real-world scenarios, when the deadcode saves development time will be provided. The main features and options of the deadcode package will be presented and it will be shown, why this tool is superior to vulture. Also some implementation details and complexities will be discussed.
Details
analyzing stdf production test data in the silicon manufacturing industry using construct | Franz Haas | Details
The data amount and the complexity of the queries are not particularly large in this industry. The challenge comes from using the STDF format, a binary file format with roots in the 1980's. A method to make this data source available to modern data analysis tools (jupyter/streamlit) using the construct library will be discussed. The focus is on how the data can be collected, converted and made available in a fast and efficient way, using both pypy and cpython.
Details
Designing for tomorrow's programming workflows | Matthew Honnibal | Details
New tools are changing how people program, and even _who_ programs. Type hints, modern editor support and, more recently, AI-powered tools like GitHub Copilot and ChatGPT are truly transforming our workflows and improving developer productivity. But what does this mean for how we should be writing and designing our APIs and libraries?
Details
Pointers? In My Python? | Eli Holderness | Details
Learn about Python's memory handling, including: - what pointers are, and why it matters - what object IDs are, and what they mean - how CPython can tell when you're done with an object, and what happens next No C knowledge required!
Details
Lessons Learned From Maintaining SDK in Python for Three Years | Adam Furmanek | Details
Let’s see how to build an SDK that works for years and is used by other developers. We’ll learn which patterns actually work, how mistakes made in the early stage affect the software years later, and how to make sure we don’t break users’ code when introducing changes.
Details
Let’s create a Python Debugger together | Johannes Bechberger | Workshop | Details
Debuggers are indispensable tools for all Python developers, empowering them to conquer bugs and unravel complex systems. Let's create our own.
Details
Python package creation using bleeding edge toolset | Albertas Gimbutas | Workshop | Details
We will create a new Python package from scratch using the best practices and will deploy it to pypi.org. We will also learn the benefits and how to use the bleeding edge tools for code linting, unit testing and deployment. Let's make Python ecosystem even more awesome!
Details
Deep Dive into Asynchronous SQLAlchemy - Transactions and Connections | Damian Wysocki | Details
SQLAlchemy is one of the most popular ORM libraries in Python. In this talk I will try to present caveats and gotchas that other Pythonists can find on their way while writing the asynchronous backend application using SQLAlchemy as an ORM. Mainly we will focus on how SQLAlchemy handles transactions and connections to the database and what issues we may face because of it.
Details

Data day

Keynote | Ritchie Vink
Keynote | Ines Montani
functime: a next generation ML forecasting library powered by Polars | Luca Baggi | Details
Polars conquered dataframes, and now it is coming for machine learning! With Polars-powered feature-extraction and a best-of-the-class set of diagnostic tools, functime enables **forecasting thousands of time series all at once, from the comfort of your laptop**. Though forecasting practitioners are the intended audience, the talk has something for every data scientist. With Polars, we can **push the boundary for what "reasonable scale" means - and build a new generation of tools for machine learning**.
Details
Introduction to Polars DataFrames - how to supercharge your data workflows | Marco Gorelli | Details
Polars is the new dataframe on the block taking the world by storm. You'll learn: - what Polars is, and what it can do for you - Polars basics and core concepts (including expressions and lazy computation) - how to work with different datatypes, and how the List datatype gives you superpowers - interoperability with other tools: NumPy, SciPy, Arrow, pandas, Numba - migrating from pandas What better way to learn it than by attending a PyCon Lithuania tutorial, delivered by a Polars core dev?
Details
🧼 From GPU-poor to data-rich: data quality practices for LLM fine-tuning | Gabriel Martín Blázquez, David Berenstein | Details
If you are GPU-poor you need to become data-rich. I will give an overview of what we learned from looking at Alpaca, LIMA, Dolly, UltraFeedback and Zephyr and how we applied that to fine-tuning a state-of-the-art open source LLM called Notus and Notux by becoming data-rich.
Details
Making an e-shop search bar your friend with Pinecone's hybrid search | Martynas Venckus | Details
Fast and accurate search results are a crucial components of any e-shop and thus can make the difference between high user satisfaction and user frustration. With recent advancements in vector search technologies, enhanced search systems have become more efficient, leading to better user experiences and improved conversion rates. In this talk, we’ll explore how to implement a hybrid search system for a non-english e-commerce site using Pinecone, a high-performance vector search engine.
Details
Speed up open source LLM-serving with llama-cpp-python | Isaac Chung | Details
Large language models (LLMs) often require huge compute resources to serve. This is a common challenge for those who want to avoid sharing their data with cloud API providers, or to deploy their stack in air-gapped environments. We will take a look at how the open source llama-cpp-python library opens the door to lower hardware requirements and simplifies deployment significantly.
Details
DataFrame interoperatiblity - what's been achieved, and what comes next? | Marco Gorelli | Details
In 2023, we saw several libraries - which had previously only supported pandas - add support for other dataframe libraries such as Polars, Modin, and cuDF. - How did they do it? - Are there any drawbacks to how they did it? - What comes next, and what other solutions are there? This talk could be of interest to anyone working with dataframes. In particular, those maintaining or contributing to libraries which use dataframes will learn about how they can best support multiple dataframe libraries.
Details
Understanding ChatGPT: Embeddings, Transformers and Reinforcement Learning with Human Feedback | Luca Baggi | Details
.
Details
Write-Audit-Publish Pattern in Modern Data Pipelines | Tomas Peluritis | Details
Data is new oil, and one of the ways is leakage and poisoning the surrounding environment. What happens if you pollute one of the datasets used in some decision makers facing dashboards? In this talk, I will explain the reemergence of the Write-Audit-Publish pattern and how you can achieve it using Apache Iceberg and Apache Spark.
Details
Transcend the Knowledge Barriers in RAG: Setup, Chat State, and More | Isaac Chung | Details
Developer tools power many LLM-based chat and Retrieval Augmented Generation applications today. However, there is a non-trivial knowledge barrier for entrants that could hinder developer experience. Our discussion intends to offer actionable insights into building and maintaining generative AI solutions in a secure and economical way, thereby improving the developer experience in this Generative AI wave.
Details
The pragmatic Pythonic data engineer | Robson Junior | Details
Learn to make practical decisions in data engineering with Python's vast ecosystem. Avoid blindly following market guidelines and consider the reality of your situation for better performance and architecture
Details
Generative AI in Lithuanian language | Vytautas Bielinskas | Details
Presentation about how we (few local NLP enthusiasts) trained Language Transformer to generate meaningful text in Lithuanian language. Everything was based on volunteer work with huge R&D flavor. During this presentation I will not only cover what kind of data we used to train this model and what results we got but also present other initiatives we drive in NLP field. Will try to do both technical and interactive presentation.
Details
Transforming Data Insights: Creating Dynamic Animated Stories with Python and ipyvizzu-story | Peter Vidos | Workshop | Details
Unlocking the value of data often hinges on the ability to communicate insights effectively to non-technical audiences. What if you could go beyond static charts and captivate your audience with animated data stories? Join us in this workshop to discover the power of animated storytelling using ipyvizzu-story, an innovative open-source presentation tool designed to work seamlessly within Jupyter Notebook and similar platforms.
Details
[MLOps] CI/CD in the age of Machine Learning | Emmanuel-Lin Toulemonde | Details
Machine learning models are a new artifact to build, version and deploy, explore there impacts on your architecture.
Details
Customizing LLMs: A Guide to Fine-Tuning Open Source Models | Maria Jose Molina Contreras | Details
In today's world, large language models (LLMs) are revolutionizing how we interact with technology, allowing us to have conversations, organize data, write text with minimal human effort.However, It is likely that when using an LLM, you have received incorrect answers or not specialized answers. For this reason, fine-tuning models that have been pre-trained with this large corpus of data is crucial to: (1) obtain better performance in the quality of responses, and (2) tune the model to a specific domain.
Details
RAG on KDTree | Jan Bartnitsky | Details
How to use KDTree from sklearn library to prototype RAG (Retrieval-Augmented Generation) applications.
Details
Revenue based scoring in `GridSearchCV`: a case for the new metadata routing in scikit-learn | Adrin Jalali | Details
Passing metadata such as `sample_weight` and `groups` through a scikit-learn `cross_validate`, `GridSearchCV`, or a `Pipeline` to the right estimators, scorers, and CV splitters has been either cumbersome, hacky, or impossible. The new metadata routing mechanism in scikit-learn enables you to pass metadata through these objects. As a use-case, we study how you can implement a revenue sensitive scoring while doing a hyperparameter search within a `GridSearchCV` object.
Details
A 101 in time series analytics with Apache Arrow, Pandas and Parquet | Zoe Steinkamp | Workshop | Details
Columnar databases are on the rise! They provide an efficient and scalable data warehouse for many use cases including time series data. The problem? Many conventional database drivers and querying methods become the bottleneck for data processing and analytics within our client-side applications. Learn how to leverage open-source projects like Apache Arrow Flight and Apache Parquet alongside industry-standard analytics tools to build the foundations of a performant analytics application.
Details
Data Processing with Apache Spark and Apache Iceberg | Tomas Peluritis | Details
"Data Processing with Apache Spark and Apache Iceberg" is a dynamic workshop designed to equip data professionals with advanced skills in managing and processing large-scale data. Participants will be introduced to the essential table formats before delving into Apache Iceberg's integration with Apache Spark. This session focuses on practical applications, including schema evolution and efficient file management, to enhance data processing efficiency and scalability. Ideal for data engineers and scientists,
Details
Streaming DataFrames: A New Way to Process Streaming Data in Python | Tomáš Neubauer | Details
Introducing an open source library in Python: Quix Streams. It solves all the complexities of stream processing in a cloud native package with a familiar Pandas DataFrame API interface. This library lets you work with data like they are static in your Jupyter Notebook without any hassle associated with streaming technologies. Our mission is to bring masses of Python developers into streaming and make the journey as smooth as possible so real-time applications using ML are not so difficult
Details