Confirmed Talks
Web day
- Keynote | Daniel Roy Greenfeld
- Keynote | Tom Christie
- How we Develop and Maintain a Modern Python Service at Mozilla: Merino as Example | Tadas Korris | DetailsAt Mozilla, we maintain services that are used by millions of users daily. These services are the backbone of expanding Firefox and providing users with useful features, all while protecting privacy. Learn about how one service, Merino, was planned to meet user needs at scale. This service providers users with search recommendations and suggestions from local and remote providers. Get some insights on how we develop, deploy and monitor and maintain this modern Python service.Details
- So you've built an AI startup using Async DJango - the MVP looks great and your hand full of users love it. Now you need to clean up the MVP, so you can scale. This is the Part Two, to building an AI startup with Async DJango - we talk about moving from ChromaDb to a OpenSearch/ElasticSearch, document processing steps to Celery/RabbitMQ, selfhosting via vLLM, migrating from Django templates to a ReactJs APP, better monitoring and loggingDetails
- Join us for a presentation where we share the mysteries of anti-bot systems, guarding websites, APIs, and mobile applications ! šš² š ļø What's in Store: 1/ Exploring the Defence Layers 2/ Anti-Bot Reputation Score Demystified 3/ Strategies for Evasion After this talk, you'll emerge well-equipped with knowledge to navigate and comprehend the nuances of these protective measures! ššDetails
- A standard Django project involves working with multiple files and folders from the start. Let's see how the work with a Django project changes when we have only one file. This solution automatically transforms Django into a microservice-oriented async framework with "batteries includedā philosophy.Details
- In today's data-driven world, knowing how to gather and analyze information is more critical than ever. Join us for a compact session on using Python for crawling the web and solve real-time problems. We'll cover the basics, and then dive into a practical example of collecting data from the internet.Details
- FastDjango: Conjuring Powerful APIs with the Sorcery of Django Ninja | Julius Boakye | Workshop | DetailsDive into the world of modern web development by fusing the power of Django and FastAPI. This talk will guide you through the process of building robust, scalable, and efficient APIs using Django Ninja, a web framework that combines Django's reliability and FastAPI's speed. We'll explore how to leverage Django's ORM and user authentication while enjoying FastAPI's performance and type checking. Whether you're a Django veteran looking to supercharge your APIs or a beginner eager to learn cutting-edge techniqDetails
- Join Tadas Gedgaudas in an enlightening talk on revolutionizing web scraping with machine learning. Uncover how ChatGPT can adapt to website layout changes, making scraping more efficient and reducing maintenance needs. Delve into data structurization with ML, the seamless integration of ChatGPT for parsing, and its practical impact for developers.Details
- A glance behind the curtains into how an execution part is going with Danske Banks Lift and Shift journey to public cloud. Let's deep dive into some of the technical challenges and a snek Python stack standing right in front helping orchestrate Cloud Migration at Scale.Details
- Saying bye to the Keyboard, Hello to Alexa with Python AWS Lambda | Laysa Uchoa, Yuliia Barabash | DetailsJoin us, and discover how Alexa's ability to recognize and convert speech into text can be used to create applications that break the monotony of your daily routine without the need to use a keyboard at all. We will teach you about the main components of Alexa, how to get started with the Developer console, and how to customize Alexa using our favorite language, Python in a serverless way. We will also demonstrate how to incorporate Alexa into your daily developer life, and you might find that, after this tDetails
- Raw Django doesn't take the first places when comparing the performance of Python web frameworks. However, it can be pretty fast if we identify the bottlenecks and find ways to avoid them. Comparing performance and implementation complexity before and after gives us an understanding of which features should be implemented and what can be skipped.Details
- Frameworks like Django use advanced Python features to provide devs with the magical tools they know and love. In this live-coded talk weāll take a look at a couple of Django snippets that use descriptors under the hood and weāll use them as motivating examples for why Python needs descriptors. By the end of the talk, youāll understand how descriptors work and how they power Django behind the scenes.Details
- Enhance Django with HTMX: Elevate your web applications with seamless client-side interactivity and build dynamic, engaging experiences without page reloads - no React / Vue / Angular required!Details
- In this presentation, we will be exploring Observability on a Python web application. We will delve into a real-world application, showcase, and discuss the importance of Obversing for Services. We will focus on the three foundations of Observability: Logs, Metrics, and Tracing. Discover some tools for observing and monitoring, particularly showcase a Demo of how to integrate DataDog in a Python service. The presentation will show examples of logs and Metrics, and display how to trace a request.Details
- Unlock the full potential of web scraping with this session! From novice to virtuoso, join us on an exciting journey of data extraction as we unravel secrets and advanced techniques. š Session Highlights: 1/ Building Web Scrapers - The Art Unveiled š ļø 2/ Proxy and Browser Farms Adventure š 3/ Scrapoxy Orchestration - Elevate Your Scalability š 4/ Protection Measures Disclosed š This concise session will immerse you in the fascinating world of web scraping.Details
- Django's async capabilities and batteries-included tooling make it an ideal framework for quickly building MVPs and iterating. This talk demonstrates building a document search MVP with Django templates, ChromaDB, and hosted large language models. It then shows how to refactor and scale it using Elasticsearch, Celery/RabbitMQ workers, React, self-hosted vLLM, and auth. With Django async, you can rapidly build, constantly improve, and deploy the latest AI models in your product.Details
- Facing challenges with search capabilities in your web applications? Discover how the combination of OpenSearch, Python, and serverless architecture can be your solution. This talk provides hands-on examples, from building efficient queries to implementing production-ready practices. You'll gain actionable insights and the practical know-how to build and deploy robust, query-efficient search applications that solve real-world challenges.Details
- Lessons (I Wish I Knew These Before) from Migrating a Farm of Django Projects from On-Premises to AWS with Kubernetes | Justinas Kuizinas | DetailsAt Corner Case Technologies, we offer clients a service to migrate from on-premises infrastructure to AWS for various purposes, including high availability, cost optimization, and maintainability. Each migration is unique and necessitates thorough preparation for planning, execution, and subsequent development. In this talk, I will present a specific use case of a migration that we conducted, with a particular focus on the lessons learned during the planning and execution phases.Details
Python day
- Keynote | Arjan Codes
- Keynote | Robert Smallshire
- The hype for GenAI keeps rising. Nowadays, almost every company wants to adopt this technology in their business, but in order to successfully deliver a GenAI project, it takes much more than just figuring out, what to ask ChatGPT. During the presentation, I'll introduce you to an AI platform, that allows users to deliver GenAI projects with confidence.Details
- In this talk we'll review some of the changes we've made to Pydantic since 2.0 to push performance even further. This is possible largely because Pydantic chose to implement the core in Rust. We'll focus on two main topics: - Come learn about optimizations Pydantic has been working on since 2.0 - Come see our draft ideas how Pydantic v3 could be even faster than v2 You should leave this talk excited about performance wins for your apps using Pydantic and inspired to try Rust in your own code.Details
- The speech will address Python's limitations in AI and how MAX Platform can overcome them by offering superior speed, seamless Python code execution, and hardware compatibility. It will inspire Pythonistas to explore MAX Platform and unlock new possibilities in AI development and beyond.Details
- While Functional Programming gains traction, I'll showcase how OOP, done right, yields clean, efficient code. Explore a fresh perspective, gain insights, and reshape your coding approach.Details
- Python's ecosystem is one of the best out there, and this is mainly due to its community and what lies inside its core, a C API. Being partially in C enables Python to interact with many languages out there which might be known by you like C++, Rust or Zig. But how does it work? On this talk, you will be able to understand how Python can embrace the power and performance of other languages, in order to expose modules that improve the whole ecosystem.Details
- Crafting scalable event-driven applications using Python can be a tricky endeavor, requiring careful consideration of various factors, from understanding synchronous and asynchronous network calls to tackling the Python Global Interpreter Lock (GIL) bottleneck and implementing robust auto-scaling strategies. This talk delves into advanced techniques and concepts for designing and implementing scalable event-driven applications with Python, empowering you to overcome these challenges effectively.Details
- I've been working full-time on a Python FOSS project for 503 days, so what did I learn? Am I a better (Python) programmer? Better teammate? Better person? In this talk I will share some lessons I learned over the course of these 503 days: - how to get a tech job in this day & age - how to put your ego aside - how to deal with mistakes - how to interact with users & contributors online - how it feels to collaborate to a large codebase As for the first 3 questions... Ask my colleagues!Details
- A newly developed deadcode Python package to detect and automatically fix unused Python code will be introduced. Real-world scenarios, when the deadcode saves development time will be provided. The main features and options of the deadcode package will be presented and it will be shown, why this tool is superior to vulture. Also some implementation details and complexities will be discussed.Details
- analyzing stdf production test data in the silicon manufacturing industry using construct | Franz Haas | DetailsThe data amount and the complexity of the queries are not particularly large in this industry. The challenge comes from using the STDF format, a binary file format with roots in the 1980's. A method to make this data source available to modern data analysis tools (jupyter/streamlit) using the construct library will be discussed. The focus is on how the data can be collected, converted and made available in a fast and efficient way, using both pypy and cpython.Details
- New tools are changing how people program, and even _who_ programs. Type hints, modern editor support and, more recently, AI-powered tools like GitHub Copilot and ChatGPT are truly transforming our workflows and improving developer productivity. But what does this mean for how we should be writing and designing our APIs and libraries?Details
- Learn about Python's memory handling, including: - what pointers are, and why it matters - what object IDs are, and what they mean - how CPython can tell when you're done with an object, and what happens next No C knowledge required!Details
- Letās see how to build an SDK that works for years and is used by other developers. Weāll learn which patterns actually work, how mistakes made in the early stage affect the software years later, and how to make sure we donāt break usersā code when introducing changes.Details
- Debuggers are indispensable tools for all Python developers, empowering them to conquer bugs and unravel complex systems. Let's create our own.Details
- We will create a new Python package from scratch using the best practices and will deploy it to pypi.org. We will also learn the benefits and how to use the bleeding edge tools for code linting, unit testing and deployment. Let's make Python ecosystem even more awesome!Details
- SQLAlchemy is one of the most popular ORM libraries in Python. In this talk I will try to present caveats and gotchas that other Pythonists can find on their way while writing the asynchronous backend application using SQLAlchemy as an ORM. Mainly we will focus on how SQLAlchemy handles transactions and connections to the database and what issues we may face because of it.Details
- Performing climate science within the context of climate change requires creative solutions to challenges such as data collection and storage management, optimizations for better memory and CPU usage, in addition to ensuring that analysis outputs are trustworthy. This talk will showcase xclim and finch, two pieces of software built for performing climate analyses on large datasets using Python, WPS, and the PANGEO software stack of technologies.Details
- Agenda: - What is mutation testing? - Why isn't test coverage enough? - What are its pros and cons? - How does it work (overview *and* details)? - Simple example (finding and fixing bad test) - Complex example (finding and fixing bad/missing test) - Complex example (finding and fixing redundant code) - FAQs -- history, why it's so CPU/RAM intensive, and more if time allows - Unusual applications, if time allows - Wrapup - Q&ADetails
- In May 2023, there was a big buzz in the AI community as a brand-new programming language called 'Mojo' made its debut. People were talking about it in blog posts like: 'Mojo may be the biggest programming language advance in decades'. In this talk, we'll dive into Mojo, checking out what it promises and where it stands right now, and also pondering what the future could hold for it. Target Audience: Software Developers Prerequisites: General knowledge about programming languagesDetails
- An overcomplicated project increases development and maintenance time. If a complete redesign is not possible, we can distribute the complexity across the existing codebase. If AI assistants cannot help us with this task yet, we should discuss manual methods and tools that can be useful. Using examples of real large projects, we will discuss that despite different business types, geographical and social contexts, these projects share similar architectural mistakes and how they can be redesigned.Details
- Sometimes you have a Python object and you want it somewhere else: maybe you want to save your data to disk and load it again tomorrow; or you want to send some complex parameters over the network. I'll talk about pickle - the usual way to do this, including ways it can go wrong, how to extend it, compare it to other approaches like JSON or storing in a database; and I'll stick a little bit of theory in my talk too.Details
Data day
- Keynote | Ritchie Vink
- Keynote | Ines Montani
- Polars conquered dataframes, and now it is coming for machine learning! With Polars-powered feature-extraction and a best-of-the-class set of diagnostic tools, functime enables **forecasting thousands of time series all at once, from the comfort of your laptop**. Though forecasting practitioners are the intended audience, the talk has something for every data scientist. With Polars, we can **push the boundary for what "reasonable scale" means - and build a new generation of tools for machine learning**.Details
- Introduction to Polars DataFrames - how to supercharge your data workflows | Marco Gorelli | DetailsPolars is the new dataframe on the block taking the world by storm. You'll learn: - what Polars is, and what it can do for you - Polars basics and core concepts (including expressions and lazy computation) - how to work with different datatypes, and how the List datatype gives you superpowers - interoperability with other tools: NumPy, SciPy, Arrow, pandas, Numba - migrating from pandas What better way to learn it than by attending a PyCon Lithuania tutorial, delivered by a Polars core dev?Details
- š§¼ From GPU-poor to data-rich: data quality practices for LLM fine-tuning | Gabriel MartĆn BlĆ”zquez, David Berenstein | DetailsIf you are GPU-poor you need to become data-rich. I will give an overview of what we learned from looking at Alpaca, LIMA, Dolly, UltraFeedback and Zephyr and how we applied that to fine-tuning a state-of-the-art open source LLM called Notus and Notux by becoming data-rich.Details
- Fast and accurate search results are a crucial components of any e-shop and thus can make the difference between high user satisfaction and user frustration. With recent advancements in vector search technologies, enhanced search systems have become more efficient, leading to better user experiences and improved conversion rates. In this talk, weāll explore how to implement a hybrid search system for a non-english e-commerce site using Pinecone, a high-performance vector search engine.Details
- Large language models (LLMs) often require huge compute resources to serve. This is a common challenge for those who want to avoid sharing their data with cloud API providers, or to deploy their stack in air-gapped environments. We will take a look at how the open source llama-cpp-python library opens the door to lower hardware requirements and simplifies deployment significantly.Details
- In 2023, we saw several libraries - which had previously only supported pandas - add support for other dataframe libraries such as Polars, Modin, and cuDF. - How did they do it? - Are there any drawbacks to how they did it? - What comes next, and what other solutions are there? This talk could be of interest to anyone working with dataframes. In particular, those maintaining or contributing to libraries which use dataframes will learn about how they can best support multiple dataframe libraries.Details
- Understanding ChatGPT: Embeddings, Transformers and Reinforcement Learning with Human Feedback | Luca Baggi | Details
- Data is new oil, and one of the ways is leakage and poisoning the surrounding environment. What happens if you pollute one of the datasets used in some decision makers facing dashboards? In this talk, I will explain the reemergence of the Write-Audit-Publish pattern and how you can achieve it using Apache Iceberg and Apache Spark.Details
- Developer tools power many LLM-based chat and Retrieval Augmented Generation applications today. However, there is a non-trivial knowledge barrier for entrants that could hinder developer experience. Our discussion intends to offer actionable insights into building and maintaining generative AI solutions in a secure and economical way, thereby improving the developer experience in this Generative AI wave.Details
- Learn to make practical decisions in data engineering with Python's vast ecosystem. Avoid blindly following market guidelines and consider the reality of your situation for better performance and architectureDetails
- Presentation about how we (few local NLP enthusiasts) trained Language Transformer to generate meaningful text in Lithuanian language. Everything was based on volunteer work with huge R&D flavor. During this presentation I will not only cover what kind of data we used to train this model and what results we got but also present other initiatives we drive in NLP field. Will try to do both technical and interactive presentation.Details
- Transforming Data Insights: Creating Dynamic Animated Stories with Python and ipyvizzu-story | Peter Vidos | Workshop | DetailsUnlocking the value of data often hinges on the ability to communicate insights effectively to non-technical audiences. What if you could go beyond static charts and captivate your audience with animated data stories? Join us in this workshop to discover the power of animated storytelling using ipyvizzu-story, an innovative open-source presentation tool designed to work seamlessly within Jupyter Notebook and similar platforms.Details
- Machine learning models are a new artifact to build, version and deploy, explore there impacts on your architecture.Details
- In today's world, large language models (LLMs) are revolutionizing how we interact with technology, allowing us to have conversations, organize data, write text with minimal human effort.However, It is likely that when using an LLM, you have received incorrect answers or not specialized answers. For this reason, fine-tuning models that have been pre-trained with this large corpus of data is crucial to: (1) obtain better performance in the quality of responses, and (2) tune the model to a specific domain.Details
- How to use KDTree from sklearn library to prototype RAG (Retrieval-Augmented Generation) applications.Details
- Revenue based scoring in `GridSearchCV`: a case for the new metadata routing in scikit-learn | Adrin Jalali | DetailsPassing metadata such as `sample_weight` and `groups` through a scikit-learn `cross_validate`, `GridSearchCV`, or a `Pipeline` to the right estimators, scorers, and CV splitters has been either cumbersome, hacky, or impossible. The new metadata routing mechanism in scikit-learn enables you to pass metadata through these objects. As a use-case, we study how you can implement a revenue sensitive scoring while doing a hyperparameter search within a `GridSearchCV` object.Details
- A 101 in time series analytics with Apache Arrow, Pandas and Parquet | Zoe Steinkamp | Workshop | DetailsColumnar databases are on the rise! They provide an efficient and scalable data warehouse for many use cases including time series data. The problem? Many conventional database drivers and querying methods become the bottleneck for data processing and analytics within our client-side applications. Learn how to leverage open-source projects like Apache Arrow Flight and Apache Parquet alongside industry-standard analytics tools to build the foundations of a performant analytics application.Details
- "Data Processing with Apache Spark and Apache Iceberg" is a dynamic workshop designed to equip data professionals with advanced skills in managing and processing large-scale data. Participants will be introduced to the essential table formats before delving into Apache Iceberg's integration with Apache Spark. This session focuses on practical applications, including schema evolution and efficient file management, to enhance data processing efficiency and scalability. Ideal for data engineers and scientists,Details
- Introducing an open source library in Python: Quix Streams. It solves all the complexities of stream processing in a cloud native package with a familiar Pandas DataFrame API interface. This library lets you work with data like they are static in your Jupyter Notebook without any hassle associated with streaming technologies. Our mission is to bring masses of Python developers into streaming and make the journey as smooth as possible so real-time applications using ML are not so difficultDetails
- Python is a leading language of choice for the Databricks and ML ecosystem, alongside a delta tables stack leveraging Unity catalog to manage petabytes of structured data. To build and experiment with ML data and models, version control has become the backbone of modern machine learning (ML) projects, bringing critical aspects of reproducibility and experimentation to teams who are able to experiment in isolation, while still collaborating on projects.Details
- Machine learning (ML) model serialization helps to optimize inference latency, memory, and disk space requirements and provides more options for model deployment. We will explore the use cases that benefit the most from this technique and some drawbacks.Details
- In 2023, vector databases are attracting great interest, as evidenced by the Google Trends search statistics. This type of database has a direct link with Large Language Models (LLM), such as ChatGPT , by enabling āRetrieval Augmented Generationā (RAG) for example. This approach offers the possibility of exploiting the power of a conversational agent using our own data. But... Do you really need a vector database ?Details