Cloud Data Solutions Are Overrated: Building a Pan-European Business Database for Lunch Money
Speaker
Antanas Baltrušaitis
After 18 years in the data and analytics industry - spanning roles from banking strategy at Nordea to Head of Data at Luminor and Girteka - I made a conscious decision to step out of the corporate world.
Why? Because my skillset doesn't fit a standard job description.
I am a true Generalist in a world that often tries to specialize. I don’t just analyze data; I engineer the pipelines, write the code, design the UI, and map the business strategy. I realized that my ability to manage the entire data journey - from raw SQL to the final user experience - was best utilized in building my own solutions rather than managing narrow slices of corporate infrastructure.
Today, I am fully dedicated to Product Development. I build software where data isn't just a byproduct; it is the core engine.
My flagship product, Scoris, is Lithuania's premier open business data aggregator, built on the belief that public data should be accessible and actionable. I also created Oriux, a weather app designed for precision. Few other massive products about to launch soon.
I bridge the gap between complex data engineering and tangible business value. I am no longer just advising on strategy; I am executing it, line by line and database by database.
Abstract
We are told that modern data engineering requires expensive cloud warehouses and enterprise SaaS. This talk challenges that narrative. I will show how I built scoris.eu - aggregating business data from hundreds of sources across Lithuania, Latvia, Estonia, Finland, and the UK - as a solo developer. Using a purely open-source Python stack (dlt, dbt, Prefect) on cheap infrastructure, I will demonstrate that with the right architecture, you can integrate data at scale without burning cash on the cloud.
Description
The modern data industry has convinced us that "Big Data" problems require "Big Cloud" solutions. We are taught that to integrate hundreds of diverse data sources, we need Snowflake, Databricks, managed Airflow, and a five-figure monthly AWS bill. But what if you are a solo developer with a massive goal and a "lunch money" budget? In this talk, I will present the engineering journey behind scoris.eu, a project that aggregates, cleans, and standardizes business data from national registries across Europe (currently live with Lithuania, Latvia, Estonia, Finland, and the UK). We will move past the "Hello World" examples and look at the gritty reality of handling hundreds of disparate data sources, schema changes, and messy government data without a dedicated DevOps team. We will cover:
- The "Lunch Money" Stack: A deep dive into the open-source triumvirate: dlt (Data Load Tool) for robust extraction, dbt for transformation, and Prefect for orchestration.
- Schema Evolution on Autopilot: How to handle government APIs that change without warning, using Pythonic ingestion that adapts automatically.
- The Hardware Reality: Why a cheap VPS (Virtual Private Server) is often superior to serverless functions for batch processing, and how to optimize local execution.
- The Cost Comparison: A transparent look at the actual running costs of Scoris.eu versus the estimated cost of running the equivalent architecture on a major cloud provider.
This talk is for Data Engineers and Python developers who suspect that "Modern Data Stack" complexity has gone too far. You will leave with a blueprint for building scalable, resilient data platforms that run on your laptop or a cheap server - proving that smart engineering beats infinite cloud budgets every time.