This talk explores the synergy between Apache Beam and Apache Airflow, demonstrating how to create a robust, end-to-end data engineering workflow. We'll dive into the challenges of orchestrating complex data processing tasks and show how combining Airflow's scheduling capabilities with Beam's data processing framework can create more efficient and manageable data pipelines. The session will cover integration with Google Cloud Platform services, including Cloud Functions, BigQuery, and Gemini AI models.
Basic Python and SQL Knowledge
Problem Addressed: In today's data-driven world, organizations face the daunting challenge of orchestrating complex, end-to-end data engineering workflows that seamlessly integrate batch and streaming processing, scheduling, cloud services, and AI models. This talk tackles the often-overlooked synergy between Apache Beam and Apache Airflow, two powerful tools in the data engineering ecosystem that are rarely used in tandem. We'll explore how combining these technologies with Google Cloud Platform services and cutting-edge AI models can revolutionize data pipeline architecture.
Relevance to the Audience: As data volumes explode and processing requirements become increasingly complex, data engineers and scientists are under pressure to build scalable, maintainable pipelines that can handle diverse data sources and downstream applications. This topic is crucial for professionals looking to:
Solutions and Key Takeaways: Attendees will gain practical insights and hands-on knowledge to:
Sadeeq is a Data Analytics Specialist at Google Cloud in the UK. His role involves understanding customers' Data Engineering and Analytics challenges and goals, while helping them through their Digital Transformation journeys as they leverage solutions primarily on Google Cloud Platform, as well as on-Prem or on other Clouds.
Whilst in Nigeria, Sadeeq worked as a Software Engineer at few Startups, but gradually transitioned into Data Engineering at FMDQ Group. He then moved to Portugal for an MSc. degree in Data Science and Advanced Analytics at NOVA University of Lisbon.
He previously worked at KPMG and Microsoft, and his decade of industry experience include consulting with and for other notable Fortune 500 companies on data-centric projects.