
Product Manager @ClickHouse
I've been working with data, big and small, for the last 10 years: first as a Data Engineer slogging through ETL pipelines and big migration projects; and then in Product-shaped roles helping build data streaming infrastructure (Apache Flink, Materialize). I'm currently a Product Manager at ClickHouse, focusing on our real-time data ingestion service.
In this talk, we'll walk through the evolution of CDC as the enabler for high-performance, real-time analytics on transactional data, and explore what's missing to make it work for the 99%. A decade after Debezium entered the scene and commoditized Change Data Capture (CDC), we're still struggling to bridge OLTP and OLAP for analytics. Expensive batch jobs evolved into overcomplicated streaming architectures, HTAP promised to ditch the need for data movement, someone told us to "just use Postgres". How much progress have we made, and where do we go from here? We'll share lessons learned over the past decade seeing hundreds of customers run CDC at scale, and how we can use them to build best-of-breed experiences for high-performance, real-time analytics. Key Takeaways: - Learn how CDC evolved over the past decade - Understand the trade-offs between different architectures for CDC - See what works at scale for teams without massive engineering resources - Explore what's next for CDC and unified data stacks