Distributed Stream Processing in Practice [Scalable, Real-time Data Pipelines]
Details
- Date:
June 19, 2025
- Time:
11:00 AM EST, 08:30 PM [IST]
- Duration:
60 mins
Summary
This technical session examines real-world challenges and patterns in building distributed stream processing systems. We focus on scalability, fault tolerance, and latency trade-offs through a concrete case study, using specific frameworks like Apache Storm as supporting tools to illustrate production concepts.
- ✅ Master real-world challenges - Understand scalability, fault tolerance, and latency trade-offs in production
- ✅ See architectural patterns - Stateless vs. stateful processing, event time vs. processing time decisions
- ✅ Handle scale bottlenecks - Partitioning strategies, backpressure handling, and scheduling challenges
- ✅ Learn from concrete examples - Real ML feature generation pipeline using Storm and Kafka
Hosted By

Harsha Kalbalia
[Moderator] GTM & Founding Member @ Datazip
Harsha is a user-first GTM specialist at Datazip, transforming early-stage startups from zero to one. With a knack for technical market strategy and a startup enthusiast's mindset, she bridges the gap between innovative solutions and meaningful market adoption.

Hasan Geren
Data Engineer @ ProcurePro
Hasan's career includes Data Engineering, where he has: • Designed and optimised 𝘀𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀 and cloud storage architectures. • Built 𝗹𝗼𝘄-𝗹𝗮𝘁𝗲𝗻𝗰𝘆 𝗱𝗮𝘁𝗮 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲𝘀 to support real-time applications and analytics dashboards. • Developed AI/ML-based solutions, including 𝗟𝗦𝗧𝗠 𝗺𝗼𝗱𝗲𝗹𝘀 and 𝗿𝗲𝗰𝗼𝗺𝗺𝗲𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝘀𝘆𝘀𝘁𝗲𝗺𝘀 to enhance user engagement. • Collaborated across teams to drive actionable insights, ensuring data solutions align with business goals.
Ready to Join our next webinar?
Secure your spot by registering below.