Discussion about this post

User's avatar
Neural Foundry's avatar

Brilliant deep dive into teh streaming shift. The row-group merging trick is the kind of optimization that makes or breaks real-time at scale, I've seen teams burn compute like crazy doing row-level compaction and never realize why latency kept creeping up. Linking Flink checkpoints to Hudi commits sounds obvious in hindsight, but coordinating those failure boundaries is probabaly the reason most folks just stick with batch ingesion.

No posts

Ready for more?