Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's a way of doing stream processing that sacrifices a little bit of update latency for higher throughput and exactly-once processing semantics. Regular streaming has single-digit milli update latency, while microbatching has at least a few hundred millis update latency.

By "exactly-once processing semantics", I mean that regardless of how many failures are on the cluster (e.g. nodes losing power, network partitions), the updates into all partitions of all indexes will be as if there were no failures at all. Pure streaming, on the other hand, has either "at-least once" or "at-most once" semantics.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: