https://github.com/cubejs/cluster2/tree/cluster3 is about to GA, I’m quite excited for putting it together.
A few things we added on the way are quite interesting:
- nanny monitoring for irresponsive workers
- bruteforce killing via ‘SIGTERM’ and use exception to determine if the process has been actually collected
- pause/resume of a worker (for taking it out of traffic and do explicit GC!)
- warmup sequence rewrite to put warmup worker on different ports (ensure each gets warmed up! before serving 1st request)