Escalating via post-it note just to get some health checks

turtleyacht · on April 10, 2023

If it can't be easily simulated by QA--and infra had to get involved in this case, in order to hang the service--it's hard to even test Rachel's suggestion like she mentioned.

No dev team would add a /hang endpoint just sitting out there, either.

Every environment, from local to pre-production, may successfully bind the port. So it becomes an operations issue, almost like an emergent thing (a service among services in an ecosystem).

Ideally, the port binding retry gets added to an internal framework, so that all teams benefit. The first version might be a hack, but later improvements multiply every service's robustness.

Although, it seems like there's a threshold for distributed port binding retries where they synchronize (?), and nothing can rebind.

shlip · on April 10, 2023

I think the author is referring to this at the end of the article :

https://blog.backslasher.net/1.5GB-string.html