Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Postgres LISTEN/NOTIFY was a consistent pain point for Oban (background job processing framework for Elixir) for a while. The payload size limitations and connection pooler issues alone would cause subtle breakage.

It was particularly ironic because Elixir has a fantastic distribution and pubsub story thanks to distributed Erlang. That’s much more commonly used in apps now compared to 5 or so years ago when 40-50% of apps didn’t weren’t clustered. Thanks to the rise of platforms like Fly that made it easier, and the decline of Heroku that made it nearly impossible.



How did you resolve this? Did you consider listening to the WAL?


We have Postgres based pubsub, but encourage people to use a distributed Erlang based notifier instead whenever possible. Another important change was removing insert triggers, partially for the exact reasons mentioned in this post.


> Another important change was removing insert triggers, partially for the exact reasons mentioned in this post.

What did you replace them with instead?


In app notifications, which can be disabled. Our triggers were only used to get subsecond job dispatching though.


Distributed Erlang if application is clustered, redis if it is not.

Source: Dev at one of the companies that hit this issue with Oban


What about Heroku made Erlang clustering difficult? It's had the same DNS clustering feature that Fly has, and they've had it since 2017: https://devcenter.heroku.com/articles/dyno-dns-service-disco....


The problem was with restrictive connections, not DNS based discovery for clustering. It wasn't possible (as far as I'm aware) to connect directly from one dyno to another through tcp/udp.


That is not an issue when using Private Spaces, which have been available since 2015


I didn’t realize Oban didn’t use Mnesia (Erlang built-in).


Very very few applications use mnsesia. There’s absolutely no way I would recommend it over Postgres.


I have heard the mnesia is very unreliable, which is a damn shame.

I wonder if that is fixable, or just inherent to its design.


My understanding is that mnesia is sort of a relic. Really hard to work with and lots of edge / failure cases.

I'm not sure if it should be salvaged?


I think RabbitMQ still uses by default for its metadata storage. Is it problematic?


They are in the process of migrating away from it https://www.rabbitmq.com/docs/metadata-store


can you explain why?


Mnesia along with clustering was a recipe for split brain disasters a few years ago I assume that hasn't been addressed.


I have only worked with a product that used it, so no direct experience, but one problem that was often mentioned is split-brains happening very frequently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: