Is it though? This is a genuine question. My intuition is that the investment of time / stress / risk to become good at this is unlikely to have high ROI to either the person putting in that time or to the business paying them to do so. But maybe that's not right.
It's more nuanced for sure than my pithy comment suggests. I've done both self-managed and managed and felt it was a good use of my time to self-manage given the size of the organizations, the profile of the workloads and the cost differential. There is a whole universe of technology businesses that do not earn SV/FAANG levels of ROI - for them, self-managed is a reasonable allocation of effort.
One point to keep in mind is that the effort is not constant. Once you reach a certain level of competency and stability in your setup, there is not much difference in time spent.
I also felt that self-managed gave us more flexibility in terms of tuning.
My final point is that any investment in databases whether as a developer or as an ops person is long-lived and will pay dividends for a longer time than almost all other technologies.
Managing the PostgreSQL databases is a medium to low complexity task as I see it.
Take two equivalent machines, set up with streaming replication exactly as described in the documentation, add Bacula for backups to an off-site location for point-in-time recovery.
We haven't felt the need to set up auto fail-over to the hot spare; that would take some extra effort (and is included with AWS equivalents?) but nothing I'd be scared of.
Add monitoring that the DB servers are working, replication is up-to-date and the backups are working.
> We haven't felt the need to set up auto fail-over to the hot spare; that would take some extra effort (and is included with AWS equivalents?) but nothing I'd be scared of.
this part is actually scariest, since there are like 10 different 3rd party solutions of unknown stability and maintanability.
> Managing the PostgreSQL databases is a medium to low complexity task as I see it.
Same here. But, I assume you have managed PostgreSQL in the past. I have. There are a large number of people software devs who have not. For them, it is not a low complexity task. And I can understand that.
I am a software dev for our small org and I run the servers and services we need. I use ansible and terraform to automate as much as I can. And recently I have added LLMs to the mix. If something goes wrong, I ask Claude to use the ansible and terraform skills that I created for it, to find out what is going on. It is surprisingly good at this. Similarly I use LLMs to create new services or change configuration on existing ones. I review the changes before they are applied, but this process greatly simplifies service management.
> Same here. But, I assume you have managed PostgreSQL in the past. I have. There are a large number of people software devs who have not. For them, it is not a low complexity task. And I can understand that.
I'd say needing to read the documentation for the first time is what bumps it up from low complexity to medium. And then at medium you should still do it if there's a significant cost difference.
You can ask an AI or StackOverflow or whatever for the correct way to start a standby replica, though I think the PostgreSQL documentation is very good.
But if you were in my team I'd expect you to have read at least some of the documentation for any service you provision (self-hosted or cloud) and be able to explain how it is configured, and to document any caveats, surprises or special concerns and where our setup differs / will differ from the documented default. That could be comments in a provisioning file, or in the internal wiki.
That probably increases our baseline complexity since "I pressed a button on AWS YOLO" isn't accepted. I think it increases our reliability and reduces our overall complexity by avoiding a proliferation of services.
I hope this is the correct service, "Amazon RDS for PostgreSQL"? [1]
The main pair of PostgreSQL servers we have at work each have two 32-core (64-vthread) CPUs, so I think that's 128 vCPU each in AWS terms. They also have 768GiB RAM. This is more than we need, and you'll see why at the end, but I'll be generous and leave this as the default the calculator suggests, which is db.m5.12xlarge with 48 vCPU and 192GiB RAM.
That would cost $6559/month, or less if reserved which I assume makes sense in this case — $106400 for 3 years.
Each server has 2TB of RAID disk, of which currently 1TB is used for database data.
That is an additional $245/month.
"CloudWatch Database Insights" looks to be more detailed than the monitoring tool we have, so I will exclude that ($438/month) and exclude the auto-failover proxy as ours is a manual failover.
With the 3-year upfront cost this is $115000, or $192000 for 5 years.
Alternatively, buying two of yesterday's [2] list-price [3] Dell servers which I think are close enough is $40k with five years warranty (including next-business-day replacement parts as necessary).
That leaves $150000 for hosting, which as you can see from [4] won't come anywhere close.
We overprovision the DB server so it has the same CPU and RAM as our processing cluster nodes — that means we can swap things around in some emergency as we can easily handle one fewer cluster node, though this has never been necessary.
When the servers are out of warranty, depending on your business, you may be able to continue using them for a non-prod environment. Significant failures are still very unusual, but minor failures (HDDs) are more common and something we need to know how to handle anyway.
For what it's worth, I have also managed my own databases, but that's exactly why I don't think it's a good use of my time. Because it does take time! And managed database options are abundant, inexpensive, and perform well. So I just don't really see the appeal of putting time into this.
If you have a database, you still have work to do - optimizing, understanding indexes, etc. Managed services don't solve these problems for you magically and once you do them, just running the db itself isn't such a big deal and it's probably easier to tune for what you want to do.
Absolutely yes. But you have to do this either way. So it's just purely additive work to run the infrastructure as well.
I think if it were true that the tuning is easier if you run the infrastructure yourself, then this would be a good point. But in my experience, this isn't the case for a couple reasons. First of all, the majority of tuning wins (indexes, etc.) are not on the infrastructure side, so it's not a big win to run it yourself. But then also, the professionals working at a managed DB vendor are better at doing the kind of tuning that is useful on the infra side.
Maybe, but you're paying through the nose continually for something you could learn to do once - or someone on your team could easily learn with a little time and practice. Like, if this is a tradeoff you want to make, it's fine, but at some point learning that 10% more can halve your hosting costs so it's well worth it.
It's not the learning, it's the ongoing commitment of time and energy into something that is not a differentiator for the business (unless it is actually a database hosting business).
I can see how the cost savings could justify that, but I think it makes sense to bias toward avoiding investing in things that are not related to the core competency of the business.
This sounds medium to high complexity to me. You need to do all those things, and also have multiple people who know how to do them, and also make sure that you don't lose all the people who know how to do them, and have one of those people on call to be able to troubleshoot and fix things if they go wrong, and have processes around all that. (At least if you are running in production with real customers depending on you, you should have all those things.)
With a managed solution, all of that is amortized into your monthly payment, and you're sharing the cost of it across all the customers of the provider of the managed offering.
Personally, I would rather focus on things that are in or at least closer to the core competency of our business, and hire out this kind of thing.
You are right. Are you actually seriously considering whether to go fully managed or self managed at this point? Pls go AWS route and thank me later :)
I ran through roughly our numbers here [1], it looks like self-hosted costs us about 25% of AWS.
I didn't include labour costs, but the self-hosted tasks (set up of hardware, OS, DB, backup, monitoring, replacing a failed component which would be really unusual) are small compared to the labour costs of the DB generally (optimizing indices, moving data around for testing etc, restoring from a backup).
Yes thank you for that. I always feel like these up front cost analyses miss (or underrate) the ongoing operational cost to monitor and be on call to fix infrastructure when problems occur. But I do find it plausible that the cost savings are such that this can be a wise choice nonetheless.
I really do not think so. Most startups should rather focus on their core competency and direct engineering resources to their edge. When you are $100 mln ARR then feel free to mess around with whatever db setup you want.