(Nhost) Sorry for not answering everyone individually, but I see some confusion ...

akrymski · on Sept 27, 2022

Indeed RDS was never designed to be "re-sold", and assuming that a single PG instance will handle lots of different users is naive. Turns out if you're aiming to be an infra provider, building your own infra is the way to go. Who would have thought?

If I was launching a BaaS I wouldn't touch AWS. Grab a few Hetzner bare metal servers and setup your infra. You're leaving a massive profit margin to AWS when you don't have to.

fmajid · on Sept 27, 2022

Are you using a Kubernetes PostgreSQL operator like pgo or CloudNativePG?

https://proopensource.it/blog/postgresql-on-k8s-experiences

SomaticPirate · on Sept 27, 2022

Also would like to know this. This post is a bit light on content. It sounds like they just moved to K8s from RDS. In my experience, Postgres works decently but there are sharp edges running it containerized (OOMS in subprocesses might not be caught by the container runtime, shared memory is pitifully low in docker at 64 MB by default)

fmajid · on Sept 27, 2022

From other comments, it looks like they rolled their own solution. Perhaps they had unique requirements, but it seems short-sighted to forego the automation an operator brings.

cloudbee · on Sept 26, 2022

And what are your cost savings from RDS perspective. I'd a similar problem where we'd to provision like 5 databases for 5 different teams. RDS is really expensive. And your solution is open source ? I would like to try.

SOLAR_FIELDS · on Sept 26, 2022

RDS and similar managed databases are over half of our total cloud bill at my place of work. Managed databases in general are really expensive.

vbezhenar · on Sept 27, 2022

Is there any particular reason for managed databases being expensive or they just charge because they can?

nunopato · on Sept 26, 2022

I hope to have a more detailed analysis to share when we have more accurate data. We launched individual instances recently and although I don't have exact numbers, the price difference will be significant. Just imagine how much it would cost to have 1 RDS instance per tenant (we have thousands).

We haven't open-sourced any of this work yet but we hope to do it soon. Join us on discord if you want to follow along (https://nhost.io/discord).

jrockway · on Sept 27, 2022

I'm guessing that they're betting that they can put X idle customers on one machine, and so pay X/machine cost for their free tier.

A while ago, I worked for a company that offered a hosted version of their application that required Postgres, etcd, Kubernetes, etc. It was set up so that every customer got their own GCP project, containing a K8s cluster, Cloud Storage, and a Postgres instance, The k8s cluster ("workspace") then contained dedicated nodes (4vCPU x 16G RAM at a minimum, autoscaling up according to their workload including GPU compute), SSDs, a public-facing LoadBalancer, etc. This is good for per-customer isolation, but quite costly at idle, on the order of several hundred dollars a month. Users expect this kind of isolation (but need the SOC2 and similar checkmark for sure), but they don't expect to be charged when they're not running anything, which was a problem for us.

If I was doing this again, I would do it this way, at least for the MVP. One option is to make the application multi-tenant aware, and isolate at the application level instead of at the GCP project level. This might be more difficult to get certified and might not meet everyone's HIPAA-like compliance goals, but is a good starting point, especially for free trials.

The other option that was very appealing to me is to give each user a VM that just gets de-scheduled when no requests are being made. Instead of k8s managing nodes, nodes would manage k8s. The downside there is that cluster size is limited to whatever the largest node you can buy is, but honestly, 448vCPUs is a ton (AWS's max instance size at the moment), so it's a very workable solution. When users sign up, create a VM image that runs K8s, Minio, Postgres, etc. and route traffic to it with a shared L7 router/front proxy. If their workloads autoscale up, freeze and migrate the VM to a machine with more resources. If they're not using it for a while, freeze it completely, and reprogram your front proxy to point at a program that waits for an RPC / web request and starts up the VM when one comes in. Now your idle cost is the cost of your block storage, modulo deduplication, instead of dedicated CPU cores and RAM. You also get a lot of knobs to control your actual compute cost; you aren't reliant on your users provisioning spot instances from their cloud provider, you can just tell cron jobs to run when CPU load is lowest, or set your own rate to incentive off-peak usage. And, you can pretty much get away with charging nothing for idle instances, limit free trials in aggregate to X CPU cores, etc. I think it would have been good, though complex.

TL;DR: RDS is a highly-available always-on service. But customers might not want HA or always-on. By being able to turn off the database at the right moment, you can save a lot of money on compute, which makes things like good free trials more economically viable. I think OP is on the right track to a successful k8s-based business and wish them great luck!