if you want to send out a series of long-reply SQL calls and wait for them all i...

laurencerowe · on Feb 16, 2015

Postgres lets you synchronize snapshots across connections so that they all see the same data (though subsequent changes are not visible to the other transactions unless you export another snapshot.) http://www.postgresql.org/docs/9.4/static/functions-admin.ht...

This lets you parallelize work across multiple processes while maintaining consistency. When the client is the limiting factor you can use this with multiprocessing. When the db server is the limiting factor you can just use threads (or async.) Postgres backend processes are essentially single threaded.

jnowlan · on Feb 16, 2015

So caching. Doesn't the db do that? And as much as I hate to say it, are the added complexities (webserver caches) better than the even more traditional approach - throw hardware at it? Always lots to think about. Thanks!

meowface · on Feb 16, 2015

>And as much as I hate to say it, are the added complexities (webserver caches) better than the even more traditional approach - throw hardware at it?

That depends, do you want to save (some) development time or do you want to save (a lot of) money?

aidos · on Feb 16, 2015

When I was more junior I was always told to use caching as a last resort. It's a good attitude to take to make sure you're not doing something stupid and hiding it with caching. These days though I look for caching opportunities up-front. In fact, I'll design with them in mind.

I did some work for a client some time ago that were expecting a lot of read load. Their backend had a bunch of constantly changing data in mongo - but it only refreshed every 10 seconds. I told them initially to just output the aggregated data to an S3 object and have all the clients access it from there. They decided to run loads of servers instead, they were muttering something about AWS Autoscale (even though I told them that wouldn't help).

As expected, I got a call one Friday evening asking if I could take a look at why their servers were timing out. When I got there, there were about 15 frontend servers hammering 1 poor mongo box that was aggregating the same query again and again - and within any 10 second window always getting the same result. I stripped it down to 1 frontend box with an nginx cache (after jumping through a hoop to support jsonp).

After the dust settled they apparently didn't want to admit that it was something that could just be solved with caching so it was described as a configuration issue to the business.