"2. Push all your application logic into Postgres functions and procedures"
Why are functions and procedures (an abstraction layer at db layer) considered harmful to performance when the same abstraction layer will be required at the application layer (introducing out of process overhead and possibly network traffic)? I don't agree with this advice. (Or I don't understand it.)
Worst mistake I've ever made was implementing logic in the db - made for horrible debugging. It was only a few small bits of logic, but man, the amount of gotchas years later not realising something was there.. certainly I think you either have to all/most of your logic in the DB or none. Definitely not a sprinkling..
That's right, you either do all of it in the DB or none of it. Mixing the two makes long term maintenance complicated unless your overall solution is very well documented and the documentation is very well maintained. That's two rare "very well"'s.
I once worked for a place years ago that had these crazy data integrity bugs. There were about 35 different bugs, and we suspected that most of them were pieces of legacy code not cleaned up from acquisitions (it was a healthcare company).
Anyways, if you were able to solve one, you got a $5k bonus and a two week vacation. While I was there, someone figured one of them out, and it was due to logic in the DB. ...but I have a feeling that all of them were due to logic in the DB.
1. It's much easier to debug concurrency issues when you use SPs, but much harder to debug anything else.
2. At some point you will want to move some of your data into another system, and will have to pull the logic out into the application layer.
3. PL/pgSQL (or any other imperative SQL extension) isn't something you can find lots of devs on the market for.
4. Upgrades and rollbacks are much more painful and require downtime.
My team wrote a few critically important pieces of software that are running on Oracle, and here's why we did this:
1. Concurrency issues were the biggest pain point that we tried to avoid. We still had to fix a lot of bugs in the logic itself, and debugging them without unit tests was painful
2. We were tightly integrated with another system written in PL/SQL. When we started on v2, an independent solution, I moved almost all logic out of the database except for the critical synchronization logic.
3. We had a veteran team of PL/SQL devs in house. We still needed to get a subcontractor that wrote the API layer in Java, something PL/SQL isn't suited for at all.
4. Upgrades and rollbacks were a pain, especially after we had to move to a 24x7 SLA that left us with no upgrade window. Oracle has edition-based redefinition, but Postgres doesn't.
A middle ground that has had some success is managing a queue in Postgres that falls out business or application logic in the app, whether it’s micro service or monolith.
Yes, that's basically what we ended up doing: a queue of tasks in Postgres that a variable number of workers could access via a clever SP that encapsulated all inter-task conflicts and spat out the next task you were allowed to process.
Author seems to be arguing against long functions/procedures. But if you move that to the client, presumably with ORM support - you're going to be executing more or less the same sequence of SQL queries and commands. Only difference is that when doing it on client you will have a lot of latency.
Yes, you can cache some data in between those commands to avoid same multiple queries, but if you use temp tables to do so, they will use memory only if it is available, otherwise you are limited with the actual storage size.
Only time I had memory issues with PostgreSQL when I used too much data in arrays and json's. Those are memory only variables. For example, I'd return a huge json to client and I'd run out of memory on PostgreSQL. I started streaming row by row and problem solved.
Fwiw the specific case which motivated that section in the post was a set of recursive functions we used to denormalise an irregular graph structure (so not suitable for CTE) into a single blob of JSON to be sent to another data store. 99% of the time there were no issues with this but at times of heavier load and on complex subgraphs, those recursive call stacks contributed to severe replication lag on the replicas they were running on.
Moving the traversal logic into the application and just sending SQL queries to Postgres (we don't use an ORM) eliminated the lag. RTT between the application and the db was a few ms and this wasn't user-facing logic anyway, so extra latency wasn't an issue in this case.
Probably the fundamental problem here was a sub-optimal schema, but sometimes you're just working with what you've got. Plus a commenter on Reddit pointed out that if we used pure SQL functions instead of PL/pgSQL, we'd also have seen better performance then.
"Probably the fundamental problem here was a sub-optimal schema, but sometimes you're just working with what you've got. Plus a commenter on Reddit pointed out that if we used pure SQL functions instead of PL/pgSQL, we'd also have seen better performance then."
So, would the better advice not have been to use simpler SQL instead of complex recursive statements, instead of taking a drastic approach to abandon ship (move logic to a completely new layer)?
Also, if you're doing string concats manually for your Json, this might cause some overhead for larger objects. ??
> So, would the better advice not have been to use simpler SQL instead of complex recursive statements, instead of taking a drastic approach to abandon ship (move logic to a completely new layer)?
Probably, yep! But I didn't know that when I wrote it.
I didn't want to give any concrete advice at all tbh. The entire rationale for the post was that I'm not an expert and I've broken prod in some surprising ways and if I share those ways maybe it will stop other people making similar mistakes in future. But I guess I over-stepped in my discussion for this mistake, sorry about that.
> Also, if you're doing string concats manually for your Json, this might cause some overhead for larger objects. ??
Good point, I hadn't considered that part of it. It wasn't string concats, we were building it with `jsonb_set`, but I can definitely see the JSON structure in memory as being part of the problem now you mention it (although maybe that reinforces the argument for doing it in the application layer).
I’m fully against using triggers to implement business logic, but find stored procedures can be great for encapsulating some elements of application/business logic. I’ve got several applications that access the same databases, and putting logic in the database avoids needing to duplicate it in clients.
Most comments about debugability are nonsense. It’s just different, with some pros and cons. One simple example - if you have a bug in production, you’re not going to attach a debugger to your production application. But you can absolutely open a readonly connection to the database and start running queries, invoking functions, etc. It helps if you can architect your functions to distinguish pure readonly functions from those with side effects, but you can still debug even if that’s not the case.
I think I’m this is a commonly stated fact, but I don’t find it particularly true. Like any other technology, you just need to put in some initial effort to set up your test framework. In the case of PostgreSQL, pgTAP does a great job.
It is commonly stated and I found it to be very true. PostgreSQL is quite advanced in its procedural aspects (Oracle isn't too far behind either) but they were not made with particular focus on debugging. I'll need to have hacks like creating temp tables to dump records at a given stage vs simply setting a breakpoint. I can unit test the shit out of bog standard Java code; PL/SQL for all its capabilities doesn't even come close. The one area this tilts to the other side is when you need to do heavy processing with high volume of data on database side; a well written stored proc would handily outperform application side logic simply due to the network latency involved. But for typical use cases, putting complex business logic in stored procs just isn't worth it.
The context here was that it’s not free, as I understood it. So, moving logic to the database, might make it faster, but that doesn’t mean that it’s instantaneous or that I no longer have to think about the scaling concerns of it.
So, personally, I read that section as “logic in the database is not a zero cost abstraction.
In theory it's attractive to perform business logic in the DB with functions and procedures but in practice the "devops experience" is painful.
Functions are restricted in their available languages, ability to A/B test and scale. There's also complexity entailed by having two sources of business logic because people can forget which one does what, needing to constantly switch back and forth between procedures and app code when debugging.
Additionally the networking, resiliency and update patterns of databases are often not well suited to functions. You may want your functions to have public internet access but not you DB or rollbacks of your data but not function versions.
All of these issues can be overcome by people who're confident DBAs and sysadmin types in addition to being application developers but that's a small group of people.
I wish there were more startups in this area working to improve the developer experience of DB functions because you're correct about the superior performance and I'm pretty sure most of the issues I raised could be solved with well thought out tooling. However, at the moment such tools don't exist so these functions are painful to use.
It is always more easier to scale horizontally at application layer (just adding more servers) than at database layer (which involves syncing data between multiple database instances).
In my experience more often than not, Postgres performance problems aren't really caused by the database, but either badly designed schemas or queries. For a lot of developers, the thinking goes that 10s of millions of rows sounds like a lot like big data, so they must start building microservices, distributed systems, use K/V stores and horizontally scale a la Google, whereas their entire dataset could actually fit in RAM of a single server and they could avoid the majority of the complexity if they just scaled vertically, which is usually much, much easier.
I think many people underestimate the capabilities of SQL databases by a couple orders of magnitude. I once worked on a feature that integrated tightly with a third party service. Their api didn't have any support for aggregate queries, and my company was smaller without real BI or data team, so I ended up writing a tool to dump our account data into a local Postgres db in order to do a some data analysis. By the time I left the company that db was approaching 50 GB, the table holding the primary data had about 40 million rows, and a couple of the supporting tables were over 100 million rows. This was all on a 2018-era Dell dev laptop - a fairly powerful machine (6 core/12 thread, 32 GB RAM, SSD), but certainly no server. It took about 90 seconds to update the materialized views that summarized the data I looked at most frequently. More than acceptable for my use case, and there was a lot of room for improvement in that schema (it was pretty much a straight dump of the api data).
Came here to say exactly this. Over the last 12~ years working with PostgreSQL I've dealt with quite a few performance related issues - almost all were poorly written queries.
Start with EXPLAIN ANALYZE then work from there. You can use tools where you paste the output of it and it shows you the data in a more easy to understand format.
I don't have any other good recommendations other than learning how to read and use EXPLAIN [1]. This should typically be the first tool when you have a slow query.
100%. Exaggerating the bigness of their own data is a common phenomena. Sometimes one is talking to a group of developers who are all so impressed with this bigness but every one of them has a phone in their pocket which could fit their entire dataset.
This comes a lot from people who want to "horizontal" scaling. The camp that thinks everything should be in the middle tier (Java/C#/). Also cost on AWS is cheap for those, and expensive for RDS. In the end db will be bottle neck. Of course DevOp ppl will can also create cache layer etc to lessen the stress to the db.
I think the article is kinda mixing two points here.
One the one hand, it is sensible to try and keep all your business logic in one place (could be the database, could be the application) as spreading it across multiple places can make it hard to maintain.
The current trend is to do your business logic in the application and treat the db as a data storage layer. The point in the article is that if you're using this model, but then stored procs and functions start creeping in to your db and it turns out they're actually doing some heavy lifting, then this can negatively impact the performance of your 'data storage layer' (which is actually not a data storage layer any more).
Why splitting logic between Postgres and an application considered worse than splitting it between multiple micro-services? A DB is a storage service with INSERT/SELECT/e.t.c. as an API. Why we cannot extend this API to include stored procedures too? Indexes are commonly used to enforce data integrity. Why we cannot use triggers to do this even better?
For sure, it doesn't matter where you put it if it is in one place. If one look at company like superbase, and their product like PostgresREST. It is just way faster way to develop API, and it will scale too. Often it is about how one 'horizontally' scale.
I've been out of the trenches for some time, but when I participated in projects that relied on heavily in store procedures, we felt constrained in terms of flexibility (the language options were very restricted, and we were not able to use common libraries), the tooling (the support in the IDE was not great, neither it was straightforward to debug the code) and the scalability (vertical, instead of horizontal). Also, this approach introduced a heavy coupling.
It is true that we were much more familiar with application layer technologies, but the lack of expertise can also be considered a restriction, I think.
While I agree with all the other commenters about debugging and scaling issues at least some of the time the stored procedure route can be very powerful.
Stored procedures will eliminate insane levels of latency if there are many records to be updated in ways that are hard to do in application layer code without repeated calls the the db. I use them a lot for DB maintenance. Often for that kind of work its also a lot simpler and easier to reason with than app layer code.
One is better keeping heavy processing away from the database. Your application layer can scale almost indefinitely, and the main bottleneck for a random system is usually the database.
As a rule, processing cost should give you a default bias into moving anything away from the database. Multiple sources, the need for temporary storage, and the existence of concerns that don't deal directly with your data should bias you more towards moving your code away from the database.
On the other hand, data consolidation and enforcing non-local (to a record) rules should bias you towards moving your code into the database. If a lot of those happen, moving it there may even reduce the load of your database.
Any one sided advice here is guaranteed to be wrong.
I work with all major RDBMS on the market (I integrate with ERPs/Accounting packages so even RDBMS that are niche and things that are a insult to call DBMS).
ANYONE that have a problem with RDBMS "functions and procedures & views (!)" are invariably mishandling the RDBMS: Bad schemas, null refactoring in the DB after years/decades(!) of cruft, re-implementation, poorly, of things the RDBMS already have (like for example, date types), procedural doing stuff that SQL already do easier and in short time, too big SQL that never, ever, use VIEWS to abstract away, the RDBMS was never upgraded or is assumed never will so nothing of the new things inventing like 10 years ago is used.
And that is a short list.
---
If you consider the RDBMS like the BEST programming language environment (sans SQL but still better than most languages for data!) and use the most BASIC ideas around it: like think a little about how do your schemas considering the queries you will do, some refactoring at least once every 5 years, pls!, use the CORRECT data types, pls pls!, use VIEWS pls pls pls!, etc your logic in triggers/functions MUST BE short and EASY.
Having application logic in the db is harder to debug and test (and possibly scale, but that also depends). But as you mention, it can be much faster if the logic is working on a lot of records.
Also, IME, the data store often far outlives the original application. Having the logic tightly coupled to the data model means future applications are less like to break the assumptions made in the original data model.
Think about it this way: you have to implement the same amount of business logic in any case. The only question being discussed here is where the work will be performed. The author is talking about scaling. If you have 3 server insurance but 1 database instance, it’s better (generally speaking) to put the logic in the server because there are 3 of them. That will scale better. In the case of Postgres, even if you have replicas, they’ll be read replicas. If you put everything in Postgres you are putting everything in the bottleneck. You can add more server instances but you can’t add more database writers.
I'd say because it is a much more specialized skill than programming in python / ruby / JS / or whatever your app language. Ideally, I would say "use the best tool for the job", which may very well be a stored procedure for data locality reasons, but practically speaking, with a larger team, you may be asking for trouble.
Why are functions and procedures (an abstraction layer at db layer) considered harmful to performance when the same abstraction layer will be required at the application layer (introducing out of process overhead and possibly network traffic)? I don't agree with this advice. (Or I don't understand it.)