Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

REINDEX concurrently made a huge difference for us.

We have a main database that started at version 9.6 and was upgraded along the way. The largest table is huge (billions of rows, TB’s of diskspace) and gets a lot of deletes and updates.

Vacuums could no longer finish on that table (we killed it after ~90 days).

Reindex (+ vacuum with skip-indexes) dropped our db load from 60 to 20 and fixed the autovacuums, which now take less than a day. The indexes on that table had accumulated a lot of bloat, and I think newer versions also improved the index disk layout.

We now have a monthly cron job to reindex all indexes.



Have you tried pg_repack? We had some great success with it on large tables with lots of bloat


It’ll probably shave 50% off that table’s disk space.

However, we’re actually in the process of sharding the db, and by copying customer by customer we’ll lose the bloat that way. The subsequent shards will be a lot more manageable so we can run pg_repack there with more confidence.


How are you measuring your db load?


I just assumed they were talking about regular sysload


Correct. Just unix load (ie concurrent busy processes)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: