Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The recently released sqlite_rsync utility uses a version of the rsync algorithm optimized to work on the internal structure of a SQLite database. It compares the internal data pages efficiently, then only syncs changed or missing pages.

Nice tricks in the article, but you can more easily use the builtin utility now :)

I blogged about how it works in detail here: https://nochlin.com/blog/how-the-new-sqlite3_rsync-utility-w...



Also note:

sqlite3_rsync is now built into the rsync.net platform.

  ssh user@rsync.net sqlite3_rsync … blah blah …
… just added last week and not rolled out in all regions but … all initial users reported it worked exactly as they expected it to.


sqlite_rsync can only be used in WAL mode. A further constraint of WAL mode is the database file must be stored on local disk. Clearly, you'd want to do this almost all the time, but for the times this is not possible this utility won't work.


I just checked in an experimental change to sqlite3_rsync that allows it to work on non-WAL-mode database files, as long as you do not use the --wal-only command-line option. The downside of this is that the origin database will block all writers while the sync is going on, and the replicate database will block both reads and writers during the sync, because to do otherwise requires WAL-mode. Nevertheless, being able to sync DELETE-mode databases might well be useful, as you observe.

If you are able, please try out this enhancement and let me know if it solves your problem. See <https://sqlite.org/src/info/2025-05-01T16:07Z> for the patch.


Update: This enhancement is now on trunk and will be included in the 3.50.0 release of SQLite due out in about four weeks.


WAL mode works on many network filesystems provided it's being written from a single host at a time.


I'm not sure understand your comment. Regardless of WAL or network filesystem usage, the sqlite file cannot be written to from multiple processes simultaneously. Am I missing something here, or did you misstate?


You can have multiple writer connections from multiple writer processes (on one host at a time). Their transactions will be safely serialized by SQLite.

So, sure, from a low-level point of view, the write transactions are not really simultaneous (they never are with SQLite3), but from the user perspective they are in the sense that you don't need to serialize them yourself.


Demands increasing page size if you sync frequently (bandwidth).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: