Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are some easy answers here:

* Bigger blocks = better performance. The bigger you can make it the faster you'll go. Your limiting factor is usually the desired resolution of the user (i.e. aggregation will inevitably result in under-utilized space).

* Disk, SSD and NFS don't all belong to the same category. Most modern products in storage are developed with the expectation that the media is SSD. Virtually nobody wants to enter the market of HDDs. The performance gap is just too big, and the existing products that still use HDDs rely on fast caching in something like flash memory anyways. NFS is a hopelessly backwards and outdated technology. It's the least common denominator, and that's why various storage products do support it, but if you want to go fast, forget about it. The tradeoff here usually is between writing your own client (usually, a kernel module) to do I/O efficiently, or spare users the need for installing a custom kernel module (often a security audit issue) and let them go slow...

* OS disk cache is somewhat of a misnomer. There are also two things that might get confused here. OS doesn't cache data written to disk -- the disk does. OS provides mechanism to talk to the disk and instruct it to flush the cache. There's also filesystem cache -- that's what OS does. It caches in the memory it manages the file contents of recently accessed files.

* I/O through mmap is a gimmick. Just one of the ways to abuse system API to do something it's not really intended to do. You can safely ignore it. If you are looking into making I/O more efficient, look into uring_io.



I've spent a lot of my career automating datacenter and HPC environments and I disagree with several of these points.

* Big distributed storage systems still use hdd's, usually within a tired system including ssds and nvme.

* A good nfs server implementation will beat the pants off all the cloud vendors. It's still highly relevant in physical datacenters.

* Mmap is used heavily in a ton of software for good reason. On top of that it's part of the POSIX API.

* While block size is one of those things where it usually doesn't matter until it does, just staying bigger blocks is faster is a bit misleading.


For the record, I work in HPC environment, but originally, my background is in storage.

> Big distributed storage systems still use hdd's

So what? Did you read what I wrote? I wrote about developing new, not supporting old...

> A good nfs server implementation will beat the pants off all the cloud vendors.

What are you even talking about? What cloud vendors have to do with this? Did you read what you replied to?

> Mmap is used heavily in a ton of software for good reason

So what? OP is asking in the context of writing a database / disk I/O. It's a wrong system API to do that. It's intended for applications to "easily" saving their in-memory data. If that's what it's used for, then it's fine. If it's used to implement a filesystem, then the filesystem authors don't understand what they are doing. Also, being part of POSIX or any other standard doesn't warrant a magical resilience to being a bad functionality... just look at the history of UNIX / Linux repeatedly failing to come up with an interface for asynchronous I/O, and sure enough, all these iterations made it into the standard.

> just staying bigger blocks is faster is a bit misleading.

It's not misleading. For the one paragraph answer, it's perfectly correct. And, no, block size is a very important aspect of any storage system, it's not something that may not matter.

---

As an aside: you sound pretentious, and try to pass as knowledgeable by saying things that have a drop of truth, but are mostly fancily dressed nonsense. Just stop. It's embarrassing.


NFS and Samba are right out IMO




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: