Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The article claims that an allocator that splits memory based on allocation size is called a "buddy allocator". That's misleading: an allocator that allocates an area for each size class is usually called a "slab allocator", while a "buddy allocator" is one that when needed subdivides a memory area with a power of two size into two half-sized areas that are "buddies", does so recursively to satisfy allocations, and coalesces them again when they are free.

E.g. the Linux kernel used (not sure if it's still like this) a buddy allocator to allocate pages and power-of-two blocks of pages and slab allocators to subdivide those pages and allocate data structures.

Another thing that the article doesn't mention that is important is that most production allocators make use of thread-local storage and either have per-thread caches of free blocks or sometimes whole per-thread memory regions. This is to reduce lock contention and provide memory that is more likely to be in the current core's cache.



You're absolutely right, I've corrected this. Thank you!

I had originally written about threading and locality but it made the post too long and complicated, so I cut it out for the final draft. You can see remnants of it if you check the HTML comments in the post :D


linux had a bitmap based "buddy allocator" (power of two), now it is not bitmap based anymore (complexity not worth it anymore, performance wise, namely simplicty was restored).

Then linux has various slabs(slub/slob/slab), built on top of the "buddy allocator".

Userlevel code shoud use non virtual address stable mmap-ed regions (slabs + offsets). Legacy "libc" services were built as virtual address stable services... which are kind of expensive to manage on a modern paginated system. Virtual address stable regions should be kept to a minimum (that horrible ELF static TLS). There is a workaround though (but linux overcommit default policy could kill your process): a user process would query the amount of ram on the system and mmap a region of roughly (care of the overcommit policy) the same size, only once per process life-time. Then you could have a virtual address stable region which could use most if not all the available ram (excluding hot-memory addition...). Should be very easy to manage with lists.


In this case, one should start with implementing -Xmx switch, then gradually adding the rest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: