More

charleslmunger · 2026-02-22T23:10:01 1771801801

The default Gboard keyboard has settings for always showing the number row, or only showing it when entering a password. There is also a setting for the "suggestion strip" under "corrections & suggestions". You can also drag to resize the keyboard itself in the Gboard menu, to scale the height.

Now, whether your users will do that to play your game is a different story, but the options exist.

charleslmunger · 2026-02-15T05:43:09 1771134189

Not OP, but used Arch for a while in 2011, and at some point doing an update moved /bin to /usr/bin or something like that and gave me an unbootable system. This was massive inconvenience and it took me many hours to un-hose that system, and I switched to Ubuntu. The Ubuntu became terrible with snaps and other user hostile software, so I switched to PopOS, then I got frustrated with out of date software and Cosmic being incomplete, and am now on Arch with KDE.

Back then I used Arch because I thought it would be cool and it's what Linux wizards use. Now Arch has gotten older, I've gotten older, and now I'm using Arch again because I've become (more of a) Linux wizard.

mjevans · 2026-02-15T08:41:16 1771144876

The silly move from /bin to /usr/bin broke lots of distros. Probably would have worked out if they'd had cp --reflink=auto --update to help ease migrations from files in /bin to /usr/bin and then just symlinked /bin to /usr/bin . However then any setups where /usr is a distinct filesystem from / would hard-require initramfs to set that up before handoff.

The python (is python2) transition was even more silly though. Breaking changes to the API and they wanted (and did!) force re-pointing the command to python3? That's still actively breaking stuff today in places that are forsaken enough to have to support python2 legacy systems.

dboon · 2026-02-15T16:44:48 1771173888

Arch + KDE is pretty sweet. It looks gorgeous out of the box, and gives you a system that mostly just works but is still everything you love about Arch

cromka · 2026-02-15T10:51:49 1771152709

Also not OP, I gave up Arch around 2011 as well after I wasn't able to mount a USB pendrive at the uni, as I was rushing somewhere. This was embarrassing and actually a serious issue, took some time to fix upstream and finding workaround was also annoying. This is when I gave up on it and never looked back, but I did, indeed, learn all about Linux internals from dailying Arch for 3 or so years.

charleslmunger · 2026-02-14T14:12:32 1771078352

You have a list of IDs, and want to make them compact for storage or transport - fast and simple way is to sort and delta encode.

hinkley · 2026-02-15T17:05:01 1771175101

Hmm. That’s fair, though I’d probably use set operations instead. What you find though is that for most other problems besides diffing, ID order is not chronological order, so you need to sort by a date stamp instead. But I’m typically letting the database do that, so I’m a consumer of sorted numbers, but not an implementor. Because what I sort is nearly always compound sorts. By field A, then field B and field C if those two still don’t cut it.

charleslmunger · 2026-02-06T01:30:26 1770341426

C99, but with a million macros backporting features from newer language versions and compiler extensions. Lovely features you don't get with ordinary c99:

free_sized

#embed

static_assert

Types for enum

Alignof, alignas, aligned_alloc

_Atomic

charleslmunger · 2026-01-22T00:02:45 1769040165

What hardware are you running on where the cost of a relaxed 64 bit load and a branch is significant compared to a (possibly contended) cas?

You could always use ldset on arm for this.

charleslmunger · 2026-01-02T05:10:57 1767330657

Yup this works but there's as of yet no HBR13.5 or better input so you're not getting full hdmi 2.1 equivalent. But if you don't care about 24 bits per pixel DSC then you can have an otherwise flawless 4k120hz experience.

https://trychen.com/feature/video-bandwidth

charleslmunger · 2025-12-10T00:12:40 1765325560

It's so weird to see the leading heroin story phrased like a hypothetical, when:

1. Heroin itself was marketed as a "non-addictive morphine substitute", and sold to the public. It didn't become a controlled substance until 1914 (according to Wikipedia) 2. The opioid crisis was basically started and perpetuated by Purdue pharma, again marketing Oxycodone with the label “Delayed absorption as provided by OxyContin tablets, is believed to reduce the abuse liability of a drug.” and other more egregious advertising. 3. Britain went to war with China twice to force the Qing dynasty to allow them to sell opium there. 4. President Teddy Roosevelt's grandfather made a ton of money in the opium trade.

It's supposed to be sort of shocking hypothetical, except actually that's basically the history of the actual drug.

charleslmunger · 2025-12-08T02:07:34 1765159654

>Critical section under 100ns, low contention (2-4 threads): Spinlock. You’ll waste less time spinning than you would on a context switch.

If your sections are that short then you can use a hybrid mutex and never actually park. Unless you're wrong about how long things take, in which case you'll save yourself.

>alignas(64) in C++

    std::hardware_destructive_interference_size

Exists so you don't have to guess, although in practice it'll basically always be 64.

The code samples also don't obey the basic best practices for spinlocks for x86_64 or arm64. Spinlocks should perform a relaxed read in the loop, and only attempt a compare and set with acquire order if the first check shows the lock is unowned. This avoids hammering the CPU with cache coherency traffic.

Similarly the x86 PAUSE instruction isn't mentioned, even though it exist specifically to signal spin sections to the CPU.

Spinlocks outside the kernel are a bad idea in almost all cases, except dedicated nonpreemptable cases; use a hybrid mutex. Spinning for consumer threads can be done in specialty exclusive thread per core cases where you want to minimize wakeup costs, but that's not the same as a spinlock which would cause any contending thread to spin.

raggi · 2025-12-08T02:46:00 1765161960

> Spinlocks outside the kernel are a bad idea in almost all cases, except dedicated nonpreemptable cases; use a hybrid mutex. Spinning for consumer threads can be done in specialty exclusive thread per core cases where you want to minimize wakeup costs, but that's not the same as a spinlock which would cause any contending thread to spin.

Very much this. Spins benchmark well but scale poorly.

magicalhippo · 2025-12-08T02:33:51 1765161231

> Spinlocks outside the kernel are a bad idea in almost all cases, except dedicated nonpreemptable cases; use a hybrid mutex

Yeah, pure spinlocks in user-space programs is a big no-no in my book. If you're on the happy path then it costs you nothing extra in terms of performance, and if you for some reason slide off the happy path you have a sensible fall-back.

charleshn · 2025-12-08T04:04:16 1765166656

> std::hardware_destructive_interference_size Exists so you don't have to guess, although in practice it'll basically always be 64.

Unfortunately it's not quite true, do to e.g. spacial prefetching [0]. See e.g. Folly's definition [1].

[0] https://community.intel.com/t5/Intel-Moderncode-for-Parallel...

[1] https://github.com/facebook/folly/blob/d2e6fe65dfd6b30a9d504...

menaerus · 2025-12-08T11:10:54 1765192254

Some things from the article are debatable for sure, and some are maybe missing like the one you mention with PAUSE instruction, which I also have not been aware of, but generally speaking I thought it was a really good content. Lean system engineering skills applied to real world problems. I especially appreciated the examples of large-scale infra codebases doing it in practice.

surajrmal · 2025-12-08T04:28:30 1765168110

Hybrid locks are also bad for overall system performance by maximizing local application performance. There is a reason default lock implementations from OS don't spin even a little bit.

menaerus · 2025-12-08T08:20:16 1765182016

> There is a reason default lock implementations from OS don't spin even a little bit.

glibc pthread mutex uses a user-space spinlock to mitigate the syscall cost for uncontended cases.

charleslmunger · 2025-12-08T09:36:59 1765186619

That depends on your workload. If you're making a game that's expected to use near 100% of system resources, or a real time service pinned to specific cores, your local application is the overall system.

surajrmal · 2025-12-10T15:08:21 1765379301

Totally agree. However it's important to differentiate those workloads from the average workload which is to participate in a larger system.

imtringued · 2025-12-08T17:58:34 1765216714

This is nonsense. If the lock hasn't been acquired, you don't spin to begin with and if the lock has been acquired and the lock is being released shortly after, the spinning avoids a context switch. If the maximum number of retries has been reached, the thread was going to sleep anyway and starts scheduling the next thread (which was only delayed by the few attempted spins). This means in the worst case the next spin will only happen once all the other queued up threads have had their turn and that's assuming you're immediately running into another acquired lock.

surajrmal · 2025-12-10T15:21:40 1765380100

It's makes the worse case sufficiently bad and unfair such that it makes things worse overall. If the lock is contended by a thread with higher priority, then that blocking thread will have its priority increased. Now if the ends thread to get the lock is one spinning on it rather than actual high priority one, then this will repeat, leading to large latency on front of the high priority thread and a lot of misaligned CPU utilization by a lower priority thread.

Spinning on a CAS is far more expensive than spinning on most other instructions as well as it affects all core that may try to access that cache line, which may include things other than the lock itself.

Also consider how the system acts under high CPU load. You will end up with threads holding locks when not running leading to the majority of the time you miss the lock you spin all 100 times. This just exacerbate the CPU load issues even more. Hybrid locks are only helpful under lower CPU load.

nly · 2025-12-08T09:44:49 1765187089

GNU libc posix mutexes do spin...

surajrmal · 2025-12-10T05:46:19 1765345579

And I think it'd a poor choice that causes worse system performance. Android's bionic doesn't spin, nor does Windows or Fuchsia. Avoiding the syscall overhead is generally detrimental to overall system performance especially when the CPU load is high.

saagarjha · 2025-12-08T04:20:30 1765167630

> std::hardware_destructive_interference_size

Of course, this is just the number the compiler thinks is good. It’s not necessarily the number that is actually good for your target machine.

nly · 2025-12-08T09:41:15 1765186875

The PAUSE instruction isn't actually as good as it used to be. In, iirc, Skylake Intel massively increased the latency to improve utilisation under hyperthreading. The latency of this instruction is now really high.

Most people using spinlocks really care about latency, and many will have hyperthreading disabled to reduce jitter

SkiFire13 · 2025-12-08T10:51:56 1765191116

If the PAUSE instruction is too fast doesn't that kinda defeat its purpose?

menaerus · 2025-12-08T14:31:07 1765204267

Yeah, I think so too now that I read some documentation about it. It appears that the main issue with the spinlock pattern is that it inhibits "a severe performance penalty when exiting the [spinlock] loop because it [CPU] detects a possible memory order violation." [0].

~10 years ago, on Haswell, it took ~9 cycles to retire, and from Skylake onward, with some exceptions, it takes a magnitude more - ~140 cycles.

These numbers alone suggests that it really messes up hard with the CPU pipeline, perhaps BP (?) or speculative execution (?) or both (?) such that it will basically force the CPU to flush the whole pipeline. This is at least how I read this. I will remember this instruction as "damage control" instruction from now on.

[0] https://www.felixcloutier.com/x86/pause

nly · 2025-12-10T10:12:08 1765361528

Not sure if you'll see this now, but the actual reason you want to use it is as a speculation barrier and a hint to various predictors.

Lfence is the better choice these days.

charleslmunger · 2025-12-02T04:36:51 1764650211

>The compiled compressed binary for an APK

This doesn't undermine your argument at all, but we should not be compressing native libs in APKs.

https://developer.android.com/guide/topics/manifest/applicat...

charleslmunger · 2025-11-29T18:53:36 1764442416

>Not at all? Most memory-safety issues will never even show up in the radar

Citation needed? There's all sorts of problems that don't "show up" but are bad. Obvious historical examples would be heartbleed and cloudbleed, or this ancient GTA bug [1].

1: https://cookieplmonster.github.io/2025/04/23/gta-san-andreas...