*>What is stopping [...] Java, JS, and C# files in UTF-8?* The output of files o...

kevin_thibedeau · on Feb 8, 2022

Win32 narrow API calls support UTF-8 natively now.

mark-r · on Feb 8, 2022

Code page 65001 has existed for a long time now, but it was discouraged because there were a lot of corner cases that didn't work. Did they finally get all the kinks out of it?

kevin_thibedeau · on Feb 8, 2022

Yes. Applications can switch code page on their own.

Raymonf · on Feb 9, 2022

Yes, Windows 10 has continually improved UTF-8 support. You can even set applications to use it by default now.

colejohnson66 · on Feb 8, 2022

UTF-16*, not UCS-2. Although there are probably many programs that assume UCS-2.

mark-r · on Feb 8, 2022

When Windows adopted Unicode, I think the only encoding available was UCS-2. They converted pretty quickly to UTF-16 though, and I think the same is true of everybody else who started with UCS-2. Unfortunately UTF-16 has its own set of hassles.

account42 · on Feb 9, 2022

Technically, they converted to WTF-16 [0] since many places, including filenames, allow you to use unpaired surrogates.

[0] https://simonsapin.github.io/wtf-8/

nwallin · on Feb 8, 2022

Note that the asterisk in `UTF-16*` is a really big asterisk. I fixed a UCS-16 bug last week at my day job.

WorldMaker · on Feb 9, 2022

Yeah, there's sometimes a lot more hacks like WTF-8 and WTF-16 in practice on UCS-2 originally systems (including Windows and JS) than is healthy: https://simonsapin.github.io/wtf-8/