100M+ is a bit more than i would expect for an image format. have i not been pay...

aw1621107 · 2025-12-01T17:48:24 1764611304

According to tokei, the lib/ directory from the reference implementation [0] has 93821 lines of C++ code and 22164 lines of "C Header" (which seems to be a mix of C++ headers, C headers, and headers that are compatible with both C and C++). The tools/ directory adds 16314 lines of C++ code and 1952 lines of "C Header".

So at least if GP was talking about libjxl "100K+" would be more accurate.

[0]: https://github.com/libjxl/libjxl

jiggawatts · 2025-12-01T22:02:43 1764626563

One of the best ways to measure code complexity is to zip up the source code. This eliminates a lot of the redundancies and is a more direct measure of entropy/complexity than almost anything else.

By that metric, jpeg-xl is about 4x the size of the jpeg or png codebase.

tkfoss · 2025-12-01T22:27:58 1764628078

Interesting approach

jiggawatts · 2025-12-02T02:16:33 1764641793

It comes from the "intelligence is a form of compression" hypothesis that has been floating around in the ML space. Also, with a good compression algorithm it is a fairly direct measure of entropy, which is quite well correlated with what a developer might consider code size and/or complexity.

tkfoss · 2025-12-03T23:49:18 1764805758

I'm familiar with the concept[1], but I'm unsure if it's a good showcase of code complexity. I've tested some internal microservices I'm deeply familiar with and found no correlation...

[1] for the past ~15 years actually, got introduced to the concept through works of mr. Hutter, after becoming aware of his Prize, and I'm dabbling in compression to this day (right now trying to improve on Bellard's nncp)

account42 · 2025-12-02T10:47:23 1764672443

Your method would still judge well-documented code with lots of intermediate variables as more complex than undocumented code golf soup.

palmotea · 2025-12-01T18:30:21 1764613821

>> 100M+ is a bit more than i would expect for an image format. have i not been paying attention

> So at least if GP was talking about libjxl "100K+" would be more accurate.

M can mean thousands and I think it's common to use it used that way in finance and finance-adjacent areas: https://www.chicagomanualofstyle.org/qanda/data/faq/topics/A...:

> A. You’ve identified two commonly used conventions in finance, one derived from Greek and the other from Latin, but neither one is standard.

Starting with the second convention, M is used for amounts in the thousands and MM for amounts in the millions (usually without a space between the number and the abbreviation—e.g., $150M for $150,000 and $150MM for $150 million). This convention overlaps with the conventions for writing roman numerals, according to which a thousand is represented by M (from mille, the Latin word for “thousand”). Any similarity with roman numerals ends there, however, because MM in roman numerals means two thousand, not a thousand thousands, or one million, as in financial contexts...

https://www.accountingcoach.com/blog/what-does-m-and-mm-stan...:

> An expense of $60,000 could be written as $60M. Internet advertisers are familiar with CPM which is the cost per thousand impressions.

> The letter k is also used represent one thousand. For example, an annual salary of $60,000 might appear as $60k instead of $60M.

WheatMillington · 2025-12-01T18:54:04 1764615244

I assume this is regional... I work in accounting and finance in New Zealand (generally following ordinary Western/Commonwealth standards) and I've never heard of using M for thousands. If I used that I would confuse the hell out of everyone around me.

mkaic · 2025-12-01T19:01:31 1764615691

"It's... a regional dialect."

"What region?"

"Er, upstate New York."

"Really. Well, I'm from Utica and I've never heard anyone use the phrase '100M' to mean '100 thousand'"

"Oh, no, not in Utica. It's an Albany expression."

qingcharles · 2025-12-01T22:30:59 1764628259

In some areas M is mille as in the Latin/French/Italian word for thousand, e.g.

https://en.wikipedia.org/wiki/Cost_per_mille

dataflow · 2025-12-01T19:38:11 1764617891

Okay, but this is... not finance? And the article itself wrote 100K. Rewriting that as 100M does nobody a favor.

sealeck · 2025-12-01T20:15:07 1764620107

I don't think many (if any) programmers would imagine 100M lines of code to mean 100,000 lines of code and not 1,000,000...

uselesswords · 2025-12-01T20:49:10 1764622150

Technically right is the worst kind of right

palmotea · 2025-12-02T06:48:05 1764658085

I'm surprised at the negative reaction to having it pointed out that the OP may not be wrong, just using a dialect.

munificent · 2025-12-01T18:06:20 1764612380

The article says 100K, not 100M. I'm guessing that's what the parent comment meant.

100MLOC for an image format would be bananas. You could fit the entire codebases of a couple of modern operating systems, a handful of AAA videogames, and still have room for several web apps and command line utilities in 100MLOC.

JyrkiAlakuijala · 2025-12-01T18:24:55 1764613495

the article includes test code and encoder code, that is not the way how we compute the decoder size

the decoder is something around 30 kloc

crooked-v · 2025-12-01T17:44:30 1764611070

It's a container format that does about a bajillion things - lossy, lossless, multiple modes optimized for different image types (photography vs digital design), modern encode/decode algorithms, perceptual color space, adaptive quantization, efficient ultra-high-resolution decoding and display, partial and complete animation, tile handling, everything JPEG does, and a bunch more.

furyofantares · 2025-12-01T17:47:42 1764611262

The Linux kernel is 40M lines of code after 34 years of development.

OP might have well have said "infinite lines of code" for JPEGXL and wouldn't have been much less accurate. Although I'm guessing they meant 100k.

EMM_386 · 2025-12-02T20:50:19 1764708619

You are correct, "K" not "M" in my typo.

GaggiX · 2025-12-01T18:03:45 1764612225

They wanted to say 100K instead of 100M

EMM_386 · 2025-12-02T20:51:19 1764708679

They did indeed.