Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Integer math in JavaScript (darpinian.com)
90 points by simonebrunozzi on July 17, 2022 | hide | past | favorite | 32 comments


Aren't most JS engines able to do this optimization regardless of |0, especially during JIT? The rationale behind the |0 trick in asm.js was the lack of an integer type in pure js, and thus the need to avoid precision errors (which can't happen if you limit yourself to 32 bits since IEEE double-precision floats have 53 bits of mantissa).

This sounds a bit unnecessary given the capabilities of modern engines. I benchmarked some additions on my machine on nodejs and there is no difference between the naive and |0 implementations.


> Aren't most JS engines able to do this optimization regardless of |0

Yes - if you are taking 'after every math operation' literally. JavaScript engines (and other dynamic language engines - nothing special about JavaScript) track types at a finger-grained level than simple generic 'number' or not. They already know if a number is a guaranteed integer, and even in some cases which bits can be set in that number. Sometimes the compiler doesn't want to prove more advanced properties like range when for example a value escapes, due to the overhead of tracking this, and then it's useful to do the trick in this article.

Here's a worked example in a Ruby compiler for example https://chrisseaton.com/truffleruby/stamping-out-overflow-ch....

But the author knows infinitely more than me about JavaScript compilation - they're ex-Google and Meta, including working on both Chromium and WebKit. They could show their working by for example showing some compiler IR for the engines they know about.


Author here. I did work on Chromium but not on V8. There are definitely people here in this comments section who know a lot more than me about these topics!

It's true that JS engines already optimize code to use integer math extensively, without |0 annotations. There are situations where this is made difficult or impossible due to requirements of the language, and |0 can help in these situations.

There's a whole lot more to know about this topic for sure. My article would be a lot better with some examples where |0 gives a benefit and the corresponding IR. This post already took a lot more time to put together than I expected, and I didn't know if that many people would read it. Now that it hit HN, maybe I can justify the extra work.


I think a critical point to is that |0 forces integer arithmetic, beyond just ensuring that integer arithmetic is used to do the calculation. E.g. (1/2) |0 gives a very different result than (1/2), so using |0 ensures that integer arithmetic is used even if you don't know what the operands are.


I mean, no, not really, unless you're appending |0 to everything like a compiler would, i.e. ((a|0) / (b|0))|0.

I'm not saying it's useless, asm.js used this trick to great effect. I'm just saying that unless the situation is more complex and doing a bitwise operation actually encodes information (like the &0xff example in a sibling comment, from which the compiler can deduce "this is a uint8"), it's unlikely you're giving the compiler more information than it's already capable of deducing on its own.


> unless you're appending |0 to everything like a compiler would, i.e. ((a|0) / (b|0))|0.

It effectively does do that though, since earlier in your code `a` and `b` would be defined with `let a = ... |0` so now `a` and `a|0` are equivalent in any subsequent expressions.


Yes - local variables are just indirections through a name - in a compiler referring to a and referring to the original ... |0 expression are the same thing.

https://twitter.com/ChrisGSeaton/status/1530579735012593665


The compiler has to prove that it can use 31-bit integers.

By forcing your number into a 31-bit integer, you provide definite type info to work with.


Not necessarily. JSC has 32 bit integer fast paths. So it depends on the engine.


I mean this is covered in the article, right?

Its possible until you do division or some other operation where the value needs to be promoted for language correctness. At that point the programmer must constrain the result to assure it stays in an integer.

In theory I guess one could be more selective about the |0 and apply it only after operations where the hint is needed.


Yes.


If you want bigger than 32 bits but not the whole 64 bits (up to 53 bits I think) you can just do Math.floor(). But you'll want a range check in that case.

On that note: why does |0 and >>>0 needlessly chop to 32 bits when they can represent more?


Be careful with Math.floor on negative numbers. It rounds -1.1 to -2. You probably want the new Math.trunc.


given the popularity of javascript, it's about time we had better native tools for dealing with numbers in general, but it seems like the discussion is very slow on this front. for instance, the decimal proposal is still at stage 1 [1]. and no, BigInt does not solve that.

[1] https://github.com/tc39/proposal-decimal


JavaScript now has Bigint! (ES2020)


It's a bit strange that it got BigInt before Int64, for most applications 64 bits are plenty and arbitrary precision arithmetic is just needless overhead. Maybe a sufficiently smart JS engine could deduce that a BigInt can't exceed 2^64 and lower it to native 64 bit math, but that kind of magic is hard to rely on.


Full 64 bit would very likely need to be boxed (heap allocated). I assume bigint is not boxed up to 40-something bits and instead stored as NaN[0]. Once you start boxing, the additional few instructions to handle things above 64 bit are probably not that important, so they may as well be generic.

[0] https://anniecherkaev.com/the-secret-life-of-nan


Interesting that there are 15*(2^47) unused values in this scheme in the range FFFF:xxx1:xxxx:xxxx - FFFF:xxxF:xxxx:xxxx. Could these be put to any practical use?


https://github.com/philihp/nanbox

You don't even need a Double, you can use a Float NaN to encode strings:

> IEEE 754 encodes 32-bit floating points with 1 bit for the sign, 8 bits for the exponent, and 23 bits for the number part (mantissa). (...)

> 23 bits isn't much space, but conveniently the max size of a Unicode codepoint is 0x10FFFF, making it a 22-bit charset. When you encode your strings in UTF-32, you're only going to be using 22 of those bits, so masking the top portion with the NaN signature makes all of your characters NaNs.


Not that strange considering Python effectively dropped Int64 support.

In Python 2.7 there was a distinction between when Python would use the "long int" vs the traditional Int64. This distinction was eventually entirely dropped in favor of all ints being "long ints". In case you are unfamiliar python implement ints internally as a list of digits. This is now true no matter the number of digits in the int.

What interesting about this is that there actually are edge cases where performance in Python 2.7 will be superior to Python 3. There's a nice foot note about this in "High Performance Python".


> It's a bit strange that it got BigInt before Int64

JavaScript has Int64?


Sorry, no it doesn't. I meant there should be an Int64 type and it probably should have been implemented before they dived into arbitrary precision integers.

I suppose the answer if you really care about performance is "just use WebAssembly" since that does have native 64 bit integers.


Why would it need a 64 bit int? How often do you go over 2*56 in Javascript?


All sorts of hash functions expect 64-bit intermediate values. Worse, these algorithms are liable to go haywire, and return very-not-randomized results if intermediate values are silently truncated to 56 bits (or magically become floating-point and thus lose their lower-order one bits) as happens when they're naively translated from C to JS.


As a pure quantity representing a real-world value, probably not very often, but just as a series of bits or a more abstract quantity like a memory address, it will often go above 2^56. Many algorithms and binary file encodings use 64 bit integers so interfacing with them in Javascript is hard.


BigInt exists, but there is a chicken and egg issue.

It's slow, so nobody wants to use it and nobody uses it, so they don't see a need to optimize.


Do JS engines actually still have any Asm.js optimisation code (did Chrome ever have any)? I assumed they would remove it now that WASM exists.


> Do JS engines actually still have any Asm.js optimisation code

It's not 'Asm.js' optimisation code - it's normal optimisations that were there before Asm.js, and are still useful.

(I'm sure there are some exceptions and some thing were Asm.js-specific, but the basics are generic.)


I think some of those optimizations were implemented because of asm.js, at least in Spider Monkey and old Edge's JavaScript engine, but yes, they should be generic in theory.

https://en.wikipedia.org/wiki/Asmjs#Implementations


I added the |0 optimization to JSC’s optimizer before asm.js was a thing. It worked for any use of a number in bit math, so like ~~x also had the same effect as x|0.

Hilariously, I didn’t actually do the folding of x|0 to x because I didn’t know folks would do that. I just wanted to eliminate int overflow checks if the use didn’t care about the high bits. We only implemented the folding once asm.js became a thing.


No there definitely was Asm.js optimisation code, at least in Firefox. That was the entire point.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: