Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Round error issue - produce money for free on itBit bitcoin exchange (hackerone.com)
70 points by waffle_ss on March 3, 2017 | hide | past | favorite | 60 comments


Disclosure: I'm CTO of BitMEX, a Bitcoin derivatives exchange built on kdb+/q. We don't compete directly with itBit or any of the exchanges I mention below (we don't do spot trading, they don't do derivatives) but we are in a similar space.

Audits matter. Your books should always sum zero. Time and time again, we've seen bugs like this across the space: there was the infamous Bitcoinica "infinite leverage" bug [1], a rounding error in another's order sizes, and it's theorized that a similar bug was leveraged in the downfall of MtGox to create virtually infinite balances and use it to buy Bitcoin up to $1000 in 2013.

Whether it's in your engine or outside of it, whether it's a Bitcoin product or any other financial product, make sure there is something - somewhere - that is regularly summing up all accounts and ensuring that every cent or satoshi is accounted for.

We run such a system at every execution. If the system doesn't sum zero, we shut down. It's happened three times in two and a half years, each time due to tiny mathematical errors in our liquidation process, each time for less than 10,000 satoshis (~$0.10). We sleep much better at night knowing it's there.

Audits would catch the error caused by float math. And of course, don't do float math.

1. I can't find the original reference to it, but it was originally possible to simply edit the DOM to submit any leverage amount you pleased (such as 100,000x; never trust the client!) and then run your account massively into profit - or bankruptcy.


Yeah. I was the first employee / architect at Kraken (left 2015) and can confirm that these issues were similarly very much at the forefront of our minds in the design phase before we even dared to soft open. (Frankly given the clear demand for Satoshi level accuracy it seems strange to have missed this one!)


That's definitely great to have. Or once the system gets too big and it becomes infeasible to sum all accounts, you can also just make sure that you will never create a transaction where the entries within it do not sum to 0 (and make sure your entries are either all written or none).


Why would you ever use floats for finance?

That's like the industry that epitomizes the usecase for fixed point!


It seems logical if you don't understand how floating points work under the hood.


Those that do not understand why floating point math has no place in the world of money are doomed to repeat this mistake. Over and over and over ...

The snippet of what I presume is their API response is interesting:

     {"currencyCode":"XBT","balance":100000.00000008,"available":100000.00000008,"onDeposit":0}
The amount fields are JSON numbers. That's fine for integers up to 2^52[1] but can have issues with floating point math. They'd be better off representing everything as satoshis and doing integer math on everything.

I bet their entire system is based on floating point math. Tisk tisk ...

[1]: Yes probably not an issue unless your Bill Gates and convert your networth to Zimbabwean dollars...


As someone who is trying to get deterministic simulations relying on floats - the idea to use floats for money boggles my mind. Having a simple addition give the exact same thing on different machines - bit for bit - is already causing headaches.


That's odd. IEEE 754 requires that basic operations produce the closest possible result to the exact answer. (Those operations are simple arithmetic, square roots, remainders, rounding, and various format conversions.) As long as the rounding mode is the same on all machines, they should all give identical results for something like addition unless you're using hardware that doesn't do IEEE 754.

You will, of course, run into huge trouble if you perform more complex operations. All bets are off once you start doing things like trigonometry or logarithms, unless you implement it yourself.


Even if you only have 2 additions in a row, the following things cause problems: Rounding mode, distributivity (or rather, lack thereof), intermediate precision, compiler optimisation, denormals, real IEEE754 conformity of the platform... The list goes on.


A common cause of this is 32-bit x86 builds that use x87, which uses 80-bit precision which sounds like a good thing except that it only does it while values are in registers, which from the programmer's viewpoint means the precision might as well be fluctuating at random. Solution: don't use x87. There are a few compiler options for this, but the most obvious one is to just always build 64-bit.


* you are


Sounds like someone used floating point to store currency. Bad idea, unless it's part of a macroeconomics model or something.


This is why when doing currency, you store in the lowest denomination possible and use whole number (if doing dollars, store cents in integers)


That does not solve the problem in general, you still may end up with fractional quantities if you, for example, calculate a 1% fee. The correct way to do it is to use a decimal floating point format AND carefully pay attention to rounding.

Also note the the way you round usually depends on the context, for some calculations you want to round up, for some down, and for some in yet another way. So you perform explicit rounding with an explicit rounding mode everywhere it is required and don't assume that implicit rounding or the default rounding mode will do the right thing because they won't in general.

EDIT: To expand a bit on the context. If you, for example, charge a percentage fee you can choose between rounding up getting at least the fee you are asking for or rounding down not charging the customer more than you are asking for. Or you can try to be less biased and round either up or down depending on the fractional part. This may still require tie-breaking, either in a direction of your choice, or randomized, or by some rule like round half to even. Different choices will obviously have different biases.

While this may almost seem somewhat pedantic, such things have been exploited, see for example »Computer Capers: Tales of Electronic Thievery, Embezzlement & Fraud«. [1][2] Also note that there may be even legal regulations prescribing how you can and can not round, for example between the Euro and the replaced national currencies. [3] I also have some faint memories of a case against someone systematically exploiting rounding errors but I can not come up with a reference and may even misremember this.

[1] http://www.snopes.com/business/bank/salami.asp

[2] https://www.amazon.com/Computer-Capers-Electronic-Thievery-E...

[3] http://www.sysmod.com/eurofaq.htm#ROUNDING


Won't decimal floats still result in errors like this if the absolute value gets too large? You can avoid that by ensuring that the mantissa has enough digits to hold the largest value you'll ever encounter with enough significance to represent the smallest value you'll ever encounter, but at that point you might as well use an integer with that smallest value as the unit.


Of course, you can push any system to its limits and you can certainly replace your calculations with integer arithmetics. But why would you? Decimal floating point types are readily available and usually offer enough range and precision for all practical purposes. So why reinvent the wheel? And just imagine the pain you are going to go through if you have to adjust your base denomination, say because the law suddenly requires two additional digits of precision. You will have to touch every number you ever stored somewhere.

Someone else in the comments mentioned arbitrary precision arithmetics, but even those don't solve all your problems magically. Not only are they comparatively slow and more cumbersome to deal with due to their variable length nature, but they still fail to correctly deal with one third, so rounding will still be required unless you manage to avoid all the bad numbers.

You may then be tempted to switch to a rational number type with arbitrary precision integers as numerator and denominator. But while you now solved the problem with the one third, you still have to approximate irrational numbers and transcendental functions, for example Euler's constant, which is not unlikely to appear in interest calculations. So you still have to think about rounding, even if possibly in less places.


Use fixed-point decimal math.


It also helps to only round once. Calculate and save intermediate values.

    fee = accountX * 0.01
    accountY = accountY + fee
    accountX = accountX - fee
vs

    accountY = accountY + accountX * 0.01
    accountX = accountX * 0.99
I've seen a fair bit of (non financial) accounting go wrong like this, where fractions just slip through the cracks.


That is not the correct way. The correct way is to use fxed-decimal math, and pay careful attention to rounding.

No floats. Not even once.


No floats. Not even once.

Do you mean no decimal floating point numbers or did you misread my comment as advocating binary floating point numbers plus careful rounding?

Fixed point decimal numbers are of course also okay, it just changes what you have to pay attention to. And often fixed point decimal numbers are in some sense secretly floating point decimal numbers, they just do not automatically adjust the decimal point [1] but require doing so explicitly.

And this may of course be useful because you will run into an overflow and not accidentally lose precision when numbers become large. On the other hand you may lose precision if numbers become small and you only have a fixed precision.

[1] I am not sure what the important factor is for calling a number type fixed point or floating point. Is it the ability to move the decimal point around or is it doing so automatically?


Ah, sounds like a terminology issue. Sorry if this get pedantic, but I wanted a solid response for anyone who stumbles on this thread down the road.

Fixed-point is a way of doing math, as well as a data representation. Think of it as keeping a very large integer, plus another int that represents where the decimal place should go. Hell, you can even think of it as a string of numbers and an optional '.', removing the issue of overflow as well.

When you do fixed-point math, you have to be explicit about rounding and precision changes, or set a default way to handle those situations. When two numbers are added, do we inherit the precision of the more or less precise number? What about division, where we run into necessarily repeating decimals?

Floating point math, on the other hand, uses a mantissa and necessarily loses precision. As a floating point number is an approximation from 0 to infinity using a fixed number of bits (often 32, 64, 128), there are a ton of amounts that it can't exactly represent. But that's not an acceptable situation in finance- we don't need infinity, but we do need countability within a range with some minimum step size.

An example- let's say we're adding amounts of bitcoin in Python

  > 1000.1234 + 1000.1234
  20000.2468
  > 10000.1234 * 3
  30000.3702
Looks good, but now we have a whale, a high frequency trader, or some HackerOne bounty hunter with API access

  > 10000.1234 * 50
  500006.17000000004
That calculation, at face value, just created .00000000004 bitcoin out of thin air. That's less than a satoshi, but if you keep going with a strategy like this on an automated platform, you can turn it into real money.

Instead, you want to use fixed-point.

  > from decimal import Decimal
  > Decimal('1000.1234') * Decimal('50')
  Decimal('50006.1700')
Note in this case the precision was expanded to the most precise in multiplication, and no money was created out of thin air.

You can also set precision explicitly (`Decimal('1000.1234').quantize('0.00000001')`), handle rounding per operation or across calculations, etc.


Why though? What are your arguments for fixed point over decimal floating point. If you ensure a sufficiently large mantissa it will be more resilient in procedures like calculating a small percentage of a small value.


This is a fairly settled issue in finance AFAIK, but to the arguments.

When doing math in finance, we need to be explicit about the precision we use, because we're counting finite things. Things like significant figures and rounding matter, and can't be ignored by imagining we have an "infinite precision" that throws away information, as floating point math does.

Using fixed-point decimal math as default helps mitigate this class of mistakes in this domain. It's similar to using a memory safe language outside embedded programming- if there's no value in the other option, and it's a source of bugs, eliminate it.

NB: Using fixed-point decimal math is similar to the advice to use integers, though it's a bit more flexible (eg keeping amounts in "BTC" with a fixed 8 decimal places instead of "satoshis").


Or use a language with arbitrary precision numbers (they are only like 50 years old so I can see why many people don't use them /s). As someone else already pointed out, this still prevents calculations of fees as a percentage, and other issues.


Arbitrary precision rationals increase in storage and processing time each time you multiply them, so they open you up to DoS attacks.

My order of preference:

* Fixed-point arithmetic: All-around safe, though you'll probably need to take a maintenance window later if you need extra precision.

* Floating point arithmetic: Sounds dangerous, but you can eliminate most of the risk using database checks.

* Arbitrary-precision arithmetic: Almost any place they're used opens you up to DoS attacks; it's very difficult to detect the problem. I'd ensure my database and application server have debugging symbols installed, replace libgmp with something that alerts by generating core dumps when a timing threshold is passed, use a language where backtraces are easy to read in gdb, and ensure that all arbitrary-precision arithmetic is written against gmp. I'm not sure what I'd do when I detected the problem though: If I truncated 1 + 1/(10^10)! to 1 to resolve the issue, it's equivalent to using floating point in the first place.


> Arbitrary precision rationals increase in storage and processing time each time you multiply them

Only if you store the arbitrary precision number. Last I checked you can't actually have a tenth of a cent, so it has to be rounded at some point. Like when you store it. I was advocating calculation be done with arbitrary precision numbers which do not have this problem.

> Fixed-point arithmetic

This is a decent solution as long as you have enough precision and a strategy for detecting lost cents.

> Floating point arithmetic

Which is non-deterministic based off of optimization level and hardware. Waits for the bank to switch to a new cluster and watch all of their numbers change for the lols. No wonder banks don't know how much money they have.


I believe there are libraries to do this that can be used in any language.

http://speleotrove.com/decimal/ etc.

https://hn.algolia.com/?query=speleotrove&sort=byDate&dateRa...


Instructions unclear booted left-pad (if a developer is stupid enough to use floats for money, I doubt they are aware of how library systems work).


C'mon, the Egyptians had arbitrary precision numbers.


Don't you still have to be smart about rounding at some point? Otherwise compounding a (say) 1% fee will lead to arbitrarily long numbers.


Yes. But then you write that at the boundary. Along with the other consistency checks. And you are forced to think about it at the appropriate time, instead of forgetting about it entirely (floats) or introducing subtle errors (integers, and then likely casting it to a float to calculate 1% and then adding it back by casting back to an integer; hint: that's also a bad idea because it has floats in it - which as we've learned recently on HN - are not that deterministic or precise).


Let's say you're not a complete idiot and avoid the use of floats altogether, what subtle errors would you avoid by using arbitrary precision numbers over integers?


Well first riddle me this. How do you calculate a ⅓% (1 / 3 of a percent) fee with integers. Depending on your choice of solution I can tell you how it has a problem and how arbitrary precision numbers would solve it.


That is something I would be interested in, I always assumed arbitrary precision arithmetics would also have to turn one third into some finite approximation, though you can choose however much precision you need or want.

I ran across this once when I implemented a rational number type with arbitrary precision integers and wanted to print the decimal representation. I wanted to determine the repeating and non-repeating fractional parts but failed to find any applicable math or algorithm.

If I am not misremembering and did not do my research to poorly, this is an open problem and at best you can start expanding and look if you return to a state you were in before. But at some point you have to give up because you can not event determine the length of the repeating and non-repeating fractional parts in advance and even some relatively small numerators and denominators can produce some pretty long decimal representations until they start repeating.


I'd do: fee = amount / 300

More generally: represent the percentage as a rational number, then multiply the amount by the numerator and then divide by the denominator.


With integers this will rounds towards zero. It may be that you only wanted to charge a fee of 1 for an amount of 599 but it would not be unreasonable to want to charge a fee of 2 in this case. Which you could of course fix by adding 299 or maybe 150 before the devision but that makes some pretty hard to understand code.


You can wrap up the logic for doing a division with floor, ceiling, or whatever rounding in a function so you don't get ugly and error prone stuff at each call site.

If you wanted your fees to round up then the actual code would look something like:

  fee = round_up_multiply(amount, 1, 300)


At this point (danbruc correctly covered the first part of my point) you have basically reinvented an arbitrary precision library by storing the numerator, denominator and then having a utility function to choose the exact behavior of the result when converting back to numbers you use for storage and show to the user.

This isn't a fixed point number system, because you are now using a non-fixed point number representation (e.g. two numbers; which can't be represented in fix point because of their repeating nature) to represent your fraction. Trying to get all of that behavior right is more error prone than just using an arbitrary precision library in the first place.


Do you see how you will end up reimplementing a fixed point decimal number type?


Of course. An integer that represents a number of some subdivision of your nominal "one" unit is a fixed-point decimal number type. They're two names for the same thing.

But the other commenter was discussing arbitrary precision arithmetic libraries, not fixed-point decimals.


Of course. An integer that represents a number of some subdivision of your nominal "one" unit is a fixed-point decimal number type. They're two names for the same thing.

And an array of single precision floats is a zero-terminated UTF-8 encoded string. Using an actual fixed point decimal number type is still preferable over going straight to the underlying integer, even if it is only for the easier to read source code with decimal points where they belong instead of having to perform the transformation in your head while reading.

But the other commenter was discussing arbitrary precision arithmetic libraries, not fixed-point decimals.

Fair point.


I take your point, but the nitpicker in me (some might say there's nothing else in there) wants to point out that an array of floats is highly unlikely to be valid UTF-8.

If your language of choice has a good fixed-point decimal library, or you can switch to one that does, then I agree that would definitely be the superior solution.


I take your point, but the nitpicker in me (some might say there's nothing else in there) wants to point out that an array of floats is highly unlikely to be valid UTF-8.

I was afraid of that response and thought about using an integers array and ASCII but it is so nicely abstruse, I could not resist.


Exactly. (And yes, it's good to point out the right way, not just the wrong way!) I've heard of some applications using tenths of a cent since they needed more precision.

For Bitcoin applications, I think you'd want to work in integer satoshis.


The default currency type I've seen used for accounting is a decimal with 4 digits trailing the decimal. I forget the leading digits, but it is something like 14.

Basically I think the idea with 4 digits after the decimal is that the smallest denomination you can have is 1 penny, and that allows you to calculate percentages of pennies accurately to a hundredth of a percentage point.


What if you have to calculate, say, a 0.5% fee on a few satoshis?


Calculate it and round it off after calculation to the nearest satoshi.


the lowest denomination of the dollar is the mill not the cent.


I think banks keep track of calculations (like interest) down to the 4th decimal place, 1/10,000 of a penny. Then they round to the nearest cent when settling.


There are still items in the US that use mills (1/1000 of a dollar). Two examples: property tax and gas.


Reminds me of the penny shaving plot in Office Space.

https://www.youtube.com/watch?v=6pX3OcPAHR8


All I get from that site, with some ad blocking on Firefox 51, is:

    11:34:40.979 SecurityError: The operation is insecure. 1 frontend.fe083880.js:15
	e.exports<.initialize https://hackerone.com/assets/frontend.fe083880.js:15:31757
	t.Model https://hackerone.com/assets/vendor.b247925b.js:31:4648
	e.exports<.constructor https://hackerone.com/assets/frontend.fe083880.js:2:20109
	q/r< https://hackerone.com/assets/vendor.b247925b.js:31:22455
	<anonymous> https://hackerone.com/assets/frontend.fe083880.js:19:31012
	t https://hackerone.com/assets/vendor.b247925b.js:1:96
	n https://hackerone.com/assets/frontend.fe083880.js:26:14283
	forEach self-hosted
	<anonymous> https://hackerone.com/assets/frontend.fe083880.js:96:23586
	t https://hackerone.com/assets/vendor.b247925b.js:1:96
	window.webpackJsonp https://hackerone.com/assets/vendor.b247925b.js:1:418
	<anonymous> https://hackerone.com/assets/frontend.fe083880.js:1:1
Fail, Hacker One.


Isn't this the same as the premise of the movie Office Space?


To quote from that movie:

“Yeah, they did it in Superman III.”


And then a rounding error in the software to exploit the rounding error ends up almost getting them caught.


I don't understand how you can run a financial exchange and not have break analysis running periodically (if not every transaction). How else are you going to catch mininteractions between complex systems on an on going basis?

Apparently Fintech startups have no need for the wisdom built up over the course of 500 years of modern banking, because this stuff is Banking 101 for any engineer at a bank.


It requires 3*10^8 transactions to get the amount of the bounty. Good deal.


It's a good thing nobody has access to machines that can rapidly perform repetitive tasks.


Maybe people at NASA.


Maybe people at NSA. FTFY




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: