Hacker Newsnew | past | comments | ask | show | jobs | submit | bfgeek's commentslogin

One has to wonder if this due to the global memory shortage. ("Oh - changing our memory allocator to be more efficient will yield $XXM dollar savings over the next year").

Facebook had talks already years ago (10+) - nobody was allowed to share real numbers, but several facebook employed where allowed to share that the company has measured savings from optimizations. Reading between the lines, a 0.1% efficiency improvement to some parts of Facebook would save them $100,000 a month (again real numbers were never publicly shared so there is a range - it can't be less than $20,000), and so they had teams of people whose job it was to find those improvements.

Most of the savings seemed to come from HVAC costs, followed by buying less computers and in turn less data centers. I'm sure these days saving memory is also a big deal but it doesn't seem to have been then.

The above was already the case 10 years ago, so LLMs are at most another factor added on.


I don't have many regrets about having spent my career in (relatively) tiny companies by comparison, but it sure does sound fun to be on the other side for this kind of thing - the scale where micro-optimizations have macro impact.

In startups I've put more effort into squeezing blood from a stone for far less change; even if the change was proportionally more significant to the business. Sometimes it would be neat to say "something I did saved $X million dollars or saved Y kWh of energy" or whatever.


I've worked on optimizing systems in that ballpark range, memory is worth saving but it isn't necessarily 1:1 with increasing revenue like CPU is. For CPU we have tables to calculate the infra cost savings (we're not really going to free up the server, more like the system is self balancing so it can run harder with the freed CPU), but for memory as long as we can load in whatever we want to (rec systems or ai models) we're in the clear so the marginal headroom isn't as important. It's more of a side thing that people optimizing CPU also get wins in by chance because the skillsets are similar.

I've heard of some people getting banned from FB to save memory space? Surely that can't be the case but I swear I've seen something like that

There are some people who think they can beat the system by treating apps like Telegram and Discord as free cloud storage, and they certainly get banned to save storage space.

> LLMs are at most another factor added on

At most... Think 10x rather than 0.1x or 1x.


On top of cost, they probably cannot get as much memory as they order in a timely fashion so offsetting that with greater efficiency matters right now.

Yeah, identifying single-digit millions of savings out of profiles is relatively common practice at Meta. It's ~easy to come up with a big number when the impact is scaled across a very large numbers of servers. There is a culture of measuring and documenting these quantified wins.

Oooh maybe finally time for lovingly hand-optimized assembly to come back in fashion! (It probably has in AI workloads or so I daydream)

With the reputation of that company, one can wonder a lot of backstories that are even more depressing than a memory shortage.

Not just shortage, any improvements to LLMs/electricity/servers memory footprint is becoming much more valuable as the time goes. If we can get 10% faster, you can easily get a lead in the LLM race. The incentives to transparently improving performance are tremendous

> changing our memory allocator

they've been using jemalloc (and employing "je") since 2009.


HTML parsing supports some of this, e.g:

  text <b>bold <i>bold-italic</b> italic</i>


that becomes:

    text <b>bold <i>bold-italic</i></b><i> italic</i>


Yeah. After parsing.


Blink's (Chromium) text layout engine works the following way.

1. Layout the entire paragraph of text as a single line.

2. If this doesn't fit into the available width, bisect to the nearest line-break opportunity which might fit.

3. Reshape the text up until this line-break opportunity.

4. If it fits great! If not goto 2.

This converges as it always steps backwards, and avoids the contradictory situations.

Harfbuzz also provides points along the section of text which is safe to reuse, so reshaping typically involes only a small portion of text at the end of the line, if any. https://github.com/harfbuzz/harfbuzz/issues/224

This approach is different to how many text layout engines approach this problem e.g. by adding "one word at a time" to the line, and checking at each stage if it fits.


> This approach is different to how many text layout engines approach this problem e.g. by adding "one word at a time" to the line, and checking at each stage if it fits.

Do you know why Chrome does it this way?


We found it was roughly on par performance wise for simple text (latin), and faster for more complex scripts (thai, hindi, etc). It also is more correct when there is kerning across spaces, hyphenation, etc.

For the word-by-word approach to be performant you need a cache for each word you encounter. The shape-by-paragraph approach we found was faster for cold-start (e.g. the first time you visit a webpage). But this is also more difficult to show in standard benchmarks as benchmarks typically reuse the same renderer process.


You can exploit flexbox for this type of layout: https://bfgeek.com/flexbox-image-gallery/


> "At this point, the engineers in Australia decided that a brute-force approach to their safe problem was warranted and applied a power drill to the task. An hour later, the safe was open—but even the newly retrieved cards triggered the same error message."

What happened here (from what I recall) was far funnier than this does it credit.

The SREs first attempted to use a mallet (hammer) on the safe (which they had to first buy from the local hardware store - don't worry it got expensed later), then after multiple rounds of "persuasion" they eventually called in a professional (aka. a locksmith) who used a drill+crowbar to finally liberate the keycard.

The postmortem had fun step by step photos of the safe in various stages of disassembly.


What are the chances that the photos could be shared?


One thing to keep in mind when developing these large lists of fonts is that they are generally terrible for performance if the appropriate glyphs for what you are trying to display aren't present in the first font (and the font is available - this isn't an issue if the font isn't available at all).

This is generally more of an issue with non-latin scripts (or when emoji is present for example), and developers adding a font which doesn't have glyph coverage - or sparse glyph coverage.

Chrome/Firefox devtools both have a section "Rendered Fonts"/"Used Fonts" which show which gylphs are used from which font.

Additionally if you are showing non-latin, make sure to language tag your markup: https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...

`font-family: sans-serif` if not language tagged with incur a similar fallback perfromance penalty (the browser will have to change the "english" sans-serif font, find no glyphs, then use the "other-lang" sans-serfic font).


> which are more powerful and is out-of-spec

These are in the specification here: https://drafts.fxtf.org/filter-effects-1/#typedef-filter-url

And used by backdrop-filter here: https://drafts.fxtf.org/filter-effects-2/#BackdropFilterProp...


My biggest pet peeve is designers using high end apple displays.

You've average consumer is using a ultra cheap LCD panel that has no where near the contrast ratio that you are designing your mocks on, all of your subtle tints get saturated out.

This is similar to good audio engineers back in the day wiring up a dirt cheap car speaker to mix albums.


Those displays also have a huge resolution and eye-blindingly bright contrast by default, which is also how you get UI elements which are excessively large, tons of wasted space padding, and insanely low contrast.


> This is similar to good audio engineers back in the day wiring up a dirt cheap car speaker to mix albums.

Isn't that the opposite of what's happening?

I have decent audio equipment at home. I'd rather listen to releases that were mixed and mastered with professional grade gear.

Similarly, I'd like to get the most out of my high-end Apple display.

Optimizing your product for the lowest common denominator in music/image quality sounds like a terrible idea. The people with crappy gear probably don't care that much either way.


Ideally, you do both. Optimize on crap hardware, tweak on nice hardware.


This likely more effective quite a few years ago, but not particularly important today.

Changing height typically only shifts elements, and browser engines typically wont relayout them due to position changes.

"overflow: clip" is also much more lightweight than "overflow: hidden"


Part of the design constraint here is to reuse the existing properties that exist for multi-column layout which have existed for a long time - https://developer.mozilla.org/en-US/docs/Web/CSS/column-rule

This proposal extends this mechanism to be more general.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: