Unless marketing blogs from any company *specifically* say what model they are t...

mquander · 2025-08-23T05:41:54 1755927714

You should not assume that "they've chosen one particular tiny model", or "it's a median across all models including the tiny one they use for all search queries" because those are totally made up assumptions that have nothing to do with what they say they measured. They measured the Gemini Apps product that completes text prompts. They also provided a chart showing that the thing they are measuring scores comparably to GPT-4o on LM Arena.

penteract · 2025-08-23T06:56:33 1755932193

From the report:

> To calculate the energy consumption for the median Gemini Apps text prompt on a given day, we first determine the average energy/prompt for each model, and then rank these models by their energy/prompt values. We then construct a cumulative distribution of text prompts along this energy-ranked list to identify the model that serves the 50-th percentile prompt.

They are measuring more than one model. I assume this statement describes how they chose which model to report the LM arena score for, and it's a ridiculous way to do so - the LM arena score calculated this way could change dramatically day-to-day.

mgraczyk · 2025-08-23T06:08:10 1755929290

> total energy use across their whole AI product range, including training, is the only number that really matters.

What if they are serving more requests?

mgraczyk · 2025-08-23T05:35:25 1755927325

They did specifically say in the linked report

esperent · 2025-08-23T05:57:24 1755928644

Here's the report. Could you tell me where in it you found a link to 33x reduction (or any large reduction) for any specific non-tiny model? Because all I can find is lots of references to "median Gemini". In fact, I would say they're being extremely careful in this paper not to mention any particular Google models with regards to energy reduction.

https://services.google.com/fh/files/misc/measuring_the_envi...

mgraczyk · 2025-08-23T06:09:28 1755929368

Figure 4

I think you are assuming we are talking about swapping API usage from one model to another. That is not what happened. A specific product doing a specific thing uses less energy now.

To clarify: the way models become more efficient is usually by training a new one with a new architecture, quantization, etc.

This is analogous to making a computer more efficient by putting a new CPU in it. It would be completely normal to say that you made the computer more efficient, even though you've actually swapped out the hardware.

sigilis · 2025-08-23T06:31:22 1755930682

Don’t they call all their LLM models Gemini? The paper indicates that they specifically used all the AI models to come up with this figure when they describe the methodology. It looks like they even include classification and search models in this estimate.

I’m inclined to believe that they are issuing a misleading figure here, myself.

mgraczyk · 2025-08-23T06:37:20 1755931040

They reuse the word here for a product, not a model. It's the name of a specific product surface. There is no single model and the models used change over time and for different requests

immibis · 2025-08-23T06:42:27 1755931347

So it includes both tiny models and large models?

mgraczyk · 2025-08-23T06:44:30 1755931470

I would assume so. One important trend is that models have gotten more intelligent for the same size, so for a given product you can use a smaller model.

Again this is pretty similar to how CPUs have changed

immibis · 2025-08-23T15:25:21 1755962721

So it's not a specific product doing a specific thing, but the average across different things?

simianwords · 2025-08-23T11:26:17 1755948377

“Gemini App” would be the specific Gemini App in the App Store. Why would it be anything different?

esperent · 2025-08-23T06:23:09 1755930189

> Figure 4: Median Gemini Apps text prompt emissions over time—broken down by Scope 2 MB emissions (top) and Scope 1+3 emissions (bottom). Over 12 months, we see that AI model efficiency efforts have led to a 47x reduction in the Scope 2 MB emissions per prompt, and 36x reduction in the Scope 1+3 emissions per user prompt—equivalent to a 44x reduction in total emissions per prompt.

Again, it's talking about "median Gemini" while being very careful not to name any specific numbers for any specific models.

logicprog · 2025-08-23T11:31:01 1755948661

You're grouping those words wrong. As another commenter pointed out to you, which you ignored, it's median (Gemini Apps) not (median Gemini) Apps. Gemini Apps is a highly specific thing — with a legal definition even iirc — that does not include search, and encompasses a list of models you can actually see and know.

esperent · 2025-08-24T07:31:42 1756020702

I didn't ignore it, I actually spent some time researching to find out what Google means by "Gemini Apps" (plural) and whether it includes search AI overview, and I can't get a clear answer anywhere.

Of course, Gemini App (singular) means the mobile app. But it seems that the term Gemini Apps (plural) is being used by Google to refer to any way in which users can access the Gemini models, and also they do clearly state that a version of Gemini isused to generate the search overviews.

So it still seems reasonably likely, until they confirm otherwise, that this median includes search overview.

simianwords · 2025-08-24T09:18:09 1756027089

"This section presents the environmental impact metrics for the Gemini Apps AI assistant" is this also not specific enough?

esperent · 2025-08-24T13:34:25 1756042465

No, because unless they state otherwise we should assume that they consider search overview to be an AI assistant (they definitely believe this) and also that it's one of the Gemini Apps.

Look, there's not enough information to answer this within the paper. I'm not willing to give Google the benefit of the doubt on vague language, and you are. I'm assuming they're a huge basicappy evil corporation whose every publication is gone over and reworded by marketing to make them look good, and you're assuming... whatever.

That's fine by me, we disagree. Let's stop here.

simianwords · 2025-08-23T11:25:24 1755948324

What do you think the Gemini app means? It can only mean the consumer facing actually existing Gemini App that exposes 2 models.

esperent · 2025-08-24T13:35:31 1756042531

They refer to Gemini Apps, plural. One of those apps is also called the Gemini App, singular.

mgraczyk · 2025-08-23T06:36:32 1755930992

That isn't what that means. Look at the paragraph above that where they explain.

This is the median model used to serve requests for a specific product surface. It's exactly analogous to upgrading the CPU in a computer over time

tovej · 2025-08-23T07:53:17 1755935597

The median does not move if the upper tail shifts, it only moves if the median moves.

The fact that they do not report the mean is concerning. The mean captures the entire distribution and could actually be used to calculate the expected value of energy used.

The median only tells you which point separates the upper half from the lower half, if you don't know anything else about the distribution you cannot use it for any kind of analysis.

esperent · 2025-08-23T07:21:22 1755933682

I can't copy text from that pdf on my phone, but the paragraph above says exactly what you'd expect: they're using a "median" value from a "typical user" across all Gemini models. While being very careful not to list the specific models which are used to calculate this median, because it almost certainly includes the tiny model used to show AI summaries on google.com, which would massively skew the median value. As someone above said, it's like adding 8 extra meals of a single lettuce leaf and then claiming you reduced the median caloric intake of your meals.

simianwords · 2025-08-23T11:23:42 1755948222

This doesn’t check out. It is not reasonable to interpret “Gemini app” as also including a functionality that is embedded in google searches.

Gemini app is a specific thing: the Gemini App that actually exists.

How can Gemini App also include their internal augmented functionality on search which itself is not an application?

tupshin · 2025-08-23T11:50:28 1755949828

If I, as a regular Google user ask in the search "is this search powered by Gemini?", the AI generated result is in the affirmative.

"Yes, this search is powered by a customized version of the Gemini model for its generative AI features."

Based on that, I'm not sure how it is reasonable to claim that Gemini App has a legal term that is exclusive of its use in search.

Amusingly, it refuses to answer if i ask "is this search powered by Gemini app?"

simianwords · 2025-08-23T12:03:33 1755950613

What? The paper clearly says "This section presents the environmental impact metrics for the Gemini Apps AI assistant". You are going through lots of hoops instead of just reading the paper.