Hacker Newsnew | past | comments | ask | show | jobs | submit | TrainedMonkey's commentslogin

I really enjoyed fantasy part of many small farmers. It felt rustic. However based on my understanding the modern world is moving towards megacorps and economies of scale.

Has moved.

Why is that?

You honestly don't know of Moritz Hardt?

Why so snarky? I also didn't know who he was:

I'm a director at the Max Planck Institute for Intelligent Systems. Prior to joining the institute, I was Associate Professor for Electrical Engineering and Computer Sciences at the University of California, Berkeley. My research contributes to the scientific foundations of machine learning and algorithmic decision making with a focus on social questions.[0]

Also simply knowing of him doesn't answer the question.

[0] https://mrtz.org/


sry just a joke man

xkcd 1053, my friend.

> it doesn't dramatically reduce screen brightness or image quality.

AFAIK it significantly decreases the brightness. Jerry Rig Everything demonstrates this here - https://youtu.be/TRW4W7KkJXs?t=32


significantly and dramatically are two different things. I was sceptic when buying it but have no problem using the display with privacy screen on, and dont see that much difference in brightness, even in direct sunlight, fwiw.

Bonus with it on you can stretch your battery life, only half the pixels actually active saves quite some battery, who knew!


You’re paying more for less brightness.

Only if you turn it on for the whole screen at all times, and you are still getting a privacy screen out of it so its not a loss with no benefits.


Servicing is not done by Apple, it's 3rd party contractors. They have a rubric of possible issues from Apple and their profit margins are thin. I suspect contacting Apple support about Apple support issues would have resulted in a swift replacement of the item.

Will need to wait for real benchmarks, but based on OpenAI marketing Instant is their latency optimized offering. For voice interface, you don't actually need high tok/s because speech is slow, time to first token matters much more.


Apple seems to be pushing for accessibility and volume. Cheaper phones, mac minis, and entry point mac that will be introduced on Wednesday.


Tbh, feels like the market is pushing for that and Apple is responding.


Exceptionalism says we have best of everything, including idiots.


What if, and hear me out here, "You don't have to"


No, what he is saying is that benchmarks are static and there is tremendous reputational and financial pressure to make benchmark number go up. So you add specific problems to training data... The result is that the model is smarter, but the benchmarks overstate the progress. Sure there are problem sets designed to be secret, but keeping secrets is hard given the fraction of planetary resources we are dedicating to making the AI numbers go up.

I have two of my own comments to add to that. First one is that there is problem alignment at play. Specifically - the benchmarks are mostly self-contained problems with well defined solutions and specific prompt language, humans tasks are open ended with messy prompts and much steerage. Second is that it would be interesting to test older models on brand new benchmarks to see how those compare.


> No, what he is saying is that benchmarks are static and there is tremendous reputational and financial pressure to make benchmark number go up.

That's a much better way to say it than I did.

These models are known for being open weights but they're still products that Alibaba Cloud wants is trying to sell. They have Product Managers and PR and marketing people under pressure to get people using them.

This Venture Beat article is basically a PR piece for the models and Alibaba Cloud hosting. The pricing table is right in the article.

It's cool that they release the models for us to use, but don't think they're operating entirely altruistically. They're playing a business game just like everyone else.


There should be a way to turn the questions we ask LLMs into benchmarks.

That way, we can have a benchmark that is always up to date.


There are a few “updating” benchmarks out there. I periodically take a look at these two:

https://swe-rebench.com/

https://livebench.ai/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: