Hacker Newsnew | past | comments | ask | show | jobs | submit | patresh's commentslogin

I believe OP's point is that for a given model quality, inference cost decreases dramatically over time. The article you linked talks about effective total inference costs which seem to be increasing.

Those are not contradictory: a company's inference costs can increase due to deploying more models (Sora), deploying larger models, doing more reasoning, and an increase in demand.

However, if we look purely at how much it costs to run inference on a fixed amount of requests for a fixed model quality, I am quite convinced that the inference costs are decreasing dramatically. Here's a model from late 2025 (see Model performance section) [1] with benchmarks comparing a 72B parameter model (Qwen2.5) from early 2025 to the late 2025 8B Qwen3 model.

The 9x smaller model outperforms the larger one from earlier the same year on 27 of the 40 benchmarks they were evaluated on, which is just astounding.

[1] https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct


They're likely of limited use for someone looking for introductory material to ML, but for someone having done some computer vision and used various types convolution layers, it can be useful to see a summary with visualizations.


Does anyone have experience with longer DeepResearch tasks with Mammouth? How does it compare to using Gemini's / ChatGPT's DeepResearch or GPTResearcher + API-based alternatives?

For standard questions I feel like it doesn't matter too much what you use. When it comes to multi-step searching + reasoning flows like look for alternatives, fetch pricing, feature lists, compare etc, the differences are larger because of the engineering glue and prompting around the pure LLM inference which makes the tools more or less powerful.


I've had no issues with the app lately, but it's still missing the feature of building a local search index to do searches based on e-mail content, like the web client can do.


Yes, I don't mean HN doesn't experience toxicity, but putting things in context, if you read random posts on X versus HN there is no comparison.

Moderation for sure helps, would there be ways to make it scalable with less manual supervision? Or a system that would organize people with certain rule-sets to distribute them into suitable sized groups?

I do agree with your statement that "Good discussions evolve naturally and also randomly", let's say now your platform becomes popular. It will attract players that will want to exploit that either to sway opinions for their own gain, and I believe that this is becoming increasingly cheaper to game and simulate whole crowds. So the limits are mostly with this in mind.

Indeed perhaps the term social platform is vague and different "optimal rules" could be different for social platforms that is a mega-forum, a network for friends, or just generic post sharing.

I'm wondering if there is some sort of taxonomy of these rulesets or levers that exist? Or a review paper on what has been tried and what effects they had? There are so many possible ways to structure online social interactions.


> if you read random posts on X versus HN there is no comparison.

Fair. At this point, I'm not sure if X should be still called social, it's really just a mess of bots and voices.

> Moderation for sure helps, would there be ways to make it scalable with less manual supervision?

This would be the golden goose of communication. Everyone wants good automated moderation, but depending on the topic, crowd and size, it's really hard, and probably expensive, depending on the solution. The main problem is, you have to have a very good understanding of any disputed topic, to understand if something is good for the discussion, or not. And not even all human mods have this on all topics.

> let's say now your platform becomes popular. It will attract players that will want to exploit that either to sway opinions for their own gain, and I believe that this is becoming increasingly cheaper to game and simulate whole crowds. So the limits are mostly with this in mind.

Understandable. And yes, this often happens, a community grows, gains numbers and the vibe and focus is shifting in some way. It's similar to what is usually called "going mainstream" of something. Numbers influence the community, and it's hard to preserve the originals. And this is the normal social fringe. Communication is always about some level of "swaying opinions" and exploiting others for some goal.

So if I understand you correctly, you want to isolate the bad actors, and limit their impact? The question is, if you can successfully divide them from honest actors, or even good actors. Maybe a mechanical or automatic way to build up reputation, social standing and social impact might be a way. HN for example is using the karma-points to unlock certain features on certain levels. Maybe if you can build up a more detailed karma-system, which is more complex than just points, it would be possible to create a semi-automated system for healthy social interactions?

As I already said, I don't like the simple voting-systems, because it's too simple, and tend to drift into simple number-games. For example, nobody knows why something receives votes, and people tend to vote more for certain comments, which are not necessarily beneficial for the discussion. So I think a more diversified voting, with meaningful votes, would be better. On GitHub people are using emojis to communication their reaction to messages in issues, and some projects are even making use of them for certain actions. So using a set of preselected Emojis with specific positive and negative meaning, would IMHO enhance the simple voting-system and maybe allow an automated reputation-building, which then can be used by an automated modding-system.

> I'm wondering if there is some sort of taxonomy of these rulesets or levers that exist?

There is a broad set of information and knowledge in communication-science, diplomacy, psychologies, sociology, etc. But whether they can be used with a social platform is a different thing. Social platforms should be easy, simple, people want to chat and entertain themselves. If you make it too complicated, annoying, they won't participate much, and the platform will die. The biggest problem is again resources, manpower for modding, manpower to organization, time invested in using the platform..

And thinking about, there are also all kind of specialized Subreddits, which have strict rules how they communicate and for which goal. They are usually kinda good, tame and focused in their disputes.


Indeed, there are different societal structures that would attract more one or the other type of person.

I wonder if it would be possible to simulate this to understand what behaviors will emerge if you set certain types of rules. It is certainly difficult to create coherent personalities with LLMs that act in realistic ways but I wonder if one could get an approximation.

Perhaps what I have in mind is also not best described as "pleasant", but also something that is net-positive for society, where as a whole society is better off having that than not. This is arguably the case for HN but not necessarily for some of the bigger ones out there.


I also enjoy watching Charles, a French-Canadian cyclist currently cycling from Canada to Europe. As a geologist he regularly explains rock formations and rock types he encounters.

https://www.youtube.com/c/Charlesenv%C3%A9lo


> currently cycling from Canada to Europe.

Isn't there, like, the ocean? Or does he go the Karl Bushby way over the Bering Strait?


If the diagram is representative of what is happening, it would seem that each cluster is represented as a hypersphere, possibly using the cluster centroid and max distance from the centroid to any cluster member as radius. Those hyperspheres can then overlap. Not sure if that is what is actually happening though.


What is the clustering performed on? Is another embedding model used to produce the embeddings or do they come from the LLM?

Typically LLMs don't produce usable embeddings for clustering or retrieval and embedding models trained with contrastive learning are used instead, but there seems to be no mention of any other models than LLMs.

I'm also curious about what type of clustering is used here.


I agree with your premise that there is often an unproductive pendulum-like phenomenon in public debates where interpretations swing from one extreme to the other, making nuanced discussions difficult.

However I don't believe that PG's article meant to address the elephant, but rather was a meta-level thesis on how he sees debates being shut down by orthodoxy, and for that he does suggest what he thinks would be a possible solution.

Perhaps the thesis could have gained in being more balanced to as you say "avoid giving tacit permissions for the extremists on the other side"? On the other hand, does one always have to shield one's expressions with disclaimers and is one not free to share thoughts however raw in order to express, discuss and learn, update our beliefs?

There likely is a bigger responsibility when one has a larger audience to avoid misinterpretations, but ultimately I believe as long as there is a rational and nuanced discussion to take the good points and have a productive debate, it should be okay.

How can we create incentives to have a more nuanced discussion?


> does one always have to shield one's expressions with disclaimers and is one not free to share thoughts however raw in order to express, discuss and learn, update our beliefs?

The problem with one-sided criticism of extremism is:

1. It is indistinguishable from the default extreme-vs-extreme debate. So it amplifies stupidity all around.

2. The takeaway is unclear. Are all programs to counterbalance discrimination just evil things from the bottom up.

3. It ignores middle ground. Guess we better give up on being more fair, and benefiting more from societies outcasts, and more fairness in general? It must be anti-capitalist, -technology, -patriotic, or something?

None of that is helpful.

So yes, I would say quite strongly, addressing complex divisive issues requires wide situational awareness, nuance, intellectual humility, curiosity, honesty, and an aim to move discussion away from division and toward solutions.

Not more reactionary communiques.

---

A completely different approach would be: these DEI programs are out of hand and creating new problems of their own. Not good. But the status quo they are meant to address isn't good either.

So, here are some thoughts on how we could systematically address harmful discrimination in a way that doesn't forget to be fair to everyone else... And fair to everyone is the point of all this, right?

If anyone might be a useful mentor here, it could be pg, if he steps back and thinks about things more. It fits with his general quest to help startups succeed on all fronts. Wisdom for handling side issues well, professionally, creatively, so they don't keep cropping up as distractions.

Does PG or YC have a sensible practical low-ideological view on ensuring hiring and employee treatment reflects and benefits from diversity, avoiding the pitfalls of unfair discrimination, without creating new ones, and defining diversity to mean ALL of us?

That might take more thought. But it would be well worth a PG post. It is also liable to hit more people FROM ALL SIDES or NO SIDES, as worthy of consideration.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: