More

smokel · 2026-03-24T07:52:16 1774338736

We need a website with refutations that one can easily link to. This interpretations of LLMs is outdated and unproductive.

smokel · 2026-03-24T07:50:58 1774338658

No. AlphaGo does search, and does so imperfectly. It does come up with creative new patterns not seen before.

smokel · 2026-03-22T19:45:19 1774208719

I understand that in a research lab or in academia, this is common practice. But in the more menial coding industry that most of us are probably in, how do you find time for this? Do people read papers in their spare time and discuss over lunch, or are there enlightened managers who support this during working hours?

Foe · 2026-03-22T20:05:25 1774209925

Good question. Most people read the paper on their own time, and we meet over lunch. The meetings themselves are just an hour, so it's not a massive time block. I've found that the people who show up are the ones who are genuinely curious and would be reading this stuff anyway (and sometimes just need a commitment/accountability to do it). Having a group gives them a reason to do it on a schedule.

oa335 · 2026-03-22T20:53:34 1774212814

> The meetings themselves are just an hour, so it's not a massive time block

How exactly are the meetings structured? I.e is someone leading discussions? Does each person go around and share thoughts? Etc

Foe · 2026-03-23T02:40:56 1774233656

We usually start with quick overall impressions, then go around with a few prompts: "what's something new you learned?", "what didn't you like?", and "what didn't you fully understand?" (every paper has something, whether it's the evaluation methodology or some algorithm detail). That last question tends to drive most of the discussion because people chime in and build on each other's answers. Sometimes you get lucky with domain expertise in the room. For example, when we read "What Every Programmer Should Know About Memory"[1], one of the attendees was a former Intel engineer who spent their career in memory systems. They answered questions the rest of us wouldn't have even known to ask.

[1] https://people.freebsd.org/~lstewart/articles/cpumemory.pdf

zihotki · 2026-03-22T20:35:10 1774211710

That implies that you have a fixed time for lunch and also chat during lunch. I may be the minority but I prefer to eat when I'm hungry and focus on the food instead of chatting. And there is also allergies, as a celiac, I have big troubles eating together with others - they may accidently contaminate my food

tanjtanjtanj · 2026-03-22T23:28:37 1774222117

I’m actually curious here, not trying to question your experience but does other people’s food regularly contaminate your food when you eat at the same table as them?

I’ve lived with a celiac sufferer before and I’ve never heard of something that extreme, but everyone’s different.

fc417fc802 · 2026-03-23T13:50:53 1774273853

The degree of sensitivity of allergies varies widely. For example there are people who only have a problem after consuming a large scoop of peanut butter but there are also those who will end up in the hospital from trace amounts that you'd have difficulty spotting with the naked eye.

FuriouslyAdrift · 2026-03-23T15:24:04 1774279444

I dated a woman with celiac sprue (which I guess was extreme.. her mother had to have a bowel resection due to celiac related issues) and she had sudden anaphylaxis at a restaurant that required the use of an epi-pen and an ambulance.

The reaction was caused by the micro-brewery that had opened next door and all the wheat dust in the ventilation system.

tayo42 · 2026-03-22T23:51:12 1774223472

When I've seen this done, yeah you block a fixed time for a "meeting", durring lunch time.

titanomachy · 2026-03-23T08:35:04 1774254904

It sounds like you could get very high ROI from chilling out a little bit. If one social lunch per month is an unfathomable hardship then you're probably leaving a lot of other opportunities on the table. Do you have OCD or social anxiety or something?

obezyian · 2026-03-23T09:56:51 1774259811

Apparently, people with celiac disease do have "anxiety or something":

https://en.wikipedia.org/wiki/Coeliac_disease#Dietary_challe...

eldenring · 2026-03-22T20:03:51 1774209831

I'm not sure what you mean by menial coding but all my employers have supported this in the past. This was a variety of companies, big tech, startups, etc. I think its more likely your employer is the outlier.

nico · 2026-03-22T20:15:06 1774210506

I’ve been scolded for reading books and documentation for the tasks and software I was asked to build (at a startup) during my regular work hours

No company I’ve worked at has ever had dedicated time for reading papers or articles

Maybe I’ve only worked at outliers?

userbinator · 2026-03-22T22:34:23 1774218863

All the companies I've worked at implicitly assume that you're supposed to use your working hours for more than just coding, including learning what you need for the task at hand, although if you're looking at very beginner material that might raise some suspicion.

nico · 2026-03-23T03:19:22 1774235962

In the case I mentioned above, the company wanted me to build a search engine before elastic search existed, and before there was full-text search in popular dbs like Postgres or MySQL. The CTO/founder gave me his credit card and told me to buy whatever books I needed. I bought about 5 different relevant books. Work days were about 10-12hrs, they still wanted me to read/research on my own time

johngossman · 2026-03-22T20:41:41 1774212101

In 35 years in the industry, reading and studying during work hours were always supported. Frankly, most places would let us play video games during work hours as long as we met our deadlines.

amarant · 2026-03-23T00:15:54 1774224954

I've had mandated gaming on Friday after lunch. But this was in the gaming industry so it's "market research"!

We also often played board games. My favourite was playing secret Hitler with my team that one time. That was fun! (I managed to become "untouchable" while also being Hitler. That's a memorable moment!)

smokel · 2026-03-23T09:30:20 1774258220

Interestingly, the person at Microsoft states in a reply that even most of them have to pick this up in their spare time. Judging from other replies, it seems that there are quite some differences in how companies approach this.

What I meant with "menial coding" is those jobs where people have to submit TPS reports on how much hours they spend for each customer. Reading a paper such as this one [1] is typically not directly necessary for being a good frontend developer, but it might stimulate someone to develop into a more fruitful employee in the long run. Managers would have to explain to their customer why time is being spent on that, and that requires some vision and creativity, which is not always a given.

[1] https://arxiv.org/abs/1706.03762

thi2 · 2026-03-22T20:28:41 1774211321

Thats my experience as well. Of course not ten paper a day but some learning is always encouraged.

One company had a +1 day. You worked 4 days, had 1 day for learning - everything relevant for the job was fine.

yellow_postit · 2026-03-23T01:32:19 1774229539

In my experience it is a lot like finding time to work on "strategy". There's never really explicit time given, you have to make it in the day, and its often the most valuable time spent.

rectang · 2026-03-23T03:51:47 1774237907

The groups I ran were scheduled during lunch. Technical management would look the other way if we ran over time or if people spent a certain amount of their work day reading the material.

Even if you have enlightened technical management, it's helpful if you don't force them to spend political capital justifying groups like this. Getting our enlightened CTO to spend a few hundred dollars on books was easy when we were a startup. Once we got acquired, making that argument to unreceptive higher-ups wasn't worth it for anybody.

zihotki · 2026-03-22T20:24:19 1774211059

This is a very good question. I also struggle to find a good solution to process various signals (papers, tecniques, etc.) with my co-workers while maintaining proper work-life balance. Either you have to be a full time geek, or be left behind..

munchbunny · 2026-03-22T23:24:00 1774221840

I sneak thirty minutes in here and there for it regardless of my manager. If you work, say, 40-45 hours a week, you’re probably doing 20 hours of true focused productivity. It’s easier to borrow here and there from the other half of the time to flip through a paper or two.

OJFord · 2026-03-23T10:40:51 1774262451

If it happens in the office and on the calendar then I can't imagine it being an issue? (vs. an extended jolly at the pub every lunch through the afternoon for unofficial 'reading group'!) Would take quite a micromanaging and anti-L&D employer/manager.

markn951 · 2026-03-22T20:38:08 1774211888

Speaking as a SWE manager who explicitly “mandates” (not actually mandatory but I strongly encourage following your passions and interests in an academic kind of way!) we do exist, I assume I’m not the only one :)

My team almost always can find an hour between tasks organically so I’ve never really had to push

Insanity · 2026-03-23T01:52:28 1774230748

I'm a SWE manager as well. I always tell my team that learning is part of the job, and so it can happen on the job. To be honest, it worked out pretty well. and I lead by example, I'll read something interesting during work and share it with the team.

vasco · 2026-03-23T06:27:19 1774247239

You could realistically read 2-3 papers per visit to HackerNews if you were doing that instead. Me too.

smokel · 2026-03-22T15:20:44 1774192844

No, there is one award each year, and this year it is shared equally between two people: Charles Bennett and Gilles Brassard. This happens more often, and it has even been shared between three people (in 2002, 2007 and 2018).

smokel · 2026-03-22T13:12:59 1774185179

This article made me enthusiastic to dive into Bayesian statistics (again). A quick search led me to Think Bayes [1], which also introduces the concepts using Python, and seems to have a little more depth.

[1] https://allendowney.github.io/ThinkBayes2/

smokel · 2026-03-15T16:21:24 1773591684

> AI really wants to use Project Panama

It would help if you briefly specified the AI you are using here. There are wildly different results between using, say, an 8B open-weights LLM and Claude Opus 4.6.

matt_heimer · 2026-03-15T18:25:49 1773599149

I've been using several. LM Studio and any of the open weight models that can fit my GPU's RAM (24GB) are not great in this area. The Claude models are slightly better but not worth they extra cost most of the time since I typically have to spend almost the same amount of time reworking and re-prompting, plus it's very easy to exhaust credits/tokens. I mostly bounce back and forth between the codex and Gemini models right now and this includes using pro models with high reasoning.

smokel · 2026-03-11T12:21:25 1773231685

> Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence? Obviously and intuitively the answer is NO.

Not so fast. People have built pretty amazing thought frameworks out of a few axioms, a few bits, or a few operations in a Turing machine. Dolphin songs are probably more than enough to encode the game of life. It's just how you look at it that makes it intelligence.

smokel · 2026-03-09T21:39:11 1773092351

I've written an Obsidian clone for myself, which has proper Emacs keybindings. Took me a few hours too many to get in all the features that I need.

What I find interesting is that I have little motivation to open source it. Making it usable for others requires a substantial amount of time, which would otherwise be just a fraction of the development time.

xorvoid · 2026-03-09T21:55:35 1773093335

I was thinking about doing the same. Build a clone with AI custom tailored for my own quirks. And not bothering to open source it because it's too bespoke for anyone else. How hard was this? Can you share any advice?

smokel · 2026-03-10T07:20:58 1773127258

It turned out to be pretty hard in some places. I'm using CodeMirror as the basic building block, which is great, but it does not support WYSIWYG table editing out of the box. Getting that to work requires one to use a separate CodeMirror instance for the cell editor, which makes things rather complicated. For the LLM as well :)

I think I've spent ~20 hours and a couple of $100 of Claude Opus tokens in Cursor. So it's not cheaper or easier, but the amount of frustration saved with having proper Emacs keybindings might delay catastrophic global warming by a few days.

Oh, and of course I'm not compatible with all the Obsidian extensions, nor do I have proper hosting for server-side sync yet. All in all, a fool's errand, but I'm having fun.

jerlendds · 2026-03-15T21:51:45 1773611505

Im doing the exact same thing but Im building my Obsidian clone with Rust and gpui and primarily with Codex. So far I estimate Ive been solely vibe coding it for ~15 hours now with only one small change made by hand. Id be interested in comparing notes/our different approaches to this. Feel free to shoot me an email at jerlendds at osintbuddy dot com if you want to chat.

I have a small demo video of yesterdays work here: https://github.com/jerlendds/mdi

Theres since been many additions, Ill update the video tonight

xorvoid · 2026-03-10T16:00:42 1773158442

Thank you! Re extensions: my thinking was that if you build a clone, then extensions become irrelevant. Just build what you need directly into the software. Extensions systems always seemed to me to be a second class citizen. I think I read an old story of Linus Torvalds using an old fork of microemacs and whenever he disliked something he would just go tweak it's C code (e.g. key bindings). I'm kind of thinking that but done with an LLM. Software could in theory be smaller and more bespoke. And it you want it to work differently, you just prompt an LLM to change the actual source code. Then you don't need higher level configuration/cuatomization interfaces. Simpler software.

smokel · 2026-03-10T19:48:25 1773172105

This is an interesting topic. I always loved the idea of extensions, for multiple reasons. But they do have their disadvantages, and I'm eager to find out how extension systems will hold up in the time of LLMs.

A major advantage of (certain) extension mechanisms is that you can update them in real-time. For example, in Emacs you can change functions without losing the current state of the application. In Processing or live coding environments, you can even update functions that affect real-time animation or audio.

Another advantage is that they can pose a very nice API, that allows for other people to learn an abstraction of the core application. If you are the sole developer, and if you can spend the time to keep an active memory of the core application, this does not help much. But it can certainly help others to build upon your foundation. Gimp and Emacs are great examples of this.

A disadvantage is that you have to keep supporting the extension mechanism, or otherwise extensions will break. That makes an ecosystem somewhat more slow to adapt. Emacs is the prime example here. We're still stuck with single-threaded text mode :)

bityard · 2026-03-09T22:37:20 1773095840

I have a theory (and I'm sure I am far from the first one to voice it) that the number of useful open source projects released to the public will be on the decline now that anyone scratch their own itch with a few hours of vibe coding. Why would I spend hours evaluating a dozen different note-taking applications and _maybe_ find one that is _kinda close_ to what I want, if I can instead have Claude vibe me one up _exactly_ the way I want it?

(I actually did write my own note-taking application, but that was before LLMs were any good at writing code.)

archagon · 2026-03-10T00:12:39 1773101559

Because when it eventually and inevitably corrupts your data, you won't know what to do or have any recourse?

TheAceOfHearts · 2026-03-10T01:01:01 1773104461

Surely any sane person vibe coding a note taking app just has it save all the notes as markdown files to disk? At that point making a backup is trivial and they're unlikely to get corrupted.

archagon · 2026-03-10T01:03:57 1773104637

So why vibe code a version of a thing that already exists in a dozen different permutations, and with actual eyes on the codebase?

smokel · 2026-03-10T07:26:14 1773127574

In a typical open source project only one person has had a look at a particular piece of code. Only in the larger and more mature projects do people actually spend time reviewing code. Also, if you don't pay for the free code, there is often no serious recourse to recover your data either.

As stated in my first comment, Obsidian does not support Emacs keybindings properly, nor is it open source. Writing an extension to add Emacs keybindings is not at all trivial, because you have to work around a lot of existing and undocumented functionality.

There are other reasons for not vibe coding your own alternative, but as LLMs keep progressing, these reasons may become less relevant.

smokel · 2026-03-08T09:15:01 1772961301

> This outperforms the majority of online llm services

I assume you mean outperforms in speed on the same model, not in usability compared to other more capable models.

(For those who are getting their hopes up on using local LLMs to be any replacement for Sonnet or Opus.)

moffkalast · 2026-03-08T10:04:14 1772964254

Obviously it's not going to be of a paid tier 2T sized SOTA model quality, but it can probably roughly match Haiku at the very least. And for tasks that aren't super complex that's already enough.

Personally though, I find Qwen useless for anything but coding tasks because if its insufferable sycophancy. It's like 4o dialed up to 20, every reply starts with "You are absolutely right" with zero self awareness. And for coding, only the best model available is usually sensible to use otherwise it's just wasted time.

Anduia · 2026-03-08T10:13:20 1772964800

That's why I start any prompt to Qwen 3.5 with:

persona: brief rude senior

amelius · 2026-03-08T13:16:08 1772975768

I'm using:

persona: drunken sailor

Because then at least the tone matches the quality of the output and I'm reminded of what I can expect.

moffkalast · 2026-03-08T20:50:35 1773003035

But then what do you do with it early in the morning?

amelius · 2026-03-08T21:10:25 1773004225

For starters, shave his belly with a rusty razor, obviously ;)

dlcarrier · 2026-03-08T16:39:39 1772987979

Does it tend to break out into sea shanties?

drob518 · 2026-03-08T18:30:31 1772994631

Yo, ho, ho, and a bottle of rum.

yunnpp · 2026-03-08T19:45:47 1772999147

https://www.youtube.com/watch?v=C_k8wYuk8PQ

em500 · 2026-03-08T13:31:06 1772976666

This also works

persona: emotionless vulcan

dlcarrier · 2026-03-08T16:42:55 1772988175

Does "persona: air traffic controller" work?

If I could set up a voice assistant that actually verifies commands, instead of assuming it heard everything correctly 100% of the time, it might even be useful.

9wzYQbTYsAIc · 2026-03-08T15:47:33 1772984853

persona: fair witness

https://fairwitness.bot/

Chris2048 · 2026-03-08T20:19:54 1773001194

You just paste in that YAML? Is this an official llm config format that is parsed out?

9wzYQbTYsAIc · 2026-03-10T17:31:57 1773163917

Yeah, just paste it in there - the LLM will figure it out. Play with it if you want to tweak the formatting - you could try JSON instead, but for readability I went with YAML.

ranger_danger · 2026-03-09T04:36:35 1773030995

wow I had no idea you could do that. this changes everything for me.

varispeed · 2026-03-08T14:59:10 1772981950

persona: party delegate in a rural province who doesn't want to be there

lemonginger · 2026-03-08T11:33:30 1772969610

gamechanger

andai · 2026-03-11T19:42:40 1773258160

>for coding, only the best model available is usually sensible to use otherwise it's just wasted time.

I had the opposite experience. Gave a small model and a big model the same 3 tasks. Small model was done in 30 sec. Large model took 90 sec 3x longer and cost 3x more. Depending on the task, the benchies just tell you how much you are over-paying and over-waiting.

wehadit · 2026-03-12T20:37:28 1773347848

If you use the models like we execute coding tasks, older models outperform latest models. There's this prep tax that happens even before we start coding, i.e., extract requirements from tools, context from code, comments and decisions from conversations, ACs from Jira/Notion, stitch them together, design tailored coding standards and then code. If you automate the prep tax, the generated code is close to production ready code and may require 1-2 iterations max. I gave it a try and compared the results and found the output to be 92% accurate while same done on Claude Code gave 68% accuracy. Prep tax is the cue here

itsTyrion · 2026-03-12T01:05:54 1773277554

oh? I used it in t3 chat before, with traits `concise` `avoid unnecessary flattery/affirmation/praise` `witty` `feel free to match potential user's sarcasm`

and it does use that sarcasm permission at times (I still dislike the way it generally communicates)

ggregoire · 2026-03-08T14:11:52 1772979112

> I find Qwen useless for anything but coding tasks because if its insufferable sycophancy

We use Qwen at work since 2.0 for text/image/video analysis (summarization, categorization, NER, etc), I think it's impressive. We ask for JSON and always ask "do not explain your response".

segmondy · 2026-03-08T16:56:37 1772988997

You can replace Sonnet and Opus with local models, you just need to run the larger ones.

smokel · 2026-03-07T15:25:06 1772897106

Apparently, there is no scientific evidence that ANC is or is not causing tinnitus.

ANC reduces background noise, which typically allows users to listen at lower volumes, thereby reducing total sound exposure to the ear. So if the user adapts their volume, that would lead to less risk of tinnitus. This works for me :)

But there are lots of people on forums suggesting that there is a link between tinnitus and ANC. One reason could be that ANC headphones allow you to listen very accurately to inner auditory signals, and if you already had some tinnitus, you might start to notice it.