More

fovc · 2025-12-04T20:45:56 1764881156

Łukasz Kaiser basically confirmed it in a podcast:

https://youtu.be/3K-R4yVjJfU?si=JdVyYOlxUbEcvEEo&t=2624

> Q: Are the releases aligned with pre-training efforts?

> A: There used to be a time not that long ago, maybe half a year, distant past, where the models would align with RL runs or pretraining runs ... now the naming is by capability. GPT5 is a capable model; 5.1 is a more capable model

fovc · 2025-11-23T12:42:15 1763901735

> I also think it’s important to notice that a lot of these challenges they happen with humans too. The concept of prompt injection isn’t that different from social engineering, right? When somebody calls in and says, “Oh, I forgot my password, can you just help me this one time?”

fovc · 2025-09-07T13:04:07 1757250247

Crossover from the other front page article. I tested out ChatGPT5 search mode and there are some good sources!

https://chatgpt.com/s/t_68bd82908c0c8191b142b860ff91c9dc

fovc · 2025-06-18T02:34:27 1750214067

Very nice! For other readers, vc is short for verification condition and wp is short for weakest precondition.

fovc · 2025-03-15T01:24:14 1742001854

I wonder if the error propagation problem could be solved with a “branching” generator? Basically at every token you fork off N new streams, with some tree pruning policy to avoid exponential blowup. With a bit of bookkeeping you could make an attention mask to support the parallel streams in the same context sharing prefixes. Perhaps that would allow more of an e2e error minimization than the greedy generation algorithm in use today?

fovc · 2025-02-26T23:34:11 1740612851

Or see the explanation in video form here: https://m.youtube.com/watch?v=d0HJvGSWw8A

Mamba has been discussed a lot here, and this seems like a promising line of inquiry for improvement

fovc · 2025-02-23T00:13:51 1740269631

Having the data structures is nice and all, but using them is kind of painful. They are certainly second class.

Having to use accessor functions or destructuring macros instead of just a period or -> is often annoying too. The lack of syntax has cons as well as pros.

pjmlp · 2025-02-23T08:19:34 1740298774

Everything needed is place, there is no second class about using arrays instead of lists.

cenamus · 2025-02-23T07:48:58 1740296938

I mean you can write a macro that let's you write

(object -> slot)

and transforms it to (slot object)

"->" should be unused

tmtvl · 2025-02-23T15:04:38 1740323078

Writing a reader macro that allows for something like...

  [some-numbers 0]

...to get the first (many programming languages make this mistake, using 0 to refer to the first element of a collection, so we can forgive CL for this) element. But I'm curious how you can write...

  (object -> slot)

...without getting an error about OBJECT not being a valid function or macro.

kazinator · 2025-02-26T19:46:55 1740599215

> so we can forgive CL for this

The 1962 dated Lisp 1.5 Programmer's Manual already describes a 0 based array feature. Lisp was clearly one of the historic instigators of zero based array, rather than just playing along.

tmtvl · 2025-02-26T21:55:41 1740606941

Yes, but the various Lisps that Common Lisp is the more-or-less common subset of are (were?) all 0-indexed. Between easy heap implementation (left is (ash index 1), right is (1+ (ash index 1)), parent is (ash index -1)) and easy last element selection (nth seq (length seq)) I prefer 1-indexing, but I realize that's an unpopular opinion.

Jach · 2025-02-26T11:43:03 1740570183

A late reply but it's worth addressing one way of doing this. First, your concern about object not being a valid function or macro isn't relevant at read time. Second, note that Lisp already has similar syntax: '(1 . 2) is essentially (cons 1 2). Implementing this type of syntax is not a privilege of the implementation alone. You're allowed to redefine your own reader for left paren. In SBCL:

    CL-USER> (get-macro-character #\()
    SB-IMPL::READ-LIST

You can write `(set-macro-character #\( 'sb-impl::read-list)` and everything continues to work just fine. You can also jump-to-source and modify it if you want -- though it's cleaner to just copy it out to your own project, that's what I did for a quick hack/proof of concept. Essentially I added before the existing (when...) which handles the special dot syntax:

      (when (and (eq firstchar #\-)
                 (eq (peek-char t stream t nil t) #\>))
        (read-char stream t) ; actually read the nextchar > to discard it
        (let ((next-obj (read stream)))
          (sb-impl::flush-whitespace stream rt)
          (return `(slot-value ,@listtail ',next-obj))))

I won't claim this is good or proper, but it shows that it's quite feasible. We've turned (foo -> bar) into (slot-value foo 'bar).

    CL-USER> (defclass vec2 ()
      ((x :initarg :x)
       (y :initarg :y)))
    #<STANDARD-CLASS COMMON-LISP-USER::VEC2>
    CL-USER> (defparameter vec (make-instance 'vec2 :x 3 :y 4))
    VEC
    CL-USER> (vec -> y)
    4
    CL-USER> (read-from-string "(print (vec -> x))")
    (PRINT (SLOT-VALUE VEC 'X))
    18

Personally I wouldn't use this even if it was more properly/carefully implemented. (There's really no reason to replace the default left-paren reader, and no reason we have to have a space surrounding the "->". One thing I like about the infix reader macro package https://github.com/quil-lang/cmu-infix is that it doesn't care about spaces, I can write #I(1+1 + 4) and get 6.) I'm quite happy putting my class in its own package, and thus getting the primary tab-completion behavior I care about. e.g. "(ma:<tab>" could complete to "(math:" and then "(math:v<tab>" could complete to a list of options like "vector-x" "vector-y" or so on. I also like the somewhat unusual approach of naming my accessors with a dot prefix, e.g. (.x vec) and (.y vec), or even (math:.x vec) if I haven't imported the symbol.

tmtvl · 2025-02-26T19:55:45 1740599745

Good things are worth waiting for. I never considered making a reader macro for a regular opening bracket, that's equal parts genius and insanity.

fovc · 2025-02-23T15:07:02 1740323222

And also make sure that slot is a symbol in the correct package. Or do like Elisp and do without packages but then have a 16 character prefix

fovc · on Feb 18, 2025

Sparse attention essentially combines 3 types of attention optimizations:

1. Compression of the query input vectors to reduce the size of the KV cache

2. Selectively computing uncompressed attention on a subset of tokens based on the compressed blocks with the highest attention scores

3. Using sliding window for local attention at full resolution

> Both Full Attention and sparse attention models are pretrained on 270⁢B tokens of 8⁢k-length texts, followed by continued training and supervised fine-tuning on 32⁢k-length texts with YaRN to achieve long-context adaptation. Both models are trained to full convergence to ensure fair comparison.

> our experiments adopt a backbone combining Grouped-Query Attention (GQA) and Mixture-of-Experts (MoE), featuring 27⁢B total parameters with 3⁢B active parameters

Evaluated on MMLU, MMLU-PRO, CMMLU, BBH, GSM8K, MATH, DROP, MBPP, and HumanEval. NSA outperforms full attention on 7/9.

Beats out H2O, InfLLM, Quest, Exact-Top, and full attention on LongBench

Perfect retrieval on 64k needle-in-a-haystack

The CoT eval is less convincing, but outperforms the FA on AIME24.

Training speed of 2-9x vs. FlashAttention

Decoding speedup of 4-12x vs. full attention ["expected"? Didn't see comparison to other attention mechanisms]

fovc · on Feb 12, 2025

Great to see this is alive and progressing! I believe Ohm started life in Alan Kay’s research group, to build a graphical OS and office suite in 10k lines of code. I found this talk immensely inspiring https://m.youtube.com/watch?v=ubaX1Smg6pY

pdubroy · on Feb 12, 2025

Very close! Alex Warth created OMeta (https://en.wikipedia.org/wiki/OMeta) as part of the STEPS project. Ohm was designed as a kind of successor to OMeta, but was created after STEPS.

agumonkey · on Feb 12, 2025

seems like it's his github webpage https://alexwarth.github.io/

and it's full of great stuff

fovc · on Feb 14, 2025

Ah thanks for the clarification! Do you happen to know if Nile/Gezira went anywhere?

fovc · on Jan 23, 2025

> I feel like I'm taking crazy pills when I read about others' experiences. Surely I am not alone?

You're not alone :-) I asked a very similar question about a month ago: https://news.ycombinator.com/item?id=42552653 and have continued researching since.

My takeaway was that autocomplete, boiler plate, and one-off scripts are the main use cases. To use an analogy, I think the code assistants are more like an upgrade from handsaw to power tools and less like hiring a carpenter. (Which is not what the hype engine will claim).

For me, only the one-off script (write-only code) use-case is useful. I've had the best results on this with Claude.

Emacs abbrevs/snippets (+ choice of language) virtually eliminate the boiler plate problem, so I don't have a use for assistants there.

For autocomplete, I find that LSP completion engines provide 95% of the value for 1% of the latency. Physically typing the code is a small % of my time/energy, so the value is more about getting the right names, argument order, and other fiddly details I may not remember exactly. But I find, that LSP-powered autocomplete and tooltips largely solve those challenges.

sdesol · on Jan 23, 2025

> like an upgrade from handsaw to power tools and less like hiring a carpenter. (Which is not what the hype engine will claim).

I 100% agree with the not hiring a carpenter part but we need a better way to describe the improvement over just a handsaw. If you have domain knowledge, it can become an incredible design aid/partner. Here is a real world example as to how it is changing things for me.

I have a TreeTable component which I built 100% with LLM and when I need to update it, I just follow the instructions in this chat:

http://beta.gitsense.com/?chat=dd997ccd-5b37-4591-9200-b975f...

Right now, I am thinking about adding folders to organize chats, and here is the chat with DeepSeek for that feature:

http://beta.gitsense.com/?chat=3a94ce40-86f2-4e68-b5d7-88d33...

I'm thoroughly impressed as it suggested data structures and more for me to think about. And here I am asking it to review what was discussed to make the information easier to understand.

http://beta.gitsense.com/?chat=8c6bf5db-49a7-4511-990c-5e6ad...

All of this cost me less than a penny. I'm still waiting for my Anthropic API limit to reset and I'm going to ask Sonnet for feedback as well, and I figure that will cost me 5 cents.

I fully understand the not hiring a carpenter part, but I think what LLMs bring to the table is SO MUCH more than an upgrade to a power tool. If you know what you need and can clearly articulate it well enough, there really is no limit to what you can build with proper instructions, provided the solution is in its training data and you have a good enough BS detector.

strogonoff · on Jan 24, 2025

> If you know what you need and can clearly articulate it well enough, there really is no limit to what you can build with proper instructions, provided the solution is in its training data and you have a good enough BS detector.

In other words: you must already know how to do what you are asking the LLM to do.

In other words: it may make sense if typing speed is your bottleneck and you are dealing with repetitive tasks that have well been solved many times (i.e., you want an advanced autocomplete).

This basically makes it useless for me. Typing speed is not a bottleneck, I automate or abstract away repetition, and I seek novel tasks that have not yet been well solved—or I just reuse those existing solutions (maybe even contributing to respective OSS projects).

The cases where something new is needed in areas that I don’t know well it completely failed me. NB: I never actually used it myself, I only gave into a suggestion by a friend (whom LLM reportedly helps) to use his LLM wrangling skills in a thorny case.

sdesol · on Jan 24, 2025

> In other words: you must already know how to do what you are asking the LLM to do.

Those that will benefit the most will be senior developers. They might not know the exact problem or language, but they should know enough to guide the LLM.

> In other words: it may make sense if typing speed is your bottleneck and you are dealing with repetitive tasks that have well been solved many times (i.e., you want an advanced autocomplete).

I definitely use a LLM as a typist and I love it. I've come to a point now where I mentally ask myself, "Will it take more time to do it myself or to explain it?" Another factor is cost, as you can rack up a bill pretty quickly with Claude Sonnet if you ask it to generate a lot of code.

But honestly, what I love about integrating LLM into my workflow is, I'm better able to capture and summarize my thought process. I've also found LLMs can better articulate my thoughts most of the time. If you know how to prompt a LLM, it almost feels like you are working with a knowledgeable colleague.

> I never actually used it myself, I only gave into a suggestion by a friend (whom LLM reportedly helps) to use his LLM wrangling skills in a thorny case.

LLMs are definitely not for everyone, but I personally cannot see myself coding without LLMs now. Just asking for variable name suggestions is pretty useful. Or describing something vague and having it properly articulate my thoughts is amazing. I think we like to believe what we do is rather unique, but I think a lot of things that we need to do have already been done. Whether it is in the training data is another thing, though.

strogonoff · on Jan 24, 2025

> They might not know the exact problem or language, but they should know enough to guide the LLM.

I was in this exact situation. I worked with an unfamiliar area with a hardware SDK in C that I needed to rewrite for my runtime, or at least call its C functions from my runtime, or at least understand how the poorly written (but working) example SDK invocation works in C by commenting it. The LLMs failed to help with any of that, they produced code that was 1) incorrect (literally doing the opposite of what’s expected) and 2) full of obvious comments and missing implementetions (like “cleanup if needed” comment in the empty deinit function).

Later it turned out there is actually an SDK for my runtime, I just failed to find it at first, so the code the LLM could use or tell me about actually existed (just not very easy to find).

Those were two top LLMs as of December 2024. It left me unimpressed.

I don’t think I would be compelled to guide them, once I understood how the code works it is faster to just write it or read relevant reference.

My friend, who volunteered to waste those precious tokens to help with my project, does use chatbots a lot while coding, but he’s more of an intermediate than senior developer.

> Just asking for variable name suggestions is pretty useful.

I can’t see myself asking anyone, much less an LLM, for the name of a variable. I am known to ask about and/or look up, say, subject domain terminology that I then use when naming things, but to name things well you first need to have a full picture of what you are making. Our job is to have one…

barrell · on Jan 23, 2025

I think you make a very good point about your existing devenv. I recently turned off GitHub copilot after maybe 2 years of use — I didn’t realize how often I was using its completions over LSPs.

Quality of Life went up massively. LSPs and nvim-cmp have come a long way (although one of these days I’ll try blink.cmp)