adoos's comments

adoos · on April 22, 2023

Gpt is really bad at optimizing prompts this way because there is no way it has the ability to simulate the effects, way too complex. Tools like this need to log and a/b test.

gpt can be layered and made into an agent etc. To do the AB testing or to make prompts longer by adding more end cases as time goes by. But the effects of one single word change are far too complex for gpt base output to understand anything about.

pulvinar · on April 22, 2023

I'm sure it could be improved, including telling it to do what you suggest. Have you tried it as is though?

adoos · on April 22, 2023

Yes I used it. The optimized prompt was not better for my use case. The playground was useful though. I believe prompt optimization is really only optimized by running it through many scenarios and understanding how changing a single word affects things down the line. And then a bunch of hardcoded conditions to change the system/assistant messages on demand as an output of the tool.

adoos · on April 21, 2023

Form factor smaller than 1865 with energy dense formula is rare. Chinese made LFP but only a handful of mfgs make them anymore. So size doesn't matter, but it basically matters :D

chriscappuccio · on April 22, 2023

LFP is going to be manufactured in north america soon

adoos · on April 21, 2023

It's good at temporal reasoning and causality is baked in. I spent a lot of time asking gpt to tell me what is happening at the current moment of a story and it always responds with a causal representation. Where humans might tend to be more visual etc. Remember time is not real anyway we just have a bunch of codependent stuff happening so gpt gets it. What it lacks is just memory and experience and some other things to showcase the ability better. I think it's the training on code more than language that gave it logical reasoning. Humans are logical sometimes but our code really is the summit of our logic.

Anyway regardless of how inherently good they are at temporal reasoning I think a secondary module explicitly for reasoning will come around soon. I believe in the brain some neurons organize into hexagons or other geometries to better capture logic, maths, etc. The LLM basically needs some rigidity in it if we don't want fuzzy outputs.

And the largest danger is not people getting lazy and letting the LLM do it. That kind of danger is really long term globalization type danger. Short term we've got much more to worry.

wizzwizz4 · on April 21, 2023

> and it always responds with a causal representation.

It responds with a language representation. It uses "causal" words because that's how the English language works: we have tenses.

> I think a secondary module explicitly for reasoning will come around soon.

This has been an unsolved, actively-researched problem for ages – certainly since before you were born. I doubt very much that a solution will "come around soon"; and even if it does, integrating the solution into a GPT-based system would be a second unsolved problem – though probably a much easier (and more pointless) one. If you have any ideas, I invite you to pursue them, after a quick literature search.

adoos · on April 21, 2023

It describes the present moment as a series of causal events. Like event x led to y which led to z. Doesn't matter if you ask it for English or code or to not use any tenses, those conditions don't affect its baseline understanding. I might be missing your point though.

For the second thing. I think from any point in history saying "coming soon" , well the current moment is the most accurate time to say it. And especially with events x and y and chat gpt right behind us. Chat gpt has basically been a problem since before I was born too, but stating as much a few months ago would just be as pessimistic as the statement you made. Only because i think the LLM hallucination problem may be simple. But it's only a hunch, based on our wetware.

wizzwizz4 · on April 21, 2023

> Like event x led to y which led to z.

Grammar parsers have been able to do this since the 90s. There is no reason to believe that it's not just a slightly-fancier grammar parser: the kinds of errors it makes are those you'd expect from a pre-biased stochastic grammar parser.

> But it's only a hunch, based on our wetware.

Our "wetware" fundamentally does not work like a GPT model. We don't build sentences as a stream of tokens. (Most people describe a "train of thought", and we have reason to believe there's even more going on than is subjectively accessible.) ChatGPT does not present any kind of progress towards the reasoning problem. It is an expensive toy, built using a (2017, based on 1992) technology that represented progress towards better compression algorithms, and provided some techniques useful for computational linguistics and machine translation. The only technological advance it represents is "hey, we threw a load of money at this!".

The "LLM hallucination problem" is not simple. It's as fundamental as the AI-upscaler hallucination problem. There is no difference between a GPT model's "wow amazing" and its "hallucinations": eliminate one, and you eliminate the other.

These technologies are useful and interesting, but they don't do what they don't do. If you try to use them to do something they can't, bad things will happen. (The greatest impact will probably not be on the decision-makers.)

> well the current moment is the most accurate time to say it.

This is true of every event that is expected to happen in the future.

adoos · on April 21, 2023

The take that its a sophisticated grammar parser is fine. Could be lol. But when it is better at humans then the definitions can just get tossed as usage changes. You can't deny its impact (or you can, but it's intellectually dishonest a bit to just call it old tech with monies and nothin' special from impact alone). But that's your experience so it's fine.

For the stuff about it being a hard problem , now I know you aren't expressly making a false equivocation right? But I did say simple not easy. You are saying hard not complex.

I think there's too much digression here. You're clearly smart and knowledgeable but think LLM are over rated, fine.

And yes I know it's always the best time to say it that's the point of a glass half full, some sugar in the tea, or anything else nice

wizzwizz4 · on April 21, 2023

(It's not just a grammar parser, for the record: that was imprecise of me. The best description of the thing is the thing itself. But, when considering those properties, that's sufficient.)

> But when it is better at humans then the definitions can just get tossed as usage changes.

I'm not sure what this means. We have the habit of formally specifying a problem, solving that specification, then realising that we haven't actually solved the original problem. Remember Deep Blue? (We could usually figure this out in advance – and usually, somebody does, but they're not listened to.) ChatGPT is just the latest in a long line.

> You are saying hard not complex.

Because reasoning is simple. Mathematical reasoning can be described in, what, two-dozen axioms? And scientists are making pretty good progress at describing large chunks of reality mathematically. Heck, we even have languages like (formal dialects of) Lojban, and algorithms to translate many natural languages into it (woo! transformers!).

… Except our current, simple reasoning algorithms are computationally-intractable. Reasoning becomes a hard problem with a complex solution if you want it to run fast: you have to start considering special-cases individually. We haven't got algorithms for all the special-cases, and those special-cases can look quite different. (Look at some heuristic algorithms for NP-hard problems if you want to see what I mean.)

> but think LLM are over rated,

I think they're not rated. People look at the marketing copy and the hype, have a cursory play with OpenAI's ChatGPT or GPT4, and go "hey, it does what they say it can!" (even though it can't). Most discussion seems to be about that idea, rather than the thing that actually exists (transformer models, but BIG). … but others in this thread seem to be actually discussing transformers, so I'll stop yelling at clouds.

adoos · on April 21, 2023

Yeah! I'm going to ramble for a bit because I'm about to leave the house and just want to print all my thoughts on the matter. I actually think there will be a new genre of 'non-fiction' games once hallucinations are minimized. For example I experimented with this animal game for Wikipedia, where you could play about any animal and learn about its unique lifecycle and so on. You would go through a few years of its life, die, and be reborn as a new animal. Was really fun especially if you start at the beginning of life on Earth, GPT-4 will already let you evolve pretty reasonable traits. It can simulate living through historical events etc. The problem though is costs... if we want the game to be accurate biologically, eg. for plants or bacteria the decision are boiling down to much more simple chemical responses, than more abstract human concepts --- then my prompts are way too big. GPT4 is too averagey it get it perfect without defining a ton of end cases. And then getting it to not hallucinate, just makes it not feasible. Scraping Wikipedia for random animals, loading in that data, and making sure that animal, its environment, the types of actions it can take, and so on is just too expensive to prompt engineer. I tried to simulate all the scenarios out and print it into mermaid.js and it is a cambrian explosion. At the point I had an interesting game with realistic characteristics I was spending $10 every few hours just by myself to test it. So IMO there will be a fine tuned model for this task... Chat GPT 4+ will probably be used for the highest level function for deciding if a goal was met, understanding its own limits, etc. and more finely tuned models will live down in the stem for example to check if an action an animal can take is realistic (eg. I am a bear, so I have a paw, so I can do this)... that kind of stuff needs to get hardcoded into the model at the bottom layers.

adoos · on April 21, 2023

Yeah so I had this problem too. The first thing is, if you don't want it to generate these then don't use a system message, just go for assistant role when you are giving it valence. Because it's state role seems a lot more honorable...

Second you want to split these into multiple agents, one agent can continue the story, and another determines of the character dies and so on. The stories can get REALLY horrendous that way. In the decision agent you can give that particular one a state role that says it is a hypothetical story, so extremely bad things are OK -- and it will honor it completely.

You can also cut-off the agent mid-sentence, and have another agent start from where they left off. Do this with Token limit! Is the secret sauce, otherwise it will be too easy for it to settle back into averages. For me this got much more imaginative but still cohesive content. If you let chatgpt in a single conversation with a single state message create the whole story it gets quite boring fast.

For win or lose tweaking, it is definitely the most interesting part of the problem imo. What I did was actually have the referee bot conclude the story, and in that way you can push it towards win or lose which I find really interesting. So when you prompt a bot to see if a goal is won or lost, having it reason the ending, create an ending, or any infinite variation of those words will affect its determination... much in the same way a human simulate "future thoughts" to determine if a goal has completed, by what possible consequences result and so on.

adoos · on April 21, 2023

Wow nice lol so interesting to see your approach. I released almost exactly the same thing yesterday on HN:

https://emojistory.herokuapp.com/As-a-HackerNews-user,-submi...

themaximalist · on April 21, 2023

your approach is very cool!

adoos · on Feb 4, 2017

Awesome stuff. I think a bot server is a logical choice and a good one. It will be neat to see what people can come up with.

adoos · on Feb 4, 2017

I think everyone agrees (even if unconsciously) with your feelings. Most players had a defense advantage in movement speed before, but no longer. Reduced security and other implications favor large troops and rush strategies. Which leaves traditional strategies you mentioned less effective. The problem is, war, in the real world almost always has some kind of defensive advantage. So the cohesion of the General narrative was disrupted negatively as this core expectation is broken. My 2 cents even if non-specific, because I think this is the right perspective for highest-level analysis. I don't necessarily feel the new change is bad. But it changes game mechanics to favor play-styles where do not tell a cohesive story.

My example is just anecdotal. I used to play and, in just 300 moves was able to tell a full story. Of how I advanced, beat a certain enemy through an interesting battle. How I went into unknown territory and was surprised by what I found. There really was a cohesive narrative and nice story-telling in the previous version. This is what made the game fun, fundamentally. Mini stories of kingdoms locked in war, with interesting battles.

Now, the stories are not told well. Since I am queuing constantly, I am not even paying attention to where my general is most of the time. I am half-blind charging on my bull hoping I run over a general. And likewise, if I am on the defensive there is nothing to do but hope the huge army does not snake through and hit my general. The sense of a story was largely destroyed.

And that's why I think the game is less fun now, even if it is not so obvious why.