Yeah, this is in my opinion the biggest limitation of the current gen GPT 4o ima...

atommclain · 2025-04-08T14:59:38 1744124378

I thought the selection tool allows you to limit the area of the image that a revision will make changes to, but I tested it and I still see changes outside of the selected area which is good to know.

As an example the tape spindles, among other changes, are different: https://chatgpt.com/share/67f53965-9480-800a-a166-a6c1faa87c...

https://help.openai.com/en/articles/9055440-editing-your-ima...

qingcharles · 2025-04-08T16:54:22 1744131262

Yeah, I'm not sure what the selection brush actually does. Is it just a hint to the LLM?

danielbln · 2025-04-08T12:20:16 1744114816

It just means that you comp it together manually. That's still much better than having to set up some inpainting pipeline or whatever.

wavemode · 2025-04-08T14:20:33 1744122033

Is manually comping actually going to be easier (let alone, give better results) than inpainting? I can imagine it working in simple cases, but for anything involving 3D geometry you'll likely run into issues of things not quite lining up between the first and second image.

echelon · 2025-04-08T13:37:10 1744119430

100%. Multimodal images surpass ComfyUI and inpainting (for now). It's a step function improvement in image generation.

I'm hoping we see an open weights or open source model with these capabilities soon, because good tools need open models.

As has happened in the past, once an open implementation of DallE or whatever comes out, the open source community pushes the capabilities much further by writing lots of training, extensions, and pipelines. The results look significantly better than closed SaaS models.

iandanforth · 2025-04-09T02:00:51 1744164051

Fwiw pixlr is a good pairing with GPT 4o for just this. Generate with 4o then use pixlr AI tools to edit bits. Especially for removals pixlr (and I'm sure others) are much much faster and quite reliable.