Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Open AI have both said it's native image generation and autoregressive. It has the signs of it too.

It's probably an implementation of VAR (https://arxiv.org/abs/2404.02905) - autoregressive image generation with a small twist. Rather than predict every token at the target resolution directly, start with predicting it at a small resolution, cranking it higher and higher until the desired resolution.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: