Description: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design
chatgpt (1184) gpt-4 (108) mm-react (3)
We introduce DEsignBench, a text-to-image (T2I) generation benchmark tailored for visual design scenarios. Recent T2I models like DALL-E 3 and others, have demonstrated remarkable capabilities in generating photorealistic images that align closely with textual inputs. While the allure of creating visually captivating images is undeniable, our emphasis extends beyond mere aesthetic pleasure. We aim to investigate the potential of using these powerful models in authentic design contexts. In pursuit of this go
Idea2Img involves an LMM, GPT-4V(ision), interacting with a T2I model to probe its usage for automatic image design and generation. Idea2Img takes GPT-4V for improving, assessing, and verifying multimodal contents.
Idea2Img framework enables LMMs to mimic humanlike exploration to use a T2I model, enabling the design and generation of an imagined image specified as a multimodal input IDEA.