Visual ChatGPT: Combining Talking with Visual Elements

Asif Iqbal
2 min readMar 11, 2023

--

Microsoft researchers have unveiled Visual ChatGPT, an open-source extension that links an AI dialogue system to various Visual Foundation Models, allowing for the exchange of images.

Demo: Visual ChatGPT | Github

Currently, ChatGPT is incapable of generating or manipulating images; it can only describe them, which can then be used with tools like Stable Diffusion, DALL-E, or Midjourney. But with the Visual ChatGPT project, the AI system gains the ability to produce images, make edits, remove objects, and perform other similar tasks.

System Architecture | Visual ChatGPT

Visual ChatGPT allows:

  1. sending and receiving not only languages but also images;
  2. providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps;
  3. providing feedback and asking for corrected results.

Find the project on GitHub

Read the paper Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

Know More about GPT: https://twitter.com/BeingOvee/status/1634211136836079618

— — — — — — — — — — — — —

Find me on: Twitter | Linkedin | Instagram | Facebook | Pinterest

--

--

Asif Iqbal
Asif Iqbal

Written by Asif Iqbal

🚀 Multi-Award Winning Entrepreneurship Ecosystem Builder |📚 Teaching Entrepreneruship, Oxford Brookes Uni | 💡 Innovating in Higher Ed | 🌍 Director, GEN UK

No responses yet