News

Microsoft Eases AI Image Generation Across the Board

Capitalizing on the current generative AI hype, Microsoft is making it easier for users to create their own images across its consumer app portfolio.

For its AI image generation, Microsoft uses DALL·E 2 technology from partner OpenAI, the creator of the sentient-sounding ChatGPT chat bot based on the GPT- series of machine language large language models (LLM), recently advanced to GPT-4. DALL·E 2 is an advanced AI system that can create realistic images and art from a description in natural language thanks to natural language processing (NLP) breakthroughs, even allowing for the combination of concepts, attributes and styles from typed-in commands.

While the DALL·E site can be used to generate images, users are likely to get the "The server is currently overloaded with other requests" message these days.

But Microsoft has been making it easier to use DALL·E 2 to generate images in apps including the "new Bing" search site and a preview of Microsoft Designer. As a preview, the latter requires that interested users join a waitlist to be granted access to the graphic design app, though Microsoft reportedly recently removed the waitlist for the new Bing.

Microsoft Designer (Preview)
Microsoft Designer is meant to help users create professional quality social media posts, invitations, digital postcards, graphics and more.

Using Microsoft Designer is simple, but first you have to go to the web site, enter your email address and click the "Join the waitlist" button. Upon being accepted, just revisit the site and click on "Add image" to start with an existing image from your device or "Generate image" to describe the image you want and have AI create it for you.

[Click on image for larger view.] Microsoft Designer (source: Microsoft).

The tool seems to have a social media/marketing focus and automatically adds branding-speak text to generated images, no matter what their intended purpose. But all aspects of text and images can be customized afterward.

Microsoft earlier this year published "How to use AI image prompts to generate art using DALL·E" guidance that works for the DALL·E site or Designer.

"The benefit of using Designer is that it is also a graphic design app, so you'll not only get unique images generated from the ideas you type, but you also can add more design elements like text or graphics and AI-powered editing experiences that will perfectly integrate it all into a design," the guidance says.

Prompting guidance further advises users to be specific with lots of adjectives and other details, and also to add directive details, so instead of just directing the AI to create an image in an "oil painting" style, rather specifically ask it for an "oil-on-canvas, masterpiece by Caravaggio, from 1599."

Pitfalls to avoid, meanwhile, include:

  • Complex scenes with multiple subjects
  • Detailed layout requests (for example, "A big red Object X on the left, friendly Object Y on the right, a small Object Z wearing Item A above them")
  • Images with multiple faces (these are often distorted)
  • Requests for text (for example, "a sign saying 'Happy birthday!'"), because the generator doesn't know how to spell!

Designer does, however, know how to generate an image of "an astronaut walking through a galaxy of sunflowers", offering up these options:

[Click on image for larger view.] Generated Microsoft Designer Image (source: Microsoft Designer).

Microsoft borrowed the "Copilot" moniker from the GitHub Copilot "AI pair programmer" tool to infuse AI tech into its portfolio of products across the board, and verbiage from the Designer site indicates its name may change to "Designer Copilot" after the preview stage.

Bing Image Creator
While Microsoft's "new Bing" search experience has been powered by OpenAI's GPT-4 LLM for a while now, the company just added Bing Image Creator a couple weeks ago.

"We're excited to announce we are bringing Bing Image Creator, new AI-powered visual Stories and updated Knowledge Cards to the new Bing and Edge preview," Microsoft said in a March 21 announcement. "Powered by an advanced version of the DALL∙E model from our partners at OpenAI, Bing Image Creator allows you to create an image simply by using your own words to describe the picture you want to see. Now you can generate both written and visual content in one place, from within chat."

Note that to use the image generator requires switching to the Creative mode.

The functionality is also supposed to have been rolled out to the Edge browser, accessible by an Image Creator icon in the sidebar, but this reporter couldn't find that icon, even after updating Edge and even downloading an Insider build.

"Since making the new Bing available in preview, we have been testing it with people to get real-world feedback to learn and improve the experience," Microsoft said. "People used it in some ways we expected and others we didn't. In this spirit of learning and continuing to build new capabilities responsibly, we're rolling out Bing Image Creator in a phased approach by flighting with a set of preview users before expanding more broadly. We will initially only include Image Creator in the Creative mode of Bing chat and our intention is to make it available in Balanced and Precise mode over time. We are also working on some ongoing optimizations for how Image Creator works in multi-turn chats. We continue to believe the best way to bring these technologies to market is to test them carefully, in the open, where everyone can provide feedback."

[Click on image for larger view.] Generated Bing Image (source: Microsoft Designer).

Using Image Creator is also dead simple. Inputting the previous request to generate an image of an astronaut walking through a galaxy of sunflowers resulted in the image candidates depicted above.

[Click on image for larger view.] Generated DALL∙E Image (source: DALL∙E).

After trying the DALL·E site for more than hour, we finally got the same image generation request to create some options, shown above (we have no idea what the first option is -- apparently a "hallucination" mistaking some kind of strawberry/pretzel mutation with an astronaut walking through a galaxy of sunflowers).

Artistic considerations aside, using Microsoft tooling in Bing, Microsoft Designer and the company's other consumer apps at least provides more immediate results than the overused DALL·E site.

More thoughts about Designer can be found in the February article, "The Impact that Designer May Have on the Future of Microsoft 365."

About the Author

David Ramel is an editor and writer for Converge360.

Featured

Subscribe on YouTube