Google has introduced its newest artificial intelligence tool, Whisk, allowing users to upload images and generate combined, AI-created images without requiring any text input. The tool uses Google’s Gemini AI offering paired with DeepMind’s Imagen 3 to create visually unique images.
Users can input images depicting subjects, settings, and styles before Whisk combines everything into one image. Unlike traditional image editors, Whisk is designed as a creative tool for quick inspiration. It does not aim to produce refined professional work but rather fun AI-generated art.
Since OpenAI launched its text-to-image creation tool Dall-E in 2021, the concept of AI-generated artwork has become popular on social media. Google’s Whisk builds upon this concept by offering an image-to-image generator. Users can remix their inputs and mix categories to produce different images such as plushie toys or stickers.
“Whisk is designed to allow users to remix a subject, scene, and style in new and creative ways, offering rapid visual exploration instead of pixel-perfect edits,” said Thomas Iljic, director of product management at Google Labs. The tool captures the essence of the subject rather than an exact replica, allowing for unique variations.
Whisk works by using Gemini’s caption generation paired with Imagen 3 to create the final image. This process allows for creative remixing but also means the end product might not match the original prompt.
The tool is currently available as a website on Google Labs for users in the US and marks an early stage of its development. With the release of Whisk, Google aims to flex its muscles in the AI and tech competition with OpenAI, demonstrating the potential of its DeepMind asset.
Source: https://edition.cnn.com/2024/12/17/business/google-ai-whisk-image-prompts/index.html