Stability AI released the latest version of its popular and open-source image generator, called Stable Diffusion 2.0. When compared to the first model, version 2.0 has got lot of big improvements and new cool features.
Now let's check the new features of Stable Diffusion 2.0
Brand new text encoder (OpenCLIP), developed by LAION
Upscaler Diffusion model that enhances the resolution of images by a factor of 4
Brand new depth-guided stable diffusion model
Brand new text-guided inpainting model
Let’s check each one of them.
New Text Encoder
The new diffusion model is trained from scratch with 5.85 billion CLIP-filtered image-text pairs.
The result is a stunning high-definition image like this.
New Upscaler
Stable Diffusion 2.0 also includes an Upscaler Diffusion model that enhances the resolution of images by a factor of 4. Below is an example of our model upscaling a low-resolution generated image (128x128) into a higher resolution image (512x512). Combined with our text-to-image models, Stable Diffusion 2.0 can now generate images with resolutions of 2048x2048–or even higher.
Depth Recognition
Our new depth-guided stable diffusion model, called depth2img, extends the previous image-to-image feature from V1 with brand new possibilities for creative applications. Depth2img infers the depth of an input image (using an existing model), and then generates new images using both the text and depth information.
Text Guided Inpainting Model
SD 2.0 now supports text-guided inpainting. It means you can simply describe in natural language what parts of the image you want to modify.
The project is still open source. You can download or fork the project from GitHub.
The demo application is accessible from the HuggingFace app => https://huggingface.co/spaces/stabilityai/stable-diffusion
The new version will also be available in DreamStudio in next few days.
If you’re interested in accessing the service via API, you can check out the documentation here.
Overall, I am in awe of the people behind this technology. Many thought we were going closed-source, but here we are. Let me end with this quote from Stability AI.
Comments