Stability AI today announces the release of SDXL 0.9, the most advanced development in Stable Diffusion's suite of text-to-image models. Following the successful release of the Stable Diffusion XL beta in April, SDXL 0.9 produces significantly improved image and composition detail over the previous version.
The model can be accessed via ClipDrop today and the API will be available soon. The study weights are now available and the open release will be available in mid-July when we move to version 1.0.
Although the model can run on a modern consumer GPU, SDXL 0.9 represents a breakthrough in creative applications of generative artificial intelligence for imagery. The ability to create hyper-realistic artwork for film, television, music, and instructional videos, as well as offer advances in design and industrial applications, places SDXL at the forefront of real-world applications for AI-generated imagery.
Some sample queries tested on both versions of SDXL (beta on the left and 0.9 on the right) show how far this model has come in just two months.
The SDXL Series also offers a number of features that go beyond basic text-based prompting. These include image-to-image prompting (entering a single image to get variations of that image), restoring missing parts of an image, and creating a smooth continuation of an existing image.
The main driver of this progress in composition for SDXL 0.9 is its significant increase in the number of parameters (the sum of all weights and biases in the neural network on which the model is trained) compared to the beta version.
SDXL 0.9 has one of the largest parameter counts of any open source image model, boasting a base model with 3.5 billion parameters and a pipeline ensemble model with 6.6 billion parameters (the final output is created by running the two models and aggregating the results). The second pipeline stage model is used to add finer details to the generated output of the first stage.
Compared to the beta version, which runs on 3.1 billion parameters and uses only one model.
SDXL 0.9 runs on two CLIP models, including one of the largest OpenCLIP models trained to date (OpenCLIP ViT-G/14), which enhances its processing power and ability to produce realistic images with greater depth and higher 1024×1024 resolution.
The SDXL team will soon be releasing a research blog that will describe the specifications and testing of this model in more detail.
Despite the model's powerful output and advanced architecture, SDXL 0.9 can run on a modern consumer GPU requiring only Windows 10 or 11 or Linux operating systems, with 16GB of VRAM, an Nvidia GeForce GeForce RTX 20 graphics card (standard equivalent or higher) with a minimum of 8GB of VRAM. Linux users can also use a compatible AMD graphics card with 16GB VRAM.
Since the beta launch of SDXL on April 13, we've received great feedback from our community of users on Discord, which numbers almost 7,000. These users have created over 700,000 images, averaging over 20,000 per day. Over 54,000 images have been entered into the Discord community "Showdowns" and 3,521 SDXL images have been nominated as winners.
SDXL 0.9 is now available on the platform Clipdrop by Stability AI. Stability AI and DreamStudio API customers will be able to access the model starting June 26, as well as other leading imaging tools such as NightCafe.
SDXL 0.9 will be made available for research only for a limited period to gather feedback and fully refine the model before its public release. The code to run the model will be publicly available at Github.
SDXL 0.9 will be followed by a full public release of SDXL 1.0, scheduled for mid-July (time to be announced later).
Ailib neural network catalog. All information is taken from public sources.
Advertising and Placement: [email protected] or t.me/fozzepe