Stability AI has announced the release of its new Stable Audio model, which uses a diffusion architecture to generate audio recordings based on textual queries. The model has been trained on an extensive dataset including textual metadata and over 800,000 audio files with finished songs, sound effects and instrumental parts totaling over 19,500 hours.
The company used data from AudioSparx, which owns a library of music for commercial purposes, to train the model. This means that Stability AI has received permission to use copyrighted content.
Users are offered a choice of three pricing plans. The free plan allows you to generate up to 20 tracks per month with a maximum duration of 45 seconds. Professional plan costs $11.99 per month (excluding taxes) and allows generating up to 500 tracks of up to 1.5 minutes duration per month. The terms of the corporate plan are discussed individually.
The use of generated tracks in commercial projects is available only for paid subscribers. It is also forbidden to use the generated tracks for training your own artificial intelligence models.
The generator interface looks as follows. We tried to create a melody, but due to the high load on the server the system generated an error several times and failed to play the track.
Stability AI's user help states that Stable Audio can be used to create not only complete songs, but also individual instrumental parts and sound effects.
The company is not the first to develop such neural networks. For example, OpenAI introduced the Jukebox model in 2020, while Google has AudioML for creating melodies based on audio cues and MusicLM for generating music based on text description.
In addition, Meta (banned in Russia) released the MusicGen music generator in June 2023, and in August introduced AudioCraft, which allows you to create sounds and environmental effects.
Ailib neural network catalog. All information is taken from public sources.
Advertising and Placement: [email protected] or t.me/fozzepe