Content generation is an exhaustive project often developed by a single or a reduced group of people. Many of the tasks involved require a high level of craftsmanship, and as such are available to a reduced number of experts after extensive training. In an era where content is consumed in a greedy manner by a gigantic audience, content creators face the necessity of novel tools to assist them in their creative process.
Music creation uses novel technologies in many ways, and a wide range of techniques are employed by composers routinely. These technologies have been evolving dramatically in recent years, with the introduction of AI into music generation. While this is a great opportunity for creators, it requires an increasing level of technical proficiency, out of reach of non-experts. A novel industry may emerge from this set of possibilities and requirements, giving each user a new personal experience of media addressed particularly to each taste. Authors need novel tools to develop their work and expand their creativity beyond the constraint of potentially automatized tasks.
Our goal is to reduce the gap between novel techniques of content creation, in particular raw audio music, to artists and creators. The content may be also created and manipulated in real-time according to some external inputs. Tasks related to content creation are of different complexity, some of which are already performed in an automatized way. In this context, novel tools may contribute to an efficient creation task, where efforts of the artist or creator are focused on deeply creative tasks, relying less on critical parts to be performed by an AI assistant.
The elaboration of creative assistants has a wide application to many different fields of creation. While visual arts are suited to the application of powerful AI tools based on CNNs, recent developments on RNNs or similar techniques allow the efficient processing of music, speech, or video. In this context, the scope of interest of these techniques will soon cover the complete spectrum of content production.
This approach has potential impact in a broad range of other disciplines as well, where the combination of data sources could add value. Beyond visual arts, we find many areas of human creation where language, image, sound, and other information are combined in complex ways, such as games. Tools to generate, modify and synchronise all these data sources are in need to automatically ensure the quality and consistency of novel digital content. In this use case, we collaborate with music composers with previous experience with the use of AI to help us define new work methodologies and new ways of using AI in their creation. In particular, we focus on coupling raw audio CNNs with GANs trained with mood and emotional labels, lyrics, and other data extracted from musical knowledge. Also, general-purpose RNNs used in speech synthesis will be explored to process RAW audio from music tracks.
This use case aims at producing a novel way of content co-creation based on AI tools. Our objective is to collaborate with authors to fulfill development requirements to approach these tools to a point where they may reach a wide range of content.
Discover more on this topic on the whitepaper on Exploring AI Music Composition Tools for Humans.
Watch the Demo-Video