The day DJing changed — source separation for djay Pro AI and VirtualDJ

Where Stems failed, djay Pro AI and VirtualDJ 2021 may well succeed, because both can now separate music into vocals, instruments, and beats on the fly.

I had an entirely different article planned, one that told the story of my two hour pre-release djay Pro AI sofa mixing session. It started like this:

“Bugger me — it’s 1.30 am”.

Turning to my better half on the sofa, I assured her it’s not a timely request but an exclamation.

“I know. I left you to play. I haven’t seen you this happy in a long time”.

I had only expected one game-changing announcement that day, but then VirtualDJ green-lit their own take on precisely the same thing. Thus that particular story arc got spiked.

So what happened? Well, Algoriddim announced djay Pro AI for iOS and iPadOS, a ground-up rebuild complete with audio source splitting tech dubbed Neural Mix. And faced with little choice Atomix released VirtualDJ 2021 for macOS and Windows that offers the very same feature — real-time audio source splitting into stems.

And the DJ game… changed forever? Will never be the same again? Time will tell, but such hyperbolic marketing phrases do seem somewhat appropriate as I write this.

This is not a glib statement dished out for dramatic effect — the introduction of music source splitting just changed everything. And while early days (actually still hours at the time of writing) you’re witnessing the next real revolution. I’ll explain why shortly.

As I’m stepping away from pure DJ news reporting, I’ll let you discover the respective djay Pro AI and VirtualDJ 2021 news here and here. Instead, I want to take you down a different path, to explain how we got here, and why this is so damned important.

DISCLAIMER — I’ve been Algoriddim’s video making guy for years. This however doesn’t stop me having an independent opinion.

In the beginning

It has long been my contention that everything that needs to be done by DJs has been rinsed to the nth degree via DJ technology, and that the next real innovation will happen with music. This manifested itself with the lurch towards streaming services, and the plumbing in of said services into the usual suspects’ software, and now with Denon DJ directly into hardware.

But we’re talking about individual track manipulation rather than the delivery of them. We’ve always been able to screw around with our music in all manner of ways, either directly in software to create new versions, or while performing via loops, hot cues, samples, effects, and filters.

Let’s talk about Spleeter

For DJing, this is a whole new ballgame. But it’s one that has developed over a period of time with a number of products. But most recently, Deezer’s Spleeter technology made huge waves with real demonstrations of actively extracting decent stems. Not the cleanest of stems you understand, but nobody would argue that this was nothing less than audio sorcery.

The problem was usage. You couldn’t just download an app and extract away — this was command line stuff. Hell, people struggle with the App Store, let alone navigating arcane instructions on Github.

But the real joy is that it was open-source, meaning that anyone could use it. And while Algoriddim isn’t expressly saying they’re not using Spleeter, Atomix goes out of its way to stress that their take is all their own work. Let’s see if it was worth it should Algoriddim’s patent application get approved.

The fundamental difference between old methods and this new one is immediacy. Being baked into performance software means that this all happens live. No prep is needed, and it works with any audio source, including streamed music. I happily smashed out two hours worth via TIDAL with djay Pro AI without a single hitch.

But it is transient and temporary. There’s no extraction of stems to a saved file — record labels and streaming platforms might have an issue with that. But there’s nothing stopping you recording your output in real-time. Ugh — how positively archaic.

The same but different

On the face of it, Algoriddim and Atomix just announced much the same thing. And on one level, yes that’s true. And while the end result is largely the same, the implementation and target audiences are quite different.

djay Pro AI Neural Mix VirtualDJ 2021 Spleeter stems audio source separation (1)

Firstly, djay Pro AI calls it Neural Mix, and it works by isolating beats, instruments, and vocals. You have full control over these independently on all four decks, but in two-deck mode, you can solo, mute, or swap sources with the other deck. Or you can combine beats and instruments or instruments and vocal to switch or fade between them for instant Acappella or beats. There are also options on viewing the source waveforms too.

It’s very simple but incredibly powerful when you experience it for yourself. Of the two, it’s instant gratification and is implemented in the most uncomplicated way. It just works, and it works instantly too, even on my comparatively elderly iPhone 7 Plus.

djay Pro AI Neural Mix VirtualDJ 2021 Spleeter stems audio source separation (3)

Look at the pads and EQs. 

VirtualDJ however takes the base function of splitting source audio and creates their own more expansive take on it. It takes more of a complementary EQ approach to things with an added stem on/off pad mode too. Interestingly VirtualDJ splits the song into five stems — vocals, instruments, bass, kick, and hi-hat, and combines them for different modes.

djay Pro AI Neural Mix VirtualDJ 2021 Spleeter stems audio source separation (2)

Atomix does stress that to get the instant feel of real-time source splitting, you’ll need a Mac or PC with some grunt. My 2014 MacBook Pro does work pretty well, but it’s not quite the absolutely instant feel of djay Pro AI.

From what I can tell, the big difference is that djay Pro AI quite literally does it in real-time i.e. it starts analysing chunks at the playhead, whereas VirtualDJ analyses the whole track, hence needing the powerful machine to deliver that necessary instant real-time response.

Out of the starting gates, djay Pro AI’s version is more polished, easier to understand and delivers that instant feel. VirtualDJ feels like it needs a little more work, but the five-way stem extraction may well be a winner for some. Make it separate out kick and snare/clap and that’s me sold. They’re obviously continuously developing it as it’s had three updates since launch.

Ultimately, they’re two very different animals. One is on the iOS/iPadOS platform and the other is macOS/Windows. One is simple, the other does more. Let the arms stems race begin.

Check out this djay Pro AI mix courtesy of Crossfader

BUT HOW DOES IT SOUND?

This, to quote football pundits, is definitely a game of two halves. The very first time you try either software, you get a genuine wow feeling, as if everything you knew about DJing just changed and you can’t go back. 

The experience of removing your first vocal or making that long yearned for instrumental is epic. And having done it repeatedly over the last 2 weeks or so, it’s a feeling that shows no signs of going away any time soon.

That said, the more you try it out, the more you realise that this is early days for this technology. There will be numerous ways to implement wrangling of stems, but the key is the quality of the output.

Right now it’s a mixed bag, but that is only to be expected. The source material matters, so music with space around the drums, instruments, and vocals will clearly deliver the best results.

Vocals clearly work best. Even pushing some very angry Sepultura through them pulled out a pretty clean vocal. But there’s no denying the slightly reverse reverb feel to the stems. But I stress that this is when sitting down and listening for such things with a single track. When mixing, it’s much less pronounced — it’s like your brain hears something familiar in a mix and uses real intelligence to fill in the audible gaps in quality.

It would be folly to expect true stems level quality from every track any time soon. But it will get better, and quickly too. It just needs some machines to do some more learning and to feed that right into the AI that’s driving this stuff, or however the hell this is working anyway.

But there’s no two ways about it — even at this fledgeling stage, this technology is pretty bloody magical. And as it develops, the possibilities become unfathomable.

An interesting thought to ponder — given that SoundCloud is about to be flooded with pretty shitty mashups made by DJs trying to be producers using this tech (wanna hear my Good Times/Another One Bites The Dust mashup? No?), will it make the music industry think about monetising real stems at last? Or is that just too progressive a thought for them to comprehend? Will streaming takedowns be impacted because this new fangled layering of stems will confuse the algorithms?

But what about other software? I’d say Pioneer DJ’s rekordbox is already tiptoeing though the idea of deeper track analysis with the recently announced vocal position feature. Native Instruments will probably be devastated that free stems will kill off their real Stems project, although I’d argue that they were never really into it anyway.

And already commenters are looking at Serato to respond. I doubt they’ll be too worried about people migrating from Serato DJ Pro to djay and VirtualDJ though, but it could turn the heads of newer DJs less entrenched in a particular ecosystem.

BEYOND SOFTWARE

The immediate problem is making it work with hardware. Short term, you can shift map controllers, but that’s an immediate workaround implementation rather than a proper solution.

The knock-on effect will be a slew of new full controllers designed to give direct hardware access to these new features. Having just been presented with a new cash cow, I’m sure the industry is collectively rubbing its hands together. Product managers will be working out how to implement these new features in hardware form as we speak.

I urge caution to the industry — not everyone will want to use this source splitting tech at this point, so don’t rush to deliver it to everyone. Take baby steps to deliver some modular controller solutions and see what sticks first. This might not be the technology to quickly cram into your range just yet.

For most, especially mixer users, it might be wise to lay hands on a modular controller like a Kontrol X1 or Korg Nano and play with this new feature for yourself. And should this feature take off, keeping up with the Joneses and oneupmanship dictates that hardware churn will be equally rapid too.

Also, given the march towards standalone performance, it’s just a matter of time before we see this happening in hardware too. Looks like a potential update to the already stunning Denon DJ SC6000/M Primes got given its next major USP. Perhaps I’ll wait for whatever those will be to arrive.

The promo video I made for Algoriddim. No explanations — just that oh wow moment.

Summing up

Wearing my editor’s hat, it’s been a while since I had an oh shit moment. And while a few bits of hardware have delivered that in recent years, none of them has been a real revolution. You can count those on your fingers over the last couple of decades.

But for me, source separation offers the next real shift in how we play music to a crowd. When we look back at things we now take for granted, they all started somewhere, and were pretty bloody awful by modern standards, and should by all rights have been doomed to fail. 

At the start, DVS latency felt more like a delay effect than a feature. But the promise was so strong that people stuck with it. And I feel the same about this.

Does it deliver studio-grade stems? Of course not. It would be unwise for anyone to say it never will though because we’re only seven months past the launch of Spleeter, and those algorithms can only improve dramatically, just like digital audio compression and DVS latency did too.

But it’s a start, and a bloody good one too. I sat completely lost in music for two hours not even noticing the diminished quality. It was just pure unadulterated fun and made me think more about possibilities than it did the sound.

The biggest test of all is not our DJ ears, but those of the audience. They’ll soon tell you if it isn’t good enough. And sure, a discerning audience only happy with lossless recordings being played via a £5K rotary though a Void sound system won’t be having any of that nonsense. But even in these early days, I’m certain that your average audience will love being regaled with vocals over beats that previously were strangers. Cheap beer and a few pills can make anything sound a-maaay-zing maaate.

I haven’t been this excited for the DJ future in a very long time. It’s finally going in the direction I’ve wanted it to for years.

Mark Settle
Mark Settle

The old Editor of DJWORX - you can now find Mark at WORXLAB

Articles: 1228

31 Comments

  1. I wouldn’t sign up for the new rekordbox plan since neither the cloud based music service nor their new 3 band waveform is a great upgrade. But I would subscribe to Djay Pro AI if it were available for iMacs and Macbook Pros.

    IMO, iPads are a pain to add music to since iTunes deletes any music you having bought from iTunes when syncing. Plus I add my music by USB sticks or drives.

    • VDJ does have low/mid/high crossfaders, but very few skins use them. I suppose they could be mapped to a controller, but I think controllers with horizontal faders would be a little hard to find.

      Maybe Atomix should reimagine the 3 band crossfaders now that stem EQ is a thing.

  2. The more and more I experiment With Djay Pro AI, the more I am impressed with it. When it comes to software, no one at Pioneer DJ, Serato, or Native Instruments can compete with Algoriddim. They are outstanding.

    There software is really fast and accurate with BPM analysis and sync. Also, they have the best audio separation for stems. Furthermore, they have been consistently award for their software development.

    Areas of improvement for Algoriddim are their effects. They are still not at Pioneer DJ’s level. Pioneer has the best effects.

  3. @Djworx. You need to network with more studio & multimedia producers. Stem extraction has been available in pro audio editing software for many years. It’s the audio equivalent of spectral layers in photography & photo editing software.. It’s a great feature to have in dj software but there is no single program which works. You need to mix & match all the available tools to get the best results.

    It’s available in Acid pro but you only get three converted audio tracks & you have no control over the results.
    There’s a Vst version named regroover.
    Izotope Rx7 has a very powerful feature named music rebalance but it’s expensive.You also have online versions where you upload your file to a server & it converts them. They are usually related to karaoke

    Phonic mind vocal remover is powerful but it has an online subscription version which costs more than a digital wav file.
    Mvsep is a free online version but it only converts to mp3.

    There are many more which perform the same task however the best versions are the ones which give you as much control as possible over the conversion. I generally use Rx7 for vocal extracts as it’s more detailed. I perform several conversions using small steps & split the stereo channel to mono to minimise phasing & artifacts.

    The only rule ive noted is that you can raise or reduce the levels of the audio of the items as a whole & not experience much quality loss.However the quality is compromised if you completely remove a component.

    The most frustrating observation is that you have to try the conversion on every issued version of the audio source to be certain.Ive had poor results on my vinyl rips & had success on a digitally released version of the same audio & vice versa.Ive also had poor results in Rx 7 but success with another version which ive listed.

    A very fast cpu is also required but It’s a very useful tool. I was surprised to note how well it works on the old live disco rap tapes from 1978.

    • I’m more than aware of the path taken to get to this point, and specifically talk about Spleeter because this is likely to be at least the inspiration for the feature appearing now. And for DJs to be able to do it in real time in DJ software with zero preparation is a first.

  4. Sounds super-phasey to me…. I don’t think it’s nice to hear on a big-room PA, maybe as an effect, but not for a longer period. But the tech will definitely improve, so a good start!

  5. I see this more of an effect than a feature. Also is there any issue with Copyright? I’m sure Metallica woudln’t want me removing their vocals and mixing in Usher. it’s not for me, I’d rather use a professional acapella and instrumental / loop then having a volcal suddenly pop up int he middle of a mix because there is a change in frequency. Also the amount of improper key mixing is making me cringe like fingers going accross a chalkboard.

    • The advantage of this new method clearly is that you don’t have to grid acapellas upfront (which can be a pain in the ass).
      Also, with Mixed in Key, harmonic mixing shouldn’t be a problem for any professional.

      Unfortunately, my softwares of choice (Rekordbox DJ & Traktor) don’t have that feature yet, and I’m not going to switch to deejay or Virtual DJ.

      • Kevin is worth a try but maybe not worth changing your software of choice. I personally like to always try all available software and new DJ products to stay update on options available.

    • I think you right on the copyright issue and that is why I believe Algoriddim has released this on iPhones and iPads since it give Apple better copyright control.

      Anyway, I have tried both versions and I personally feel djay Pro AI is outstanding and the best choice for 4 deck mixing. It offers the best audio separation currently. Plus you get the fastest and most accurate BPM analysis and morph mixing. I rate it a 9/10.

      Virtual DJ is a close second and really deserves a closer look. High recommend trying both and you might be surprised by alot of the features they offer. VDJ really shine in 2 deck pro mode. Love how VDJ maps vocal, instru, and beat for highs, mids, and lows for supported controllers, players, and mixers. I do wish VDJ would do a better job supporting original STEMS. Audio separation for original STEMS is poor and has phase issues even for true STEMS. I rate it a 8/10.

      Anyhow, I believe Rekordbox will be the first to follow, then Serato, and Native Instruments will just hit the snooze button. Traktor support won’t happen I think for at least 2 or more years based on past trends.

    • Metallica woudln’t want me removing their vocals and mixing in Usher

      Not if you then started distributing it / profiting from it, but creating it for yourself & playing it at gigs shouldn’t cause any harm.

      • Correct. I agree. I mean I already take acapellas and play them over instrumentals on my radio shows and podcasts but they are purchased as such and listed on my track listings as two separate tracks. This is similar to that however done on the fly with added latitude as the technology is manipulating the control of the frequencies.

  6. The separation of kicks, bass, and instruments is frankly incredible, even on the first few tracks I threw in. The vocals and high hats can sound a bit phasey, depending on the track, but you’re absolutely right, that effect largely disappears when you’re actually mixing. Fucking stoked.

    • That’s the thing with so many tech based advances — DJs will sit in controlled conditions and listen for flaws. But in real world conditions with real world people, so many of these things don’t matter, or at least are acceptable because the end result works regardless of quality.

  7. Atomix have just added the ability to pre-analyse tracks and save the stems so loading time is reduced.

    People without the beefy NVidia GTX graphics cards Atomix recommend were complaining about how long it took to analyse when the track loaded (which also pushes the CPU as it can’t use the GPU). Now user can “rip” the stems in advance, which get saved to a single VDJ specific file (not NI format) for each track.

    Also worth mentioning that Atomix had added this in the 64 bit version of VDJ only. Apparently the 32 bit version isn’t up to the job. This means that the requirement is Windows 10 64 bit only on PC. Mac users are 64 bit anyway.

    Your VDJ screen shots only show three EQ modes and four knobs on the right side. I don’t know why but I’m getting another mode (five way ‘stems’ mode) and five knobs on the right.

  8. I remember seeing the Spleeter thing and hoping at some point Serato would build this kind of functionality in. I wouldn’t even be that bothered about needing it done real time, just the ability to easily prep a few tracks like this nicely in your library / interface – rather than doing it all myself in Python.

  9. I’d argue that vocals is where this system suffers the most. But I’m not even sure this is due to the software, but rather at least partially caused by the fact the voices just aren’t totally there within the track. They’re usually heavily processed with effects and compression (sometimes side chained), so I’m not really surprised. In my experience, it still sits well within the elements of other tracks, as long as they’re in key.

  10. DJ Pro AI and VirtualDJ definitely have changed the DJ business. If you want to be successful in this business now, you should know how to work with both of these things. Otherwise, you’ll fail in this business.

  11. There is a good chance that dJay Pro 3 comes to Mac with neural mix since apple new chipset launches 11.17.20. I wouldn’t be surprised if dJay Pro 3 is even used at launch to demo the new processing power of the M1 chip.

Leave a Reply