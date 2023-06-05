Set to a Ye-style beat that he found on YouTube, Nickson’s Ye-voiced verses make the rapper seem to apologise for his shocking antisemitic outbursts last year. “I attacked a whole religion all because of my ignorance,” Nickson rapped in the vocal guise of Kanye. (In reality, the rapper offered a sorry-not-sorry apology last year in which he said he didn’t regret his comments.)

“When I made that video, these machine-learning models were brand new,” Nickson told me in a video call, sitting behind a microphone in his filming studio in Charlotte, North Carolina. The 37-year-old is a tech entrepreneur and content creator. He came across the Kanye voice model while browsing a Ye-inspired music-remix forum called Yedits on the internet site Reddit.

“It was a novelty, no one had seen it,” he said of the AI-generated Ye voice. “Like, the tutorial had about 20 views on YouTube. And I looked at it and went, ‘Oh my God.’ The reason I knew it was going to be huge wasn’t just that it was novel and cool, but also because the copyright conversation around it is going to change everything.”

ETHICAL QUESTION RAISED

Ethical questions are also raised by voice cloning. Nickson, who isn’t African-American, was criticised online for using a black American voice. “I had a lot of comments calling it digital blackface. I was trying to explain to people, hey look, at the time this was the only good model available.”

Elsewhere on his YouTube channel are guides to making your own celebrity voice. Led by his tutorials, I enrol as a member of an AI hub on Discord, the social-media platform founded by computer gamers. There you can find vocal models and links to the programming tools for processing them.

These tools have abstruse names like “so-vits-svc” and initially look bewildering, though it’s possible to use them without programming experience. The voice models are formulated from a cappella vocals taken from recordings, which are turned into sets of data. It takes several hours of processing to create a convincing musical voice. Modellers refer to this as “training”, as though the vocal clone were a pet.

Amid the Travis Scotts and Bad Bunnies on the Discord hub is a Tom Waits voice. It’s demonstrated by a clip of the AI-generated Waits bellowing a semi-plausible version of Lil Nas X’s country-rap hit Old Town Road. But I can’t make the model work. So my next port of call is a website to do it for me.