Microsoft wants to use AI to bleep out bad words in Xbox Live party chat

Today, Microsoft announced that it’s rolling out filters that will let Xbox Live players automatically limit the text-based messages they receive to four maturity tiers: “Friendly, Medium, Mature, and Unfiltered.” That’s a long-overdue feature for a major communication platform that’s well over a decade old now, but not really anything new in terms of online content moderation writ large.

What’s more interesting is a “looking ahead” promise Microsoft made at the end of the announcement (emphasis added):

Ultimately our vision is to supplement our existing efforts and leverage our company efforts in AI and machine learning technology to provide filtration across all types of content on Xbox Live, delivering control to each and every individual player. Your feedback is more important than ever as we continue to evolve this experience and make Xbox a safe, welcome and inclusive place to game.

That’s all a bit vague, but The Verge reports on the real thrust of that passage: an effort by the company to “tackle the challenge of voice chat toxicity on Xbox Live.” That means leveraging Microsoft’s existing efforts in speech-to-text machine-learning algorithms to automatically filter out swear words that might come up in an Xbox Live party chat.

The ultimate goal, Xbox Live program manager Rob Smith told The Verge, is a system “similar to what you’d expect on broadcast TV where people are having a conversation, and in real time, we’re able to detect a bad phrase and beep it out for users who don’t want to see that.” While broadcast networks still use live engineers to censor their tape-delayed content (with sometimes disastrous errors), Microsoft’s plans involve using machine learning to filter voice communication on a much larger scale without centralized human intervention.

That should be welcome news for users worried about Amazon-style human reviews of Alexa audio commands. Smith stressed that “we have to respect privacy requirements at the end of the day for our users, so we’ll step into it in a thoughtful manner, and transparency will be our guiding principle to have us do the right thing for our gamers.”

While speech-to-text algorithms are getting pretty quick and accurate these days, we’re still probably a way off from a centralized server being able to reliably detect and block spoken swearing in real time. And even a slight delay in transmission could have an outsized impact on players who need instant communication to work together in a fast-paced gaming situation.

In the meantime, Smith said an algorithm could measure an overall “level of toxicity” in group chat and hand out automatic mutings for problematic users. Hopefully that kind of measurement could also impact a player’s rating under the Xbox Live reputation system, which can already lead to bans for excessive profanity in human-reviewed video clips.

Leave a Reply