Audio moderation is social media and digital moderation specific to audio and video, or any sound that flows through channels, such as TikTok and Facebook Live Audio. Moderating audio content has become increasingly difficult, as platforms push short videos with many interaction styles, from stickers to captions to songs to text overlays.

“Audio presents a fundamentally different set of challenges for moderation than text-based communication. It’s more ephemeral and it is harder to research and action,” said Discord’s chief legal officer, Clint Smith, referring to the channel’s moderation struggle around their newest Stage Channels audio feature.

The whole crew of audio-based social channels – TikTok, Instagram’s Reels, Clubhouse, Discord’s Stage Channels – have just started to research solutions, but the tools for audio content moderation are lagging far behind social listening tools for text-based conversations. A few companies are developing speech analysis API, but there is no streamlined approach.

Because of this, social platforms don’t have much of a tracking system. They will resort to blocking users over anything, which creates its own set of accessibility issues.

Moderating comments and threads on social media in general is an arduous process. Audio moderation adds a difficult and nuanced layer – listening, tracking and reporting audio takes more time, requires both humans and tech to uncover layers, and is easy to get lost in the mix. Plus, there are few third party social listening platforms that can fully track audio.

Luckily, platforms like Instagram (who recently released a Reels API) are providing ways for people and brands to track their information on audio. In the meantime, we’ll need to face these challenges head-on.

Challenge #1: Artificial Intelligence Can’t Detect Sarcasm

At this stage, Artificial Intelligence can transcribe audio and crawl through content to find buzz words and phrases, but there are gaps. And while AI auto-moderators are useful in identifying users who utilize these platforms improperly, it is easy for these auto-moderators to take words out of context. It’s also difficult for AI auto-moderators to separate overlapping voices and tones. Popular expressions can be detected and filtered by both AI moderators and human moderators but there are particular words that can appear to be negative but are used harmlessly.

Conversely, human moderators won’t miss those nuances and cultural differences of language – especially if you employ a global team. Regardless it’s nearly impossible for moderators to sift through every conversation as they are happening – Verbal attacks can happen quickly and have a long-lasting effect on users. Combining tech and humans is a businesses best bet.

Audio-focused gaming platform, Discord found a happy medium to address nuance in tone and syntax. They note that moderation is a combination of tech and human skill – bots should be used in tandem with these community moderators to help mitigate the risks of audio platforms. If you’re on Discord, this option is available for both text and voice chat options.e for anyone looking to chat or join online communities. Here, users can start their own communities (known as “servers”), and invite users to join. Inside each server, users can set up channels dedicated to specific topics so that users on each server can congregate with others about the issues they’d like to discuss.

Discord is also free to join, offering unlimited messaging, and complete access to your messages, history, communities, etc. And, furthermore, it’s free to start your own communities/servers too. With that being said, Discord also offers paid subscriptions with extra perks and bonuses, such as animated avatars, custom sitewide emojis, larger file upload sizes, free games, and more.

Challenge #2: Nuanced Guidelines

In order to moderate, brands should have guidelines that help moderators make real time decisions quickly. Creating audio guidelines will greatly differ from straight forward text moderation. Moderators have to listen closely for intonations and intent in speaking so the guidelines must include specific words and terms, sounds, and definitions to ensure that it accounts for any utterances that can break the community guidelines.

If you’re a brand in your early stages of an audio platform – or simply diving into TikTok or Reels – creating an all encompassing mod guide will bring its challenges, but it is imperative. For example, unlike moderating text on social media platforms, offending comments cannot be hidden or filtered from chat participants. Moderators will have to take harsher action.

Plus, these platforms only recently released tracking APIs, so you’re looking at a strong need to track manually. Brands have to be able to find the best way to classify offenses – or conversely, opportunities to engage – in order to direct their moderators.

Challenge #3: Lack of Resources

Generally speaking, big brands with 3+ social platforms running will need to hire a specific moderator role or outsource their community management. Already strapped for resources or employee time or specialization, brands may not have the resources to comb through audio to find inappropriate content. With hundreds of millions of active users on audio focused platforms, it is almost impossible to have the resources to do so.

As a brand, the rule of thumb is to staff enough moderators with regiononal and cultural expertise, and who can represent in an audience and its colloquialisms – it’s your best defense against offenders – or just downright trolls. Speech happens rapidly and can be missed if a moderator isn’t in the chat to hear it or if it isn’t reported. Your moderation guidelines can only work if the offense is heard.

Challenge #4: Privacy Issues

While not real time, one audio moderation strategy is to review recordings. Recordings give moderators a chance to review reported incidents and run audits on audio files to ensure that community guidelines are being followed within each community. This also allows moderators to have evidence of any actions that are taken in the event that anyone wants to appeal the decisions made by moderators.

The problem with this approach is that it can impede people’s privacy. Users may be less likely to participate in servers or platforms that record, or simply feel restricted in using their voice. Privacy on private (yet highly public) platforms is a broader issue that countries around the world continue to face and grapple with.

Twitter Spaces addressed this concern by choosing to record all spaces and keep the audio for 30 days just in case community guidelines weren’t followed. If reported or any violations have been detected, they keep it for 90 additional days for review and to give the parties involved, time to appeal. As of now, this may be the most thoughtful option to protect all participants involved as the exposure is only temporary – notions to consider when developing your strategy.

Challenge #5: Moderator Mental Health

Moderators are tasked with being objective when monitoring content. So even when they’re watching potentially triggering or violent content, they are tasked with observation and decision making.

In the past few years, content moderators have spoken out about their PTSD diagnoses due to encountering a high volume of repeated, difficult content on a daily basis. There are also concerns of anxiety, depression, and general burnout.

A common complaint among moderator employees is a lack of mental health benefits at the company. According to a Virginia Tech study, being proactive about maintaining mental wellness is the best way to protect social media moderators. This includes resilience training, providing coping mechanisms, and access to mental health care. It is essential to implore your moderators to take breaks, offer mental health and wellness benefits, and be vocal about any topics or content that is difficult to handle.

So, what should brands do about audio moderation?

Moderation is not a one-size-fits-all task. When creating a moderation plan, always consider context: take a hard look at your audience, the immediate and predicted social climate, and importantly, a really realistic lens into your team’s capabilities and time limits. Integrate the tools and tech – such a social listening tools – that will make jobs easier, smoother, and less time consuming. Assessing and taking future action on detailed reports will also protect your moderators and users.

Our final recommendation for audio moderation is this: Use AI filters and human moderators. If you choose just one, you’ll miss nuanced opportunities to engage and overall reach. If you choose just humans, you’re looking at burnout and potentially missed mentions. If your employees are too strapped, or not trained, for the task, brands may need to outsource their moderation services or work in tandem with a global company dedicated to the job.


ICUC is a 24/7 global social media community management and online moderation team who integrates into your team to ensure your brand is safe online. Learn more about our audio moderation and social listening services: Social Listening Service Page.

Ready to Level Up?

Our social media experts are ready to supercharge your social platforms. Click the get started button to learn more.

Get Started