Listening to the conversations happening on social media can be a great way to gauge public opinion about certain topics among target audiences. For campaigns in tight races, it can be a valuable resource. But social listening also has many limitations, and it’s not a replacement for polling.
I’ve spent the past seven years working at the intersection of politics and social media, developing, directing, and executing social media strategies for a range of clients on and off—and up and down—the ballot.
For almost all of the campaigns and clients I’ve worked with, social listening has been an important part of ongoing reporting and analysis. Since 2013, I’ve had a front-row seat to the evolution of social listening tools and capabilities. I’ve used or demoed virtually every major social listening platform on the market today, and have worked with data teams to develop and implement social listening infrastructures in-house.
Social listening can be a useful tool if you know how to use it — and if you understand its limitations. This is a complicated topic with many dimensions, and my goal in this piece is to outline some common pitfalls to watch out for when using this evolving technology, as well as some opportunities and applications in which it truly can excel.
Make sure you understand who you are and aren’t listening to.
In the overwhelming majority of its applications, social listening is only able to capture public online conversations—some of which are much easier to identify than others. Public or private, remember that you’re only “hearing” from those who want their voices to be heard.
Now, 72 percent of U.S. adults use some type of social media, a statistic that’s stayed fairly steady since 2016. But this 72 percent doesn’t align perfectly with the 55 percent who vote.
Though campaigns can still see and collect insights from the public comments that people leave on, say, the campaign’s own public Facebook or Instagram content, it’s important to remember that these comments are being made directly to a candidate or campaign.
In other words, it’s something they want the campaign to know, not things they’d necessarily say behind their backs. For instance, Twitter replies from protected accounts stay private, though they do contribute to the total number of “replies” shown next to a tweet.
Meanwhile, though campaigns can identify and analyze public conversations across platforms like Twitter, Reddit, and Facebook, those conversations are limited to self-selecting and often anonymized communities of users—not a representative sample constructed by pollsters.
Consider that just 22 percent of U.S. adults use Twitter. And according to Pew, these users have, on average, some significant demographic differences from U.S. adults overall, including skewing younger and being more likely to identify as Democrats.
This content and its volume are also prone to manipulation by bots, hackers, and trolls, or all of the above. There are ways that social listening platforms and data scientists have learned to identify and filter out these kinds of accounts. In 2016, we saw that fake accounts can have demonstrable influence on real people’s views and experiences online.
Sure, it’s important to separate, as much as possible, what fake accounts are saying from the content coming from real users. But as long as fake accounts are present, there’s meaningful need and value for some to measure and assess their content.
Platforms including Facebook and Twitter are also stepping up their efforts to crack down on fake engagement. But user growth is one of the most important metrics for online platforms, reducing some of the incentive for them to remove all fake accounts. Plus, there can be a real value to keeping anonymity an option for some users, such as those speaking out or organizing online against oppressive regimes.
On social media, as in real life, those who are loudest can also sometimes be in the minority. In the private sector, research has repeatedly shown that consumers with “extreme experiences” are more likely to leave a review, and the same is true for the increasingly polarized online discourse about politics. In fact, 80 percent of tweets are estimated to come from just 10 percent of the most active accounts.
Due to the growing distrust of social media platforms and increased awareness of social media surveillance, many people expressing their authentic opinions online opt for harder-to-capture methods to communicate their message. For instance, some use asterisks and special characters to complicate or censor pickup by a name search for keywords like D*n*ld Tr*mp or get creative with nicknames like these. Many simply comment on a linked article, attach a screenshot of a tweet (or a screenshot of an article), or use video to get their point across—avoiding searchable keywords altogether.
Data scientists can search for every symbol-ridden, misspelled variant of a range of keywords and creative nicknames they can think of, and use tools to pick up text, logos, and locations on public photos, as well as identify leading shares of specific links or articles. Machine learning is even capable of auto-transcribing audio into searchable keywords.
But that’s all just the public conversation. What about all that’s being said on private profiles, in DM rooms and group chats, or in closed or invite-only forums, groups, and subreddits?
It’s worth noting that 45 percent of social media users keep all of their accounts private, while 63 percent of users say private messaging apps are where they feel most comfortable sharing and discussing content. Even software that scrapes audiovisual content isn’t able to override privacy settings to capture insights about conversations and content that’s shared in non-public online spaces.
Plus, social listening is always inherently at the mercy of the social media platforms themselves — and their ever-changing APIs — the most prominent of which has an extensive and well-documented history of making inaccurate claims and providing inflated metrics.
So, if social listening has all these limitations and is not a replacement for polling, why do we keep using it? How can it be a valuable tool?
It can be a great addition to social media data and inform a strategic content marketing program.
On Facebook, standard metrics for post performance include its number of reactions, comments, and shares. In addition to this, consider looking at the kinds of reactions users are leaving as well as the content of their comments and shares to understand the quality of how your audience is responding to your content—not just the quantity.
You can do this at scale by looking at keyword volume, thoughtfully grouping similar terms together to more readily identify patterns in how your audience is responding to a specific piece of content, or which topics they want to hear more about.
On Twitter, consider looking at the content of top user replies, mentions, and retweets with comment, as these all can lend themselves to valuable insights. For instance, Warren for President’s “billionaire tears” mug was a very simple example of social listening in action, launched in direct response to a torrent of users urging Elizabeth Warren to sell one.
It quickly became our campaign’s best-selling piece of merchandise, both through organic social and overall. You, too, can do this at scale by looking at keyword volume—and from there, looking at most-engaged public content about top keywords — and their common variants, including multiple languages as appropriate — to identify trends.
If calibrated correctly, social listening can also help summarize evolving public online conversation about major movements like #BlackLivesMatter, events like a #DemDebate or the #DemConvention, topics like COVID-19, candidates like the one you’re working to elect, and specific articles or landing pages like the GOTV action page you’re launching next week.
It’s also great for gauging interest and sampling public conversations among vocal supporters, followers, or a well-defined audience like verified journalists, prominent progressives, or elected leaders on Twitter.
In any case, be careful about drawing conclusions from metrics alone without looking at the raw data. During the primary debates in Nevada and South Carolina, for example, users “liked” tweets mentioning Warren about as much as they “liked” tweets about Mike Bloomberg.
If read without looking at the tweets themselves, or at least having watched either debate, it could be tempting to conclude that social media audiences “liked” the two candidates’ performances about the same. But as anyone who saw those debates can imagine, the content of those two sets of tweets was very different. Technology like natural language processing can help understand these differences programmatically, but the quality of tools equipped with it varies widely, especially across languages, and often carries a hefty price tag.
It all comes back to approaching social listening data with an honest and thoughtful understanding of the audiences and topics it can and can’t capture, a curiosity about the context of its content, and an appreciation of the nuance therein.
Polling has its own limitations, and it has failed us in many ways — especially in 2016. On Warren’s campaign, we had a saying for this: “Don’t ride the pollercoaster.” Instead of going all in on the social listening coaster, though, remember that no one approach is ever perfect or complete. But we can be certain that more data can help us make more informed decisions.
Anastasia develops and directs innovative digital strategies for leading progressive campaigns and organizations. Before serving as Elizabeth Warren's Social Media Director, Anastasia launched and led the Social Media Department at Trilogy Interactive, where she worked with dozens of high-profile clients on refining and growing their online presence.