Survey text analysis beyond word clouds

Quick disclaimer/TLDR.

This article is written for fairly technical people who love detail. If you don’t, here are a few key things you can take away:

  • Traditional survey text analysis is basically limited to word clouds (which were cool in 2001) and yield little to no actual insight from your hard earned opinion data.
  • If you think machine learning will fix it, you’re close, but survey text alone isn’t rich enough to yield a useful output.
  • Networked surveys are a new kind of survey that create way richer insights from your open-ended responses.
  • Networked surveys multiply text data per-respondent 27x on average and provide higher ROI on your panel investment.
  • Networked surveys automatically discover respondent personas based on their behavior and reactions (not just text).
  • Networked surveys mean change, and if you don’t like change, stick to what you know.
  • If you want to get started, or learn more about networked surveys in a more market-y way, click on this.

Alright, game on.

Survey text analysis is limited.

Are you a technical/analytical thinker? Do you run surveys often? If you are, and you do, then you probably already know where they are strong and where they have limits.

One of those limits is text analysis, and as markets move to contextual insights to comply with GDPR, this barrier can be paralyzing. At the center of the challenge is the very nature of survey text data collection.

You invest cash to get a representative sample of your market, collect answers to open-ended questions, and then invest again in time spent tagging each of the ~3,000 opinions you collected for keywords and themes. If you spend one minute tagging each opinion, that’s 50 billable hours ($7,500 @ $150/hr) spent on tagging alone. You spent all of that time tagging text and what you got back was… the number of times you tagged your text… the depth ends there.

Survey text machine learning is limited.

Now, you might say, ‘but what about natural language processing (NLP) and machine learning (ML) approaches to analyzing text for sentiment?’ The problem with survey opinion data is that it’s flat text, with no metadata. Because you’re feeding the machine learning algorithm short-length, flat text with no additional descriptors, there’s just not enough training data for the ML algorithm to yield useful or relevant insights.

If you’ve ever made use of these machine learning text analysis tools for surveys, you’ve probably already been disappointed by the obviousness of the conclusions they draw. You’ve likely also had to manually adjust outputs, because your human perspective is more relevant to the problem you’re solving. That human perspective is what makes networked surveys so powerful. But, before we get into what they are and why they’re useful, let’s travel back in time to simpler days.

Remember when search engines sucked?

If you’re ancient enough to remember Lycos, Ask Jeeves, and Yahoo! search, you remember that they were pretty basic. Put a keyword in the box, and if the keyword appeared most often in the text of a result, that result displayed highest for your search. That was it – flat text-level thinking. This led SEOs of yester-year to stuff keywords into content in hopes that it would rank higher, and it worked… This made for a frustrating search experience and was the de facto standard until Google entered the market with a simple and obvious discovery, links.

They realized that text alone wasn’t enough to determine that a result was relevant to your search. There was a trove of additional non-text human behavioral signals in the form of links between web content that formed a massive network. This network dramatically improved the quality of search results and set a new bar for what search engines needed to be.

Networked surveys go beyond text-level thinking.

Ok, back to 2018. Traditional survey text analysis is a lot like Lycos right now. You tag your text, and those tags that show up most often get ranked higher in your report (or worse, your word cloud, but don’t get me started there). That’s where your insight really ends – frequency – making your text analysis output about as high-quality as a circa-‘95 search result.

Now let’s talk about networked surveys. Networked surveys work exactly the same way as a traditional survey in terms of sampling and survey distribution. You log into the software, create a survey, get a link, and share it with your panel. They integrate with panel recruiters like Research Now/SSI, and panel market places like They support Google analytics URL tracking parameters for digital survey recruitment attribution. They support demographic exclusion rules to filter out bad sample fits and save you budget. All of the basics are covered, including basic survey question types (like Likert scales, multiple choice single-answer, multiple choice multiple-answer, open-ended, etc.). They also support a new question type: the networked question.

Networked questions spin-up miniature disposable social networks inside of your survey. Sounds badass, but what does it mean? More specifically, the simplest version of a networked question takes an open-ended text response, and lets other respondents react to it along a rating scale (from positive to negative, agree to disagree, not interesting to interesting, etc.).

Example of a networked survey in action:

These non-text human rating signals between opinions and respondents create a large network of opinions (the opinion network) and have a huge multiplying effect on text data. Now, in addition to each respondent’s open-ended text, you get on average an extra 27 open-ended data points with that respondent’s reaction to each on a scale.

Here’s an example of an opinion network (nodes are opinions, edges are shared respondents).

That many more qualitative data points per respondent means a significantly increased depth of knowledge when multiplied across your sample and a much bigger return on your survey investment. But the increased quantity of qualitative data isn’t the only advantage networked surveys give you. The extra data enables much richer discovery of patterns that only occur in the network of opinions that forms (e.g. Craig likes pizza, but not spicy pizza, but spicy tacos are yum).

Here’s a comparison between traditional and networked (you’ll want to scroll over it)


Networked surveys let you cluster your tags together to form respondent segments.

Now here’s something cool you can do with all of your tags. Because networked surveys track respondent scores for each opinion, and those scores can be aggregated by tags, we can ask questions like “which tags tend to be scored similarly by the same people?” If tags have high score similarity and respondent overlap, we group them together into segments and find useful patterns that are not text-dependent. For example, what do you think pool goers care about in parks? Take a minute.

Ok, times up, were canopies on the list? It might seem pretty obvious once you hear it and it makes sense, but, I know it at least wasn’t top of mind for me. Did you know people who care about fitness stations also care about the same things? Not to mention other seemingly unrelated things like “invasive control.”

Here’s an example of a single respondent segment (“the pool goer”). The bars represent levels of agreement with each tag/factor.

Networked surveys also measure complex behavior like persuasion.

More advanced versions of networked surveys can even measure persuasion. Take the Net Promoter Score for example. If you ask an NPS scale question, followed by an open-ended, networked surveys make it possible to segment customer loyalty feedback along an extra dimension that traditional surveys can’t access. You get to find out which opinions were exclusively those of your promoters (your promoter tribe), which were written by your promoters and got your detractors to agree with them (positive persuasive), which opinions were exclusively those of your detractors (your staunch opposition), and which were written by your detractors that got your promoters to agree with them (negative persuasive/leaks in your dam). This means you can tell which opinions persuade your market to (and not to) buy your product, recommend your brand, vote for your candidate, use your service, engage in your workplace, donate to your cause… etc.

Here’s an example of how networked surveys can be used to increase donations.

Networked surveys are actually pretty straight forward to learn and run with but they’re not for people who don’t like change. If you’re cool with word clouds and basic validation of your assumptions, there’s nothing wrong with a traditional survey. But, if you’re an innovator, and angling for a way to get past word clouds, you have little to lose and a lot (27x) to gain.

This has been fun! If you’re planning research for yourself or a client and want a trial, contact us below and we’ll help you figure out the right setup.

~ Alan


AI and the Opinion Network

Our Vision for Artificial Intelligence

Artificial Intelligence is automated enlightenment. It has the power to solve hard problems that are normally handled by people, because it’s trained by people. It is a mirror onto ourselves and as a result an incredible catalyst for human innovation. For that reason, we have decided to double-down on our vision:

To train social-facing Artificial Intelligence systems with a deeper qualitative understanding of people, using our Networked Survey™.

Individuals have opinions on a near infinite number of topics. Groups of people do too, and there is significant overlap. We can imagine connecting people based on the opinions they have in common. If we transposed our thinking, we would find relationships between opinions based on the people that align with them. This network of connected opinions is what we call the “Opinion Network.” The Opinion Network is ancient; but, fairly new terrain. It needed a new research methodology, so we developed a solution, the Networked Survey. Networked Surveys allow us to sample and map the opinion network from topic to topic.

Because the Opinion Network is pervasive across verticals, segments, and cultures, it fits nicely as a general source of training data for machine learning applications. With this in mind, our AI strategy is unique in that it is not centered on the end goal of building systems that autonomously make decisions.

Predictions made by machine learning algorithms are only as good as the training data informing them. For this reason, our AI strategy is to support its development across an unlimited field of qualitative applications using our Networked Survey technology.

We are already making headway proving this strategy out with two major brands, each developing feature selection and data collection strategies using our Networked Survey technology. Active applications range from predictive modeling of consumer behavior to qualitative team alignment inputs for a global brand.

We are also making strides in developing a normative set of training data on topics relevant to our user base made up of strategists, researchers, and marketers at global brands, political organizations, and membership organizations. Agreeable will iterate on these topical areas in a longitudinal study that maps small mutations to the Opinion Network over time. We plan on launching this initiative Q2 2018 and making results available quarterly to our customers.

We are truly excited to be a part of the growing field of Artificial Intelligence.


~ Alan

Alan Garcia
Founder & CEO, Agreeable Research

Interested? Let Us Share More Materials With You!

The Opinion Network™

The thing about the Opinion Network is that it’s THE thing. It’s ancient, ever-changing, and we’ve only begun to scratch its surface.

It sounds intangible but the truth is we are all physically connected to it. When you share an opinion with me, you wirelessly transmit a signal over the air. I hear it, interpret it, and accept or reject it and that creates a bond between us. When you imagine the overlap between the opinions we share you discover that we are engulfed by a massive ocean of opinions, constantly updating it at local and global levels. It is the SOURCE of all social insight.

Existing research methods get close to taking a snapshot of it – but the images we get back are low resolution and distorted. But now a new approach is taking center stage, and its applications are endless.