March 4, 2013

Total Survey Error: nanny to some, wise elder for some, strange parental friend for others

Total Survey Error and I are long-time acquaintences, just getting to know each other better. Looking at TSE is, for me, like looking at my work in survey research through a distorted mirror to an alternate universe. This week, I’ve spent some time closely reading Groves’ Past, Present and Future of Total Survey Error, and it provided some historical context to the framework, as well as an experienced account of its strengths and weaknesses.

Errors are an important area of study across many fields. Historically, models about error assumed that people didn’t really make errors often. Those attitudes are alive and well in many fields and workplaces today. Instead of carefully considering errors, they are often dismissed as indicators of incompetence. However, some workplaces are changing the way they approach errors. I did some collaborative research on medical errors in 2012 and was introduced to the term HRO or High-Reliability Organization. This is an error focused model of management that assumes that errors will be made, and not all errors can be anticipated. Therefore, every error should be embraced as a learning opportunity to build a better organizational framework.

From time to time, various members of our working group have been driven to create checklists for particular aspects of our work. In my experience, the checklists are very helpful for work that we do infrequently and virtually useless for work that we do daily. Writing a checklist for your daily work is a bit like writing instructions on how you brush your teeth and expecting to keep those instructions updated whenever you make a change of sorts. Undoubtedly, you’ll reread the instructions and wonder when you switched from a vertical to a circular motion for a given tooth. And yet there are so many important elements to our work, and so many areas where people could make less than ideal decisions (small or large). From this need rose Deming, with the first survey quality checklist. After Deming, a few other models arose. Eventually, TSE became the cumulative working framework or foundational framework for the field of survey research.

In my last blog, I spoke about the strangeness of coming across a foundational framework after working in the field without one. The framework is a conceptually important one, separating out sources of errors in ways that make shortcomings and strengths apparent and clarifying what is more or less known about a project.

But in practice, this model has not become the applied working model that its founders and biggest proponents expected it to be. This is for two reasons (that I’ll focus on), one of which Groves mentioned in some detail in this paper and one of which he barely touched on (but likely drove him out of the field).

1. The framework has mathematical properties, and this has led to its more intensive use on aspects of the survey process that are traditionally quantitative. TSE research in areas of sampling, coverage, response and aspects of analysis is quite common, but TSE research in other areas is much less common. In fact, many of the less quantifiable parts of the survey process are almost dismissed in favor of the more quantifiable parts. A survey with a particularly low TSE value could have huge underlying problems or be of minimal use once complete.
2. The framework doesn’t explicitly consider the human factors that govern research behind the scenes. Groves mentioned that the end users of the data are not deeply considered in the model, but neither are the other financial and personal (and personafinancial) constraints that govern much decision making. Ideally, the end goal of research is high quality research that yields a useful and relevant response for as minimal cost as possible. In practice, however, the goal is both to keep costs low and to satisfy a system of interrelated (and often conflicting) personal or professional (personaprofessional?) interests. If the most influential of these interests are not particularly interested in (or appreciative of) the model, practitioners are highly unlikely to take the time to apply it.

Survey research requires very close attention to detail in order to minimize errors. It requires an intimate working knowledge of math and of computer programming. It also benefits from a knowledge of human behavior and the research environment. If I were to recommend any changes to the TSE model, I would recommend a bit more task based detail, to incorporate more of the highly valued working knowledge that is often inherent and unspoken in the training of new researchers. I would also recommend a more of an HRO orientation toward error, anticipating and embracing unexpected errors as a source of additions to the model. And I would recommend some deeper incorporation of the personal and financial constraints and the roles they play (clearly an easier change to introduce than to flesh out in any great detail!). I would recommend a shift of focus, away from the quantitative modeling aspects and to the overall applicability and importance of a detailed, applied working model.

I’ve suggested before that survey research does not have a strong enough public face for the general public to understand or deeply value our work. A model that is better embraced by the field could for the basis for a public face, but the model would have to appeal to practitioners on a practical level. The question is: how do you get members of a well established field who have long been working within it and gaining expertise to accept a framework that grew into a foundational piece independent of their work?

Posted in commentary, research, research methodology, Strategic Communications, survey methodology
Tagged AAPOR, commentary, data analysis, design, quantitative, research, research methodology
Leave a comment

November 7, 2012

What do all of these polling strategies add up to?

Yesterday was a big first for research methodologists across many disciplines. For some of the newer methods, it was the first election that they could be applied to in real time. For some of the older methods, this election was the first to bring competing methodologies, and not just methodological critiques.

Real time sentiment analysis from sites like this summarized Twitter’s take on the election. This paper sought to predict electoral turnout using google searches. InsideFacebook attempted to use Facebook data to track voting. And those are just a few of a rapid proliferation of data sources, analytic strategies and visualizations.

One could ask, who are the winners? Some (including me) were quick to declare a victory for the well honed craft of traditional pollsters, who showed that they were able to repeat their studies with little noise, and that their results were predictive of a wider real world phenomena. Some could call a victory for the emerging field of Data Science. Obama’s Chief Data Scientist is already beginning to be recognized. Comparisons of analytic strategies will spring up all over the place in the coming weeks. The election provided a rare opportunity where so many strategies and so many people were working in one topical area. The comparisons will tell us a lot about where we are in the data horse race.

In fact, most of these methods were successful predictors in spite of their complicated underpinnings. The google searches took into account searches for variations of “vote,” which worked as a kind of reliable predictor but belied the complicated web of naturalistic search terms (which I alluded to in an earlier post about the natural development of hashtags, as explained by Rami Khater of Al Jezeera’s The Stream, a social network generated newscast). I was a real-world example of this methodological complication. Before I went to vote, I googled “sample ballot.” Similar intent, but I wouldn’t have been caught in the analyst’s net.

If you look deeper at the Sentiment Analysis tools that allow you to view the specific tweets that comprise their categorizations, you will quickly see that, although the overall trends were in fact predictive of the election results, the data coding was messy, because language is messy.

And the victorious predictive ability of traditional polling methods belies the complicated nature of interviewing as a data collection technique. Survey methodologists work hard to standardize research interviews in order to maximize the reliability of the interviews. Sometimes these interviews are standardized to the point of recording. Sometimes the interviews are so scripted that interviewers are not allowed to clarify questions, only to repeat them. Critiques of this kind of standardization are common in survey methodology, most notably from Nora Cate Schaeffer, who has raised many important considerations within the survey methodology community while still strongly supporting the importance of interviewing as a methodological tool. My reading assignment for my ethnography class this week is a chapter by Charles Briggs from 1986 (Briggs – Learning how to ask) that proves that many of the new methodological critiques are in fact old methodological critiques. But the critiques are rarely heeded, because they are difficult to apply.

I am currently working on a project that demonstrates some of the problems with standardizing interviews. I am revising a script we used to call a representative sample of U.S. high schools. The script was last used four years ago in a highly successful effort that led to an admirable 98% response rate. But to my surprise, when I went to pull up the old script I found instead a system of scripts. What was an online and phone survey had spawned fax and e-mail versions. What was intended to be a survey of principals now had a set of potential respondents from the schools, each with their own strengths and weaknesses. Answers to common questions from school staff were loosely scripted on an addendum to the original script. A set of tips for phonecallers included points such as “make sure to catch the name of the person who transfers you, so that you can specifically say that Ms X from the office suggested I talk to you” and “If you get transferred to the teacher, make sure you are not talking to the whole class over the loudspeaker.”

Heidi Hamilton, chair of the Georgetown Linguistics department, often refers to conversation as “climbing a tree that climbs back.” In fact, we often talk about meaning as mutually constituted between all of the participants in a conversation. The conversation itself cannot be taken outside of the context in which it lives. The many documents I found from the phonecallers show just how relevant these observations can be in an applied research environment.

The big question that arises from all of this is one of a practical strategy. In particular, I had to figure out how to best address the interview campaign that we had actually run when preparing to rerun the campaign we had intended to run. My solution was to integrate the feedback from the phonecallers and loosen up the script. But I suspect that this tactic will work differently with different phonecallers. I’ve certainly worked with a variety of phonecallers, from those that preferred a script to those that preferred to talk off the cuff. Which makes the best phonecaller? Neither. Both. The ideal phonecaller works with the situation that is presented to them nimbly and professionally while collecting complete and relevant data from the most reliable source. As much of the time as possible.

At this point, I’ve come pretty far afield of my original point, which is that all of these competing predictive strategies have complicated underpinnings.

And what of that?

I believe that the best research is conscious of its strengths and weaknesses and not afraid to work with other strategies in order to generate the most comprehensive picture. As we see comparisons and horse races develop between analytic strategies, I think the best analyses we’ll see will be the ones that fit the results of each of the strategies together, simultaneously developing a fuller breakdown of the election and a fuller picture of our new research environment.

Posted in big data, commentary, data, ethnography of communication, event, outside story, research, research methodology, social media, sociolinguistics, survey methodology
Tagged commentary, data, data analysis, design, discourse analysis, facebook, linguistics, linguistics in action, Natural Language Processing, qualitative, quantitative, research methodology, sentiment analysis, social media analysis, text analysis, text analytics, twitter
1 Comment

October 1, 2012

Notes on the Past, Present and Future of Survey Methodology from #dcaapor

I had wanted to write these notes up into paragraphs, but I think the notes will be more timely, relevant and readable if I share them as they are. This was a really great conference- very relevant and timely- based on a really great issue of Public Opinion Quarterly. As I was reminded at the DC African Festival (a great festival, lots of fun, highly recommended) on Saturday, “In order to understand the future you must embrace the past.”

DC AAPOR Annual Public Opinion Quarterly Special Issue Conference

75^th Anniversary Edition

The Past, Present and Future of Survey Methodology and Public Opinion Research

Look out for slides from the event here: http://www.dc-aapor.org/pastevents.php

Note: Of course, I took more notes in some sessions than others…

Peter Miller:

– Adaptive design- tracking changes in estimates across mailing waves and tracking response bias, is becoming standard practice at Census

– Check out Howard Schuman’s article tracking attitudes toward Christopher Columbus

Ended up doing some field research in the public library, reading children’s books

Stanley Presser:

– Findings have no meaning independent of the method with which they were collected

– Balance of substance and method make POQ unique (this was a repeated theme)

Robert Groves:

– The survey was the most important invention in Social Science in the 20^th century – quote credit?

– 3 era’s of Survey research (boundaries somewhat arbritrary)

1930-1960
- Foundation laid, practical development
1960-1990
- Founders pass on their survey endeavors to their protégés
- From face to face to phone and computer methods
- Emergence & Dominance of Dillman method
- Growth of methodological research
- Total Survey Error perspective dominates
- Big increase in federal surveys
- Expansion of survey centers & private sector organizations
- Some articles say survey method dying because of nonresponse and inflating costs. This is a perennial debate. Groves speculated that around every big election time, someone finds it in their interest to doubt the polls and assigns a jr reporter to write a piece calling the polls into question.
1990à
- Influence of other fields, such as social cognitive psychology
- Nonresponse up, costs up à volunteer panels
- Mobile phones decrease cost effectiveness of phone surveys
- Rise of internet only survey groups
- Increase in surveys
- Organizational/ business/ management skills more influential than science/ scientists
- Now: software platforms, culture clash with all sides saying “Who are these people? Why do they talk so funny? Why don’t they know what we know?”
- Future
  - Rise of organic data
  - Use of administrative data
  - Combining data sets
  - Proprietary data sets
  - Multi-mode
  - More statistical gymnastics

Mike Brick:

Society’s demand for information is Insatiable
Re: Heckathorn/ Respondent Driven samples
- Adaptive/ indirect sampling is better
- Model based methods
  - Missing data problem
  - Cost the main driver now
  - Estimation methods
  - Future
    - Rise of multi-frame surveys
    - Administrative records
    - Sampling theory w/nonsampling errors at design & data collection stages
      - Sample allocation
      - Responsive & adaptive design
      - Undercoverage bias can’t be fixed at the back end
        
        *Biggest problem we face*
        
        Worse than nonresponse
        
        Doug Rivers (2007)
        
        Math sampling
        
        Web & volunteer samples
        
        1^st shot at a theory of nonprobability sampling
        
        Quota sampling failed in 2 high profile examples
        
        Problem: sample from interviews/ biased
        
        But that’s FIXABLE
        
        Observational
        
        Case control & eval studies
        
        Focus on single treatment effect
        
        “tougher to measure everything than to measure one thing”

Mick Couper:

– Mode an outdated concept

Too much variety and complexity
Modes are multidimensional
- Degree of interviewer involvement
- Degree of contact
- Channels of communication
- Level of privacy
- Technology (used by whom?)
- Synchronous vs. asynchronous
More important to look at dimensions other than mode
Mode is an attribute of a respondent or item
Basic assumption of mixed mode is that there is no difference in responses by mode, but this is NOT true
- We know of many documented, nonignorable, nonexplainable mode differences
- Not “the emperor has no clothes” but “the emperor is wearing suggestive clothes”
- Dilemma: differences not Well understood
  - Sometimes theory comes after facts
  - That’s where we are now- waiting for the theory to catch up (like where we are on nonprobability sampling)

– So, the case for mixed mode collection so far is mixed

Mail w/web option has been shown to have a lower response rate than mail only across 24-26 studies, at least!!
- (including Dillman, JPSM, …)
- Why? What can we do to fix this?
- Sequential modes?
  - Evidence is really mixed
  - The impetus for this is more cost than response rate
  - No evidence that it brings in a better mix of people

– What about Organic data?

Cheap, easily available
But good?
Disadvantages:
- One var at a time
- No covariates
- Stability of estimates over time?
- Potential for mischief
  - E.g. open or call-in polls
  - My e.g. #muslimrage
Organic data wide, thin
Survey data narrow, deep

– Face to face

Benchmark, gold standard, increasingly rare

– Interviewers

Especially helpful in some cases
- Nonobservation
- Explaining, clarifying

– Future

Technical changes will drive dev’t
Modes and combinations of modes will proliferate
Selection bias The Biggest Threat
Further proliferation of surveys
- Difficult for us to distinguish our work from “any idiot out there doing them”

– Surveys are tools for democracy

Shouldn’t be restricted to tools for the elite
BUT
There have to be some minimum standards

– “Surveys are tools and methodologists are the toolmakers”

Nora Cate Schaeffer:

– Jen Dykema read & summarized 78 design papers- her summary is available in the appendix of the paper

– Dynamic interactive displays for respondent in order to help collect complex data

– Making decisions when writing questions

See flow chart in paper
- Some decisions are nested
Question characteristics
- E.g. presence or absence of a feature
  - E.g. response choices

Sunshine Hillygus:

– Political polling is “a bit of a bar trick”

The best value in polls is in understanding why the election went the way it did

– Final note: “The things we know as a field are going to be important going forward, even if it’s not in the way they’ve been used in the past”

Lori Young and Diana Mutz:

– Biggest issues:

Diversity
Selective exposure
Interpersonal communication

– 2 kinds of search, influence of each

Collaborative filter matching, like Amazon
- Political targeting
- Contentious issue: 80% of people said that if they knew a politician was targeting them they wouldn’t vote for that candidate
  - My note: interesting to think about peoples relationships with their superficial categories of identity- it’s taken for granted so much in social science research, yet not by the people within the categories

– Search engines: the new gatekeepers

Page rank & other algorithms
No one knows what influence personalization of search results will have
Study on search learning: gave systematically different input to train engines are (given same start point), results changes Fast and Substantively

Rob Santos:

– Necessity mother of invention

Economic pressure
Reduce costs
Entrepreneurial spirit
Profit
Societal changes
- Demographic diversification
  - Globalization
  - Multi-lingual
  - Multi-cultural
  - Privacy concerns
  - Declining participation

– Bottom line: we adapt. Our industry Always Evolves

– We’re “in the midst of a renaissance, reinventing ourselves”

Me: That’s framing for you! Wow!

– On the rise:

Big Data
Synthetic Data
- Transportation industry
- Census
- Simulation studies
  - E.g. How many people would pay x amount of income tax under y policy?
Bayesian Methods
- Apply to probability and nonprobability samples
New generation
- Accustomed to and EXPECT rapid technological turnover
- Fully enmeshed in social media

– 3 big changes:

Non-probability sampling
- “Train already left the station”
- Level of sophistication varies
- Model based inference
- Wide public acceptance
- Already a proliferation
Communication technology
- Passive data collection
  - Behaviors
    - E.g. pos (point of service) apps
    - Attitudes or opinions
  - Real time collection
    - Prompted recall (apps)
    - Burden reduction
      - Gamification
Big Data
- What is it?
- Data too big to store
  - (me: “think “firehoses”)
  - Volume, velocity, variety
  - Fuzzy inferences
  - Not necessarily statistical
  - Coursenes insights

– We need to ask tough questions

(theme of next AAPOR conference is just that)
We need to question probability samples, too
- Flawed designs abound
- High nonresponse & noncoverage
- Can’t just scrutinize nonprobability samples
Nonprobability designs
- Some good, well accepted methods
- Diagnostics for measurement
  - How to measure validity?
  - What are the clues?
  - How to create a research agenda to establish validity?
Expanding the players
- Multidisciplinary
  - Substantive scientists
  - Math stats
  - Modelers
  - Econometricians
We need
- Conversations with practitioners
- Better listening skills

– AAPOR’s role

Create forum for conversation
Encourage transparency
Engage in outreach
Understanding limitations but learning approaches

– We need to explore the utility of nonprobability samples

– Insight doesn’t have to be purely from statistical inferences

– The biggest players in big data to date include:

Computational scientists
Modelers/ synthetic data’ers

– We are not a “one size fits all” society, and our research tools should reflect that

My big questions:

– “What are the borders of our field?”

– “What makes us who we are, if we don’t do surveys even primarily?”

Linguistic notes:

– Use of we/who/us

– Metaphors: “harvest” “firehose”

– Use of specialized vocabulary

– Use of the word “comfortable”

– Interview as a service encounter?

Other notes:

– This reminds me of Colm O’Muircheartaigh- from that old JPSM distinguished lecture

Embracing diversity
Allowing noise
Encouraging mixed methods

I wish his voice was a part of this discussion…

Posted in commentary, data, event, research, research methodology, skills, survey methodology
Tagged AAPOR, data, data analysis, design, quantitative, research methodology
7 Comments

July 8, 2012

Do you ever think about interfaces? Because I do. All the time.

Did you ever see the movie Singles? It came out in the early 90s, shortly before the alternative scene really blew up and I dyed [part of] my hair blue and thought seriously about piercings. Singles was a part of the growth of the alternative movement. In the movie, there is a moment when one character says to another “Do you ever think about traffic? Because I do. All the time.” I spent quite a bit of time obsessing over that line, about what it meant, and, more deeply, what it signaled.

I still think about that line. As I drove toward the turnoff to my mom’s street during our 4th of July vacation, I saw what looked like the turn lane for her street, but it was actually an intersection- less left- turning split immediately preceding the real left turn lane for her street. It threw me off every time, and I kept remembering that romantic moment in Singles when the two characters were getting to know each other’s quirks, and the man was talking about traffic. And it was okay, even cool, to be quirky and think or talk about traffic, even during a romantic moment.

I don’t think about traffic often. But I am no less quirky. Lately, I tend to think about interfaces. Before my first brush with NLP (Natural Language Processing), I thought quite a bit about alternatives to e-mail. Since I discovered the world of text analytics, I have been thinking quite a bit about ways to integrate the knowledge across different fields about methods for text analysis and the needs of quantitative and qualitative researchers. I want to think outside of the sentiment box, because I believe that sentiment analysis does not fully address the underlying richness of textual data. I want to find a way to give researchers what they need, not what they think they want. Recently, my thinking on this topic has flipped. Instead of thinking from the data end, or the analytic possibilities end, or about what programs already exist and what they do, I have started to think about interfaces. This feels like a real epiphany. Once we think about the problem from an interface, or user experience perspective, we can better utilize existing technology and harness user expectations.

Have you read the new Imagine book about how creativity works? I believe that this strategy is the natural step after spending time zoning out on the web, thinking, or not thinking, about research. The more time you cruise, the better feel you develop for what works and what doesn’t, the more you learn what to expect. Interfaces are simply the masks we put on datasets of all sorts. The data could be the world wide web as a whole, results from a site or time period, a database of merchandise, or even a set of open ended survey responses. The goal is to streamline the searching interface and then make it available for use on any number of datasets. We use NLP every day when we search the internet, or shop. We understand it intuitively. Why don’t we extend that understanding to text analysis?

I find myself thinking about what this interface should look like and what I want this program to do.

Not traffic, not as romantic. But still quirky and all-encompassing.

Posted in commentary, data, research, research methodology, social media
Tagged commentary, data, data analysis, design, discourse analysis, Natural Language Processing, qualitative, research methodology, sentiment analysis, social media analysis, text analysis, text analytics
2 Comments

June 12, 2012

Question Writing is an Art

As a survey researcher, I like to participate in surveys with enough regularity to keep current on any trends in methodology. As a web designer, an aspect of successful design is a seamlessness with the visitor’s expectations. So if the survey design realm has moved toward submit buttons on the upper right hand corner of individual pages, your idea (no matter how clever) to put a submit button on the upper left can result in a disconnect on the part of the user that will effect their behavior on the page. In fact, the survey design world has evolved quite a bit in the last few years, and it is easy to design something that reflects poorly on the quality of your research endeavor. But these design concerns are less of an issue than they have been, because most researchers are using templates.

Yet there is still value in keeping current.

And sometimes we encounter questions that lend themselves to an explanation of the importance of question writing. These questions are a gift for a field that is so difficult to describe in terms of knowledge and skills!

Here is a question I encountered today (I won’t reveal the source):

How often do you purchase potato chips when you eat out at any quick service and fast food restaurants?

2x a week or more
1x a week
1x every 2-3 weeks
1x a month
1x every 2-3 months
Less than 1x every 3 months
Never

This is a prime example of a double barreled question, and it is also an especially difficult question to answer. In my care, I rarely eat at quick service restaurants, especially sandwich places, like this one, that offer potato chips. When I do eat at them, I am tempted to order chips. About half the time I will give in to the temptation with a bag of sunchips, which I’m pretty sure are not made of potato.

In bigger firms that have more time to work through, this information would come out in the process of a cognitive interview or think aloud during the pretesting phase. Many firms, however, have staunchly resisted these important steps in the surveying process, because of their time and expense. It is important to note that the time and expense involved with trying to make usable answers out of poorly written questions can be immense.

I have spent some time thinking about alternatives to cognitive testing, because I have some close experience with places that do not use this method. I suspect that this is a good place for text analytics, because of the power of reaching people quickly and potentially cheaply (depending on your embedded TA processes). Although oftentimes we are nervous about web analytics because of their representativeness, the bar for representativeness is significantly lower in the pretesting stage than in the analysis phase.

But, no matter what pretesting model you choose, it is important to look closely at the questions that you are asking. Are you asking a single question, or would these questions be better separated out into a series?

How often do you eat at quick service sandwich restaurants?

When you eat at quick service restaurants, do you order [potato] chips?

What kind of [potato] chips do you order?

The lesson of all of this is that question writing is important, and the questions we write in surveys will determine the kind of survey responses we receive and the usability of our answers.

Posted in commentary, data, research, research methodology, skills, social media, survey methodology
Tagged design, Natural Language Processing, qualitative, quantitative, research methodology, social media analysis, text analysis, text analytics
Leave a comment

February 8, 2012

The 11 Best Art and Design Books of 2011

http://www.brainpickings.org/index.php/2011/11/28/best-art-design-books-2011/

Free Range Research

An aspiring postdisciplinarian surfs through the ebbs and flows of the changing research environment

Tag Archives: design

Total Survey Error: nanny to some, wise elder for some, strange parental friend for others

Notes on the Past, Present and Future of Survey Methodology from #dcaapor

The 11 Best Art and Design Books of 2011