Can they just get along? Situated Cognition and Survey Response

Finally, I’m going to take a moment to talk about Norbert Schwarz’s JPSM Distinguished Lecture on March 30! I’ve attended a few events and had a few experiences lately that I’m eager to blog about, but sometimes life has plans for us that don’t involve blogging. Today, I would say, is no different, except that I woke up thinking about this lecture!

Ok, enough about me, more about Schwartz.

I should start by saying that I am a longtime fan of Schwartz. In Fall 2009, I had just discovered the MLC program and finished what was a whirlwind application process, and I was first trying to wrap my head around the field of sociolinguistics and its intersection with my career in survey methodology. I had attended a presentation of an ethnography of communication pilot study to the McDonough School of Business, and, to my great shock, I came across a survey methodology paper that spoke of the Logic of Conversation and the role of Gricean maxims in survey responses. This fantastic piece is the work of Norbert Schwarz, and I’ve kept it nearby ever since. In it, Schwartz addresses the conversational expectations of survey respondents and shows how they respond not only to the question at hand, but also to these expectations.

It’s common in every survey to look at some of the responses and wonder how in the world they could have come about. I addressed this in an earlier blog post, where one researcher had gone as far as to call respondents stupid. Oftentimes we think of respondents “getting it right” or “getting it wrong.” But there is a larger phenomena underlying what appear to be strange responses, and it’s something that we experience when we attempt to respond to surveys.

We write survey questions with a mechanistic expectation, that if we ask a question, we will hear back the answer to that question, but we neglect to consider the fact that communication is not mechanistic. Of course, we are not necessarily aware of this. We’re aware of misunderstandings, but we’re not often aware of the tiny sphere of focus and interpretive frames that we apply to every utterance we here and utter. This is no fault of our own. This is a survival tool. We simply cannot process all of the information that we’re constantly inundated with.

In survey research, we’re aware that small differences in question format can influence responses. We’re aware that changing a scale will change the numeric range of the responses. We see that changing labels on a scalar question changes the results. We’re aware that sometimes answers appear to be absolute contradictions and seem to us to be impossible. These are especially large challenges for us, and they are the purview of linguistics.

Schwartz, however, is not a linguist. He is a cognitice scientist. And his lecture was not about the linguistic basis behind apparently wonky response phenomenon. Instead, he spoke about situated cognition.

Situated cognition makes a lot of intuitive sense. It is a proven psychological phenomena that shows that we don’t hold attitudes, beliefs and responses at a certain location in our mind, rather we recreate them each time. Instead we create or recreate them each time. This process allows for much more of an influence from “what’s on our mind,” making situational or contextual factors much more important, and decreasing the reliability, or repeatability, of survey responses. This is not a hard egg for someone (me) with a background in cognitive science and sociolinguistics to swallow, but the effect on the audience was remarkable. How does someone from a field that thrives on the mechanistic nature of responses take the suggestion that what they’re measuring is not a distinctly measurable entity so much as a complicated, potentially unreliable act of nature?

One of the discussants used a couple that he was not very fond of as an example of a stable opinion. I believe that this example lends itself well to further exploration. If he had just met the couple, and he had had a negative experience with them, his evaluation of his opinion toward the couple would depend on the degree of negativity of the experience, his predisposition to give or not give them the benefit of the doubt, and his degree of concern about expressing a negative opinion to the interviewer or survey researchers. After this point, these factors will be increasingly influenced by his further experiences with the people and the degree of negativity, positivity or neutrality of the experiences, and the recency and saliency of the experiences. Essentially, his response would reflect a complicated underlying equation and be the output of situated cognition.

But what is a survey researcher supposed to do with this information?

It would be easy at this point to throw the baby out with the bathwater and cast doubt on the whole survey and response process. But that’s not necessary, and that’s not the point.

The point is that each method of analysis has its own unique set of strengths and weaknesses. It is important to know the strengths and weaknesses of your methods in order to better understand what exactly you are finding and what your findings mean. And it also behooves us to supplement across methodologies. A reliable survey response is a strong finding, but it can mask underlying factors that can be accessed through other methodologies. As Pew demonstrated in their Kony 2012 report, mixing methodologies can lead to a more clear, nuanced narrative than any single method could yield.

It would be easy to dismiss Schwartz’s reporting, or to dismiss survey methodology. But dismissing either would be foolish, rash and unnecessary. Instead, let’s build on both. A wider foundation can build a better house, but the best house will need to take down some old walls and rethink its floorplan.

JPSM Distinguished Lecture

Tomorrow the Joint Program in Survey Methodology is having a special lecture at the University of Maryland.

Do survey respondents lie?

Situated cognition and socially desirable responding Prof. Norbert Schwarz University of Michigan

Survey researchers commonly assume that people know what they do, know what they believe, and can report on it with candor and accuracy, as Angus Campbell put it. From this perspective, many findings suggest that survey respondents are less than candid. The best known example is the observation that answers to racial attitude questions vary as a function of the interviewers race. Challenging this interpretation, a large body of social psychological research shows similar context effects under conditions that do not lend themselves to this interpretation, including conditions that use implicit attitude measures, which are not subject to deliberate “faking”.

From a situated cognition perspective, such findings reflect that attitude questions assess context sensitive evaluations that respondents form on the spot, drawing on information that is accessible at that point in time. The underlying processes operate in daily life as well as in survey interviews and reflect the situated nature of human judgment rather than a deliberate attempt to report a socially desirable answer.

I review relevant findings and discuss their implications for survey measurement.

Friday, March 30, 2012, 3:00 PM – 5:00 PM

2205 LeFrak Hall, University of Maryland, College Park MD USA

Metro stop: College Park on the Green line See http://www.jpsm.umd.edu/jpsm/?geninfo/directions.htm for directions and parking information.

 

Discussants: Paul Beatty, NCHS and David Cantor, Westat

 

A reception follows the lecture.

Amazing Presentation on Infographics

I had the privilege this week of attending a webinar by Matthew Erickson of the New York Times about innovative graphic presentations of data. There were some truly amazing interactive displays included in this presentation, and the presenter had a lot of very insightful suggestions for rethinking data presentation:

 

http://www.ericson.net/files/aapor-shared.pdf

 

He spoke about the role of good interactive presentation in situating data, providing context, developing layers, and telling a story. A lot of times, the distribution of the data, and it’s relationship with data from other sources, is its most interesting layer. In an innovative presentation of data, we must balance the expectations of the audience, who become interactants with the data and must be able to manipulate it easily, with a complementary layer of expertise or context.

For example, data about Manny Rivera’s pitching style could best be understood by the placement of the ball at the hitter’s decision making point. In a graphic about Rivera’s success, the reporters were able to show how radically different pitches were virtually indistinguishable at the crucial decision making point for the hitter.

He referred to infographics as the “gamification of news.”

 

To connect his presentation to this ongoing discussion of text analytics, check out the way he displayed word frequencies:

http://www.ericson.net/presentations/aiga-for-web.pdf

Interestingly, it is still problematic, but it is super cool looking…

 

And, speaking of infographics, check out this awesome one that Pew debuted today:

http://features.pewforum.org/religious-migration/map.php#/Destination/UnitedStates/

Qualitative and Quantitative methods revisited

At the GURT 2012 conference last week, another graduate student whose work was primarily quantitative asked me about the issue of data quality in qualitative research. She asked how, if your results are not repeated over a large group of people, do you know that your results are reliable or representative? In the process of answering her, I realized that in some ways quantitative and qualitative research are ideologically opposed. Whereas quantitative research is validated in part by the reliability of a result across a variety of people and contexts, the people and contexts for which the result is being quantified are not the focus but the rationale. However, in qualitative research, the context and the people are the focus. Qualitative research is more about putting a microscope up to an element in the data and closely observing how it works in the context of its surroundings. The research questions are starkly different, but they do clearly complement each other.

In the world of text analytics, the analytic focus is largely quantitative, which in some ways counteracts the very nature of the data. The unbalanced nature of the data collection then calls the quality of the quantitative analysis into question, and the analysis is not used in a traditional way to recreate the microfocus, so the advantages of the qualitative microfocus are also diminished.

We spoke a lot at GURT about the paucity of theory in the field of text analytics. The technical ability applied to the problems is quite strong. There are quite a few very talented programmers who are hard at work at conquering some of the technical issues inherent in text analysis, some on the computer science end of things, some at the computational linguistic end of things and some fortunate enough to work from both ends with knowledge from both fields. But it is not enough to be able to solve problems and answer questions, we have to know which questions to ask.

It has been relatively easy to blame the fast growing set of consumers for their immediate hunger for the data analysis. In order to satisfy this hunger, companies like Open Amplify double their output monthly and are working as hard and fast as they can to keep up with the demand. But the demand generally comes with the same level of linguistic knowledge that most laypeople have. We, as language users, are constantly inundated with language, and we only consciously process a very small proportion of it. So we don’t instinctively ask the questions that our data is really best suited to answer. The text analytic world is responding to questions of “what are people talking about?” with word frequencies, comparative word frequencies and sentiment analyses that are tied to those word frequencies. But we don’t use language that way! If I ask you about your phone, you’re likely to respond about its features of its usefulness of its price, or how well you’ve adapted to it. If I ask 100 people about their phones, how much good will it do me to aggregate across responses? There is a good deal of work that needs to be done in terms of finding intertextual references to phones (e.g. “springy keypad” or “data plan”) and assigning a negative value to “limited calling plan” and a positive to “limited call interference.” When I asked a coworker how our advisory committee meeting was going while I was at the conference, she answered “delicious.” We communicate by keying on shared knowledge, and as we communicate we build senses of particular topics that are related specifically to our conversation. If my coworker had answered with a comment about the potato salad, and I had played off of that, would we be talking about potato salad in any equivalent way to the way we might at a summer picnic? I would argue that as we joke on, we be talking in fact about something quite different than the potato salad itself. In fact, we would likely use the potato salad as a stand-in for the meeting that we were really discussing. Should that conversation be used by market researchers in a potato salad corporation?

In fact, the topics that we discuss are quite variable. The specific meanings that elements take on within a conversation is best understood in the context of that conversation as a part of a qualitative analysis than an aggregated quantitative analysis.

The big essential questions that we need to grapple with, as a field, at this point are questions like:

‘what kind of questions is this kind of data best suited to answer?’

‘how can our knowledge of linguistics and discourse be transferred into quantifiable questions that could feed the field of text analysis’

‘what kinds of questions can we ask of textual data that will reframe the way that people think about the usefulness of textual data?’

‘how can we best harness this fast growing mass of textual data in the most useful, reliable ways?’

I would argue that these are questions that discourse analysts are best suited to answer, but in order to ask them, they/we must be able to leave our qualitative bunkers and open our minds to the complementary potential for quantitative analysis. I would also argue that a popularized appreciation for the value of discourse analysis would also lend some legitimacy to a field that is largely unknown.

On the way to work this morning, I listened to an interview with Naomi Wolf. She spoke in part of the chutzpah of presenting academic knowledge in a widely accessible format. Academic perspective, she argued, is too often maintained in academic circles, far away from the general population who could really use and appreciate it. Georgetown professor Deborah Tannen made some important steps in the popularization of sociolinguistics. I believe that what I am suggesting is a quantitative extension of the popularization. People could not have imagined that a book about something as obscure as conversation analysis could be interesting or so widely applicable to their own lives. There was no rushing the doors from the general population of people desperate for these books. Hers was not a case of giving people what they wanted. It was a case of giving people something that would be widely useful. And people embraced Dr Tannen’s books as such.

Let us now use the luxury of time that the academic sector has but the commercial sector certainly does not to do what we do best: theorize! A new, great plane awaits. Let us head west!

Another CLIP

I missed today’s CLIP. Too much work and too much rain. But the description of it made it sound especially interesting, because the speaker is obviously really grappling with the concept of context. It would have been interesting to have heard what he did with it and how he used linguistics (he specifically mentioned the field, albeit probably not in a discourse analytic type of way). I will have to follow up with him or with his papers. Thankfully, he’s local!

Here’s the sum:

February 29: Vlad Eidelman, Unsupervised Textual Analysis with Rich Features

Learning how to properly partition a set of documents into categories in an unsupervised manner is quite challenging, since documents are inherently multidimensional, and a given set of documents can be correctly partitioned along a number of dimensions, depending on the criterion. Since the partition criterion for a supervised model is encoded in the data via the class labels, even the standard information retrieval representation of a document as a vector of term frequencies is sufficient for many state-of-the-art classification models. This representation is especially well suited for the most common application: topic (or thematic) analysis, where term presence is highly indicative of class. Furthermore, for tasks where term presence may not be adequate, such as sentiment or perspective analysis, discriminative models have the ability to incorporate complex features, allowing them to generalize and adapt to the specific domain. In the case where we do not have access to resources for supervised training, we must turn to unsupervised clustering models. Clustering models rely almost exclusively on a simple bag-of-words vector representation, which performs well for topic analysis, but unfortunately, is not guaranteed to perform well for a different task.

In this talk, I will present a feature-enhanced unsupervised model for categorizing textual data. The presented model allows for the integration of arbitrary features of the observations within a document. While in generative models the observed context is usually a single unigram, or bigram, our model can robustly expand the context to extract features from a block of text of larger size. After presenting the model derivation, I will describe the use of complex automatically derived linguistic and statistical features across three practical tasks with different criterion: perspective, sentiment, and topic analysis. I show that by introducing domain relevant features, we can guide the model towards the task-specific partition we want to learn. For each task, our feature enhanced model outperforms strong baselines and state-of-the-art models.

Bio: Vladimir Eidelman is a fourth-year Ph.D. student in the Department of Computer Science at the University of Maryland, working primarily with Philip Resnik. He received his B.S. in Computer Science and Philosophy from Columbia University in 2008 and a M.S in Computer Science from UMD in 2010. His research interests are in machine learning and natural language processing problems, such as machine translation, structured prediction, and unsupervised learning. He is the recipient of the National Science Foundation Graduate Research and National Defense Science and Engineering Graduate Fellowships.

Observations on another CLIP event: ESL and MT

Today I attended another CLIP colloquila at the University of MD:

Feb 22: Rebecca Hwa, The Role of Machine Translation in Modeling English as a Second Language (ESL) Writings

She addressed these research questions:

1. How patterned are the errors of English language learners?

1a. Could ‘English with mistakes’ be used as an input for machine translation?

1b. Could that be used to improve mt outputs?

1c. Could these findings be used for EFL training?

 

Her presentation made me think a lot about the role of linguistics in this type of work and about the nature of English.

First, I am coming to firmly believe that the best text processing should be done in partnership between linguists and computer scientists. Linguistics provides the most thorough and reliable frame for computer scientists to key off of, and once you stray from the nature of what you’re trying top represent, you end up astray.

So, for example, in the first part of her research presentation she talked about a project involving machine translation and English language learners of all backgrounds. One woman in the audience kept asking questions about the conglomeration of non native English speakers, and I assumed she was from the English department. The issue of mistakes in language use is a huge one, and a focus has to be chose from which to do the work. Maybe language background would be a more productive way to narrow the focus, and would allow for much more specific structural guidance and bodies of knowledge on language interference.

Second, she spoke about Chinese English language learners in particular and her investigation of lexical choice. Often English language learners’ written English is marked by lexical choices that appear strange to native English speakers. Her hypothesis was that the words that were used in place of the correct words were similar in some way to the correct words, most likely by context. She played a lot with the definition of context; was it proximity? Was it a specific grammatical relationship? This discussion was fascinating, but probably could have benefited from some restrictions on the context of the errors she was targeting. Again, this is from the linguistics end of the linguistics—computer science spectrum.

Her speech made me think a lot about the nature of English. I often think about what it means to be a global language. English is spoken in many places where there are not native speakers, and it is spoken in many places that we don’t traditionally think of as native English places. Often the English that arises from these contexts is judged to be full of errors, but I don’t necessarily agree with this. Instead, I would ask two questions:

1. Is the variation patterned?

2. Is communication successful?

If the answer to these questions is yes, then I don’t think that the speaker is producing errors, so much as a different variety of English. Varieties of English are not all treated with the same respect, but I suspect that the reasons behind this are more to do with the prejudices of the person judging the grammar than a paucity on the part of the speaker.

AAPOR Conference Preliminary Program is Up!

This is exciting!

The conference theme this year is New Frontiers in Public Opinion Research, and now we can get a first glimpse at AAPOR’s take on the future of the field! There are quite a few sessions on web survey design, paradata, alternative data sources, and the potential of social media. It will be interesting to see which of the sessions will have a sociolinguistic bent, because many certainly have that potential. There are also sessions on interviewer effects and context effects, which may even use Conversation Analysis (CA) approaches.

http://www.aapor.org/AM/Template.cfm?Section=AAPOR_Annual_Conference&Template=/CM/ContentDisplay.cfm&ContentID=4986

Patterning in Language, revisited

Language can be pretty mindblowing.
In my paper on the potential of Natural Language Processing (NLP) for social science research, I called NLP a kind of oil rig for the vast reserves of data that we are increasingly desperate to tap.
Sometimes the rigging runs smoothly. This week I read a chapter about compliments in Linguistics at Work. In the chapter, Nessa Wolfson describes her investigations into the patterning of compliments in English. Although some of her commentary in this chapter seems far off base to me (I’ll address this in another post) her quantitative findings are strong. She discovered that 54% of the compliments in her corpus fell into a single syntactic pattern, 85% of the compliments fell into three syntactic patterns, and 97% fell into a total of nine syntactic patterns. She also found that 86% of the compliments with a syntactically positive verb used just two common verbs, ‘like’ and ‘love.’ And she discovered some strong patterning in the adjectival compliments as well.

 

Linguistic patterns such as these are generally not something that native speakers of a language are aware of, yet they offer great potential to English Language Learners and NLP programmers. It is precisely patterns such as these that  NLP programmers use in order to mine information from large bodies of textual data. When language is patterned as strongly as this, it is significantly easy to mine and makes a strong case for the effectiveness of NLP as a rig and syntax as the bones of the rig.

 

But as strongly as language patterns in some areas, it is also profoundly conflicted in others.

 

This week I attended a CLIP Colloquilam at the Universty of Maryland. The speaker was Jan Wiebe, and the title of her talk was ‘Subjectivity and Sentiment Analysis. From Words to Discourse.’ In an information packed hourlong talk, Wiebe essentially covered her long history with sentiment analysis and discussed her current research (I took 11 pages of notes! Totally mindblowing). Wiebe approached one of the essential struggles of linguistics, the spectrum between language out of context and language in context (from words to discourse) from a computer science perspective. She spoke about the programming tools and transformations that she had developed and worked with in order to take data out of context in an automated way and build their meaning back in a patterned way. For each stage or transformation, she spoke of the complications and potential errors she had encountered.

 

She spoke of her team’s efforts to tag word senses in wordnet by their subjective or objective orientation and positive and negative meanings. Her team has created a downloadable subjectivity lexicon, and they hope to make a subjectivity phrase classifier available this Spring. For the sense labeling, they decided to use courser groupings that wordnet in order to improve accuracy, so instead of associating words with their senses, they associate them only along usage domains, or s/o (subjective/objective) and p/n/n (positive/negative/neutral). This increases the accuracy of the tags, but doesn’t account for the context effects such as polarity shifting, e.g. from wonderfully (+) horrid (-) to wonderfully horrid (+). The subjectivity phrase classifier will be a next step in the transition between prior polarity (out of context, word level orientation, like in the subjectivity lexicon) and contextual polarity (the ultimate polarity of the sentence, taking into count phrase dependency, etc.), or longer distance negation such as “not only good, but amazing”.

 

She also spoke of her teams research into debate sites. They annotate individual postings by their target relationships (same/alternative/part/anaphora, etc.), p/n/n, and reinforcing vs non reinforcing. So, for example, in a debate between blackberries and iphones, where the sides are predetermined by the setup of the site, she can connect relationships to stances, e.g. “fast keyboard” is a positive stance toward blackberry, “slower keyboard” reflects a negative orientation toward an iphone, and a pro-iphone post that mentions the “fast keyboard” is building a concessionary, rather than an argument in favor of blackberry.

 

In sum, she discussed the transformations between words out of context and words in context, a transformation which is far from complete. She discussed the subjectivity/objectivity of individual words, but then showed how these could be transformed through context. She showed the way phrases with the same syntactic structure could have completely different meanings. She spoke of the difficulty of isolating targets or the subject of the speech. She spoke of the interdependent structures in discourse, and the way that each compounding phrase in a sentence can change the overall directionality. She spoke of her efforts to account for these more complex structures with a phrase level classifier, and she spoke of her research into more indirect references in language. Each of these steps is a separate area of research, each compounding error on the path between words and discourse.

 

Patterning such as Wolfson found show the great potential of NLP, but research such as Wiebe’s shows the complicated nature of putting these patterns into use. In fact, this was exactly my experience working with NLP. NLP is a constant struggle between linguistic patterning and the complicated nature of discourse. It is an important field and a growing field, but the problems it poses will not be quickly resolved. The rigs are being built, but the quality of the oil is still dubious.

A visit from Noam Chomsky

On Friday January 27th, I went to hear Noam Chomsky give a talk to the Linguistics department at the University of Maryland. Chomsky gave three public talks during his visit to UMD; a multidisciplinary dean’s lecture on Thursday afternoon, the linguistics department talk on Friday morning, and a talk at the Clarice Smith Performing Arts Center on Friday afternoon. I thought it would be especially cool to hear him talk about Linguistics, now that I’ve had some exposure to it beyond my undergraduate Cognitive Science classes.

I was, of course, supremely ignorant about Chomsky’s role in Linguistics. I had read Chafe’s criticism of Chomsky’s movement. Chafe’s criticism was introduced in part as a reaction to research such as the fMRI research that I’ve actually taken part in. But I believe that the fMRI program is a solid, albeit overused, field of study (I recently attended a conference of survey researchers where one researcher was in the process of proposing an fMRI program to analyze the survey response process), and I believe that Chafe made some solid and excellent contributions to Linguistics. I had also heard something about a split between Generative Linguistics and Functional Linguistics from my friend Holly, who appears to have studied Linguistics on a planet apart from the one I’m studying on (Two separate planets? Most likely I’m the one on the separate planet…).

Chomsky began the talk very close to my programmer’s heart. He essentially spoke of reframing the complex world of apparently complex data by its more simply structured source. As a programmer, I often see people agonizing over apparently complex patterns in data that was produced by a single programming error. As a computer scientist of sorts, I have been trained to begin by tracing apparently complex problems down to their roots, and following the roots outward until an error is encountered. The idea of looking at linguistics from this perspective appealed strongly to my training in neuroscience. I eagerly anticipated his argument.

He then spoke about his minimalist program. He mentioned his issue with the term, saying that the term was misleading, because it renamed a program of study that hadn’t changed and was simply a continuation of ongoing work. He then defended the minimalist perspective, by saying that it was simply one research program out of many. All are necessary and do different things, so if you don’t like the minimalist program, you should simply follow another. When Chomsky mentioned that some people differed with him about this ‘minimalist program,’ whatever that was, I sat up in my chair and geared up for a fight! I don’t know why people resent Chomsky so much, but maybe this was it?!

Well, touche. This was not the lead up to a fight. This was the lead up to a talk that I didn’t understand at all. He spoke in formulas, none of which were meaningful to me (I mean, what is the social context of X+Y=[XY]? Why is this such a revelatory departure from X+Y=Z? What the heck are X? And Y? And Z?). He spoke about about binding theory (never heard of it), repetitions vs copies (never heard of them in this context), and phase theory (?). He spoke mostly about internal and external merges (don’t know anything about either).

So what did I get out of the talk? I got a reminder that I think like a programmer and feel most comfortable in situations where people approach problems like computer scientists, from root to tips instead of tips to root. I also got a reminder that I’m no linguist, rather someone who is gaining valuable training in discourse analysis.