A fleet of research possibilities and a scattering of updates

Tomorrow is my first day of my 3rd year as a Masters student in the MLC program at Georgetown University. I’m taking the slowwww route through higher ed, as happens when you work full-time, have two kids and are an only child who lost her mother along the way.

This semester I will [finally] take the class I’ve been borrowing pieces from for the past two years: Ethnography of Communication. I’ve decided to use this opportunity do an ethnography of DC taxi drivers. My husband is a DC taxi driver, so in essence this research will build on years of daily conversations. I find that the representation of DC taxi drivers in the news never quite approximates what I’ve seen, and that is my real motivation for the project. I have a couple of enthusiastic collaborators: my husband and a friend whose husband is also a DC taxi driver and who has been a vocal advocate for DC taxi drivers.

I am really eager to get back into linguistics study. I’ve been learning powerful sociolinguistic methods to recognize and interpret patterning in discourse, but it is a challenge not to fall into the age old habit of studying aboutness or topicality, which is much less patterned and powerful.

I have been fortunate enough to combine some of my new qualitative methods with my more quantitative work on some of the reports I’ve completed over the summer. I’m using the open ended responses that we usually don’t fully exploit in order to tell more detailed stories in our survey reports. But balancing quantitative and qualitative methods is very difficult, as I’ve mentioned before, because the power punch of good narrative blows away the quiet power of high quality, representative statistical analysis. Reporting qualitative findings has to be done very carefully.

Over the summer I had the wonderful opportunity to apply my sociolinguistics education to a medical setting. Last May, while my mom was on life support, we were touched by a medical error when my mom was mistakenly declared brain dead. Because she was an organ donor, her life support was not withdrawn before the error was recognized. But the fallout from the error was tremendous. The problem arose because two of her doctors were consulting by phone about their patients, and each thought they were talking about a different patient. In collaboration with one of the doctors involved, I’ve learned a great amount about medical errors and looked at the role of linguistics in bringing awareness to potential errors of miscommunication in conversation. This project was different from other research I’ve done, because it did not involve conducting new research, but rather rereading foundational research and focusing on conversational structure.

In this case, my recommendations were for an awareness of existing conversational structures, rather than an imposition of a new order or procedure. My recommendations, developed in conjunction with Dr Heidi Hamilton, the chair of our linguistics department and medical communication expert, were to be aware of conversational transition points, to focus on the patient identifiers used, and to avoid reaching back or ahead to other patients while discussing a single patient. Each patient discussion must be treated as a separate conversation. Conversation is one of the largest sources of medical error and must be approached carefully is critically important. My mom’s doctor and I hope to make a Grand Rounds presentation out of this effort.

On a personal level, this summer has been one of great transitions. I like to joke that the next time my mom passes away I’ll be better equipped to handle it all. I have learned quite a bit about real estate and estate law and estate sales and more. And about grieving, of course. Having just cleaned through my mom’s house last week, I am beginning this new school year more physically, mentally and emotionally tired than I have ever felt. A close friend of mine has recently finished an extended series of chemo and radiation, and she told me that she is reveling in her strength as it returns. I am also reveling in my own strength, as it returns. I may not be ready for the semester or the new school year, but I am ready for the first day of class tomorrow. And I’m hopeful. For the semester, for the research ahead, for my family, and for myself. I’m grateful for the guidance of my newest guardian angel and the inspiration of great research.

A snapshot from a lunchtime walk

In the words of Sri Aurobindo, “By your stumbling the world is perfected”

Could our attitude toward marketing determine our field’s future?

In our office, we call it the “cocktail party question:” What do you do for a living? For those of us who work in the area of survey research, this can be a particularly difficult question to answer. Not only do people rarely know much about our work, but they rarely have a great deal of interest in it. I like to think of myself as a survey methodologist, but it is easier in social situations to discuss the focus of my research than my passion for methodology. I work at the American Institute of Physics, so I describe my work as “studying people who study physics.” Usually this description is greeted with an uncomfortable laugh, and the conversation progresses elsewhere. Score!

But the wider lack of understanding of survey research can have larger implications than simply awkward social situations. It can also cause tension with clients who don’t understand our work, our process, or where and how we add expertise to the process. Toward this end, I once wrote a guide for working with clients that separated out each stage in the survey process and detailed what expertise the researcher brings to the stage and what expertise we need from the client. I hoped that it would be a way of both separating and affirming the roles of client and researcher and advertising our firm and our field. I have not ye had the opportunity to use this piece, because of the nature of my current projects, but I’d be happy to share it with anyone who is interested in using or adapting it.

I think about that piece often as I see more talk about big data and social media analysis. Data seems to be everywhere and free, and I wonder what affect this buzz will have on a body of research consumers who might not have respected the role of the researchers from the get-go. We worried when Survey Monkey and other automated survey tools came along, but the current bevvy of tools and attitudes could have an exponentially larger impact on our practice.

Survey researchers often thumb their nose at advertising, despite the heavy methodological overlap. Oftentimes there is a knee-jerk reaction against marketing speak. Not only do survey methodologists often thumb their/our noses at the goal and importance of advertising, but they/we often thumb their/our nose at what appears to be evidence of less rigorous methodology. This has led us to a ridiculous point where data and analyses have evolved quickly with the demand and heavy use of advertising and market researchers and evolved strikingly little in more traditional survey areas, like polling and educational research. Much of the rhetoric about social media analysis, text analysis, social network analysis and big data is directed at the marketing and advertising crowd. Translating it to a wider research context and communicating it to a field that is often not eager to adapt to it can be difficult. And yet the exchange of ideas between the sister fields has never been more crucial to our mutual survival and relevance.

One of the goals of this blog has been to approach the changing landscape of research from a methodologically sound, interdisciplinary perspective that doesn’t suffer from the artificial walls and divisions. As I’ve worked on the blog, my own research methodology has evolved considerably. I’m relying more heavily on mixed methods and trying to use and integrate different tools into my work. I’ve learned quite a bit from researchers with a wide variety of backgrounds, and I often feel like I’m belted into a car with the windows down, hurtling down the highways of progress at top speed and trying to control the airflow. And then I often glimpse other survey researchers out the window, driving slowly, sensibly along the access road alongside the highway. I wonder if my mentors feel the change of landscape as viscerally as I do. I wonder how to carry forward the anchors and quality controls that led to such high quality research in the survey realm. I wonder about the future. And the present. About who’s driving, and who in what car is talking to who? Using what gps?

Mostly I wonder: could our negative attitude toward advertising and market research drive us right into obscurity? Are we too quick to misjudge the magnitude of the changes afoot?

 

This post is meant to be provocative, and I hope it inspires some good conversation.

Rethinking demographics in research

I read a blog post on the LoveStats blog today that referred to one of the most widely regarded critiques of social media research: the lack of demographic information.

In traditional survey research, demographic information is a critically important piece of the analysis. We often ask questions like “Yes 50% of the respondents said they had encountered gender harassment, but what is the breakdown by gender?” The prospect of not having this demographic information is a large enough game changer to cast the field of social media research into the shade.

Here I’d like to take a sidestep and borrow a debate from linguistics. In the linguistic subfield of conversation analysis, there are two main streams of thought about analysis. One believes in gathering as much outside data as possible, often through ethnographic research, to inform a detailed understanding of the conversation. The second stream is rooted in the purity of the data. This stream emphasizes our dynamic construction of identity over the stability of identity. The underlying foundation of this stream is that we continually construct and reconstruct the most important and relevant elements of our identity in the process of our interaction. Take, for example, a study of an interaction between a doctor and a patient. The first school would bring into the analysis a body of knowledge about interactions between doctors and patients. The second would believe that this body of knowledge is potentially irrelevant or even corrupting to the analysis, and if the relationship is in fact relevant it will be constructed within the excerpt of study. This begs the question: are all interactions between doctors and patients primarily doctor patient interactions? We could address this further through the concept of framing and embedded frames (a la Goffman), but we won’t do that right now.

Instead, I’ll ask another question:
If we are studying gender discrimination, is it necessary to have a variable for gender within our datasouce?

My kneejerk reaction to this question, because of my quantitative background, is yes. But looking deeper: is gender always relevant? This does strongly depend on the datasource, so let’s assume for this example that the stimulus was a question on a survey that was not directly about discrimination, but rather more general (e.g. “Additional Comments:”).

What if we took that second CA approach, the purist approach, and say that where gender is applicable to the response it will be constructed within that response. The question now becomes ‘how is gender constructed within a response?’ This is a beautiful and interesting question for a linguist, and it may be a question that much better fits the underlying data and provides deeper insight into the data. It also turns the age old analytic strategy on its head. Now we can ask whether a priori assumptions that the demographics could or do matter are just rote research or truly the productive and informative measures that we’ve built them up to be?

I believe that this is a key difference between analysis types. In the qualitative analysis of open ended survey questions, it isn’t very meaningful to say x% of the respondents mentioned z, and y% of the respondents mentioned d, because a nonmention of z or d is not really meaningful. Instead we go deeper into the data to see what was said about d or z. So the goal is not prevalence, but description. On the other hand, prevalence is a hugely important aspect of quantitative analysis, as are other fun statistics which feed off of demographic variables.

The lesson in all of this is to think carefully about what is meaningful information that is relevant to your analysis and not to make assumptions across analytic strategies.

Do you ever think about interfaces? Because I do. All the time.

Did you ever see the movie Singles? It came out in the early 90s, shortly before the alternative scene really blew up and I dyed [part of] my hair blue and thought seriously about piercings. Singles was a part of the growth of the alternative movement. In the movie, there is a moment when one character says to another “Do you ever think about traffic? Because I do. All the time.” I spent quite a bit of time obsessing over that line, about what it meant, and, more deeply, what it signaled.

I still think about that line. As I drove toward the turnoff to my mom’s street during our 4th of July vacation, I saw what looked like the turn lane for her street, but it was actually an intersection- less left- turning split immediately preceding the real left turn lane for her street. It threw me off every time, and I kept remembering that romantic moment in Singles when the two characters were getting to know each other’s quirks, and the man was talking about traffic. And it was okay, even cool, to be quirky and think or talk about traffic, even during a romantic moment.

I don’t think about traffic often. But I am no less quirky. Lately, I tend to think about interfaces. Before my first brush with NLP (Natural Language Processing), I thought quite a bit about alternatives to e-mail. Since I discovered the world of text analytics, I have been thinking quite a bit about ways to integrate the knowledge across different fields about methods for text analysis and the needs of quantitative and qualitative researchers. I want to think outside of the sentiment box, because I believe that sentiment analysis does not fully address the underlying richness of textual data. I want to find a way to give researchers what they need, not what they think they want. Recently, my thinking on this topic has flipped. Instead of thinking from the data end, or the analytic possibilities end, or about what programs already exist and what they do, I have started to think about interfaces. This feels like a real epiphany. Once we think about the problem from an interface, or user experience perspective, we can better utilize existing technology and harness user expectations.

Have you read the new Imagine book about how creativity works? I believe that this strategy is the natural step after spending time zoning out on the web, thinking, or not thinking, about research. The more time you cruise, the better feel you develop for what works and what doesn’t, the more you learn what to expect. Interfaces are simply the masks we put on datasets of all sorts. The data could be the world wide web as a whole, results from a site or time period, a database of merchandise, or even a set of open ended survey responses. The goal is to streamline the searching interface and then make it available for use on any number of datasets. We use NLP every day when we search the internet, or shop. We understand it intuitively. Why don’t we extend that understanding to text analysis?

I find myself thinking about what this interface should look like and what I want this program to do.

Not traffic, not as romantic. But still quirky and all-encompassing.

Question Writing is an Art

As a survey researcher, I like to participate in surveys with enough regularity to keep current on any trends in methodology. As a web designer, an aspect of successful design is a seamlessness with the visitor’s expectations. So if the survey design realm has moved toward submit buttons on the upper right hand corner of individual pages, your idea (no matter how clever) to put a submit button on the upper left can result in a disconnect on the part of the user that will effect their behavior on the page. In fact, the survey design world has evolved quite a bit in the last few years, and it is easy to design something that reflects poorly on the quality of your research endeavor. But these design concerns are less of an issue than they have been, because most researchers are using templates.

Yet there is still value in keeping current.

And sometimes we encounter questions that lend themselves to an explanation of the importance of question writing. These questions are a gift for a field that is so difficult to describe in terms of knowledge and skills!

Here is a question I encountered today (I won’t reveal the source):

How often do you purchase potato chips when you eat out at any quick service and fast food restaurants?

2x a week or more
1x a week
1x every 2-3 weeks
1x a month
1x every 2-3 months
Less than 1x every 3 months
Never

This is a prime example of a double barreled question, and it is also an especially difficult question to answer. In my care, I rarely eat at quick service restaurants, especially sandwich places, like this one, that offer potato chips. When I do eat at them, I am tempted to order chips. About half the time I will give in to the temptation with a bag of sunchips, which I’m pretty sure are not made of potato.

In bigger firms that have more time to work through, this information would come out in the process of a cognitive interview or think aloud during the pretesting phase. Many firms, however, have staunchly resisted these important steps in the surveying process, because of their time and expense. It is important to note that the time and expense involved with trying to make usable answers out of poorly written questions can be immense.

I have spent some time thinking about alternatives to cognitive testing, because I have some close experience with places that do not use this method. I suspect that this is a good place for text analytics, because of the power of reaching people quickly and potentially cheaply (depending on your embedded TA processes). Although oftentimes we are nervous about web analytics because of their representativeness, the bar for representativeness is significantly lower in the pretesting stage than in the analysis phase.

But, no matter what pretesting model you choose, it is important to look closely at the questions that you are asking. Are you asking a single question, or would these questions be better separated out into a series?

How often do you eat at quick service sandwich restaurants?

When you eat at quick service restaurants, do you order [potato] chips?

What kind of [potato] chips do you order?

The lesson of all of this is that question writing is important, and the questions we write in surveys will determine the kind of survey responses we receive and the usability of our answers.

To go big, first think small

We use language all of the time. Because of this, we are all experts in language use. As native speakers of a language, we are experts in the intricacies of that language.

Why, then, do people study linguistics? Aren’t we all linguists?

Absolutely not.

We are experts in *using* language, but we are not experts in the methods we employ. Believe it or not, much of the process of speaking and hearing is not conscious. If it was, we would be sensorally overwhelmed with the sheer volume of words around us. Instead, listening comprehension involves a process of merging what we expect to hear with what we gauge to be the most important elements of what we do hear. The process of speaking involves merging our estimates of what the people we communicate with know and expect to hear with our understanding of the social expectations surrounding our words and our relationships and distilling these sources into a workable expression. The hearer will reconstruct elements of this process using cues that are sometimes conscious and sometimes not.

We often think of language as simple and mechanistic, but it is not simple at all. As conversational analysts, our job is to study conversation that we have access to in an attempt to reconstruct the elements that constituted the interaction. Even small chunks of conversation encode quite a bit of information.

The process of conversation analysis is very much contrary to our sense of language as regular language users. This makes the process of explaining our research to people outside our field difficult. It is difficult to justify the research, and it is difficult to explain why such small pieces of data can be so useful, when most other fields of research rely on greater volumes of data.

In fact, a greater volume of data can be more harmful than helpful in conversation analysis. Conversation is heavily dependent on its context; on the people conversing, their relationship, their expectations, their experiences that day, the things on their mind, what they expect from each other and the situation, their understanding of language and expectations, and more. The same sentence can have greatly different meanings once those factors are taken into account.

At a time when there is so much talk of the glory of big data, it is especially important to keep in mind the contributions of small data. These contributions are the ones that jeopardize the utility and promise of big data, and if these contributions can be captured in creative ways, they will be the true promise of the field.

Not what language users expect to see, but rather what we use every day, more or less consciously.

Data Journalism, like photography, “involves selection, filtering, framing, composition and emphasis”

Beautiful:

“Creating a good piece of data journalism or a good data-driven app is often more like an art than a science. Like photography, it involves selection, filtering, framing, composition and emphasis. It involves making sources sing and pursuing truth – and truth often doesn’t come easily. ” -Jonathan Gray

Whole article:

http://www.guardian.co.uk/news/datablog/2012/may/31/data-journalism-focused-critical

Truly, at a time when the buzz about big data is at such a peak, it is nice to hear a voice of reason and temper! Folks: big data will not do all that it is talked up to do. It will, in fact, do something surprising and different. And that something will come from the interdisciplinary thought leaders in fields like natural language processing and linguistics. That *something,* not the data itself, will be the new oil.

Janet Harkness- the passing of a great mind in survey research

I just received this announcement from WAPOR. This is sad news. I attended an AAPOR short course that she helped to teach in the Spring of 2010, and she was very sharp, insightful and kind. At the time I was researching multilingual, multinational and multicultural surveys, and her writing was one of my mainstays.

“Janet Harkness died on Memorial Day (May 28, 2012) in Germany at age 63.  Harkness was the Director of the Survey Research and Methodology graduate program and Gallup Research Center, and holder of the Donald and Shirley Clifton Chair in Survey Science at the University of Nebraska-Lincoln.  She was the founder and Chair of the Organizing Committee on the International Workshop on Comparative Survey Design and Implementation (CSDI).  Her many contributions to cross-national and cross-cultural survey research included service as Head of the International Social Survey Programme’s Methodology Committee (1997-2008),  board member of the National Science Foundation’s (USA) Social, Behavioral & Economic Sciences Advisory Board (2008-present),  board member of the  Deutsches Jugendinstitut (Germany) Advisory Board (2009-present), Co-initiator of the Cross-Cultural Survey Guidelines Initiative, Chair of the Organizing Committee for the International Conference on Survey Methods in Multicultural, Multinational and Multiregional Contexts (3MC, Berlin 2008), and member of the European Social Survey’s (ESS) Central Coordinating Team. The ESS was awarded the European Union’s top annual science award, the Descartes Prize, in 2005.  She has been a member of WAPOR since 2009.

Besides her substantial contributions and organizational achievements in cross-national survey research, Harkness made major contributions to the scholarly literature including Cross-Cultural Survey Equivalence (1998), Cross-Cultural Survey Methods (with F.J.R. Van de Vijver and P. Ph. Mohler, 2003), and Survey Methods in Multicultural, Multinational, and Multiregional Contexts (with M. Braun, B. Edwards, T.P. Johnson, L.F Lyberg, P. PH. Mohler, B. Pennell and T.W. Smith, 2010).

As her professional colleague, Don Dillman, Regents Professor at Washington State Unversity, noted of Janet “I don’t know of anyone who has done as much thinking as she has about cross-cultural surveys, and how measurement differs across languages and countries…That’s one of the major challenges we now face in doing surveys as we increasingly shift to a world-wide emphasis in survey design.”

She is survived by her husband Peter Ph. Mohler.”

——-

10/15/12 Edited to add some great news:

Announcement for the Janet A. Harkness Student Paper Award

The Janet A. Harkness Student Paper Award will be issued annually, starting in 2013, by the World Association for Public Opinion Research (WAPOR) and the American Association for Public Opinion Research (AAPOR) to honor the memory of Dr. Harkness and the inspiration she brought to her students and colleagues.

In particular, WAPOR and AAPOR will consider papers related to the study of multi-national/ multi-cultural/multi-lingual survey research (aka 3M survey research), or to the theory and methods of 3M survey research, including statistics and statistical techniques used in such research. Paper topics might include: (a) methodological issues in 3M surveys; (b) public opinion in 3M settings; (c) theoretical issues in the formation, quality, or change in 3M public opinion; (d) or substantive findings about 3M public opinion. The competition committee encourages submissions that deal with the topics of the annual conferences, for which the call for papers are posted on both associations’ websites in the fall.

Submissions to the Harkness award competition are anticipated to be 15-25 pages in length. A prize of $750 will be awarded to the winning paper and the author(s) of the paper will be invited to deliver it as a part of either the annual WAPOR conference, AAPOR conference, or in certain years the WAPOR-AAPOR joint conference.

For a winning paper with one author, WAPOR and AAPOR will pay for the author’s travel expenses to and from the nearest WAPOR or AAPOR annual conference for that year. However, for a winning submission with multiple authors, WAPOR and AAPOR will pay only for the primary author (or his/her designee, who must be a co-author) to present the paper. Up to two other papers each year may receive an Honorable Mention designation with each receiving a $100 cash prize (though no travel expenses).

All authors must be current students (graduate or undergraduate) at the time of the submission, or must have received their degree during the preceding calendar year. The research must have been substantially completed while the author was (all authors were) enrolled in a degree program. Preference will be given to papers based on research not presented elsewhere.

A panel of public opinion researchers from WAPOR and AAPOR’s membership – drawn from academic, government, and commercial sectors – will judge the papers.

The 2013 Call-for-Submission of papers for the Harkness Award will be issued in the near future.

Remotely following AAPOR conference #aapor

The AAPOR 2012 conference began today in sunny Orlando, Florida. This is my my favorite conference of the year, and I am sorry to miss it. Fortunately, the Twitter action is bringing a lot of the action to homeviewers like us!

https://twitter.com/#!/search/realtime/%23AAPOR

I will keep retweeting some of the action. For those of you who may be concerned that this represents a new era of heavy tweeting for me, rest assured- it wont!

And for anyone who has been wondering what happened to me and my blog, please stay tuned. I am working on an exciting new project that I will eagerly share about in due time.

Facebook Measures Happiness in Status Updates?

From Flowing data:

http://flowingdata.com/2009/10/05/facebook-measures-happiness-in-status-updates/

Does anyone have a link to the original report?

I really wish I had more of a window into the methodology of this one!

A couple of questions:

What is happiness?

How can it be measured or signaled? What kinds of data are representing happiness? Is this just an expanded or open ended sentiment analysis? Is the technology such that this would be a valid study?

Are Facebook statuses a sensible place to investigate happiness?

What is this study representing? Constituting? Perpetuating?

 

Edited to Add: http://blog.facebook.com/blog.php?post=150162112130