The curse of the elevator speech

Yesterday I was involved in an innocent watercooler chat in which I was asked what Sociolinguistics is. This should be an easy enough question, because I just got a master’s degree in it. But it’s not. Sociolinguistics is a large field that means different things to different people. For every way of studying language, there are social and behavioral correlates that can also be studied. So a sociolinguist could focus on any number of linguistic areas, including phonology, syntax, semantics, or, in my case, discourse. My studies focus on the ways in which people use language, and the units of analysis in my studies are above the sentence level. Because Linguistics is such a large and siloed field, explaining Sociolinguistics through the lens of discourse analysis feels a bit like explaining vegetarianism through a pescatarian lens. The real vegetarians and the real linguists would balk.

There was a follow up question at the water cooler about y’all. “Is it a Southern thing?” My answer to this was so admittedly lame that I’ve been trying to think of a better one (sometimes even the most casual conversations linger, don’t they?).

My favorite quote of this past semester was from Jan Blommaert: “Language reflects a life, and not just a birth, and it is a life that is lived in a real sociocultural, historical and political space” Y’all has long been considered a southernism, but when I think back to my own experience with it, it was never about southern language or southern identity. One big clue to this is that I do sometimes use y’all, but I don’t use other southern language features along with it.

If I wanted to further investigate y’all from a sociolinguistic perspective, I would take language samples, either from one or a variety of speakers (and this sampling would have clear, meaningful consequences) and track the uses of y’all to see when it was invoked and what function it serves when invoked. My best, uninformed guess is that it does relational work and invokes registers that are more casual and nonthreatening. But without data, that is nothing but an uninformed guess.

This work has likely been done before. It would be interesting to see.
(ETA: Here is an example of this kind of work in action, by Barbara Johnstone)

What is the role of Ethnography and Microanalysis in Online Research?

There is a large disconnect in online research.

The largest, most profile, highest value and most widely practiced side of online research was created out of a high demand to analyze the large amount of consumer data that is constantly being created and largely public available. This tremendous demand led to research methods that were created in relative haste. Math and programming skills thrived in a realm where social science barely made a whisper. The notion of atheoretical research grew. The level of programming and mathematical competence required to do this work continues to grow higher every day, as the fields of data science and machine learning become continually more nuanced.

The largest, low profile, lowest value and increasingly more practiced side of online research is the academic research. Turning academia toward online research has been like turning a massive ocean liner. For a while online research was not well respected. At this point it is increasingly well respected, thriving in a variety of fields and in a much needed interdisciplinary way, and driven by a search for a better understanding of online behavior and better theories to drive analyses.

I see great value in the intersection between these areas. I imagine that the best programmers have a big appetite for any theory they can use to drive their work in a useful and productive ways. But I don’t see this value coming to bear on the market. Hiring is almost universally focused on programmers and data scientists, and the microanalytic work that is done seems largely invisible to the larger entities out there.

It is common to consider quantitative and qualitative research methods as two separate languages with few bilinguals. At the AAPOR conference in Boston last week, Paul Lavarakas mentioned a book he is working on with Margaret Roller which expands the Total Survey Error model to both quantitative and qualitative research methodology. I spoke with Margaret Roller about the book, and she emphasized the importance of qualitative researchers being able to talk more fluently and openly about methodology and quality controls. I believe that this is, albeit a huge challenge in wording and framing, a very important step for qualitative research, in part because quality frameworks lend credibility to qualitative research in the eyes of a wider research community. I wish this book a great deal of success, and I hope that it is able to find an audience and a frame outside the realm of survey research (Although survey research has a great deal of foundational research, it is not well known outside of the field, and this book will merit a wider audience).

But outside of this book, I’m not quite sure where or how the work of bringing these two distinct areas of research can or will be done.

Also at the AAPOR conference last week, I participated in a panel on The Role of Blogs in Public Opinion Research (intro here and summary here). Blogs serve a special purpose in the field of research. Academic research is foundational and important, but the publish rate on papers is low, and the burden of proof is high. Articles that are published are crafted as an argument. But what of the bumps along the road? The meditations on methodology that arise? Blogs provide a way for researchers to work through challenges and to publish their failures. They provide an experimental space where fields and ideas can come together that previously hadn’t mixed. They provide a space for finding, testing, and crossing boundaries.

Beyond this, they are a vehicle for dissemination. They are accessible and informally advertised. The time frame to publish is short, the burden lower (although I’d like to believe that you have to earn your audience with your words). They are a public face to research.

I hope that we will continue to test these boundaries, to cross over barriers like quantitative and qualitative that are unhelpful and obtrusive. I hope that we will be able to see that we all need each other as researchers, and the quality research that we all want to work for will only be achieved through the mutual recognition that we need.

Revisiting Latino/a identity using Census data

On April 10, I attended a talk by Jennifer Leeman (Research Sociolinguist @Census and Assistant Professor @George Mason) entitled “Spanish and Latino/a identity in the US Census.” This was a great talk. I’ll include the abstract below, but here are some of her main points:

  • Census categories promote and legitimize certain understandings, particularly because the Census, as a tool of the government, has an appearance of neutrality
  • Census must use categories from OMB
  • The distinction between race and ethnicity is fuzzy and full of history.
    • o   In the past, this category has been measured by surname, mothertongue, birthplace
      o   Treated as hereditary (“perpetual foreigner” status)
      o   Self-id new, before interviewer would judge, record
  • In the interview context, macro & micro meet
    • o   Macro level demographic categories
    • o   Micro:
      • Interactional participant roles
      • Indexed through labels & structure
      • Ascribed vs claimed identities
  • The study: 117 telephone interviews in Spanish
    • o   2 questions, ethnicity & race
    • o   Ethnicity includes Hispano, Latino, Español
      • Intended as synonyms but treated as a choice by respondents
      • Different categories than English (Adaptive design at work!)
  • The interviewers played a big role in the elicitation
    • o   Some interviewers emphasized standardization
      • This method functions differently in different conversational contexts
    • o   Some interviewers provided “teaching moments” or on-the-fly definitions
      • Official discourses mediated through interviewer ideologies
      • Definitions vary
  • Race question also problematic
    • o   Different conceptions of Indioamericana
      • Central, South or North American?
  • Role of language
    • o   Assumption of monolinguality problematic, bilingual and multilingual quite common, partial and mixed language resources
    • o   “White” spoken in English different from “white” spoken in Spanish
    • o   Length of time in country, generation in country belies fluid borders
  • Coding process
    • o   Coding responses such as “American, born here”
    • o   ~40% Latino say “other”
    • o   Other category ~ 90% Hispanic (after recoding)
  • So:
    • o   Likely result: one “check all that apply” question
      • People don’t read help texts
    • o   Inherent belief that there is an ideal question out there with “all the right categories”
      • Leeman is not yet ready to believe this
    • o   The takeaway for survey researchers:
      • Carefully consider what you’re asking, how you’re asking it and what information you’re trying to collect
  • See also Pew Hispanic Center report on Latino/a identity




Censuses play a crucial role in the institutionalization and circulation of specific constructions of national identity, national belonging, and social difference, and they are a key site for the production and institutionalization of racial discourse (Anderson 1991; Kertzer & Arel 2002; Nobles 2000; Urla 1994).  With the recent growth in the Latina/o population, there has been increased interest in the official construction of the “Hispanic/Latino/Spanish origin” category (e.g., Rodriguez 2000; Rumbaut 2006; Haney López 2005).  However, the role of language in ethnoracial classification has been largely overlooked (Leeman 2004). So too, little attention has been paid to the processes by which the official classifications become public understandings of ethnoracial difference, or to the ways in which immigrants are interpellated into new racial subjectivities.

This presentation addresses these gaps by examining the ideological role of Spanish in the history of US Census Bureau’s classifications of Latina/os as well as in the official construction of the current “Hispanic/Latino/Spanish origin” category. Further, in order to gain a better understanding of the role of the census-taking in the production of new subjectivities, I analyze Spanish-language telephone interviews conducted as part of Census 2010.  Insights from recent sociocultural research on the language and identity (Bucholtz and Hall 2005) inform my analysis of how racial identities are instantiated and negotiated, and how respondents alternatively resist and take up the identities ascribed to them.

* Dr. Leeman is a Department of Spanish & Portuguese Graduate (GSAS 2000).

Digital Democracy Remixed

I recently transitioned from my study of the many reasons why the voice of DC taxi drivers is largely absent from online discussions into a study of the powerful voice of the Kenyan people in shaping their political narrative using social media. I discovered a few interesting things about digital democracy and social media research along the way, and the contrast between the groups was particularly useful.

Here are some key points:

  • The methods of sensemaking that journalists use in social media is similar to other methods of social media research, except for a few key factors, the most important of which is that the bar for verification is higher
  • The search for identifiable news sources is important to journalists and stands in contrast with research methods that are built on anonymity. This means that the input that journalists will ultimately use will be on a smaller scale than the automated analyses of large datasets widely used in social media research.
  • The ultimate information sources for journalists will be small, but the phenomena that will capture their attention will likely be big. Although journalists need to dig deep into information, something in the large expanse of social media conversation must capture or flag their initial attention
  • It takes some social media savvy to catch the attention of journalists. This social media savvy outweighs linguistic correctness in the ultimate process of getting noticed. Journalists act as intermediaries between social media participants and a larger public audience, and part of the intermediary process is language correcting.
  • Social media savvy is not just about being online. It is about participating in social media platforms in a publicly accessible way in regards to publicly relevant topics and using the patterned dialogic conventions of the platform on a scale that can ultimately draw attention. Many people and publics go online but do not do this.

The analysis of social media data for this project was particularly interesting. My data source was the comments following this posting on the Al Jazeera English Facebook feed.


It evolved quite organically. After a number of rounds of coding I noticed that I kept drawing diagrams in the margins of some of the comments. I combined the diagrams into this framework:


Once this framework was built, I looked closely at the ways in which participants used this framework. Sometimes participants made distinct discursive moves between these levels. But when I tried to map the participants’ movements on their individual diagrams, I noticed that my depictions of their movements rarely matched when I returned to a diagram. Although my coding of the framework was very reliable, my coding of the movements was not at all. This led me to notice that oftentimes the frames were being used more indexically. Participants were indexing levels of the frame, and this indexical process created powerful frame shifts. So, on the level of Kenyan politics exclusively, Uhuru’s crimes had one meaning. But juxtaposed against the crimes of other national leaders’ Uhuru’s crimes had a dramatically different meaning. Similarly, when the legitimacy of the ICC was questioned, the charges took on a dramatically different meaning. When Uhuru’s crimes were embedded in the postcolonial East vs West dynamic, they shrunk to the degree that the indictments seemed petty and hypocritical. And, ultimately, when religion was invoked the persecution of one man seemed wholly irrelevant and sacrilegious.

These powerful frame shifts enable the Kenyan public to have a powerful, narrative changing voice in social media. And their social media savvy enables them to gain the attention of media sources that amplify their voices and thus redefine their public narrative.