Fitness for Purpose, Representativeness and the perils of online reviews

Have you ever planned a trip online? In January, when I traveled to Amsterdam, I did all of the legwork online and ended up in a surprising place.

Amsterdam City Center is extremely easy to navigate. From the train station (a quick ride from the airport and a quick ride around The Netherlands), the canals extend outward like spokes. Each canal is flanked by streets. Then the city has a number of concentric rings emanating from the train station. Not only is the underlying map easy to navigate, there is a traveler station at the center and maps available periodically. English speaking tourists will see that not only do many people speak English, but Dutch has enough overlap with English to be comprehensible after even a short exposure.

But the city center experience was not as smooth for me. I studied map after map in the city center without finding my hotel. I asked for directions, and no one had heard of the hotel or the street it was on. The traveler center seemed flummoxed as well. Eventually I found someone who could help and found myself on a long commuter tram ride well outside the city center and tourist areas. The hotel had received great reviews and recommendations from many travelers. But clearly, the travelers who boasted about it were not quite the typical travelers, who likely would have ended up in one of the many hotels I saw from the tram window.

Have you ever discovered a restaurant online? I recently went to a nice, local restaurant that I’d been reading about for years. I ordered the truffle fries (fries with truffle salt and some kind of fondue sauce), because people had really raved about them, only to discover once they arrived that they were fundamentally french fries (totally not my bag- I hate fried food).

These review sites are not representative of anything. And yet we/I repeatedly use them as if they were reliable sources of information. One could easily argue that they may not be representative, but they are good enough for their intended use (fitness for purpose <– big, controversial notion from a recent AAPOR task force report on Nonprobability Sampling). I would argue that they are clearly not excellent for their intended use. But does that invalidate them altogether? They often they provide the only window that we have into the whatever it is that we intend them for.

Truffle fried aside, the restaurant was great. And location aside, the hotel was definitely an interesting experience.

Toilet capsule in hotel room (with frosted glass rotating pane for some degree of privacy)

Toilet capsule in hotel room (with frosted glass rotating pane for some degree of privacy)

Advertisement

What next, after graduation?

A question that recent graduates are often asked is “what next, now that you’ve graduated?” This is a different question for graduates in different stages of their lives. When I finished my bachelor’s I could answer with the types of jobs I was applying to and my plans of where to live next. In fact, I wasn’t one to leave these big questions unanswered: I moved and began a full-time research position within a few weeks of my last set of finals. I was eager to begin my life without school. Nine months later I began another research position, chosen because of the shear intensity and rigor of the interview (I had two interviewers firing questions at me, and I loved it. Crazy, right?). At this point, I’ve been at the second job for about 14 years.

What keeps you at a job for 14 years? This is an important question, because keeping with a job when everything is not fresh and new is a special sort of challenge. There have been a few keys:

1. Stay in the moment. There are quite a few different projects that I juggle at once, and I work on each project across multiple stages. For each of these stages in the research process, I have elements that I particularly enjoy. I try to focus on these key elements while I work on each project.

2. Know yourself. As a worker, I know that I have little patience for repetitive tasks. I tend to be very hardworking and productive, but when tasks become repetitive I quickly get distracted. If I can, I always delegate these tasks away. If I can’t, I juggle them with other projects that complement them, such as tasks that I need to spend more time thinking strategically about or tasks that either have a deadline or can be given a set of short term goals. This way, I feel productive and maintain my morale.

3. Feed yourself. I’ve also learned that I hunger to learn new things. I take advantage of every opportunity to learn new things, to share the new knowledge with my coworkers, and to integrate the things I learn into my work. This keeps my projects fresh. In addition to the standard, core reports that I produce, for example, I add new kinds of analyses or data. This makes the reports more interesting to produce, and it probably keeps them fresh for the reader as well.

4. Maintain relationships. I’ve been lucky enough to work with people I genuinely enjoy and to see them through marriages, graduations, births, deaths, as well as the silly packages they recieve at work. This helps to make work an enjoyable place.

5. Keep moving. Go to the gym, if you can. Go on a walk, if you can. Get up and stretch. Drink a lot of fluids.

Now, back to the question. “What next, after graduation?” For me, this is not a question with a clear, obvious answer. School disturbs the equillibrium of every day life. Juggling work, school and family left me on a constant cycle of challenges and [mostly] successes. How do you come down from that? What happens to that level of productivity? As a mom, there is a looming stack of laundry, dishes and other household tasks always waiting at the ready. In the past week alone, I’ve spent over 6 hours doing make-up gymnastic lessons (with another 2.5 hours coming tomorrow!). Life expands to fit any empty spaces. But given a trade-off between reading Blommaert and folding laundry…

I read a commencement speech by Daniel Foster Wallace that addressed the monotony of life and the power of being alive through the seemingly routine moments. I plan to do just that, but I was shocked to see it laid out in a commencement address. To be a student is to be saddled with the potential of what life could be, and that stands in such contrast to the smaller, daily joys of life without school. I often wondered how well prepared the students around me who hadn’t yet left academia were for life “on the other side.” Now I can see why some people choose to stay in school! If it weren’t for the many sacrifices my family made in order for me to go to school, I probably would have already enrolled in a PhD program.

The transition is surprisingly difficult, and I haven’t yet figured it out.

Representativeness, qual & quant, and Big Data. Lost in translation?

My biggest challenge in coming from a quantitative background to a qualitative research program was representativeness. I came to class firmly rooted in the principle of Representativeness, and my classmates seemed not to have any idea why it mattered so much to me. Time after time I would get caught up in my data selection. I would pose the wider challenge of representativeness to a colleague, and they would ask “representative of what? why?”

 

In the survey research world, the researcher begins with a population of interest and finds a way to collect a representative sample of the population for study. In the qualitative world that accompanies survey research units of analysis are generally people, and people are chosen for their representativeness. Representativeness is often constructed by demographic characteristics. If you’ve read this blog before, you know of my issues with demographics. Too often, demographic variables are used as a knee jerk variable instead of better considered variables that are more relevant to the analysis at hand. (Maybe the census collects gender and not program availability, for example, but just because a variable is available and somewhat correlated doesn’t mean that it is in fact a relevant variable, especially when the focus of study is a population for whom gender is such an integral societal difference.)

 

And yet I spent a whole semester studying 5 minutes of conversation between 4 people. What was that representative of? Nothing but itself. It couldn’t have been exchanged for any other 5 minutes of conversation. It was simply a conversation that this group had and forgot. But over the course of the semester, this piece of conversation taught me countless aspects of conversation research. Every time I delved back into the data, it became richer. It was my first step into the world of microanalysis, where I discovered that just about anything can be a rich dataset if you use it carefully. A snapshot of people at a lecture? Well, how are their bodies oriented? A snapshot of video? A treasure trove of gestures and facial expressions. A piece of graffiti? Semiotic analysis! It goes on. The world of microanalysis is built on the practice of layered noticing. It goes deeper than wide.

 

But what is it representative of? How could a conversation be representative? Would I need to collect more conversations, but restrict the participants? Collect conversations with more participants, but in similar contexts? How much or how many would be enough?

 

In the world of microanalysis, people and objects constantly create and recreate themselves. You consistently create and recreate yourself, but your recreations generally fall into a similar range that makes you different from your neighbors. There are big themes in small moments. But what are the small moments representative of? Themselves. Simply, plainly, nothing more and nothing else. Does that mean that they don’t matter? I would argue that there is no better way to understand the world around us in deep detail than through microanalysis. I would also argue that macroanalysis is an important part of discovering the wider patterns in the world around us.

 

Recently a NY Times blog post by Quentin Hardy has garnered quite a bit of attention.

Why Big Data is Not Truth: http://bits.blogs.nytimes.com/2013/06/01/why-big-data-is-not-truth/

This post has really struck a chord with me, because I have had a hard time understanding Hardy’s complaint. Is big data truth? Is any data truth? All data is what it is; a collection of some sort, collected under a specific set of circumstances. Even data that we hope to be more representative has sampling and contextual limitations. Responsible analysts should always be upfront about what their data represents. Is big data less truthful than other kinds of data? It may be less representative than, say, a systematically collected political poll. But it is what it is: different data, collected under different circumstances in a different way. It shouldn’t be equated with other data that was collected differently. One true weakness of many large scale analyses is the blindness to the nature of the data, but that is a byproduct of the training algorithms that are used for much of the analysis. The algorithms need large training datasets, from anywhere. These sets often are developed through massive web crawlers. Here, context gets dicey. How does a researcher represent the data properly when they have no idea what it is? Hopefully researchers in this context will be wholly aware that, although their data has certain uses, it also has certain [huge] limitations.

 

I suspect that Hardy’s complaint is with the representations of massive datasets collected from webcrawlers as a complete truth from which any analyses could be run and all of the greater truths of the world could be revealed. On this note, Hardy is exactly right. Data simply is what it is, nothing more and nothing less. And any analysis that focuses on an unknown dataset is just that: an analysis without context. Which is not to say that all analyses need to be representative, but rather that all responsible analyses of good quality need to be self aware. If you do not know what the data represents and when and how it was collected, then you cannot begin to discuss the usefulness of any analysis of it.