The Role of Gratitude in Research

Research, as most things in life, is best approached with gratitude. In this post, I’ll share a bit about what I’m grateful for, an exercise in gratitude, and some food for thought about the role of gratitude in research.

First, here is a window into what I’m feeling grateful for.

Grateful for the challenge of research

Research can provide a challenging career. While it is possible to find positions in research that are more repetitive, most positions afford many opportunities for learning about new subject matter and new methods. Each new research question provides fresh challenges to implement. And with the body of literature and informal sources available, there is always the ability to read more deeply about the work that others have done. I am grateful for the perpetual learning experiences that research has brought.

Grateful for the versatility of research

One of my favorite aspects of a career in research is the versatility. I’ve been able to work in neuropsychology, physics education, sociolinguistics, social media research, media measurement and in public health using a great variety of research methods.

Grateful for my colleagues

Over the years I’ve had the pleasure of working with people that I respect, learn from and genuinely enjoy. I’m grateful for their help, their wisdom, their curiosity, their enthusiasm, their support, their friendship, and their comforting awkwardness.

Gratitude for the research opportunities

I am grateful for the opportunity to study people. I am grateful for the people who agree to participate in research and who honestly share what is in their hearts or on their minds. Some opinions and experiences are easier to share than others. I am grateful for all of it. The qualitative work that I am currently involved with is often built on individual and group interviews that can be a powerful experience for the participant and the interviewer, and I am so grateful to the participants and the process for bringing this to fruition.

 

Now, let’s take a minute to Go Beyond the Gush. It is easy to get swept up in the everyday grind of research, whether because the research approval process seems unnecessarily repetitive or cumbersome, or data needs more wrangling than predicted, or the meetings seem endless and the emails, texts and phone calls seem constant, or the people working on a project are particularly difficult to corral, or the behavior that you need to observe in your research is particularly difficult to isolate, or… We can all get caught in the slog of research. But gratitude can help.

Here is an exercise:

Let’s take a minute to get very basic with this. First, think of the reasons why you enjoy your work. Then let’s take it back even further.

  • Be grateful to have a topic to research or to have the ability to find one. Be grateful for the ability to be curious and to find unanswered questions.
  • Be grateful to have the support to pursue this topic as a professional or as a student. Research costs time, money and many other resources.
  • Be grateful to have the skills to approach the topic. Think of all of the training that provided these skills. Think of the resources that are available to you to help you learn what you need.
  • Be grateful for your strength. You have the ability to tackle what comes your way.
  • Be grateful for the people who must come together to make this work happen. Sometimes we get stuck thinking of one person’s habits or quirks or in finding fault with the people around us. Some groups are more cohesive than others, and each person brings a different set of skills. Take a step back from that. Let go of it for a minute and take a fresh look. First see yourself as someone with strengths and weaknesses. Then see your colleagues in this light as well. Allow yourself to forgive yourself and others.
  • Be grateful for the challenges your work brings. Sometimes it seems to bring too many challenges. But those challenges are keeping you sharp. And in some way, they will offer you the opportunity to learn and grow.
  • Be grateful for research participants. These are the people who make our work possible by letting us into their world in some way. That is a privilege.

 

What do exercises like this gain you? A few things, really. Peace of mind. A break from the stress and an opportunity to just feel grateful. Perspective. A chance to put challenges that seem constant or insurmountable into a smaller box. The opportunity to see the people around us from a fresh perspective and hear them more clearly. A better insulation against the instability that affects us all. And an opportunity to see our research in context and think more broadly about the affect it has. The work we do affects peoples’ lives, but these basic mechanisms can become lost to us when we lose perspective. With fresh perspective and gratitude, we can better see these mechanisms in action and produce work that better respects all involved. No research exists in a vacuum, and the better we can understand the role our research plays in a wider context the better stewards we can be over this tremendous privilege we’ve been granted.

Thanks for listening.

Advertisement

The future doesn’t belong to big or small data. It belongs to the disruptors.

Research is evolving fast. There is less support and more doubt for traditional methods, a fast- changing set of expectations from end-users, and a fast-evolving field of nontraditional methods and approaches. The future of research does not belong to big data or small data. It belongs to the disruptors. It belongs to those that can recognize and challenge the assumptions underlying their methodologies. The future belongs to creative approaches, connected data, and collaboration.

 

Research requires listening and understanding.

In order to create research that is useful, there needs to be a deep understanding of end-users, clients, and the context of the products we create. This requires listening, understanding, and creating opportunities to learn more, both by representing end users and clients more directly in the development process and by qualitative research methods. Qualitative research provides methods of collecting and analyzing information about people, in-person, virtually and through behavioral data sources, and it must provide a vitally important role in evolving research methods.

 

Research requires ever-changing analytic capabilities and creative, open minds.

We live in an era when data is plentiful. But the data looks different from what we saw in the past. We need capable and versatile technical workers who are able to process data. And we need the creativity to put the data to use in ways that benefit end-users.

 

Research must embrace diversity.

Creative strategy and good user-focus can’t spring from echo chambers. We need to connect to divergent experiences and views early and often in order to create good products. People with divergent views can raise questions earlier in the development process and allow us to integrate holistic solutions for problems we could not have thought of alone. Diverse experiences allow us to be more creative because they provide more material to inspire us. And diversity is crucial for us to successfully compete in the global marketplace.

 

Research design must be iterative.

If we want to create new ways of analyzing and connecting data, we have to be free to experiment with new methods, test new methods, and allow end-users to test proposed solutions. Often what we create doesn’t function the way that we expect it to. In an era where data does not need to be designed and collected, we have the flexibility to find creative ideas (“ugly babies”), nurture them, test them out, and tweak what doesn’t work.

 

Silos no longer make sense in research

It no longer makes sense to separate end users from developers or quantitative from qualitative. The best disruptive, creative potential lies in the mingling of methods and people. The most useful products are the ones that can be created collaboratively.

 

Research can be agile.

Agile development has become standard practice in much of the software development world, but it makes sense for research as well. Agile teams can involve end-users, UX researchers, quantitative methodologists and qualitative methodologists. Research can be built by agile, creative teams that feel free to question and inspire each other.

 

Creating high performing teams has never mattered more.

There is a growing body of great research about what makes a highly effective team. Effective teams are empathetic and open. They consider each other’s interests. They are practical and focused on the end product. They are comfortable asking questions and brainstorming solutions. They work collaboratively, and they celebrate their accomplishments.

 

The future of research is bigger than any one person or silo. It requires us to come together in new ways. I already see some firms moving in this direction- kudos to them. A new era is here, and I’m excited for us all!

That which is bigger than us

We learn about things that are bigger than ourselves in layers, and we accomplish tasks that are bigger than ourselves one step at a time.

 

In college, this knowledge came as a revelation to me. Instead of learning, memorizing and standing atop a field of knowledge, knowledge was something that was created in pieces. Knowledge came to be about process.

 

In graduate school, I again came to relish the mystery of the analytic process through the activities of conversation analysis and discourse analysis. Over and again, I began with a small piece of data, like a conversation or a snippet of video, and watched it come to life through rounds of observation. Something that began as a digestible piece that wouldn’t necessarily attract attention became a multilayered journey into all of the pieces that comprise the situated social actions we make every day.

 

As a parent, I learned almost immediately that parenthood was bigger than me. I learned that I couldn’t be, do or know it all, and I learned that the choices and priorities I made dramatically governed the shape of my family. I learned that I could not be perfect in anyone’s eyes, and I could never measure up to every standard by which I was being measured. I learned that I was ultimately responsible for something I valued more than I had imagined possible, and that I ultimately had to accept and embrace my unique approach to the task. I could only strive to be a parent in the ways in which I was capable, and I could never fit anyone else’s vision. I learned that my shortcomings had to be a bridge of understanding to other parents, who also found themselves unequal to the task at their hands.

 

In my professional life, I’ve learned to relish the possibilities and opportunities that teamwork can bring. As a team we can achieve far more and greater things than we could ever achieve as individuals, and that which we can accomplish can be an inspiration. As a manager the most I could wish for is a team that is inspired by process and by potential, who can love the work and love the product of that work.

 

Ultimately, that’s what I wish for all things that are bigger than myself. Inspiration, pride, a love of the journey and the process- to love life and be surrounded by others who love life, in all its complications, challenges, ups and downs.

 

But all of this talk of inspiration neglects the other side of things that are bigger than us. When we make choices of where to focus our time and energy, other elements are always neglected. As a parent, I have to remind myself that I may not be a go-to mom at bake sale time, but I have other qualities to offer my kids. Even as we work to get things done there is always an undercurrent of things not getting done. And there are times when the journey ahead is more daunting than inspiring. There are the moments when all of the work we’ve accomplished becomes undone before our eyes. There are the toddlers behind us as we clean, some more and some less metaphorical, dumping toys and laughing. And there are the mountains ahead that seem to be too big to climb.

 

There is a TED talk that has been making the rounds lately about emotional hygiene. In it, the speaker talks about how we handle failure and disappointments. We all encounter failures and disappointments, small and large, every day. We conquer our to-do lists one day, only to see them build back up the next day. Sometimes our hard work is unrecognized. Sometimes our efforts are not enough. It’s one thing to love process, but what do we do when the process can’t fit the task ahead? How do we handle ambiguity? To Ignore these challenges is to undercut the complicated texture of life.

I believe that part of embracing life is to embrace the mess; to embrace that which is bigger than ourselves; to keep feeling around the darkness until we find our way; to have faith that there is a path through the darkness, to continually double back to our rocks; to embrace the challenges and embrace our core that guides us through them; to recognize the downs and the ups, and to know where within ourselves to find the strength to persevere. These moments, these challenges allow us to be hear, see and do that which is much grander than what we could see, hear achieve alone or in any immediate sense. These are the elements that give depth to our lives. These are the challenges that define our lives and make life worth living.

Understanding news consumption and production can be like understanding the air we breathe

A careful, systematic look at the way you encounter news might just dramatically change your understanding of the genre. Here are some observations about creating and consuming news in our current information ecosystem.

Creating News

News is not one size fits all, and news methodology can’t be one size fits all. This is probably a well known fact to people with more of a journalism background, but it is often overlooked by people who are newer to the field. Here are a few points that stem from differences:

– Social media can be a great source for information about breaking events that have a critical base of witnesses with internet access.

– Social media is no substitute for news that has very few witnesses with privileged access to information.

– The core job of newsmakers is to keep the public informed about unfolding events. Oftentimes newsmakers are as invisible to their audiences as the people who develop dictionaries are. The audience assumes that the major events they see covered are the objectively most-major events, often without any understanding of the curation involved. Newsmakers provide a vital public service and have a moral obligation to the public, but that obligation is far from straight forward.

– News consumers may choose to engage most deeply in the topics they are most interested in, but that doesn’t invalidate a basic desire to know what’s going on in the world. This is why I like to advocate for eye tracking as an engagement metric- the current tracking metrics don’t reflect the most basic function of the news media.

 

Consuming News

News exposure is seamlessly integrated into our daily experiences. As a child, I would watch multiple newscasts with my mom, and we would both scan the newspapers regularly. As a new parent, I visited multiple websites to collect news from different perspectives and regularly watched multiple newscasts- this seemed like an essential tie between the small world of new parenthood and the larger world outside my door. But these days I work long hours and rarely catch newscasts or have time to visit multiple news sites. Someone recently asked me which news outlets I follow, and I was surprised that the answer didn’t come very readily to me. I’ve been making a careful effort to observe my contact with news stories, outlets and journalists, and I highly recommend this exercise to anyone interested in understanding or measuring media use.

Here is some of what I’ve observed:

– Twitter is the first platform I think of when I think of news. I think of it as my own curated stream of news amidst the wider raging river of information flow. But when it comes to news stories in particular, I often hear about them not because I seek them out or curate them but because my streams are based on people who have a variety of interests. I hear about emerging news because people go off-topic in  their Twitter streams, not because I seek it out. I often value this dynamic as a kind of filter of its own, because major events enter my stream from a variety of perspectives, but the majority of news does not.

– Re: Interest-based streams- I mostly follow researchers on Twitter. As a result, I can follow conferences as they happen or read interesting articles as they come out. Is this news? What makes it news?

– Platforms morph based on the way people use them. See @clintonyates Twitter feed for an example of a journalist using Twitter to tell resonant stories in a unique way that defies traditional uses of the platform.

– Re: Instagram- I love to follow Instagrammers because I really love photography. Some of the instagrammers I follow are photojournalists. This is an area of news coverage that is rarely considered in depth. And sometimes I wonder whether these pictures are only news if they contain, and I read, captions explaining their context and importance?

– Facebook is often discussed as a news source, but it is very important when discussing Facebook as a news source to consider the social context of information. I will share news from news sources only if I think it is something I can share without harming valued personal relationships with people across many ideological spectra and backgrounds. That said, some of my friends will regularly share the pieces that I choose not to. When I see those articles from these friends I will put the articles in the context of what I’ve seen from those people in the past, my patterns with them in regards on the topic, and my social patterns with them in general.

– It is important to recognize that news items on Facebook can come from news sources, interest groups or pages, interested people, or simply from Facebook. The source interacts with the platform to create the stimulus.

– Re: other fora- There are many more news sources that I follow to varying degrees. I receive research updates and daily briefings from Pew and Nielsen, which I read with varying frequency (the only one I read every day is the Daily Briefing from the Pew Journalism Project.) I also receive e-mails from research and technical lists, lists about STEM education, community lists, blog notifications and emails from LinkedIn. I read the Sunday paper, and weekly updates from my employer, and I regularly hear and participate in discussions in my workplace and outside of it. Each of these are potential news sources that may bring in other news sources.

– These sources listed together may appear to amount to a critical mass of time, but I was not aware of that critical mass until I stopped to observe it. Our choices and actions regarding media consumption are as unconscious as many other choices I make with my time.

All of this is to say that news is as seamlessly integrated into my environment as the air I breathe, and it stems from sources of all kinds. Every story has a different way of intersecting with and co creating my own. Whereas news media has a particularly strong history of top down and one way dissemination, it is much more ubiquitous, multi-directional and part of our ecosystem now than ever before. We are consumers and participants in very different ways, and understanding these is a key to understanding and developing tools for news in the future.

 

* A side note re: pay to read. My advice to news outlets is to find a way to integrate pre-existing online funding resources (like Amazon, paypal, etc.) in a collective or semi-standardized way, so that people don’t have to provide financial information to anyone new, and so that people can pay small fees (e.g. 25 cents for a long-read or something that required a good deal of expense to produce, 5 or ten cents for smaller or shorter pieces) with a single click and pay as they go to read around a variety of sources.

Ruby slippers? The professional skills that parenthood builds

As a parent of older children, I am strongly aware of the ways in which parenthood has affected my career. I’m also aware of the many professional skills that parenthood has reinforced in me. Lately I’ve found myself discussing these skills with other parents, many of whom had always focused more on the drawbacks of parenting than on the advantages it brings to the workplace. These skills can be like our ruby slippers. They are wonderful, and we’ve developed them along the road without ever realizing what we’ve had.

For example:

1. The buck stops here.

There is a moment in (very) early parenthood when you hear your child cry and wonder what someone will do to soothe them. In the next moment you realize that you (more than anyone else on earth) are the one who is supposed to soothe the child. This is a big step in your transformation into parenthood. This sets the stage for you to advocate for your child, defend your child and soothe your child. But it also transforms you as a person, from someone who expects others to do things to someone who expects to do things yourself. The guts with which you advocate for your children should also help you advocate for yourself and your co workers, and the proactive habits you develop can permeate everything you do, both in the home and in the workplace.

2. Efficiency

Wasting time is a big deal for parents. I am happy to waste time on a kayak, at the beach, or hiking with my kids. But I am not willing to redo work that I have already done. This distinction has made me very aware of my time use and organization. Although multiple layers of checks and balances can be great, I don’t want to read the same email twice, shuffle the same piece of paper twice or spend time trying to figure out where I left off with a project- potentially redoing work that I have already done. My time feels precious, and that drives me to be significantly more organized and efficient. It also drives me to think carefully about process and streamline what I can to maximize quality and minimize unintentional duplication.

3. Prioritizing

There is more work to do than you will ever be able to keep up with. You can’t work full time or overtime and pursue professional development, and then come home and keep a perfectly clean and maintained home, cook a full meal, keep up with the laundry, spend time doing homework and teaching extra lessons, attend PTA and school events, take your kids to lessons of all kinds, do bathtime and bedtime rituals, juggle sick kids and dentist appointments, keep up with all of the bills, paperwork and repairs that arise, exercise regularly, keep up with the news and trends, pursue spiritual fulfillment, participate in your community, develop your hobbies and interests, spend time with your extended family, and enjoy leisure time. You will have to prioritize the things that you find most important and necessary. Thinking strategically about your time is also a really great professional skill that will help you to better organize your time and the time of your team.

4. Reconciling differences

The priorities that you have chosen from the list above will evolve over time, and they will be different from the priorities that others choose. Your priorities will differ from other adults’ priorities, and they will differ from your kids priorities (and their priorities will differ, too!). Somehow you will have to reconcile these differences, and “my way or the highway” will only get you so far. At work you will also find that you have different priorities than other people you work with. Managing differences in priorities is a great professional skill to have.

5. Dealing with personal conflicts

One amazing lesson of parenthood is that just when you are ready to turn and run for cover from your child is just about when you need to spend more time together. Find a change of scenery and an activity that you both enjoy, and retreat to it together. Whatever challenge you were stuck on will usually become much easier to pull through after a break. The same trap of pulling away and developing conflicts happens in professional environments. These traps can grow into irreconcilable differences if they are left to fester, but they can often be little more than small bumps in the road if they are caught and acted on early.

6. Gratitude for intellectual challenges

I love parenthood. But as much as I LOVE Boynton and Dr Seuss books, I am also very happy to balance out family time with activities that stretch my mind and make me think. Before I became a parent I took a career for granted. Of course I would always be working! But in the early days of parenthood it sometimes felt like a miracle to walk into the office on time, fully dressed, and rested enough to do my work. I am very fortunate to not only have a job that supports my family, but to have a job that keeps me intellectually stimulated and interested. I am interested in research methodology, and I am extremely grateful to be able to pursue that interest. For someone who had labeled solo trips to the grocery time “me time” for years, graduate school felt almost like a really great book club. It was an excuse to read great books, write interesting papers and have regular discussions with adults who shared some of the same interests. A career is not just a responsibility; it is a privilege.

7. Explanations, explanations and more explanations

I spent a few nights while I was in graduate school reading academic articles to my family and explaining why they were so cool and interesting. I also practice talks with my family, taking detours to make sure they understand what I’m saying. Being able to communicate about your work with any audience is a real gift. Not only does it help you develop a great understanding of your work, it helps prepare you to interact with a wide variety of people.

 
There are many more topics along these lines that I could cover, and maybe I will cover them another day. But for now I hope you’ve taken some inspiration from the advantages that parenthood brings to a career. Parenting can make you more focused, more proactive, better able to deal with an array of people, and more grateful for work and the challenges it brings. It brings challenges, but it also fortifies you.

The surprising unpredictability of language in use

This morning I recieved an e-mail from an international professional association that I belong to. The e-mail was in English, but it was not written by an American. As a linguist, I recognized the differences in formality and word use as signs that the person who wrote the e-mail is speaking from a set of experiences with English that differ from my own. Nothing in the e-mail was grammatically incorrect (although as a linguist I am hesitant to judge any linguistic differences as correct or incorrect, especially out of context).

Then later this afternoon I saw a tweet from Twitter on the correct use of Twitter abbreviations (RT, MT, etc.). If the growth of new Twitter users has indeed leveled off then Twitter is lucky, because the more Twitter grows the less they will be able to influence the language use of their base.

Language is a living entity that grows, evolves and takes shape based on individual experiences and individual perceptions of language use. If you think carefully about your experiences with language learning, you will quickly see that single exposures and dictionary definitions teach you little, but repeated viewings across contexts teach you much more about language.

Language use is patterned. Every word combination has a likelihood of appearing together, and that likelihood varies based on a host of contextual factors. Language use is complex. We use words in a variety of ways across a variety of contexts. These facts make language interesting, but they also obscure language use from casual understanding. The complicated nature of language in use interferes with analysts who build assumptions about language into their research strategies without realizing that their assumptions would not stand up to careful observation or study.

I would advise anyone involved in the study of language use (either as a primary or secondary aspect of their analysis) to take language use seriously. Fortunately, linguistics is fun and language is everywhere. So hop to it!

Reporting on the AAPOR 69th national conference in Anaheim #aapor

Last week AAPOR held it’s 69th annual conference in sunny (and hot) Anaheim California.

Palm Trees in the conference center area

My biggest takeaway from this year’s conference is that AAPOR is a very healthy organization. AAPOR attendees were genuinely happy to be at the conference, enthusiastic about AAPOR and excited about the conference material. Many participants consider AAPOR their intellectual and professional home base and really relished the opportunity to be around kindred spirits (often socially awkward professionals who are genuinely excited about our niche). All of the presentations I saw firsthand or heard about were solid and dense, and the presenters were excited about their work and their findings. Membership, conference attendance, journal and conference submissions and volunteer participation are all quite strong.

 

At this point in time, the field of survey research is encountering a set of challenges. Nonresponse is a growing challenge, and other forms of data and analysis are increasingly en vogue. I was really excited to see that AAPOR members are greeting these challenges and others head on. For this particular write-up, I will focus on these two challenges. I hope that others will address some of the other main conference themes and add their notes and resources to those I’ve gathered below.

 

As survey nonresponse becomes more of a challenge, survey researchers are moving from traditional measures of response quality (e.g. response rates) to newer measures (e.g. nonresponse bias). Researchers are increasingly anchoring their discussions about survey quality within the Total Survey Error framework, which offers a contextual basis for understanding the problem more deeply. Instead of focusing on an across the board rise in response rates, researchers are strategizing their resources with the goal of reducing response bias. This includes understanding response propensity (who is likely not to respond to the survey? Who is most likely to drop out of a panel study? What are some of the barriers to survey participation?), looking for substantive measures that correlate with response propensity (e.g. Are small, rural private schools less likely to respond to a school survey? Are substance users less likely to respond to a survey about substance abuse?), and continuous monitoring of paradata during the collection period (e.g. developing differential strategies by disposition code, focusing the most successful interviewers on the most reluctant cases, or concentrating collection strategies where they are expected to be most effective). This area of strategizing emerged in AAPOR circles a few years ago with discussions of nonresponse propensity modeling, a process which is surely much more accessible than it sounds, but it has really evolved into a practical and useful tool that can help any size research shop increase survey quality and lower costs.

 

Another big takeaway for me was the volume of discussions and presentations that spoke to the fast-emerging world of data science and big data. Many people spoke of the importance of our voice in the realm of data science, particularly with our professional focus on understanding and mitigating errors in the research process. A few practitioners applied error frameworks to analyses of organic data, and some talks were based on analyses of organic data. This year AAPOR also sponsored a research hack to investigate the potential for Instagram as a research tool for Feed the Hungry. These discussions, presentations and activities made it clear that AAPOR will continue to have a strong voice in the changing research environment, and the task force reports and initiatives from both the membership and education committees reinforced AAPOR’s ability to be right on top of the many changes afoot. I’m eager to see AAPOR’s changing role take shape.

“If you had asked social scientists even 20 years ago what powers they dreamed of acquiring, they might have cited the capacity to track the behaviors, purchases, movements, interactions, and thoughts of whole cities of people, in real time.” – N.A.  Christakis. 24 June 2011. New York Times, via Craig Hill (RTI)

 

AAPOR a very strong, well-loved organization and it is building a very strong future from a very solid foundation.

 

 

2014-05-16 15.38.17

 

MORE DETAILED NOTES:

This conference is huge, so I could not possibly cover all of it on my own, so I will try to share my notes as well as the notes and resources I can collect from other attendees. If you have any materials to share, please send them to me! The more information I am able to collect here, the better a resource it will be for people interested in the AAPOR or the conference-

 

Patrick Ruffini assembled the tweets from the conference into this storify

 

Annie, the blogger behind LoveStats, had quite a few posts from the conference. I sat on a panel with Annie on the role of blogs in public opinion research (organized by Joe Murphy for the 68th annual AAPOR conference), and Annie blew me away by live-blogging the event from the stage! Clearly, she is the fastest blogger in the West and the East! Her posts from Anaheim included:

Your Significance Test Proves Nothing

Do panel companies manage their panels?

Gender bias among AAPOR presenters

What I hate about you AAPOR

How to correct scale distribution errors

What I like about you AAPOR

I poo poo on your significance tests

When is survey burden the fault of the responders?

How many survey contacts is enough?

 

My full notes are available here (please excuse any formatting irregularities). Unfortunately, they are not as extensive as I would have liked, because wifi and power were in short supply. I also wish I had settled into a better seat and covered some of the talks in greater detail, including Don Dillman’s talk, which was a real highlights of the conference!

I believe Rob Santos’ professional address will be available for viewing or listening soon, if it is not already available. He is a very eloquent speaker, and he made some really great points, so this will be well worth your time.

 

Let’s talk about data cleaning

Data cleaning has a bad rep. In fact, it has long been considered the grunt work of the data analysis enterprise. I recently came across a piece of writing in the Harvard Business Review that lamented the amount of time data scientists spend cleaning their data. The author feared that data scientists’ skills were being wasted on the cleaning process when they could be using their time for the analyses we so desperately need them to do.

I’ll admit that I haven’t always loved the process of cleaning data. But my view of the process has evolved significantly over the last few years.

As a survey researcher, my cleaning process used to begin with a tall stack of paper forms. Answers that did not make logical sense during the checking process sparked a trip to the file folders to find the form in question. The forms often held physical evidence of a indecision on the part of the respondent, such as eraser marks or an explanation in the margin, which could not have been reflected properly by the data entry person. We lost this part of the process when we moved to web surveys. It sometimes felt like a web survey left the respondent no way to communicate with the researcher about their unique situations. Data cleaning lost its personalized feel and detective story luster and became routine and tedious.

Despite some of the affordances of the movement to web surveys, much of the cleaning process stayed routed in the old techniques. Each form has its own id number, and the programmers would use those id numbers for corrections

if id=1234567, set var1=5, set var7=62

At this point a “good programmer” would also document the changes for future collaborators

*this person was not actually a forest ranger, and they were born in 1962
if id=1234567, set var1=5, set var7=62

Making these changes grew tedious very quickly, and the process seemed to drag on for ages. The researcher would check the data for a potential errors, scour the records that could hold those errors for any kind of evidence of the respondent’s intentions, and then handle each form one at a time.

My techniques for cleaning data have changed dramatically since those days. My goal is to use id numbers as rarely as possible, but instead to ask myself questions like “how can I tell that these people are not forest rangers?” The answer to these questions evokes a subtley different technique:

* these people are not actually forest rangers
if var7=35 and var1=2 and var10 contains ‘fire fighter’, set var1=5)

This technique requires honing and testing (adjusting the precision and recall), but I’ve found it to be far more efficient, faster, more comprehensive and, most of all- more fun (oh hallelujah!). It makes me wonder whether we have perpetually undercut the quality of the data cleaning we do simply because we hold the process in such low esteem.

So far I have not discussed data cleaning for other types of data. I’m currently working on a corpus of Twitter data, and I don’t see much of a difference in the cleaning process. The data types and programming statements I use are different, but the process is very close. It’s an interesting and challenging process that involves detective work, a better and growing understanding of the intricacies of the dataset, a growing set of programming skills, and a growing understanding of the natural language use in your dataset. The process mirrors the analysis to such a degree that I’m not really sure why it would be such a bad thing for analysts to be involved in data cleaning.

I’d be interested to hear what my readers have to say about this. Is our notion of the value and challenge of data cleaning antiquated? Is data cleaning a burden that an analyst should bear? And why is there so little talk about data cleaning, when we could all stand to learn so much from each other in the way of data structuring code and more?

Professional Identity: Who am I? And who are you?

Last night I acted as a mentor at the annual Career Exploration Expo sponsored by my graduate program. Many of the students had questions about developing a professional identity. This makes sense, of course, because graduate school is an important time for discovering and developing a professional identity.

People enter our program (and many others) With a wide variety of backgrounds and interests. They choose from a variety of classes that fit their interests and goals. And then they try to map their experience onto job categories. But boxes are difficult to climb into and out of, and students soon discover that none of the boxes is a perfect fit.

I experienced this myself. I entered the program with an extensive and unquestioned background in survey research. Early in my college years (while I was studying and working in neuropsychology) I began to manage a clinical dataset in SPSS. Working with patients and patient files was very interesting, but to my surprise working with data using statistical software felt right to me much in the way that Ethiopian meals include injera and Japanese meals include rice (IC 2006 (1997) Ohnuki Tierney Emiko). I was actually teased by my friends about my love of data! This affinity served me well, and I enjoyed working with a variety of data sets while moving across fields and statistical programming languages.

But my graduate program blew my mind. I felt like I had spent my life underwater and then discovered the sky and continents. I discovered many new kinds of data and analytic strategies, all of which were challenging and rewarding. These discoveries inspired me to start this blog and have inspired me to attend a wide variety of events and read some very interesting work that I never would have discovered on my own. Hopefully followers of this blog have enjoyed this journey as much as I have!

As a recent graduate, I sometimes feel torn between worlds. I still work as a survey researcher, but I’m inspired by research methods that are beyond the scope of my regular work. Another recent graduate of our program who is involved in market research framed her strategy in a way that really resonated with me: “I give my customers what they want and something else, and they grow to appreciate the ‘something else.'” That sums up my current strategy. I do the survey management and analysis that is expected of me in a timely, high quality way. But I am also using my newly acquired knowledge to incorporate text analysis into our data cleaning process in order to streamline it, increasing both the speed and the quality of the process and making it better equipped to handle the data from future surveys. I do the traditional quantitative analyses, but I supplement them  with analyses of the open ended responses that use more flexible text analytic strategies. These analyses spark more quantitative analyses and make for much better (richer, more readable and more inspired) reports.

Our goal as professionals should be to find a professional identity that best capitalizes on  our unique knowledge, skills and abilities. There is only one professional identity that does all of that, and it is the one you have already chosen and continue to choose every day. We are faced with countless choices about what classes to take, what to read, what to attend, what to become involved in, and what to prioritize, and we make countless assessments about each. Was it worthwhile? Did I enjoy it? Would I do it again? Each of these choices constitutes your own unique professional self, a self which you are continually manufacturing. You are composed of your past, your present, and your future, and your future will undoubtedly be a continuation of your past and present. The best career coach you have is inside of you.

Now your professional identity is much more uniquely or narrowly focused that the generic titles and fields that you see in the professional marketplace. Keep in mind that each job listing that you see represents a set of needs that a particular organization has. Is this a set of needs that you are ready to fill? Is this a set of needs that you would like to fill? You are the only one who knows the answers to these questions.

Because it turns out that you are your best career coach, and you have been all along.

In praise of getting things wrong and working toward better

“An expert is a man who has made all the mistakes which can be made in a very narrow field” -Niels Bohr

I’ve been reading “In the Plex,” a book about the history of Google by Steven Levy. I highly recommend this book, because as I read it I am increasingly aware of the ways in which Google’s constant presence invisibly shapes our daily lives. Levy makes a point in the book of attributing some of Google’s constant evolution to its obsession with failure. In search terms, isolating failures is relatively easy- if people soon return to the search page, reframe their query, or continue down through lower ranked results their search was a relative failure. Failures are identified and isolated by Google and then obsessed over until the PageRank algorithm can be appropriately tweaked in a way that passes rigorous testing protocols.

In this way, Google is similar to an increasing number of failure- focused initiatives, including some of the engineering based models that have been applied to healthcare and more. These voices are increasingly the source of innovations that are continually shaping and reshaping our future. But the rhetoric of failure and success of its evangelizers can be hard for us to wrap our heads around, as people who naturally fear, avoid and focus on failure in a negative way.

Over the weekend, while I was practicing Yoga I told one of my kids my favorite part of the practice (note: not a good time for chatting). I love that Yoga is a process. One day you will be able to do something that you may or may not be able to do the next day, and vice versa. My practice involves quite a bit of balancing on one foot, and there are days when that balance feels effortless and days when that balance feels impossible. But the effortless days only come because I continue to practice despite the disappointments of my wobblier days. Yoga instructors sometimes talk about the power of intentions and working in ways that align with our intentions. One of my kids pointed out that the wobbly days, as I call them, are exactly the reason why she hates Yoga. She’s believes that she’s no good at it, and because of her assessment she will avoid it. You can probably guess that this conversation is far from over between us.

We see attitudes like these affecting people (including ourselves) every day. Some people theorize that the lower representation of women in STEM (Science, Technology, Engineering and Math) fields is due to a larger proportion of women than men who doubt their abilities or judge their abilities more harshly. We hear about graduate students who experience what is sometimes called the ‘imposter syndrome.’ I remember some students in my graduate classes who chose not to participate in class for fear they would sound stupid. I’ve heard of medical practitioners who were so worried that they would make another mistake that they were afraid to practice. As a writer, I know that the power of self doubt can cause writers block, but I also know how much easier it is to edit or rewrite.

I would encourage all of you to embrace your failures, your mistakes, your shortcomings, your missteps and your errors and see them as part of a process and not an endpoint. These stumbling points are the key points of growth- the key moments for us to learn and to redirect our actions to better suit our intentions. To err is human, but to learn from our missteps is surely something greater.