Understanding news consumption and production can be like understanding the air we breathe

A careful, systematic look at the way you encounter news might just dramatically change your understanding of the genre. Here are some observations about creating and consuming news in our current information ecosystem.

Creating News

News is not one size fits all, and news methodology can’t be one size fits all. This is probably a well known fact to people with more of a journalism background, but it is often overlooked by people who are newer to the field. Here are a few points that stem from differences:

– Social media can be a great source for information about breaking events that have a critical base of witnesses with internet access.

– Social media is no substitute for news that has very few witnesses with privileged access to information.

– The core job of newsmakers is to keep the public informed about unfolding events. Oftentimes newsmakers are as invisible to their audiences as the people who develop dictionaries are. The audience assumes that the major events they see covered are the objectively most-major events, often without any understanding of the curation involved. Newsmakers provide a vital public service and have a moral obligation to the public, but that obligation is far from straight forward.

– News consumers may choose to engage most deeply in the topics they are most interested in, but that doesn’t invalidate a basic desire to know what’s going on in the world. This is why I like to advocate for eye tracking as an engagement metric- the current tracking metrics don’t reflect the most basic function of the news media.

 

Consuming News

News exposure is seamlessly integrated into our daily experiences. As a child, I would watch multiple newscasts with my mom, and we would both scan the newspapers regularly. As a new parent, I visited multiple websites to collect news from different perspectives and regularly watched multiple newscasts- this seemed like an essential tie between the small world of new parenthood and the larger world outside my door. But these days I work long hours and rarely catch newscasts or have time to visit multiple news sites. Someone recently asked me which news outlets I follow, and I was surprised that the answer didn’t come very readily to me. I’ve been making a careful effort to observe my contact with news stories, outlets and journalists, and I highly recommend this exercise to anyone interested in understanding or measuring media use.

Here is some of what I’ve observed:

– Twitter is the first platform I think of when I think of news. I think of it as my own curated stream of news amidst the wider raging river of information flow. But when it comes to news stories in particular, I often hear about them not because I seek them out or curate them but because my streams are based on people who have a variety of interests. I hear about emerging news because people go off-topic in  their Twitter streams, not because I seek it out. I often value this dynamic as a kind of filter of its own, because major events enter my stream from a variety of perspectives, but the majority of news does not.

– Re: Interest-based streams- I mostly follow researchers on Twitter. As a result, I can follow conferences as they happen or read interesting articles as they come out. Is this news? What makes it news?

– Platforms morph based on the way people use them. See @clintonyates Twitter feed for an example of a journalist using Twitter to tell resonant stories in a unique way that defies traditional uses of the platform.

– Re: Instagram- I love to follow Instagrammers because I really love photography. Some of the instagrammers I follow are photojournalists. This is an area of news coverage that is rarely considered in depth. And sometimes I wonder whether these pictures are only news if they contain, and I read, captions explaining their context and importance?

– Facebook is often discussed as a news source, but it is very important when discussing Facebook as a news source to consider the social context of information. I will share news from news sources only if I think it is something I can share without harming valued personal relationships with people across many ideological spectra and backgrounds. That said, some of my friends will regularly share the pieces that I choose not to. When I see those articles from these friends I will put the articles in the context of what I’ve seen from those people in the past, my patterns with them in regards on the topic, and my social patterns with them in general.

– It is important to recognize that news items on Facebook can come from news sources, interest groups or pages, interested people, or simply from Facebook. The source interacts with the platform to create the stimulus.

– Re: other fora- There are many more news sources that I follow to varying degrees. I receive research updates and daily briefings from Pew and Nielsen, which I read with varying frequency (the only one I read every day is the Daily Briefing from the Pew Journalism Project.) I also receive e-mails from research and technical lists, lists about STEM education, community lists, blog notifications and emails from LinkedIn. I read the Sunday paper, and weekly updates from my employer, and I regularly hear and participate in discussions in my workplace and outside of it. Each of these are potential news sources that may bring in other news sources.

– These sources listed together may appear to amount to a critical mass of time, but I was not aware of that critical mass until I stopped to observe it. Our choices and actions regarding media consumption are as unconscious as many other choices I make with my time.

All of this is to say that news is as seamlessly integrated into my environment as the air I breathe, and it stems from sources of all kinds. Every story has a different way of intersecting with and co creating my own. Whereas news media has a particularly strong history of top down and one way dissemination, it is much more ubiquitous, multi-directional and part of our ecosystem now than ever before. We are consumers and participants in very different ways, and understanding these is a key to understanding and developing tools for news in the future.

 

* A side note re: pay to read. My advice to news outlets is to find a way to integrate pre-existing online funding resources (like Amazon, paypal, etc.) in a collective or semi-standardized way, so that people don’t have to provide financial information to anyone new, and so that people can pay small fees (e.g. 25 cents for a long-read or something that required a good deal of expense to produce, 5 or ten cents for smaller or shorter pieces) with a single click and pay as they go to read around a variety of sources.

Advertisements

Reflections and Notes from the Sentiment Analysis Symposium #SAS14

The Sentiment Analysis Symposium took place in NY this week in the beautiful offices of the New York Academy of Sciences. The Symposium was framed as a transition into a new era of sentiment analysis, an era of human analytics or humetrics.

The view from the New York Academy of Sciences is really stunning!

The view from the New York Academy of Sciences is really stunning!

Two main points that struck me during the event. One is that context is extremely important for developing high quality analytics, but the actual shape that “context” takes varies greatly. The second is a seeming disconnect between the product developers, who are eagerly developing new and better measures, and the customers, who want better usability, more customer support, more customized metrics that fit their preexisting analytic frameworks and a better understanding of why social media analysis is worth their time, effort and money.

Below is a summary of some of the key points. My detailed notes from each of the speakers, can be viewed here. I attended both the more technical Technology and Innovation Session and the Symposium itself.

Context is in. But what is context?

The big takeaway from the Technology and Innovation session, which was then carried into the second day of the Sentiment Analysis Symposium was that context is important. But context was defined in a number of different ways.

 

New measures are coming, and old measures are improving.

The innovative new strategies presented at the Symposium made for really amazing presentations. New measures include voice intonation, facial expressions via remote video connections, measures of galvanic skin response, self tagged sentiment data from social media sharing sites, a variety of measures from people who have embraced the “quantified self” movement, metadata from cellphone connections (including location, etc.), behavioral patterning on the individual and group level, and quite a bit of network analysis. Some speakers showcased systems that involved a variety of linked data or highly visual analytic components. Each of these measures increase the accuracy of preexisting measures and complicate their implementation, bringing new sets of challenges to the industry.

Here is a networked representation of the emotion transition dynamics of 'Hopeful'

Here is a networked representation of the emotion transition dynamics of ‘Hopeful’

This software package is calculating emotional reactions to a Youtube video that is both funny and mean

This software package is calculating emotional reactions to a Youtube video that is both funny and mean

Meanwhile, traditional text-based sentiment analyses are also improving. Both the quality of machine learning algorithms and the quality of rule based systems are improving quickly. New strategies include looking at text data pragmatically (e.g. What are common linguistics patterns in specific goal directed behavior strategies?), gaining domain level specificity, adding steps for genre detection to increase accuracy and looking across languages. New analytic strategies are integrated into algorithms and complementary suites of algorithms are implemented as ensembles. Multilingual analysis is a particular challenge to ML techniques, but can be achieved with a high degree of accuracy using rule based techniques. The attendees appeared to agree that rule based systems are much more accurate that machine learning algorithms, but the time and expertise involved has caused them to come out of vogue.

 

“The industry as a whole needs to grow up”

I suspect that Chris Boudreaux of Accenture shocked the room when he said “the industry as a whole really needs to grow up.” Speaking off the cuff, without his slides after a mishap and adventure, Boudreaux gave the customer point of view toward social media analytics. He said said that social media analysis needs to be more reliable, accessible, actionable and dependable. Companies need to move past the startup phase to a new phase of accountability. Tools need to integrate into preexisting analytic structures and metrics, to be accessible to customers who are not experts, and to come better supported.

Boudreaux spoke of the need for social media companies to better understand their customers. Instead of marketing tools to their wider base of potential customers, the tools seem to be developed and marketed solely to market researchers. This has led to a more rapid adoption among the market research community and a general skepticism or ambivalence across other industries, who don’t see how using these tools would benefit them.

The companies who truly value and want to expand their customer base will focus on the usability of their dashboards. This is an area ripe for a growing legion of usability experts and usability testing. These dashboards cannot restrict API access and understanding to data scientist experts. They will develop, market and support these dashboards through productive partnerships with their customers, generating measures that are specifically relevant to them and personalized dashboards that fit into preexisting metrics and are easy for the customers to understand and react to in a very practical and personalized sense.

Some companies have already started to work with their customers in more productive ways. Crimson Hexagon, for example, employs people who specialize in using their dashboard. These employees work with customers to better understand and support their use of the platform and run studies of their own using the platform, becoming an internal element in the quality feedback loop.

 

Less Traditional fields for Social Media Analysis:

There was a wide spread of fields represented at the Symposium. I spoke with someone involved in text analysis for legal reasons, including jury analyses. I saw an NYPD name tag. Financial services were well represented. Publishing houses were present. Some health related organizations were present, including neuroscience specialists, medical practitioners interested in predicting early symptoms of diseases like Alzheimer’s, medical specialists interested in helping improve the lives of people with diseases like Autism (e.g. with facial emotion recognition devices), pharmaceutical companies interested in understanding medical literature on a massive scale as well as patient conversation about prescriptions and participation in medical trials. There were traditional market research firms, and many new startups with a wide variety of focuses and functions. There were also established technology companies (e.g. IBM and Dell) with innovation wings and many academic departments. I’m sure I’ve missed many of the entities present or following remotely.

The better research providers can understand the potential breadth of applications  of their research, the more they can improve the specific areas of interest to these communities.

 

Rethinking the Public Image of Sentiment Analysis:

There was some concern that “social” is beginning to have too much baggage to be an attractive label, causing people to think immediately of top platforms such as Facebook and Twitter and belying the true breadth of the industry. This prompted a movement toward other terms at the symposium, including human analytics, humetrics, and measures of human engagement.

 

Accuracy

Accuracy tops out at about 80%, because that’s the limit of inter-rater reliability in sentiment analysis. Understanding the more difficult data is an important challenge for social media analysts. It is important for there to be honesty with customers and with each other about the areas where automated tagging fails. This particular area was a kind of elephant in the room- always present, but rarely mentioned.

Although an 80% accuracy rate is really fantastic compared to no measure at all, and it is an amazing accomplishment given the financial constraints that analysts encounter, it is not an accuracy rate that works across industries and sectors. It is important to consider the “fitness for use” of an analysis. For some industries, an error is not a big deal. If a company is able to respond to 80% of the tweets directed at them in real-time, they are doing quite well, But when real people or weightier consequences are involved, this kind of error rate is blatantly unacceptable. These are the areas where human involvement in the analysis is absolutely critical. Where, honestly speaking, are algorithms performing fantastically, and where are they falling short? In the areas where they fall short, human experts should be deployed, adding behavioral and linguistic insight to the analysis.

One excellent example of Fitness for Use was the presentation by Capital Market Exchange. This company operationalizes sentiment as expert opinion. They mine a variety of sources for expert opinions about investing, and then format the commonalities in an actionable way, leading to a substantial improvement above market performance for their investors. They are able to gain a great deal of market traction that pure sentiment analysts have not by valuing the preexisting knowledge structures in their industry.

 

Targeting the weaknesses

It is important that the field look carefully at areas where algorithms do and do not work. The areas where they don’t represent whole fields of study, many of which have legions of social media analysts at the ready. This includes less traditional areas of linguistics, such as Sociolinguistics, Conversation Analysis (e.g. looking at expected pair parts) and Discourse Analysis (e.g. understanding identity construction), as well as Ethnography (with fast growing subfields, such as Netnography), Psychology and Behavioral Economics. Time to think strategically to better understand the data from new perspectives. Time to more seriously evaluate and invest in neutral responses.

 

Summing Up

Social media data analysis, large scale text analysis and sentiment analysis have enjoyed a kind of honeymoon period. With so many new and fast growing data sources, a plethora of growing needs and applications, and a competitive and fast growing set of analytic strategies, the field has been growing at an astronomical rate. But this excitement has to be balanced out with the practical needs of the marketplace. It is time for growing technologies to better listen to and accommodate the needs of the customer base. This shift will help ensure the viability of the field and free developers up to embrace the spirit of intellectual creativity.

This is an exciting time for a fast growing field!

Thank you to Seth Grimes for organizing such a great event.

 

Planning another Online Research, Offline lunch

I’m planning another Online Research, Offline lunch for researchers in the Washington DC area later this month. The specific date and location are TBA, but it will be toward the end of February near Metro Center.

These lunches are designed to welcome professionals and students involved in online research across a variety of disciplines, fields and sectors. Past attendees have had a wide array of interests and specialties, including usability and interface design, data science, natural language processing, social network analysis, social media monitoring, discourse analysis, netnography, digital humanities and library science.

The goal of this series is to provide an informal venue for a diverse set of researchers to talk with each other and gain a wider context for understanding their work. They are an informal and flexible way to researchers to meet each other, talk and learn. Although Washington DC is a great meeting place for specific areas of online research, there are few informal opportunities for interdisciplinary gatherings of professionals and academics.

Here is a form that can be used to add new people to the list. If you’re already on the list you do not need to sign up again. Please feel free to share the form with anyone else who may be interested:

Storytelling about the Past and Predicting the Future: On People, Computers and Research in 2014 and Beyond

My Grandma was a force to be reckoned with. My grandfather was a writer, and he described her driving down the street amidst symphonies. She was beautiful and stubborn, strong willed and sharp. Once a young woman with the good looks of a model, she wore high heels and took daily trips to the gym well into her 90’s. At the age of 94 she managed to run across her house, turn off the water and stand with her hand on her hip in front of the shower before I returned from the next room over with the shampoo I forgot (lest I waste water).

My Grandma, looking amazing

My Grandma, looking amazing

A few years ago I visited her in Florida. She collected work for all of her visitors to do, and we were busy from the moment I arrived. To my surprise, many of the tasks she had gathered involved dealing with customer service and discovering the truth in advertisements. At one point she led me into the local pharmacy with a stack of papers and asked to see the manager. Once she found the manager she began to go through the papers one by one and ask about them. The first paper on the stack was about the Magic Jack. He showed her the package, and she questioned him in depth about how it worked. I was shocked. I’d never thought of a store manager in this role before.

After that trip I began to pay closer attention to the ways in which the people around me dealt with customer service, and I became a kind of customer service liaison for my family. My older family members had an expectation that any customer service agent be both extensively knowledgeable and dependably respectful, but the problems of customer service seemed to have grown beyond this small, personable level to a point where a large network of people with structurally different areas of knowledge act together to form a question answering system. The amount and structure of knowledge necessary has become the focus of the customer service problem, and people everywhere complain about the lack of knowledge, ability and pleasant attitude of the customer service agents they encounter.

This is a problem with many layers and levels to it, and it is a problem that reflects the developing data science industry well. In order to deliver good customer service a great deal of information has to be organized and structured in a meaningful way to allow for optimal extraction. But this layer cannot be everything. The customer service interaction itself needs to be set-up in such a way to allow customers to feel satisfied. People expect personalized, accurate interactions that are structured in a way that is intuitive to them. The customer service experience cannot be the domain of the data scientists. If it is automated, it requires usability experts to develop and test systems that are intuitive and easy to use. If it is done by people, the people need to have access to the expertise necessary for them to do their job and be trained in successful interpersonal interaction. I believe that this whole system could be integrated well under a single goal: to provide timely and direct answers to customer inquiries in 3 steps or less.

The past few years have brought a rapid increase in customization. We have learned to expect the information around us to be customized, curated and preprocessed. We expect customer service to know intuitively what our problems are and answer them with ease. We expect Facebook to know what we want to see and customize our streams appropriately. We expect news sites to be structured to reflect the way we use them. This increase in demand and expectations is the drive behind our hunger for data science, and it will fuel a boom in data and information science positions until we have a ubiquitous underlayer of organized information across all necessary domains.

But data and information science are new fields and not well understood. Our expectations as users exceed the abilities of this fast-evolving field. We attract pioneers who are willing to step into a field that is changing shape beneath their feet as they work. But we ask for too much of a result and expect too much of a result, because these pioneers can’t be everything across all fields. They are an important structural layer of our newly unfolding economy, but in each case, another layer of people are needed in order to achieve the end result.

Usability is an important step above the data and information science layer. Through usability studies, Facebook will eventually learn that people and goals are not constant across all visits. Sometimes I look at Facebook simply to see if I’ve missed any big developments in the lives of my friends and loved ones. Sometimes I want to catch news. Sometimes I’m bored and looking for ridiculous stuff to entertain me. Sometimes I have my daughter next to me and want to show her funny pet pictures that I normally wouldn’t look twice at. Through usability studies, Facebook will eventually learn that users need some control over the information presented to them when they visit.

Through usability studies newspapers will better understand the important practice of headline scanning and develop pay models that work with peoples reading habits. Through qualitative research newspapers will understand their importance as the originators of news about big events with few witnesses, like peace treaties and celebrity births and deaths and the real value of social media for events with large numbers of witnesses and points of view. News media sources are deep in a period of transition where they are learning to better understand dissemination, virality, clicks, page views, reader behavior and reader expectations, and the strengths and weaknesses of social media news sources.

There have been many blog posts (like this one) about Isaac Asimov’s predictions for the future, because he was so right about so many things. At this point we’re at a unique vantage point where his notions of machine programmers and machine tenders are taking deeper shape. This year we will continue to see these changes form and reform around us.

Planning a second “Online Research, Offline Lunch”

In August we hosted the first Online Research, Offline Lunch for researchers involved in online research in any field, discipline or sector in the DC area. Although Washington DC is a great meeting place for specific areas of online research, there are few opportunities for interdisciplinary gatherings of professionals and academics. These lunches provide an informal opportunity for a diverse set of online researchers to listen and talk respectfully about our interests and our work and to see our endeavors from new, valuable perspectives. We kept the first gathering small. But the enthusiasm for this small event was quite large, and it was a great success! We had interesting conversations, learned a lot, made some valuable connections, and promised to meet again.

Many expressed interest in the lunches but weren’t able to attend. If you have any specific scheduling requests, please let me know now. Although I certainly can’t accommodate everyone’s preferences, I will do my best to take them into account.

Here is a form that can be used to add new people to the list. If you’re already on the list you do not need to sign up again. Please feel free to share the form with anyone else who may be interested:

 

Reflections on Digital Dualism & Social Media Research from #SMSociety13

I am frustrated by both Digital Dualism and the fight against Digital Dualism.

Digital dualism is the belief that online and offline are different worlds. It shows up relatively harmlessly when someone calls a group of people who are on their devices “antisocial,” but it is much more harmful in the way it pervades the language we use about online communication (e.g. “real” vs. “virtual”).

Many researchers have done important work countering digital dualism. For example, at the recent Social Media & Society conference, Jeffrey Keefer briefly discussed his doctoral work in which he showed that the support that doctoral students offered each other online was both very real and very helpful. I think it’s a shame that anyone ever doubted the power of a social network during such a challenging time, and I’m happy to see that argument trounced! Wooooh, go Jeffrey! (now a well-deserved Dr Keefer!)

Digital dualism is a false distinction, but it is built in part on a distinction that is also very real and very important. Online space and offline spare are different spaces. People can act in either to achieve their goals in very real ways, but, although both are very real, they are very different. The set of qualities with which the two overlap and differ and even blur into each other changes every day. For example, “real name” branding online and GPS enabled in-person gaming across college campuses continue to blur boundaries.

But the private and segmented aspects of online communication are important as well. Sometimes criticism of online space is based on this segmentation, but communities of interest are longstanding phenomena. A book club is expected to be a club for people with a shared interest in books. A workplace is a place for people with shared professional interests. A swim team is for people who want to swim together. And none of these relationships would be confused with the longstanding close personal relationships we share with friends and family. When online activities are compared with offline ones, often people are falsely comparing interest related activities online with the longstanding close personal ties we share with friends and family. In an effort to counter this, some have take moves to make online communication more unified and holistic. But they do this at the expense of one of the greatest strengths of online communication.

Let’s discuss my recent trip to Halifax for this conference as an example.

My friends and family saw this picture:

Voila! Rethinking Digital Democracy! More of a "Hey mom, here's my poster!" shot than a "Read and engage with my argument!" shot

Voila! Rethinking Digital Democracy! More of a “Hey mom, here’s my poster!” shot than a “Read and engage with my argument!” shot

My dad saw this one:

Not bad for airport fare, eh?

Not bad for airport fare, eh?

This picture showed up on Instagram:

2013-09-16 15.27.43

It’s a glass wall, but it looks like water!

People on Spotify might have followed the music I listened to, and people on Goodreads may have followed my inflight reading.

My Twitter followers and those following the conference online saw this:

Talking about remix culture! Have I landed in heaven? #SMSociety13 #heaveninhalifax #niiice

— Casey Langer Tesfaye (@FreeRangeRsrch) September 15, 2013

And you have been presented with a different account altogether

This fractioning makes sense to me, because I wouldn’t expect any one person to share this whole set of interests. I am able to freely discuss my area of interest with others who share the same interests.

Another presenter gave an example of LGBT youth on Facebook. The lack of anonymity can make it very hard for people who want to experiment or speak freely about a taboo topic to do so without it being taken out of context. Private and anonymous spaces that used to abound online are increasingly harder to find.

In my mind this harkens back a little to the early days of social media research, when research methods were deeply tied to descriptions of platforms and online activity on them. As platforms rose and fell, this research was increasingly useless. Researchers had to move their focus to online actions without trying to route them in platform or offline activity. Is social media research being hindered in similar ways, by answering old criticisms instead of focusing on current and future potential?  Social media needs to move away from these artificial roots. Instead of countering silly claims about social media being antisocial or anything more than real communication, we should focus our research activities on the ways in which people communicate online and the situated social actions and behaviors in online situations. This means, don’t try to ferret out people from usernames, or sort out who is behind a username. Don’t try to match across platforms. Don’t demand real names.

Honestly, anyone who is subjected to social feeds that contain quite a bit of posts outside their area of interest should be grateful to refocus and move on! People of abstract Instagram should be thrilled not to have seen a bowl of seafood chowder, and my family and friends should be thrilled not to have to hear me ramble on about digital dualism or context collapse!

I would love to discuss this further. If you’ve been waiting to post a comment on this blog, this is a great time for you to jump in and join the conversation!

Upcoming DC Event: Online Research Offline Lunch

ETA: Registration for this event is now CLOSED. If you have already signed up, you will receive a confirmation e-mail shortly. Any sign-ups after this date will be stored as a contact list for any future events. Thank you for your interest! We’re excited to gather with such a diverse and interesting group.

—–

Are you in or near the DC area? Come join us!

Although DC is a great meeting place for specific areas of online research, there are few opportunities for interdisciplinary gatherings of professionals and academics. This lunch will provide an informal opportunity for a diverse set of online researchers to listen and talk respectfully about our interests and our work and to see our endeavors from new, valuable perspectives.

Date & Time: August 6, 2013, 12:30 p.m.

Location: Near Gallery Place or Metro Center. Once we have a rough headcount, we’ll choose an appropriate location. (Feel free to suggest a place!)

Please RSVP using this form: