AJR  Drop Cap
From AJR,   October/November 2011

Aggregating Without Aggravating   

What are the best practices when it comes to collecting the material of other news outlets? Wed. August 24, 2011

By Michaelle Bond
Michaelle Bond (mbond@ajr.umd.edu) is an AJR editorial assistant.     


The day after media columnist Simon Dumenco wrote a post commenting on the Twitter buzz surrounding both Apple's iCloud announcement and the Rep. Anthony Weiner scandal, he was indignant to see what he calls a thorough paraphrasing and rewriting of his story on The Huffington Post.

"It's a creepy feeling," Advertising Age's "Media Guy" says. The June 9 Huffington Post story presented much of his data and shared the same premise – Apple's bad luck in making its announcements the day Weiner resigned from Congress.

The Huffington Post story did link twice to the original piece. But Dumenco says the thoroughness of the summary gave readers no incentive to click on the links.

"It's kind of embarrassing that we've gotten to this point where we're accepting of this level of aggregation," Dumenco says.

The flap spotlights both aggregation's role in journalism's future and the question of just how it should be carried out. It's the execution of aggregation, not the idea of it, that causes controversy, says Jeff Sonderman, a digital media fellow at the Poynter Institute. He laid out a framework for journalists in his post "The seven steps to a successful aggregation strategy for your news organization." The post drew on both Poynter's experiences with its Romenesko media news aggregation blog and observation of the way other news aggregators do their jobs.

"It's important for all of us in the industry to make sure we move the conversation beyond 'Is aggregation good or bad?' " Sonderman says.

Aggregation has become essential for most news outlets and is an effective way of attracting Web visitors, he says. "For most news organizations, there's more news going on that the audience wants to know about than they themselves are able to cover," Sonderman says. "Unless you have the staff to give your readers all the information they want to know, you should probably use aggregation."

Readers rely on aggregators to help them sort through an overabundance of information online. And aggregation can free resources for those who can acknowledge and link to good work by other news outlets rather than waste time trying to re-create what others already have reported, he says.

Sonderman advises that, while algorithms are the easiest way to break into the world of aggregation, nothing quite beats a human editor.

Technology news aggregator Techmeme hired Megan McCarthy as its first human editor in 2008. She's now the editor of its sister site, Mediagazer, another media news aggregator. McCarthy says people work together with the algorithms, and adding humans to the mix has its advantages.

"What humans do better than the algorithms is anticipate," she says. "I can see a story and think, 'Oh, that's going to be big.' But the algorithm doesn't know that."

Like Techmeme, Mediagazer focuses on what McCarthy calls "pure aggregation," displaying only the link and the first few lines of the story. "The value is, we bring all the views together, so the reader can then see the bigger picture," McCarthy says. The site's visitors see the top issues of the day grouped by story and, from there, they can delve deeper into specific viewpoints, she says.

Selecting and grouping aggregated items is one element that separates "curated" aggregation from indiscriminate aggregation, says Larry Kramer, a longtime newspaper reporter and editor who founded the financial Web site MarketWatch.

"It's not a value proposition to take everyone's stories and pile them on" your site, Kramer says. Curation is educated aggregation that brings together the best versions of a story from the most reliable sources, he says. This thinking led Dan Farber, editor-in-chief of CBSNews.com, to start using the term "aggrefilter" for model aggregators.

Frédéric Filloux, editor of the Monday Note, a newsletter covering digital media's business models, wrote a piece last year called "Aggregators: the good ones vs. the looters." In it, Filloux wrote that "good" aggregators avoid "self-sufficient" posts of other sources' information. He wrote that publishers should put their content behind a paywall or into a smartphone application to protect themselves from being "e-looted."

"Aggregation is absolutely fine as long as publishers benefit from it; it could be traffic or some kind of revenue sharing," the Paris-based Filloux wrote in an e-mail interview. Otherwise, aggregation will kill publishers' businesses, he says.

A drop in business for content producers is what Simon Dumenco fears will happen if Web sites continue to oversummarize the work of others. Dumenco, a longtime critic of The Huffington Post, decided to use Google Analytics to test the extent to which the Web site's summary of his story drove traffic to his original article. The Huffington Post says driving traffic to other sites is one of the important services it provides.

Techmeme also picked up Dumenco's Twitter story, so Dumenco compared traffic coming from both sites.

He found that the niche site Techmeme drove 746 pageviews to his original post. The Huffington Post drove 57 pageviews, according to Dumenco.

But the Huffington Post has its own analytics that say it sends millions of its own readers to other sites every month, says Managing Editor Nico Pitney.

The Huffington Post, which publishes original material as well as aggregating, posts more than 500 items every day and knows firsthand the frustration of other sites taking its material, he says. "It's something we're incredibly sensitive to as well," Pitney says. An instance like Dumenco's "is an extreme rarity," he says.

After Dumenco published his findings, The Huffington Post sent him a letter of apology, writing that his criticism was "completely valid" and counter to the way The Huffington Post trains its staff to act. The writer of the offending post was briefly suspended.

But Dumenco thinks writing thorough summaries and saying that it's aggregation is part of a widespread problem at The Huffington Post and elsewhere. "It's kind of the culture of the Internet world," he says. Small blogs copy whole passages of other writers' work all the time, he says. "But it doesn't really matter because maybe they'll get 50 pageviews," Dumenco says. On the other hand, The Huffington Post attracted more than 21 million unique visitors in June, according to Web traffic appraiser Compete.

"Just about every blog on the Internet aggregates to some extent," Dumenco says, "and the question is, do you do it fairly or unfairly?"

Unethical aggregators may face more than ill will. They are also taking legal risks by pushing the boundaries of fair use. But aggregators can reduce those risks in several ways, as suggested by the 2010 paper "The Rise of the News Aggregator: Legal Implications and Best Practices," by the Citizen Media Law Project out of Harvard University's Berkman Center for Internet & Society. To protect themselves, aggregators should use only the parts of the original story that are needed to identify the story or make a point, wrote Kimberley Isbell, the paper's author. Instead of saying exactly what the source said, aggregators should provide context or commentary. The original sources used in aggregation should vary. And the source should be identified prominently in the aggregated story, which should also link back to the original piece.

But this kind of aggregation is nothing new. Reporters were doing it in the pre-Internet age, says Bill Keller, executive editor of the New York Times.

"If aggregation means pulling together material from many different sources, then just about everybody who publishes, online or off, is an aggregator to some degree," Keller wrote in an e-mail to AJR. "The distinction between original journalism and aggregation is more a continuum than a binary distinction. A reporter covering the economy is drawing on economists, politicians, think tanks, government agencies." Bloggers following developing stories are farther along on the continuum, he says.

So although he has publicly criticized the aggregation methods of Arianna Huffington and her Huffington Post – he likened aggregators to Somali pirates in one much-discussed column – he is not anti-aggregation, he says. He just thinks it isn't a substitute for original reporting and shouldn't summarize the original sources too extensively, he says.

"Actually, the aggregator I use most is a medium that doesn't really bill itself as an aggregator: Twitter," says Keller, who is stepping down on September 6 and will write an op-ed column. He uses the social networking and microblogging service as a "tip sheet" on what other people are reading. "And Twitter is what you'd call a 'good' aggregator; it's hard to steal somebody's work in 140 characters, so it's all about the link."

###