The Associated Press plans soon to sic a scraper-bot on the Web to find swiped AP content. While no one would argue with taking on scraper sites, the vagueness of AP news editor Ted Bridis might be worth considering.
In an interview with Ars Technica, Bridis talked of a new technology (that writer Matthew Lasar cleverly described as a “search-and-maybe-threaten bot”) that is on the horizon for the AP. The technology will identify and flag webpages copying entire AP articles. Upon flagging, AP lawyers would review.
A scraper-bot would be nice, a positive technology to evolve from this fiasco. What you do you think?
Bridis insisted the news organization would not be going after bloggers or publications excerpting a paragraph of AP content and linking to the original. He admitted AP sometimes borrows excerpts from newspapers and crafts their own story around it.
But Bridis stopped there and made no such concessions about usage of headlines and AP ledes. Arguing the so-called “hot news misappropriation” doctrine, this could affect search engines, aggregators, and sites like the Drudge Report who display headlines and the first line of an article.
Also under the radar would be articles written based on AP content, especially commercial websites rewriting with hedges like “the AP has reported” or the “AP said.” That’s where the vagueness is troubling, and where the lines are fairly blurry. It’s hard to tell if there is more emphasis on commercial or on an attribution method. It is also unclear what is meant by “rewriting.” Does he define rewriting only as reporting the facts with only a word or two changed (i.e., plagiarism)? Or does Bridis also include rewriting as retelling a story in different words, or even summarizing facts?
Should the AP be able to dictate which facts are fair to retell, which styles are acceptable to retell them, which sentences are acceptable to excerpt, and how attribution is to be made? Let us know.
Depending on how these questions are answered, Bridis could be drawing a line between blogs and news sites, essentially saying nonprofit bloggers can quote and refer but commercial news sites cannot. He’s also drawing a line between textual storytelling and verbal storytelling. Bridis seems to suggest any commercial, textual relay of information wouldn’t be considered “fair” use, so long as they can, in a decentralized communication universe, prove the AP was the only outfit that knew certain facts. That argument is rather stunning considering the AP is a distributor of news first written elsewhere in the world at local publications.
What’s extra interesting is that though the AP has criticized fair use as a “misguided” legal theory, the organization itself is insisting on its own with the “hot news” doctrine, which is mostly a semantic device to create a separate category for “facts,” which are not copyrightable in the first place.
Ninety years ago, the AP sued William Randolph Hearst’s International News Service (INS) for swiping breaking news the AP had gathered and distributing the news on its own. Over a lengthy court battle reaching the Supreme Court, the “hot news” doctrine was born. Though the AP essentially lost the suit because the courts found that facts could not be copyrighted, hot news (a scoop) was designated as a special kind of property to which the outlet breaking the news had exclusive rights for a limited amount of time. Just how long these special kinds of facts are protected is unclear, especially in the Internet age, when hot news gets cold much faster.
To succeed in its efforts, the AP will have significant legal hurdles in front of it. The organization will have to redefine fair use, get a court to uphold that some facts are protected and set some kind of timetable for that protection, explain how textually reporting facts to an Internet audience is different from reporting facts to any other audience by any other method, find a logical differentiation between bloggers and journalists, between Internet forums/social networks and water cooler conversations, convince courts previous precedents regarding aggregating, linking and snippeting should be overturned, all while avoiding federal charges of anticompetitive behavior.
Those are some pretty tall hurdles, and likely a 90-year-old argument from a different world isn’t going to be able to jump them.
This has heated debate written all over it. Sound off in the comment section.