Google And The Digitization of The Planet's Books

by Lee Gesmer on November 27, 2005

Copyright.

    “Imagine the cultural impact of putting tens of millions of previously inaccessible [books] into one vast index, every word of which is search able by anyone, rich and poor, urban and rural, First World and Third, en tout langue — and all, of course, entirely for free.”

    Eric Schmidt, Google CEO
    “Mr Schmidt fails to mention that Google’s intent . . . is to make even more money. . . . Can it be so greedy that it seeks to bolster it profits by freely exploiting the rights of publishers and authors?”

    Patricia Schroeder, President, American Association of Publishers

The legal controversy over Google’s plan to use a proprietary high speed scanning process to copy (to start with) the entire book collections in the libraries at Michigan, Stanford, Oxford, the New York Public Library and Harvard, whether or not a particular book is under copyright, and to require copyright owners to notify Google if they wish to “opt out” of this program, has reached fever pitch. Indeed, its easy to see why. In almost every case until now, copyright litigation has implicated the legal rights of one copyright owner and one alleged infringer. In the few cases to involve broader rights, such as the Napster/Grokster line of file sharing cases, it was easy for anyone with a moral compass to conclude that wholesale copying and distribution of copyrighted music (or the encouragement of same) was wrong, and was a violation of the copyright laws.

Google’s is not threatening to Napsterize the book publishing industry, but its chutzpah is breathtaking: “Google Book Search” (f/k/a Google Print) proposes to scan every book, in every language, in-print or out-of-print, in-copyright or out of copyright, and make it freely searchable. If the copyright owner gives permission, make available for Internet searches the full page on which a search result appears, and adjacent pages. If Google doesn’t have the copyright owner’s permission, present only a “snippet” from the book to searchers using Google.

Whew! You would be hard pressed to come up with anything to make the book publishing industry angrier (short of outright Napsterization), and Google’s proposal has had a predictable effect (although its not clear whether the Googlers in Mountain View anticipated that their altruistic undertaking would result in copyright infringement lawsuits brought by both the American Association of Publishers and the Author’s Guild).

Here’s my view on the debate over the legality of Google’s on-again/off-again project. For the views of numerous other academics and observers (and more than you could possibly ever want to know about this issue, unless you are a lawyer for one of the sides in these lawsuits and are trolling for arguments), read here and here.

First, just as was true for many people when the peer-to-peer music file sharing issues first arose, there’s a lot of emotion over this issue. We are all consumers of information, and what consumer wouldn’t want “one giant electronic card catalogue that makes all the world’s books discoverable with just a few keystrokes by anyone, anywhere, anytime”? (Google’s Schmidt again). However, unless the courts rewrite the law (as Lawrence Lessig suggests), the legality of Google Book Search is going to be decided under the law as it exists today, not based on what’s good for the planet or Utopian philosophies of copyright law.

Second, I suspect that what’s really keeping the publishers and authors up nights is this question: who’s going to have control over this compilation of data? Sure, it’s “Do No Evil”-Google today, but who might have the resources to do the same thing, even on a smaller scale, in the future? And remember, the future is a long time. One can imagine the great-to-the-nth descendants of today’s publishers cursing their literary ancestors for allowing Google to take the first step down the slippery slope that leads, who knows where?

And, if that’s not enough to keep a good capitalist publisher awake nights, how long will it before the entire estimated 20 million books can be copied onto some kind of high capacity optical/nanotech data storage (blah, blah, woof, woof) DVD? Fifty years? Less? What’s to stop some Google employee who only pretended to buy this “do no evil” stuff from stealing this collection someday, and selling a copy to a wealthy collector? What if Google goes bankrupt (stranger things have happened), and the compilation is sold to . . . . ? (complete the sentence yourself, Constant Reader – your imagination is almost certainly better than anything I could provide).

Bottom line: for the publishers, this is the Battle of the Somme and the Battle of Normandy rolled into one. For the publishers, lose here, and the consequences are too horrible to imagine. Or at least they so imagine.

Now, the dry legalities. There is no question that Google is making literal copies of copyrighted works. Not only do they not deny it, they boast about it. By doing so, they are violating the first of the six “exclusive rights” that belong to a copyright owner, the right to “reproduce the copyrighted work.” Accordingly, Google must fall back upon the copyright “fair use” doctrine, which states:

    [T]he fair use of a copyrighted work, including such use by reproduction in copies . . . for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright. In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include-

    (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
    (2) the nature of the copyrighted work;
    (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
    (4) the effect of the use upon the potential market for or value of the copyrighted work.

This statute may appear reasonably straightforward on its face, but of course in application it’s not. The courts have applied these factors in many different contexts, and reversals by appellate courts (including SCOTUS on many occasions) has proved that applying them in a given case is no simple matter.

It is almost inconceivable that any court would give Google summary judgment in this case (although the same is not true if the publishers/authors move for summary judgment). Assuming a trial, what kind of evidence might the jury expect to hear in the copyright “trial of the century” ? Well, on the first factor, start with evidence of Google’s expected profits from this project, including all internal memos and analyses by Google projecting its business plans for Google Book Search, as well as Google’s contracts with the universities, one of which has already been made public. I’d be surprised if the publishers can’t show that Google Book Search is motivated by commercial interests. On second factor, the jury will learn that many of the copyrighted books involved are works of fiction, which are granted far stronger protection than factual works.

On the third factor, consider that although Google claims it will display only “snippets,” it is copying the full text of the works, and the publishers/authors have no guarantee that Google will forever comply with its promise. What will the jury focus on, the amount and substantiality of the material copied, or the substantiality of the material displayed in a single search? Finally, the fourth factor, which focuses on the effect on the potential market or value. What effect might a compliation of this magnitude have on the market if Google lost control of all or even a part of it, a concern that is far from hypothetical, and that the publishers/authors are likely to press aggressively?

Bottom line: while as a consumer I would love to have access to a fully stocked Google Book Search, as a lawyer I would place my bets with the copyright owners. I do not agree with Jonathan Band’s argument that the Ninth Circuit’s decision in Kelly v. Ariba Soft is controlling, or even persuasive precedent in this case. Arriba held that an online database of photographs, displayed as thumbnails and linked to the original web site that hosted the photo, fell within the fair use exception. The most significant difference, of course, is that in that case the copyright owners had freely put their works on the Internet themselves, and Ariba made copies only for the purpose of providing a directory for them.

When the publishers and authors first filed suit my water cooler prediction was that Google would back down. Since then, Google has made aggressive statements to the press, asserting that it believed in its position, and that it would take this case through the courts. Frankly, I don’t believe it, and I expect to see some form of out-of-court resolution that can be held up as a victory for both sides, but which is far less than Google Book Search as it was originally conceived.

Previous post:

Next post: