Not content with organizing billions of Web documents, Google Inc. is leading the charge in turning library collections into searchable digital content.
In announcing Tuesday that it is working with five major libraries to scan millions of books for inclusion in its Web index, Google opened another battle in the intense competition among the leading search engines.
Its major search competitors will likely respond by further expanding their own indexes with sources outside of traditional Web pages, analysts said.
Meanwhile, Googles step into becoming a digital library drew enthusiasm as well as uncertainty from librarians. They were optimistic that the project would raise the profile on libraries in the age of the Internet but worried that book collections might get lost in the sea of searchable information on Google.
“This is valuable content,” said Allen Weiner, a research director at Gartner Inc. “Weve been focused on Web content, which has varying degrees of value, but this has a built-in marketplace and built-in demand.”
Googles library project is part of the Google Print effort it started testing early this year and launched as a beta in October. Through Google Print, the Mountain View, Calif., company is working with publishers to include digital versions of books and periodicals in its search index of about 8 billion documents.
For the library project, Google is partnering with the New York Public Library and the libraries of Harvard University, Stanford University, the University of Oxford and the University of Michigan.
Scanning the libraries collections will take years, but Google already has made a small percentage available from its search engine, said Susan Wojcicki, Googles director of product management.
Google has reached different arrangements with the libraries, each of which has a collection ranging from 7 million to 15 million books. It will scan the entire collections of the Stanford and Michigan libraries, while it will digitize works from 1900 and earlier at Oxford, Wojcicki said.
Harvard and the New York Public Library are starting with pilot projects of a subset of their collections.
“This is something we wanted to do when the company started, and it was the vision of founders before they even started Google,” Wojcicki said, noting Googles origin as a library digitization project at Stanford. “This happened to be a time where Google had enough resources to take on such an endeavor.”
Google earlier this year raised $1.7 billion in one of the years most closely watch initial public offerings.
Google is classifying books into three categories to deal with copyright issues. For works in the public domain, Google plans to make the full text available as part of search results.
For those under copyright, Google will work with publishers to determine how much of the text will be shown, Wojcicki said. Where it has no publisher relationship, Google will show short excerpts or only bibliographical information.
Google also plans to display its sponsored links alongside the text of books where it has a publisher relationship and to share with publishers a portion of pay-per-click revenues, Wojcicki said. With public-domain works and the excerpts, no ads will be displayed.
In a preview page about the library results, Google also is displaying links for buying a book at an online bookseller or for borrowing it from a local library.
Next Page: Will proprietary deals become the next trend?
Proprietary Deals
The library partnerships offer one of the first concrete clues about Googles strategic direction, Weiner said. The project will add new sets of non-Web information into its index.
“Google believes that its future is as a search purveyor where search is what drives the economics,” Weiner said.
Google Print pits the company more directly against Amazon.com Inc., which also has made books searchable by keywords.
Weiner said he doesnt expect Yahoo Inc., Microsoft Corp.s MSN division or Ask Jeeves Inc. to directly compete with Googles drive to digitize library collection, but he said all search players will increasingly look to add new forms of content to their indexes.
“Companies will begin to create proprietary deals for searching up databases and new areas,” he said. “Their ability to strike these deals gives them areas of differentiation.”
Google plans to co-mingle the library collection with its overall Web results. Google Print results today appear atop the list of results when a searcher enters keywords associated with a book, such as part of a title or an authors name, Wojcicki said.
Googles approach of merging books within the same index as Web pages and content raised concerns among some librarians.
“The bigger the Google database gets, the harder it will be to find all these snippets of things, especially since they do not provide a specific interface for these books,” said Steven Cohen, a librarian in New York and a contributing editor to Weblog ResourceShelf.
Gary Price, a library and Internet research consultant, said he shared concerns about whether books from the library collections would be easy to find in Google since few searchers use advanced queries or enter more than two or three keywords in a query.
Price, who is also the editor of ResourceShelf, said he hopes Google eventually provides libraries with a way to directly access the book collections in Googles indexes for their Web sites and online efforts.
Yet Googles move into scanning and indexing library collections could be a boon for libraries, which have struggled to market their research offerings as the Internet has grown in popularity, Cohen said. By making books more accessible online and including links to local libraries, Google could help increase the profile of libraries, he said.
Whatever the effect on libraries, the project will help Google maintain its top mindshare in search as it faces an increasing number of competitors, Price said.
“This has been another incredible marketing move for [Google],” Price said. “Not that the mission is not noble, and I applaud them for that, but its also another brilliant marketing move.”
Editors Note: This story was updated to correct information about which portion of Oxford Universitys collection Google will scan and index.