Wednesday, January 14, 2009

Long Tale

Google's book scanning project turns up in the news every once in a while.
The project uses the collections of large research libraries, to generate digital image files with key word search capability.


To get a glimpse of the research potential of this project, an "exact phrase" search for Parks Music Company during the years 1910 through 1930 returns two items.
One uses the business as an example in a 1920 business English textbook. The other is a directory listing in the 1928 Editor and Publisher Marketing Guide.
parks_google_screenshot

parks_1920_exercise The two examples highlight some of the project's features.

The 1920 textbook from the research collection at the New York Public Library is out of copyright. The New York Public Library has given permission for the whole book to be viewed on line, printed out, or used in other ways.
The library that owns the 1928 directory (One of the libraries within the University of California) only allows summary information about that title to be shown -- since it was published after 1923 it is still subject to copyright.
Some of the more powerful features of the google / books project will only be available to paying subscribers.

Another digitization project with a different aim and different operating priciples is PROJECT GUTENBERG. That project is trying to generate electronic versions of complete texts. Its catalog even includes a few audio-book versions. Project Gutenberg content is available at no charge.

No comments: