Friday, October 17, 2008

Weekly Response 8

Chapter 1. Definition and Origins of OAI-PMH

Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH): greater interoperability between digital libraries and more efficient dissemination of information. Gee, that sounds like the general goal of all libraries. One thing I'm learning in this class is that while digital libraries are very different than traditional libraries in terms of structure and management, they have the same goals. Get information to the people! Preserve it for future people!

Scope: Metadata, using XML. It is moving into working with other classes of metadata and full content. The metadata is specifically for document-like-objects, in digital form. Often digital libraries are not just books and papers, but digital images, digital objects, and other things that require metadata.

Purpose: define a standard way to move metadata information from point A to point B in the world wide web; to facilitate sharing and aggregation of metadata.

They accomplish this by dividing the universe into OAI data providers (have the content and/or metadata) and OAI service providers (harvest info from data providers and make it available). This follows the client/server model. Data providers are the servers, service providers are the clients. This model allows one-stop shopping.

What it is not: an open access system, an archival standard, Dublin Core, or a realtime/dynamic search service.

Federated Searching: Put it in its place

Users want a search box! Give simple and easy access to information in one place, just like Google does. Whether or not the answer is the best one or from the best source is a moot point. Therefore, make federated searching mimic Google: one stop shopping that spits out an answer.

The Truth about Federated Searching
1. Does not search everything, ever! You will still have to consult other sources.
2. You will still get duplicates. To truly avoid duplication, it would take too long to download.
3. Relevancy is not perfect because it is only looking at the citation.
4. Federated searching out to be used as a service, not purchased as software. Updates happen to often to make it feasible.
5. The federated search engine does not search your catalog better than you can, it only searches it as well as your own search engine can.

The Z39.50 Information Retrieval Standard

Z39.50 is a standard allowing patrons to search other libraries' catalogs using their native library's interface. A client machine searches the server for data and it is retrieved using the client machine.

The server has all the catalog information and it retrieves the appropriate information and returns it to the user machine. Each set of database records has a set of access points for the collection.

Search Engine Technology and Digital Libraries
Since libraries are academic institutions with minimal universal searching capacity, and places like Google are universal search engines with minimal (although still a lot!) academic focus, the best of both worlds would be to marry the two entities: the academic internet! Google does have GoogleScholar now, although I am uncertain if it existed in June 2004, when this article was written. My understanding is that GoogleScholar works by bringing up papers and publications known to be 'academic' in nature that fulfill the search request. If you are searching from an academic IP address (like Pitt!) it will sort things so that emphasis is given to information available through the databases that that IP address subscribes to. So, if you search GoogleScholar from a Pitt computer, you are likely to retrieve fulltext items that you could have found through a database available at Pitt, but with the comfort of the Google interface.

This article appears to be focusing on academic libraries indexing the academic internet and making it available. Essentially, they would be putting the "LIBRARIAN APPROVED!" stamp on it. This helps the uninitiated user discern what would be an appropriate and trust-worthy source, vs. an inappropriate and untrustworthy source.

No comments: