Friday, September 12, 2008

Week 3 Readings

Lesk Ch. 2

Computer typesetting:
1. Printers
2. Word processing
a. exact appearance of the text
b. content of the text

Text Formats
1. ASCII standard: 7-bit code for 26 Latin letters
2. Unicode is gaining popularity: covers all characters for all major languages in 16-bit-per character
3. Higher level descriptive systems: characters are marked for meaning
a. MARC: Machine-Readable Cataloging
b. SGML: Standard generalized Markup Language
c. HTML: Hypertext Markup Language

Document Conversion: analog to digital forms
1. Keying in: expensive
2. Scanning: less expensive
a. Optical character recognition: improving reliability
3. Converted documents can then be made online: digital libraries!

Arms Ch. 3
1. Structure: elements of the document: font, characters, paragraphs, etc
2. Appearance: How the elements are arranged on the page
3. Page-description languages: describe appearance on the page. TeX, PostScript, PDF
4. Encoding characters: ASCII, Unicode, transliteration, SGML, HTML (simplified SGML), XML (bridge between SGML and HTML)
5. style sheets (formatting on screen/printed page)
a. Cascading style sheets (CSS): used with HTML
b. Extensible style language (XSL): used with XML
6. Page description languages: layout
a. TeX: focus on mathematics
b. PostScript: graphical output for printing, with support for fonts
c. Portable document format (PDF): from PostScript. Similar attributes to reading paper, but on the screen. Can limit unlawful printing. Adobe provides excellent, free PDF readers, making the format widely accepted.

Identifiers and Their Role In Networked Information Applications
1. ISBN, ISSN, OCLC, RILN: make locating a given object easy.
2. New identifiers are emerging the electronic world: URLs and URNs
a. URLs: not long lasting locators, very ephemeral.
b. URN: naming authority identifier and object identifier
c. OCLC persistant URL (PURL): maintained for a much longer time than regular URLs- less likely to produce dead links.
d. Serial Item and Contribution identifier (SICI): using ISSN, can identify individual journal or article.
e. Book Item and Contribution Identifier (BICI): can identify individual volumes or chapters within a work.
f. Digital object identifier (DOI): based on the URN idea. Can allow copyright limitations to control who has what kind of access

Digital Object Identifier
1. DOI is the digital identifier of an object, not the identifier of a digital object. It is a persistent identifier.
2. It includes: Syntax (name), resolution of the name to the object, metadata describing the object, and social networking of the object through interoperability
3. DOI does not preserve the object: it merely finds a way of sharing information about the object.


These 4 readings are all centered around communicating meaning about a given object or text. The characters on the page don't mean anything to a computer, so it is necessary to tag them and use appropriate languages so that you can convey that meaning to the computer. When you do that, the computer can organize it in the way you want.

Affixing meaning also applies to identifiers. Without a good identifier, a given object will be very difficult to find. Providing an identifier like a DOI not only helps the user to access the object, but it also provides other information about the object that is translatable across a variety of mediums. This means that the record will be persistent.

All of this applies to digital libraries. What is the point of having a digital library if you can't find what you are looking for? Or if you may have found what you're looking for, but you're not quite sure if it is without looking at the entire object? Providing information about a given object is absolutely vital in any library, including digital libraries.

And, here is an entirely gratuitous puppy picture, for those who are interested.

We took her camping in Fayette county a few weeks ago. There was a lake there and she swam and swam and swam. She's a water dog, you might say.

Look at those little paws paddling! awwww.

No comments: