What is a CIP catalog record

64th IFLA General Conference
August 16 - August 21, 1998

As of 22 April 2009 this website is 'frozen' in time - see the current IFLA websites

This old website and all of its content will stay on as archive -


 


Code Number: 007-126-G
Division Number: IV.
Professional Group: Cataloging
Joint meeting with: -
Meeting Number: 126.
Simultaneous interpretation:   No

Cataloging / metadata: old wine in new bottles?

Stefan Gradmann Pica, Leiden, Netherlands

Abstract:

The metadata approach makes it necessary to think about the relationship between descriptive data and referenced elements, especially in the field of document-like objects (DLO). It appears that in this area a certain number of fundamental parameters are subject to major changes. The paper identifies some of the fundamental differences between traditional cataloging and metadata production, or between the context of use and the relationship between metadata and the objects referenced by metadata or cataloging records.

The paper shows that recognizing these differences is a fundamental prerequisite for redefining the role of libraries in a constantly growing and changing information environment.


introduction

An article published by R. Heery in 1996 (and thus well before thinking about Internet standards and their development) says: "The familiar data of a library catalog could be described as metadata, in which the catalog data is referred to as 'data about data'. (HEERY 1996a) If this statement is considered valid (and it appears to be the case - especially semantically -), a correct, but somewhat naive, reaction from a library point of view could come to the conclusion that cataloging is a specific case of metadata generation is, and could just leave this vague talk aside: just keep cataloging as if nothing had happened.

One of the specific aims of this paper is to identify some of the reasons for the inappropriateness of such a response: the general purpose is to highlight some of the points that will make the metadata theme important to libraries in the near future.

As a result, the paper is not an introduction to metadata policy issues. Such knowledge can easily be obtained via the WWW: The starting point is collections of material via metadata (http://ifla.inist.fr/II/metadata.htm), disseminated by IFLA or via the metadata page of UKOLN (http: //www.ukoln .ac.uk / metadata /), which provide extensive information on all aspects of metadata. Anyone familiar with these general metadata information pages will understand why this area is treated less thoroughly here: the paper does not seek to cover all metadata standards and activities, but rather focuses on an example, perhaps the most prominent at the moment : Concentrate the so-called "Dublin Core" (DC) set (background information at http://purl.org/metadata/dublin_core).

The paper also does not want to be a contribution to relevant standardization processes in the area of ​​DC or to existing / emerging library cataloging rules and formats (such as ISBD (ER)), nor is it argued in favor of one of these models: There are options that are more suitable for this (like the corresponding mailing lists), and there are certainly experts in both fields who are entitled to a higher degree than the author of this paper to make such contributions.

What concerns me here is more the question of a possible interrelationship between the cataloging and metadata approaches and an only very tentative and preliminary attempt to provide answers. Some claim that metadata and 'conventional' cataloging records complement each other in some respects, whereas in this article I would like to emphasize that both pursue fundamentally different, if not contradicting working models, which are also based on different working concepts.

There are several good reasons - some explicitly, some implicitly - why the metadata community did not rely on the addition of the MARC format, but created a completely new set of attributes. Some of these reasons stem from the outside view of what librarians do: a refreshing perspective that should get librarians thinking.

On the flip side, the metadata approach benefits from the bonus that comes with every new start - once that is over, metadata-based activities are likely to rediscover some of the problems and pitfalls that have plagued librarians over the past 30 years: while reinventing wheels sometimes is justified (and somehow is in the current field of library automation up to now), there are, on the other hand, good reasons to avoid mistakes that have already been made by others.

This post intends to challenge and stimulate the discussion: I apologize for all the necessary simplifications and analogies that I use in this context: they are wrong like all simplifications and analogies ...

Who does it, how is it done?

When looking at the typical results of DC metadata production, a tempting thought - at least from a librarian's point of view - is to assume that DC metadata is some kind of simplified cataloging format. Such a perspective is supported by metadata definitions like the following: "Metadata is data about data and thus provides basic information such as the author of a work, the date of creation, references to related works, etc. A recognizable form of metadata is the catalog slip in a library; the information on this slip of paper is metadata about a book. Perhaps, without realizing it, you use metadata every day in your work ... "(MILLER 1996)

This is fully in line with a similar point introduced into the DC discussion very early by P. Caplan. In an attempt to answer the question, “What is metadata?” She states, “Metadata is really nothing more than data about data; a catalog record is a metadata record; as is a TEI header or any other form of description We can call it cataloging, but for some people the term is too strained, like the Anglo-American Cataloging Rules and USMARC. It kind of results in a "you call it corn, we call it corn" situation, but metadata is a good neutral expression that corresponds to all basic elements. (CAPLAN 1995) (1) In another attempt to give an overview of metadata formats, R. Heery mentions cataloging and DC in the same context, but shows a difference in complexity: A variety of formats are listed in the following table, plotted along a sequential form from a simple data set (Volume One) to a complex, rich data set (Volume Four) The number of data types is identified in the bibliographic control process and presented as follows:

Band One Band Two Band Three Band Four Proprietary Dublin Core MARC ICPSR simple records: NetFirst IAFA TEI FGDC independent headers [...] [...] [...] [...] Publishers' CIP MARC EDI messages CIP forms (HEERY 1996a)

All of this seems to indicate that the basic concern of this paper is in fact not an issue at all; it's about nothing but a slight change in terminology and variations in complexity.

More than a slight difference, however, can be seen in the following definition given by T. Berners-Lee: "Metadata is information about online publications or other things that machines can understand" - and it is continued: "The expression" machine understandable "is that Actual Keys, we're talking about information that software agents can use to make our lives easier, to make sure we obey our principles and the law, to verify that we can trust the things we do and the things we do ensure that all of our work runs smoothly and quickly. " (BERNERS-LEE 1998)

This is quite different from the "We can call it cataloging" position: While the main goals can be called the same as those of cataloging (reliability and authentication of the meta information), the information users are different: (software agents rather than library users) and that special interest in efficiency implies that things should run "smoother and faster" than cataloging!

The difference becomes clearer when we consider an aspect that originally led to the DC initiative and recently recalled Stu Weibel: "One of the original motivations for the DC workshops was the idea that authors could create their own descriptions. " (WEIBEL 1998) (2) - not only is the production process different, the authors of the meta information are no longer the "catalogers in libraries".

An additional aspect that should be considered is the fact that another aspect of the DC initiative is "simplifying the discovery of documents in a network environment" (LAGOZE 1997) and not primarily the documentdescription was. The metadata approach therefore only happens to fit into the description paradigm of library cataloging.

In fact, all of this leads to a clearer idea of ​​the explicit and implicit requirements associated with the term metadata: these are for you Context of use which differs from that of library catalogs, they are typically not created by professional catalogers, they are intended to Create cataloging records more effectively, they include a special type of material (electronic documents) and - this is the point that will be further elaborated below - the relationship between metadata and the referenced documents is fundamentally different from the relationship between a cataloging data set and a book in a library.

Although the results of the metadata production, the current DC data sets, can be semantically similar to a simplified cataloging data set (and can easily be converted into a MARC format (3)), the whole context of the production and use of this information is different and is intended to be circumvented traditional cataloging. To see the metadata creation process as a kind of simplified cataloging is a serious misunderstanding.

Who is it made for? And how is it used?

Cataloging records, as traditionally created by libraries, are more general in the sense that they address relatively little to potential users of the data. The future usage environment (integration in an OPAC or the insertion of printed catalog cards in a card catalog) has so far had very little influence on the way information is created in the cataloging process and hardly influenced the semantics (as they are formalized in cataloging rules such as AACR2 is set). This can be seen as an advantage - however, libraries are becoming increasingly aware of the disadvantage of a lack of end-user orientation in cataloging activities and are being forced to re-examine some of their principles, if only because of the growing awareness of costs in the political arena.

The same is not true of DC and other metadata initiatives: one of their main characteristics seems to me to be that they are very much based on specific end-user requirements. This could be seen as a disadvantage, as changes in end-user behavior and in the user environment can quickly have serious effects and carry the risk of data discontinuity - but this characteristic is more likely to be seen in a positive context nowadays. Every time DC is introduced, they will specific types of material (electronic objects in the WWW environment) are cited as arguments for their origin special ideas of the usage environment (For example, increasing accuracy in connection with Internet search engines is one of the (always) recurring arguments in this context) and they are often developed in relation to a specific user group: the metaphor of the 'digital tourist' is used by the DC community in upheld this sense.

This already applies in some respects to the DC semantics: To give just one example: one of the basic requirements here seems to be the uniqueness of the documents, whereby it is not taken into account that a 'work' (in the ' Functional Requirements' terminology) can have different editions and that copies of them exist - the result is a 1: 1 relationship between metadata and the physical resource, which is laid out according to the "flat" information paradigm of the WWW more tangible in connection with the corresponding syntax suggestions, which are clearly based on the use of the WWW environment. (5)

The fundamental difference can perhaps best be illustrated by comparing the respective relationships between cataloging records and books, and between metadata and the referenced documents.

In most local library systems, bibliographic records are usually supplemented by item records that contain the call number and thus the location of a book. The signature is then normally used with the aid of the lending functionality of the respective library system and often even additionally by staff in order to provide the user with the object of his request, the corresponding book or document. The basic point is that this purpose of use has little or no effect on the bibliographic record and cataloging.

These conditions are fundamentally different from those for metadata, as emphasized by R. Heery:

"Metadata differ from traditional cataloging data in that the address of an item of information is contained in the data set in a form that enables direct document access via appropriate application software, in other words, the data set contains precise access information and the network address." (HEERY 1996b)

As such, metadata is part of a special technical information infrastructure; In some respects this even applies to the semantics, which were originally intended to be independent of the environment: the current value of a metadata record is determined to a large extent by the fact that access to the document is actually contained in the record and also works (this explains the the discussion within the DC about the problem of 'non-functioning links' and the necessary link with URN or other identification-related standardization processes), and that these accesses meet the technical requirements of the application software that are needed to access the information, fulfill. To put this point in a very simplified way, one could say that a metadata record that contains an invalid access is worth almost less than no record at all.

The conclusion of this section is that metadata is not only based on a different creation concept, but that it also differs from cataloging datasets in the usage environment and that it is technically linked to this area to a high degree. While this seems to simplify matters a lot (direct document access made possible through the use of standardized access methods), paradoxically, this fact complicates things at the same time, since the role of metadata sets in the information infrastructure depends on the development of rapidly changing Internet standards (to be clear to make, this point is only a fact and should not be understood as a criticism of the metadata approach).

... and a chance for librarians ?!

Some of the fundamental differences between bibliographic records and metadata as well as the different creation mechanisms should now be clearer. Clear enough to understand that both approaches are part of different information infrastructures and necessarily respond to them, even if there are points of contact and similarities.

It can certainly be possible to combine both information paradigms with one another, as was presented in the proposal by XU (1998) to use the library OPAC as a gateway to access metadata repositories.I don't want to discuss this in detail even though I have my personal doubts about its immediate practicality. However, this is an important direction to be explored by librarians, and some of this current work being done in my institution - Pica -, combining library automation and internet technologies, is going in the same direction as we do in our WebDOC and ours Do DELTA project.

However, there are other areas in which the metadata community can benefit from specific library knowledge and experience (or, where it is already the case, in relation to the many people who represent the library world in that community), and this perhaps applies to the so-called "qualified DC" to a greater extent than to the "simple DC". I think of examples like the lessons that can be learned from the MARC experience and its footnote architecture, or the use of controlled vocabulary that can lead to new discussions similar to those that have been held in the library world about authority data in the past. There are other areas of this kind in which the necessary reinvention of the metadata wheel can avoid (and already avoid) problems that have already been recognized in earlier contexts.

I would like to end this lecture by pointing out two subject areas in which continuous substantial contributions from the library world can be seen as particularly valuable for the metadata approach. A participant on the meta2 mailing list recently did the following:

"My own experience shows that what enables better results in library catalogs is not so much the format itself, but the information that has been put into the format. Librarians have traditionally followed the concept of consistency when creating library records (consistent Form of a name, a title and a definition of content) I gladly agree that it is a big step forward to search for "Green" as a name separately from "green" in the title, but it is nothing compared to the possibility of the to choose the correct "David Green" from a multitude of eponymous names. " (Weinheimer 1998)

Consistency and nominal dates are really areas of particular concern to librarians, and they may also play a role in the context of metadata; to contribute to the comprehensive consistency of the production results - this makes it clear that the intention here cannot be to convert the metadata approach back into a kind of traditional cataloging!

The second area I am thinking of is closely related to this and concerns the metadata authentication problem. The recent report from the EC Metadata Workshop in Luxembourg states "that the current start of the Dublin Core is slow and that critical mass is missing". Among other problems, one is the fact that search engines like AltaVista generate metadata from indexed keywords and the lack of metadata authentication is one of the main reasons for this. A message from S. Weibel to meta2 shows this problem as follows: "But I have come to the conviction that we are moving from a 'where-I-click'-mentality' to a 'whom-I-trust-I'- As a representative of the library community, I see this as an opportunity as well as a problem that gives public trust our greatest important asset.

There are other functional communities that also help disseminate secure document description, museums, governments, publishers, professional and commercial organizations. There is room for abuse in each of these systems, and there will be (or already is) the metadata area. It becomes even more critical that those who offer reliable document descriptions in accordance with the order find common conventions (including the evaluation instruments) on which we build the future that we imagine "(WEIBEL 1998).

The following proposal has been made in this context:

"We doubt that proper navigational meta-information is offered by recognized organizations. I would expect something like yellow pages to be developed - it will cost money to describe the resources; the more you pay, the more lists you will get." I'm referring to commercial lists here; I expect services like Altavista, Yahoo!, and others to continue their free offerings, but I wouldn't be surprised if they focus on descriptive resources for sale. " (ARNETT 1998)

I'm not sure if this is a promising or a desirable path: it can be a way to involve public institutions like libraries in this necessary process of providing information. While I would agree that there is a need for recognized agencies in this process, I am not sure if all of us would be happy to be totally reliant on commercial services in this vibrant process of information dissemination. Since this very idea is anti-cyclical in terms of change in relation to the current wave of deregulation, I think that it is an important point to think about.

To take up the title of this article: It should now have become clear that metadata is more than just a buzzword and not (just) old wine in new bottles. The approach this term stands for stems from an information paradigm that differs from library cataloging work; and I think that libraries should feel invited to follow this development closely and see it not as a possible threat, but rather as an opportunity to redefine their role in the context of the emerging information landscape.

Literature:

Arnett, Nick: Re: authentication of metadata. [email protected] (23 Jan. 1998) (= ARNETT 1998)

Berners-Lee, Tim: Metadata Architecture. Documents, Metadata, and Links. Last edit date: 1998/02/06 17:06:46. http://www.w3.org/DesignIssues/Metadata.html (= BERNERS-LEE 1998)

Caplan, Priscilla: You Call It Corn, We Call It Syntax-Independent Metadata for Document-Like Objects. In: The Public-Access Computer Systems Review 6, no. 4, 1995. http://info.lib.uh.edu/pr/v6/n4/capl6n4.html (= CAPLAN 1995)

Heery, Rachel: Metadata Formats. December 1996. Deliverable D1.1 - Work Package 1 of Telematics for Libraries project BIBLINK (LB 4034) http://www.ukoln.ac.uk/BIBLINK/wp1/d1.1 / (= HEERY 1996a)

Heery, Rachel: Review of Metadata Formats. In: Program, Vol. 30, No. 4, October 1996, pp. 345-373 (= HEERY 1996b)

Lagoze, Carl: From Static to Dynamic Surrogates. Resource Discovery in the Digital Age. In: D-Lib Magazine, June 1997. http://www.dlib.org/dlib/june97/06lagoze.html (LAGOZE 1997)

Miller, Paul: Metadata for the Masses. In: Ariadne, 5, Sept. 1996. http://www.ariadne.ac.uk/issue5/metadata-masses/ (= MILLER 1996)

Metadata Workshop, Luxembourg - 1-2 December 1997. Workshop Report. http://hosted.ukoln.ac.uk/ec/metadata-1997/report /

Miller, Paul: An Introduction to the Resource Description Framework. In: D-Lib Magazine, May 1998 (= MILLER 1998)

Olson, Nancy B. (Ed.): Cataloging Internet Resources. A Manual and Practical Guide. Second edition. http://www.oclc.org/oclc/man/9256cat/toc.htm (= OLSON)

A User Guide for simple Dublin Core. Draft version 4.0 (05/15/1998); http://128.253.70.110/DC5/UserGuide4.html (= USER GUIDE)

Weibel, Stuart: Re: authentication of metadata. [email protected] (23 Jan. 1998) (= WEIBEL 1998)

Weibel, Stuart and Hakala, Juha: DC-5: The Helsinki Metadata Workshop: A Report on the Workshop and Subsequent Developments. Official report of the Helsinki DC Meeting. In: D-Lib Magazine, February 1998, http://www.dlib.org/dlib/february98/02weibel.html (= WEIBEL / HAKALA 1998)

Weinheimer, James: Re: authentication of metadata. [email protected] (23 Jan. 1998) (= WEINHEIMER 1998)

Xu, Amanda: Metadata Conversion and the Library OPAC. In: The Serials Librarian 33 (1-4) (Spring 1998), http://web.mit.edu/waynej/www/xu.htm (= XU 1998)

Notes:

  1. In connection with this definition, the DC drafts that are now being created are placed on the same level as other "standards from AACR2 to GILS that define metadata records" (CAPLAN 1995).
  2. Accordingly, specialist publications from the library sector such as OLSON do not mention the DC as a relevant cataloging tool.
  3. As evidenced in: Mapping the Dublin Core Metadata Elements to USMARC. OCLC Discussion Paper No. May 86, 1995. (http://ifla.inist.fr/documents/libraries/cataloging/dublin1.txt) and elsewhere.
  4. In: WEIBEL / HAKALA 1998, this principle is "maintained" in the clear awareness of the "complexity of relationships with related works that are opposed to a uniform definition / explanation.
  5. The current proposals for using XML based on the RDF syntax are a good example of this; cf. MILLER 1998