Archive for May, 2010

The future of bibliographic control: Data infrastructure

May 3, 2010

In January 2008, the Library of Congress Working Group (LCWG) on the Future of Bibliographic Control issued a report with findings on the state of bibliographic control, recommendations for the Library of Congress as well as libraries of all types across the country, and predictions of what might happen if recommended changes do not take place. [1] Recommendations ranged from increasing efficiency of bibliographic record production to positioning cataloging technology – and the cataloging community as a whole – for the future to strengthening the library and information science profession. As someone who works with other library technologies but has no experience with cataloging (other than MLIS coursework), my interest in this topic is drawn primarily to the future of cataloging technology. As I see it, the future of bibliographic control is tied to data infrastructure.

I don’t think I did a satisfactory job of explaining my position in an earlier post; my criticism of RDA is not based on the vocabularies or the not-yet-released text but on the decision to retain the MARC21 standard with the implementation of RDA rules based on the FRBR model. I strongly believe that updating the metadata infrastructure will have benefits in several of the areas discussed in the LCWG report, including sharing bibliographic data, eliminating redundancies in cataloging, and strengthening the LIS community’s place in the information arena. Even before the report’s release two years ago, the cataloging community was issuing calls for a more extensible metadata infrastructure that would permit data sharing both within and outside libraryland. [2] An important outcome (perhaps the most important outcome?) of the LCWG report is the increased discussion of the metadata infrastructure issue among the cataloging community in the literature, the blogosphere, and email listservs. [3, 4]

Reducing redundancy and increasing efficiency by sharing metadata

The first set of recommendations in the LCWG report dealt with eliminating redundancies; this goal has not been accomplished yet but the cataloging community’s discussions about formatting records in RDA to facilitate sharing among entities within and outside libraryland are a start. Among the LCWG recommendations to increase use of bibliographic data available earlier in the supply chain were recommendation 1.1.1.2:

“All: Analyze cataloging standards and modify them as necessary to ensure their ability to support data sharing with publisher and vendor partners;”

and recommendation 1.1.1.5:

“All: Work with publishers and other resource providers to coordinate data sharing in a way that works well for all partners.”

Not to minimize concerns about loss of bibliographic control, but libraries might as well take advantage of the metadata created elsewhere by trusted partners; if partners are selected carefully, the benefits should outweigh the risks. Shared metadata could reduce the number of redundant records and, by distributing the responsibility (or burden, if you wish) of creating metadata among more players, each party might reclaim time, funding, or manpower to put toward other efforts. Of course, these arguments are essentially theoretical until they can be tested. Getting libraries to a point where we can try sharing more data with other entities and information sources will require a shift in attitudes and comfort zones as well as a change in the technology supporting our records.

Positioning our technology for the future

Section 3 of the LCWG report called for the greater cataloging community to “Position Our Technology for the Future.” [1] The first recommendation in this section was to “develop a more flexible, extensible metadata carrier,” including:

“3.1.1.1 LC: Recognizing that Z39.2/MARC are no longer fit for the purpose, work with the library and other interested communities to specify and implement a carrier for bibliographic information that is capable of representing the full range of data of interest to libraries, and of facilitating the exchange of such data both within the library community and with related communities.”

One potential replacement for the MARC standard is RDF/XML. The RDF (resource description framework) data model and XML (eXtensible Markup Language) syntax were in existence well before the release of the LCWG report but are getting more attention from the cataloging community as discussion turns to data management and the Semantic Web. [5] Although there are other languages which might prove to be suitable, XML is well-enough established to be in wide use, including by many of our potential partners in metadata sharing.

XML can be used with established HTML tags for formatting (“markup”) but is infinitely more adaptable (“extensible”) because the user can define his or her own XML tags to be used as “containers” for different types of data. The tags are then used for data manipulation and display by referencing data, using unique identifiers to pull them from Web-accessible databases. Essentially, XML enables computers to read and process data, which is one of the main principles of the Semantic Web. MARC was designed to make metadata readable by machines, too (hence the name Machine Readable Cataloging), but the problem is that no one outside of libraries, publishers, and distributors is using MARC21. XML, on the other hand, is not only machine-readable but also machine-actionable and it isn’t limited to libraries and related industry; it’s used by players in all kinds of fields. What does this have to do with the future of bibliographic control? Packaging our metadata in an arrangement that is flexible, machine-accessible, and, perhaps more importantly, used by others outside of libraries but within the information arena would permit more give-and-take with record creation, hopefully resulting in less duplication of effort and more accurate records (as long as everyone uses the same tags, which was touched on by recommendations 3.1.1.2 and 3.1.3 in the LCWG report and is another discussion unto itself). By letting the machines do the heavy lifting, so to speak, we could then use the data more efficiently and with more confidence. This would have benefits both for the cataloging community and our users.

Go where the patrons are, or: How I learned to stop worrying and love Web 2.0

Library users are demonstrating different search strategies than in the past; now, users often search for bibliographic information from sources like Amazon.com and Google instead of the OPAC. [6] Web-based tools like LibraryThing pull in bibliographic metadata, reviews, and references to the item found online (such as on Wikipedia). Sources of information like Amazon.com and Google are often more intuitive to the user than a typical OPAC, so it’s not surprising that users use what they are comfortable with. Instead of watching from the sidelines, libraries should join in and take advantage of the metadata that’s already available on the Web. The phrase, “Go where your patrons/users/customers are,” is often applied to libraries’ use of Web-based technologies and social media and it is applicable here too.

In addition to importing jacket cover images and professionally-generated reviews from non-library sources, some library OPACs also are satisfying users’ desire to contribute user-generated content like ratings, reviews, and comments. Despite the increase in user-generated content, and the users’ desire to generate said content, libraries want to maintain bibliographic control by not permitting users to edit catalog data. Although maintaining control in this manner is understandable, given the lack of cataloging training held by the majority of users, it seems like libraries could harvest some data from users – with some limitations on what can be edited – with less output of effort than doing all original cataloging and without sacrificing the integrity of data created by trained catalogers. In other words, wouldn’t some help be better than no help? I don’t think the question could be answered adequately without giving it a shot. It is in the best interest of the LIS profession to implement and embrace the Web 2.0 features our patrons want; we can benefit from the give-and-take of metadata from patrons and other sources while keeping ourselves relevant as an online source of information.

Still waiting for the right outcome

The outcome of the LCWG report in these aspects has been more discussion, rather than decisions. In addition to a data container that will work with others inside and outside libraryland, a new data structure would, ideally, provide catalogers with linked access to standard vocabularies and provide for newer forms of metadata like user-generated ratings, reviews, and tags. Developing standards is such an intricate and complex process, though, that it is better to take the time to examine the situation thoroughly and try to get it right the first time rather than rush into a “solution” which does not facilitate desired functions and lacks long-term viability. That was part of the reasoning behind LCWG’s recommendation 3.2.5 to suspend work on RDA – “Assurance that RDA is based on practical realities as well as on theoretical constructs will improve support for the code in the bibliographic control community” (p. 30) – a recommendation which has not been adopted by the Joint Steering Committee for Development of RDA. The retention of MARC21 will have implications on libraries’ ability to incorporate other LCWG recommendations which might be realized sooner with the proper metadata infrastructure.

Notes

  1. Library of Congress Working Group. (2008). Report of the Library of Congress Working Group on the Future of Bibliographic Control. On the Record, January 2008. Retrieved from: http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf.
  2. See, for example: Coyle, K. & Hillmann, D. (2007). Resource Description and Access (RDA): Cataloging rules for the 20th century. D-Lib Magazine, 13(1/2). Retrieved from: http://www.dlib.org/dlib/january07/coyle/01coyle.html.
  3. Coyle, K. (2010). RDA vocabularies for a twenty-first-century data environment. Library Technology Reports, 46(2). Retrieved from: http://alatechsource.metapress.com/content/k7r37m5m8252/?p=f27fdbe2e2904acfbea08ee4c96e8ad8&pi=1 (links to each of the six chapters/articles available here).
  4. RDA-L (RDA email listserv). Retrieved from: http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/.
  5. See, for example: Coyle, K. (2010). Understanding the Semantic Web: Bibliographic Data and Metadata. Library Technology Reports, 46(1). Retrieved from: http://alatechsource.metapress.com/content/g212v1783607/?p=a596ecbea377451cbc6a72c8e28bb711&pi=2 (links to each of the three chapters/articles available here).
  6. De Rosa, C. et al. (2005). Perceptions of Libraries and Information Resources. Dublin, OH: OCLC Online Computer Library Center. Retrieved from: http://www.oclc.org/reports/pdfs/Percept_all.pdf.

Improving Access to Rare, Unique, and Other Special Hidden Materials- It’s happening…

May 3, 2010

There were many great recommendations made in “on the record” that produced a number of outcomes. Some of which have already been touched upon in the previous blogs. The outcome I wanted to focus on is the improvement in access to rare, and unique materials in special collections and archives. I feel that some of these improvements have resulted from LC stating in this report that these materials need to be brought to the light and made accessible. There were a number of great objectives listed for improving access but I will focus on only a few.

2.1.2 Streamline Cataloging for Rare, Unique, and other Special Hidden Materials, Emphasizing Greater Coverage and Broader Access.

In the Library of Congress response to “On the Record” a number of planned actions are listed for this particular objective such as developing and sharing workflows for cataloging that will allow this objective to be carried out in a practical manner. Most importantly it mentions the need for development of technology that will automate metadata production.

In what I feel is a response to these planned actions are a number of presentations at workshops and conferences that have been focused particularly on the cataloging workflow for special collections and rare materials. An example of these is the Hidden Collections Symposium <http://www.clir.org/hiddencollections/symposium20100329.html> held in March 2010. Several of the presentations focused on cataloging workflows for special collections. The presentation given on the “African Set Maps” project at the Library of Congress <http://www.clir.org/hiddencollections/symposium/LibraryofCongress.ppt&gt; really highlights this a push for greater coverage and broader access.  In the presentation it is clear that they are definitely implementing linked data and attempting to share these materials via the web. They have links that go from the historical maps to a Google Earth view that highlights the area as it currently exists. One slide of the presentation in particular gives a great chart of the interaction between the Marc records, the data entry form, the LC Web Portal and Google Earth. I think my biggest take home from this related to something Giwilker said in her post RDA’s “Legacy Approach” “The trouble is that new solutions will have to be built on the old infrastructure. MARC can, and should be, expanded to deal with additional types of information, without reducing its present usefulness.” The map collection project underway at the Library of Congress shows that this is just what can be done by integrating the old infrastructure with new technologies.

2.1.4 Encourage Digitization to Allow Broader Access.

I have been applying for jobs lately, and I have perceived a significant increase in the number of positions for digital librarians. Far more than I had seen when I started looking for jobs in 2008. I recently applied for one at University of San Francisco for which the main duty would be to assess which areas of the libraries holdings should be made available digitally. While I have no direct sources to back me up that this is resulting from “On The Record” I think that it’s not a terrible stretch to think that many departments may have been inspired from it.

2.1.4.1 LC: Study possibilities for computational access to digital content. Use this information in developing new rules and best practices.

At the PLA’s Annual conference I sat through a demonstration for LibLimes product ArchivalWare. I think that this is a good example of products being developed that make use of the computational access to digital content. This product in particular allowed users to search digital content via traditional Boolean searches, Pattern Searches and Concept Searches. The pattern searches allow users to search for things misspelled and sill retrieve the information they were looking for. Similarly if they type something out of order, the pattern search will retrieve the words in the real order they were intended to be in. Add controlled vocabularies and this becomes an expanded and, in my opinion, exciting way to search for material in special collections. Concept searches were its coolest feature. Controlled vocabularies are added to the bibliographic records that expand terms found in the collection. In the demonstration they showed how a user could search for contaminated water, and retrieve material that contained the text polluted water, because it was a related term in the controlled vocabulary. Similarly to the map project being done at the Library of Congress, all this is being done with traditional MARC records in KOHA.

All of these projects I see happening make me think that re-training cataloging staff in RDA would use time better spent training them in newer technologies that can be incorporated into the older infrastructure. I was once told that in order to become a good librarian, I would need to learn programming. I took an introductory class to programming and the University of Washington and to my dismay found that I was no good at it … I thought about my future and my goal of being a cataloging librarian. Would I make a terrible cataloger if I couldn’t program? Will I be able to improve access to rare and unique materials if I can’t understand the new technology being developed? I have no answers to my own internal questions. However, I am pleased to see others coming up with such fantastic solutions to access issues for special materials.

http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf

http://www.loc.gov/bibliographic-future/news/LCWGResponse-Marcum-Final-061008.pdf

http://www.clir.org/hiddencollections/symposium20100329.html

http://www.clir.org/hiddencollections/symposium/LibraryofCongress.ppt

Questions about the Chaos

May 3, 2010

Diane encouraged us to post questions here for Jon about his lecture.

I’m curious about the distinction between the Semantic Web and the Linked Open Data Web. I was just getting my mind around having 2 webs, not 3! I know that linked data aren’t necessarily open and open data aren’t necessarily linked….why doesn’t it matter if thinking machines can’t figure out the linking on this third web?

Also, I was wondering about the appearance of a person’s name as a Spanish label for Daytona Beach. This mistake got fixed when DBpedia was refreshed. If we were dealing with an error like this with linked library data, would a human being who noticed it be able to fix it?

Thanks!

User-generated cataloguing and its expanding role

May 1, 2010

Although it provides a set of guidelines for the library world as a whole, “On the Record”, the report of the Library of Congress Working Group on the Future of Bibliograhic Control, was primarily directed at the Library of Congress itself. By June of the same year, the Library of Congress had published a report responding to the recommendations of “On the Record”. That report is available online at http://www.loc.gov/bibliographic-future/news/LCWGResponse-Marcum-Final-061008.pdf; it makes the point early that it “is not an official program statement from the Library of Congress, nor is it an implementation plan”. However, the report is generally enthusiastic about the vision outlined in “On the Record”.

Among the recommendations of “On the Record” were three suggestions in section 4.1, ” Design for Today’s and Tomorrow’s User”, regarding greater integration of library bibliographic data with external sources. The changes discussed are not those related to the possible use of FRBR to make library bibliographic data part of the Semantic Web; “On the Record” deals with user-created content, such as ‘tags’ and user-supplied reviews. The Library of Congress responded to these suggestions by naming a number of small projects focused on similar goals, and resolving to support their work and search out more projects. Amoung the projects named were:

  • The Library of Congress’s own Bibliographic Enrichment Advisory Team (BEAT), which attempts to add data such as tables of contents and reviews to bibliographic records.
  • WPopac Project at Plymouth State University, which supports user-generated tags for records
  • PennTags at the University of Pennsylvania Libraries , similar to WPopac
  • The Library of Congress’s Prints and Photograph Division Flickr project, which allows users to tag photos, with significant guidelines regarding how to tag

These projects already existed before “On the Records” appeared, and the Library of Congress did not propose any new projects in response to the report.

The fate of these projects has been mixed:

  • The BEAT project‘s website has not been updated since 2008, and the project itself is not mentioned in any other articles I have been able to locate since that date; as the website describes it as an all-volunteer project, it’s possible that the volunteers involved have moved on without finding anyone to take up the slack.
  • WPopac has been renamed Scriblio, and is still being worked on; the software is in use by several libraries. It is not clear if Scriblio still supports the addition of user-generated content.
  • PennTags still exists, and the University of Pennsylvania’s Franklin Library online catalog still includes a link to “Add to PennTags” at the bottom of each record. However, I was unable to locate any records which included PennTags content. Either the system has not been utilized enough to make tags available for all or most records, or the tags are only visible to those logged into the website as members of the UPenn community; while only allowing community members to contribute tags makes sense, it seems odd that only community members would be allowed to use them for browsing.
  • However, the Library of Congress’s Prints and Photographs Flickr project has been healthy and successful. In October of 2008, a report was released discussing the progress of the project. At that point, of the close to 5000 images the LOC had uploaded to Flickr, almost all had been tagged, and more than 65,000 tags had been added in total to the collection. Users had also added helpful comments to many photos, including information such as detailed description of locations and events, and links to related photos.

The Prints and Photographs project had one additional outcome that demonstrates the power of user-supplied content: it inspired the creation of the Flickr Commons, a more broad-reaching project to allow Internet users at large to assist in describing and tagging image collections of historical or cultural importance. This project currently includes collections from over 40 institutions, and is an apt demonstration of the power of user-supplied content.

The success of the Flickr Commons compared to other projects involving user-generated content suggests that the guidance suggested in recommendation 4.1.2.3 is a vital component in developing useful user-generated content. Other factors which might have contributed to its success were a connection with an already thriving collaborative community, as Flickr has a large user base and publicized the project heavily, and ease of contributing, as photographs are relatively uncontroversial and easy to identify. Collaborating with existing sites might be a worthwhile strategy to pursue for libraries attempting to add user-generated content.

The “Response” pointed out that “the relationship of entry vocabulary to controlled terms is a challenge for all
catalogs”, and that accordingly, much guidance will need to be provided to allow users to add useful and meaningful tags. It mentioned the extensive guidance provided in the Flickr project as a reason for its success. This is an important point that will need to be remembered. The response also mentions the ongoing debate about pre-coordination and post-coordination of Library of Congress subject headings; the implications of user-generated content as related to subject headings are too numerous to discuss in detail here.

Although the “Response” shows good intentions, it seems that little has in fact been accomplished in response to its recommendations regarding community interaction. The Flickr Commons project arose independently of “On the Record”, although the Commons is aligned with the community-related goals stated in it. However, its success demonstrates that user-generated content can be used effectively to enhance a catalogue, and that, indeed, given a sufficiently large and motivated community can become a catalogue in itself.

Why have more libraries not made the leap to include user-generated content? A possible reason is identified in the “Response” itself, although not as answers to that question: “preserving the library-created data is essential to both access and reuse in the future”. Libraries may be concerned that user-generated content would not be sufficiently differentiated from content originating with catalogers, and that the content might not be of the best possible quality. This is a legitimate concern. However, accepting user-generated data “without interfering with the integrity of library-created data” (as suggestion 4.1.2.1 puts it) is a technical problem, not a systemic problem. Appropriate OPAC interface design should allow segregation by content source, and perhaps even allow users to hide content from external sources if they do not think it helps them locate information.

Included as well under the general heading of “positioning our community for the future” were the broad suggestion of additional testing of FRBR (section 4.2), and a number of specific proposals related to reshaping the Library of Congress Subject Headings (under section 4.3). Proposal 4.3.1.2 was to “Make LCSH openly available for use by library and non-library stakeholders””; 4.3.1.1 was to “Transform LCSH into a tool that provides a more flexible means to create and modify subject authority data.” Again, an existing project was singled out for attention to implement the suggestion: in this case SACO, the Subject Authority Cooperative Program, which provides a means for libraries to submit proposed changes to the LCSH quickly and easily. This project is still ongoing and is working very well; more than 50,000 proposals had been submitted as of January 2010. Currently, only libraries can participate; they are required to have access to LOC’s online authority files. Making LCSH more openly available might facilitate the goal of improving it, as it would allow more public review and suggestions for improvement.

4.3.3.1 suggested creating more linkages between LCSH and other subject authority files; the Response said this was technically unfeasible, although desirable. If the creation of such subject linkages were done with the assistance of interested members of the public, the unmanageably large task might become possible to complete.

The possibilities of user-generated content for enhancing catalogues are well known, and “On the Record” acknowledged and encouraged them. Although the “Response” agreed that the proposals were good ideas, so far little has been done to implement them. In that sense, the report cannot be said to have had results at all. Nonetheless, that it brought these ideas into the public eye and that LOC agreed in principle with their ideas suggests that user-generated data has not been banished from the world of cataloguing – only put aside in the face of ongoing struggles to keep up with technology, and the ongoing RDA debate.

LC and the Semantic Web

May 1, 2010

Early in 2008, the Library of Congress Working Group on the Future of Bibliographic Control published the report On the Record.  This Report, directed at both the Library of Congress and the American bibliographic community at large, was to serve as a “’call to action; that informs and broadens participation in discussion and debate, conveys a sense of urgency, stimulates collaboration, and catalyzes thoughtful and deliberate action.” (pp. 3)  The Working Group recognized that the “future of bibliographic control will be collaborative, decentralized, international in scope, and Web-based.” (pp.4)  Pursuant to this assertion, the Working Group made a number of recommendations that pushed libraries ever closer to participation within the Semantic Web.  The Semantic Web, with its inherent organization and utilization of linked data, is an ideal venue for libraries and, more importantly, library data.  By putting library data into the structure of the web itself, thus allowing this data to be used by a wide variety of information communities, libraries can gain a greater sense of relevance and importance in this increasingly digital world.

It is this driving of library practice and standards towards the Semantic Web and the clear articulation of both benefits of the change and the inherent issue in the status quo that is, perhaps, one of the most important outcomes of this Report since its publication.  Yet, this drive has not been without difficulty or setbacks.  While the Report calls for sweeping change and increased collaboration and communication, the actions of the Library of Congress and the reluctance of the library community do not necessarily echo this charge.    As discussed in the Report and on the message boards for this course, the fate of libraries as significant information providers hangs on their ability to follow their users into web.  The third recommendation of the Report exhorts libraries to position themselves and their technology for the future “by recognizing that the World Wide Web is both our technology platform and the appropriate platform for the delivery of our standards.”  (pp. 5)  Additionally, the library community must recognize that “machine applications” are also potential users of “the data we produce in the name of bibliographic control”. (pp.5)  Currently, libraries, via their catalogs, are on the web.  However, the data that libraries produce are locked within their databases.  This data is not in the web in that it cannot be utilized, shared, mashed up, or effectively linked too (or, at least, not with any real ease).  By remaining on the outskirts of what has become a flourishing information and communication platform, libraries do themselves—and their patrons—a great disservice.

The Working Group focuses the efforts of libraries and LC on entering the Semantic Web by advocating for a change in the current standards libraries use to maintain and share their data.  One standard in particular is MARC.  I have already written about what I perceive to be the limitations of MARC in the Semantic Web.  In 3.1.1, the Working Group recognizes that “Z39.2/MARC are no longer fit” standards for metadata and calls on the Library of Congress to “work with the library and other interested communities” to create a metadata carrier that will be amenable to libraries and that will allow libraries to exchange data with other information communities.” (pp.25)  By moving from the MARC “stack” and by actively collaborating with other information communities libraries will be well placed to interact within a web environment.

This is a fairly bold statement, particularly coming from an institution such as the Library of Congress, which  currently holds the responsibility for maintaining MARC21 (pp.7).  While this is something that other librarians or information professionals had been discussing, coming from the Library of Congress, this carrier a certain amount of weight.  Even if LC does not have the mandate (pp.6), and matching funding, to be the “National” library, is has undertaken this role and it certainly leads by example .  In this Report, LC demonstrates its open-mindedness and practicality by looking the future square in the eye.

However, I am unconvinced that LC has made much headway in this area –a movement that I recognize as easier said than done.  While bibliographic utilities such as OCLC [cite] can convert library data into other, more interoperable standards, the library community as a whole is still MARC based.  On April 21st, LC released more information on its testing of RDA.  This testing is still inherently MARC based, with additional fields added to bibliographic records to indicate manifestations, while work and expression will be created as MARC authority records.  I am saddened that LC is not taking the opportunity, with the emergence of the new cataloging code, to perhaps embrace a new carrier, instead of adding more complexity to the already dated MARC standard.  This also, as indicated in class discussion by Elizabeth Mitchell, directly contradicts 3.1.3 of the Report, which calls for the entire library community to “include standard identifiers for individual data elements in bibliographic records.”  MARC currently favors textual strings, not the URIs from which the Semantic Web draws its power.  While I acknowledge that the blow of a new standard might be mitigated by encoding it in the familiar way, this could be a setback for libraries in their effort to enter the web.

The Library of Congress has been much more successful in preparing and releasing its vocabularies in forms more conducive for sharing via the web.  This is a very important and very impressive move, as the controlled vocabularies utilized by libraries is what has lead, in part, to the richness and coherence of our bibliographic descriptions.  The web friendly version of LCSH is located here: http://id.loc.gov/authorities/.  In the “About” section, LC acknowledges the “Linked Data” community and provides a list of other vocabularies or codes that will soon receive the same treatment.  Benefits, for both users and machines, are outlined.  Users can download entire vocabularies in RDF/XML or N-Triples.  Here, LC follows its own suggestions and embraces the power of the URI.  Thus “Fencing coaches” can be found at http://id.loc.gov/authorities/sh93010603, a location based on an alphanumerical string instead of the usual textual string matching.  Other communities in the web now can use this concept via this link or its RDF/XML format, forge a link between this particular URI and similar or related concepts, and generally enhance what is already a pretty powerful tool.  This tool also raises the profile of libraries by not only bringing the data out into the web, but also demonstrating that libraries are now willing and interested in sharing and playing with the rest of the information community.

Yet, this move was not without seeming difficulty and controvery.  LC launched this particular system only after it asked LC cataloger Ed Summers to shut down his own SKOS-generated version of LCSH, formerly located at http://lcsh.info. (More information on the creation of this vocabulary can be found here: http://arxiv.org/abs/0805.2855). Though the version hit the web less than a year after Summers’ site came down, by terminating this innovative service, a service that was already in use by others in the metadata and library communities, the Library of Congress looked somewhat reactionary, if not backwards (http://lcsh.info/comments1.html).   I do understand that LC might want to have more centralized control over bibliographic tools that they developed over years, but I am not fully convinced that this aggregated information, added to for years by librarians around the country, is solely within their domain.  Despite the legal consideration, LC’s action, as a commenter on Summers’ closing point indicated, seemed to fly in the face of the Working Group’s recommendation that LC consider their strengths and priorities and allow others in the community to pick up the slack and innovate for them.  While LC eventually joined the Semantic Web party, so to speak, they made it clear that they would be doing so only on their own terms.  Their actions might also impact innovators who might have, on their own, taken the time to prepare for the Semantic Web other LC tools or tools involving LC data.

Clearly, LC is trying to continue their work in bringing libraries into the Semantic Web and I applaud this commitment.  In furtherance of this goal, I would like to see them move towards adopting or integrating the RDA vocabularies to augment or supplement their existing vocabularies and resources.  Initially in their Report, the Working Group did call for the cessation of RDA development (3.2.5).  This was due to the unsatisfactory business case, a lack of confidence in the benefits of the new code, and a sense that FRBR was perhaps too untried for straight implementation (pp.25).  Later LC moved towards recommending a period of testing instead (See LC’s comment on the Report here).  In the Report, the working committee called for RDA/DCMI collaboration and development of a “Bibliographic Description Vocabulary” (pp.25).  As seen in LC’s response to the Report, they are still committed towards supporting this work and in developing other vocabularies along the same line (LC response, pp.41).By helping incorporate the RDA Vocabularies into RDA testing and implementation, LC could truly start moving towards cataloging in a web environment.  This step could only be improved by including solid efforts towards finding a replacement carrier for data, ideally one that will interoperate with MARC.  While this will undoubtedly be easier said than done, it is necessary to the future of not only bibliographic control, but of libraries themselves.