May 22, 2013

Does the Creation of EAD Finding Aids Inhibit Archival Activities?

One of the primary outcomes of processing is establishing access to collections, traditionally assumed to be enabled by arrangement and description of materials and generation of a finding aid. The theory here is if you can find an item’s exact (Or approximate. Or possible.) location within a shelf or box or folder, you can access it.

If I haven’t explicitly stated it, I hope that some of my other posts have at least inferred that this concept is a fallacy, because the discovery of an audiovisual item does not assure accessibility to its content if that object is not in a playable condition. Access may come eventually through a reformatting effort or researcher request, but that is not a certainty.

This is why I have written that, in most cases, the outcomes of processing audiovisual material needs to focus on planning and preservation activities first. The archivist’s responsibilities include provision of findability, accessibility, and sustainability to collections. Processing has the capability of hitting at least parts of all of these marks, but, given that a focus of processing is generation of a finding aid, those responsibilities can at times be seen through that prism, the every problem looks like a nail thing. The danger here is the potential for the finding aid to become the endpoint rather than a stepping stone to further collection management, especially in institutions dealing with severe backlogs and growing collections.

Of course it can’t be expected that all archival activities support all needed outcomes. However, in spite of the traditional thinking that multiple touches of the object for cataloging or processing lead to an increase in the cost of managing that asset, I feel we have to start looking at the approach of making multiple passes on the object and its content in order to help achieve description, accessibility, and persistence.

If we are looking to optimize the use of resources by focusing on the outcomes they support, we need to ask if those outcomes are being optimally realized. In other words, if we are focused on producing finding aids, are those actually doing the job we think they are.

To put it succinctly, no, not in the way they could and the way they need to be.

I’m talking specifically here about EAD-based finding aids. I will not discuss the structure of that schema here nor the structure of DACS because, though they have not been ideally suited for audiovisual collections in the past, that is not the issue here. Rather, the issue is that EAD finding aids are not doing the job of being findable because EAD is not discoverable through Internet search engines, and the records or online aids themselves are frequently segregated from the library’s catalog or main site.

Even in cases where box and folder browsing is sufficient for onsite access, this lack of integration and external searchability inhibit discovery and use by a wider community. And we’re not really talking anything complex here like linked open data or developing a brand new schema. EAD already has an LOC (and likely other) approved crosswalk to MARC and other standards. With legal exceptions, why wouldn’t all finding aid records also be in the main library catalog? And why are finding aids frequently buried within the Library website under multiple clicks to the Special Collections and Archives (which may even be divided out further if an organization has multiple archive divisions) rather than being front and center along with access to the public catalog?

I understand that there are often extenuating circumstances here, such as collections with access restrictions or sensitive content. However I can’t help but suspect that this segregation is in part self-imposed, coming out of an outdated paranoia about limiting access to “approved” researchers or obscuring the contents of a collection to avoid legal issues. Or, alternatively, self-imposed by the enforcement of traditions and standards that have, for the large part, failed to adapt to changes in culture and technology. Changes which, to be honest, are not new. Motion pictures have been around for over 100 years. We’ll see how well the new DACS addresses audiovisual content, but, even if the issue is slightly better resolved, why has it taken this long?

There is nothing inherently wrong with the concept of finding aids, but there is something wrong with funneling resources into an inefficient system. This was the lesson of MPLP, though Meissner and Greene’s approach was to attack the inefficiency of the methodology. Having reviewed that, we also need to review the outcome of that method similarly and ask if the outcomes are meeting the needs of those we serve — our users, our institutions, and our collections — and if the outcomes support sufficient return on the investment of resources.

We have to admit that we often have a limited opportunity to test that ROI because, in part, the lack of access perpetuated by inadequate means of findability creates a self fulfilling prophecy of lack within archives. At administrative levels, limited resources are allocated to collections because limited access suggests such resources are not necessary. At the ground level, that lack of resources makes caretakers wary of promoting greater access because providing that access is difficult to support by under-staffed departments, and so they do not (or cannot, as is often the case with audiovisual materials). Back at the administrative levels, usage or request reports show that rates are not significantly increasing, proving that resource needs remain minimal because requests and fulfillment of access to records is minimal. And the cycle continues. No one is served. Not the administrator, nor the archival staff, nor those whom they aim to serve.

Frankly speaking, the search portals in libraries and online are probably better and certainly more user friendly than EAD finding aids. I’m not saying we need to jettison finding aids entirely because they contain much information that supports context, provenance, collection management, and other archive-specific or internal data. Rather, what we need to focus on more is how we can shape the creation of finding aids so that the data can be mapped to other portals users are more likely to access in a manner that is simple to achieve, does not lose meaning in the migration, and supports the needs expressed in those other systems. Likely this means that EAD is just one of many forms the data takes, not the ultimate container.

There are of course archival institutions working beyond EAD in innovative areas, but overall we have to accept that findability and access are no longer centralized, and realize that the structure of EAD finding aids and the manners in which they are distributed ignore or actively work against that fact. Only by integrating with other systems and technologies can archives maintain necessary authority over their records, more fully supporting archival services and responsibilities in a changing landscape.

Joshua Ranger

  1. Adam Wead says:

    Just yesterday, we discussed something very similar.

    The particular problem we face is that we have large unprocessed collections of hybrid materials that will ultimately need digital access. So do we process, traditionally, so-to-speak, and create a finding aid, then digitize? That could be a very long wait before materials are accessible.

    Why not the reverse? Digitize with preservation in mind, and use the results of that process to drive the creation of the finding aid. Collections could be sorted at a very high level, something like a MPLP approach, and then funneled through an ingestion process that creates digital access sooner, along with technical and other auto-generated metadata that can be used to further process the collection.

    The benefit is you have access sooner and can use the additional information about the content to create a better finding aid.

    • Josh says:

      Thanks for the comment, Adam. I would definitely agree with you here. We have transitioned to recommending that reverse approach over the last several years because of the need to reformat magnetic media collections in the near term and the delays that deeper descriptive cataloging (or processing like rehousing) can incur (not to mention the frequent inability to play back many audiovisual formats). If you’re digitizing there are points such as QC to review for descriptive information (or find out if you have duplicate content), or the ability for access to the content away from a dedicated deck makes it easier to distribute the work (and, as you say, the technical data at ingest will give you a much better, more controlled set of information like duration, aspect ratio, etc.). It’s in the same vein that you might decide to bake all of your 1/2″ open reel video as part of a regular workflow rather than taking the time to try and discern which reels might possibly have SBS. I wonder also about the amber-ization of the finding aid — would it be updated after the fact when all that new information is found or would we get too bogged down with creating new finding aids for other collections?

  2. Megan McShea says:

    People keep sending this link to me from your recent newsletter blast, so looks like I’d better respond! My response has gotten to be almost as long as your original post, so Kate Theimer has generously agreed to post my response on her blog, ArchivesNext:

    • Josh says:

      So we’re having some trouble with WordPress and the above comment is spam that got through to the comments and I can’t delete it.

      No, seriously, thanks for sharing the link, and thanks for the response — it’s a great post and I really appreciate the start of a conversation here. Or on whatever riverside cliff or open field we choose to duel in.

    • Josh says:

      Because my mom and at least one other person kept emailing me about whether I would respond (though that person’s name did seam to be Gucci Handbags Cheap) — “Data is a Simple Machine”, and blog post response to Megan’s response

  3. Tessa says:

    Could you say more about how EAD is not discoverable by internet search engines? I was under the opposite impression because of its XML-ness.

  4. Chris says:

    Hi Tessa, and thanks for the comment. I’ll field this one because it was my incessant harping on this detail that was the impetus for that aspect of Josh’s post. Granted, the way we’ve stated it is overly simplistic, but the core issue of EADs and findability via search engines is a longstanding issue. Yes, it has gotten better but there are still many institutions that have finding aids and finding aid portals which are not findable using Google, Bing, etc. Findability is of course based on a number of factors outside of the data itself including the infrastructure, architecture, configuration of systems and more. These aspects impact whether or not a search engine crawls the data at all, which portions of the data it crawls, and search engine optimization.

    The causes of EAD findability issues are varied and range from technical to cultural but I have consistently seen first-hand, and heard of from frustrated archivists within organizations large and small, the woes of EAD findability using search engines. I have seen significant improvements – again for a variety of reasons – over the past few years but as part of a community highly concerned with access we can’t rest on our laurels. It’s critically important to continue pushing for greater findability and access to information utilizing the most widely and commonly used mechanisms and tools.

