Why We Shouldn't Save Everything

By February 6, 2013Blog

Several years ago I created an equation that calculated the advancement of a society in relation to its acquisition of archived knowledge. A positive result meant the culture was advancing, a negative that it was declining.


Where lb=Weight of Cultural Burden, n=Cultural Impact, q=Quantity, & d=Amount of Existing Detritus

I’ll take a slight pause for the audience laughter to die down. But for you non-maths-nerds out there, the gimmick was that eπ i = -1, thus the outcome of the equation would always be 0 (stasis) or a negative number (decline), primarily because the production of detritus in the culture always greatly outweighs the production of standout cultural works.

Yes, I have always been this much of a punk ass fool.

Regardless, I do believe that, though my expression may be flawed (even if it made me giggle) it holds a grain of truth. Most things are not great. Most things are not even good. This does not mean that the non-great do not deserve to be saved — each collection has its own mission and reasoning — but it does infer that, of the masses and masses of content we create, there is a steep curve measuring usefulness. This usefulness may be predicated on quality, content, source, methodology, or some other factor that makes the content valuable, representative, tangential, or research-worthy. The degree, scope, and reach of these factors vary, as does the amount of importance we place on each factor personally, locally, institutionally, and beyond.

In other words, one man’s trash….

This is really nothing new (as the use of a hoary adage suggests). Prioritization and deaccessioning are part and parcel of the archival practice. More highly valued materials receive more attention and more resources. Lower value materials receive less attention and, in some cases, are/should be discarded.

Not everything can be saved. Nor should it. Not just when judged at a valuation level, but at a level of content and institutional use.

However, because we are aware of mass losses of records of the past, we are hyper-aware (and in some cases hyper-vigilant) about losing anything in the present, about letting one film frame or slip of paper evade our acid free grasps. This extremism is wrong, because it causes fear and paralysis, or, alternately, overreaction. We can become so afraid of making a mistake in our preservation choices that we freeze up, or we can’t see beyond the cost and resources needed for an entire collection to see where we can take smaller bites, or we waste resources on inessential materials and activities at the detriment of other ones. No collections benefit as much as they could in these scenarios.

One issue I see here is the undue influence of the concept of author and manuscript, especially when it comes to audiovisual archives. In an author’s or individual’s archive there is an aura projected onto the materials — everything they touched or every revision made matters.

In an archive of easily reproducible materials there is bound to be duplication, low quality viewing copies, transfers that have no meaning beyond their role in moving from platform to platform within a production process, and content that is low/poor quality but never discarded by the creator.

(And to be honest, I personally feel that the totemic aura of the object is grossly misplaced in most cases, whether the item in question is reproducible or not.)

The question we need to look at is what role the asset plays/played in the day-to-day activity of the collection. In the case of an individual author, versions or revisions may be of value in tracking the creative process. In a production environment, versioning or dubs or rough cuts may (and do) lack such value. The review copies and rough edits that cycle around a production environment are of practical, of-the-moment value but do not necessarily mean anything to the work process — the same way multiple copies of a manuscript distributed to colleagues do not reflect unique content (unless annotated).

But really, even if that is the case, we have to ask ourselves how much is enough? We don’t know what a researcher 100 years from now will be interested in, but does that matter? What is the value of that research point versus the cost of storage and preservation? The archivist’s job, in part, is to support researchers, but also to care for the collection. Does that footnote in a maybe future dissertation warrant an investment that subtracts from the ability to provide broader care?

It may seem like it does because long-term, day-to-day collection management has no wow factor, no direct feedback, and can be difficult to communicate the value of to administrators. The researcher finding one thing, exclaiming Eureka, and expressing gratitude provides that warm, energizing feeling that one’s work has been done. It is a silver dollar found in the middle of a reseeded forest. Something to spend now instead of realizing an expansive value later.

What this issue often comes down to is making a decision for which we cannot see the long-term outcome. Short-term we can assess the potential benefits, but the risk that we were wrong often prompts inaction. Realistically, though, in the long-term our inaction is a much greater burden than any action we take. At some point in the future the burden of too many assets, or undocumented assets, of half-cared for assets, or un-reformatted assets will have to be dealt with. And at that point the costs will be greater and the options will be fewer. That is not what we should be saving for the future.

Joshua Ranger

Join the discussion 9 Comments

  • kevin says:

    Interesting post – again! I guess my question to you is: is limited resources the first argument against saving everything? I think limited resources is part of the equation, but I don’t think it’s the whole thing – and in my mind, it’s not the most important thing. Even with unlimited resources, the idea of “saving everything” is ultimately ridiculous and becomes a Borges Library of Babel. A seemingly infinite collection of pure gibberish, with a few hidden gems, which are all the more hidden by all the gibberish. So. I would say we do not save everything first, because we are not infinite beings who can comprehend “everything”, and then also because the whole cost thing. (I will grant you that “we don’t save everything because we are not infinite beings” is perhaps not at palatable an answer as “we don’t save everything because it costs too much.”)

    • Josh says:

      Well jeez, Kevin, I thought I had finally got it right! No, really, as before, I totally agree with you on a philosophical level. The idea of saving everything is ludicrous because it is overwhelming, obscuring, and ultimately impossible if taken at face value. Also, as an Early Americanist by initial training, I have my Hawthornian reservations about the burdens of the past (okay, he’s a little late period for me, but whatever). Obviously I couldn’t call the post “One Reason Why We Shouldn’t Archive Everything” because that’s not going to draw people in, so I did take a more resource-centric angle — which I see not just as cost but all energy and effort (time, people, knowledge, equipment, etc.). I tend to that focus because though the philosophical bent makes sense, as you say, it’s not very palatable — especially not to administrators, researchers, funders, etc. In other words, the people you have to convince that you’re doing the right thing and they should support you or help get more resources your way.

  • You make a solid argument for a more practical, measured approach to archiving focussed on saving the “best” material while not letting the perfect become the enemy of the good or quantity compromise quality.

    If only there were a way to remove subjective human judgement and let an algorithm take over. Because, sometimes the greatest enemy is of our cultural and temporal bias. We often cannot predict what future generations will find valuable (just as my mother tossed my mad magazines and magic cards into the garbage) so isn’t it sometimes best to take a naive approach? Isn’t there a certain wisdom to hoarding? (I’m playing devil’s advocate here).

    For digital archiving, fortunately, disk space becomes cheaper at an exponential rate. Yesterday’s gigabyte is today’s terabyte, and tomorrow’s petabyte. While storage may never catch up with global output, and of course we can’t save everything, there’s an argument for the approach of indiscriminate “dump” archiving popularized by the Internet Archive and Archive Team, which especially makes sense for born-digital content. The fact that web crawling bots take the snapshots and make the decisions of which websites to wrangle into the wayback machine, while letting humans choose to more thoroughly archive sites of special importance, lends itself to a t-shaped approach that’s both broad and deep.

    Making heavy-handed relevance decisions by presuming to know what future generations will value, and curating what makes it into an archive, runs the risk of radically limiting the scope of materials saved. Why not prioritize the stuff that we deem important, while also striving for indiscriminate diversity to save what we might not realize is important and otherwise might have thrown out?

    • Josh says:

      You’re right, Jonathan, if only there were a way to remove human’s from the equation we wouldn’t keep messing stuff up! Seriously, though, I would counter that algorithms are created (and continually tweaked) by humans and face the same constant assessment/re-assessment that our other non-automated decisions do.
      I’ll take your devilish challenge and say maybe that fear of future generations is not our problem in a way. We make the decisions we do to our best ability, and the next in line makes decisions based on those that came before. It’s kind of false to assume that culture continually advances and that each new generation makes better choices or is purer than the last — just as it is false to assume we are in a constant state of decline/backsliding. Those themselves are cultural and temporal biases. And if we still had all the things our mothers threw out, those things wouldn’t be very valuable (and likely we would have tossed them ourselves after the 10th time we had to move them to a new house).
      I agree that the storage and collecting possibilities of on-line file-based materials changes the playing field a bit (still see too many assets buried on hard drives and physical media to include all digital assets in this), but I still see problems. I think what Brewster and the Archive Team have done is great, but there’s still the issue of such efforts being driven by one or a few visionaries. What happens when that singular drive is gone? Or what happens with the next technological shift makes that work obsolete and we’re facing another overwhelming recovery/migration issue like with the piles of U-matics, et al that are out there? I think there is certainly room for less discriminate processes, but we also have to be cognizant that those processes are not an endpoint and will become a burden of sorts to the future.

  • Jesse says:

    Great post, there are so many variables that must be taken into account when deciding on the quantity of material to be kept. With the whole data-mining buzz in the digital humanities more seems better. But especially for AV archives, i have serious doubts whether big data will really become the main research method (see my blogpost). Can you comment on the reason why you feel the totemic aura is often overrated? Thanks!

    • Josh says:

      Thanks, Jesse. I enjoyed your post as well. I think you’re spot on — there are many challenges to data-mining and AV materials. We’re more likely to see it work with audio before moving image, but still, it’s going to require a lot innovation or creation of supplemental data to really work. We’ve been doing some research with some companies that have search engines which search across multiple data sets as a way of figuring out options, but it’s hard to see data-mining working with pure image in the near future. In response to the totemic aura, I’ve written about it some before in Is There a Right Time to Let Go of Original Materials and Is Hoarding an Archival Activity. Ultimately I feel there are two major issues. First, the more we cling to the object, the harder it becomes for us to our work as archivists, to make rational decisions about collection management or even, in some cases, remember that we personally do not have ownership of the materials and should care for them and provide access to them in a manner that reflects that fact. Second, on a more personal level, I do not believe that we should create totems and rituals around particular humans and their creations/activities. It skews our ability to evaluate (in content and monetary terms — see Antiques Roadshow) and can be dangerous by establishing an inequality amongst humans.

  • Kevin) says:

    On second thought, perhaps my reduction to the absurdly literal reading of “saving everything” is not particularly helpful as saving everything isn’t really the question. The question is saving specific things, and yes, the value proposition of judging an object/piece of information’s worth versus the value of resources used to preserve it is certainly worthwhile. And even if available resources increased, there are some scenarios where the potential reward for preserving is so low that the cost is just not justifiable.
    Sorry for littering your comment sections

  • Megan says:

    Great topic, great discussion! I’m going to avoid the hard part of the question (is there an optimal general approach to defining research value or historical value for archives?), and make a humble suggestion. I think it would be a tremendous service to archives everywhere if the AV specialists could come up with some guidelines for sorting out complex collections of media production material. Even a first-pass triage approach to determining what the core elements or versions are, what’s unique, and what is of use and no use to the archives or its users, could save archives a lot of space and headaches. It’s a simple piece of this impossibly complex puzzle, but it’s do-able and could go a long way in helping non-AV specialists make appraisal decisions for AV archival collections.

    • Josh says:

      Thanks Megan — Thanks for making suggestions that will make my career redundant! Seriously though, it is a good idea for a resource, though I can see the hornet’s nest of disagreement over such a set of recommendations…But that’s par for the course. I can actually envision it as like the mirror reverse of NARA’s Digitization Services Branch‘s guidelines for selecting the right target format for your needs: a straight forward outline of each element type, what it’s uses were under production, what you can do with it in reformatting and what you would possibly lose if you got rid of it. Again though, at beyond the first pass level, it comes down to a decision of what your risk appetite is. In a recent project an organization was on the fence about what level of effort to expend on some DVDs that had been burned from their video collection. Of course the original/master tape is where you want to focus and what you want to use for transfers, but there’s also the slight gnawing feeling over the sunk cost fallacy (Well we spent a good chunk of money doing that project…) as well as the What If… fear of maybe that DVD was the last time the tape was playable.

Leave a Reply