What’s the Real Impact of SHA-256?

Originally published 2014.08.05. Thanks to the input of Kam Woods at the University of Maryland, an error in our original script was identified. The tests are being rerun this weekend and updated numbers and a revised report will be republished next week. We appreciate the feedback and support engendered by an open data approach to research.There are a variety of algorithms that can be used for generating checksums, with two in particular – MD5 and SHA-256 – being the most common. The comparative benefits and drawbacks of both are well-understood: while MD5 is weaker against random and deliberate collisions, it is faster to generate than SHA-256. However, there are no published empirical estimates for the difference in time-to-generate between MD5 and SHA-256 in archival and repository environments, leading to difficulty in making an informed decision as to which algorithm to implement for preservation monitoring. This white paper documents a comparative checksum test of the same files under the same conditions, leading to some surprising findings about the actual processing speed of SHA-256.