Getting the SHA-1 (or MD5) hash of a directory

By Stephen Akiki

By definition a cryptographic hash is, "a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an accidental or intentional change to the data will change the hash value".

Usually these hashes are used on files to "fingerprint" them, but in order to do the same to a directory you have to do something like the code below. This code allows you to "fingerprint" a directory and tell if anything has changed in it. For those that are interested in cryptography Applied Cryptography by Bruce Schneier is a great starting point.

# Copyright (c) 2009 Stephen Akiki
# MIT License (Means you can do whatever you want with this)
#  See
# Error Codes:
#   -1 -> Directory does not exist
#   -2 -> General error (see stack traceback)

def GetHashofDirs(directory, verbose=0):
  import hashlib, os
  SHAhash = hashlib.sha1()
  if not os.path.exists (directory):
    return -1

    for root, dirs, files in os.walk(directory):
      for names in files:
        if verbose == 1:
          print 'Hashing', names
        filepath = os.path.join(root,names)
          f1 = open(filepath, 'rb')
          # You can't open the file for some reason

    while 1:
      # Read file in as little chunks
      buf =
      if not buf : break

    import traceback
    # Print the stack traceback
    return -2

  return SHAhash.hexdigest()

print GetHashofDirs('My Documents', 1)