Fun with Python: File Hashing


Introduction

In this installment of Fun with Python, I wanted to test the built-in Python library hashlib, so I created a simple script to create a hash from a single file.

As always, this code is provided for educational purposes only and I make no guarantees or warranties. Use at your own risk. This code is released under the GNU GPL.

My instructions are for Windows and I assume some basic level of computer proficiency.


Preparation

The script can be run on any file, but it is best to stay safe during testing.

Create a new folder to save the script and test file.

I tested this script by moving the file to different locations and verified the hash didn’t change, but I won’t include that here.

Copy the following text into Notepad and save it as “file.txt”.

 

'Twas brillibright, and slorpy spores Did glimmergloom beneath the yews; All wimplewents, with gleeful roars, Frolicked 'mid the glooby ooze.

Beware the Snorblefang, my kin, The blorp-eyed fiend with wibble claws! It lurks where gabbled groobs begin, And whispers dark in slither laws.

Go forth! Your zaggly sword in hand, Through narble woods and twinkling gloam; Then dance upon the Starnax strand, For hero's light shall guide thee home.

 

The Code

 

import os import hashlib app_running = True dir_input = False def clear(): os.system('cls' if os.name == 'nt' else 'clear') def menu(dir): clear() ui = "" print("*********** file_hasher ***********") print(" (1) Hash a file - sha256 ") print(" (2) Hash a file - md5 ") print(" (3) Quit ") print("****************************************") while True: ui = input("_: ") match ui: case "1": hash_a_file("sha256", dir) case "2": hash_a_file("md5", dir) case "3": quit() case _: print("Error: command not recognized.") def hash_a_file(h_type, dir): ui = input("Which file: ") file_path = dir + "\\" + ui file_to_hash = open(file_path, "rb") match h_type: case "sha256": file_hash = hashlib.sha256() case "MD5": file_hash = hashlib.md5() while chunk := file_to_hash.read(8192): file_hash.update(chunk) match h_type: case "sha256": print("SHA-256 Hash: ", file_hash.hexdigest()) case "MD5": print("MD5 Hash: ", file_hash.hexdigest()) while True: ui = input("(R)eturn to menu? Or, (Q)uit?") match ui: case "R" | "r": menu(dir) case "Q" | "q": quit() def get_file_path(): clear() while True: ui = input("Search which directory? ") if ui: menu(ui) else: print("Invalid.") if __name__ == '__main__': get_file_path()


The Script in Action

This script contains a while loop to keep it running, so it can be run directly, or from the command line by navigating to its location and typing the command:

python file_hasher.py

The script will first prompt for the location of the file to be hashed. Input the folder path and press Enter.

The menu will appear. Choose either 1 or 2 to generate a sha256 hash or an md5 hash.

It will then prompt for the name of the file. Input the name and press Enter.

Finally, it will generate the hash based on the chosen type.


The Code Analyzed

If you have followed along with my scripts, you should recognize my menus. They tend to take up a good bulk of my scripts, but I like to make the scripts interactive and able to stay running with while loops.

The heart of this script lies in the function hash_a_file():

def hash_a_file(h_type, dir):		
	ui = input("Which file:  ")	
	match h_type:
		case "sha256":
			file_hash = hashlib.sha256()
		case "MD5":
			file_hash = hashlib.md5()
	file_to_hash = dir + "\\" + ui
	file_to_hash = open(ui, "rb")	
	while chunk := file_to_hash.read(8192):
		file_hash.update(chunk)	
	match h_type:
		case "sha256":
			print("SHA-256 Hash:  ", file_hash.hexdigest())
		case "MD5":
			print("MD5 Hash:  ", file_hash.hexdigest())
	while True:
		ui = input("(R)eturn to menu?  Or, (Q)uit?")
		match ui:
			case "R" | "r":
				menu(dir)
			case "Q" | "q":
				quit()

The function takes two arguments: one for the type of hash to generate called h_type that is passed from the menu based on the user’s selection, and the directory where the file is located which was passed to the menu when the script first ran called dir.

The function asks the user to provide the name of the file, and then combines that input with dir to identify the full file path. It is stored in the variable file_path.

It then opens the file in read-only as binary by passing “rb” which is necessary for the hash functions to operate, and stores it in the variable file_to_hash:

file_to_hash = open(ui, "rb")

In the next step it breaks the file input into small “chunks” to help manage resources during processing using the hashlib function .read(). It is my understanding that 8192Kb is a standard choice for this, so that is what is passed to .read(). First, though, it decides which hash type to use based on h_type.

	match h_type:
		case "sha256":
			file_hash = hashlib.sha256()
		case "MD5":
			file_hash = hashlib.md5()
	while chunk := file_to_hash.read(8192):
		file_hash.update(chunk)

Note: the use of :=, which is a newish feature that allows a variable to be declared and utilized, such as in this while loop.

Next, it prints the hash using the hashlib function hexdigest(), based on h_type.

	match h_type:
		case "sha256":
			print("SHA-256 Hash:  ", file_hash.hexdigest())
		case "MD5":
			print("MD5 Hash:  ", file_hash.hexdigest())

Finally, it uses a while loop to get input, which gives the user a chance to copy the hash. The options are simply to quit or return to the menu.


Conclusion

The hashlib library is a very useful tool for the defender on a budget. I already have a few thoughts on how this can be extended, or combined with previous scripts. For example it can easily be modified to search multiple directories and hash multiple files, then compare them to a list of IoCs. I’ll keep tweaking these scripts as I think of new ideas, and learn new concepts.

As always, this code is provided for educational purposes only and I make no guarantees or warranties. Use at your own risk. This code is released under the GNU GPL.

Side note, the silly test text was generated by Copi and inspired by Lewis Carroll’s Jabberwocky.

Have fun, and thanks for reading.


Daily Cuppa

Today’s cup of tea is Mandarin Mint Mindfulness provided by Yogi.


If you enjoyed creating file hashes, or the site in general, you can buy the author a cup of tea and show your appreciation. The author is also available for work.

Previous
Previous

IT Security 101: Malware

Next
Next

IT Security 101: The Open Systems Interconnection (OSI) Model