how to do this python project
Overview: In this project, you will be creating code to compress and decompress image files.
Goal: To gain experience with compression algorithms.
Requirements: You are going to implement a compression algorithm in Python 3.x. You have been
given a set of example images and your goal is to compress them as much as possible without losing any
perceptible information – upon decompression they should appear identical to the original images.
Images are essentially stored as a series of points of color, where each point is represented as a
combination of red, green and blue (rgb). Each component of the rgb value ranges between 0-255, so
for example: (100, 0, 200) would represent a shade of purple. Using a fixed length encoding, each
component of the rgb value requires 8 bits to encode (28 = 256) meaning that the entire rgb value
requires 24 bits to encode. You could use a compression algorithm like Huffman encoding to reduce the
number of bits needed for more common values and thereby reduce the total number of bits needed to
encode your image.
You can use the Pillow library of Python to read in image files and extract the rgb values of individual
points. To install the library in PyCharm, go to Project Settings -> Project Interpreter. To the right of the
list of packages will be a small + sign. Click on it to open up the window of available packages and search
for Pillow – since there are several similarly named libraries, you want the one written by Alex Clark and
its full name should be Python Imaging Library (Fork). Then click on Install Package. (It’s possible you
may get an error if you have an older version of pip installed – it would be listed on the prior page of
installed packages. If the error occurs, first install the latest version of pip, then install Pillow.)
Once you have Pillow installed, you can include the library into your project code with:
from PIL import Image
You can then open image files using, for example:
img = Image.open(“powercat.bmp”)
If you wanted to display the image in your standard image viewer, you could use:
img.show()
If you want to access the pixel values of points in an image, you’ll first need to call load() to create a
pixel access object and then you can access individual pixels using list notation:
width, height = img.size
px = img.load()
for x in range(width):
for y in range(height):
print(px[x,y])
Your code should read in an image file, compute how many bits are required for a fixed length encoding
and then apply a compression algorithm to create a smaller encoding – you need to implement the
compression, you cannot use a compression library. You should output how many bits are required to
store the image in your compressed format as well as the compression ratio achieved. When it comes
to saving your compressed image, you won’t be able to save it as a standard image format, since you will
have created your own encoding, but you can save the sequence of bits into a text or binary file.
Your code should also be able to prompt the user for the filename of a text file containing a compressed
sequence of bits and then decompress that file into the original image – you can assume that the file
uses the same compression format as the last file you compressed. So, for example, if you compressed
pacificat.bmp into a series of bits stored in pacificat.txt and then the user asked you to decompress
alt_pacificat.txt, you could assume that alt_pacificat.txt used the same compression data structure as
pacificat.txt (it might be a subset of the data from the original image, for example).
There are a number of libraries that can help you store formatted data into a file from Python. If you
research the options and find a way to store your compression data structure into a file, such that the
user can select both a bit file and a data structure file and use the data structure to decompress the bit
file, then you can earn an extra half point on the rubric.
Your code should compute the following statistics each time the compression code is called on a file:
ï‚· Runtime in milliseconds of the compression process, including any time needed to create the
data structures used to help with the compression process.
ï‚· Number of bits needed by your algorithm to encode the contents of the file.
ï‚· Number of bits needed by a fixed length encoding to encode the contents of the file.
ï‚· The compression ratio achieved.
For the decompression process, you only need to track the runtime in milliseconds.
Algorithm: In addition to your code, you need to turn in a well-formatted document containing the
pseudo-code for your core algorithm (written in the same format as the textbook uses for pseudo-code)
along with any proofs you decide to include about your algorithm – see grading section below.