Skip to content Skip to sidebar Skip to footer

Reading A Binary File As Plain Text Using Python

A friend of mine has written simple poetry using C's fprintf function. It was written using the 'wb' option so the generated file is in binary. I'd like to use Python to show the p

Solution 1:

The thing is, when dealing with text written to a file, you have to know (or correctly guess) the character encoding used when writing said file. If the program reading the file is assuming the wrong encoding here, you will end up with strange characters in the text if you're lucky and with utter garbage if you're unlucky.

Don't try to guess, try to know: you need to ask your friend in what character encoding he or she wrote the poetry text to the file. You then have to open the file in Python specifying that character encoding. Let's say his/her answer is "UTF-16-LE" (for sake of example), you then write:

with open("poetry.bin", encoding="utf-16-le") as f:
    print(f.read())

It seems you're on Python 2 still though, so there you write:

import io
with io.open("poetry.bin", encoding="utf-16-le") as f:
    print f.read()

You could start by trying UTF-8 first though, that is an often used encoding.


Post a Comment for "Reading A Binary File As Plain Text Using Python"