Skip to content Skip to sidebar Skip to footer

How To Return Unique Words From The Text File Using Python

How do I return all the unique words from a text file using Python? For example: I am not a robot I am a human Should return: I am not a robot human Here is what I've done so f

Solution 1:

for word in word_list:if word not in word_list:

every word is in word_list, by definition from the first line.

Instead of that logic, use a set:

unique_words = set(word_list)
for word in unique_words:
    file.write(str(word) + "\n")

sets only hold unique members, which is exactly what you're trying to achieve.

Note that order won't be preserved, but you didn't specify if that's a requirement.

Solution 2:

Simply iterate over the lines in the file and use set to keep only the unique ones.

from itertools import chain

defunique_words(lines):
    returnset(chain(*(line.split() for line in lines if line)))

Then simply do the following to read all unique lines from a file and print them

withopen(filename, 'r') as f:
    print(unique_words(f))

Solution 3:

This seems to be a typical application for a collection:

...
import collections
d = collections.OrderedDict()
for word in wordlist: d[word] = None# use this if you also want to count the words:# for word in wordlist: d[word] = d.get(word, 0) + 1 for k in d.keys(): print k

You could also use a collection.Counter(), which would also count the elements you feed in. The order of the words would get lost though. I added a line for counting and keeping the order.

Solution 4:

string = "I am not a robot\n I am a human"
list_str = string.split()
print list(set(list_str))

Solution 5:

def unique_file(input_filename, output_filename):
    input_file = open(input_filename, 'r')
    file_contents = input_file.read()
    input_file.close()
    duplicates = []
    word_list = file_contents.split()
    file = open(output_filename, 'w')
    for word in word_list:
        if word not in duplicates:
            duplicates.append(word)
            file.write(str(word) + "\n")
    file.close()

This code loops over every word, and if it is not in a list duplicates, it appends the word and writes it to a file.

Post a Comment for "How To Return Unique Words From The Text File Using Python"