What Is Efficient Way To Check If Current Word Is Close To A Word In String?

January 25, 2024 Post a Comment

consider examples below : Example 1 : str1 = 'wow...it looks amazing' str2 = 'looks amazi' You see that amazi is close to amazing, str2 is mistyped, i wanted to write a program

Solution 1:

There are a lot of ways to approach this. This one solves all of your examples. I added a minimum similarity filter to return only the higher quality matches. This is what allows the 'ly' to be dropped in the last sample, as it is not all that close any any of the words.

Documentation

You can install levenshtein with pip install python-Levenshtein

import Levenshtein

def find_match(str1,str2):
    min_similarity = .75
    output = []
    results = [[Levenshtein.jaro_winkler(x,y) for x in str1.split()] for y in str2.split()]
    for x in results:
        if max(x) >= min_similarity:
            output.append(str1.split()[x.index(max(x))])
    return output

Each sample you proposed.

find_match("is looking good", "looks goo")

['looking','good']find_match("you are really looking good", "lok goo")

['looking','good']find_match("Stu is actually SEVERLY sunburnt....it hurts!!!", "hurts!!")

['hurts!!!']find_match("you guys were absolutely amazing tonight, a...", "ly amazin")

['amazing']

Solution 2:

Like this:

str1 = "wow...it looks amazing"
str2 =  "looks amazi"
str3 = []

# Checking for similar strings in both strings:for n in str1.split():
    for m in str2.split():
        if m in n:
            str3.append(n)

# If found 2 similar strings:if len(str3) == 2:
    # If their indexes align:if str1.split().index(str3[1]) - str1.split().index(str3[0]) == 1:
        print(' '.join(str3))

elif len(str3) == 1:
    print(str3[0])

Output:

looks amazing

UPDATE with condition given by the OP:

str1 = "good..."
str2 =  "god.."
str3 = []

# Checking for similar strings in both strings:for n in str1.split():
    for m in str2.split():

        # Calculating matching character in the 2 words:
        c = ''for i in m:
            if i in n:
                c+=i
        # If the amount of matching characters is greater or equal to 50% the length of the larger word# or the smaller word is in the larger word:iflen(list(c)) >= len(n)*0.50or m in n:
            str3.append(n)


# If found 2 similar strings:iflen(str3) == 2:
    # If their indexes align:if str1.split().index(str3[1]) - str1.split().index(str3[0]) == 1:
        print(' '.join(str3))

eliflen(str3) == 1:
    print(str3[0])

Solution 3:

I made through it with regular expressions

defcheck_regex(str1,str2):
    #New list to store the updated value
    str_new = []
    for i in str2:
        # regular expression for comparing the strings
        x = ['['+i+']','^'+i,i+'$','('+i+')']
        for k in x:
            h=0for j in str1:
                #Conditions to make sure the word is close enough to the particular wordif"".join(re.findall(k,j)) == i or ("".join(re.findall(k,j)) in i andabs(len("".join(re.findall(k,j)))-len(i)) == 1andlen(i)!=2):
                    str_new.append(j)
                    h=1breakif h==1:
                breakreturn str_new
import re
str1 = input().split()
str2 = input().split()
print(" ".join(check_regex(str1,str2)))

Solution 4:

You can use Jacard coefficient in this case. First, you need to split your first and second string by space. After that, for every string in str2, take Jacard coefficient with every string in str1, then replace with which that gives you the highest Jacard coefficient.

You can use sklearn.metrics.jaccard_score.

Python Dummy

What Is Efficient Way To Check If Current Word Is Close To A Word In String?

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Post a Comment for "What Is Efficient Way To Check If Current Word Is Close To A Word In String?"