Skip to content Skip to sidebar Skip to footer

Including Word Boundary In String Modification To Be More Specific

Background The following is a minor change from modification of skipping empty list and continuing with function import pandas as pd Names = [list(['ann']), list(

Solution 1:

You need to add word boundary to each string in lists of df.loc[m].P_Name as follows:

s = df.loc[m].P_Name.map(lambda x: [r'\b'+item+r'\b'for item in x])

Out[71]:
0                   [\bann\b]
2    [\belisabeth\b, \blis\b]
3           [\bhis\b, \bhe\b]
Name: P_Name, dtype: object

df.loc[m, 'Text'].replace(s, '**BLOCK**',regex=True)

Out[72]:
0       **BLOCK** had an anniversery today
2    I like **BLOCK** and **BLOCK** 5 lists
3    one day **BLOCK** and **BLOCK** cheated
Name: Text, dtype: object

Solution 2:

Sometime for loop is good practice

df['New']=[pd.Series(x).replace(dict.fromkeys(y,'**BLOCK**') ).str.cat(sep=' ')for x , y in zip(df.Text.str.split(),df.P_Name)]
df.New.where(df.P_Name.astype(bool),inplace=True)
df
                                Text  ...                                  New0       ann had an anniversery today  ...     **BLOCK** had an anniversery today
1                       nothing here  ...                                  NaN
2  I like elisabeth and lis 5 lists   ...   I like**BLOCK**and**BLOCK**5 lists
3oneday he and his cheated  ...  oneday**BLOCK**and**BLOCK** cheated
4                          same here  ...                                  NaN
[5rows x 4 columns]

Post a Comment for "Including Word Boundary In String Modification To Be More Specific"