Skip to content Skip to sidebar Skip to footer

How To Remove The Â\xa0 From List Of Strings In Python

I have tried with the replace in python. But it wouldn't work. my_list=[['the', 'production', 'business', 'environmentâ\xa0evaluating', 'the'], ['impact', 'of', 'the', 'en

Solution 1:

Use unicodedata library. That way you can save more information from each word.

import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word inls] forlsin my_list]

To also replace with a

very_final_list = [[word.encode('ascii', 'ignore') for word inls] forlsin final_list]

If you want to completely remove then you can

very_final_list = [[word.replace('â', '') for word inls] forlsin final_list]

and to remove b' in front of every string, decode it back to utf-8

So putting everything together,

import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) forwordin ls] forlsin my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') forwordin ls] forlsin final_list]
#very_final_list = [[word.replace('â', '') forwordin ls] forlsin final_list]

And here is the final result:

[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]

If you switch the very_final_list statements, then this is the output

[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]

Solution 2:

lst = []
for l in my_list:
    lst.append([s.replace(u'\xa0','') for s in l])

Output:

[['the', 'production', 'business', 'environmentâevaluating', 'the'],
 ['impact', 'of', 'the', 'environmental', 'influences', 'such'],
 ['as', 'political', 'economic', 'technological', 'sociodemographicâ']]

Emmmm,The another answer,I think it break the structure of my_list.But it's easy too.Only one line.

Solution 3:

Updated : List of List Comprehension should make this work for you

[[w.replace("â\xa0", " ") forwin words] forwordsin my_list]

Output

[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]

Post a Comment for "How To Remove The Â\xa0 From List Of Strings In Python"