How To Remove The Â\xa0 From List Of Strings In Python
I have tried with the replace in python. But it wouldn't work. my_list=[['the', 'production', 'business', 'environmentâ\xa0evaluating', 'the'], ['impact', 'of', 'the', 'en
Solution 1:
Use unicodedata
library. That way you can save more information from each word.
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word inls] forlsin my_list]
To also replace â
with a
very_final_list = [[word.encode('ascii', 'ignore') for word inls] forlsin final_list]
If you want to completely remove â
then you can
very_final_list = [[word.replace('â', '') for word inls] forlsin final_list]
and to remove b'
in front of every string, decode it back to utf-8
So putting everything together,
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) forwordin ls] forlsin my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') forwordin ls] forlsin final_list]
#very_final_list = [[word.replace('â', '') forwordin ls] forlsin final_list]
And here is the final result:
[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]
If you switch the very_final_list
statements, then this is the output
[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]
Solution 2:
lst = []
for l in my_list:
lst.append([s.replace(u'\xa0','') for s in l])
Output:
[['the', 'production', 'business', 'environmentâevaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographicâ']]
Emmmm,The another answer,I think it break the structure of my_list
.But it's easy too.Only one line.
Solution 3:
Updated : List of List Comprehension should make this work for you
[[w.replace("â\xa0", " ") forwin words] forwordsin my_list]
Output
[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]
Post a Comment for "How To Remove The Â\xa0 From List Of Strings In Python"