Skip to content Skip to sidebar Skip to footer

How To Convert Strings In A Pandas Dataframe To A List Or An Array Of Characters?

I have a dataframe called data, a column of which contains strings. I want to extract the characters from the strings because my goal is to one-hot encode them and make the usable

Solution 1:

You can convert values to letters by list comprehension with list and then to array if necessary:

predictors = np.array([list(x) for x in data])

Or convert column predictors['Sequence']:

a = np.array([list(x) for x in predictors['Sequence']])
print(a)
[['D''K''W''L']
 ['F''C''H''N']
 ['K''D''Q''P']
 ['S''G''H''C']
 ['K''I''G''T']
 ['P''G''P''T']]

For Series use:

s = predictors['Sequence'].apply(list)
print(s)
0    [D, K, W, L]
1    [F, C, H, N]
2    [K, D, Q, P]
3    [S, G, H, C]
4    [K, I, G, T]
5    [P, G, P, T]
Name: Sequence, dtype: object

Post a Comment for "How To Convert Strings In A Pandas Dataframe To A List Or An Array Of Characters?"