Vectorized Format Function For Pandas Series
Say I start with a Series of unformatted phone numbers (as strings), and I would like to format them as (XXX) YYY-ZZZZ. I can get the sub-components of my input using regular exp
Solution 1:
You can do this directly with Series.str.replace()
:
In [47]: s = pandas.Series(["1234567890", "5552348866", "13434"])
In [49]: s
Out[49]:
0123456789015552348866213434
dtype: object
In [50]: s.str.replace(r"(\d{3})(\d{3})(\d{4})", r"(\1) \2-\3")
Out[50]:
0 (123) 456-78901 (555) 234-8866213434
dtype: object
You could also imagine doing another transformation first to remove any non-digit characters.
Post a Comment for "Vectorized Format Function For Pandas Series"