Extract The Name And Span Of Regex Matched Groups
I have a regex that looks like: rgx = '(?PABC)(?PDEF)?(?PHIJK)' Getting the matched string is no problem m.group(name). However, I need to extra
Solution 1:
You iterate over the names of the matched groups (the keys of groupdict
) and print the corresponding span
attribute:
rgx = '(?P<foo>ABC)(?P<bar>DEF)?(?P<norf>HIJK)'
p = re.compile(rgx, re.IGNORECASE)
m = re.match(p, 'ABCDEFHIJKLM')forkeyin m.groupdict():
print key, m.span(key)
This prints:
foo (0, 3)
bar (3, 6)
norf (6, 10)
Edit: Since the keys of a dictionary are unordered, you may wish to explicitly choose the order in which the keys are iterated over. In the example below, sorted(...)
is a list of the group names sorted by the corresponding dictionary value (the span
tuple):
forkeyin sorted(m.groupdict().keys(), key=m.groupdict().get):
print key, m.span(key)
Solution 2:
You can use RegexObject.groupindex
:
p = re.compile(rgx, re.IGNORECASE)
m = p.match('ABCDEFHIJK')
for name, n insorted(m.re.groupindex.items(), key=lambda x: x[1]):
print name, m.group(n), m.span(n)
Post a Comment for "Extract The Name And Span Of Regex Matched Groups"