Extracting Raw XML Via Lxml Etree
I'm trying to extract raw XML from an XML file. So if my data is: ... Lots of XML ...
Solution 1:
You should be able to use tostring() to serialize the XML.
Example...
from lxml import etree
xml = """
<xml>
<getThese>
<clonedKey>1</clonedKey>
<clonedKey>2</clonedKey>
<clonedKey>3</clonedKey>
<randomStuff>this is a sentence</randomStuff>
</getThese>
<getThese>
<clonedKey>6</clonedKey>
<clonedKey>8</clonedKey>
<clonedKey>3</clonedKey>
<randomStuff>more words</randomStuff>
</getThese>
</xml>
"""
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.fromstring(xml, parser=parser)
elems = []
for elem in tree.xpath("getThese"):
elems.append(etree.tostring(elem).decode())
print(elems)
Printed output...
['<getThese><clonedKey>1</clonedKey><clonedKey>2</clonedKey><clonedKey>3</clonedKey><randomStuff>this is a sentence</randomStuff></getThese>', '<getThese><clonedKey>6</clonedKey><clonedKey>8</clonedKey><clonedKey>3</clonedKey><randomStuff>more words</randomStuff></getThese>']
Post a Comment for "Extracting Raw XML Via Lxml Etree"