Skip to content Skip to sidebar Skip to footer

Tips On Parsing A Custom File Format Python

I developed a custom system which simulates web activity, for example downloading files and such. I also have a custom file format to feed into this system. I am looking to change

Solution 1:

So one way of doing it is using a set of regular expression replacements to create the files in the new format. I didn't completely understand the rules of your format so I generally implemented the whole thing, but there are some differences. You'll have to go in and make some adjustments to fine tune it. The output.txt file is what gets produced when one uses your example as input.txt

code

import re
data = open('input.txt').read()
data = re.sub(r"    'Step([0-9]+)' =>\s+{\s+action\s+=> ", r"    '\1'     => ", data)
data = re.sub(r"',\s+pass\s+[^,]+,", "", data)
data = re.sub(r"',\s+accept_multi_match\s+[^,]+,", "", data)
data = re.sub(r"\n +#.*\n", "\n", data)
data = re.sub(r"',\s+fail\s+[^,]+,", "", data)
data = re.sub(r"',\s+matchtype\s+[^,]+,", "", data)
data = re.sub(r"',\s+inputstring\s+=> '", ",", data)
data = re.sub(r"\s+matchstring\s+=> '", ",", data)
data = re.sub(r"\n        },", "',", data)
open('output.txt', 'w').write(data)

output.txt

[settings]
email_to=people
special_websurf_processing=1
period_0_1_only=1
crc_recheck=0

[macro]
%::WebSurfRules =
    (
    '1'     => 'NAVIGATE,http://www.tda-sgft.com/TdaWeb/jsp/fondos/Fondos.tda',','2'     => 'CLICK_REFERENCE,phHttpDest->\{\'FirstClick\'\}',
    '3'     => 'CLICK_REFERENCE,phHttpDest->\{\'SecondClick\'\}',',
    '4'     => 'CLICK_REFERENCE,phHttpDest->\{\'DealClick\'\}',
    '5'     => 'CLICK_REFERENCE,phHttpDest->\{\'LinkClick2\'\}',
    '6'     => 'CLICK_REFERENCE,phHttpDest->\{\'DocClick\'\}',',
    '7'     => 'CLICK_DOWNLOAD_OK',',
    );

...

Post a Comment for "Tips On Parsing A Custom File Format Python"