Parsing A Csv File And Aggregate Values In Python

September 26, 2022 Post a Comment

I'm looking to parse a csv file and aggregate 2 columns. Data in csv file: 'IP Address', Severity 10.0.0.1, High 10.0.0.1, High 10.0.0.1, Low 10.0.0.1, Medium 10.0.0.2, Medium 10.0

Solution 1:

import csv 
from collections import defaultdict

with open('text.txt') as f, open('ofile.csv','w+') as g:
    reader,writer = csv.reader(f), csv.writer(g)
    results = defaultdict(list)
    next(reader) #skip header line
    for ip,severity in reader:
        results[ip].append(severity)
    writer.writerow(["'IP Adress'"," High"," Medium"," Low"]) #Write headers
    for ip,severities in sorted(results.iteritems()):
        writer.writerow([ip]+[severities.count(t) for t in [" High"," Medium"," Low"]])

Produces:

'IP Adress', High, Medium, Low
10.0.0.1,2,1,1
10.0.0.2,1,1,1
10.0.0.3,1,2,0

Solution 2:

Here is my solution, ag.py:

import collections
import csv
import sys

output = collections.defaultdict(collections.Counter)

with open(sys.argv[1]) as infile:
    reader = csv.reader(infile)
    reader.next() # Skip header line
    for ip,level in reader:
        level = level.strip() # Remove surrounding spaces
        output[ip][level] += 1

print "'IP Address',High,Medium,Low"
for ip, count in output.items():
    print '{0},{1[High]},{1[Medium]},{1[Low]}'.format(ip, count)

To run the solution, issue the following command:

python ag.py data.csv

Discussion

output is a dictionary whose keys are the IP, and values are collections.Counter objects.
Each counter object counts 'High', 'Medium', and 'Low' for a particular IP
My solution prints to the stdout, you can modify it to print to file

Python Dummy

Parsing A Csv File And Aggregate Values In Python

Solution 1:

Solution 2:

Discussion

Post a Comment for "Parsing A Csv File And Aggregate Values In Python"