Skip to content Skip to sidebar Skip to footer

Python Parse File For Ip Addresses

I have a file with several IP addresses. There are about 900 IPs on 4 lines of txt. I would like the output to be 1 IP per line. How can I accomplish this? Based on other code, I h

Solution 1:

The $ anchor in your expression is preventing you from finding anything but the last entry. Remove that, then use the list returned by .findall():

found = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})',text)
ips.extend(found)

re.findall() will always return a list, which could be empty.

  • if you only want unique addresses, use a set instead of a list.
  • If you need to validate IP addresses (including ignoring private-use networks and local addresses), consider using the ipaddress.IPV4Address() class.

Solution 2:

The findall function returns an array of matches, you aren't iterating through each match.

regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text)
if regex isnotNone:
    for match in regex:
        if match notin ips:
            ips.append(match)

Solution 3:

Extracting IP Addresses From File

I answered a similar question in this discussion. In short, it's a solution based on one of my ongoing projects for extracting Network and Host Based Indicators from different types of input data (e.g. string, file, blog posting, etc.): https://github.com/JohnnyWachter/intel


I would import the IPAddresses and Data classes, then use them to accomplish your task in the following manner:

#!/usr/bin/env/python"""Extract IPv4 Addresses From Input File."""from Data import CleanData  # Format and Clean the Input Data.from IPAddresses import ExtractIPs  # Extract IPs From Input Data.defget_ip_addresses(input_file_path):
    """"
    Read contents of input file and extract IPv4 Addresses.
    :param iput_file_path: fully qualified path to input file. Expecting str
    :returns: dictionary of IPv4 and IPv4-like Address lists
    :rtype: dict
    """

    input_data = []  # Empty list to house formatted input data.

    input_data.extend(CleanData(input_file_path).to_list())

    results = ExtractIPs(input_data).get_ipv4_results()

    return results
  • Now that you have a dictionary of lists, you can easily access the data you want and output it in whatever way you desire. The below example makes use of the above function; printing the results to console, and writing them to a specified output file:

    # Extract the desired data using the aforementioned function.
    ipv4_list = get_ip_addresses('/path/to/input/file')
    
    # Open your output file in 'append' mode.withopen('/path/to/output/file', 'a') as outfile:
    
        # Ensure that the list of valid IPv4 Addresses is not empty.if ipv4_list['valid_ips']:
    
            for ip_address in ipv4_list['valid_ips']:
    
                # Print to consoleprint(ip_address)
    
                # Write to output file.
                outfile.write(ip_address)
    

Solution 4:

Without re.MULTILINE flag $ matches only at the end of string.

To make debugging easier split the code into several parts that you could test independently.

defextract_ips(data):
    return re.findall(r"\d{1,3}(?:\.\d{1,3}){3}", data)

If input file is small and you don't need to preserve original order of ips:

withopen(filename) as infile, open(outfilename, "w") as outfile:
    outfile.write("\n".join(set(extract_ips(infile.read()))))

Otherwise:

withopen(filename) as infile, open(outfilename, "w") as outfile:
    seen = set()
    for line in infile:
        for ip in extract_ips(line):
            if ip notin seen:
               seen.add(ip)
               print >>outfile, ip

Post a Comment for "Python Parse File For Ip Addresses"