Memory Error Python Processing Large File Line By Line
I am trying to concatenate model output files, the model run was broken up in 5 and each output corresponds to one of those partial run, due to the way the software outputs to file
Solution 1:
It is not necessary to read the entire contents of each file into memory before writing to the output file. Large files will just consume, possibly all, available memory.
Simply read and write one line at a time. Also open the output file once only... and choose a name that will not be picked up and treated as an input file itself, otherwise you run the risk of concatenating the output file onto itself (not yet a problem, but could be if you also process files from the current directory) - if loading it doesn't already consume all memory.
import os.path
withopen('output.txt', 'w') as outfile:
for folder inrange(1,6,1):
for name in os.listdir(folder):
if"domain"in name:
withopen(os.path.join(str(folder), name)) as file_content_list:
for line in file_content_list:
# perform corrections/modifications to line here
outfile.write(line)
Now you can process the data in a line oriented manner - just modify it before writing to the output file.
Post a Comment for "Memory Error Python Processing Large File Line By Line"