Buffer Underrun And Resourceexhausted Errors With Tensorflow
Solution 1:
You are running out of memory. It's possible that your network requires more memory than you have to run, so the first step to tracking down excessive memory usage is to figure out what is using so much memory.
Here's one approach that uses timeline and statssummarizer: https://gist.github.com/yaroslavvb/08afccbe087171881ceafc0c98abca05
This will print out several tables, one of the tables is the tensors sorted by top memory usage. You should check that you don't have something unusually large in there.
You can also see memory timeline using Chrome visualizer, as detailed here
A more advanced technique is to plot a timeline of memory allocations/deallocations, as done in this issue
Theoretically your memory usage shouldn't grow between steps if you aren't creating new stateful ops (Variables), but I found that global memory allocation can grow if sizes of your tensors change between steps.
A work-around is to periodically save your parameters to checkpoint and restart your script.
Post a Comment for "Buffer Underrun And Resourceexhausted Errors With Tensorflow"