fileChunker (class) ∞
-
class
fileChunker
(filepath, batchSize)[source] ∞ Bases:
object
The fileChunker iterator - iterate over large line-based files to reduce memory footprint
Key Arguments
filepath
– path to the large file to iterate overbatchSize
– size of the chunks to return in lines
Usage
To setup your logger, settings and database connections, please use the
fundamentals
package (see tutorial here).To initiate a fileChunker iterator and then process the file in batches of 100000 lines, use the following:
from fundamentals.files import fileChunker fc = fileChunker( filepath="/path/to/large/file.csv", batchSize=100000 ) for i in fc: print len(i)
Methods