fileChunker (class)

class fileChunker(filepath, batchSize)[source]

Bases: object

The fileChunker iterator - iterate over large line-based files to reduce memory footprint

Key Arguments

  • filepath – path to the large file to iterate over

  • batchSize – size of the chunks to return in lines


To setup your logger, settings and database connections, please use the fundamentals package (see tutorial here).

To initiate a fileChunker iterator and then process the file in batches of 100000 lines, use the following:

from fundamentals.files import fileChunker
fc = fileChunker(
for i in fc:
    print len(i)