fileChunker (class)

class fundamentals.files.fileChunker(filepath, batchSize)[source][source]

Bases: object

The fileChunker iterator - iterate over large line-based files to reduce memory footprint

Key Arguments

  • filepath – path to the large file to iterate over

  • batchSize – size of the chunks to return in lines

Usage

To setup your logger, settings and database connections, please use the fundamentals package (see tutorial here https://fundamentals.readthedocs.io/en/master/initialisation.html).

To initiate a fileChunker iterator and then process the file in batches of 100000 lines, use the following:

```python from fundamentals.files import fileChunker fc = fileChunker(

filepath=”/path/to/large/file.csv”, batchSize=100000

) for i in fc:

print len(i)

```

Methods