Large CSV loading

I have increased memory allocation to OR to 4096 and same to the JDK.
The CSV file I’m trying to open is 2.5GB!
Is this likely to be impossible??

The time remaining just keeps climbing …

I am curious if you would have any luck with the 4.0-alpha1 version (or even better, building it from the 4.0 branch from source)? This version is not meant to be stable but given that 3.x will definitely struggle opening something like that, why not give it a try…

However, I have never tried it with such a big file and it working will very much rely on only using features that are well scalable.

Hello, I had similar problem getting stuck with memory usage. It turned out that my CSV file had some problems with special characters/encoding (not sure). What I did was to remove all the lines and leave just few to see if it can read properly.
Try the following:
-Create a copy of your CSV file, delete all the lines except first 100 or 1000 and check. If that doesn’t work, probably it’s a problem with format.

Thank you for your advice -
I think one thing was that I had edited the correct .ini file. However, due to time constraints I ended up loading the whole thing into PostgreSQL in chunks.
Even though I was starting OR from the cmd line with a memory switch, I hadn’t edited refine.ini REFINE_MEMORY. Will try again and post result

I have increased memory allocation to OR to 4096 and same to the JDK.
The CSV file I’m trying to open is 2.5GB!
Is this likely to be impossible??

The time remaining just keeps climbing …

The percentage calculation is, obviously, buggy, but if the time keeps climbing for more than a minute or two, it's time to bail. Java's low memory behavior is kind of pathological in that it'll keep continuing to try to grind away when it really has no hope of completing.

How much memory you'll need will really depend on the "shape" of the CSV file (columns, rows, data types, etc), but I would expect that you'd need more than 4GB to open a 2.5 GB file, although if it's a few columns of relatively long strings, you might squeak by.

Tom

Happy to say a fix has been put in for this bug following seeing this OPs screenshot

And it’s released in 3.7.1 too.

1 Like