Hello all!
I am running into a difficult problem with OpenRefine that is seriously affecting my data. I have a project where I was trying to split multi-valued cells. It is a large project, and when I tried doing so he most recent time, the project crashed, and I saw a screen telling me I was out of memory. I increased the memory allocation and opened it again, but when I did, my data was messed up - it seems to have been split but the original cells remained too, and they are no longer kept together or in order. For instance, cells that looked like the following now look like this:
Original Data
Corrupted Data Now
abc
abc
def
a
ghi
def
jkl
ghi
b
c
jkl
d
I attempted to go back earlier into the project's history before this occurred, but when I attempt to do so I get an error saying "java.lang.IndexOutOfBoundsException: Index 180 out of bounds for length 91". Is the history of the project totally lost? I would like to be able to salvage all of the work I have done.
Thanks,
Ella
Hi @ellathompson I'm afraid I don't have any wisdom on the solution, but I would be really suprised if it is anything to do with editing the openrefine.l4j file
What version of OpenRefine are you using?
I wonder if @Rory is aware of any bugs that have been reported that relate to this behaviour @tfmorris or @thadguidry may have insight into how you might recover the situation if they are around
Sorry for the delayed response! I don’t think I’ve seen any similar issues that have come up recently.
Just to make sure I understand the issue, this is what I’m reading from the original message:
OpenRefine ran out of memory while splitting multi-valued cells
After restarting OpenRefine with a larger memory allocation, the project was corrupted:
the source data is intact but the split cells are no longer attached to the correct record
it’s not possible to go back to an earlier point in the project’s history due to the index out of bounds exception
Is that an accurate read of the situation (especially with regards to the source data remaining non-corrupted)?
@ellathompson, do you happen to have a longer error output when you get the IndexOutOfBoundsException? That would be useful in finding out where this issue originates in the code.