Cell cross not fully working: Error: java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0

Greetings,

I'm trying to use the cell.cross function and I'm not sure why it is not finding all the matches. For the cells that do not have a match it instead displays the error "Error: java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0"

471490 exists in my first list (there about 20 numbers that should match but don't)

Screenshot 2024-08-16 092237

but is not found when using cell.cross

I originally had these as text and then converted to numbers thinking that might be the issue but I get the error either way.

I was using 3.8, updated to 3.8.2 and still get the error. I'm on Windows 10 using Chrome. I'm baffled as to why is only happening with certain elements of the data.

Thank you!
Jennifer

The error message most certainly means that there have been no matches for the given value.

Therefore the result from cross is empty and accessing the first result of an empty something will give you the mentioned IndexOutOfBoundsException.

This is quite a technical way of handling this problem.

You can influence on how OpenRefine handles "errors" in the On error dialog settings.

The preview will still show you the error message, so you have a chance to see the error message, even when you decide to leave cells with errors empty or keep the original values for these cells.

But the cell isn't empty - the value is there.

But the cell isn't empty - the value is there.

I would focus on what makes those "about 20" values different from all the other ones which do work. The difference may be in either project, but they differ in a way that keeps OpenRefine from matching them. Leading/trailing whitespace is one common invisible difference which could cause this.

Tom

Thanks I had already trimmed both columns. I cannot see any visible difference between them

Hi @jenyoung

Are you able to share the projects? If so posting them here may help me or others diagnose the issue

I'd also suggest:

  • try using value.cross instead of cell.cross and see if that changes (It shouldn't but it's an area where we have seen some issues in previous versions)
  • truncate your expression to cell.cross("Cage Notations ead","id") and report what that shows in the preview - just to confirm that the issue is that no match is being found in the "Cage Notations ead" project
  • In the "Cage Notations ead" project do a "Unicode char-code" facet on the id column (this is available under Facets -> Customised facets) and look for any outlier characters - this can help point to invisible characters in the affected cells - assuming it should only contain digits, anything in the facet outside the range 48-57 is a non-numeric character which would be unexpected)
  • try manually editing one of the affected numbers in "Cage Notations ead" (find the cell and edit by hand, clearing the value as it stands and typing in the id again) and see if the lookup works after this - again if this fixes things it would suggest that the issue is some invisible character in the cell

These are just a few of the things I'd try if I was trying to diagnose the issue in this situation. Please do report back on whether any of this helps and we can try to suggest other diagnosis/fixing steps