[OPENREFINE 3.2] Error occurred during file formatting

I have a Spring Boot application that uses OpenRefine 3.2 for file formatting. While the formatting itself works correctly, I’m currently encountering an issue during the creation of an OpenRefine project. The following error is thrown:

java.lang.NumberFormatException: For input string: "null"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Long.parseLong(Long.java:589)
        at java.lang.Long.parseLong(Long.java:631)
        at com.google.refine.commands.Command.getProject(Command.java:171)
        at com.google.refine.commands.project.GetModelsCommand.internalRespond(GetModelsCommand.java:110)
        at com.google.refine.commands.project.GetModelsCommand.doGet(GetModelsCommand.java:66)
        at com.google.refine.RefineServlet.service(RefineServlet.java:182)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
        at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
        at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

This error appears to be caused by OpenRefine trying to parse a null project ID, which prevents the project from being created (projectId = null).

Notes:

  • The same code works perfectly in the production environment, so this is likely a configuration-related issue.

  • I’m currently running OpenRefine on Windows using the .bat file, and I've attached the configuration file I'm using.

If anyone has any idea how to fix this issue, I’d greatly appreciate your help.

  • refine.ini :
# NOTE: This file is not read if you run the Refine executable directly
# It is only read of you use the refine shell script or refine.bat

no_proxy="localhost,127.0.0.1"
#REFINE_PORT=3334
#REFINE_HOST=127.0.0.1
#REFINE_WEBAPP=main\webapp

# Memory and max form size allocations
#REFINE_MAX_FORM_CONTENT_SIZE=1048576
REFINE_MEMORY=512M
REFINE_MIN_MEMORY=512M
JAVA_MEMORY=2048M
# Some sample configurations. These have no defaults.
# Java options
JAVA_OPTIONS=-Drefine.data_dir="C:\Users\******\openrefine-3.2" -Drefine.headless=true -Djava.net.preferIPv4Stack=true -Dsun.net.client.defaultConnectTimeout=60000 -Dsun.net.client.defaultReadTimeout=60000 -Dhttp.proxyHost= -Dhttp.proxyPort= -Dhttps.proxyHost= -Dhttps.proxyPort=
#JAVA_HOME=C:\Program Files\Java\jdk1.8.0_151
JAVA_HOME=C:\Users\******\jdk-8.0.402.6-hotspot
#JAVA_OPTIONS=-XX:+UseParallelGC -verbose:gc -Drefine.headless=true
#JAVA_OPTIONS=-Drefine.data_dir=C:\Users\user\AppData\Roaming\OpenRefine

# Uncomment to increase autosave period to 60 mins (default: 5 minutes) for better performance of long-lasting transformations
#REFINE_AUTOSAVE_PERIOD=60

Hey @hamdi_ghribi, welcome to the forum! I think there are a couple ways project data can get corrupted, so it’d be helpful to have some more information.

Are you able to elaborate on the application? Were there any recent changes to how you’re creating and managing projects with OpenRefine? Does the data used in your production environment differ in any way from the data you’re testing with?

On a related note, you mentioned you’re using 3.2, though the most recent version is 3.9.3. Have you tried using a more recent version?

Answers to the questions:

  • Have there been any recent changes in how you create and manage projects?
    No, no changes have been made.

  • Is the data used in production different from the data you’re testing with?
    The data is similar.

  • You mentioned using version 3.2, whereas the latest version is 3.9.3. Have you tried a more recent version?
    I can't, because I’m required to use the same version deployed in the production environment.

Actually, OpenRefine is able to format the files, but it fails to create the OpenRefine project.

Thanks for that context! Are you able to share the steps you’re following to create a project?

As for my comment about using a newer version of OpenRefine, I meant to say that you could try upgrading in both development and production environments. I realize that can be a big task, but I was mostly wondering what we could do as a project to make it easier to upgrade.