Let's get the ball rolling on releasing 3.9.1. The issues I are aware of that should be patched for this version are:
Are you aware of any other issue that needs urgent fixing and releasing? Or other bug fixes in the master branch that should be backported too?
1 Like
Yes, it would be nice to put in effort to fix regressions, like the BOM detection for CSV files. It's a pain for a few users reported to me in training sessions (FYI, privately heard some gov data also particularly exhibits this in the newsrooms) Encoding Regression from other issue #6595 · Issue #7039 · OpenRefine/OpenRefine · GitHub
Doable? Easy?
1 Like
The other is that of older OpenRefine projects not being able to be opened since 3.8 landed because of Jackson's changes. I don't know if this could be a quick fix for us to just set a hardcoded higher limit setting in Jackson config?
opened 01:45AM - 18 Jun 24 UTC
Type: Bug
persistence
Priority: High
I recently wanted to go back and run some updated operations on an older SEC dat… a project I originally created in OpenRefine 3.4.1 (or it might have been 3.6, dunno) and this time using 3.8.1 for testing purposes.
My doubt on one thing is that maybe the project itself is somehow slightly damaged and thus causing a weird parsing situation that is then causing Jackson to parse something very large that it shouldn't be in the error below? or the project might be just fine, and instead something else is causing it to hit this limit now - like a new Jackson bug yet uncovered?
NOTE: I did not have this issue with the project using 3.4.1 or 3.6 in 2022 when I created and used it then.
### Current Results
I received the following Java stacktrace when trying to open the file from Project Manager screen:
```
09:09:35.555 [ project_utilities] Failed to load from data file C:\Users\thadg\AppData\Roaming\OpenRefine\2360997867709.project / data.zip (334ms)
com.fasterxml.jackson.databind.JsonMappingException: String value length (20054016) exceeds the maximum allowed (20000000, from `StreamReadConstraints.getMaxStringLength()`) (through reference chain: com.google.refine.model.Row["cells"]->java.util.ArrayList[6]->com.google.refine.model.Cell["v"])
at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:402)
at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:361)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.wrapAndThrow(BeanDeserializerBase.java:1937)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:572)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:440)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1493)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:348)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:359)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)
at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:545)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:570)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:440)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1493)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:348)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:342)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4899)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3846)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3814)
at com.google.refine.model.Row.loadStreaming(Row.java:225)
at com.google.refine.model.Row.load(Row.java:207)
at com.google.refine.model.Project.loadFromReader(Project.java:233)
at com.google.refine.model.Project.loadFromInputStream(Project.java:193)
at com.google.refine.io.ProjectUtilities.loadFromFile(ProjectUtilities.java:142)
at com.google.refine.io.ProjectUtilities.load(ProjectUtilities.java:121)
at com.google.refine.io.FileProjectManager.loadProject(FileProjectManager.java:270)
at com.google.refine.ProjectManager.getProject(ProjectManager.java:559)
at com.google.refine.commands.Command.getProject(Command.java:180)
at com.google.refine.commands.project.GetProjectMetadataCommand.doGet(GetProjectMetadataCommand.java:54)
at com.google.refine.RefineServlet.service(RefineServlet.java:181)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:750)
at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1410)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:529)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1570)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1384)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1543)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1306)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at com.google.refine.ValidateHostHandler.handle(ValidateHostHandler.java:93)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.Server.handle(Server.java:563)
at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
```
### Expected Behavior
Project loaded and showing data grid using 3.8.1 without showing infinity spinner and terminal showing Java stacktrace.
### Screenshots
### Versions
- Operating System: Windows 11 Pro
- Browser Version: Edge latest
- JRE or JDK Version: whatever we now ship with 3.8.1 embedded Java
- OpenRefine: 3.8.1
### Datasets
[2360997867709.project.zip](https://github.com/user-attachments/files/15879019/2360997867709.project.zip)
### Additional context
Since jackson-core [has a configuration option for `StreamReadConstraints` to control the maximum string length](https://github.com/FasterXML/jackson/discussions/217), and there is an internal limit of `20000000` (20M) then we can increase that? or better to add a new preference setting to read the jackson config option from with a default setting in the preference key/value like `jackson.getMaxStringLength = 100000000` (100M) ?
1 Like
I made a PR for the Jackson issue: #7191 .
I wonder if @Rory could be tempted to have a look at the encoding issue above? I don't have much mental context around it, but I could also look into it.
Rory
March 6, 2025, 10:11pm
5
I'd be happy to look into the encoding issue. It looks like the ticket has enough for me to try and reproduce so I'll get started and reach out if (or when) I have some follow up questions.
2 Likes