4.0 architecture and future JDK 20+ compatibility

@antonin_d I just noticed that David Delabassee posted the latest JDK updates to the Jython mailing list.
Of importance is the upcoming changes for Locale handling and Unicode CLDR Version 42.

Something to track or think about on 4.0 architecture maybe now or later.

Heads-Up - JDK 20 - Support for Unicode CLDR Version 42

The JDK’s locale data is based on the Unicode Consortium’s Unicode
Common Locale Data Repository (CLDR). As mentioned in the December 2022
Quality Outreach newsletter [1], JDK 20 upgraded CLDR [2] to version 42
[3], which was released in October 2022. This version includes a “more
sophisticated handling of spaces” [4] that replaces regular spaces with
non-breaking spaces (NBSP / \u00A0) or narrow non-breaking spaces
(NNBSP / \u202F):

  • in time formats between a and time
  • in unit formats between {0} and unit
  • in Cyrillic date formats before year marker such as г

Other noticeable changes include:

  • " at " is no longer used for standard date/time format ’ [5]
  • fix first day of week info for China (CN) [6]
  • Japanese: Support numbers up to 9999京 [7]

As a consequence, production and test code that produces or parses
locale-dependent strings like formatted dates and times may change
behavior in potentially breaking ways (e.g. when a handcrafted datetime
string with a regular space is parsed, but the parser now expects an
NBSP or NNBSP). Issues can be hard to analyze because expected and
actual strings look very similar or even identical in various text
representations. To detect and fix these issues, make sure to use a text
editor that displays different kinds of spaces differently.

If the required fixes can’t be implemented when upgrading to JDK 20,
consider using the JVM argument -Djava.locale.providers=COMPAT to use
legacy locale data. Note that this limits some locale-related
functionality and treat it as a temporary workaround, not a proper
solution. Moreover, the COMPAT option will be eventually removed in
the future.

It is also important to keep in mind that this kind of locale data
evolves regularly so programs parsing/composing the locale data by
themselves should be routinely checked with each JDK release.

[2] [JDK-8284840] Update CLDR to Version 42.0 - Java Bug System
[3] Unicode CLDR - CLDR 42 Release Note
[4] [CLDR-14032] - Unicode Consortium
[5] [CLDR-14831] - Unicode Consortium
[6] [CLDR-11510] - Unicode Consortium
[7] [CLDR-15966] - Unicode Consortium

Hi @thadguidry, at a first glance I do not think those changes should interfere with OpenRefine 3 or 4. I guess we can start thinking about adding JDK 20 to the CI and see how our test suite runs there.