Java import ordering

Antonin has proposed changing the package naming from com.google.refine to org.openrefine when we do the next breaking (ie backward incompatible) release. I think that makes sense if we're going to be updating the extension interface and public APIs anyway.

Since we'll be doing wholesale edits of import statements anway, it'd also be a good time to make any updates to the ordering of import statements. The standard that we currently use is pretty much based on the Apache model: java, javax, all non-project imports, project imports (although it looks like we have a slight variation for the third category where we include separately before com.*. Static imports weren't a thing back in 2010, but it makes sense to import them first since that's what most folks do.

Google's standard is the utmost in simplicity with just two blocks: static & non-static with a single blank line separating the two.

IntelliJ has a threshold (5 default) where it switches to wildcard imports, but Google disallows wildcard imports. IntelliJ also lets you always use wildcards for certain packages (e.g. java.awt.*). Since we rarely have high number of import statements, I think we can just stick with Google's "no wildcard imports" rule.

What do others think? Although sticking with something close to our current standard will minimize noise in the diff, it doesn't make a huge difference given the volume of other changes, so if folks really want to go with the Google scheme or something else we can.

Tom

I don't have strong feelings about import ordering, but I am generally keen to normalize those via a linter. I am not sure if the one we currently use supports that though.

I have looked into this a little more and I think the impsort-maven-plugin would likely be the best bet for this. I also considered the following:

  • our existing formatter plugin does not support reordering imports
  • checkstyle can be used to enforce a particular import order and a Maven plugin is available to run it in the build, however it is only able to check for style violations and not correct them
  • the OpenRewrite maven plugin can be used to reorder imports, but it's a rather heavyweight solution. It does not really seem to be designed for fast linting but rather to perform one-off migrations (which can be much more complex than reordering imports, hence the heavy machinery). Also, it is able to rewrite much more than Java files and somehow there does not seem to be a way to restrict it to parse Java files only (I opened an issue about it)

I'm looking into making a configuration for this plugin which matches the existing order as much as possible. We should have a documented way to configure major IDEs to follow the enforced import order (ideally that setting should be auto-discovered by the IDE).

Thanks for investigating linters! Having complete tooling of IDEs and a linter would be ideal, but I see the IDEs as being the most important piece since they touch the code the most and, ideally, we'd just like the linter to verify that they didn't mess anything up.

The current import ordering was documented in Refine.importorder and the project specific settings org.eclipse.jdt.ui.prefs which were deleted in 2018. Unfortunately, Eclipse maintains the import order separately from all the rest of the code style settings, so they don't get imported with the current code style XML file that we use. The original ordering was:

java
javax
org
com
com.google
com.google.refine

This gave pride of place to com.google., which isn't really justified and caused edu., and a few other low frequency things to collate in weird places. Having org.* collate before com.* was an Apache-ism, I think, but pure alphabetical is simpler.

My suggestion is that we use this order:

java javax com.google.refine org.openrefine

The last two are mutually exclusive, but this covers us both before and after the rename. I also suggest that we set the wildcard import threshold in IntelliJ to 99, effectively disabling wildcard imports (default is 5). Eclipse has a single "blank line after import groups" setting, which is less flexible than IntelliJ, but a restriction that we need to reflect in the IntelliJ settings.

The new order potentially causes a little churn for com/edu/org, but we tend not to have imports from different package TLDs in the same module. If there are only, e.g., com.* or org.* packages, the ordering will be unchanged.

I haven't used it, but the Eclipse Code Formatter plugin for IntelliJ might provide some benefit if we're going to keep Eclipse settings as the canonical source of truth. Personally, I use IntelliJ, but I'm not sure what other developers use. I suspect EditorConfig might actually be the best way forward.

But back to import ordering - what ordering do people want to use? That's the main thing I wanted to get consensus on. If we can agree on a starting point, we can check to see if all the tools support it.

Tom

I have zero requirements about the particular order, so let's just go for the order you suggest. I am fine with disabling wildcard imports too.