I started drafting the question for the 2024 user survey here. I worked based on the question from the 2022 survey with small modifications. The document is open for comments and suggestions. Please add any question you think is relevant. I would like to have a final version of the question by the end of the month.
We will most likely add a section regarding the Mission, Vision, and Value, as we work with the selected consulting company (I will share more on this soon).
@zoecooper thanks for your feedback, I answered in the document directly.
Regarding your additional question, should we make them conditional to those who ranked at least 3 out of 5 (so at least occasionally or once per month or in 50% of their project) to the following question
How often do you use the following features? Creating repeatable workflows
Going through the current question, I realized that the question "How often do you use the following features? Working with very large datasets" is not very precise.
How should we define what constitutes a large dataset? We can measure it either in terms of the number of rows and file size. Alternatively we can ask which user has edited the RAM allocation of OpenRefine as a proxy.
Is OpenRefine's installation on your work computer done by your IT staff or by yourself?
IT Staff
Myself
N/A (not used at work)
This would tell us quite a bit, like how many use it in their job and also who controls the installation environment. This will help us with packaging concerns.
The other thing that I'd like to see partitioned in the data is the work environment OS and the personal environment OS.
OpenRefine is used at work with the OS being:
Windows
Mac
Linux
N/A (I only use it personally)
OpenRefine is used personally and my OS is:
Windows
Mac
Linux
N/A (I only use it at work)
Also, we don't seem to have a way to clearly see if they use OpenRefine at work or personally through a percentage of time used. We only asked "mainly" with one choice? It would be better to simply ask for a percentage of work and personal use through 2 questions instead:
I took a pass through and left some comments in the document, but I was feeling pretty strongly anchored by the previous surveys, so I probably should have done my own cleansheet version first and then compared it to what's there.
I agree with Thad's comments about work vs non-work usage. One way to capture that might be to expand the "How often do you use OpenRefine?" question to cover both cases separately.
I like the binary voting methodology of allourideas, but I think the quality of the results will depend heavily on getting it seeded well and also filtering/post-processing the results. The previous survey had people adding duplicates of pre-existing items and the reported results didn't filter low frequency results.
Some other thoughts:
inputs/outputs/transformations are important and we should make sure we capture them well
do we care more about what users organizations do or what their role/function is?
do we care about organization size (maybe not, but it seems like it could inform about the environment users are working in)
why is there only a single choice which covers all of "for-profit" when OSM gets its own item?
similar to above, I feel that the granularity/range of answers for some of the other questions is off. e.g. do we care more about users with 1 vs 6 months experience than the entire interval between 2 and 14 years? Previous years results may help inform how these should be skewed.
I wish we had a "Click here to generate an anonymized usage report for the last 12 months" button. That would allow the survey to be focused on qualitative questions. Perhaps next survey...
I rewrote thequestion regarding Professional or non-professionalcasual user to use a slider
I changed the question regarding where OpenRefine is installed by who installed it with the option for the hosted version. We can infer whether it is run locally or hosted from the answer.
I added the following questions
Which OS are you using
With which browser do you use OpenRefine?
Question to capture the input format
Question to capture the output format generated
Questions @zoecooper regarding how users manage workflow
I would prefer to know the user's role rather than the organization they work for. For instance, I would prefer to know if you are a librarian rather than whether you work in a public library or a university.
I started a document here to list the suggestions for the Allourideas platform. I used answers from the previous survey (minus WikiCommons-specific questions) and results from my user interviews. However, I feel like the list of entry can be endless with the 407 open feature request issues. Note that with all our ideas, we have the option to moderate user suggestions before they are added to the survey.
The majority of 407 feature request issues (sorted by those with lots of thumbs up) appear to me, if I take a step back and look wholistically through categorical grouping... seems to be around 2 categorical areas:
More power and control for adhoc editing the grid in general (inserting rows, adding/removing rows, add blank columns). Almost like many are asking, can you just make OpenRefine work like a spreadsheet sometimes? Wondering if that's perhaps a new 3rd mode? Rows/Records/Spreadsheet? Feels weird, but that's what I'm seeing in many of those top ranked issues.
Reconciliation options and general quality of life improvements for Reconciliation.
It is time for a final review to finalize the survey. I accepted all the changes, so it's now easier to read. Bocoup will share their questions regarding OpenRefine's mission, vision, and values by next week. The survey is quite long, but I guess it is okay since we do it every two years. I would appreciate feedback from those who have more experience.
We still need to finalize the list of questions we want to pre-seed in the all our ideas platform. I am keen to hear from the development team (@antonin_d@thadguidry@tfmorris@abbe98 ...) on how to phrase those so we have actionable feedback (for example, what does Better reconciliation mean)?