Is OpenRefine a Digital Public Good? YES! err SHOULD BE!

TL:DR - I just don't think a lot of the world that needs us... even knows about us!

CKAN recently recieved recognition as a Digital Public Good (DPG) from the DPG Alliance.
Wow!

I am thinking it might be a good opportunity for OpenRefine to also apply as a DPG ? (especially with 4.0 version eventually coming where one highlight is that it finally will address running OpenRefine on lower-end hardware)
I think we have all the evidence that they require for submission.
We also, in my opinion, easily fit the DPG Standard.
Finally, I think we align with the 17 Goals of Sustainable Development by the United Nations. In fact, in their progress report page 6 just these quotes alone:

The picture is incomplete due to persistent challenges in securing timely data
across all 169 targets. While progress has been made in improving data for SDG
monitoring, with the number of indicators included in the global SDG database
increasing from 115 in 2016 to 225 in 2022, there are still significant gaps in
geographic coverage, timeliness, and disaggregation. The chart below indicates
that for 9 of the 17 SDGs, only around half of the 193 countries or areas have
internationally comparable data since 2015, and only around 21% of countries
have data for SDG13 (climate action)

Closing data gaps to reap the data dividend will be a key priority for the
UN system in advance of the SDG Summit and beyond.

A major surge in concerted action is needed to ensure
developing countries have access to the financing and technologies needed to
accelerate SDG implementation

We fit into many of their goals at a data collection, data cleaning, data monitoring level. However, the direct Goal 17 - Partnerships for the Goals is where I think we likely fit the best, and most technology partnerships and software stack types had been mentioned in some of the videos I watched.

Data, monitoring and accountability
Target 17.18: In 2022, 147 countries and territories reported having national
statistical legislation compliant with the Fundamental Principles of Official
Statistics. In 2022, 156 countries and territories reported implementing a national
statistical plan with 100 of the plans fully funded, compared to 81 countries
implementing a national statistical plan with 17 fully funded in 2016. However,
due to long-lasting impacts of the pandemic and limited human and financial
capacity in strategic planning, many national statistical offices are implementing
expired strategic plans for their statistical activities, which may not fully cover
their evolving development objectives and emerging demands for data.
Target 17.19: International funding for data and statistics amounted to $542
million in 2020, a decrease of over $100 million and $155 million from funding
levels in 2019 and 2018, respectively. This rate is also a decline of 16% since 2015. While this decrease could be partially attributed to pandemic-induced
funding and policy shifts, it could reflect the long-standing challenges in
mainstreaming data activities, the limited pool of donors, and the low strategic
priority of statistics

But I found many more where systemic data collection, data cleaning efforts were mentioned over the years. (keyword searched on many of their PDFs for "data" and jet browsed through them)

The challenges that I read are not only about collecting data, but oftentimes cleaning and making sense of groundtruth data by reporters, scientists, etc. Fields where OpenRefine directly has contributed to usage and progress over the years. And we have plans for doing even more with upcoming new features and enhancements (data joining, reproducibility, data validation, enhanced reconciliation, just to name a few)

@Ainali and @Martin Do you think you could take it from here and apply?

@Martin Maybe a call with SDGS is best? Maybe asking for more info about partnerships or the STI Forum could be had? I just see that there's usage of tools like OpenRefine likely among many partners, but not all. Somehow perhaps SDGS could leverage additional training, support? I noticed and found this also Search | SDG Help Desk (unescap.org) and this Digital Technologies for the SDGs | SDG Help Desk (unescap.org) where I have direct experience while I worked at Ericsson with Internet of Things (IoT) and information and communication technologies (ICT) for environmental data monitoring solutions, communications, etc. Needless to say, data collection, quickly finding outliers, and cleaning is a BIG deal here.

We likely also need to find additional orgs within the UN and it's committee member orgs where we might leverage partnerships, private/public. Grant opportunities might instead flip around and come directly into us, if we reached out more to much larger entities (like UN orgs and others) that often have a direct process involving field data collection where cleaning and wrangling are often needed. So, Environmental orgs, World Health, Geo, etc. etc. all spring to mind.

2 Likes

Lots of good articles/pubs also if you just search “data” on the main SDGS site.

I think this is a good idea, and think OR would be a good fit, both as you say, directly on SDG 17, but also as a meta-level tool to enable work on other goals, specifically reporting on them.

I have applied twice, for Govdirectory and the Standard for Public Code, and going through it twice, I would suggest starting an Etherpad or similar so that we collaboratively can write our application. Unfortunately, I am travelling for two weeks and won’t have time to do it now, but I am happy to help review later.

@thadguidry, thanks for sharing this. It is very interesting and I think it can align well with the current effort on the CZI Diversity Grant

@Ainali, your experience submitting two projects will be helpful in preparing the submission.

I have started writing down answers for the questions in the submission form in this public Etherpad. Please everyone feel free to help fill in the blanks.

Added some comments and filled in a few. The legal stuff questions should be discussed in your meetings with CS&S to be absolutely sure on those. They also should help with final review before submission.