New Reconciliation Service (The Movie Database - TMDB)

Hi all, I made a reconciliation service for OpenRefine specifically made to reconcile movies using the API from The Movie Database (TMDB). I have made a local version and a public one (running on render) to facilitate usage by people not well-versed in programming. I posted the public version on the OpenRefine reconciliation service test bench.

I am a film historian and so developed this for my own use - my coding skills are rudimentary and this is my first time writing something like this. I hope it will be useful for others, and would appreciate any feedback on the implementation! Do let me know if there is anything I should change and what else I could do to get the service out there.

4 Likes

Very cool. Congratulations, and welcome to the community!

I'd be curious to hear what you used as a starting point, if any, and what could be done to make creating reconciliation services like yours easier.

Tom

Thanks, Tom!

To be honest, there was definitely considerable help from AI for the actual coding. Most of my programming before this had been for data analysis (I’d never used something like flask to make an actual application before).

I feel like a tutorial about how to build a reconciliation service based on an API key would be helpful, but I’m also not sure how big the user base for this would be! I was driven to this because the reconciliation through wikidata for films isn’t very good (low match rates and often wrong matches with high confidence because movies with the same title are very common).

Excellent! Can I ask why you did for TMDB instead for IMDB? Just curious :slight_smile:

Probably we will also take benefit of it for our CDCA project. Thanks!

1 Like

Having other people use it would make me very happy - if you guys notice anything that could be improved, let me know! The CDCA project sounds really cool!

IMDb is a great source of information, but unfortunately their data sharing policies are terrible. Just to have access to their API costs hundreds of thousands of dollars. Incidentally (and since I am already in a “here’s something I did mode” :sweat_smile:), I’ve actually written an article about IMDb’s history and how they slowly shifted from a commons-based project into a commercial one and in the process became increasingly less willing to share the data which is generated for free by its community of users.

TMDB, on the other hand, is an open access database that has a free and well-supported API. Though the database itself is not well-known, it is used by a lot of other websites (for instance, Letterboxd). If my project also makes more people aware of TMDB that would also make me happy, they deserve more recognition.

Side note: I just saw that TMDB currently has no Wikipedia page in English, which is crazy (they do for 15 other languages!). If anyone here has some Wikipedia clout (I figure if there’s one group that is likely to have some wiki editors it’s this one!), maybe you can get the decision to delete that reversed?

1 Like

Side note: I just saw that TMDB currently has no Wikipedia page in English, which is crazy (they do for 15 other languages!).

But since we're all data geeks here, it does have a Wikidata page: https://www.wikidata.org/wiki/Q20828898

1 Like

But since we're all data geeks here, it does have a Wikidata page: https://www.wikidata.org/wiki/Q20828898

True, and most films on Wikidata have a TMDB ID listed in their properties as well.