This is an awesome piece of work. Thank you!
Years ago, I was a digitial investigator for the government working internet investigations focused on child exploitation. One of the biggest challeges I faced in building systems at that time was identifying similar, but not the same images.
WIth images that were exactly the same, it was a simple matter of grabbing a cryptographic hash and doing a comparison. However, any modification such as a scale or crop or whatever created a computationally difficult obstacle to get around. Over time with a lot of playing around with various libraries, I developed some hacks to get close enough to pass off for manual review.
A method such as this would have saved hundreds of hours and produced better results. It's amazing the level of tech we currently have at our fingertips with just a few lines of code.
Kudos!