-
Notifications
You must be signed in to change notification settings - Fork 5.8k
AverageHash image algorithm calculating comparison incorrectly #3295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
ReasonIt is caused by image resizing algorithm.
The results to resize input images to 8x8 with each INTER_* methods is following. Currently img_hash uses INTER_LINEAR_EXACT method. Resized images are not similar, and hash values are very difference. (from left, to right) how to fix(temporary)With this input data, the patch below seems to mitigate the problem.
However, generally, robustness and performance are in a trade-off relationship. I believe sometimes resizing with AREA for img-hash should not be better for performance reason. So I think it is difficult to submit this suggest as MR. |
how to fix(another way)Another solution is to resize the image before calculation hash value. Using mean() function seems work well for this images.
sample code is here. (This test code does not provide sufficient ROI range validation, so some error may happen). #include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/img_hash.hpp"
#include <iostream>
using namespace cv;
using namespace cv::img_hash;
using namespace std;
Mat resize8x8(Mat &src)
{
Mat dst(8,8,CV_8UC1,Scalar(0));
Mat grayImg;
cvtColor( src, grayImg, COLOR_BGR2GRAY );
for(int y = 0 ; y < 8 ; y ++ )
{
for(int x = 0 ; x < 8 ; x ++ )
{
const Rect roi_rect = Rect
(
grayImg.cols * x / 8, grayImg.rows * y / 8,
grayImg.cols / 8, grayImg.rows / 8
);
const Mat roi(grayImg, roi_rect );
dst.at<uint8_t>(y,x) = mean( roi )[0];
}
}
#if 0
{
static int n = 0;
imwrite(cv::format("dst_%d.png",n), dst);
n++;
}
#endif
return dst;
}
int main(int argc, char **argv)
{
const Ptr<ImgHashBase> func = AverageHash::create();
Mat a = imread("A.jpg");
Mat b = imread("B.jpg");
Mat hashA, hashB;
func->compute(a, hashA);
func->compute(b, hashB);
cout << "Hash A = " << hashA << endl;
cout << "Hash B = " << hashB << endl;
cout << "compare: " << func->compare(hashA, hashB) << endl << endl;
a = resize8x8(a);
b = resize8x8(b);
func->compute(a, hashA);
func->compute(b, hashB);
cout << "Hash A = " << hashA << endl;
cout << "Hash B = " << hashB << endl;
cout << "compare: " << func->compare(hashA, hashB) << endl << endl;
return 0;
}``` |
Thanks @Kumataro , using the following pre-computed resizing in Python gives me much more sane results, and much smaller difference from source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA) Now my different test images differ by at most 2 points. Where before it would differ by up to 12. when the image I was comparing against was sourced from a screen capture of my target comparison!!! (meaning OpenCV's implementation of pHash is really sensitive to resizing, which makes sense given your explanation and the fix) To avoid having to install the whole package of contrib/extra modules, here's my final implementation: import cv2
import numpy as np
import numpy._typing as npt
import scipy.fftpack
from cv2.typing import MatLike
def __cv2_phash(image: MatLike, hash_size: int = 8, highfreq_factor: int = 4):
"""Implementation copied from https://github.com/JohannesBuchner/imagehash/blob/38005924fe9be17cfed145bbc6d83b09ef8be025/imagehash/__init__.py#L260 .""" # noqa: E501
# OpenCV has its own pHash comparison implementation in `cv2.img_hash`, but it requires contrib/extra modules
# and is innacurate unless we precompute the size with a specific interpolation.
# See: https://github.com/opencv/opencv_contrib/issues/3295#issuecomment-1172878684
#
# pHash = cv2.img_hash.PHash.create()
# source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
# capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)
# source_hash = pHash.compute(source)
# capture_hash = pHash.compute(capture)
# hash_diff = pHash.compare(source_hash, capture_hash)
img_size = hash_size * highfreq_factor
image = cv2.cvtColor(image, cv2.COLOR_BGRA2GRAY)
image = cv2.resize(image, (img_size, img_size), interpolation=cv2.INTER_AREA)
dct = cast(npt.NDArray[np.float64], scipy.fftpack.dct(scipy.fftpack.dct(image, axis=0), axis=1))
dct_low_frequency = dct[:hash_size, :hash_size]
median = np.median(dct_low_frequency)
return dct_low_frequency > median
source_hash = __cv2_phash(source)
capture_hash = __cv2_phash(capture)
hash_diff = np.count_nonzero(source_hash != capture_hash) |
Turns out using import cv2
from cv2.typing import MatLike
CV2_PHASH_SIZE = 8
def __cv2_phash(source: MatLike, capture: MatLike):
"""
OpenCV has its own pHash comparison implementation in `cv2.img_hash`,
but is inaccurate unless we precompute the size with a specific interpolation.
See: https://github.com/opencv/opencv_contrib/issues/3295#issuecomment-1172878684
"""
phash = cv2.img_hash.PHash.create()
source = cv2.resize(source, (CV2_PHASH_SIZE, CV2_PHASH_SIZE), interpolation=cv2.INTER_AREA)
capture = cv2.resize(capture, (CV2_PHASH_SIZE, CV2_PHASH_SIZE), interpolation=cv2.INTER_AREA)
source_hash = phash.compute(source)
capture_hash = phash.compute(capture)
hash_diff = phash.compare(source_hash, capture_hash)
return 1 - (hash_diff / 64.0)
def compare_phash(source: MatLike, capture: MatLike, mask: MatLike | None = None):
"""
Compares the Perceptual Hash of the two given images and returns the similarity between the two.
@param source: Image of any given shape as a numpy array
@param capture: Image of any given shape as a numpy array
@param mask: An image matching the dimensions of the source, but 1 channel grayscale
@return: The similarity between the hashes of the image as a number 0 to 1.
"""
# Apply the mask to the source and capture before calculating the
# pHash for each of the images. As a result of this, this function
# is not going to be very helpful for large masks as the images
# when shrunk down to 8x8 will mostly be the same.
if is_valid_image(mask):
source = cv2.bitwise_and(source, source, mask=mask)
capture = cv2.bitwise_and(capture, capture, mask=mask)
return __cv2_phash(source, capture) |
Uh oh!
There was an error while loading. Please reload this page.
System information (version)
Detailed description
The AverageHash image comparison algorithm is calculating a hamming distance that is too large when comparing the following two screenshots. The hamming distance calculated is 57, although one can see that the images are practically identical, apart from some text toward the bottom. I tried other open source AverageHash algorithms (for example imghash) and received hamming distances of between 0 and 3.
NOTE: Do not navigate to okta[.]ru[.]com as the domain may be malicious.


Steps to reproduce
One can utilize any of the opencv client libraries to reproduce the behavior. I have tried with gocv and opencv-python.
Below is a simple python program utilizing
opencv-python
that can be used to reproduce the issue using the above screenshots.Issue submission checklist
forum.opencv.org, Stack Overflow, etc and have not found any solution
The text was updated successfully, but these errors were encountered: