Skip to content

AverageHash image algorithm calculating comparison incorrectly #3295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
joshuaherrera opened this issue Jun 29, 2022 · 4 comments
Open
3 tasks done

AverageHash image algorithm calculating comparison incorrectly #3295

joshuaherrera opened this issue Jun 29, 2022 · 4 comments

Comments

@joshuaherrera
Copy link

joshuaherrera commented Jun 29, 2022

System information (version)
  • OpenCV => 4.6
  • Operating System / Platform => MacOS Monterey arm64
  • Compiler => clang
Detailed description

The AverageHash image comparison algorithm is calculating a hamming distance that is too large when comparing the following two screenshots. The hamming distance calculated is 57, although one can see that the images are practically identical, apart from some text toward the bottom. I tried other open source AverageHash algorithms (for example imghash) and received hamming distances of between 0 and 3.

NOTE: Do not navigate to okta[.]ru[.]com as the domain may be malicious.
or2crop
or1crop

Steps to reproduce

One can utilize any of the opencv client libraries to reproduce the behavior. I have tried with gocv and opencv-python.
Below is a simple python program utilizing opencv-python that can be used to reproduce the issue using the above screenshots.

import cv2
import sys

hasher = cv2.img_hash.AverageHash_create()
a1 = cv2.imread(sys.argv[1])
a2 = cv2.imread(sys.argv[2])
a1h = hasher.compute(a1)
a2h = hasher.compute(a2)
diff = hasher.compare(a1h,a2h)
print(f"image1 hash is: {a1h}")
print(f"image2 hash is: {a2h}")
print(f"image hamming distance is: {diff}")
Issue submission checklist
  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues,
    forum.opencv.org, Stack Overflow, etc and have not found any solution
  • I updated to the latest OpenCV version and the issue is still there
@Kumataro
Copy link
Contributor

Kumataro commented Jul 2, 2022

Reason

It is caused by image resizing algorithm.

https://github.com/opencv/opencv_contrib/blob/4.6.0/modules/img_hash/src/average_hash.cpp#L29

        cv::resize(input, resizeImg, cv::Size(8,8), 0, 0, INTER_LINEAR_EXACT);

The results to resize input images to 8x8 with each INTER_* methods is following.

Currently img_hash uses INTER_LINEAR_EXACT method. Resized images are not similar, and hash values are very difference.

https://docs.opencv.org/4.6.0/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121

(from left, to right)
NEAREST, LINEAR, CUBIC, AREA, LANCZOS4, LINER_EXACT, NEAREST_EXACT.

image

how to fix(temporary)

With this input data, the patch below seems to mitigate the problem.

        cv::resize(input, resizeImg, cv::Size(8,8), 0, 0, INTER_LINEAR_AREA);
kmtr@kmtr-virtual-machine:~/work/studyC3295/A$ ./a.out A.jpg B.jpg
[[[0]
grayImg [ 43,  43,  43,  54,  54,  43,  43,  43;
  43,  43,  48,  49,  49,  47,  43,  43;
  43,  43,  52,  56,  41,  44,  43,  43;
  43,  43,  55,  69,  62,  50,  43,  43;
  43,  43,  54,  56,  51,  50,  43,  43;
  43,  43,  54,  66,  45,  45,  43,  43;
  43,  43,  47,  48,  48,  47,  43,  43;
  43,  43,  43,  44,  44,  43,  43,  43]
[[[1]
grayImg [ 43,  43,  43,  54,  54,  43,  43,  43;
  43,  43,  48,  49,  49,  47,  43,  43;
  43,  43,  52,  56,  41,  44,  43,  43;
  43,  43,  55,  69,  62,  50,  43,  43;
  43,  43,  54,  56,  51,  50,  43,  43;
  43,  43,  54,  59,  49,  45,  43,  43;
  43,  43,  47,  48,  48,  47,  43,  43;
  43,  43,  43,  44,  44,  43,  43,  43]
Hash A = [ 24,  28,  12,  60,  60,  12,  24,   0]
Hash B = [ 24,  60,  12,  60,  60,  28,  60,   0]
compare: 4

However, generally, robustness and performance are in a trade-off relationship.

I believe sometimes resizing with AREA for img-hash should not be better for performance reason.

So I think it is difficult to submit this suggest as MR.

@Kumataro
Copy link
Contributor

Kumataro commented Jul 2, 2022

how to fix(another way)

Another solution is to resize the image before calculation hash value.

Using mean() function seems work well for this images.

kmtr@kmtr-virtual-machine:~/work/studyC3295/B$ ./a.out
Hash A = [  0,   0,   0,   0,   0,  24,   0,   0]
Hash B = [255, 255, 231, 231, 231, 247, 255, 255]
compare: 57

[ INFO:[email protected]] global /home/kmtr/work/opencv/modules/core/src/parallel/registry_parallel.impl.hpp (96) ParallelBackendRegistry core(parallel): Enabled backends(3, sorted by priority): ONETBB(1000); TBB(990); OPENMP(980)
Hash A = [ 24,  28,  12,  60,  60,  12,  24,   0]
Hash B = [ 24,  28,  12,  60,  60,  28,  24,   0]
compare: 1

sample code is here. (This test code does not provide sufficient ROI range validation, so some error may happen).

#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/img_hash.hpp"

#include <iostream>

using namespace cv;
using namespace cv::img_hash;
using namespace std;

Mat resize8x8(Mat &src)
{
    Mat dst(8,8,CV_8UC1,Scalar(0));

    Mat grayImg;
    cvtColor( src, grayImg, COLOR_BGR2GRAY );

    for(int y = 0 ; y < 8 ; y ++ )
    {
        for(int x = 0 ; x < 8 ; x ++ )
        {
            const Rect roi_rect = Rect
            (
                grayImg.cols * x / 8, grayImg.rows * y / 8,
                grayImg.cols / 8,     grayImg.rows / 8
            );
            const Mat roi(grayImg, roi_rect );

            dst.at<uint8_t>(y,x) = mean( roi )[0];
        }
    }
#if 0
{
static int n = 0;
imwrite(cv::format("dst_%d.png",n), dst);
n++;
}
#endif

    return dst;
}

int main(int argc, char **argv)
{
    const Ptr<ImgHashBase> func = AverageHash::create();

    Mat a = imread("A.jpg");
    Mat b = imread("B.jpg");
    Mat hashA, hashB;

    func->compute(a, hashA);
    func->compute(b, hashB);

    cout << "Hash A = " << hashA << endl;
    cout << "Hash B = " << hashB << endl;

    cout << "compare: " << func->compare(hashA, hashB) << endl << endl;

    a = resize8x8(a);
    b = resize8x8(b);

    func->compute(a, hashA);
    func->compute(b, hashB);

    cout << "Hash A = " << hashA << endl;
    cout << "Hash B = " << hashB << endl;

    cout << "compare: " << func->compare(hashA, hashB) << endl << endl;
    return 0;
}```

@Avasam
Copy link

Avasam commented Dec 13, 2023

Thanks @Kumataro , using the following pre-computed resizing in Python gives me much more sane results, and much smaller difference from imagehash's implementation using Pillow:

source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)

Now my different test images differ by at most 2 points. Where before it would differ by up to 12. when the image I was comparing against was sourced from a screen capture of my target comparison!!! (meaning OpenCV's implementation of pHash is really sensitive to resizing, which makes sense given your explanation and the fix)


To avoid having to install the whole package of contrib/extra modules, here's my final implementation:

import cv2
import numpy as np
import numpy._typing as npt
import scipy.fftpack
from cv2.typing import MatLike

def __cv2_phash(image: MatLike, hash_size: int = 8, highfreq_factor: int = 4):
    """Implementation copied from https://github.com/JohannesBuchner/imagehash/blob/38005924fe9be17cfed145bbc6d83b09ef8be025/imagehash/__init__.py#L260 ."""  # noqa: E501
    # OpenCV has its own pHash comparison implementation in `cv2.img_hash`, but it requires contrib/extra modules
    # and is innacurate unless we precompute the size with a specific interpolation.
    # See: https://github.com/opencv/opencv_contrib/issues/3295#issuecomment-1172878684
    #
    # pHash = cv2.img_hash.PHash.create()
    # source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
    # capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)
    # source_hash = pHash.compute(source)
    # capture_hash = pHash.compute(capture)
    # hash_diff = pHash.compare(source_hash, capture_hash)

    img_size = hash_size * highfreq_factor
    image = cv2.cvtColor(image, cv2.COLOR_BGRA2GRAY)
    image = cv2.resize(image, (img_size, img_size), interpolation=cv2.INTER_AREA)
    dct = cast(npt.NDArray[np.float64], scipy.fftpack.dct(scipy.fftpack.dct(image, axis=0), axis=1))
    dct_low_frequency = dct[:hash_size, :hash_size]
    median = np.median(dct_low_frequency)
    return dct_low_frequency > median
    
source_hash = __cv2_phash(source)
capture_hash = __cv2_phash(capture)
hash_diff = np.count_nonzero(source_hash != capture_hash)

@Avasam
Copy link

Avasam commented Jun 15, 2025

Turns out using opencv-contrib-python-headless (given I was already using opencv-python-headless before) actually results on smaller PyInstaller onefile builds by ~22MB compared to importing that single scipy function. So here's my updated implementation:

import cv2
from cv2.typing import MatLike

CV2_PHASH_SIZE = 8


def __cv2_phash(source: MatLike, capture: MatLike):
    """
    OpenCV has its own pHash comparison implementation in `cv2.img_hash`,
    but is inaccurate unless we precompute the size with a specific interpolation.

    See: https://github.com/opencv/opencv_contrib/issues/3295#issuecomment-1172878684
    """
    phash = cv2.img_hash.PHash.create()
    source = cv2.resize(source, (CV2_PHASH_SIZE, CV2_PHASH_SIZE), interpolation=cv2.INTER_AREA)
    capture = cv2.resize(capture, (CV2_PHASH_SIZE, CV2_PHASH_SIZE), interpolation=cv2.INTER_AREA)
    source_hash = phash.compute(source)
    capture_hash = phash.compute(capture)
    hash_diff = phash.compare(source_hash, capture_hash)
    return 1 - (hash_diff / 64.0)


def compare_phash(source: MatLike, capture: MatLike, mask: MatLike | None = None):
    """
    Compares the Perceptual Hash of the two given images and returns the similarity between the two.

    @param source: Image of any given shape as a numpy array
    @param capture: Image of any given shape as a numpy array
    @param mask: An image matching the dimensions of the source, but 1 channel grayscale
    @return: The similarity between the hashes of the image as a number 0 to 1.
    """
    # Apply the mask to the source and capture before calculating the
    # pHash for each of the images. As a result of this, this function
    # is not going to be very helpful for large masks as the images
    # when shrunk down to 8x8 will mostly be the same.
    if is_valid_image(mask):
        source = cv2.bitwise_and(source, source, mask=mask)
        capture = cv2.bitwise_and(capture, capture, mask=mask)

    return __cv2_phash(source, capture)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants