我们如何检查 python 中两个音频文件的哈希值之间的相似性？

Question

关于数据：我们有 2 个相同的视频文件，这些文件的音频也相同，但质量不同。那一个分别是128kbps和320kbps。

我们使用 ffmpeg 从视频中提取音频，并使用以下代码为两个音频文件生成哈希值：ffmpeg -loglevel error -i 320kbps.wav -map 0 - f 哈希 - 输出是：SHA256=4c77a4a73f9fa99ee219f0019e99a367c4ab72242623f10d1dc35d12f3be726c 类似地，我们为另一个我们必须比较的音频文件做了它， C:\FFMPEG>ffmpeg -loglevel error -i 128kbps.wav -map 0 -f hash - SHA256=f8ca7622da40473d375765e1d4337bdf035441bbd01187b69e4d059514b2d69a

现在我们知道这些音频文件和哈希值是不同的，但我们想知道它们实际上有多少 different/similar，例如：比如 a-b 中的某个距离是 3

有人可以帮忙吗？

Answer 1

您不能为此使用 SHA256 哈希。这是故意的。如果可以的话，它会削弱哈希的安全性。您的建议类似于差分密码分析。 SHA256 是一种现代加密哈希，旨在抵御此类攻击。

Answer 2

不能使用像 SHA-256 这样的加密哈希来比较两个音频文件之间的距离。加密哈希被有意设计为不可预测，并且理想情况下不会泄露有关被哈希输入的信息。

但是，有许多合适的 acoustic fingerprinting 算法可以接受一段音频和 return 一个指纹向量。然后，您可以通过查看两个音频剪辑对应的指纹向量的接近程度来衡量两个音频剪辑的相似性。

选择声学指纹算法

Chromaprint is a popular open source acoustic fingerprinting algorithm with bindings and reimplementations in many popular languages. Chromaprint is used by the AcoustID 项目，它正在构建一个开源数据库来收集流行音乐的指纹和元数据。

研究员 Joren Six 还编写并开源了声学指纹库 Panako and Olaf。但是，它们目前均获得 AGPLv3 许可，可能会侵犯仍然有效的美国专利。

几家公司——例如 Pex--sell APIs for checking if arbitrary audio files contain copyrighted material. If you sign up for Pex, they will give you their closed-source SDK 根据他们的算法生成声学指纹。

生成和比较指纹

在这里，我假设您选择了 Chromaprint。您将必须安装 libchromaprint 和 FFT 库。

我假设您选择了 Chromaprint 并且您想使用 Python 比较指纹，尽管一般原则适用于其他指纹识别库。

安装libchromaprint or the fpcalc command line tool.
从 PyPI 安装 pyacoustid Python 库。它将查找您现有的 libchromaprint 或 fpcalc 安装。
标准化您的音频文件以消除可能混淆 Chromaprint 的差异，例如音频文件开头的静音。还要记住 Chromaprin
虽然我通常 measure the distance between vectors using NumPy，但许多 Chromaprint 用户通过计算指纹之间的 xor 函数并计算 1 位数来比较两个音频文件。

这里是一些比较简单的 Python 代码，用于比较两个指纹之间的距离。尽管如果我正在构建生产服务，我会在 C++ 或 Rust 中实现比较。

from operator import xor
from typing import List

# These imports should be in your Python module path
# after installing the `pyacoustid` package from PyPI.
import acoustid
import chromaprint


def get_fingerprint(filename: str) -> List[int]:
    """
    Reads an audio file from the filesystem and returns a
    fingerprint.

    Args:
        filename: The filename of an audio file on the local
            filesystem to read.

    Returns:
        Returns a list of 32-bit integers. Two fingerprints can
        be roughly compared by counting the number of
        corresponding bits that are different from each other.
    """
    _, encoded = acoustid.fingerprint_file(filename)
    fingerprint, _ = chromaprint.decode_fingerprint(
        encoded
    )
    return fingerprint


def fingerprint_distance(
    f1: List[int],
    f2: List[int],
    fingerprint_len: int,
) -> float:
    """
    Returns a normalized distance between two fingerprints.

    Args:
        f1: The first fingerprint.

        f2: The second fingerprint.

        fingerprint_len: Only compare the first `fingerprint_len`
            integers in each fingerprint. This is useful
            when comparing audio samples of a different length.

    Returns:
        Returns a number between 0.0 and 1.0 representing
        the distance between two fingerprints. This value
        represents distance as like a percentage.
    """
    max_hamming_weight = 32 * fingerprint_len
    hamming_weight = sum(
        sum(
            c == "1"
            for c in bin(xor(f1[i], f2[i]))
        )
        for i in range(fingerprint_len)
    )
    return hamming_weight / max_hamming_weight

以上函数可以让你比较两个指纹如下：

>>> f1 = get_fingerprint("1.mp3")
>>> f2 = get_fingerprint("2.mp3")
>>> f_len = min(len(f1), len(f2))
>>> fingerprint_distance(f1, f2, f_len)
0.35 # for example

您可以详细了解如何使用 Chromaprint 计算不同音频文件之间的距离。 This mailing list thread describes the theory of how to compare Chromaprint fingerprints. This GitHub Gist 提供了另一种实现方式。

我们如何检查 python 中两个音频文件的哈希值之间的相似性？

how do we check similarity between hash values of two audio files in python?

python

audio

ffmpeg

similarity

computer-vision

选择声学指纹算法

生成和比较指纹