OpenCV，通过BOWKMeansTrainer得到的词汇矩阵

Question

我正在尝试按照 this 代码来实施 BoF。特别是这段代码：

//featuresUnclustered contains all the feature descriptors of all images
//Construct BOWKMeansTrainer
//the number of bags
int dictionarySize=200;
//define Term Criteria
TermCriteria tc(CV_TERMCRIT_ITER,100,0.001);
//retries number
int retries=1;
//necessary flags
int flags=KMEANS_PP_CENTERS;
//Create the BoW (or BoF) trainer
BOWKMeansTrainer bowTrainer(dictionarySize,tc,retries,flags);
//cluster the feature vectors
cout<<"starting k-means..."<<endl;
Mat dictionary=bowTrainer.cluster(featuresUnclustered);    
//store the vocabulary
FileStorage fs("dictionary.yml", FileStorage::WRITE);
fs << "vocabulary" << dictionary;
fs.release();

我获得了 dictionary.yaml 格式的文件：

%YAML:1.0
vocabulary: !!opencv-matrix
   rows: 200
   cols: 128
   dt: f
   data: [ 8.19999981e+00, 1.20000005e+00, 1., 24., 5.82000008e+01,
   ...
   ]

现在，我的问题是：每一行代表一个质心（我们有 200 个质心，由 dictionarySize 给出）并且由于 SIFT 的描述符大小为 128 位，因此每个质心具有相同的维度。那是对的吗？

Answer 1

Each row represents a centroid (and we have 200 centroids, given by dictionarySize) and since SIFT's descriptor size is 128 bit, each centroid has the same dimension. Is that correct?

是的，正确。

嗯，SIFT 有 128 个值（不是 bit）。在 OpenCV 中，每个值都是 float，即 32 位。但是，是的，每个质心有 128 个值。

k-means(dictionarySize)的K是质心的个数。每个质心都具有与您使用的特征相同的维度 N，因此 SIFT 为 128。

字典将是一个矩阵 K x N，在本例中 200 X 128。

请记住，BoW 直方图（这是使用字典计算的全局描述符）将具有 K 个值。

OpenCV，通过BOWKMeansTrainer得到的词汇矩阵

OpenCV, vocabulary matrix obtained by BOWKMeansTrainer

c++

opencv

k-means