在一组可能有噪声的数据中,假设我知道真实数据的峰值应该均匀分布,我如何使用 MATLAB 检测真正需要的数据?
In a set of possibly noisy data, and given that I know the real data should peaks evenly spaced, how can I detect the real desired data using MATLAB?
我有一组测量数据,理论上应该只存储到达接收器的功率峰值,我知道这些峰值应该以 4 秒的间隔出现(大约至少,因为在实际情况下我应该希望它会有点偏差)。
问题是系统还可以从我感兴趣的研究以外的来源接收随机数据,或者作为来自同一来源的回声,如图像示例:
在这个图像中,蓝色数据是真实数据,红色数据是应该被忽略的随机数据。
使用 MATLAB(可能还有一些统计知识)来检测那些最有可能是想要的数据的最佳方法是什么?
(有时 "parasite" 数据也可以间隔 4 秒,如果它是回声)
以下代码查找属于间隔接近 4 的倍数的最长系列的时间标签。
该算法假定系列中可能缺少有效间隙(不搜索连续性)。
%T is the X coordinate of your graph (time tag).
%Notice: The amplitude is irrelevant here.
T = [1, 2, 5, 6, 7, 10, 12, 14];
%Create all possible combinations of indexes of T.
[Y, X] = meshgrid(1:length(T));
%G matrix is the combinations of all gaps:
%T(1) - T(1), T(2) - T(1), T(3) - T(1)...
%It is inefficient to compute all gaps (even in reverse and T(1) - T(1)),
%But it is a common way to solve problems using Matlab.
G = T(X) - T(Y);
%Ignore sign of gaps.
G = abs(G);
%Remove all gaps that are not multiple of 4 with 0.1 hysteresis.
%Remove gaps like 5, 11, and 12.7...
G((mod(G, 4) > 0.1) & (mod(G, 4) < 3.9)) = 0;
%C is a counter vector - counts all gaps that are not zeros.
%Now C holds the number of elements in the relevant series of each time sample.
C = sum(G > 0, 1);
%Only indexes belongs to the maximum series are valid.
ind = (C == max(C));
%Result: time tags belongs to the longest series.
resT = T(ind)
注:
如果您正在寻找没有间隙的最长系列,您可以使用以下代码:
T = [1, 2, 5, 6, 7, 10, 12, 14];
len = length(T);
C = zeros(1, len);
for i = 1:len-1
j = i;
k = i+1;
while (k <= len)
gap = T(k) - T(j);
if (abs(gap - 4) < 0.1)
C(i) = C(i) + 1; %Increase series counter.
%Continue searching from j forward.
j = k;
k = j+1;
else
k = k+1;
end
if (gap > 4.1)
%Break series if gap is above 4.1
break;
end
end
end
%now find(C == max(C)) is the index of the beginning of the longest contentious series.
我有一组测量数据,理论上应该只存储到达接收器的功率峰值,我知道这些峰值应该以 4 秒的间隔出现(大约至少,因为在实际情况下我应该希望它会有点偏差)。
问题是系统还可以从我感兴趣的研究以外的来源接收随机数据,或者作为来自同一来源的回声,如图像示例:
在这个图像中,蓝色数据是真实数据,红色数据是应该被忽略的随机数据。
使用 MATLAB(可能还有一些统计知识)来检测那些最有可能是想要的数据的最佳方法是什么? (有时 "parasite" 数据也可以间隔 4 秒,如果它是回声)
以下代码查找属于间隔接近 4 的倍数的最长系列的时间标签。
该算法假定系列中可能缺少有效间隙(不搜索连续性)。
%T is the X coordinate of your graph (time tag).
%Notice: The amplitude is irrelevant here.
T = [1, 2, 5, 6, 7, 10, 12, 14];
%Create all possible combinations of indexes of T.
[Y, X] = meshgrid(1:length(T));
%G matrix is the combinations of all gaps:
%T(1) - T(1), T(2) - T(1), T(3) - T(1)...
%It is inefficient to compute all gaps (even in reverse and T(1) - T(1)),
%But it is a common way to solve problems using Matlab.
G = T(X) - T(Y);
%Ignore sign of gaps.
G = abs(G);
%Remove all gaps that are not multiple of 4 with 0.1 hysteresis.
%Remove gaps like 5, 11, and 12.7...
G((mod(G, 4) > 0.1) & (mod(G, 4) < 3.9)) = 0;
%C is a counter vector - counts all gaps that are not zeros.
%Now C holds the number of elements in the relevant series of each time sample.
C = sum(G > 0, 1);
%Only indexes belongs to the maximum series are valid.
ind = (C == max(C));
%Result: time tags belongs to the longest series.
resT = T(ind)
注:
如果您正在寻找没有间隙的最长系列,您可以使用以下代码:
T = [1, 2, 5, 6, 7, 10, 12, 14];
len = length(T);
C = zeros(1, len);
for i = 1:len-1
j = i;
k = i+1;
while (k <= len)
gap = T(k) - T(j);
if (abs(gap - 4) < 0.1)
C(i) = C(i) + 1; %Increase series counter.
%Continue searching from j forward.
j = k;
k = j+1;
else
k = k+1;
end
if (gap > 4.1)
%Break series if gap is above 4.1
break;
end
end
end
%now find(C == max(C)) is the index of the beginning of the longest contentious series.