如何从对向量进行排名中排除 NaN

Question

我们正在研究用于对股票进行排名的 MATLAB 代码。我们没有完整的数据集，因此必须处理一些 NaN。但是，在我们用于排序的代码中，NaN 排名最高。我们的目的是从排名中排除 NaN。如何做到这一点？

请考虑以下 Y 和 stockkid 的示例

Y = [1.2 1.3 NaN 0.9 0.95 NaN 0.8 0.7];
stockid = [801 802 803 804 805 806 807 808];
[totalmonths,totalstocks] = size(Y);
nbrstocks = totalstocks - sum(isnan(Y));
[B,I] = sort(Y,'descend');
ncandidates = 4;
idwinner(1:ncandidates) = stockid(I(1:ncandidates));

运行程序结果：

Y =

    1.2000    1.3000       NaN    0.9000    0.9500       NaN    0.8000    0.7000
idwinner =

   803   806   802   801

因此，803 对应 NaN，806 对应 NaN，802 对应 1.3 等等

我们想要的结果应该是这样的：

Y =

    1.2000    1.3000       NaN    0.9000    0.9500       NaN    0.8000    0.7000
idwinner =

   802   801   805   804

那么，我们如何从排名中排除 NaN？

Answer 1

使用

Y(isnan(Y)) = -inf;

在调用 sort 之前。这会将 NaN 值更改为 -inf，因此这些值将是最低的。

或者，如果您不想更改 Y 中的任何值，您可以使用如下中间索引：

Y = [1.2 1.3 NaN 0.9 0.95 NaN 0.8 0.7];
stockid = [801 802 803 804 805 806 807 808];

ind = find(~isnan(Y)); %/ intermediate index that tells which elements are numbers
[B,I] = sort(Y(ind),'descend');
ncandidates = 4;
idwinner(1:ncandidates) = stockid(ind(I(1:ncandidates))); %// apply intermediate index

Answer 2

在您的 sort 语句之后，添加行：I = I(~isnan(B));，这将从 stockids 中删除与 select 之前的 NaN 关联的索引]

Answer 3

 I = I(~isnan(B));

效果最好，因为我们不会像使用

那样覆盖 NaN

 Y(isnan(Y)) = -inf;

因为我们稍后还必须从回报率最低的股票中确定失败的投资组合。这不适用于最后一个代码，因为所有 NaN 都具有最低的 returns 而不是具有实际数据的股票。

如何从对向量进行排名中排除 NaN

How to exclude NaNs from ranking a vector

sorting

matlab

nan