如何用其他两个第一维和第二维不相等的 3d 数组的内容填充 3d 数组

How to fill 3d array with contents from two other 3d arrays with unequal first and second dimensions

我有来自两个不同供应商的两个 3d 数据阵列。 对于这两个数组,尺寸为:

维度 1:日期

维度 2:工具(不同的期货交割)

维度 3:六个工具属性(开盘价、最高价、最低价、收盘价、成交量、openInterest)

对于每个 3D 数组,我有两个用于日期和工具的变量(例如,我的代码中的 A1TimesA1Inst)。

然而,尽管有明显的重叠,但两个数组中的日期和工具并不相同。某些日期 and/or 工具可能存在于 Array1 而不是 Array2 中,反之亦然。

我正在尝试创建 Array3,第三个三维数据数组,其中第一个维度是来自两个来源的日期的并集,第二个维度是可用工具的并集,第三个维度也是六个仪器属性。

如果可能的话,我想从 Array2 填充 Array3。只有当 Array2 中没有任何内容时,我才想从 Array1 填充。 因此,对于给定的仪器和日期,如果 Array1 和 Array2 中存在数据,我想从 Array2 填充 Array3。

我尝试了一种解决方案,将数组的切片转换为时间表,使用 retime 使切片具有相同的时间长度,并将数据复制到第三个数组。这很慢,我认为必须有更好的方法。如果有人能告诉我一个矢量化的方法来做到这一点,我将不胜感激。

Array1 = randn(4,5,6); % time x instrument x attribute
A1Times = datetime([today-3:today]', 'ConvertFrom','datenum'); % times of first dimension of Array1
A1Inst = [3 4 5 6 7]';    % instruments of second dimension of Array1
Array1(round(1 + (numel(Array1)-1).*rand(round(numel(Array1)/5),1))) = NaN; % put a few random NaNs in the array

Array2 = randn(6,8,6);
A2Times = datetime([today-2:today+3]','ConvertFrom','datenum'); % times of first dimension of Array2
A2Inst = [1 2 5 6 7 8 9 10]'; % instruments of second dimension of Array2
Array2(round(1 + (numel(Array2)-1).*rand(round(numel(Array2)/5),1))) = NaN; % put a few random NaNs in the array

% third dimension will always be the same for both matrices

dateUnion = union(A1Times,A2Times);
instrumentUnion = union(A1Inst,A2Inst);

% Initialize A3:

Array3 = NaN(numel(dateUnion),numel(instrumentUnion),6);

% what I want to do:
% if data exists for both Array1 and Array2, populate Array3 with data from Array1
% if data doesn't exist for Array1 and does exist for Array2, populate Array3 from Array2


%% clumsy retime solution, with two for loops

A1varnames = matlab.lang.makeValidName(cellstr([repmat('Array1Instrument',numel(A1Inst),1) num2str(A1Inst)]));
A2varnames = matlab.lang.makeValidName(cellstr([repmat('Array2Instrument',numel(A2Inst),1) num2str(A2Inst)]));

for ij = 1:6 % looping through third dimension

    A1layer = array2timetable(Array1(:,:,ij),'RowTimes',A1Times);
    A1layer.Properties.VariableNames = A1varnames;

    A2layer = array2timetable(Array2(:,:,ij),'RowTimes',A2Times);
    A2layer.Properties.VariableNames = A2varnames;

    A1layer = retime(A1layer,dateUnion);
    A2layer = retime(A2layer,dateUnion);

    for ii = 1:numel(instrumentUnion)
        [~,A1loc] = ismember(instrumentUnion(ii),A1Inst);
        [~,A2loc] = ismember(instrumentUnion(ii),A2Inst);

        if (A1loc == 0)
            Array3(:,ii,ij) = A2layer{:,A2loc};
        elseif A2loc == 0
            Array3(:,ii,ij) = A1layer{:,A1loc};
        else % if instrument exists in both sources
            A1vec = A1layer{:,A1loc};
            A2vec = A2layer{:,A2loc};
            % if data exists in Array2 and Array1, choose Array2
            % if data exists in Array2 and not Array1, choose Array2
            % if data exists in Array1 and not Array2, choose Array1
            bothpopulated = ~isnan(A1vec) & ~isnan(A2vec);
            onlyA2populated = ~isnan(A2vec) & isnan(A1vec);
            onlyA1populated = isnan(A2vec) & ~isnan(A1vec);
            Array3(bothpopulated,ii,ij) = A2vec(bothpopulated);
            Array3(onlyA2populated,ii,ij) = A2vec(onlyA2populated);
            Array3(onlyA1populated,ii,ij) = A1vec(onlyA1populated);
        end
    end
end

首先,您需要将 AxTimesAxInst 映射到连续整数,以便它们可用于多维数组索引。 unique 的第三个输出给出了这些索引。之后,您只需要使用逻辑和多维数组索引来分配值。在这里,我简化了您的示例并将 A1Times 更改为数字。

Array1 = randn(4,5,6);
A1Times = [1 2 3 4].'
A1Inst = [3 4 5 6 7].';    
Array1(round(1 + (numel(Array1)-1).*rand(round(numel(Array1)/5),1))) = NaN;

Array2 = randn(6,8,6);
A2Times = [3 4 5 6 7 8].';
A2Inst = [1 2 5 6 7 8 9 10].'; 
Array2(round(1 + (numel(Array2)-1).*rand(round(numel(Array2)/5),1))) = NaN;

[ut,~,iut] = unique([A1Times; A2Times]);
[ui,~,iui] = unique([A1Inst; A2Inst]);


Array3 = NaN(numel(ut), numel(ui), 6);

Array3(iut(numel(A1Times)+1:end), iui(numel(A1Inst)+1:end), :) = Array2;

idx3 = false(size(Array3));
idx3(iut(1:numel(A1Times)), iui(1:numel(A1Inst)), :) = true;
idx3 = idx3 & isnan(Array3);

idx1 = idx3(iut(1:numel(A1Times)), iui(1:numel(A1Inst)), :);

Array3(idx3) = Array1(idx1);