计算并显示出现的总和

Counting and displaying sum of occurrences

我的部分数据(字符串元胞数组)如下所示。我想计算特定字符串的出现次数(例如 'P0702''P0882' 等)并以如下所示的输出形式显示出现次数的总和:

'1FA'   '2012'  'F' ''  ''  ''  ''  ''  'P0702' 'P0882' 
'1Fc'   '2012'  'r' ''  ''  ''  ''  ''  'P0702' ''  ''  ''  
'1FA'   '2012'  'f' ''  ''  ''  ''  ''  'P0702' 'P0882' ''  
'1FA'   '2012'  'y' ''  ''  ''  'P0702' ''  ''  ''  ''  ''  
'1FA'   '2012'  'g' ''  ''  ''  ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'u' ''  'P0702' 'P0882' ''  ''  ''  ''  ''  
'1FA'   '2012'  'y' ''  'P0702' ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'n' ''  'P0702' ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'j' ''  ''  ''  ''  ''  ''  ''  ''  'P0702'                                
'1FA'   '2012'  'u' 'P0702' ''  ''  ''  ''  ''  ''  ''  ''  
'1FM'   '2013'  'x' ''  ''  ''  ''  ''  'P1921' ''  ''  ''
'1FM'   '2013'  'c' ''  'P1711' ''  ''  ''  ''  ''  ''  ''
'1FM'   '2013'  'c' ''  ''  ''  ''  ''  'P0702' 'P0882' ''
'1FM'   '2009'  'E' ''  ''  ''  ''  ''  ''  ''  'P0500' 

输出:

        sum of counts above      
P0702   15
P0500    1
P1711    1

等等。

我尝试使用 sum(strcmp(d,{'P0882'}),2); 来告诉我 'P0882' 出现了多少次,但是很难将它用于每个数据字符串。

您可以执行以下操作,基本上按照您的建议应用 strcmp,但在一个循环中,您 pre-determined 计算唯一的 strings/data 个名称。

我对您提供的数据进行了一些修改,使尺寸合适。代码已注释并且很容易理解:

C = {'1FA'   '2012'  'F' ''  ''  ''  ''  ''  'P0702' 'P0882' ;
'1Fc'   '2012'  'r' ''  ''  ''  ''  ''  'P0702' '';
'1FA'   '2012'  'f' ''  ''  ''  ''  ''  'P0702' 'P0882';
'1FA'   '2012'  'y' ''  ''  ''  'P0702' ''  ''  '';
'1FA'   '2012'  'g' ''  ''  ''  ''  ''  ''  '';
'1FA'   '2012'  'u' ''  'P0702' 'P0882' ''  ''  ''  ''  ;
'1FA'   '2012'  'y' ''  'P0702' ''  ''  ''  ''  '' ;
'1FA'   '2012'  'n' ''  'P0702' ''  ''  ''  ''  '' ;
'1FA'   '2012'  'j' ''  ''  ''  ''  ''  ''  'P0702' ;  
'1FA'   '2012'  'u' 'P0702' ''  ''  ''  ''  '' '' ;
'1FM'   '2013'  'x' ''  ''  ''  ''  ''  'P1921' '';
'1FM'   '2013'  'c' ''  'P1711' ''  ''  ''  ''  '';
'1FM'   '2013'  'c' ''  ''  ''  ''  ''  'P0702' 'P0882';
'1FM'   '2009'  'E' ''  ''  ''  ''  ''  '' 'P0500'}

%// Find unique strings to count occurence of.
[strings,~,~] = unique(C(:,4:end));

%// Remove empty cells automatically.
strings = strings(~cellfun(@isempty,strings));

%// Initialize output cell array
Output = cell(numel(strings),2);

%// Count occurence. You can combine the 2 lines into one using concatenation.
for k = 1:numel(strings)

    Output{k,1} = strings{k};    
    Output{k,2} = sum(sum(strcmp(C(:,4:end),strings{k})));

end

让我们好好地 table 处理一下:

T = table(Output(:,2),'RowNames',Output(:,1),'VariableNames',{'TotalOccurences'})

输出:

T = 

             TotalOccurences
             _______________

    P0500    [ 1]           
    P0702    [10]           
    P0882    [ 4]           
    P1711    [ 1]           
    P1921    [ 1]

如果您无法访问 table 函数,您可以使用 headers 创建一个元胞数组并稍微更改一下循环:

%// Initialize output cell array
Output = cell(numel(strings)+1,2);

%// Count occurence
for k = 1:numel(strings)

    Output{k+1,1} = strings{k};    
    Output{k+1,2} = sum(sum(strcmp(C(:,4:end),strings{k})));

end
%T = table(Output(:,2),'RowNames',Output(:,1),'VariableNames',{'TotalOccurences'})

Output(1,:) = {'Data' 'Occurence'}

输出:

Output = 

    'Data'     'Occurence'
    'P0500'    [        1]
    'P0702'    [       10]
    'P0882'    [        4]
    'P1711'    [        1]
    'P1921'    [        1]

如果您有 统计工具箱,您可以简单地使用 tabulate

%// get only relevant part
X = data(:,4:end);

%// tabulate
tabulate(X(:))

它已经给出了格式良好的输出:

  Value    Count   Percent
  P0702       10     58.82%
  P1711        1      5.88%
  P0882        4     23.53%
  P1921        1      5.88%
  P0500        1      5.88%

或者使用标准函数:

X = data(:,4:end)
[a,~,x] = unique(X(~strcmp(X,'')))
occ = hist(x(:),1:numel(a))
out = [a num2cell(occ).']

您可以计算所有没有循环的字符串的出现次数。让 C 成为您的元胞数组。

[uniqueStrings, ~, v] = unique(C);
counts = histc(v, 1:max(v));
result = [uniqueStrings(:) num2cell(counts(:))];

在您的示例中,这给出了

result = 
    ''         [81]
    '1FA'      [ 9]
    '1FM'      [ 4]
    '1Fc'      [ 1]
    '2009'     [ 1]
    '2012'     [10]
    '2013'     [ 3]
    'E'        [ 1]
    'F'        [ 1]
    'P0500'    [ 1]
    'P0702'    [10]
    'P0882'    [ 4]
    'P1711'    [ 1]
    'P1921'    [ 1]
    'c'        [ 2]
    'f'        [ 1]
    'g'        [ 1]
    'j'        [ 1]
    'n'        [ 1]
    'r'        [ 1]
    'u'        [ 2]
    'x'        [ 1]
    'y'        [ 2]