Matlab Cell 中答案的百分比计算 - （多选，单行）

Question

我正在使用 MATLAB 进行统计分析，但遇到了一个小问题。我需要计算特定问题的正确答案百分比。我将答案存储在一个单元格中。这里;

mySturct.nm_answers=
'${e://Field/n11},${e://Field/n99},${e://Field/n147}, Sam, Thomas' % = participant1
 NaN % = participant 2
''${e://Field/n3},${e://Field/n11},${e://Field/n43},${e://Field/n59},${e://Field/n83},${e://Field/n91},${e://Field/n99},${e://Fiel...'' <Preview truncated at 128 characters>'          % = participant 3
''${e://Field/n11},${e://Field/n19},${e://Field/n43},${e://Field/n59},${e://Field/n67},${e://Field/n83},${e://Field/n107},${e://Fi...'' <Preview truncated at 128 characters>' %= participant 4
 ...
% goes until participant 150

单元格的每一行代表参与者的答案。在此预览中，有 4 位参与者。这看起来很乱，我知道，因为我已经连续记录了所有答案。（我有一道有40个选项的选择题，每个选项都记录在第一行。）

我有 20 个错误和 20 个正确的选择，所以我的多项选择题有 40 个不同的选项。每个以 ${e://Field/ 开头的答案都将被视为正确答案，每个名称如 Sam、Thomas（检查 participant1）将被视为错误答案。

此外，我还将计算未选择的选项。因此，20- # of correct answers 将被视为 "should have selected"，而 20- #of wrong answers 将被视为 "should have not selected".

我需要计算每个参与者的正确答案率。

它会变成= (# of should have not been selected + # of correct answer)/40.

我无法使用 find 函数来获取每个条件的数量（正确，错误。应该选择...）它给出了错误，因为它是一个单元格。

 correctansw=lentgh(find(myStruct.nm_answers= '${e://Field/n'));

    Undefined operator '==' for input arguments of type 'cell'.

此外，我无法使用 strcmp 函数，因为每个答案都存储在一个（行，列）中。

我该怎么办？

我的答案

我结合了我得到的两个答案，这里是我针对这个问题的代码；

numberCorrect = cellfun(@(x) length(strfind(x, 
'e://Field/')),myStruct.nm_answers); %correct answers

numberanswers = cellfun(@(x) length(strfind(x, ',')),myStruct.nm_answers)+1;
%all answers

numberanswers(7,1)=0; , numberanswers(15,1)=0; ...
... % since I did +1, NaNs = 1...
numberofUncorrect = numberanswers-numberCorrect;
correctunticks= 20- numberofUncorrect;

myStruct.nm_perc= (correctunticks+numberCorrect)/40 ;

myStruct.nm_perc(7,1)= NaN;
myStruct.nm_perc(15,1)= NaN;
myStruct.nm_perc(38,1)= NaN;
myStruct.nm_perc(74,1)= NaN;
myStruct.nm_perc(105,1)= NaN;

clear numberanswers numberCorrect numberofUncorrect correctunticks

因为我只有 5 个 NaN，所以我可以手动完成，但将来我将使用@TomasoBelluzzo 的 NaN 代码。这是更简洁快捷的方式！

Answer 1

因为您有一个字符串单元格，您可以使用 cellfun 和 strfind 来查找匹配项。例如：

nm = {
    '${e://Field/n11},${e://Field/n99},${e://Field/n147}, Sam, Thomas';
    '${e://Field/n3},${e://Field/n11},${e://Field/n43},${e://Field/n59},${e://Field/n83}'
}

然后你可以用

计算每个单元格中e://Field/的数量

numberCorrect = cellfun(@(x) length(strfind(x, 'e://Field/')), nm);

对于此示例，它 returns 3; 5。然后要完成百分比，您可以除以 40 或直接将其添加到 cellfun 调用

percentCorrect = cellfun(@(x) length(strfind(x, 'e://Field/')) / 40, nm);

Answer 2

count function 可能就是您要找的。正确答案的模式是线性的，很容易通过文本搜索找到，因此最好专注于正确答案，而不是尝试使用正则表达式检测错误答案。

您发布的摘录有点混乱且难以阅读，至少从我的 phone 来看是这样……但我们假设您的答案结构为单元格的行向量，其基础值为字符数组（为了简单起见，我们称该变量为 answers），然后：

answers_total = count(answers,',') + 1;
answers_correct = count(answers,'${e://Field/n');
% answers_wrong = answers_total - answers_correct;

ratio = (answers_correct ./ answers_total) .* 100;

ratio 变量将是双精度值的行向量，其中每一行代表特定参与者提供的正确答案的百分比，遵循数据中定义的顺序。

该代码可以毫无问题地处理每个参与者提供的不同数量的答案。

编辑

我刚刚注意到你的变量中可以有 NaNs。我想他们代表的是参与者……好吧，他们没有参与。我建议你避免像这样混合变量类型，特别是如果你想开发一种尽可能标准化的计算方法……它们只会让一切变得更复杂。将它们替换为空字符串，以便相应地调整我的解决方案：

answers_empty = cellfun(@isempty,answers);

answers_total = count(answers,',');
answers_total(~answers_empty) = answers_total(~answers_empty) + 1;

answers_correct = count(answers,'${e://Field/n');

ratio = (answers_correct ./ answers_total) .* 100;
ratio(answers_empty) = 0;

Matlab Cell 中答案的百分比计算 - （多选，单行）

Percentage calculation of answers in a Matlab Cell - (multiple answer, single row)

indexing

matlab

cell

strcmp