如何计算置信区间并将它们绘制在条形图上

How to compute confidence intervals and plot them on a bar plot

如何从

中绘制条形图

data = 1x10 cell ,其中单元格中的每个值都有不同的维度,例如 3x100、3x40、66x2 等。

我的目标是获得条形图,其中我将有 10 组条形图,每组中的每个值有 3 个条形图。在栏上,我希望它显示值的中位数,我想计算置信区间并另外显示。

在这个例子中没有条形图组,但我的意思是向您展示我希望如何显示置信区间。在 site 上,我找到了这个例子,他们提供了一个解决方案,他们有这个命令行

e1 = errorbar(mean(data), ci95);

但我的问题是找不到任何 ci95

那么,有没有其他有效的方法,无需安装或下载额外的服务?

因为我不确定你的数据是什么样子,因为在你的问题中你说单元格的元素包含不同维度的数据,比如

3x100, 3x40, 66x2

我假设您的数据可以按列或行排列,并且并非所有数据都需要三个条。

由于您没有提供一小段数据供我们测试,我生成了一些人工数据:

data = cell(1,10);

% Random length of the data
l = randi(500, 10, 1) + 50;  

% Random "width" of the data, with 3 more likely
w = randi(4, 10, 1);
w(w==4) = 3;
% random "direction" of the data
d = randi(2, 10, 1);

% sigma of the data (in fraction of mean)
sigma = rand(10,1) / 3;

% means of the data
dmean = randi(150,10,1);
dsigma = dmean.*sigma;

for c = 1 : 10
    if d(c) == 1
        data{c} = randn(l(c), w(c)) .* dsigma(c) + dmean(c);
    else
        data{c} = randn(w(c), l(c)) .* dsigma(c) + dmean(c);
    end
end

接下来是

On the bar, I want it to be shown the median of the values, and I want to calculate the confidence interval and show it additionally.

您确定要绘制中位数吗?一些数据的中位数与数据的方差无关,因此不需要任何类型的误差线。我猜你想显示平均值。如果您真的想显示中位数,box plot 可能是更好的选择。

以下代码计算平均值并在条形图中绘制:

means = zeros(numel(data),3);
stds = zeros(numel(data),3);
n = zeros(numel(data),3);
for c = 1:numel(data)
    d = data{c};
    if size(d,1) < size(d,2)
        d = d';
    end
    cols = size(d,2);
    means(c, 1:cols) = nanmean(d);
    stds(c, 1:cols) = nanstd(d);
    n(c, 1:cols) = sum(~isnan((d)));
end

b = bar(means);

现在,我们需要计算误差线的长度。典型的选择是 standard deviation of the data (already computed by the code above, stored in stds), the standard error or the 95% confidence interval (which is the 1.96fold of the standard error, assuming the underlying data follows a normal distribution).

% for standard deviation use stds

% for standard error
ste = stds./sqrt(n);

% for 95% confidence interval
ci95 = 1.96 * ste;

最后一件事是绘制误差线。在这里,我选择了你在问题中提出的 ci95,如果你想改变它,只需将调用中的变量更改为 errorbar:

for c = 1:3
    size(means(:, c))
    size(b(c).XData)
    e = errorbar(b(c).XData + b(c).XOffset, means(:,c), ci95(:, c));
    e.LineStyle = 'none';
end

我发现 Patrick Happel 的回答不起作用,因为数字 window(因此变量 b)被后续调用 errorbar 清除。只需添加 hold on 命令即可解决此问题。为了避免混淆,这里有一个新的答案,它重现了帕特里克的所有原始代码,加上我的小调整:

%% Old answer
%Just to be safe, let's clear everything
clear all

data = cell(1,10);

% Random length of the data
l = randi(500, 10, 1) + 50;  

% Random "width" of the data, with 3 more likely
w = randi(4, 10, 1);
w(w==4) = 3;
% random "direction" of the data
d = randi(2, 10, 1);

% sigma of the data (in fraction of mean)
sigma = rand(10,1) / 3;

% means of the data
dmean = randi(150,10,1);
dsigma = dmean.*sigma;

for c = 1 : 10
    if d(c) == 1
        data{c} = randn(l(c), w(c)) .* dsigma(c) + dmean(c);
    else
        data{c} = randn(w(c), l(c)) .* dsigma(c) + dmean(c);
    end
end
%============================================
%Next thing is 
%    On the bar, I want it to be shown the median of the values, and I
%    want to calculate the confidence interval and show it additionally.
%
%Are you really sure you want to plot the median? The median of some data
%is not connected to the variance of the data, and hus no type of error
%bars are required. I guess you want to show the mean. If you really want
%to show the median, a box plot might be a better alternative.
%
%The following code computes and plots the mean in a bar plot:
%============================================
means = zeros(numel(data),3);
stds = zeros(numel(data),3);
n = zeros(numel(data),3);
for c = 1:numel(data)
    d = data{c};
    if size(d,1) < size(d,2)
        d = d';
    end
    cols = size(d,2);
    means(c, 1:cols) = nanmean(d);
    stds(c, 1:cols) = nanstd(d);
    n(c, 1:cols) = sum(~isnan((d)));
end

b = bar(means);

%% New code
%This ensures that b continues to reference existing data in the next for
%loop, as the graphics objects can otherwise be deleted.  
hold on
%% Continuing Patrick Happel's answer
%============================================
%Now, we need to compute the length of the error bars. Typical choices are
%the standard deviation of the data (already computed by the code above,
%stored in stds), the standard error or the 95% confidence interval (which
%is the 1.96fold of the standard error, assuming the underlying data
%follows a normal distribution).
%============================================
% for standard deviation use stds

% for standard error
ste = stds./sqrt(n);

% for 95% confidence interval
ci95 = 1.96 * ste;
%============================================
%Last thing is to plot the error bars. Here I chose the ci95 as you asked
%in your question, if you want to change that, simply change the variable
%in the call to errorbar:
%============================================
for c = 1:3
    size(means(:, c))
    size(b(c).XData)
    e = errorbar(b(c).XData + b(c).XOffset, means(:,c), ci95(:, c));
    e.LineStyle = 'none';
end