编辑 hbar 图 (Stata)
Editing hbar graph (Stata)
我下面的代码生成了所附的图表。但是,我试图添加两个调整但没有运气。
1- 我想组织 Y 轴,所有行业的 11 月都在 12 月之前,而不是像当前图表中那样按哪个月有更多工作岗位来计算运行。
2- 我还尝试在 Y 轴上添加标签,它只显示“Nov”和“Dec”,没有附加文本,虽然 Stata 不会产生任何错误,但它不会更改图表。
preserve
drop if total_jobs_industry<15
graph hbar (count) total_jobs_industry, over(month) over(industry, sort(1)) subtitle("Jobs by Industry and month", span)
restore
我知道我可以在 Stata 中手动更改带有微小细节的图表,但我更喜欢尽可能自动化该过程。
数据示例:
Example generated by -dataex-. To install: ssc install dataex
clear
input float total_jobs_industry str39 industry str8 month
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
38 "Computer Hardware & Software" "Dec_2020"
12 "Consulting" "Dec_2020"
63 "" "Dec_2020"
32 "IT Services" "Dec_2020"
32 "IT Services" "Nov_2020"
38 "Computer Hardware & Software" "Nov_2020"
12 "Aerospace & Defense" "Nov_2020"
12 "Accounting" "Nov_2020"
12 "Accounting" "Dec_2020"
当我 运行 使用总和而不是计数时,我得到下图:
preserve
drop if total_jobs_industry<15
graph hbar (sum) total_jobs_industry, over(month) over(industry, sort(1)) subtitle("Jobs by Industry and month", span)
restore
此外,这就是我如何创建变量来计算每个行业的工作岗位数量:
// The variable id contains observation number running from 1 to X and nt is the total number of observations
generate id = _n
generate nt = _N
// Sorting by inudstry. Now n1 is the observation number within each Industry group and total_jobs_industry is the total number of observations for each Industry group.
sort industry
by industry: generate n1 = _n
by industry: generate total_jobs_industry = _N
order total_jobs_industry, a(industry)
这是一个很费解的问题。以下原因列表不完整。
post 似乎混合了自己的新旧版本,并且不一致。你不能合理地期望我们可靠地解码这样一个曲折的故事。这里的标准是提供一个最小的可验证示例,而该线程不满足该标准。参见 guidance here。
显示的图表均不符合给定的数据。
我很难相信 (count)
对您的数据有意义。如前所述,它计算非缺失值,但您的关键变量似乎是 total_count_industry
。另一方面,(sum)
的不同处理方式和观察的数量似乎混淆了完全不同的计算类型。
您的示例数据中似乎存在重复的观察结果。
您声明您'还尝试在 Y 轴上添加标签,其中它只显示“11 月”和“12 月”',但您的代码中没有任何内容显示任何此类评论尝试。
你期望 Nov_2020
在 Dec_2020
之前排序,这不会发生,因为就 Stata 而言,它只是一个字符串变量,所以事实D
在 N
之前排序是最重要的。这就是 12 月在 1 月之前排序的原因,它与行业值排序无关,它只影响条形图组的排序。您没有使用 Stata 的日期变量功能。
除了最后一个问题,我怀疑我是否能理解这些问题中的任何一个。它似乎是 graph hbar
的一个限制,它忽略了时间变量显示格式,所以我使用值标签来确保 Nov
和 Dec
按您希望的顺序排序。
clear
input float total_jobs_industry str39 industry str8 month
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
38 "Computer Hardware & Software" "Dec_2020"
12 "Consulting" "Dec_2020"
63 "" "Dec_2020"
32 "IT Services" "Dec_2020"
32 "IT Services" "Nov_2020"
38 "Computer Hardware & Software" "Nov_2020"
12 "Aerospace & Defense" "Nov_2020"
12 "Accounting" "Nov_2020"
12 "Accounting" "Dec_2020"
end
duplicates drop
gen mdate = monthly(month, "MY")
levelsof mdate, local(months)
tokenize "`c(Mons)'"
foreach m of local months {
local month = month(dofm(`m'))
label def mdate `m' "``month''", modify
}
label val mdate mdate
set scheme s1color
graph hbar (asis) total_jobs_industry, over(mdate) over(industry, sort(1) descending)
我下面的代码生成了所附的图表。但是,我试图添加两个调整但没有运气。 1- 我想组织 Y 轴,所有行业的 11 月都在 12 月之前,而不是像当前图表中那样按哪个月有更多工作岗位来计算运行。 2- 我还尝试在 Y 轴上添加标签,它只显示“Nov”和“Dec”,没有附加文本,虽然 Stata 不会产生任何错误,但它不会更改图表。
preserve
drop if total_jobs_industry<15
graph hbar (count) total_jobs_industry, over(month) over(industry, sort(1)) subtitle("Jobs by Industry and month", span)
restore
我知道我可以在 Stata 中手动更改带有微小细节的图表,但我更喜欢尽可能自动化该过程。
数据示例:
Example generated by -dataex-. To install: ssc install dataex
clear
input float total_jobs_industry str39 industry str8 month
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
38 "Computer Hardware & Software" "Dec_2020"
12 "Consulting" "Dec_2020"
63 "" "Dec_2020"
32 "IT Services" "Dec_2020"
32 "IT Services" "Nov_2020"
38 "Computer Hardware & Software" "Nov_2020"
12 "Aerospace & Defense" "Nov_2020"
12 "Accounting" "Nov_2020"
12 "Accounting" "Dec_2020"
当我 运行 使用总和而不是计数时,我得到下图:
preserve
drop if total_jobs_industry<15
graph hbar (sum) total_jobs_industry, over(month) over(industry, sort(1)) subtitle("Jobs by Industry and month", span)
restore
此外,这就是我如何创建变量来计算每个行业的工作岗位数量:
// The variable id contains observation number running from 1 to X and nt is the total number of observations
generate id = _n
generate nt = _N
// Sorting by inudstry. Now n1 is the observation number within each Industry group and total_jobs_industry is the total number of observations for each Industry group.
sort industry
by industry: generate n1 = _n
by industry: generate total_jobs_industry = _N
order total_jobs_industry, a(industry)
这是一个很费解的问题。以下原因列表不完整。
post 似乎混合了自己的新旧版本,并且不一致。你不能合理地期望我们可靠地解码这样一个曲折的故事。这里的标准是提供一个最小的可验证示例,而该线程不满足该标准。参见 guidance here。
显示的图表均不符合给定的数据。
我很难相信
(count)
对您的数据有意义。如前所述,它计算非缺失值,但您的关键变量似乎是total_count_industry
。另一方面,(sum)
的不同处理方式和观察的数量似乎混淆了完全不同的计算类型。您的示例数据中似乎存在重复的观察结果。
您声明您'还尝试在 Y 轴上添加标签,其中它只显示“11 月”和“12 月”',但您的代码中没有任何内容显示任何此类评论尝试。
你期望
Nov_2020
在Dec_2020
之前排序,这不会发生,因为就 Stata 而言,它只是一个字符串变量,所以事实D
在N
之前排序是最重要的。这就是 12 月在 1 月之前排序的原因,它与行业值排序无关,它只影响条形图组的排序。您没有使用 Stata 的日期变量功能。
除了最后一个问题,我怀疑我是否能理解这些问题中的任何一个。它似乎是 graph hbar
的一个限制,它忽略了时间变量显示格式,所以我使用值标签来确保 Nov
和 Dec
按您希望的顺序排序。
clear
input float total_jobs_industry str39 industry str8 month
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
38 "Computer Hardware & Software" "Dec_2020"
12 "Consulting" "Dec_2020"
63 "" "Dec_2020"
32 "IT Services" "Dec_2020"
32 "IT Services" "Nov_2020"
38 "Computer Hardware & Software" "Nov_2020"
12 "Aerospace & Defense" "Nov_2020"
12 "Accounting" "Nov_2020"
12 "Accounting" "Dec_2020"
end
duplicates drop
gen mdate = monthly(month, "MY")
levelsof mdate, local(months)
tokenize "`c(Mons)'"
foreach m of local months {
local month = month(dofm(`m'))
label def mdate `m' "``month''", modify
}
label val mdate mdate
set scheme s1color
graph hbar (asis) total_jobs_industry, over(mdate) over(industry, sort(1) descending)