灵敏度与正预测值 - 哪个最好？

Question

我正在尝试在 class 不平衡数据集（二进制 - 1：25% 和 0：75%）上构建模型。尝试使用分类算法和集成技术。我对以下两个概念有点困惑，因为我对预测更多 1 更感兴趣。

1. Should i give preference to Sensitivity or Positive Predicted Value. 
Some ensemble techniques give maximum 45% of sensitivity and low Positive Predicted Value.
And some give 62% of Positive Predicted Value and low Sensitivity.


2. My dataset has around 450K observations and 250 features. 
After power test i took 10K observations by Simple random sampling. While selecting 
variable importance using ensemble technique's the features 
are different compared to the features when i tried with 150K observations. 
Now with my intuition and domain knowledge i felt features that came up as important in 
150K observation sample are more relevant. what is the best practice?

3. Last, can i use the variable importance generated by RF in other ensemple 
techniques to predict the accuracy?

你能帮我解决一下吗，我有点困惑

Answer 1

灵敏度和阳性预测值之间的偏好取决于您分析的最终目标。这两个值之间的区别在这里得到了很好的解释：https://onlinecourses.science.psu.edu/stat507/node/71/ 总而言之，这是从两个不同角度看待结果的两个衡量标准。灵敏度使您有可能在测试中找到 "condition"。阳性预测值着眼于 "condition" 在接受测试的人中的流行程度。

准确性取决于您的分类结果：它被定义为（真阳性 + 真阴性）/（总计），而不是由 RF 生成的可变重要性。

此外，可以补偿数据集中的不平衡，参见 https://stats.stackexchange.com/questions/264798/random-forest-unbalanced-dataset-for-training-test

灵敏度与正预测值 - 哪个最好？

Sensitivity Vs Positive Predicted Value - which is best?

statistics

regression

classification

ensemble-learning