带有情感分数的 nlp 字典

nlp dictionary with sentiment score

谁能帮忙找到字典每个值的描述。 https://github.com/cjhutto/vaderSentiment/blob/master/vaderSentiment/vader_sentiment_lexicon.txt

如以下几行：

滥用 -3.2 0.6 [-4, -2, -3, -4, -3, -4, -3, -3, -3, -3]

滥用 -2.3 0.64031 [-2, -2, -3, -2, -2, -4, -2, -2, -2, -2]

滥用者 -2.6 0.4899 [-3, -2, -3, -3, -2, -3, -2, -2, -3, -3]

滥用者 -2.6 1.0198 [-2, -3, -3, -3, -3, -2, -3, -4, -3, 0]

滥用 -2.6 0.66332 [-3, -2, -3, -3, -3, -3, -1, -2, -3, -3]

滥用 -2.0 1.41421 [-1, -2, -2, -4, -4, -2, -3, -1, 1, -2]

辱骂 -3.2 0.74833 [-4, -3, -3, -4, -4, -3, -4, -2, -3, -2]

辱骂 -2.8 0.6 [-3, -4, -3, -2, -3, -2, -2, -3, -3, -3]

哪个值表示情绪得分，以及方括号 [] 中表示的值是什么。

快速搜索发现他们的 research paper 解释了这一点：

We used a wisdom-of-the-crowd13 (WotC) approach (Surowiecki, 2004) to acquire a valid point estimate for the sentiment valence (intensity) of each context-free candidate feature. We collected intensity ratings on each of our candidate lexical features from ten independent human raters (for a total of 90,000+ ratings). Features were rated on a scale from “[–4] Extremely Negative” to “[4] Extremely Positive”, with allowance for “[0] Neutral (or Neither, N/A)”. Ratings were obtained using Amazon Mechanical Turk (AMT), a micro-labor website where workers perform minor tasks in exchange for a small amount of money...

所以[]中的10个数字是10个人的情感强度评分，单词后的第一个数字是整体情感强度ratiny。