从字典列表中抽样

sampling from list of dicts

我有如下示例的数据。它非常大,我想从中抽取前 10 件商品。它看起来像一个字典列表,但如果我尝试 user_train[:5] 我会收到错误消息。我可以像 user_train[4] 作品那样一次抽样一件作品。非常感谢任何提示。

代码:

user_train[0]

输出:

[{u'asin': u'B00APT3MHO',
  u'helpful': [0, 0],
  u'overall': 5.0,
  'productid': 1,
  u'reviewText': u"Good for someone who likes skinny jeans but doesn't look great in the legging-tight ones. A little stretchy. Not super tight in the knee or ankle, but snug on the thigh and calf.",
  u'reviewTime': u'11 17, 2013',
  u'reviewerID': u'A1JWX45KHE34AL',
  u'reviewerName': u'varnienarsil',
  u'summary': u'Love these jeans',
  u'unixReviewTime': 1384646400},
 {u'asin': u'B00CJ5NH36',
  u'helpful': [0, 0],
  u'overall': 5.0,
  'productid': 2,
  u'reviewText': u"This shirt with it's bold graphic is seriously adorable. I have pretty narrow shoulders, and like the way the sleeves slope off them. The shirt fits loosely in a way that is flattering and I liked the length. I'm no model, but the shirt looks on me as great as it looks in the photo.",
  u'reviewTime': u'11 17, 2013',
  u'reviewerID': u'A1JWX45KHE34AL',
  u'reviewerName': u'varnienarsil',
  u'summary': u'As cute as it looks',
  u'unixReviewTime': 1384646400},
 {u'asin': u'B00F9NGAPM',
  u'helpful': [1, 1],
  u'overall': 3.0,
  'productid': 4,
  u'reviewText': u"The shirt is a little flowy-er than I expected. I like the way it drapes, but the arms are a bit loose (and on me, short—I'm pretty tall). Has a sort of after-yoga feel rather than the urban feel I was looking for. Super comfortable.",
  u'reviewTime': u'11 17, 2013',
  u'reviewerID': u'A1JWX45KHE34AL',
  u'reviewerName': u'varnienarsil',
  u'summary': u"Like, don't love",
  u'unixReviewTime': 1384646400}]



Update:

code:

user_train[:5]

error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-bb27c2e9fa75> in <module>()
----> 1 user_train[:5]

TypeError: unhashable type

如果您想找到解决方法。你可以去:

sample = [user_train[x] for x in range(10)]

这称为列表推导,无法散列的类型错误通常是由于尝试将字典转换为列表。