Google 预测 API - 多分类训练数据语法

Google prediction API - Training data syntax for multi classification

尝试利用 Google 预测 API 的力量对我的数据进行分类。我的数据库中的每个项目都可以分配多个类别。

例如："My Nexus phone is rebooting constantly" 可以同时分配 #Android 和 #troubleshooting 个标签。

我想将我的训练数据上传到 Google，但我不确定如何将这两个标签应用于相同的内容。在下面的 example 中，我发现了为每个内容提供一个类别的语法，如下所示：

"Android" ,"My Nexus phone is rebooting constantly"

多分类训练数据的正确语法是什么？

除非我误解了你的问题，否则我认为答案在文档中 here。

也就是说，关于文本字符串的部分解释说，当您提交一个文本字符串时，系统实际上会将其切割成多个字符串，并使用空格作为分隔符来分隔所有内容。他们指出 "Godzilla vs Mothra" 是 "Godzilla"、"vs" 和 "Mothra"。所以在你的情况下，你可以只使用 "Android troubleshooting"。系统会在"Android"和"troubleshooting".

中分开

来自文档：

每一行只能有one label assigned，但是你可以通过repeating an example and applying different labels to each one[=24=对一个例子应用多个标签].例如：

"excited", "OMG! Just had a fabulous day!"

"annoying", "OMG! Just had a fabulous day!"

如果你向这个模型发送推文，你可能会得到这样的分类："excited":0.6, "annoying":0.2.

Google 预测 API - 多分类训练数据语法

Google prediction API - Training data syntax for multi classification

classification

bigdata

training-data

google-prediction

web