janusgraph 加载记录期间的低提交性能
Low commit performance during janusgraph load records
当我尝试从 MySQL 加载大量数据时,
我使用 cassandra 后端和 elasticsearch 将每条记录提交给 JanusGraph 用于构建索引,using 8 thread;
一开始,程序会以 280 条记录/秒的速度加载;
但是当它处理秒时它下降到 1 ~ 10 条记录/秒;
我尝试修改buffer-size,page-size,block-size ,renew-percentage这样的配置,但是没有明显改善;
我只是想知道我是否错过了什么,是什么造成了这种情况...
后面的代码是我的commit过程,dataMap是fastJson Object,g是janusgraph遍历源;
Long countryId = dataMap.getLong("countryId");
Long uid = dataMap.getLong("uid");
String phoneNum = dataMap.getString("phoneNumber");
String fbId = dataMap.getString("fbId");
Long createTime = dataMap.getLong("createTime");
if (uid == null) {
return;
}
Vertex uidVertex = g.addV("uid").next();
uidVertex.property("uid_code", uid);
if (createTime != null)
uidVertex.property("create_time", createTime);
if (status != null)
uidVertex.property("status", status);
g.tx().commit();
if (phoneNum != null) {
Vertex phoneVertex = KfkMsgParser.createMerge(g, "phone", "phone_num", phoneNum);
Edge selfPhone = uidVertex.addEdge("user_phone", phoneVertex);
selfPhone.property("create_time", bind.of("create_time", dataMap.getLong("createTime")));
selfPhone.property("uid_code", bind.of("uid_code", uid));
selfPhone.property("phone_num", bind.of("phone_num", phoneNum));
g.tx().commit();
}
if(fbId != null){
long endTamp2 = System.currentTimeMillis();
Vertex fbVertext = KfkMsgParser.createMerge(g, "fb_id", "fb_account",fbId);
Edge selfFb = uidVertex.addEdge("user_fb",fbVertext);
if (createTime != null)
selfFb.property("create_time",bind.of("create_time",createTime));
g.tx().commit();
}
这是 createMerge 函数:
private static Vertex createMerge(GraphTraversalSource g, String label, String propertyKey, Object propertyValue) {
Optional<Vertex> vertexOptional = g.V().hasLabel(label).has(propertyKey, propertyValue).tryNext();
if (vertexOptional.isPresent()) {
return vertexOptional.get();
}
Vertex vertex = g.addV(label).next();
vertex.property(propertyKey, propertyValue);
return vertex;
}
建索引出错
我在 google 群里找到了这样一个话题:https://groups.google.com/forum/#!msg/janusgraph-users/VPIUdlC4wNo/KiHM-s2aAwAJ
并且知道得到 2000~3000 records/sec .
当我尝试从 MySQL 加载大量数据时, 我使用 cassandra 后端和 elasticsearch 将每条记录提交给 JanusGraph 用于构建索引,using 8 thread;
一开始,程序会以 280 条记录/秒的速度加载;
但是当它处理秒时它下降到 1 ~ 10 条记录/秒;
我尝试修改buffer-size,page-size,block-size ,renew-percentage这样的配置,但是没有明显改善;
我只是想知道我是否错过了什么,是什么造成了这种情况...
后面的代码是我的commit过程,dataMap是fastJson Object,g是janusgraph遍历源;
Long countryId = dataMap.getLong("countryId");
Long uid = dataMap.getLong("uid");
String phoneNum = dataMap.getString("phoneNumber");
String fbId = dataMap.getString("fbId");
Long createTime = dataMap.getLong("createTime");
if (uid == null) {
return;
}
Vertex uidVertex = g.addV("uid").next();
uidVertex.property("uid_code", uid);
if (createTime != null)
uidVertex.property("create_time", createTime);
if (status != null)
uidVertex.property("status", status);
g.tx().commit();
if (phoneNum != null) {
Vertex phoneVertex = KfkMsgParser.createMerge(g, "phone", "phone_num", phoneNum);
Edge selfPhone = uidVertex.addEdge("user_phone", phoneVertex);
selfPhone.property("create_time", bind.of("create_time", dataMap.getLong("createTime")));
selfPhone.property("uid_code", bind.of("uid_code", uid));
selfPhone.property("phone_num", bind.of("phone_num", phoneNum));
g.tx().commit();
}
if(fbId != null){
long endTamp2 = System.currentTimeMillis();
Vertex fbVertext = KfkMsgParser.createMerge(g, "fb_id", "fb_account",fbId);
Edge selfFb = uidVertex.addEdge("user_fb",fbVertext);
if (createTime != null)
selfFb.property("create_time",bind.of("create_time",createTime));
g.tx().commit();
}
这是 createMerge 函数:
private static Vertex createMerge(GraphTraversalSource g, String label, String propertyKey, Object propertyValue) {
Optional<Vertex> vertexOptional = g.V().hasLabel(label).has(propertyKey, propertyValue).tryNext();
if (vertexOptional.isPresent()) {
return vertexOptional.get();
}
Vertex vertex = g.addV(label).next();
vertex.property(propertyKey, propertyValue);
return vertex;
}
建索引出错
我在 google 群里找到了这样一个话题:https://groups.google.com/forum/#!msg/janusgraph-users/VPIUdlC4wNo/KiHM-s2aAwAJ
并且知道得到 2000~3000 records/sec .