返回图形数据库中的列表集
Returning sets of lists in graph databases
这是 的后续问题。如果问题被删除或发生重大变化,我会重复这个问题:
我目前正在研究对高等教育课程和其他此类实体(MATH101、BIOL360、BSc 等)进行建模,我们正在研究的选项之一是图形数据库。除了理论上,我对图形数据库不熟悉。
该数据库的一个用例是查询课程的可能途径;例如,回答问题 "what minimum combinations of courses are valid to fulfill the requirements to receive a Bachelor of Science in Computer Science with Honours?"。有些要求很简单(资格要求你完成 Comp101、Math101 和 Comp201),有些会提供选项(要求你完成 80 分的论文,分类为 "science" 100 级或以上的论文).
我发现 neo4j lists 我认为这很有前途,但我真正想要的似乎是能够 return 列表列表,其中每个组件列表代表一个潜在路径.不过,我没有看到生成这样的列表列表的方法,所以我猜我在概念层面上有问题。
我可以做到这一点的一种方法是有一个循环来查看资格节点,获取一个可能的节点组合,递归地满足该节点的要求,然后移动到下一个可能的组合。作为一名数据库开发人员,将循环用于理论上明确可解决的基于集合的操作的想法是一种让我晚上睡不着觉的事情,所以我会竭尽全力避免这种可憎的事情。 如何构建查询来构建这样的集合?
再次,我标记了 Neo4J,因为我倾向于它,因为(据我所知)它是最广泛的 known/used 图 dbms(而且我有一个非常优雅的解决方案来解决我的问题以前的问题在那个问题上有效),但我也对其他数据库中的解决方案持开放态度(事实上,如果在非常新的 SQL 服务器产品中有可能,那可能是理想的,因为其他基础设施就在上面) .
回顾一下我的评论:递归或循环的使用应该是不必要的,使用列表可能也不是正确的方法。您应该只使用一个好的 graph-oriented 数据模型(充分利用关系并利用索引来启动查询)。
冒着过分强调这一点的风险,下面的查询使用了一个相当简单的数据模型来构建学校图表的一部分:
CREATE
(sci:Area {name: 'Science'}),
(hum:Area {name: 'Humanities'}),
(bio:Department {name: 'Biology'})-[:IN_AREA]->(sci),
(phy:Department {name: 'Physics'})-[:IN_AREA]->(sci),
(che:Department {name: 'Chemistry'})-[:IN_AREA]->(sci),
(eng:Department {name: 'English'})-[:IN_AREA]->(hum),
(his:Department {name: 'History'})-[:IN_AREA]->(hum),
(soc:Department {name: 'Sociology'})-[:IN_AREA]->(hum),
(bioMaj:Major {name: 'Biology'})-[:IN_DEPT]->(bio),
(phyMaj:Major {name: 'Physics'})-[:IN_DEPT]->(phy),
(bio101:Course {id: 'Bio101', name: 'Introductory Biology', level: 101, credits: 3}) -[:IN_DEPT]->(bio),
(che101:Course {id: 'Chem101', name: 'Introductory Chemistry', level: 101, credits: 4})-[:IN_DEPT]->(che),
(phy101:Course {id: 'Phys101', name: 'Newtonian Physics', level: 101, credits: 5}) -[:IN_DEPT]->(phy),
(phy201:Course {id: 'Phys201', name: 'Mechanics', level: 201, credits: 4}) -[:IN_DEPT]->(phy),
(phy202:Course {id: 'Phys202', name: 'Elec & Mag', level: 202, credits: 4}) -[:IN_DEPT]->(phy),
(eng101:Course {id: 'Eng101', name: 'Intro to Poetry', level: 101, credits: 3}) -[:IN_DEPT]->(eng),
(eng102:Course {id: 'Eng102', name: 'Intro to Drama', level: 102, credits: 3}) -[:IN_DEPT]->(eng),
(eng103:Course {id: 'Eng103', name: 'Intro to Fiction', level: 103, credits: 3}) -[:IN_DEPT]->(eng),
(eng202:Course {id: 'Eng201', name: 'Medieval Literature', level: 201, credits: 4}) -[:IN_DEPT]->(eng),
(his100:Course {id: 'Hist100', name: 'Global History', level: 100, credits: 3}) -[:IN_DEPT]->(his),
(soc100:Course {id: 'Soc100', name: 'Intro to Sociology', level: 100, credits: 3}) -[:IN_DEPT]->(soc),
(fred:Student {id: 123456, name: 'Fred Smith'})-[:HAS_MAJOR]->(bioMaj),
(sue:Student {id: 987654, name: 'Sue Jones'})-[:HAS_MAJOR]->(phyMaj),
(fred)-[:ENROLLED_IN {year: 2017, term: 1, grade: 3.73}]->(bio101),
(fred)-[:ENROLLED_IN {year: 2017, term: 1, grade: 3.62}]->(eng101),
(fred)-[:ENROLLED_IN {year: 2017, term: 2, grade: 3.55}]->(che101),
(fred)-[:ENROLLED_IN {year: 2017, term: 2, grade: 2.95}]->(eng102),
(fred)-[:ENROLLED_IN {year: 2018, term: 1, grade: 3.13}]->(eng202),
(fred)-[:ENROLLED_IN {year: 2018, term: 1, grade: 3.68}]->(phy101),
(sue) -[:ENROLLED_IN {year: 2017, term: 1, grade: 3.55}]->(che101),
(sue) -[:ENROLLED_IN {year: 2017, term: 1, grade: 3.66}]->(eng101),
(sue) -[:ENROLLED_IN {year: 2017, term: 2, grade: 3.77}]->(phy201),
(sue) -[:ENROLLED_IN {year: 2017, term: 2, grade: 3.44}]->(soc100),
(sue) -[:ENROLLED_IN {year: 2018, term: 1, grade: 3.33}]->(eng202),
(sue) -[:ENROLLED_IN {year: 2018, term: 1, grade: 3.22}]->(phy101);
假设要求所有理科学生必须在 3 门人文课程中获得 3.0+ 的成绩,并且其中至少一门人文课程必须达到 200+ 级别。
我们可以通过这种方式找到满足该要求(例如,Sue Jones
)的所有理科学生:
MATCH (a1:Area)<-[:IN_AREA]-()<-[:IN_DEPT]-()<-[:HAS_MAJOR]-(student)-[e:ENROLLED_IN]->(course)-[:IN_DEPT]->()-[:IN_AREA]->(a2:Area)
WHERE a1.name = 'Science' AND a2.name = 'Humanities' AND e.grade >= 3.0
WITH student, COLLECT(course) AS courses
WHERE SIZE(courses) >= 3 AND ANY(c IN courses WHERE c.level >= 200)
RETURN student;
相反,我们可以通过以下方式找到所有尚未满足该要求(例如,Fred Smith
)的理科生。 (下面,如果 OPTIONAL MATCH
及其 WHERE
找不到匹配项,则 courses
将是一个空集合)。
MATCH (a1:Area)<-[:IN_AREA]-()<-[:IN_DEPT]-()<-[:HAS_MAJOR]-(student)
WHERE a1.name = 'Science'
OPTIONAL MATCH (student)-[e:ENROLLED_IN]->(course)-[:IN_DEPT]->()-[:IN_AREA]->(a2:Area)
WHERE a2.name = 'Humanities' AND e.grade >= 3.0
WITH student, COLLECT(course) AS courses
WHERE SIZE(courses) < 3 OR NONE(c IN courses WHERE c.level >= 200)
RETURN student;
这是
我目前正在研究对高等教育课程和其他此类实体(MATH101、BIOL360、BSc 等)进行建模,我们正在研究的选项之一是图形数据库。除了理论上,我对图形数据库不熟悉。
该数据库的一个用例是查询课程的可能途径;例如,回答问题 "what minimum combinations of courses are valid to fulfill the requirements to receive a Bachelor of Science in Computer Science with Honours?"。有些要求很简单(资格要求你完成 Comp101、Math101 和 Comp201),有些会提供选项(要求你完成 80 分的论文,分类为 "science" 100 级或以上的论文).
我发现 neo4j lists 我认为这很有前途,但我真正想要的似乎是能够 return 列表列表,其中每个组件列表代表一个潜在路径.不过,我没有看到生成这样的列表列表的方法,所以我猜我在概念层面上有问题。
我可以做到这一点的一种方法是有一个循环来查看资格节点,获取一个可能的节点组合,递归地满足该节点的要求,然后移动到下一个可能的组合。作为一名数据库开发人员,将循环用于理论上明确可解决的基于集合的操作的想法是一种让我晚上睡不着觉的事情,所以我会竭尽全力避免这种可憎的事情。 如何构建查询来构建这样的集合?
再次,我标记了 Neo4J,因为我倾向于它,因为(据我所知)它是最广泛的 known/used 图 dbms(而且我有一个非常优雅的解决方案来解决我的问题以前的问题在那个问题上有效),但我也对其他数据库中的解决方案持开放态度(事实上,如果在非常新的 SQL 服务器产品中有可能,那可能是理想的,因为其他基础设施就在上面) .
回顾一下我的评论:递归或循环的使用应该是不必要的,使用列表可能也不是正确的方法。您应该只使用一个好的 graph-oriented 数据模型(充分利用关系并利用索引来启动查询)。
冒着过分强调这一点的风险,下面的查询使用了一个相当简单的数据模型来构建学校图表的一部分:
CREATE
(sci:Area {name: 'Science'}),
(hum:Area {name: 'Humanities'}),
(bio:Department {name: 'Biology'})-[:IN_AREA]->(sci),
(phy:Department {name: 'Physics'})-[:IN_AREA]->(sci),
(che:Department {name: 'Chemistry'})-[:IN_AREA]->(sci),
(eng:Department {name: 'English'})-[:IN_AREA]->(hum),
(his:Department {name: 'History'})-[:IN_AREA]->(hum),
(soc:Department {name: 'Sociology'})-[:IN_AREA]->(hum),
(bioMaj:Major {name: 'Biology'})-[:IN_DEPT]->(bio),
(phyMaj:Major {name: 'Physics'})-[:IN_DEPT]->(phy),
(bio101:Course {id: 'Bio101', name: 'Introductory Biology', level: 101, credits: 3}) -[:IN_DEPT]->(bio),
(che101:Course {id: 'Chem101', name: 'Introductory Chemistry', level: 101, credits: 4})-[:IN_DEPT]->(che),
(phy101:Course {id: 'Phys101', name: 'Newtonian Physics', level: 101, credits: 5}) -[:IN_DEPT]->(phy),
(phy201:Course {id: 'Phys201', name: 'Mechanics', level: 201, credits: 4}) -[:IN_DEPT]->(phy),
(phy202:Course {id: 'Phys202', name: 'Elec & Mag', level: 202, credits: 4}) -[:IN_DEPT]->(phy),
(eng101:Course {id: 'Eng101', name: 'Intro to Poetry', level: 101, credits: 3}) -[:IN_DEPT]->(eng),
(eng102:Course {id: 'Eng102', name: 'Intro to Drama', level: 102, credits: 3}) -[:IN_DEPT]->(eng),
(eng103:Course {id: 'Eng103', name: 'Intro to Fiction', level: 103, credits: 3}) -[:IN_DEPT]->(eng),
(eng202:Course {id: 'Eng201', name: 'Medieval Literature', level: 201, credits: 4}) -[:IN_DEPT]->(eng),
(his100:Course {id: 'Hist100', name: 'Global History', level: 100, credits: 3}) -[:IN_DEPT]->(his),
(soc100:Course {id: 'Soc100', name: 'Intro to Sociology', level: 100, credits: 3}) -[:IN_DEPT]->(soc),
(fred:Student {id: 123456, name: 'Fred Smith'})-[:HAS_MAJOR]->(bioMaj),
(sue:Student {id: 987654, name: 'Sue Jones'})-[:HAS_MAJOR]->(phyMaj),
(fred)-[:ENROLLED_IN {year: 2017, term: 1, grade: 3.73}]->(bio101),
(fred)-[:ENROLLED_IN {year: 2017, term: 1, grade: 3.62}]->(eng101),
(fred)-[:ENROLLED_IN {year: 2017, term: 2, grade: 3.55}]->(che101),
(fred)-[:ENROLLED_IN {year: 2017, term: 2, grade: 2.95}]->(eng102),
(fred)-[:ENROLLED_IN {year: 2018, term: 1, grade: 3.13}]->(eng202),
(fred)-[:ENROLLED_IN {year: 2018, term: 1, grade: 3.68}]->(phy101),
(sue) -[:ENROLLED_IN {year: 2017, term: 1, grade: 3.55}]->(che101),
(sue) -[:ENROLLED_IN {year: 2017, term: 1, grade: 3.66}]->(eng101),
(sue) -[:ENROLLED_IN {year: 2017, term: 2, grade: 3.77}]->(phy201),
(sue) -[:ENROLLED_IN {year: 2017, term: 2, grade: 3.44}]->(soc100),
(sue) -[:ENROLLED_IN {year: 2018, term: 1, grade: 3.33}]->(eng202),
(sue) -[:ENROLLED_IN {year: 2018, term: 1, grade: 3.22}]->(phy101);
假设要求所有理科学生必须在 3 门人文课程中获得 3.0+ 的成绩,并且其中至少一门人文课程必须达到 200+ 级别。
我们可以通过这种方式找到满足该要求(例如,Sue Jones
)的所有理科学生:
MATCH (a1:Area)<-[:IN_AREA]-()<-[:IN_DEPT]-()<-[:HAS_MAJOR]-(student)-[e:ENROLLED_IN]->(course)-[:IN_DEPT]->()-[:IN_AREA]->(a2:Area)
WHERE a1.name = 'Science' AND a2.name = 'Humanities' AND e.grade >= 3.0
WITH student, COLLECT(course) AS courses
WHERE SIZE(courses) >= 3 AND ANY(c IN courses WHERE c.level >= 200)
RETURN student;
相反,我们可以通过以下方式找到所有尚未满足该要求(例如,Fred Smith
)的理科生。 (下面,如果 OPTIONAL MATCH
及其 WHERE
找不到匹配项,则 courses
将是一个空集合)。
MATCH (a1:Area)<-[:IN_AREA]-()<-[:IN_DEPT]-()<-[:HAS_MAJOR]-(student)
WHERE a1.name = 'Science'
OPTIONAL MATCH (student)-[e:ENROLLED_IN]->(course)-[:IN_DEPT]->()-[:IN_AREA]->(a2:Area)
WHERE a2.name = 'Humanities' AND e.grade >= 3.0
WITH student, COLLECT(course) AS courses
WHERE SIZE(courses) < 3 OR NONE(c IN courses WHERE c.level >= 200)
RETURN student;