MySQL PHP 如果来自 Table 的关键词短语在句子中不存在 Table 显示结果
MySQL PHP If Keyword Phrase From Table Does Not Exist In Sentence Table Show Results
我有两个 tables:句子,否定。
我想 select 列 Sentences.sentence 不包含 Negatives.negphrase 中的任何记录。
Sentences 有 200k 条记录,Negatives 有 50k 条记录。
Sentences.sentence Sample Data
=============================
- university lab on campus
- laboratory designs
- lab coats
- math lab
- methane production
- meth lab
Negatives.negphrase Sample Data
======================================
- coats
- math lab
- meth
Desired Result Set
==================
- university lab on campus
- laboratory designs
- methane production
我尝试使用我的另一个问题的结果,但数据库超时:
SELECT Sentences.sentence
FROM Sentences, Negatives
GROUP BY Sentences.sentence
HAVING (((Max(InStr(" " & sentence & " "," " & negphrase & " ")))=0));
我的答案
所以我会给 Richard b/c 正确的答案,他的解决方案确实适用于较小的记录集,但不适用于较大的记录集。这是我用来将所有否定关键字放入数组中的 PHP 代码,然后使用 UPDATE 子句遍历该数组以在句子 table 中标记新列 'negmatch'。我将在另一个 WHERE 子句中使用它 select Sentences.sentence WHERE negmatch <> 1.
我只需要 运行 对所有否定短语使用此代码一次,然后当我添加其他关键字时,我使用相同的代码但没有循环来再次搜索句子(代码未在下面显示)。代码需要 6.5 分钟来循环遍历 2800 个 UPDATE 子句,因此初始加载相当长,但是一旦完成就不必再次完成。
<?php
$mysqli = new mysqli("localhost", "myuser", "myuserpassword", "database");
/* check connection */
if ($mysqli->connect_errno) {
printf("Connect failed: %s\n", $mysqli->connect_error);
exit();
}
if ($result = $mysqli->query("SELECT negphrase FROM negatives")) {
$row_cnt = $result->num_rows;
printf("Negative keywords have %d rows.\n", $row_cnt); //print count of rows
while($row = $result->fetch_assoc()){ //loop through all results by row
foreach( $row AS $value ) {
$negative[] = $value;
}
}
/* free result set */
$result->close();
$data = array_values($negative); // get only values
$data = array_filter($data);
$datacount = 1;
foreach($data as $val) { //loop through array to build MySQL WHERE clause
$updatequery = "UPDATE Sentences SET negmatch=1 WHERE sentence REGEXP '[[:<:]]" . trim($val) . "[[:>:]]'";
echo $updatequery . "<br />";
mysqli_query($mysqli,$updatequery) or die (mysqli_error($mysqli));
echo $datacount . " " . trim($val) ."<br />";
$datacount++;
}
}
$mysqli->close();
unset($result, $row, $mysqli,$value,$negative,$data,$val,$updatequery,$datacount,$row_cnt);
?>
SELECT t1.sentence FROM Sentences as t1
inner join Negatives as t2 on t1.sentence != t2.negphrase
确保两列都正确索引
使用否定左连接,这将 return 只有来自 Senteces table 的行根据规则
与否定 table 不匹配
select * from Sentences s
left join Negatives n
on (concat(" ",s.sentence," ") like concat("% ",n.negphrase," %"))
where n.negphrase is null
根据以下数据进行测试
CREATE TABLE IF NOT EXISTS `Negatives` (
`negphrase` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Negatives` (`negphrase`) VALUES
('coats'),
('math lab'),
('meth');
CREATE TABLE IF NOT EXISTS `Sentences` (
`sentence` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Sentences` (`sentence`) VALUES
('university lab on campus'),
('laboratory designs'),
('lab coats'),
('math lab'),
('methane production'),
('meth lab'),
('testing sentence');
我有两个 tables:句子,否定。
我想 select 列 Sentences.sentence 不包含 Negatives.negphrase 中的任何记录。
Sentences 有 200k 条记录,Negatives 有 50k 条记录。
Sentences.sentence Sample Data
=============================
- university lab on campus
- laboratory designs
- lab coats
- math lab
- methane production
- meth lab
Negatives.negphrase Sample Data
======================================
- coats
- math lab
- meth
Desired Result Set
==================
- university lab on campus
- laboratory designs
- methane production
我尝试使用我的另一个问题的结果,但数据库超时:
SELECT Sentences.sentence
FROM Sentences, Negatives
GROUP BY Sentences.sentence
HAVING (((Max(InStr(" " & sentence & " "," " & negphrase & " ")))=0));
我的答案
所以我会给 Richard b/c 正确的答案,他的解决方案确实适用于较小的记录集,但不适用于较大的记录集。这是我用来将所有否定关键字放入数组中的 PHP 代码,然后使用 UPDATE 子句遍历该数组以在句子 table 中标记新列 'negmatch'。我将在另一个 WHERE 子句中使用它 select Sentences.sentence WHERE negmatch <> 1.
我只需要 运行 对所有否定短语使用此代码一次,然后当我添加其他关键字时,我使用相同的代码但没有循环来再次搜索句子(代码未在下面显示)。代码需要 6.5 分钟来循环遍历 2800 个 UPDATE 子句,因此初始加载相当长,但是一旦完成就不必再次完成。
<?php
$mysqli = new mysqli("localhost", "myuser", "myuserpassword", "database");
/* check connection */
if ($mysqli->connect_errno) {
printf("Connect failed: %s\n", $mysqli->connect_error);
exit();
}
if ($result = $mysqli->query("SELECT negphrase FROM negatives")) {
$row_cnt = $result->num_rows;
printf("Negative keywords have %d rows.\n", $row_cnt); //print count of rows
while($row = $result->fetch_assoc()){ //loop through all results by row
foreach( $row AS $value ) {
$negative[] = $value;
}
}
/* free result set */
$result->close();
$data = array_values($negative); // get only values
$data = array_filter($data);
$datacount = 1;
foreach($data as $val) { //loop through array to build MySQL WHERE clause
$updatequery = "UPDATE Sentences SET negmatch=1 WHERE sentence REGEXP '[[:<:]]" . trim($val) . "[[:>:]]'";
echo $updatequery . "<br />";
mysqli_query($mysqli,$updatequery) or die (mysqli_error($mysqli));
echo $datacount . " " . trim($val) ."<br />";
$datacount++;
}
}
$mysqli->close();
unset($result, $row, $mysqli,$value,$negative,$data,$val,$updatequery,$datacount,$row_cnt);
?>
SELECT t1.sentence FROM Sentences as t1
inner join Negatives as t2 on t1.sentence != t2.negphrase
确保两列都正确索引
使用否定左连接,这将 return 只有来自 Senteces table 的行根据规则
与否定 table 不匹配select * from Sentences s
left join Negatives n
on (concat(" ",s.sentence," ") like concat("% ",n.negphrase," %"))
where n.negphrase is null
根据以下数据进行测试
CREATE TABLE IF NOT EXISTS `Negatives` (
`negphrase` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Negatives` (`negphrase`) VALUES
('coats'),
('math lab'),
('meth');
CREATE TABLE IF NOT EXISTS `Sentences` (
`sentence` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Sentences` (`sentence`) VALUES
('university lab on campus'),
('laboratory designs'),
('lab coats'),
('math lab'),
('methane production'),
('meth lab'),
('testing sentence');