php preg_replace 分隔点

Question

我有一个脚本，它从字符串中为我提供关键字。代码是：

<?php
$text = "This is some text. This is some text. Vending Machines are great.Баста - ЧК (Чистый Кайф)";
$string = preg_replace('/[^\p{L}\p{N}\s]/u', '', $text);
$string = preg_replace('/\s+/', ' ', $string);
$string = preg_replace('/\s+/', ' ', $string);
$string = mb_strtolower($string, 'UTF-8');
$keywords = explode(' ', $string);
var_dump($keywords);
?>

效果很好，但我遇到了问题。这段代码returns我：

array (size=15)
  0 => string 'this' (length=4)
  1 => string 'is' (length=2)
  2 => string 'some' (length=4)
  3 => string 'text' (length=4)
  4 => string 'this' (length=4)
  5 => string 'is' (length=2)
  6 => string 'some' (length=4)
  7 => string 'text' (length=4)
  8 => string 'vending' (length=7)
  9 => string 'machines' (length=8)
  10 => string 'are' (length=3)
  11 => string 'greatбаста' (length=15)
  12 => string 'чк' (length=4)
  13 => string 'чистый' (length=12)
  14 => string 'кайф' (length=8)

为什么第 11 个阵列 伟大的棒球。我想分开 great 和 баста 单词。

我需要一些东西来代替 . 到点和 space (. ) 如果点附近有东西。

示例：
这是个好日子day.It is sunny => 这是个好日子。天气晴朗 (将 . 替换为点和 space (. ))
这是美好的一天。天气晴朗 => 这是个好日子。天气晴朗没有更换任何东西。因为点在

之后有space

Answer 1

第一个替换应该用 space 执行，最后一个输入应该被修剪。

使用

$text = "This is some text. This is some text. Vending Machines are great.Баста - ЧК (Чистый Кайф)";
$string = preg_replace('/[^\p{L}\p{N}\s]/u', ' ', $text); // <= Replace with space
$string = preg_replace('/\s+/', ' ', $string);
$string = mb_strtolower($string, 'UTF-8');
$keywords = explode(' ', trim($string));        // <= Use trim to remove leading/trailing spaces
var_dump($keywords);

见IDEONE demo

我也猜你不需要重复的 $string = preg_replace('/\s+/', ' ', $string); 行。

Answer 2

您只需要 2 个正则表达式。

查找：[^\p{L}\p{N}\s.]+
替换：无

查找：[\s.]+
替换：a space

然后做一个爆炸。

有点直接而且切中要点!!

php preg_replace 分隔点

php preg_replace separate dots

php

regex

preg-replace