PHP:使用不同分隔符列表拆分字符串并保留有关分隔符的信息

PHP: split string with list of different delimeters and keep info about delimiters

这实际上是我遇到但没有找到解决方案的两个 SO 问题 split string into words by using space a delimiter AND Split string by other strings 的一种组合。

假设分隔符数组是

$splitby = array('dlmtr1','dlmtr2','dlmtr3',' ','dlmtr5','dlmtr6');

$text = '  dlmtr1This is     the string dlmtr2dlmtr2TTTdlmtr5WWWWW ';


$textArr = ('This', 'is', 'the', 'string', 'TTT', 'WWWWW');

$delimiterArr = ('  dlmtr1', ' ', '     ', ' ', 'dlmtr2dlmtr2', 'dlmtr5',' ');

换句话说,确实如此

$text == $delimiterArr[0] . $textArr[0] . $delimiterArr[1] . $textArr[1] . ... . $delimiterArr(count($delimiterArr));

P.S。因此,$delimiterArr 的每一项都包含至少一个或多个分隔符,如您所见。

模式的可能解决方案的步骤是:

$pattern = '/\s?'.implode($splitby, '\s?|\s?').'\s?/';

然后我以任何方式继续得到错误的结果。

更新:这是我得到的结果接近预期的结果但是问题是分隔符被分开了但是他们应该文中有的话一起来**

$splitby = array('dlmtr1','dlmtr2','dlmtr3',' ','dlmtr5','dlmtr6');
$text = '  dlmtr1This is     the string dlmtr2dlmtr2TTTdlmtr5WWWWW ';

$pattern = '/\s?'.implode($splitby, '\s?|\s?').'\s?/';
$result = preg_split($pattern, $text, -1, PREG_SPLIT_NO_EMPTY);
preg_match_all($pattern, $text, $matches);
print_r($result);
print_r($matches[0]);

结果:

Array
(
    [0] => This
    [1] => is
    [2] => the
    [3] => string
    [4] => TTT
    [5] => WWWWW
)
Array
(
    [0] =>   
    [1] => dlmtr1 '[0] and [1] should come together
    [2] =>  
    [3] =>    
    [4] =>   
    [5] =>  
    [6] =>  dlmtr2 '[6] and [7] should come together
    [7] => dlmtr2
    [8] => dlmtr5
    [9] =>  
)

谢谢。

下面的代码按预期运行。

$splitby = array('dlmtr1','dlmtr2','dlmtr3',' ','dlmtr5','dlmtr6');
$text = '  dlmtr1This is     the string dlmtr2dlmtr2TTTdlmtr5WWWWW ';

preg_match_all("/\s*(dlmtr[1-6])+\s*|\s+/", $text, $matches);
echo "<pre>";print_r($matches[0]);echo "</pre>";

Array
(
    [0] =>   dlmtr1
    [1] =>  
    [2] =>      
    [3] =>  
    [4] =>  dlmtr2dlmtr2
    [5] => dlmtr5
    [6] =>  
)

$result = explode(' ', trim(preg_replace("/\s*(dlmtr[0-9])+\s*|\s+/",' ', $text)));
echo "<pre>";print_r($result);echo "</pre>";

Array
(
    [0] => This
    [1] => is
    [2] => the
    [3] => string
    [4] => TTT
    [5] => WWWWW
)