保持引文内的文本完整,同时拆分文本
Keep text within quotation intact, while splitting text
我需要引号内的字符串 [$str] 中的数据不被拆分。
在这种情况下,“会计公司”应该保留在一串中,而不是散开。
<?php
$str =
'#PROGRAM "Accounting company" 98.2
#GENERATED 2020715 "SE"';
$data = explode("\n", $str);
foreach($data as &$value){
$value = preg_split("/\s+/", $value);
}
var_dump($data);
结果:
array(2) {
[0]=>
array(4) {
[0]=>
string(8) "#PROGRAM"
[1]=>
string(11) ""Accounting" // Unwanted split
[2]=>
string(8) "company"" // Unwanted split
[3]=>
string(4) "98.2"
}
[1]=>
&array(4) {
[0]=>
string(0) ""
[1]=>
string(10) "#GENERATED"
[2]=>
string(7) "2020715"
[3]=>
string(4) ""SE""
}
}
想要的结果:
array(2) {
[0]=>
array(4) {
[0]=>
string(8) "#PROGRAM"
[1]=>
string(18) ""Accounting company"
[2]=>
string(4) "98.2"
}
[1]=>
&array(4) {
[0]=>
string(0) ""
[1]=>
string(10) "#GENERATED"
[2]=>
string(7) "2020715"
[3]=>
string(4) ""SE""
}
}
这是一个没有正则表达式的解决方案
$str =
'#PROGRAM "Accounting company" 98.2
#GENERATED 2020715 "SE"';
$quoted = false;
$index = 0;
$data = [];
$rows = explode("\n", $str);
foreach($rows as $row) {
$temp = [];
for ($i = 0; $i < strlen($row); $i++) {
if ($row[$i] === "\"") $quoted = !$quoted;
if ($row[$i] === " " && !$quoted) {
$index++;
continue;
}
$temp[$index] = ($temp[$index] ?? "") . $row[$i];
}
$data[] = array_values($temp);
}
var_dump($data);
结果
array(2) {
[0]=>
array(3) {
[0]=>
string(8) "#PROGRAM"
[1]=>
string(20) ""Accounting company""
[2]=>
string(4) "98.2"
}
[1]=>
array(3) {
[0]=>
string(10) "#GENERATED"
[1]=>
string(7) "2020715"
[2]=>
string(4) ""SE""
}
}
虽然仍在寻找正则表达式解决方案:)
如果您想保留 [1][0] 处的空元素:Demo
您可以使用 SKIP FAIL 模式跳过从开始到结束双引号的匹配值,然后匹配 1+ 个水平空白字符以拆分
"[^"]*"(*SKIP)(*FAIL)|\h+
$str =
'#PROGRAM "Accounting company" 98.2
#GENERATED 2020715 "SE"';
$data = explode("\n", $str);
foreach($data as &$value){
$value = preg_split("/\"[^\"]*\"(*SKIP)(*FAIL)|\h+/", $value);
}
print_r($data);
输出
Array
(
[0] => #PROGRAM
[1] => "Accounting company"
[2] => 98.2
)
Array
(
[0] =>
[1] => #GENERATED
[2] => 2020715
[3] => "SE"
)
如果你不想要第二个数组中的空条目,你可以使用 PREG_SPLIT_NO_EMPTY
标志:
$value = preg_split("/\"[^\"]*\"(*SKIP)(*FAIL)|\h+/", $value, -1, PREG_SPLIT_NO_EMPTY);
我需要引号内的字符串 [$str] 中的数据不被拆分。 在这种情况下,“会计公司”应该保留在一串中,而不是散开。
<?php
$str =
'#PROGRAM "Accounting company" 98.2
#GENERATED 2020715 "SE"';
$data = explode("\n", $str);
foreach($data as &$value){
$value = preg_split("/\s+/", $value);
}
var_dump($data);
结果:
array(2) {
[0]=>
array(4) {
[0]=>
string(8) "#PROGRAM"
[1]=>
string(11) ""Accounting" // Unwanted split
[2]=>
string(8) "company"" // Unwanted split
[3]=>
string(4) "98.2"
}
[1]=>
&array(4) {
[0]=>
string(0) ""
[1]=>
string(10) "#GENERATED"
[2]=>
string(7) "2020715"
[3]=>
string(4) ""SE""
}
}
想要的结果:
array(2) {
[0]=>
array(4) {
[0]=>
string(8) "#PROGRAM"
[1]=>
string(18) ""Accounting company"
[2]=>
string(4) "98.2"
}
[1]=>
&array(4) {
[0]=>
string(0) ""
[1]=>
string(10) "#GENERATED"
[2]=>
string(7) "2020715"
[3]=>
string(4) ""SE""
}
}
这是一个没有正则表达式的解决方案
$str =
'#PROGRAM "Accounting company" 98.2
#GENERATED 2020715 "SE"';
$quoted = false;
$index = 0;
$data = [];
$rows = explode("\n", $str);
foreach($rows as $row) {
$temp = [];
for ($i = 0; $i < strlen($row); $i++) {
if ($row[$i] === "\"") $quoted = !$quoted;
if ($row[$i] === " " && !$quoted) {
$index++;
continue;
}
$temp[$index] = ($temp[$index] ?? "") . $row[$i];
}
$data[] = array_values($temp);
}
var_dump($data);
结果
array(2) {
[0]=>
array(3) {
[0]=>
string(8) "#PROGRAM"
[1]=>
string(20) ""Accounting company""
[2]=>
string(4) "98.2"
}
[1]=>
array(3) {
[0]=>
string(10) "#GENERATED"
[1]=>
string(7) "2020715"
[2]=>
string(4) ""SE""
}
}
虽然仍在寻找正则表达式解决方案:)
如果您想保留 [1][0] 处的空元素:Demo
您可以使用 SKIP FAIL 模式跳过从开始到结束双引号的匹配值,然后匹配 1+ 个水平空白字符以拆分
"[^"]*"(*SKIP)(*FAIL)|\h+
$str =
'#PROGRAM "Accounting company" 98.2
#GENERATED 2020715 "SE"';
$data = explode("\n", $str);
foreach($data as &$value){
$value = preg_split("/\"[^\"]*\"(*SKIP)(*FAIL)|\h+/", $value);
}
print_r($data);
输出
Array
(
[0] => #PROGRAM
[1] => "Accounting company"
[2] => 98.2
)
Array
(
[0] =>
[1] => #GENERATED
[2] => 2020715
[3] => "SE"
)
如果你不想要第二个数组中的空条目,你可以使用 PREG_SPLIT_NO_EMPTY
标志:
$value = preg_split("/\"[^\"]*\"(*SKIP)(*FAIL)|\h+/", $value, -1, PREG_SPLIT_NO_EMPTY);