使用 preg_replace() 将 CamelCase 转换为 snake_case

Using preg_replace() to convert CamelCase to snake_case

我现在有一个方法可以将我的驼峰式字符串转换为蛇形式,但它被分成三个调用 preg_replace():

public function camelToUnderscore($string, $us = "-")
{
    // insert hyphen between any letter and the beginning of a numeric chain
    $string = preg_replace('/([a-z]+)([0-9]+)/i', ''.$us.'', $string);
    // insert hyphen between any lower-to-upper-case letter chain
    $string = preg_replace('/([a-z]+)([A-Z]+)/', ''.$us.'', $string);
    // insert hyphen between the end of a numeric chain and the beginning of an alpha chain
    $string = preg_replace('/([0-9]+)([a-z]+)/i', ''.$us.'', $string);

    // Lowercase
    $string = strtolower($string);

    return $string;
}

我编写了测试来验证它的准确性,它可以在以下输入数组 (array('input' => 'output')) 下正常工作:

$test_values = [
    'foo'       => 'foo',
    'fooBar'    => 'foo-bar',
    'foo123'    => 'foo-123',
    '123Foo'    => '123-foo',
    'fooBar123' => 'foo-bar-123',
    'foo123Bar' => 'foo-123-bar',
    '123FooBar' => '123-foo-bar',
];

我想知道是否有一种方法可以将我的 preg_replace() 调用减少到一行,从而得到相同的结果。有什么想法吗?

注意:Referring to this post,我的研究向我展示了一个 preg_replace() 正则表达式,它让我 几乎 我想要的结果,但它不起作用在 foo123 的示例中将其转换为 foo-123.

来自同事:

$string = preg_replace(array($pattern1, $pattern2), $us.'', $string); might work

我的解决方案:

public function camelToUnderscore($string, $us = "-")
{
    $patterns = [
        '/([a-z]+)([0-9]+)/i',
        '/([a-z]+)([A-Z]+)/',
        '/([0-9]+)([a-z]+)/i'
    ];
    $string = preg_replace($patterns, ''.$us.'', $string);

    // Lowercase
    $string = strtolower($string);

    return $string;
}

您可以使用 lookarounds 在一个正则表达式中完成所有这些:

function camelToUnderscore($string, $us = "-") {
    return strtolower(preg_replace(
        '/(?<=\d)(?=[A-Za-z])|(?<=[A-Za-z])(?=\d)|(?<=[a-z])(?=[A-Z])/', $us, $string));
}

RegEx Demo

Code Demo

正则表达式说明:

(?<=\d)(?=[A-Za-z])  # if previous position has a digit and next has a letter
|                    # OR
(?<=[A-Za-z])(?=\d)  # if previous position has a letter and next has a digit
|                    # OR
(?<=[a-z])(?=[A-Z])  # if previous position has a lowercase and next has a uppercase letter

这是我根据之前标记的重复 post 得到的两分钱。这里接受的解决方案很棒。我只是想尝试用分享的内容来解决它:

function camelToUnderscore($string, $us = "-") {
    return strtolower(preg_replace('/(?<!^)[A-Z]+|(?<!^|\d)[\d]+/', $us.'[=10=]', $string));
}

示例:

Array
(
    [0] => foo
    [1] => fooBar
    [2] => foo123
    [3] => 123Foo
    [4] => fooBar123
    [5] => foo123Bar
    [6] => 123FooBar
)

foreach ($arr as $item) {
    echo camelToUnderscore($item);
    echo "\r\n";
}

输出:

foo
foo-bar
foo-123
123-foo
foo-bar-123
foo-123-bar
123-foo-bar

解释:

(?<!^)[A-Z]+      // Match one or more Capital letter not at start of the string
|                 // OR
(?<!^|\d)[\d]+    // Match one or more digit not at start of the string

$us.'[=13=]'          // Substitute the matching pattern(s)

online regex

问题已经解决了所以我不会说我希望它有所帮助,但也许有人会觉得这很有用。


编辑

此正则表达式有限制:

foo123bar => foo-123bar
fooBARFoo => foo-barfoo

感谢@urban 指出。这是他 link 对这个问题的三个解决方案 post 进行的测试:

three solutions demo