Python 的模式匹配性能如何?是 O(1) 吗?
How performant is Python's pattern matching? Is it O(1)?
据我们所知,Python 将支持 pattern matching in 3.10. Putting usage and syntax aside, I can't seem to find any word about performance in the PEP, or the official tutorial。
所以我想知道,它的性能如何?它是 O(1) 还是与 if..elif..else
相同?
模式匹配脱糖成链式 elif 语句和 isinsance 调用。它绝对不是 O(1) 或接近于任何类型的查找 table.
编辑:更准确地说,在 merged pull request 中,您可以看到 match
语句变成了负责处理各种模式的操作码链。这些操作码按顺序处理,很像链式 elif
语句。
单从时间复杂度来说,两种构造基本等价
但是,在考虑可读性和代码结构时,它们非常不同。作为一般规则,如果您匹配 值 或测试多个主题,if
-elif
-else
阶梯可能是最好的工作工具:
def stringify_response(response: int) -> str:
if 100 <= response < 200:
kind = "info"
elif 200 <= response < 300:
kind = "success"
elif 300 <= response < 400:
kind = "redirect"
elif 400 <= response < 500:
kind = "client error"
elif 500 <= response < 600:
kind = "server error"
else:
kind = "unexpected response"
return f"{kind} ({response})"
另一方面,如果您要匹配单个主题的 结构,那么 结构 模式匹配可能是完成这项工作的最佳工具:
from ast import *
def lint_function_def(function_def: FunctionDef) -> None:
match function_def.body:
case *_, Assign([Name(na)], e), Return(Name(nr, lineno=l)) if na == nr:
print(f"line {l}: consider returning {unparse(e)} instead of "
f"assigning to {na!r} on line {e.lineno}")
未来的改进
话虽如此,模式匹配的句法支持的好处之一是编译器和解释器拥有非凡的上下文知识。正因为如此,他们可以做出普通控制流做不到的假设。
考虑以下来自 PEP 636 的陈述(稍微简化):
match command.split():
case ["north"] | ["go", "north"]:
# Code for moving north.
case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
# Code for picking up the given object.
“当前”(CPython 3.10) 模式编译器非常幼稚,将其编译为如下内容:
_subject = command.split()
import _collections_abc
if (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 1
and _subject[0] == "north"
):
# Code for moving north.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 2
and _subject[0] == "go"
and _subject[1] == "north"
):
# Code for moving north.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 2
and _subject[0] == "get"
):
obj = _subject[1]
# Code for picking up the given object.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 3
and _subject[0] == "pick"
and _subject[1] == "up"
):
obj = _subject[2]
# Code for picking up the given object.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 3
and _subject[0] == "pick"
and _subject[2] == "up"
):
obj = _subject[1]
# Code for picking up the given object.
虽然这个例子很简单,但生成的代码充满了冗余检查。匹配命令"pick Spam up"
检查isinstance(_subject, _collections_abc.Sequence)
五次,调用len(_subject)
五次,检查_subject[0] == "pick"
两次!
但是,编译器完全有可能生成更高效的“决策树”,而不执行不必要的工作。这可能是这样的:
_subject = command.split()
import _collections_abc
if isinstance(_subject, _collections_abc.Sequence):
_len_subject = len(_subject)
if _len_subject == 1:
if _subject[0] == "north":
# Code for moving north.
elif _len_subject == 2:
_subject_0 = _subject[0]
if _subject_0 == "go":
if _subject[1] == "north":
# Code for moving north.
elif _subject_0 == "get":
obj = _subject[1]
# Code for picking up the given object.
elif _len_subject == 3:
if _subject[0] == "pick":
_subject_1 = _subject[1]
if _subject_1 == "up":
obj = _subject[2]
# Code for picking up the given object.
elif _subject[2] == "up":
obj = _subject_1
# Code for picking up the given object.
当你有守卫或嵌套模式时,它开始变得更加复杂......但我希望我们能够在 CPython 3.11 中加入这样的东西!
据我们所知,Python 将支持 pattern matching in 3.10. Putting usage and syntax aside, I can't seem to find any word about performance in the PEP, or the official tutorial。
所以我想知道,它的性能如何?它是 O(1) 还是与 if..elif..else
相同?
模式匹配脱糖成链式 elif 语句和 isinsance 调用。它绝对不是 O(1) 或接近于任何类型的查找 table.
编辑:更准确地说,在 merged pull request 中,您可以看到 match
语句变成了负责处理各种模式的操作码链。这些操作码按顺序处理,很像链式 elif
语句。
单从时间复杂度来说,两种构造基本等价
但是,在考虑可读性和代码结构时,它们非常不同。作为一般规则,如果您匹配 值 或测试多个主题,if
-elif
-else
阶梯可能是最好的工作工具:
def stringify_response(response: int) -> str:
if 100 <= response < 200:
kind = "info"
elif 200 <= response < 300:
kind = "success"
elif 300 <= response < 400:
kind = "redirect"
elif 400 <= response < 500:
kind = "client error"
elif 500 <= response < 600:
kind = "server error"
else:
kind = "unexpected response"
return f"{kind} ({response})"
另一方面,如果您要匹配单个主题的 结构,那么 结构 模式匹配可能是完成这项工作的最佳工具:
from ast import *
def lint_function_def(function_def: FunctionDef) -> None:
match function_def.body:
case *_, Assign([Name(na)], e), Return(Name(nr, lineno=l)) if na == nr:
print(f"line {l}: consider returning {unparse(e)} instead of "
f"assigning to {na!r} on line {e.lineno}")
未来的改进
话虽如此,模式匹配的句法支持的好处之一是编译器和解释器拥有非凡的上下文知识。正因为如此,他们可以做出普通控制流做不到的假设。
考虑以下来自 PEP 636 的陈述(稍微简化):
match command.split():
case ["north"] | ["go", "north"]:
# Code for moving north.
case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
# Code for picking up the given object.
“当前”(CPython 3.10) 模式编译器非常幼稚,将其编译为如下内容:
_subject = command.split()
import _collections_abc
if (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 1
and _subject[0] == "north"
):
# Code for moving north.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 2
and _subject[0] == "go"
and _subject[1] == "north"
):
# Code for moving north.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 2
and _subject[0] == "get"
):
obj = _subject[1]
# Code for picking up the given object.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 3
and _subject[0] == "pick"
and _subject[1] == "up"
):
obj = _subject[2]
# Code for picking up the given object.
elif (
isinstance(_subject, _collections_abc.Sequence)
and len(_subject) == 3
and _subject[0] == "pick"
and _subject[2] == "up"
):
obj = _subject[1]
# Code for picking up the given object.
虽然这个例子很简单,但生成的代码充满了冗余检查。匹配命令"pick Spam up"
检查isinstance(_subject, _collections_abc.Sequence)
五次,调用len(_subject)
五次,检查_subject[0] == "pick"
两次!
但是,编译器完全有可能生成更高效的“决策树”,而不执行不必要的工作。这可能是这样的:
_subject = command.split()
import _collections_abc
if isinstance(_subject, _collections_abc.Sequence):
_len_subject = len(_subject)
if _len_subject == 1:
if _subject[0] == "north":
# Code for moving north.
elif _len_subject == 2:
_subject_0 = _subject[0]
if _subject_0 == "go":
if _subject[1] == "north":
# Code for moving north.
elif _subject_0 == "get":
obj = _subject[1]
# Code for picking up the given object.
elif _len_subject == 3:
if _subject[0] == "pick":
_subject_1 = _subject[1]
if _subject_1 == "up":
obj = _subject[2]
# Code for picking up the given object.
elif _subject[2] == "up":
obj = _subject_1
# Code for picking up the given object.
当你有守卫或嵌套模式时,它开始变得更加复杂......但我希望我们能够在 CPython 3.11 中加入这样的东西!