如何表达和强制 class 有 2 种操作模式,每种模式都有一些有效和无效的方法

How to express and enforce that a class has 2 modes of operation, each having some valid and invalid methods

我对类型检查还很陌生 Python。我想找到一种方法来使用它来检查这种常见情况:

  1. class(例如我的 DbQuery class)已实例化,处于某种未初始化状态。例如我是数据库查询人员,但我还没有连接到数据库。您可以(抽象地)说该实例的类型为 'Unconnected Db Query Connector'
  2. 用户调用 .connect() 将 class 实例设置为已连接。现在可以认为这个 class 实例属于一个新类别(协议?)。您现在可以说该实例的类型为 'Connected DB Query Connector'...
  3. 用户调用 .query() 等使用 class。查询方法被注释以表示在这种情况下 self 必须是 'Connected DB Query Connector'

我想自动检测的错误用法:用户实例化数据库连接器,然后调用 query() 而没有先调用 connect。

是否有带注释的表示?我可以表示 connect() 方法导致 'self' 加入了一个新类型吗?或者这是正确的方法吗?

在 Python 或 mypy 中是否有一些其他标准机制来表达和检测它?

我也许能看到这如何用继承来表达...我不确定

提前致谢!

编辑:

这是我希望我能做的事情:

from typing import Union, Optional, NewType, Protocol, cast


class Connector:
    def __init__(self, host: str) -> None:
        self.host = host

    def run(self, sql: str) -> str:
        return f"I ran {sql} on {self.host}"


# This is a version of class 'A' where conn is None and you can't call query()
class NoQuery(Protocol):
    conn: None


# This is a version of class 'A' where conn is initialized. You can query, but you cant call connect()
class CanQuery(Protocol):
    conn: Connector


# This class starts its life as a NoQuery. Should switch personality when connect() is called
class A(NoQuery):
    def __init__(self) -> None:
        self.conn = None

    def query(self: CanQuery, sql: str) -> str:
        return self.conn.run(sql)

    def connect(self: NoQuery, host: str):
        # Attempting to change from 'NoQuery' to 'CanQuery' like this
        # mypy complains: Incompatible types in assignment (expression has type "CanQuery", variable has type "NoQuery")
        self = cast(CanQuery, self)
        self.conn = Connector(host)


a = A()
a.connect('host.domain')
print(a.query('SELECT field FROM table'))


b = A()
# mypy should help me spot this. I'm trying to query an unconnected host. self.conn is None
print(b.query('SELECT oops'))

对我来说,这是一个常见的场景(一个对象具有一些不同且非常有意义的操作模式)。在mypy中没有办法表达吗?

您可以通过使 A class 成为通用类型、(ab)使用文字枚举并注释 self 参数来组合一些东西,但坦率地说,我不认为这是个好主意。

Mypy 通常假定调用一个方法不会改变方法的类型,并且如果不诉诸粗暴的黑客攻击和一堆强制转换或 # type: ignores 就不可能绕过它。

相反,标准约定是使用两个 classes——一个 "connection" 对象和一个 "query" 对象——以及上下文管理器。作为一个附带的好处,这还可以让您确保您的连接在您使用完后始终关闭。

例如:

from typing import Union, Optional, Iterator
from contextlib import contextmanager


class RawConnector:
    def __init__(self, host: str) -> None:
        self.host = host

    def run(self, sql: str) -> str:
        return f"I ran {sql} on {self.host}"

    def close(self) -> None:
        print("Closing connection!")


class Database:
    def __init__(self, host: str) -> None:
        self.host = host

    @contextmanager
    def connect(self) -> Iterator[Connection]:
        conn = RawConnector(self.host)
        yield Connection(conn)
        conn.close()


class Connection:
    def __init__(self, conn: RawConnector) -> None:
        self.conn = conn

    def query(self, sql: str) -> str:
        return self.conn.run(sql)

db = Database("my-host")
with db.connect() as conn:
    conn.query("some sql")

如果您真的想将这两个新的 class 合并为一个,您可以(滥用)使用文字类型、泛型和自我注释,并保持在您只能 return个具有新人格的实例。

例如:

# If you are using Python 3.8+, you can import 'Literal' directly from
# typing. But if you need to support older Pythons, you'll need to
# pip-install typing_extensions and import from there.
from typing import Union, Optional, Iterator, TypeVar, Generic, cast
from typing_extensions import Literal
from contextlib import contextmanager
from enum import Enum


class RawConnector:
    def __init__(self, host: str) -> None:
        self.host = host

    def run(self, sql: str) -> str:
        return f"I ran {sql} on {self.host}"

    def close(self) -> None:
        print("Closing connection!")

class State(Enum):
    Unconnected = 0
    Connected = 1

# Type aliases here for readability. We use an enum and Literal
# types mostly so we can give each of our states a nice name. We
# could have also created an empty 'State' class and created an
# 'Unconnected' and 'Connected' subclasses: all that matters is we
# have one distinct type per state/per "personality".
Unconnected = Literal[State.Unconnected]
Connected = Literal[State.Connected]

T = TypeVar('T', bound=State)

class Connection(Generic[T]):
    def __init__(self: Connection[Unconnected]) -> None:
        self.conn: Optional[RawConnector] = None

    def connect(self: Connection[Unconnected], host: str) -> Connection[Connected]:
        self.conn = RawConnector(host)
        # Important! We *return* the new type!
        return cast(Connection[Connected], self)

    def query(self: Connection[Connected], sql: str) -> str:
        assert self.conn is not None
        return self.conn.run(sql)


c1 = Connection()
c2 = c1.connect("foo")
c2.query("some-sql")

# Does not type check, since types of c1 and c2 do not match declared self types
c1.query("bad")
c2.connect("bad")

基本上,只要我们坚持 returning 个新实例(即使在运行时,我们总是 return 只是 'self').

再多 cleverness/a 一些妥协,您甚至可以在从一种状态转换到另一种状态时摆脱强制转换。

但是老实说,我认为这种技巧 overkill/probably 不适合您似乎想要做的事情。我个人会推荐两种 classes + contextmanager 方法。