peewee:Python int 太大,无法转换为 SQLite INTEGER

peewee: Python int too large to convert to SQLite INTEGER

我有以下代码:

from peewee import Model, CharField, BigIntegerField, DateTimeField, SqliteDatabase
db = SqliteDatabase('samples.db')

class UnsignedIntegerField(BigIntegerField):
    field_type = 'int unsigned'

class MyBaseModel(Model):
    class Meta:
        database = db

class Sample(MyBaseModel):
    myint = UnsignedIntegerField(index=True)
    description = CharField(default="")


Sample.create_table()

sample = Sample()
sample.description = 'description'
sample.myint = '15235670141346654134'
sample.save()

代码给出了上面提到的错误。在那之后,如果我尝试手动插入值,我没有问题:

insert into sample (description,myint) values('description',15235670141346654134);

架构是这样显示的:

 CREATE TABLE IF NOT EXISTS "sample" ("id" INTEGER NOT NULL PRIMARY KEY, "myint" int unsigned NOT NULL, "description" VARCHAR(255) NOT NULL);
CREATE INDEX "sample_myint" ON "sample" ("myint");

出于某种原因,peewee 无法使用 sqlite 的 unsigned int。我确实在 cygwin 的最新 sqlite 驱动程序上尝试过。我正在使用 python3.

Sqlite 整数是带符号的 64 位值。它没有任何无符号类型的概念。它所做的是非常自由地接受列类型的任意字符串; CREATE table ex(col fee fie fo fum); 有效。有关列类型如何转换为列 affinity 和其他重要详细信息,请参阅 the documentation

因此它可以作为整数保存的最大数字是 9223372036854775807,它小于您要插入的 15235670141346654134。您需要告诉您的 ORM 将其存储为字符串或 blob,并在其与 Python 任意大小的整数之间进行转换。

当你做一个手册时

insert into sample (description,myint) values('description',15235670141346654134);

然后查看 table 中的那一行,您会看到那个非常大的数字被转换为浮点 (real) 值(任何大于一个整数可以容纳被视为一个)。这不太可能是您想要的,因为它会导致数据丢失。

15235670141346654134 太大,无法存储在 64 位整数中。当您尝试将该值绑定到 sqlite 准备语句(由 python sqlite3 驱动程序完成)时,它会溢出。它似乎在 shell 中工作的原因是因为 sqlite 可能正在进行不同类型的转换(例如,将其视为浮点数或字符串)。

这不是特定ORM的问题,而是SQLite自身的限制。有几种解决方法。

正在阅读文档

SQLStorage Classes and Datatypes 中的站点文档说:

Each value stored in an SQLite database [...] has one of the following storage classes:

  • NULL. The value is a NULL value.
  • INTEGER. The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
  • REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number.
  • TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
  • BLOB. The value is a blob of data, stored exactly as it was input.

[...] The INTEGER storage class, for example, includes 6 different integer datatypes of different lengths. This makes a difference on disk. But as soon as INTEGER values are read off of disk and into memory for processing, they are converted to the most general datatype (8-byte signed integer).

Type Affinity 关于 SQLite 的动态类型性质:

[...] SQLite supports the concept of "type affinity" on columns. The type affinity of a column is the recommended type for data stored in that column. The important idea here is that the type is recommended, not required. Any column can still store any type of data. It is just that some columns, given the choice, will prefer to use one storage class over another.

[...]

A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if the text is a well-formed integer or real literal, respectively. If the TEXT value is a well-formed integer literal that is too large to fit in a 64-bit signed integer, it is converted to REAL. For conversions between TEXT and REAL storage classes, only the first 15 significant decimal digits of the number are preserved. If the TEXT value is not a well-formed integer or real literal, then the value is stored as TEXT. For the purposes of this paragraph, hexadecimal integer literals are not considered well-formed and are stored as TEXT.

[...]

A column that uses INTEGER affinity behaves the same as a column with NUMERIC affinity. The difference between INTEGER and NUMERIC affinity is only evident in a CAST expression.

让我们得出一些结论:

  • 大于 263-1 的值不适合 8 字节有符号整数,必须以其他方式存储(即 REAL 或 TEXT)和 SQLite 就可以了
  • SQLite 将存储 a well-formed 整数文字,它太大而无法放入 64 位有符号整数 作为 REAL 本身( OP 发现了什么手工)
  • REAL 存储为 IEEE 754 binary64 尾数为 53 位 这是 ~16 位有效的十进制数字,但是 math.log10(2**63) ~ 19 所以转换是有损的

实验

In [1]: import sqlite3

In [2]: conn = sqlite3.connect(':memory:')

In [3]: conn.execute('CREATE TABLE test(x INTEGER)')
Out[3]: <sqlite3.Cursor at 0x7fafdbc3b570>

In [4]: conn.execute('INSERT INTO test VALUES(1)')
Out[4]: <sqlite3.Cursor at 0x7fafdbc3b490>

In [5]: conn.execute('INSERT INTO test VALUES({})'.format(2**63))
Out[5]: <sqlite3.Cursor at 0x7fafdbc3b5e0>

In [6]: conn.execute('INSERT INTO test VALUES(?)', (2**63,))
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-6-d0aa07d5aa5c> in <module>
----> 1 conn.execute('INSERT INTO test VALUES(?)', (2**63,))

OverflowError: Python int too large to convert to SQLite INTEGER

In [7]: conn.execute('SELECT * FROM test').fetchall()
Out[7]: [(1,), (9.223372036854776e+18,)]

如果 SQLite 可以将无符号 bigint 值存储为 REAL,那么 OverflowError 来自哪里?这是 CPython 的 Pysqlite check, used in pysqlite_statement_bind_parameter.

解决方法

  1. 如果您对有损 REAL 表示没问题,请将您的 int 转换(或告诉您的 ORM)为 str 并让 SQLite 执行它的操作.

  2. 如果您不喜欢有损表示,但可以牺牲 SQL 算术和聚合,您可以教 sqlite3 如何使用 round-trip sqlite3.register_adapter and register_converter.

     In [1]: import sqlite3
    
     In [2]: MAX_SQLITE_INT = 2 ** 63 - 1
        ...:
        ...: sqlite3.register_adapter(
                 int, lambda x: hex(x) if x > MAX_SQLITE_INT else x)
        ...: sqlite3.register_converter(
                 'integer', lambda b: int(b, 16 if b[:2] == b'0x' else 10))
    
     In [3]: conn = sqlite3.connect(
                 ':memory:', detect_types=sqlite3.PARSE_DECLTYPES)
    
     In [4]: conn.execute('CREATE TABLE test(x INTEGER)')
     Out[4]: <sqlite3.Cursor at 0x7f549d1c5810>
    
     In [5]: conn.execute('INSERT INTO test VALUES(?)', (1,))
     Out[5]: <sqlite3.Cursor at 0x7f549d1c57a0>
    
     In [6]: conn.execute('INSERT INTO test VALUES(?)', (2**63,))
     Out[6]: <sqlite3.Cursor at 0x7f549d1c56c0>
    
     In [7]: conn.execute('SELECT * FROM test').fetchall()
     Out[7]: [(1,), (9223372036854775808,)]