SQLAlchemy 中 JSON 列的自定义 json 序列化程序
Custom json serializer for JSON column in SQLAlchemy
我有以下 ORM 对象(简化):
import datetime as dt
from sqlalchemy import create_engine, Integer, Column, DateTime
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import Session, declarative_base
Base = declarative_base()
class Metrics(Base):
__tablename__ = 'metrics'
id = Column(Integer, primary_key=True)
ts = Column(DateTime, default=dt.datetime.now())
computed_values = Column(JSONB)
dates = Column(JSONB)
entry = Metrics(computed_values={'foo': 12.3, 'bar':45.6},
dates=[dt.date.today()])
engine = create_engine('postgresql://postgres:postgres@localhost:5432/my_schema')
with Session(engine, future=True) as session:
session.add(entry)
session.commit()
每行有:
id
主键
ts
插入行时的时间戳
computed_values
实际要存储的JSONB个数据
dates
JSONB 存储计算数据的日期列表。
虽然我对 computed_values
列没有问题,但 dates
列内列表中的 datetime.date
对象默认情况下无法序列化 SQLAlchemy JSON 序列化程序。
我的想法是为 date
对象的确切列重新定义序列化器行为。为此,我必须定义自己的自定义 JSON 序列化程序,或者使用一些现成的序列化程序,例如 orjson。由于我可能会在项目中遇到许多其他 JSON 序列化问题,所以我更喜欢后者。
深入研究 JSONB
class 并且它是超级classes,我认为以下应该可以解决问题:
class Metrics(Base):
__tablename__ = 'metrics'
# ---%<--- snip ---%<---
dates = Column(JSONB(json_serializer=lambda obj: orjson.dumps(obj, option=orjson.OPT_NAIVE_UTC)))
# ---%<--- snip ---%<---
但它没有:
File "metrics.py", line 30, in Metrics
dates = Column(JSONB(json_serializer=lambda obj: orjson.dumps(obj, option=orjson.OPT_NAIVE_UTC)))
TypeError: __init__() got an unexpected keyword argument 'json_serializer'
我做错了什么以及如何为 JSON(和 JSONB)列正确定义自定义 SQLAlchemy 序列化程序?
看来您应该可以通过修改 create_engine
语句来获得您想要的内容。
来自the docstring in SQLAlchemy:
Custom serializers and deserializers are specified at the dialect level,
that is using :func:`_sa.create_engine`. The reason for this is that when
using psycopg2, the DBAPI only allows serializers at the per-cursor
or per-connection level. E.g.::
engine = create_engine("postgresql://scott:tiger@localhost/test",
json_serializer=my_serialize_fn,
json_deserializer=my_deserialize_fn
)
因此生成的代码应如下所示:
import datetime as dt
import orjson
from sqlalchemy import create_engine, Integer, Column, DateTime
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import Session, declarative_base
Base = declarative_base()
class Metrics(Base):
__tablename__ = 'metrics'
id = Column(Integer, primary_key=True)
ts = Column(DateTime, default=dt.datetime.now())
computed_values = Column(JSONB)
dates = Column(JSONB)
entry = Metrics(computed_values={'foo': 12.3, 'bar':45.6},
dates=[dt.date.today()])
def orjson_serializer(obj):
"""
Note that `orjson.dumps()` return byte array, while sqlalchemy expects string, thus `decode()` call.
"""
return orjson.dumps(obj, option=orjson.OPT_SERIALIZE_NUMPY | orjson.OPT_NAIVE_UTC).decode()
engine = create_engine('postgresql://postgres:postgres@localhost:5432/my_schema',
json_serializer=orjson_serializer,
json_deserializer=orjson.loads)
with Session(engine, future=True) as session:
session.add(entry)
session.commit()
我有以下 ORM 对象(简化):
import datetime as dt
from sqlalchemy import create_engine, Integer, Column, DateTime
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import Session, declarative_base
Base = declarative_base()
class Metrics(Base):
__tablename__ = 'metrics'
id = Column(Integer, primary_key=True)
ts = Column(DateTime, default=dt.datetime.now())
computed_values = Column(JSONB)
dates = Column(JSONB)
entry = Metrics(computed_values={'foo': 12.3, 'bar':45.6},
dates=[dt.date.today()])
engine = create_engine('postgresql://postgres:postgres@localhost:5432/my_schema')
with Session(engine, future=True) as session:
session.add(entry)
session.commit()
每行有:
id
主键ts
插入行时的时间戳computed_values
实际要存储的JSONB个数据dates
JSONB 存储计算数据的日期列表。
虽然我对 computed_values
列没有问题,但 dates
列内列表中的 datetime.date
对象默认情况下无法序列化 SQLAlchemy JSON 序列化程序。
我的想法是为 date
对象的确切列重新定义序列化器行为。为此,我必须定义自己的自定义 JSON 序列化程序,或者使用一些现成的序列化程序,例如 orjson。由于我可能会在项目中遇到许多其他 JSON 序列化问题,所以我更喜欢后者。
深入研究 JSONB
class 并且它是超级classes,我认为以下应该可以解决问题:
class Metrics(Base):
__tablename__ = 'metrics'
# ---%<--- snip ---%<---
dates = Column(JSONB(json_serializer=lambda obj: orjson.dumps(obj, option=orjson.OPT_NAIVE_UTC)))
# ---%<--- snip ---%<---
但它没有:
File "metrics.py", line 30, in Metrics
dates = Column(JSONB(json_serializer=lambda obj: orjson.dumps(obj, option=orjson.OPT_NAIVE_UTC)))
TypeError: __init__() got an unexpected keyword argument 'json_serializer'
我做错了什么以及如何为 JSON(和 JSONB)列正确定义自定义 SQLAlchemy 序列化程序?
看来您应该可以通过修改 create_engine
语句来获得您想要的内容。
来自the docstring in SQLAlchemy:
Custom serializers and deserializers are specified at the dialect level,
that is using :func:`_sa.create_engine`. The reason for this is that when
using psycopg2, the DBAPI only allows serializers at the per-cursor
or per-connection level. E.g.::
engine = create_engine("postgresql://scott:tiger@localhost/test",
json_serializer=my_serialize_fn,
json_deserializer=my_deserialize_fn
)
因此生成的代码应如下所示:
import datetime as dt
import orjson
from sqlalchemy import create_engine, Integer, Column, DateTime
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import Session, declarative_base
Base = declarative_base()
class Metrics(Base):
__tablename__ = 'metrics'
id = Column(Integer, primary_key=True)
ts = Column(DateTime, default=dt.datetime.now())
computed_values = Column(JSONB)
dates = Column(JSONB)
entry = Metrics(computed_values={'foo': 12.3, 'bar':45.6},
dates=[dt.date.today()])
def orjson_serializer(obj):
"""
Note that `orjson.dumps()` return byte array, while sqlalchemy expects string, thus `decode()` call.
"""
return orjson.dumps(obj, option=orjson.OPT_SERIALIZE_NUMPY | orjson.OPT_NAIVE_UTC).decode()
engine = create_engine('postgresql://postgres:postgres@localhost:5432/my_schema',
json_serializer=orjson_serializer,
json_deserializer=orjson.loads)
with Session(engine, future=True) as session:
session.add(entry)
session.commit()