Plotly Dash:如何重现 dcc.Upload 的 'content' 输出? (即 base64 编码的字符串)
Plotly Dash: How to reproduce 'content' Output of dcc.Upload? (i.e. base64 encoded string)
我无法重现 dcc.Upload 组件的 content
-属性 的准确输出。
如果我将文件 my_excel.xlsx
上传到 dcc.Upload 组件,我的回调函数会收到一个“base64 编码字符串”(根据 dcc.Upload documentation)。我不知道如何在没有 dcc.Upload 组件的情况下重现完全相同的字符串 (我想使用单元测试的输出)。
我目前的做法:
import base64
import io
import pandas as pd
# This is what I try to reproduce the output of the dcc.Upload Component
with open('tests/data/my_excel.xlsx', 'rb') as file:
raw_data = file.read()
# raw_data is the output I receive from the dcc.Upload Component
# these steps are raise no Error with the output of dcc.Upload
_, content_string = raw_data.split(',') # this Fails
decoded = base64.b64decode(content_string)
df = pd.read_excel(io.BytesIO(decoded))
我收到错误 TypeError: a bytes-like object is required, not 'str'
。
如果我添加
raw_data = base64.b64encode(raw_data)
在 raw_data.split(',')
之前,我得到了同样的错误。
如何在没有 dcc.Upload 组件的情况下获得完全相同的“base64 编码字符串”?
我找不到一个函数来重现 dcc.Upload 的 contents
属性,但能够手动创建 dcc.Upload.[=20 的输出=]
从 documentation 我们有:
contents
is a base64 encoded string that contains the files contents
[...] Property accept
(string; optional): Allow specific types of
files. See https://github.com/okonet/attr-accept for more information.
Keep in mind that mime type determination is not reliable across
platforms. CSV files, for example, are reported as text/plain under
macOS but as application/vnd.ms-excel under Windows. In some cases
there might not be a mime type set at all.
检查 contents
字符串,发现它由两个字符串组成:
content_type, content_string = contents.split(',')
进一步检查显示:
content_type
: 包含文件
的mime类型信息
content_string
: 文件的base64编码内容
import base64
import io
import pandas as pd
import magic
filepath = 'tests/data/my_excel.xlsx'
# Reproduce output of dcc.Upload Component
with open(filepath, "rb") as file:
decoded = file.read()
content_bytes = base64.b64encode(decoded)
content_string = content_bytes.decode("utf-8")
mime = magic.Magic(mime=True)
mime_type = mime.from_file(filepath)
content_type = "".join(["data:", mime_type, ";base64"])
contents = "".join([content_type, ",", content_string])
# and now revert: convert contents to binary file stream
content_type, content_string = contents.split(",")
decoded = base64.b64decode(content_string)
df = pd.read_excel(io.BytesIO(decoded))
我无法重现 dcc.Upload 组件的 content
-属性 的准确输出。
如果我将文件 my_excel.xlsx
上传到 dcc.Upload 组件,我的回调函数会收到一个“base64 编码字符串”(根据 dcc.Upload documentation)。我不知道如何在没有 dcc.Upload 组件的情况下重现完全相同的字符串 (我想使用单元测试的输出)。
我目前的做法:
import base64
import io
import pandas as pd
# This is what I try to reproduce the output of the dcc.Upload Component
with open('tests/data/my_excel.xlsx', 'rb') as file:
raw_data = file.read()
# raw_data is the output I receive from the dcc.Upload Component
# these steps are raise no Error with the output of dcc.Upload
_, content_string = raw_data.split(',') # this Fails
decoded = base64.b64decode(content_string)
df = pd.read_excel(io.BytesIO(decoded))
我收到错误 TypeError: a bytes-like object is required, not 'str'
。
如果我添加
raw_data = base64.b64encode(raw_data)
在 raw_data.split(',')
之前,我得到了同样的错误。
如何在没有 dcc.Upload 组件的情况下获得完全相同的“base64 编码字符串”?
我找不到一个函数来重现 dcc.Upload 的 contents
属性,但能够手动创建 dcc.Upload.[=20 的输出=]
从 documentation 我们有:
contents
is a base64 encoded string that contains the files contents [...] Propertyaccept
(string; optional): Allow specific types of files. See https://github.com/okonet/attr-accept for more information. Keep in mind that mime type determination is not reliable across platforms. CSV files, for example, are reported as text/plain under macOS but as application/vnd.ms-excel under Windows. In some cases there might not be a mime type set at all.
检查 contents
字符串,发现它由两个字符串组成:
content_type, content_string = contents.split(',')
进一步检查显示:
content_type
: 包含文件
的mime类型信息
content_string
: 文件的base64编码内容
import base64
import io
import pandas as pd
import magic
filepath = 'tests/data/my_excel.xlsx'
# Reproduce output of dcc.Upload Component
with open(filepath, "rb") as file:
decoded = file.read()
content_bytes = base64.b64encode(decoded)
content_string = content_bytes.decode("utf-8")
mime = magic.Magic(mime=True)
mime_type = mime.from_file(filepath)
content_type = "".join(["data:", mime_type, ";base64"])
contents = "".join([content_type, ",", content_string])
# and now revert: convert contents to binary file stream
content_type, content_string = contents.split(",")
decoded = base64.b64decode(content_string)
df = pd.read_excel(io.BytesIO(decoded))