Python error: PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x1144e9860>
Python error: PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x1144e9860>
我有以下脚本打印指定 url 上所有图像的 src 路径和大小:
from requests_html import HTMLSession
from urllib.request import urlopen
from bs4 import BeautifulSoup
from PIL import Image
import requests
url="https://example.com/"
session = HTMLSession()
r = session.get(url)
b = requests.get(url)
soup = BeautifulSoup(b.text, "lxml")
images = soup.find_all('img')
for img in images:
if img.has_attr('src') :
imgsize = Image.open(requests.get(img['src'], stream=True).raw)
print(img['src'], imgsize.size)
它对一些 url 的人来说工作正常,但对其他人我得到以下错误:
PIL.UnidentifiedImageError: 无法识别图像文件 <_io.BytesIO 对象在 0x10782e900>
有没有办法克服这个错误?
没有你的具体 url,我无法去看看为什么会这样。但是你可以在那里放一个 try/except
这样你的脚本就不会崩溃并且会继续到下一个 img
from requests_html import HTMLSession
from urllib.request import urlopen
from bs4 import BeautifulSoup
from PIL import Image
import requests
url="https://example.com/"
session = requests.Session()
r = session.get(url)
b = requests.get(url)
soup = BeautifulSoup(b.text, "lxml")
images = soup.find_all('img')
for img in images:
if img.has_attr('src') :
try:
img_link = img['src']
if img_link.startswith('data:image'):
img_link = img['data-src']
imgsize = Image.open(requests.get(img_link, stream=True).raw)
print(img_link, imgsize.size)
except Exception as e:
print(e)
我有以下脚本打印指定 url 上所有图像的 src 路径和大小:
from requests_html import HTMLSession
from urllib.request import urlopen
from bs4 import BeautifulSoup
from PIL import Image
import requests
url="https://example.com/"
session = HTMLSession()
r = session.get(url)
b = requests.get(url)
soup = BeautifulSoup(b.text, "lxml")
images = soup.find_all('img')
for img in images:
if img.has_attr('src') :
imgsize = Image.open(requests.get(img['src'], stream=True).raw)
print(img['src'], imgsize.size)
它对一些 url 的人来说工作正常,但对其他人我得到以下错误:
PIL.UnidentifiedImageError: 无法识别图像文件 <_io.BytesIO 对象在 0x10782e900>
有没有办法克服这个错误?
没有你的具体 url,我无法去看看为什么会这样。但是你可以在那里放一个 try/except
这样你的脚本就不会崩溃并且会继续到下一个 img
from requests_html import HTMLSession
from urllib.request import urlopen
from bs4 import BeautifulSoup
from PIL import Image
import requests
url="https://example.com/"
session = requests.Session()
r = session.get(url)
b = requests.get(url)
soup = BeautifulSoup(b.text, "lxml")
images = soup.find_all('img')
for img in images:
if img.has_attr('src') :
try:
img_link = img['src']
if img_link.startswith('data:image'):
img_link = img['data-src']
imgsize = Image.open(requests.get(img_link, stream=True).raw)
print(img_link, imgsize.size)
except Exception as e:
print(e)