如何从抓取数据创建各种列表
How to create various list from a grab data
我正在尝试改进下面的代码。我想从正在抓取的相同数据文本中向列表添加标签。
import requests
from bs4 import BeautifulSoup as bs
headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0"}
r = requests.get('https://bscscan.com/tx/0x945d380c807137cd0b1299959bf364fc0ee2aec08fed361b71f2ead6dcfa3818', headers = {'User-Agent':'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
test = soup.select_one('div#rawtab textarea').text
print (test)
当前输出:
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
MethodID: 0x6167aa61
[0]: 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
[1]: 0000000000000000000000000000000000000000000000183e55dbab04396869
[2]: 00000000000000000000000000000000000000000000000000000000625f7296
[3]: 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
[4]: 0000000000000000000000000000000000000000000000000000000000000001
需要输出:
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
address lpToken: 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
uint256 amount: 0000000000000000000000000000000000000000000000183e55dbab04396869
uint256 unlockTime: 00000000000000000000000000000000000000000000000000000000625f7296
address withdrawer: 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
uint8 feePaymentMode: 0000000000000000000000000000000000000000000000000000000000000001
只是简单的文本 search-and-replace.
text = """\
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
MethodID: 0x6167aa61
[0]: 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
[1]: 0000000000000000000000000000000000000000000000183e55dbab04396869
[2]: 00000000000000000000000000000000000000000000000000000000625f7296
[3]: 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
[4]: 0000000000000000000000000000000000000000000000000000000000000001"""
params = []
for line in text.splitlines():
if line.startswith('Function'):
print(line)
i = line.find('(')
j = line.find(')')
params = line[i+1:j].split(', ')
print()
elif line.startswith('['):
print( f"{params[int(line[1])]:20} {line[3:]}" )
输出:
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
address lpToken : 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
uint256 amount : 0000000000000000000000000000000000000000000000183e55dbab04396869
uint256 unlockTime : 00000000000000000000000000000000000000000000000000000000625f7296
address withdrawer : 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
uint8 feePaymentMode : 0000000000000000000000000000000000000000000000000000000000000001
我正在尝试改进下面的代码。我想从正在抓取的相同数据文本中向列表添加标签。
import requests
from bs4 import BeautifulSoup as bs
headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0"}
r = requests.get('https://bscscan.com/tx/0x945d380c807137cd0b1299959bf364fc0ee2aec08fed361b71f2ead6dcfa3818', headers = {'User-Agent':'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
test = soup.select_one('div#rawtab textarea').text
print (test)
当前输出:
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
MethodID: 0x6167aa61
[0]: 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
[1]: 0000000000000000000000000000000000000000000000183e55dbab04396869
[2]: 00000000000000000000000000000000000000000000000000000000625f7296
[3]: 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
[4]: 0000000000000000000000000000000000000000000000000000000000000001
需要输出:
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
address lpToken: 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
uint256 amount: 0000000000000000000000000000000000000000000000183e55dbab04396869
uint256 unlockTime: 00000000000000000000000000000000000000000000000000000000625f7296
address withdrawer: 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
uint8 feePaymentMode: 0000000000000000000000000000000000000000000000000000000000000001
只是简单的文本 search-and-replace.
text = """\
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
MethodID: 0x6167aa61
[0]: 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
[1]: 0000000000000000000000000000000000000000000000183e55dbab04396869
[2]: 00000000000000000000000000000000000000000000000000000000625f7296
[3]: 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
[4]: 0000000000000000000000000000000000000000000000000000000000000001"""
params = []
for line in text.splitlines():
if line.startswith('Function'):
print(line)
i = line.find('(')
j = line.find(')')
params = line[i+1:j].split(', ')
print()
elif line.startswith('['):
print( f"{params[int(line[1])]:20} {line[3:]}" )
输出:
Function: lockTokens(address lpToken, uint256 amount, uint256 unlockTime, address withdrawer, uint8 feePaymentMode) ***
address lpToken : 0000000000000000000000009c95aa9407611b1fef6a2b16d7b0de3d03359136
uint256 amount : 0000000000000000000000000000000000000000000000183e55dbab04396869
uint256 unlockTime : 00000000000000000000000000000000000000000000000000000000625f7296
address withdrawer : 000000000000000000000000c0ef93ad4c21053bf82d66f4e5513b0b542e329d
uint8 feePaymentMode : 0000000000000000000000000000000000000000000000000000000000000001