用 BeautifulSoup 抓取 Facebook 好友

Question

我已经使用 BeautifulSoup 完成了一些基本的网络抓取。对于我的下一个项目，我选择抓取指定用户的 facebook 好友列表。问题是，只有登录后，facebook 才能让你看到好友列表。所以我的问题是，我能以某种方式绕过它吗？如果不能，我能让 BeautifulSoup 表现得像它已登录一样吗？

这是我的代码：

from urllib.request import urlopen
from bs4 import BeautifulSoup

url = input("enter url: ")

try:
   page = urlopen(url)
except:
   print("Error opening the URL")

soup = BeautifulSoup(page, 'html.parser')
content = soup.find('div', {"class": "_3i9"})

friends = ''
for i in content.findAll('a'):
    friends = friends + ' ' +  i.text

print(friends)

Answer 1

The problem is, facebook lets you see friend lists of people only if you are logged in

您可以使用 Selenium 克服这个问题。您将需要它来验证自己的身份，然后才能找到用户。一旦找到它，您可以通过两种方式进行：

您可以使用 driver.page_source 获取 HTML 来源，然后从那里使用 Beatiful Soup
使用Selenium提供给你的方法来抓取好友

Answer 2

BeautifulSoup 不要求您使用 URL。相反：

查看好友列表
将包含列表的父标记复制到新文件(ParentTag.html)
以字符串形式打开文件，并将其传递给 BeautifulSoup()

with open("path/to/ParentTag.html", encoding="utf8") as html:
    soup = BeautifulSoup(html, "html.parser")

然后，"you make-a the soup-a."

用 BeautifulSoup 抓取 Facebook 好友

Scrape Facebook friends with BeautifulSoup

python

facebook

beautifulsoup

web-scraping