使用 puppeteer 抓取 Squarespace 分析
Using puppeteer to scrape Squarespace analytics
我是 Puppeteer 的新手。我正在尝试抓取我的 Squarespace 网站上的分析页面,以便了解人们如何使用我的网站。
作为第一次测试,我只是想截取所需页面的屏幕截图。
const puppeteer = require('puppeteer');
const CREDS = require('./creds');
(async () => {
const browser = await puppeteer.launch({headless: true})
await page.goto('https://www.squarespace.com/login');
const USERNAME_SELECTOR = '<input class="username Input-hxTtdt ipapEE" type="email" placeholder="Email Address" name="email" autocapitalize="none" autocorrect="off">';
const PASSWORD_SELECTOR = '<input class="password Input-hxTtdt ipapEE" type="password" placeholder="Password" name="password">';
const BUTTON_SELECTOR = '<button class="Button-kDSBcD fATVqu" data-test="login-button"><span>Log In</span></button>';
await page.click(USERNAME_SELECTOR);
await page.keyboard.type(CREDS.username);
await page.click(PASSWORD_SELECTOR);
await page.keyboard.type(CREDS.password);
await page.click(BUTTON_SELECTOR);
await page.waitForNavigation();
await page.goto('https://triangle-oarfish-hk21.squarespace.com/config/analytics#activity-log');
await page.screenshot({path: 'example.png'});
await browser.close();
})();
我收到这个错误:
UnhandledPromiseRejectionWarning: ReferenceError: page is not defined
at /Users/reallymemorable/Documents/scripts.scrapers/squarespace.ip.scraper/squarespace.js:8:3
at process.internalTickCallback (internal/process/next_tick.js:77:7)
(node:16200) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:16200) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
我确定我在这里遗漏了一些关于如何定义页面的非常基本的东西,但是已经晚了,我有点迷路了。任何指针将不胜感激:)
- 您忘记创建页面对象
const page = await browser.newPage();
page.click
中的选择器形状错误。像这样使用它 const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
- 单击按钮并等待导航should be wrapped in
Promise.all
这里更正了你的例子:
const puppeteer = require('puppeteer');
const CREDS = require('./creds');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://www.squarespace.com/login');
const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
const PASSWORD_SELECTOR = '.password.Input-hxTtdt.ipapEE';
const BUTTON_SELECTOR = '.Button-kDSBcD.fATVqu';
await page.click(USERNAME_SELECTOR);
await page.keyboard.type(CREDS.username);
await page.click(PASSWORD_SELECTOR);
await page.keyboard.type(CREDS.password);
await Promise.all([
page.waitForNavigation(),
page.click(BUTTON_SELECTOR),
]);
await page.goto('https://triangle-oarfish-hk21.squarespace.com/config/analytics#activity-log');
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
我是 Puppeteer 的新手。我正在尝试抓取我的 Squarespace 网站上的分析页面,以便了解人们如何使用我的网站。
作为第一次测试,我只是想截取所需页面的屏幕截图。
const puppeteer = require('puppeteer');
const CREDS = require('./creds');
(async () => {
const browser = await puppeteer.launch({headless: true})
await page.goto('https://www.squarespace.com/login');
const USERNAME_SELECTOR = '<input class="username Input-hxTtdt ipapEE" type="email" placeholder="Email Address" name="email" autocapitalize="none" autocorrect="off">';
const PASSWORD_SELECTOR = '<input class="password Input-hxTtdt ipapEE" type="password" placeholder="Password" name="password">';
const BUTTON_SELECTOR = '<button class="Button-kDSBcD fATVqu" data-test="login-button"><span>Log In</span></button>';
await page.click(USERNAME_SELECTOR);
await page.keyboard.type(CREDS.username);
await page.click(PASSWORD_SELECTOR);
await page.keyboard.type(CREDS.password);
await page.click(BUTTON_SELECTOR);
await page.waitForNavigation();
await page.goto('https://triangle-oarfish-hk21.squarespace.com/config/analytics#activity-log');
await page.screenshot({path: 'example.png'});
await browser.close();
})();
我收到这个错误:
UnhandledPromiseRejectionWarning: ReferenceError: page is not defined
at /Users/reallymemorable/Documents/scripts.scrapers/squarespace.ip.scraper/squarespace.js:8:3
at process.internalTickCallback (internal/process/next_tick.js:77:7)
(node:16200) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:16200) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
我确定我在这里遗漏了一些关于如何定义页面的非常基本的东西,但是已经晚了,我有点迷路了。任何指针将不胜感激:)
- 您忘记创建页面对象
const page = await browser.newPage();
page.click
中的选择器形状错误。像这样使用它const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
- 单击按钮并等待导航should be wrapped in
Promise.all
这里更正了你的例子:
const puppeteer = require('puppeteer');
const CREDS = require('./creds');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://www.squarespace.com/login');
const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
const PASSWORD_SELECTOR = '.password.Input-hxTtdt.ipapEE';
const BUTTON_SELECTOR = '.Button-kDSBcD.fATVqu';
await page.click(USERNAME_SELECTOR);
await page.keyboard.type(CREDS.username);
await page.click(PASSWORD_SELECTOR);
await page.keyboard.type(CREDS.password);
await Promise.all([
page.waitForNavigation(),
page.click(BUTTON_SELECTOR),
]);
await page.goto('https://triangle-oarfish-hk21.squarespace.com/config/analytics#activity-log');
await page.screenshot({ path: 'example.png' });
await browser.close();
})();