Runs on JavaScript
Runs on JavaScript
JavaScript is the language of the web. Apify SDK builds on popular tools like playwright, puppeteer and cheerio, to deliver large-scale high-performance web scraping and crawling of any website.

Automates any web workflow
Automates any web workflow
Run headless Chrome, Firefox, WebKit or other browsers, manage lists and queues of URLs to crawl, run crawlers in parallel at maximum system capacity. Handle storage and export of results and rotate proxies.
Works on any system
Works on any system
Apify SDK can be used stand-alone on your own systems or it can run as a serverless microservice on the Apify Platform. Get started with Apify Platform
Easy crawling
Easy crawling
There are three main classes that you can use to start crawling the web in no time. Need to crawl plain HTML? Use the blazing fast CheerioCrawler
.
For complex websites that use React, Vue or other front-end javascript libraries and require JavaScript execution, spawn a headless browser with PlaywrightCrawler
or PuppeteerCrawler


Powerful tools
Powerful tools
All the crawlers are automatically scaled based on available system resources using the AutoscaledPool
class. When you run your code on the Apify Platform, you can also take advantage of a pool of proxies to avoid detection. For data storage, you can use the Dataset
, KeyValueStore
and RequestQueue
classes.
Try it out
Try it out
Install Apify SDK into a Node.js project. You must have Node.js 10 or higher installed.
npm install apify puppeteer
Copy the following code into a file in the project, for example main.js
:
const Apify = require('apify');
Apify.main(async () => {
const requestQueue = await Apify.openRequestQueue();
await requestQueue.addRequest({ url: 'https://www.iana.org/' });
const crawler = new Apify.PuppeteerCrawler({
requestQueue,
handlePageFunction: async ({ request, page }) => {
const title = await page.title();
console.log(`Title of ${request.url}: ${title}`);
await Apify.utils.enqueueLinks({
requestQueue,
page,
pseudoUrls: ['https://www.iana.org/[.*]'],
});
},
});
await crawler.run();
});
Execute the following command in the project's folder and watch it recursively crawl IANA with Puppeteer and Chromium.
node main.js