Skip to content

scrapfly/typescript-scrapfly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

becabe7 Â· Nov 9, 2024

History

93 Commits
Jul 29, 2024
Nov 7, 2024
Nov 8, 2024
Nov 8, 2024
Jul 24, 2024
Jul 31, 2024
Jul 22, 2023
Jul 22, 2023
Jul 22, 2023
Aug 22, 2024
Jul 28, 2024
Nov 9, 2024
Jul 26, 2024
Nov 8, 2024
Jul 20, 2023

Repository files navigation

Scrapfly SDK

npm install scrapfly-sdk
deno add jsr:@scrapfly/scrapfly-sdk
bun jsr add @scrapfly/scrapfly-sdk

Typescript/Javascript SDK for Scrapfly.io web scraping API which allows to:

  • Scrape the web without being blocked.
  • Use headless browsers to access Javascript-powered page data.
  • Scale up web scraping.
  • ... and much more!

For web scraping guides see our blog and #scrapeguide tag for how to scrape specific targets.

The SDK is distributed through:

Quick Intro

  1. Register a Scrapfly account for free
  2. Get your API Key on scrapfly.io/dashboard
  3. Start scraping: 🚀
// node 
import { ScrapflyClient, ScrapeConfig } from 'scrapfly-sdk';
// bun
import { ScrapflyClient, ScrapeConfig} from '@scrapfly/scrapfly-sdk';
// deno: 
import { ScrapflyClient, ScrapeConfig } from 'jsr:@scrapfly/scrapfly-sdk';

const key = 'YOUR SCRAPFLY KEY';
const client = new ScrapflyClient({ key });
const apiResponse = await client.scrape(
    new ScrapeConfig({
        url: 'https://web-scraping.dev/product/1',
        // optional parameters:
        // enable javascript rendering
        render_js: true,
        // set proxy country
        country: 'us',
        // enable anti-scraping protection bypass
        asp: true,
        // set residential proxies
        proxy_pool: 'public_residential_pool',
        // etc.
    }),
);
console.log(apiResponse.result.content); // html content
// Parse HTML directly with SDK (through cheerio)
console.log(apiResponse.result.selector('h3').text());

For more see /examples directory.
For more on Scrapfly API see our getting started documentation For Python see Scrapfly Python SDK

Debugging

To enable debug logs set Scrapfly's log level to "DEBUG":

import { log } from 'scrapfly-sdk';

log.setLevel('DEBUG');

Additionally, set debug=true in ScrapeConfig to access debug information in Scrapfly web dashboard:

import { ScrapflyClient } from 'scrapfly-sdk';

new ScrapeConfig({
    url: 'https://web-scraping.dev/product/1',
    debug: true,
    // ^ enable debug information - this will show extra details on web dashboard
});

Development

This is a Deno Typescript project that builds to NPM through DNT.

  • /src directory contains all of the source code with main.ts being the entry point.
  • __tests__ directory contains tests for the source code.
  • deno.json contains meta information
  • build.ts is the build script that builds the project to nodejs ESM package.
  • /npm directory will be produced when built.ts is executed for building node package.
# make modifications and run tests
$ deno task test
# format
$ deno fmt
# lint
$ deno lint
# publish JSR:
$ deno publish
# build NPM package:
$ deno task build-npm
# publish NPM:
$ cd npm && npm publish