diff --git a/docs/guides/proxy_management.mdx b/docs/guides/proxy_management.mdx index a271a063fb..89e25469f0 100644 --- a/docs/guides/proxy_management.mdx +++ b/docs/guides/proxy_management.mdx @@ -83,35 +83,17 @@ Your crawlers will now use the selected proxies for all connections. ### IP Rotation and session management +Every call to + `proxyConfiguration.newUrl()` -allows you to pass a `sessionId` parameter. It will then be used to create a -`sessionId`-`proxyUrl` pair, and subsequent `newUrl()` calls with the same -`sessionId` will always return the same `proxyUrl`. This is extremely useful in -scraping, because you want to create the impression of a real user. See the -[session management guide](../guides/session-management) and -`SessionPool` class -for more information on how keeping a real session helps you avoid blocking. - -When no `sessionId` is provided, your proxy URLs are rotated round-robin, whereas Apify Proxy manages their rotation using black magic to get the best performance. - - +returns an independent proxy URL. For Apify Proxy that URL embeds a fresh random +session id, so consecutive calls resolve to different IP addresses; for custom +`proxyUrls` the URLs are rotated round-robin. - - -```javascript -const proxyConfiguration = await Actor.createProxyConfiguration({ - /* opts */ -}); -const sessionPool = await SessionPool.open({ - /* opts */ -}); -const session = await sessionPool.getSession(); -const proxyUrl = proxyConfiguration.newUrl(session.id); -``` - - +Session continuity (using the same IP across multiple requests, e.g. to keep a logged-in session alive) is handled one level up by Crawlee's `SessionPool`: once a `Session` is paired with a proxy URL, the crawler reuses that pairing for subsequent requests tied to the same session. See the +[session management guide](../guides/session-management) for more details. ```javascript const proxyConfiguration = await Actor.createProxyConfiguration({ @@ -125,8 +107,6 @@ const crawler = new PuppeteerCrawler({ }); ``` - - ## Apify Proxy vs. Your own proxies The `ProxyConfiguration` class covers both Apify Proxy and custom proxy URLs so that diff --git a/docs/upgrading/upgrading_v4.md b/docs/upgrading/upgrading_v4.md new file mode 100644 index 0000000000..ff4269d1c1 --- /dev/null +++ b/docs/upgrading/upgrading_v4.md @@ -0,0 +1,88 @@ +--- +id: upgrading-to-v4 +title: Upgrading to v4 +--- + +This page summarizes the breaking changes between Apify SDK v3 and v4. Apify SDK v4 adopts the redesigned Crawlee v4 interfaces (`Configuration`, `EventManager`, `StorageClient`, `ProxyConfiguration`), so most of the changes here track the corresponding Crawlee v4 changes. + +## Configuration + +The `Configuration` class no longer exposes `.get(key)` / `.set(key, value)`. Configuration values are resolved eagerly at construction time and exposed as plain typed properties. + +Before (v3): + +```ts +import { Configuration } from 'apify'; + +const config = Configuration.getGlobalConfig(); +const token = config.get('token'); +config.set('token', 'new-token'); +``` + +After (v4): + +```ts +import { Configuration } from 'apify'; + +// Construct with overrides — Configuration is immutable. +const config = new Configuration({ token: 'new-token' }); +const token = config.token; +``` + +Resolution order (highest to lowest priority): constructor options → environment variables → `crawlee.json` → schema defaults. + +Empty-string environment variables are treated as unset (they fall through to the schema default) rather than being coerced to `0` / `''` / `false`. For example, `ACTOR_MAX_TOTAL_CHARGE_USD=""` now resolves to `undefined` instead of `0`. + +## ProxyConfiguration: `newUrl()` / `newProxyInfo()` no longer take `sessionId` + +The `sessionId` parameter has been removed from both `ProxyConfiguration.newUrl()` and `ProxyConfiguration.newProxyInfo()`. Each call now returns an independent URL; for Apify Proxy the SDK mints a fresh random session id internally for every URL it hands out, so consecutive calls resolve to different IPs. + +Before (v3): + +```ts +const proxyConfiguration = await Actor.createProxyConfiguration({ + groups: ['RESIDENTIAL'], +}); + +// Sticky pairing: same sessionId → same proxy URL → same IP. +const url1 = await proxyConfiguration.newUrl('mySession'); +const url2 = await proxyConfiguration.newUrl('mySession'); // === url1 +``` + +After (v4): + +```ts +const proxyConfiguration = await Actor.createProxyConfiguration({ + groups: ['RESIDENTIAL'], +}); + +// Every call returns an independent URL with its own session id. +const url1 = await proxyConfiguration.newUrl(); +const url2 = await proxyConfiguration.newUrl(); // !== url1 +``` + +Session continuity (reusing the same IP across multiple requests) is now handled one level up by Crawlee's `SessionPool`: a `Session` stores the proxy URL it was paired with and the crawler reuses that URL for subsequent requests bound to the same session. When using `CheerioCrawler`, `PlaywrightCrawler`, etc. with `useSessionPool: true`, this is automatic — no code changes are required on the consumer side. + +`ProxyInfo` no longer carries a `sessionId` field. If you used it for logging or analytics, parse the `session-` segment out of `proxyInfo.username` instead (it is included for Apify Proxy URLs). + +The `tieredProxyUrls` and `tieredProxyConfig` options on `ProxyConfigurationOptions` were dropped in Crawlee v4 ([apify/crawlee#3599](https://github.com/apify/crawlee/pull/3599)) and the SDK no longer threads them through. Migrate to named sessions via `SessionPool` if you relied on tiered rotation. + +## EventManager + +`PlatformEventManager` now extends Crawlee v4's `EventManager` and integrates with the new service locator. Use `Configuration.getGlobalConfig()` (or pass a `Configuration` instance explicitly) when constructing it directly — the constructor no longer accepts a `config` override via the `override` keyword pattern because Crawlee's base class manages the configuration through `serviceLocator` instead of a `config` field. + +If you only interact with events through `Actor.on()` / `Actor.off()` / `Actor.events`, no code changes are needed. + +## StorageClient + +The SDK's storage layer was adapted to the new Crawlee v4 `StorageClient` interface. The Apify platform client is wrapped via an internal `ApifyStorageClient` adapter that implements `createDatasetClient`, `createKeyValueStoreClient`, and `createRequestQueueClient`. + +`KeyValueStore.getPublicUrl()` is now asynchronous (it signs URLs server-side when running on the Apify platform). Update call sites accordingly: + +```ts +// v3 +const url = store.getPublicUrl('myKey'); + +// v4 +const url = await store.getPublicUrl('myKey'); +``` diff --git a/package-lock.json b/package-lock.json index 1b05834ad6..41603ac93a 100644 --- a/package-lock.json +++ b/package-lock.json @@ -16,9 +16,9 @@ "@apify/input_secrets": "^1.1.72", "@apify/tsconfig": "^0.1.1", "@commitlint/config-conventional": "^19.8.1", - "@crawlee/core": "^4.0.0-beta.0", - "@crawlee/types": "^4.0.0-beta.0", - "@crawlee/utils": "^4.0.0-beta.0", + "@crawlee/core": "^4.0.0-beta.56", + "@crawlee/types": "^4.0.0-beta.56", + "@crawlee/utils": "^4.0.0-beta.56", "@playwright/browser-chromium": "^1.52.0", "@types/content-type": "^1.1.8", "@types/fs-extra": "^11.0.4", @@ -27,7 +27,7 @@ "@types/tough-cookie": "^4.0.5", "@types/ws": "^8.18.1", "commitlint": "^19.8.1", - "crawlee": "^4.0.0-beta.0", + "crawlee": "^4.0.0-beta.56", "eslint": "^9.27.0", "eslint-config-prettier": "^10.1.5", "fs-extra": "^11.3.0", @@ -194,6 +194,16 @@ "node": ">=6.9.0" } }, + "node_modules/@borewit/text-codec": { + "version": "0.2.2", + "resolved": "https://registry.npmjs.org/@borewit/text-codec/-/text-codec-0.2.2.tgz", + "integrity": "sha512-DDaRehssg1aNrH4+2hnj1B7vnUGEjU6OIlyRdkMd0aUdIUvKXrJfXsy8LVtXAy7DRvYVluWbMspsRhz2lcW0mQ==", + "license": "MIT", + "funding": { + "type": "github", + "url": "https://github.com/sponsors/Borewit" + } + }, "node_modules/@commitlint/cli": { "version": "19.8.1", "resolved": "https://registry.npmjs.org/@commitlint/cli/-/cli-19.8.1.tgz", @@ -464,19 +474,37 @@ "node": ">=v18" } }, + "node_modules/@crawlee/cheerio": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/cheerio/-/cheerio-4.0.0-beta.56.tgz", + "integrity": "sha512-YNuNzjL9zVbbqLgmE4vHzQvYIWVYC+sdyEw2mvJh3EHyO6bHNQzA3RDssDSB0l+zLLdUXhGZStbF71Ur/5RhTw==", + "dev": true, + "license": "Apache-2.0", + "dependencies": { + "@crawlee/http": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", + "cheerio": "^1.0.0", + "htmlparser2": "^10.0.0", + "tslib": "^2.8.1" + }, + "engines": { + "node": ">=22.0.0" + } + }, "node_modules/@crawlee/cli": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/cli/-/cli-4.0.0-beta.4.tgz", - "integrity": "sha512-IgjQTitc09MM9FIiMAzX5Xx9/6AwevvBVqbyoRMsnJqA1/sPF1Xr7I7I8ImsneZTOjEfEBEq4FycNaOUaUD2PQ==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/cli/-/cli-4.0.0-beta.56.tgz", + "integrity": "sha512-LDriHIuvuBJwNk+FV9QkeMwJPAiHvfBf77MAZMCzrNVMqkGzqNb/7/ZnGjNV9+0KckmN/LrEFOPf+3EP6+Mptw==", "dev": true, "license": "Apache-2.0", "dependencies": { - "@crawlee/templates": "4.0.0-beta.4", + "@crawlee/templates": "4.0.0-beta.56", "@inquirer/prompts": "^7.5.0", "ansi-colors": "^4.1.3", "fs-extra": "^11.3.0", "tslib": "^2.8.1", - "yargs": "^17.7.2" + "yargs": "^18.0.0" }, "bin": { "crawlee": "index.js" @@ -485,10 +513,138 @@ "node": ">=22.0.0" } }, + "node_modules/@crawlee/cli/node_modules/ansi-regex": { + "version": "6.2.2", + "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.2.2.tgz", + "integrity": "sha512-Bq3SmSpyFHaWjPk8If9yc6svM8c56dB5BAtW4Qbw5jHTwwXXcTLoRMkpDJp6VL0XzlWaCHTXrkFURMYmD0sLqg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/chalk/ansi-regex?sponsor=1" + } + }, + "node_modules/@crawlee/cli/node_modules/ansi-styles": { + "version": "6.2.3", + "resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-6.2.3.tgz", + "integrity": "sha512-4Dj6M28JB+oAH8kFkTLUo+a2jwOFkuqb3yucU0CANcRRUbxS0cP0nZYCGjcc3BNXwRIsUVmDGgzawme7zvJHvg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/chalk/ansi-styles?sponsor=1" + } + }, + "node_modules/@crawlee/cli/node_modules/cliui": { + "version": "9.0.1", + "resolved": "https://registry.npmjs.org/cliui/-/cliui-9.0.1.tgz", + "integrity": "sha512-k7ndgKhwoQveBL+/1tqGJYNz097I7WOvwbmmU2AR5+magtbjPWQTS1C5vzGkBC8Ym8UWRzfKUzUUqFLypY4Q+w==", + "dev": true, + "license": "ISC", + "dependencies": { + "string-width": "^7.2.0", + "strip-ansi": "^7.1.0", + "wrap-ansi": "^9.0.0" + }, + "engines": { + "node": ">=20" + } + }, + "node_modules/@crawlee/cli/node_modules/emoji-regex": { + "version": "10.6.0", + "resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-10.6.0.tgz", + "integrity": "sha512-toUI84YS5YmxW219erniWD0CIVOo46xGKColeNQRgOzDorgBi1v4D71/OFzgD9GO2UGKIv1C3Sp8DAn0+j5w7A==", + "dev": true, + "license": "MIT" + }, + "node_modules/@crawlee/cli/node_modules/string-width": { + "version": "7.2.0", + "resolved": "https://registry.npmjs.org/string-width/-/string-width-7.2.0.tgz", + "integrity": "sha512-tsaTIkKW9b4N+AEj+SVA+WhJzV7/zMhcSu78mLKWSk7cXMOSHsBKFWUs0fWwq8QyK3MgJBQRX6Gbi4kYbdvGkQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "emoji-regex": "^10.3.0", + "get-east-asian-width": "^1.0.0", + "strip-ansi": "^7.1.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@crawlee/cli/node_modules/strip-ansi": { + "version": "7.2.0", + "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.2.0.tgz", + "integrity": "sha512-yDPMNjp4WyfYBkHnjIRLfca1i6KMyGCtsVgoKe/z1+6vukgaENdgGBZt+ZmKPc4gavvEZ5OgHfHdrazhgNyG7w==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-regex": "^6.2.2" + }, + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/chalk/strip-ansi?sponsor=1" + } + }, + "node_modules/@crawlee/cli/node_modules/wrap-ansi": { + "version": "9.0.2", + "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-9.0.2.tgz", + "integrity": "sha512-42AtmgqjV+X1VpdOfyTGOYRi0/zsoLqtXQckTmqTeybT+BDIbM/Guxo7x3pE2vtpr1ok6xRqM9OpBe+Jyoqyww==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-styles": "^6.2.1", + "string-width": "^7.0.0", + "strip-ansi": "^7.1.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "url": "https://github.com/chalk/wrap-ansi?sponsor=1" + } + }, + "node_modules/@crawlee/cli/node_modules/yargs": { + "version": "18.0.0", + "resolved": "https://registry.npmjs.org/yargs/-/yargs-18.0.0.tgz", + "integrity": "sha512-4UEqdc2RYGHZc7Doyqkrqiln3p9X2DZVxaGbwhn2pi7MrRagKaOcIKe8L3OxYcbhXLgLFUS3zAYuQjKBQgmuNg==", + "dev": true, + "license": "MIT", + "dependencies": { + "cliui": "^9.0.1", + "escalade": "^3.1.1", + "get-caller-file": "^2.0.5", + "string-width": "^7.2.0", + "y18n": "^5.0.5", + "yargs-parser": "^22.0.0" + }, + "engines": { + "node": "^20.19.0 || ^22.12.0 || >=23" + } + }, + "node_modules/@crawlee/cli/node_modules/yargs-parser": { + "version": "22.0.0", + "resolved": "https://registry.npmjs.org/yargs-parser/-/yargs-parser-22.0.0.tgz", + "integrity": "sha512-rwu/ClNdSMpkSrUb+d6BRsSkLUq1fmfsY6TOpYzTwvwkg1/NRG85KBy3kq++A8LKQwX6lsu+aWad+2khvuXrqw==", + "dev": true, + "license": "ISC", + "engines": { + "node": "^20.19.0 || ^22.12.0 || >=23" + } + }, "node_modules/@crawlee/core": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/core/-/core-4.0.0-beta.4.tgz", - "integrity": "sha512-gagThRFMLQB0N/Uw8ad+XpRo8xgNEYgHXCSU5H5MVvK6vS4nxm0s2CI7LBj0v1o92rsoM5We284FPlZTe8Qn0w==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/core/-/core-4.0.0-beta.56.tgz", + "integrity": "sha512-IQa4HMHeHvqSZD9PfByNE82OCPA/BAG431+jgrxLhQArsWBnFYEgP72CjrCvC3TMvGEGJM5R1cGyPZ7peoYAtA==", "license": "Apache-2.0", "dependencies": { "@apify/consts": "^2.41.0", @@ -497,22 +653,21 @@ "@apify/pseudo_url": "^2.0.59", "@apify/timeout": "^0.3.2", "@apify/utilities": "^2.15.5", - "@crawlee/memory-storage": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", + "@crawlee/memory-storage": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", "@sapphire/async-queue": "^1.5.5", "@vladfrangu/async_event_emitter": "^2.4.6", "csv-stringify": "^6.5.2", - "fs-extra": "^11.3.0", - "got-scraping": "^4.1.1", "json5": "^2.2.3", "minimatch": "^10.0.1", "ow": "^2.0.0", "stream-json": "^1.9.1", "tldts": "^7.0.6", - "tough-cookie": "^5.1.2", + "tough-cookie": "^6.0.0", "tslib": "^2.8.1", - "type-fest": "^4.41.0" + "type-fest": "^4.41.0", + "zod": "^3.24.0 || ^4.0.0" }, "engines": { "node": ">=22.0.0" @@ -613,43 +768,52 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/linkedom": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/linkedom/-/linkedom-4.0.0-beta.4.tgz", - "integrity": "sha512-eOGSsQs2NylAvMG+OiQwe+CiBHGR5f+OtcMo0+vSKh+jno8IlnoSCSvkLQ8pYUo6lWtXe4m2VDfV06Yc9jTrSw==", + "node_modules/@crawlee/core/node_modules/tough-cookie": { + "version": "6.0.1", + "resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-6.0.1.tgz", + "integrity": "sha512-LktZQb3IeoUWB9lqR5EWTHgW/VTITCXg4D21M+lvybRVdylLrRMnqaIONLVb5mav8vM19m44HIcGq4qASeu2Qw==", + "license": "BSD-3-Clause", + "dependencies": { + "tldts": "^7.0.5" + }, + "engines": { + "node": ">=16" + } + }, + "node_modules/@crawlee/got-scraping-client": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/got-scraping-client/-/got-scraping-client-4.0.0-beta.56.tgz", + "integrity": "sha512-QLa0OWd+XW7QN5fczX8ph/UdFOe9nW/vfewtW5B/+JXCGlarkuMq51joo+7XBAKWYrQ3aknWnhiqwxYaSHacPA==", "dev": true, "license": "Apache-2.0", "dependencies": { - "@apify/timeout": "^0.3.2", - "@apify/utilities": "^2.15.5", - "@crawlee/http": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "linkedom": "^0.18.10", - "ow": "^2.0.0", - "tslib": "^2.8.1" + "@crawlee/http-client": "4.0.0-beta.56", + "got-scraping": "^4.2.1" }, "engines": { "node": ">=22.0.0" } }, - "node_modules/@crawlee/linkedom/node_modules/@crawlee/basic": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/basic/-/basic-4.0.0-beta.4.tgz", - "integrity": "sha512-GnHrt8eylp/9MSLy5tXHrELwTA3Prm8YGVkzyO/8mfZZJqvz7L16uIAvO+ZWhg1x31MoHJKEQLlolaO6CDj2nA==", + "node_modules/@crawlee/http": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/http/-/http-4.0.0-beta.56.tgz", + "integrity": "sha512-sQ1oOY4cIGbg3HpZs9awy/nlVifobNTb/8o1ROGJqN2XXJws8E+QCWhu63p65I0zMscSd9arNtXFKKn6swhTyQ==", "dev": true, "license": "Apache-2.0", "dependencies": { - "@apify/log": "^2.5.18", "@apify/timeout": "^0.3.2", "@apify/utilities": "^2.15.5", - "@crawlee/core": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", - "csv-stringify": "^6.5.2", - "fs-extra": "^11.3.0", - "got-scraping": "^4.1.1", + "@crawlee/basic": "4.0.0-beta.56", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/http-client": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", + "@types/content-type": "^1.1.8", + "cheerio": "^1.0.0", + "content-type": "^1.0.5", + "iconv-lite": "^0.7.2", + "mime-types": "^3.0.1", "ow": "^2.0.0", - "tldts": "^7.0.6", "tslib": "^2.8.1", "type-fest": "^4.41.0" }, @@ -657,25 +821,48 @@ "node": ">=22.0.0" } }, - "node_modules/@crawlee/linkedom/node_modules/@crawlee/http": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/http/-/http-4.0.0-beta.4.tgz", - "integrity": "sha512-G0MWHpYcSUO73b+ZST5HT7XtiiYkONaMZ+cgKm3TMbtOh0zuaOx8SdcQTNjeMnJdxmSpbmLbQl2ihynXGxhQdA==", + "node_modules/@crawlee/http-client": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/http-client/-/http-client-4.0.0-beta.56.tgz", + "integrity": "sha512-9c77oxOB4VuJlQwEqyR8VzZyFyVL7mFf0/2Ny+Kb6rtW6hlZjfg/xLq3ZpuKLfS5LmfGPpFMwCnKDr1NOCkVVw==", + "license": "Apache-2.0", + "dependencies": { + "@crawlee/types": "4.0.0-beta.56", + "tough-cookie": "^6.0.0" + }, + "engines": { + "node": ">=22.0.0" + } + }, + "node_modules/@crawlee/http-client/node_modules/tough-cookie": { + "version": "6.0.1", + "resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-6.0.1.tgz", + "integrity": "sha512-LktZQb3IeoUWB9lqR5EWTHgW/VTITCXg4D21M+lvybRVdylLrRMnqaIONLVb5mav8vM19m44HIcGq4qASeu2Qw==", + "license": "BSD-3-Clause", + "dependencies": { + "tldts": "^7.0.5" + }, + "engines": { + "node": ">=16" + } + }, + "node_modules/@crawlee/http/node_modules/@crawlee/basic": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/basic/-/basic-4.0.0-beta.56.tgz", + "integrity": "sha512-lskCNOu/Wp845TQK9DeffaTfcWsb1ZvWbASp+msQ6TtGWbo33aY/+m/6j8eJPMlOJ1NT7axzJYF6CF39JH+dEQ==", "dev": true, "license": "Apache-2.0", "dependencies": { "@apify/timeout": "^0.3.2", "@apify/utilities": "^2.15.5", - "@crawlee/basic": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", - "@types/content-type": "^1.1.8", - "cheerio": "^1.0.0", - "content-type": "^1.0.5", - "got-scraping": "^4.1.1", - "iconv-lite": "^0.6.3", - "mime-types": "^3.0.1", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/got-scraping-client": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", + "csv-stringify": "^6.5.2", + "fs-extra": "^11.3.0", "ow": "^2.0.0", + "tldts": "^7.0.6", "tslib": "^2.8.1", "type-fest": "^4.41.0" }, @@ -683,7 +870,7 @@ "node": ">=22.0.0" } }, - "node_modules/@crawlee/linkedom/node_modules/@sindresorhus/is": { + "node_modules/@crawlee/http/node_modules/@sindresorhus/is": { "version": "6.3.1", "resolved": "https://registry.npmjs.org/@sindresorhus/is/-/is-6.3.1.tgz", "integrity": "sha512-FX4MfcifwJyFOI2lPoX7PQxCqx8BG1HCho7WdiXwpEQx1Ycij0JxkfYtGK7yqNScrZGSlt6RE6sw8QYoH7eKnQ==", @@ -696,7 +883,7 @@ "url": "https://github.com/sindresorhus/is?sponsor=1" } }, - "node_modules/@crawlee/linkedom/node_modules/callsites": { + "node_modules/@crawlee/http/node_modules/callsites": { "version": "4.2.0", "resolved": "https://registry.npmjs.org/callsites/-/callsites-4.2.0.tgz", "integrity": "sha512-kfzR4zzQtAE9PC7CzZsjl3aBNbXWuXiSeOCdLcPpBfGW8YuCqQHcRPFDbr/BPVmd3EEPVpuFzLyuT/cUhPr4OQ==", @@ -709,33 +896,149 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/linkedom/node_modules/cheerio": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/cheerio/-/cheerio-1.0.0.tgz", - "integrity": "sha512-quS9HgjQpdaXOvsZz82Oz7uxtXiy6UIsIQcpBj7HRw2M63Skasm9qlDocAM7jNuaxdhpPU7c4kJN+gA5MCu4ww==", + "node_modules/@crawlee/http/node_modules/dot-prop": { + "version": "8.0.2", + "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-8.0.2.tgz", + "integrity": "sha512-xaBe6ZT4DHPkg0k4Ytbvn5xoxgpG0jOS1dYxSOwAHPuNLjP3/OzN0gH55SrLqpx8cBfSaVt91lXYkApjb+nYdQ==", "dev": true, "license": "MIT", "dependencies": { - "cheerio-select": "^2.1.0", - "dom-serializer": "^2.0.0", - "domhandler": "^5.0.3", - "domutils": "^3.1.0", - "encoding-sniffer": "^0.2.0", - "htmlparser2": "^9.1.0", - "parse5": "^7.1.2", - "parse5-htmlparser2-tree-adapter": "^7.0.0", - "parse5-parser-stream": "^7.1.2", - "undici": "^6.19.5", - "whatwg-mimetype": "^4.0.0" + "type-fest": "^3.8.0" }, "engines": { - "node": ">=18.17" + "node": ">=16" }, "funding": { - "url": "https://github.com/cheeriojs/cheerio?sponsor=1" + "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/linkedom/node_modules/dot-prop": { + "node_modules/@crawlee/http/node_modules/dot-prop/node_modules/type-fest": { + "version": "3.13.1", + "resolved": "https://registry.npmjs.org/type-fest/-/type-fest-3.13.1.tgz", + "integrity": "sha512-tLq3bSNx+xSpwvAJnzrK0Ep5CLNWjvFTOp71URMaAEWBfRb9nnJiBoUe0tF8bI4ZFO3omgBR6NvnbzVUT3Ly4g==", + "dev": true, + "license": "(MIT OR CC0-1.0)", + "engines": { + "node": ">=14.16" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@crawlee/http/node_modules/iconv-lite": { + "version": "0.7.2", + "resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.7.2.tgz", + "integrity": "sha512-im9DjEDQ55s9fL4EYzOAv0yMqmMBSZp6G0VvFyTMPKWxiSBHUj9NW/qqLmXUwXrrM7AvqSlTCfvqRb0cM8yYqw==", + "dev": true, + "license": "MIT", + "dependencies": { + "safer-buffer": ">= 2.1.2 < 3.0.0" + }, + "engines": { + "node": ">=0.10.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/@crawlee/http/node_modules/mime-db": { + "version": "1.54.0", + "resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.54.0.tgz", + "integrity": "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/@crawlee/http/node_modules/mime-types": { + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-3.0.2.tgz", + "integrity": "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A==", + "dev": true, + "license": "MIT", + "dependencies": { + "mime-db": "^1.54.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/@crawlee/http/node_modules/ow": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/ow/-/ow-2.0.0.tgz", + "integrity": "sha512-ESUigmGrdhUZ2nQSFNkeKSl6ZRPupXzprMs3yF9DYlNVpJ8XAjM/fI9RUZxA7PI1K9HQDCCvBo1jr/GEIo9joQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@sindresorhus/is": "^6.3.0", + "callsites": "^4.1.0", + "dot-prop": "^8.0.2", + "environment": "^1.0.0", + "fast-equals": "^5.0.1", + "is-identifier": "^1.0.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@crawlee/jsdom": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/jsdom/-/jsdom-4.0.0-beta.56.tgz", + "integrity": "sha512-8wsb3w+AEdFIAkMGYSPWFJXxsGM46TaXRpcEMb8gvlPsSCxn+3P7twHx01XWDT5ykYJXQNLf+exGstoxSpUd2w==", + "dev": true, + "license": "Apache-2.0", + "dependencies": { + "@apify/timeout": "^0.3.0", + "@apify/utilities": "^2.7.10", + "@crawlee/http": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", + "@types/jsdom": "^21.1.7", + "cheerio": "^1.0.0", + "jsdom": "^26.1.0", + "ow": "^2.0.0", + "tslib": "^2.8.1" + }, + "engines": { + "node": ">=22.0.0" + } + }, + "node_modules/@crawlee/jsdom/node_modules/@sindresorhus/is": { + "version": "6.3.1", + "resolved": "https://registry.npmjs.org/@sindresorhus/is/-/is-6.3.1.tgz", + "integrity": "sha512-FX4MfcifwJyFOI2lPoX7PQxCqx8BG1HCho7WdiXwpEQx1Ycij0JxkfYtGK7yqNScrZGSlt6RE6sw8QYoH7eKnQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=16" + }, + "funding": { + "url": "https://github.com/sindresorhus/is?sponsor=1" + } + }, + "node_modules/@crawlee/jsdom/node_modules/callsites": { + "version": "4.2.0", + "resolved": "https://registry.npmjs.org/callsites/-/callsites-4.2.0.tgz", + "integrity": "sha512-kfzR4zzQtAE9PC7CzZsjl3aBNbXWuXiSeOCdLcPpBfGW8YuCqQHcRPFDbr/BPVmd3EEPVpuFzLyuT/cUhPr4OQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12.20" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@crawlee/jsdom/node_modules/dot-prop": { "version": "8.0.2", "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-8.0.2.tgz", "integrity": "sha512-xaBe6ZT4DHPkg0k4Ytbvn5xoxgpG0jOS1dYxSOwAHPuNLjP3/OzN0gH55SrLqpx8cBfSaVt91lXYkApjb+nYdQ==", @@ -751,7 +1054,28 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/linkedom/node_modules/dot-prop/node_modules/type-fest": { + "node_modules/@crawlee/jsdom/node_modules/ow": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/ow/-/ow-2.0.0.tgz", + "integrity": "sha512-ESUigmGrdhUZ2nQSFNkeKSl6ZRPupXzprMs3yF9DYlNVpJ8XAjM/fI9RUZxA7PI1K9HQDCCvBo1jr/GEIo9joQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@sindresorhus/is": "^6.3.0", + "callsites": "^4.1.0", + "dot-prop": "^8.0.2", + "environment": "^1.0.0", + "fast-equals": "^5.0.1", + "is-identifier": "^1.0.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@crawlee/jsdom/node_modules/type-fest": { "version": "3.13.1", "resolved": "https://registry.npmjs.org/type-fest/-/type-fest-3.13.1.tgz", "integrity": "sha512-tLq3bSNx+xSpwvAJnzrK0Ep5CLNWjvFTOp71URMaAEWBfRb9nnJiBoUe0tF8bI4ZFO3omgBR6NvnbzVUT3Ly4g==", @@ -764,47 +1088,67 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/linkedom/node_modules/htmlparser2": { - "version": "9.1.0", - "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-9.1.0.tgz", - "integrity": "sha512-5zfg6mHUoaer/97TxnGpxmbR7zJtPwIYFMZ/H5ucTlPZhKvtum05yiPK3Mgai3a0DyVxv7qYqoweaEd2nrYQzQ==", + "node_modules/@crawlee/linkedom": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/linkedom/-/linkedom-4.0.0-beta.56.tgz", + "integrity": "sha512-+oapbX6V9leYP9ucwPe0EqLHjhxvF2M9PXxv0L9hPlorB1qg8UrE76SiDRXDROipPtOjgiq36PQ5ij33FAxBsg==", "dev": true, - "funding": [ - "https://github.com/fb55/htmlparser2?sponsor=1", - { - "type": "github", - "url": "https://github.com/sponsors/fb55" - } - ], - "license": "MIT", + "license": "Apache-2.0", "dependencies": { - "domelementtype": "^2.3.0", - "domhandler": "^5.0.3", - "domutils": "^3.1.0", - "entities": "^4.5.0" + "@apify/timeout": "^0.3.2", + "@apify/utilities": "^2.15.5", + "@crawlee/http": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", + "cheerio": "^1.0.0", + "linkedom": "^0.18.10", + "ow": "^2.0.0", + "tslib": "^2.8.1" + }, + "engines": { + "node": ">=22.0.0" } }, - "node_modules/@crawlee/linkedom/node_modules/mime-db": { - "version": "1.54.0", - "resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.54.0.tgz", - "integrity": "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ==", + "node_modules/@crawlee/linkedom/node_modules/@sindresorhus/is": { + "version": "6.3.1", + "resolved": "https://registry.npmjs.org/@sindresorhus/is/-/is-6.3.1.tgz", + "integrity": "sha512-FX4MfcifwJyFOI2lPoX7PQxCqx8BG1HCho7WdiXwpEQx1Ycij0JxkfYtGK7yqNScrZGSlt6RE6sw8QYoH7eKnQ==", "dev": true, "license": "MIT", "engines": { - "node": ">= 0.6" + "node": ">=16" + }, + "funding": { + "url": "https://github.com/sindresorhus/is?sponsor=1" } }, - "node_modules/@crawlee/linkedom/node_modules/mime-types": { - "version": "3.0.1", - "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-3.0.1.tgz", - "integrity": "sha512-xRc4oEhT6eaBpU1XF7AjpOFD+xQmXNB5OVKwp4tqCuBpHLS/ZbBDrc07mYTDqVMg6PfxUjjNp85O6Cd2Z/5HWA==", + "node_modules/@crawlee/linkedom/node_modules/callsites": { + "version": "4.2.0", + "resolved": "https://registry.npmjs.org/callsites/-/callsites-4.2.0.tgz", + "integrity": "sha512-kfzR4zzQtAE9PC7CzZsjl3aBNbXWuXiSeOCdLcPpBfGW8YuCqQHcRPFDbr/BPVmd3EEPVpuFzLyuT/cUhPr4OQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12.20" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/@crawlee/linkedom/node_modules/dot-prop": { + "version": "8.0.2", + "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-8.0.2.tgz", + "integrity": "sha512-xaBe6ZT4DHPkg0k4Ytbvn5xoxgpG0jOS1dYxSOwAHPuNLjP3/OzN0gH55SrLqpx8cBfSaVt91lXYkApjb+nYdQ==", "dev": true, "license": "MIT", "dependencies": { - "mime-db": "^1.54.0" + "type-fest": "^3.8.0" }, "engines": { - "node": ">= 0.6" + "node": ">=16" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" } }, "node_modules/@crawlee/linkedom/node_modules/ow": { @@ -828,20 +1172,33 @@ "url": "https://github.com/sponsors/sindresorhus" } }, + "node_modules/@crawlee/linkedom/node_modules/type-fest": { + "version": "3.13.1", + "resolved": "https://registry.npmjs.org/type-fest/-/type-fest-3.13.1.tgz", + "integrity": "sha512-tLq3bSNx+xSpwvAJnzrK0Ep5CLNWjvFTOp71URMaAEWBfRb9nnJiBoUe0tF8bI4ZFO3omgBR6NvnbzVUT3Ly4g==", + "dev": true, + "license": "(MIT OR CC0-1.0)", + "engines": { + "node": ">=14.16" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/@crawlee/memory-storage": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/memory-storage/-/memory-storage-4.0.0-beta.4.tgz", - "integrity": "sha512-40HxqveRwyeiwpnHw1L8cwo6BP+/UUBqehOzsODGmOZaNTFU6Nm+8hRXlbYbeqzyWwxXOn15bAf2STHsZct/gw==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/memory-storage/-/memory-storage-4.0.0-beta.56.tgz", + "integrity": "sha512-gRMYE7Wwyc34r14qU/YvY1iRjr66GQ87CUpmhqA0f5Ta+wWL3VTse+mKU3xpEVwEZXT0cv6VfJxqm4xVw7cnjw==", "license": "Apache-2.0", "dependencies": { - "@apify/log": "^2.5.18", - "@crawlee/types": "4.0.0-beta.4", + "@crawlee/types": "4.0.0-beta.56", "@sapphire/async-queue": "^1.5.5", "@sapphire/shapeshift": "^4.0.0", "content-type": "^1.0.5", "fs-extra": "^11.3.0", "json5": "^2.2.3", "mime-types": "^3.0.1", + "p-limit": "^6.2.0", "proper-lockfile": "^4.1.2", "tslib": "^2.8.1" }, @@ -859,106 +1216,99 @@ } }, "node_modules/@crawlee/memory-storage/node_modules/mime-types": { - "version": "3.0.1", - "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-3.0.1.tgz", - "integrity": "sha512-xRc4oEhT6eaBpU1XF7AjpOFD+xQmXNB5OVKwp4tqCuBpHLS/ZbBDrc07mYTDqVMg6PfxUjjNp85O6Cd2Z/5HWA==", + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-3.0.2.tgz", + "integrity": "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A==", "license": "MIT", "dependencies": { "mime-db": "^1.54.0" }, "engines": { - "node": ">= 0.6" - } - }, - "node_modules/@crawlee/templates": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/templates/-/templates-4.0.0-beta.4.tgz", - "integrity": "sha512-tpkotGuxSxeMlzmKnJvPfb+2RnUTn32Yl8OkO8hMGu+dPSU81clHlCXoOyfIsWHCTlCWjxggITAMOYEL/hwYKw==", - "dev": true, - "license": "Apache-2.0", - "dependencies": { - "ansi-colors": "^4.1.3", - "inquirer": "^12.6.0", - "tslib": "^2.8.1", - "yargonaut": "^1.1.4", - "yargs": "^17.7.2" + "node": ">=18" }, - "engines": { - "node": ">=22.0.0" + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" } }, - "node_modules/@crawlee/templates/node_modules/inquirer": { - "version": "12.6.1", - "resolved": "https://registry.npmjs.org/inquirer/-/inquirer-12.6.1.tgz", - "integrity": "sha512-MGFnzHVS3l3oM3cy+LWkyR7UUtVEn3D5U41CZbEY34szToWoJAvaVtCTz1mxsEzZFk/HXWyCArn0HDgloTXMDw==", - "dev": true, + "node_modules/@crawlee/memory-storage/node_modules/p-limit": { + "version": "6.2.0", + "resolved": "https://registry.npmjs.org/p-limit/-/p-limit-6.2.0.tgz", + "integrity": "sha512-kuUqqHNUqoIWp/c467RI4X6mmyuojY5jGutNU0wVTmEOOfcuwLqyMVoAi9MKi2Ak+5i9+nhmrK4ufZE8069kHA==", "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/prompts": "^7.5.1", - "@inquirer/type": "^3.0.6", - "ansi-escapes": "^4.3.2", - "mute-stream": "^2.0.0", - "run-async": "^3.0.0", - "rxjs": "^7.8.2" + "yocto-queue": "^1.1.1" }, "engines": { "node": ">=18" }, - "peerDependencies": { - "@types/node": ">=18" - }, - "peerDependenciesMeta": { - "@types/node": { - "optional": true - } + "funding": { + "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/templates/node_modules/mute-stream": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/mute-stream/-/mute-stream-2.0.0.tgz", - "integrity": "sha512-WWdIxpyjEn+FhQJQQv9aQAYlHoNVdzIzUySNV1gHUPDSdZJ3yZn7pAAbQcV7B56Mvu881q9FZV+0Vx2xC44VWA==", - "dev": true, - "license": "ISC", + "node_modules/@crawlee/memory-storage/node_modules/yocto-queue": { + "version": "1.2.2", + "resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-1.2.2.tgz", + "integrity": "sha512-4LCcse/U2MHZ63HAJVE+v71o7yOdIe4cZ70Wpf8D/IyjDKYQLV5GD46B+hSTjJsvV5PztjvHoU580EftxjDZFQ==", + "license": "MIT", "engines": { - "node": "^18.17.0 || >=20.5.0" + "node": ">=12.20" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/templates/node_modules/run-async": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/run-async/-/run-async-3.0.0.tgz", - "integrity": "sha512-540WwVDOMxA6dN6We19EcT9sc3hkXPw5mzRNGM3FkdN/vtE9NFvj5lFAPNwUDmJjXidm3v7TC1cTE7t17Ulm1Q==", + "node_modules/@crawlee/templates": { + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/templates/-/templates-4.0.0-beta.56.tgz", + "integrity": "sha512-DAnkiNax0k6VMFjvK9Wj9H4wFXQxqHh2Wzex8cVO6SxgGnUCS3UZuglwmxv5ODxhfTYdazC38WDvVkRtWgNSlA==", "dev": true, - "license": "MIT", + "license": "Apache-2.0", + "dependencies": { + "tslib": "^2.8.1" + }, "engines": { - "node": ">=0.12.0" + "node": ">=22.0.0" } }, "node_modules/@crawlee/types": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/types/-/types-4.0.0-beta.4.tgz", - "integrity": "sha512-DSJRUfTa5NO4xt/ayWD8U/XkHusUry4XTaTcYdEJa9tVYwuf2SyF5xRwd6NdcEiJeeVHn5MM7GU6oknvugfezw==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/types/-/types-4.0.0-beta.56.tgz", + "integrity": "sha512-KfiBOKtO8Kk9TQRIkxzWyj6TdIAXLH+HZ+G64IFmllXKW1TDrMDZHLuT8vXgNcm5Od8JBGlc/PHqkbMCBhUF6g==", "license": "Apache-2.0", "dependencies": { + "tough-cookie": "^6.0.0", "tslib": "^2.8.1" }, "engines": { "node": ">=22.0.0" } }, + "node_modules/@crawlee/types/node_modules/tough-cookie": { + "version": "6.0.1", + "resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-6.0.1.tgz", + "integrity": "sha512-LktZQb3IeoUWB9lqR5EWTHgW/VTITCXg4D21M+lvybRVdylLrRMnqaIONLVb5mav8vM19m44HIcGq4qASeu2Qw==", + "license": "BSD-3-Clause", + "dependencies": { + "tldts": "^7.0.5" + }, + "engines": { + "node": ">=16" + } + }, "node_modules/@crawlee/utils": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/utils/-/utils-4.0.0-beta.4.tgz", - "integrity": "sha512-oZOORB+gIkEeeExEd47Cmbgkrh4pEqo+vklFY3gk55iFFq7X6W6hu4k2rvQP/FlIMhWMc3ETFJFID08OS6p23g==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/utils/-/utils-4.0.0-beta.56.tgz", + "integrity": "sha512-0ZAMa8g0AT5jz8XzEfgixCEU1OKPVSU6h0/vruIcBPDeJugbhGODKNyiMOOoe9tf6qMmVdwVxS/vj5ADu0mWgA==", "license": "Apache-2.0", "dependencies": { - "@apify/log": "^2.5.18", "@apify/ps-tree": "^1.2.0", - "@crawlee/types": "4.0.0-beta.4", + "@crawlee/http-client": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", "@types/sax": "^1.2.7", "cheerio": "^1.0.0", - "file-type": "^20.5.0", - "got-scraping": "^4.1.1", + "domhandler": "^5.0.3", + "file-type": "^21.0.0", "ow": "^2.0.0", "robots-parser": "^3.0.1", "sax": "^1.4.1", @@ -993,31 +1343,6 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/utils/node_modules/cheerio": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/cheerio/-/cheerio-1.0.0.tgz", - "integrity": "sha512-quS9HgjQpdaXOvsZz82Oz7uxtXiy6UIsIQcpBj7HRw2M63Skasm9qlDocAM7jNuaxdhpPU7c4kJN+gA5MCu4ww==", - "license": "MIT", - "dependencies": { - "cheerio-select": "^2.1.0", - "dom-serializer": "^2.0.0", - "domhandler": "^5.0.3", - "domutils": "^3.1.0", - "encoding-sniffer": "^0.2.0", - "htmlparser2": "^9.1.0", - "parse5": "^7.1.2", - "parse5-htmlparser2-tree-adapter": "^7.0.0", - "parse5-parser-stream": "^7.1.2", - "undici": "^6.19.5", - "whatwg-mimetype": "^4.0.0" - }, - "engines": { - "node": ">=18.17" - }, - "funding": { - "url": "https://github.com/cheeriojs/cheerio?sponsor=1" - } - }, "node_modules/@crawlee/utils/node_modules/dot-prop": { "version": "8.0.2", "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-8.0.2.tgz", @@ -1033,25 +1358,6 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/@crawlee/utils/node_modules/htmlparser2": { - "version": "9.1.0", - "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-9.1.0.tgz", - "integrity": "sha512-5zfg6mHUoaer/97TxnGpxmbR7zJtPwIYFMZ/H5ucTlPZhKvtum05yiPK3Mgai3a0DyVxv7qYqoweaEd2nrYQzQ==", - "funding": [ - "https://github.com/fb55/htmlparser2?sponsor=1", - { - "type": "github", - "url": "https://github.com/sponsors/fb55" - } - ], - "license": "MIT", - "dependencies": { - "domelementtype": "^2.3.0", - "domhandler": "^5.0.3", - "domutils": "^3.1.0", - "entities": "^4.5.0" - } - }, "node_modules/@crawlee/utils/node_modules/ow": { "version": "2.0.0", "resolved": "https://registry.npmjs.org/ow/-/ow-2.0.0.tgz", @@ -1963,18 +2269,28 @@ "node": ">=6.9.0" } }, + "node_modules/@inquirer/ansi": { + "version": "1.0.2", + "resolved": "https://registry.npmjs.org/@inquirer/ansi/-/ansi-1.0.2.tgz", + "integrity": "sha512-S8qNSZiYzFd0wAcyG5AXCvUHC5Sr7xpZ9wZ2py9XR88jUz8wooStVx5M6dRzczbBWjic9NP7+rY0Xi7qqK/aMQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=18" + } + }, "node_modules/@inquirer/checkbox": { - "version": "4.1.6", - "resolved": "https://registry.npmjs.org/@inquirer/checkbox/-/checkbox-4.1.6.tgz", - "integrity": "sha512-62u896rWCtKKE43soodq5e/QcRsA22I+7/4Ov7LESWnKRO6BVo2A1DFLDmXL9e28TB0CfHc3YtkbPm7iwajqkg==", + "version": "4.3.2", + "resolved": "https://registry.npmjs.org/@inquirer/checkbox/-/checkbox-4.3.2.tgz", + "integrity": "sha512-VXukHf0RR1doGe6Sm4F0Em7SWYLTHSsbGfJdS9Ja2bX5/D5uwVOEjr07cncLROdBvmnvCATYEWlHqYmXv2IlQA==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/figures": "^1.0.11", - "@inquirer/type": "^3.0.6", - "ansi-escapes": "^4.3.2", - "yoctocolors-cjs": "^2.1.2" + "@inquirer/ansi": "^1.0.2", + "@inquirer/core": "^10.3.2", + "@inquirer/figures": "^1.0.15", + "@inquirer/type": "^3.0.10", + "yoctocolors-cjs": "^2.1.3" }, "engines": { "node": ">=18" @@ -1989,14 +2305,14 @@ } }, "node_modules/@inquirer/confirm": { - "version": "5.1.10", - "resolved": "https://registry.npmjs.org/@inquirer/confirm/-/confirm-5.1.10.tgz", - "integrity": "sha512-FxbQ9giWxUWKUk2O5XZ6PduVnH2CZ/fmMKMBkH71MHJvWr7WL5AHKevhzF1L5uYWB2P548o1RzVxrNd3dpmk6g==", + "version": "5.1.21", + "resolved": "https://registry.npmjs.org/@inquirer/confirm/-/confirm-5.1.21.tgz", + "integrity": "sha512-KR8edRkIsUayMXV+o3Gv+q4jlhENF9nMYUZs9PA2HzrXeHI8M5uDag70U7RJn9yyiMZSbtF5/UexBtAVtZGSbQ==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6" + "@inquirer/core": "^10.3.2", + "@inquirer/type": "^3.0.10" }, "engines": { "node": ">=18" @@ -2011,20 +2327,20 @@ } }, "node_modules/@inquirer/core": { - "version": "10.1.11", - "resolved": "https://registry.npmjs.org/@inquirer/core/-/core-10.1.11.tgz", - "integrity": "sha512-BXwI/MCqdtAhzNQlBEFE7CEflhPkl/BqvAuV/aK6lW3DClIfYVDWPP/kXuXHtBWC7/EEbNqd/1BGq2BGBBnuxw==", + "version": "10.3.2", + "resolved": "https://registry.npmjs.org/@inquirer/core/-/core-10.3.2.tgz", + "integrity": "sha512-43RTuEbfP8MbKzedNqBrlhhNKVwoK//vUFNW3Q3vZ88BLcrs4kYpGg+B2mm5p2K/HfygoCxuKwJJiv8PbGmE0A==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/figures": "^1.0.11", - "@inquirer/type": "^3.0.6", - "ansi-escapes": "^4.3.2", + "@inquirer/ansi": "^1.0.2", + "@inquirer/figures": "^1.0.15", + "@inquirer/type": "^3.0.10", "cli-width": "^4.1.0", "mute-stream": "^2.0.0", "signal-exit": "^4.1.0", "wrap-ansi": "^6.2.0", - "yoctocolors-cjs": "^2.1.2" + "yoctocolors-cjs": "^2.1.3" }, "engines": { "node": ">=18" @@ -2072,15 +2388,15 @@ } }, "node_modules/@inquirer/editor": { - "version": "4.2.11", - "resolved": "https://registry.npmjs.org/@inquirer/editor/-/editor-4.2.11.tgz", - "integrity": "sha512-YoZr0lBnnLFPpfPSNsQ8IZyKxU47zPyVi9NLjCWtna52//M/xuL0PGPAxHxxYhdOhnvY2oBafoM+BI5w/JK7jw==", + "version": "4.2.23", + "resolved": "https://registry.npmjs.org/@inquirer/editor/-/editor-4.2.23.tgz", + "integrity": "sha512-aLSROkEwirotxZ1pBaP8tugXRFCxW94gwrQLxXfrZsKkfjOYC1aRvAZuhpJOb5cu4IBTJdsCigUlf2iCOu4ZDQ==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6", - "external-editor": "^3.1.0" + "@inquirer/core": "^10.3.2", + "@inquirer/external-editor": "^1.0.3", + "@inquirer/type": "^3.0.10" }, "engines": { "node": ">=18" @@ -2095,15 +2411,37 @@ } }, "node_modules/@inquirer/expand": { - "version": "4.0.13", - "resolved": "https://registry.npmjs.org/@inquirer/expand/-/expand-4.0.13.tgz", - "integrity": "sha512-HgYNWuZLHX6q5y4hqKhwyytqAghmx35xikOGY3TcgNiElqXGPas24+UzNPOwGUZa5Dn32y25xJqVeUcGlTv+QQ==", + "version": "4.0.23", + "resolved": "https://registry.npmjs.org/@inquirer/expand/-/expand-4.0.23.tgz", + "integrity": "sha512-nRzdOyFYnpeYTTR2qFwEVmIWypzdAx/sIkCMeTNTcflFOovfqUk+HcFhQQVBftAh9gmGrpFj6QcGEqrDMDOiew==", + "dev": true, + "license": "MIT", + "dependencies": { + "@inquirer/core": "^10.3.2", + "@inquirer/type": "^3.0.10", + "yoctocolors-cjs": "^2.1.3" + }, + "engines": { + "node": ">=18" + }, + "peerDependencies": { + "@types/node": ">=18" + }, + "peerDependenciesMeta": { + "@types/node": { + "optional": true + } + } + }, + "node_modules/@inquirer/external-editor": { + "version": "1.0.3", + "resolved": "https://registry.npmjs.org/@inquirer/external-editor/-/external-editor-1.0.3.tgz", + "integrity": "sha512-RWbSrDiYmO4LbejWY7ttpxczuwQyZLBUyygsA9Nsv95hpzUWwnNTVQmAq3xuh7vNwCp07UTmE5i11XAEExx4RA==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6", - "yoctocolors-cjs": "^2.1.2" + "chardet": "^2.1.1", + "iconv-lite": "^0.7.0" }, "engines": { "node": ">=18" @@ -2117,10 +2455,34 @@ } } }, + "node_modules/@inquirer/external-editor/node_modules/chardet": { + "version": "2.1.1", + "resolved": "https://registry.npmjs.org/chardet/-/chardet-2.1.1.tgz", + "integrity": "sha512-PsezH1rqdV9VvyNhxxOW32/d75r01NY7TQCmOqomRo15ZSOKbpTFVsfjghxo6JloQUCGnH4k1LGu0R4yCLlWQQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@inquirer/external-editor/node_modules/iconv-lite": { + "version": "0.7.2", + "resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.7.2.tgz", + "integrity": "sha512-im9DjEDQ55s9fL4EYzOAv0yMqmMBSZp6G0VvFyTMPKWxiSBHUj9NW/qqLmXUwXrrM7AvqSlTCfvqRb0cM8yYqw==", + "dev": true, + "license": "MIT", + "dependencies": { + "safer-buffer": ">= 2.1.2 < 3.0.0" + }, + "engines": { + "node": ">=0.10.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, "node_modules/@inquirer/figures": { - "version": "1.0.12", - "resolved": "https://registry.npmjs.org/@inquirer/figures/-/figures-1.0.12.tgz", - "integrity": "sha512-MJttijd8rMFcKJC8NYmprWr6hD3r9Gd9qUC0XwPNwoEPWSMVJwA2MlXxF+nhZZNMY+HXsWa+o7KY2emWYIn0jQ==", + "version": "1.0.15", + "resolved": "https://registry.npmjs.org/@inquirer/figures/-/figures-1.0.15.tgz", + "integrity": "sha512-t2IEY+unGHOzAaVM5Xx6DEWKeXlDDcNPeDyUpsRc6CUhBfU3VQOEl+Vssh7VNp1dR8MdUJBWhuObjXCsVpjN5g==", "dev": true, "license": "MIT", "engines": { @@ -2128,14 +2490,14 @@ } }, "node_modules/@inquirer/input": { - "version": "4.1.10", - "resolved": "https://registry.npmjs.org/@inquirer/input/-/input-4.1.10.tgz", - "integrity": "sha512-kV3BVne3wJ+j6reYQUZi/UN9NZGZLxgc/tfyjeK3mrx1QI7RXPxGp21IUTv+iVHcbP4ytZALF8vCHoxyNSC6qg==", + "version": "4.3.1", + "resolved": "https://registry.npmjs.org/@inquirer/input/-/input-4.3.1.tgz", + "integrity": "sha512-kN0pAM4yPrLjJ1XJBjDxyfDduXOuQHrBB8aLDMueuwUGn+vNpF7Gq7TvyVxx8u4SHlFFj4trmj+a2cbpG4Jn1g==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6" + "@inquirer/core": "^10.3.2", + "@inquirer/type": "^3.0.10" }, "engines": { "node": ">=18" @@ -2150,14 +2512,14 @@ } }, "node_modules/@inquirer/number": { - "version": "3.0.13", - "resolved": "https://registry.npmjs.org/@inquirer/number/-/number-3.0.13.tgz", - "integrity": "sha512-IrLezcg/GWKS8zpKDvnJ/YTflNJdG0qSFlUM/zNFsdi4UKW/CO+gaJpbMgQ20Q58vNKDJbEzC6IebdkprwL6ew==", + "version": "3.0.23", + "resolved": "https://registry.npmjs.org/@inquirer/number/-/number-3.0.23.tgz", + "integrity": "sha512-5Smv0OK7K0KUzUfYUXDXQc9jrf8OHo4ktlEayFlelCjwMXz0299Y8OrI+lj7i4gCBY15UObk76q0QtxjzFcFcg==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6" + "@inquirer/core": "^10.3.2", + "@inquirer/type": "^3.0.10" }, "engines": { "node": ">=18" @@ -2172,15 +2534,15 @@ } }, "node_modules/@inquirer/password": { - "version": "4.0.13", - "resolved": "https://registry.npmjs.org/@inquirer/password/-/password-4.0.13.tgz", - "integrity": "sha512-NN0S/SmdhakqOTJhDwOpeBEEr8VdcYsjmZHDb0rblSh2FcbXQOr+2IApP7JG4WE3sxIdKytDn4ed3XYwtHxmJQ==", + "version": "4.0.23", + "resolved": "https://registry.npmjs.org/@inquirer/password/-/password-4.0.23.tgz", + "integrity": "sha512-zREJHjhT5vJBMZX/IUbyI9zVtVfOLiTO66MrF/3GFZYZ7T4YILW5MSkEYHceSii/KtRk+4i3RE7E1CUXA2jHcA==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6", - "ansi-escapes": "^4.3.2" + "@inquirer/ansi": "^1.0.2", + "@inquirer/core": "^10.3.2", + "@inquirer/type": "^3.0.10" }, "engines": { "node": ">=18" @@ -2195,22 +2557,22 @@ } }, "node_modules/@inquirer/prompts": { - "version": "7.5.1", - "resolved": "https://registry.npmjs.org/@inquirer/prompts/-/prompts-7.5.1.tgz", - "integrity": "sha512-5AOrZPf2/GxZ+SDRZ5WFplCA2TAQgK3OYrXCYmJL5NaTu4ECcoWFlfUZuw7Es++6Njv7iu/8vpYJhuzxUH76Vg==", + "version": "7.10.1", + "resolved": "https://registry.npmjs.org/@inquirer/prompts/-/prompts-7.10.1.tgz", + "integrity": "sha512-Dx/y9bCQcXLI5ooQ5KyvA4FTgeo2jYj/7plWfV5Ak5wDPKQZgudKez2ixyfz7tKXzcJciTxqLeK7R9HItwiByg==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/checkbox": "^4.1.6", - "@inquirer/confirm": "^5.1.10", - "@inquirer/editor": "^4.2.11", - "@inquirer/expand": "^4.0.13", - "@inquirer/input": "^4.1.10", - "@inquirer/number": "^3.0.13", - "@inquirer/password": "^4.0.13", - "@inquirer/rawlist": "^4.1.1", - "@inquirer/search": "^3.0.13", - "@inquirer/select": "^4.2.1" + "@inquirer/checkbox": "^4.3.2", + "@inquirer/confirm": "^5.1.21", + "@inquirer/editor": "^4.2.23", + "@inquirer/expand": "^4.0.23", + "@inquirer/input": "^4.3.1", + "@inquirer/number": "^3.0.23", + "@inquirer/password": "^4.0.23", + "@inquirer/rawlist": "^4.1.11", + "@inquirer/search": "^3.2.2", + "@inquirer/select": "^4.4.2" }, "engines": { "node": ">=18" @@ -2225,15 +2587,15 @@ } }, "node_modules/@inquirer/rawlist": { - "version": "4.1.1", - "resolved": "https://registry.npmjs.org/@inquirer/rawlist/-/rawlist-4.1.1.tgz", - "integrity": "sha512-VBUC0jPN2oaOq8+krwpo/mf3n/UryDUkKog3zi+oIi8/e5hykvdntgHUB9nhDM78RubiyR1ldIOfm5ue+2DeaQ==", + "version": "4.1.11", + "resolved": "https://registry.npmjs.org/@inquirer/rawlist/-/rawlist-4.1.11.tgz", + "integrity": "sha512-+LLQB8XGr3I5LZN/GuAHo+GpDJegQwuPARLChlMICNdwW7OwV2izlCSCxN6cqpL0sMXmbKbFcItJgdQq5EBXTw==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/type": "^3.0.6", - "yoctocolors-cjs": "^2.1.2" + "@inquirer/core": "^10.3.2", + "@inquirer/type": "^3.0.10", + "yoctocolors-cjs": "^2.1.3" }, "engines": { "node": ">=18" @@ -2248,16 +2610,16 @@ } }, "node_modules/@inquirer/search": { - "version": "3.0.13", - "resolved": "https://registry.npmjs.org/@inquirer/search/-/search-3.0.13.tgz", - "integrity": "sha512-9g89d2c5Izok/Gw/U7KPC3f9kfe5rA1AJ24xxNZG0st+vWekSk7tB9oE+dJv5JXd0ZSijomvW0KPMoBd8qbN4g==", + "version": "3.2.2", + "resolved": "https://registry.npmjs.org/@inquirer/search/-/search-3.2.2.tgz", + "integrity": "sha512-p2bvRfENXCZdWF/U2BXvnSI9h+tuA8iNqtUKb9UWbmLYCRQxd8WkvwWvYn+3NgYaNwdUkHytJMGG4MMLucI1kA==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/figures": "^1.0.11", - "@inquirer/type": "^3.0.6", - "yoctocolors-cjs": "^2.1.2" + "@inquirer/core": "^10.3.2", + "@inquirer/figures": "^1.0.15", + "@inquirer/type": "^3.0.10", + "yoctocolors-cjs": "^2.1.3" }, "engines": { "node": ">=18" @@ -2272,17 +2634,17 @@ } }, "node_modules/@inquirer/select": { - "version": "4.2.1", - "resolved": "https://registry.npmjs.org/@inquirer/select/-/select-4.2.1.tgz", - "integrity": "sha512-gt1Kd5XZm+/ddemcT3m23IP8aD8rC9drRckWoP/1f7OL46Yy2FGi8DSmNjEjQKtPl6SV96Kmjbl6p713KXJ/Jg==", + "version": "4.4.2", + "resolved": "https://registry.npmjs.org/@inquirer/select/-/select-4.4.2.tgz", + "integrity": "sha512-l4xMuJo55MAe+N7Qr4rX90vypFwCajSakx59qe/tMaC1aEHWLyw68wF4o0A4SLAY4E0nd+Vt+EyskeDIqu1M6w==", "dev": true, "license": "MIT", "dependencies": { - "@inquirer/core": "^10.1.11", - "@inquirer/figures": "^1.0.11", - "@inquirer/type": "^3.0.6", - "ansi-escapes": "^4.3.2", - "yoctocolors-cjs": "^2.1.2" + "@inquirer/ansi": "^1.0.2", + "@inquirer/core": "^10.3.2", + "@inquirer/figures": "^1.0.15", + "@inquirer/type": "^3.0.10", + "yoctocolors-cjs": "^2.1.3" }, "engines": { "node": ">=18" @@ -2297,9 +2659,9 @@ } }, "node_modules/@inquirer/type": { - "version": "3.0.6", - "resolved": "https://registry.npmjs.org/@inquirer/type/-/type-3.0.6.tgz", - "integrity": "sha512-/mKVCtVpyBu3IDarv0G+59KC4stsD5mDsGpYh+GKs1NZT88Jh52+cuoA1AtLk2Q0r/quNl+1cSUyLRHBFeD0XA==", + "version": "3.0.10", + "resolved": "https://registry.npmjs.org/@inquirer/type/-/type-3.0.10.tgz", + "integrity": "sha512-BvziSRxfz5Ov8ch0z/n3oijRSEcEsHnhggm4xFZe93DHcUCTlutlq9Ox4SVENAfcRD22UQq7T/atg9Wr3k09eA==", "dev": true, "license": "MIT", "engines": { @@ -4185,14 +4547,13 @@ } }, "node_modules/@tokenizer/inflate": { - "version": "0.2.7", - "resolved": "https://registry.npmjs.org/@tokenizer/inflate/-/inflate-0.2.7.tgz", - "integrity": "sha512-MADQgmZT1eKjp06jpI2yozxaU9uVs4GzzgSL+uEq7bVcJ9V1ZXQkeGNql1fsSI0gMy1vhvNTNbUqrx+pZfJVmg==", + "version": "0.4.1", + "resolved": "https://registry.npmjs.org/@tokenizer/inflate/-/inflate-0.4.1.tgz", + "integrity": "sha512-2mAv+8pkG6GIZiF1kNg1jAjh27IDxEPKwdGul3snfztFerfPGI1LjDezZp3i7BElXompqEtPmoPx6c2wgtWsOA==", "license": "MIT", "dependencies": { - "debug": "^4.4.0", - "fflate": "^0.8.2", - "token-types": "^6.0.0" + "debug": "^4.4.3", + "token-types": "^6.1.1" }, "engines": { "node": ">=18" @@ -4203,9 +4564,9 @@ } }, "node_modules/@tokenizer/inflate/node_modules/debug": { - "version": "4.4.1", - "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.1.tgz", - "integrity": "sha512-KcKCqiftBJcZr++7ykoDIEwSa3XWowTfNPo92BYxjXiyYEVrUQh2aLyhxBCwww+heortUFxEJYcRzosstTEBYQ==", + "version": "4.4.3", + "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz", + "integrity": "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA==", "license": "MIT", "dependencies": { "ms": "^2.1.3" @@ -5859,6 +6220,31 @@ "node": ">= 16" } }, + "node_modules/cheerio": { + "version": "1.2.0", + "resolved": "https://registry.npmjs.org/cheerio/-/cheerio-1.2.0.tgz", + "integrity": "sha512-WDrybc/gKFpTYQutKIK6UvfcuxijIZfMfXaYm8NMsPQxSYvf+13fXUJ4rztGGbJcBQ/GF55gvrZ0Bc0bj/mqvg==", + "license": "MIT", + "dependencies": { + "cheerio-select": "^2.1.0", + "dom-serializer": "^2.0.0", + "domhandler": "^5.0.3", + "domutils": "^3.2.2", + "encoding-sniffer": "^0.2.1", + "htmlparser2": "^10.1.0", + "parse5": "^7.3.0", + "parse5-htmlparser2-tree-adapter": "^7.1.0", + "parse5-parser-stream": "^7.1.2", + "undici": "^7.19.0", + "whatwg-mimetype": "^4.0.0" + }, + "engines": { + "node": ">=20.18.1" + }, + "funding": { + "url": "https://github.com/cheeriojs/cheerio?sponsor=1" + } + }, "node_modules/cheerio-select": { "version": "2.1.0", "resolved": "https://registry.npmjs.org/cheerio-select/-/cheerio-select-2.1.0.tgz", @@ -7407,24 +7793,24 @@ } }, "node_modules/crawlee": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/crawlee/-/crawlee-4.0.0-beta.4.tgz", - "integrity": "sha512-ez2meEG/8QNZEDGcZebV0r4VjQgDWSkihbMVLipEeOFyWr8mois/z+wvgxUiMYQI1nwCzDXYdY0OEvlHXWJwHg==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/crawlee/-/crawlee-4.0.0-beta.56.tgz", + "integrity": "sha512-paz/jcWEjnm/llVcC+GVr5ox43Z8DH4ED1BM5d9fry4JbNv/DgQzn5atqpC8CBQTMOtuOoqvRtrrR70sBpnlVA==", "dev": true, "license": "Apache-2.0", "dependencies": { - "@crawlee/basic": "4.0.0-beta.4", - "@crawlee/browser": "4.0.0-beta.4", - "@crawlee/browser-pool": "4.0.0-beta.4", - "@crawlee/cheerio": "4.0.0-beta.4", - "@crawlee/cli": "4.0.0-beta.4", - "@crawlee/core": "4.0.0-beta.4", - "@crawlee/http": "4.0.0-beta.4", - "@crawlee/jsdom": "4.0.0-beta.4", - "@crawlee/linkedom": "4.0.0-beta.4", - "@crawlee/playwright": "4.0.0-beta.4", - "@crawlee/puppeteer": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", + "@crawlee/basic": "4.0.0-beta.56", + "@crawlee/browser": "4.0.0-beta.56", + "@crawlee/browser-pool": "4.0.0-beta.56", + "@crawlee/cheerio": "4.0.0-beta.56", + "@crawlee/cli": "4.0.0-beta.56", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/http": "4.0.0-beta.56", + "@crawlee/jsdom": "4.0.0-beta.56", + "@crawlee/linkedom": "4.0.0-beta.56", + "@crawlee/playwright": "4.0.0-beta.56", + "@crawlee/puppeteer": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", "import-local": "^3.2.0", "tslib": "^2.8.1" }, @@ -7435,10 +7821,14 @@ "node": ">=22.0.0" }, "peerDependencies": { + "idcac-playwright": "*", "playwright": "*", "puppeteer": "*" }, "peerDependenciesMeta": { + "idcac-playwright": { + "optional": true + }, "playwright": { "optional": true }, @@ -7448,21 +7838,20 @@ } }, "node_modules/crawlee/node_modules/@crawlee/basic": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/basic/-/basic-4.0.0-beta.4.tgz", - "integrity": "sha512-GnHrt8eylp/9MSLy5tXHrELwTA3Prm8YGVkzyO/8mfZZJqvz7L16uIAvO+ZWhg1x31MoHJKEQLlolaO6CDj2nA==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/basic/-/basic-4.0.0-beta.56.tgz", + "integrity": "sha512-lskCNOu/Wp845TQK9DeffaTfcWsb1ZvWbASp+msQ6TtGWbo33aY/+m/6j8eJPMlOJ1NT7axzJYF6CF39JH+dEQ==", "dev": true, "license": "Apache-2.0", "dependencies": { - "@apify/log": "^2.5.18", "@apify/timeout": "^0.3.2", "@apify/utilities": "^2.15.5", - "@crawlee/core": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/got-scraping-client": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", "csv-stringify": "^6.5.2", "fs-extra": "^11.3.0", - "got-scraping": "^4.1.1", "ow": "^2.0.0", "tldts": "^7.0.6", "tslib": "^2.8.1", @@ -7473,17 +7862,17 @@ } }, "node_modules/crawlee/node_modules/@crawlee/browser": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/browser/-/browser-4.0.0-beta.4.tgz", - "integrity": "sha512-yyvzSVNn7O5Q63BHWdqtgkd3hDMn3DxSv/Vr3fiHXRUL4CiAeHZrzSSl/VgzzTMIuv7GXSVWvLkdXGGYZX2peA==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/browser/-/browser-4.0.0-beta.56.tgz", + "integrity": "sha512-vcDDKQ8w/TMcz7ytfTEhN614bs0Afh5gB2vf2wlIIWQJvN9AZx8oPJEHijnu20UneiX+CRlCytmj1j5GiS1hnA==", "dev": true, "license": "Apache-2.0", "dependencies": { "@apify/timeout": "^0.3.2", - "@crawlee/basic": "4.0.0-beta.4", - "@crawlee/browser-pool": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", + "@crawlee/basic": "4.0.0-beta.56", + "@crawlee/browser-pool": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", "ow": "^2.0.0", "tslib": "^2.8.1", "type-fest": "^4.41.0" @@ -7505,18 +7894,17 @@ } }, "node_modules/crawlee/node_modules/@crawlee/browser-pool": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/browser-pool/-/browser-pool-4.0.0-beta.4.tgz", - "integrity": "sha512-OtzcPk50a2fTTQVR3euwySuXa6tEmNjsZ3mIwb20yMNlZxXdnFvrNMt/8hrJe6CV9ccWQI0GuRp8WwltR634fg==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/browser-pool/-/browser-pool-4.0.0-beta.56.tgz", + "integrity": "sha512-aipCOm+8GVTnm9QYpq85SiFB+zPZy+zAgdyN3OI3Wt2tXamALFRvytWtcTMzoCQos+Uo97UM0akHiUxVW0ABOw==", "dev": true, "license": "Apache-2.0", "dependencies": { - "@apify/log": "^2.5.18", "@apify/timeout": "^0.3.2", - "@crawlee/core": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "fingerprint-generator": "^2.1.66", - "fingerprint-injector": "^2.1.66", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "fingerprint-generator": "^2.1.68", + "fingerprint-injector": "^2.1.68", "lodash.merge": "^4.6.2", "nanoid": "^5.1.5", "ow": "^2.0.0", @@ -7542,91 +7930,25 @@ } } }, - "node_modules/crawlee/node_modules/@crawlee/cheerio": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/cheerio/-/cheerio-4.0.0-beta.4.tgz", - "integrity": "sha512-ZQsJBiAZlR05lLm6AsNuwat+qhqaA9UYiAGCY8DtwAUK3EGM0gAGB0WHd7e6tzD3W0yFMQGgG04XIitYVttVyQ==", - "dev": true, - "license": "Apache-2.0", - "dependencies": { - "@crawlee/http": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", - "cheerio": "^1.0.0", - "htmlparser2": "^10.0.0", - "tslib": "^2.8.1" - }, - "engines": { - "node": ">=22.0.0" - } - }, - "node_modules/crawlee/node_modules/@crawlee/http": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/http/-/http-4.0.0-beta.4.tgz", - "integrity": "sha512-G0MWHpYcSUO73b+ZST5HT7XtiiYkONaMZ+cgKm3TMbtOh0zuaOx8SdcQTNjeMnJdxmSpbmLbQl2ihynXGxhQdA==", - "dev": true, - "license": "Apache-2.0", - "dependencies": { - "@apify/timeout": "^0.3.2", - "@apify/utilities": "^2.15.5", - "@crawlee/basic": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", - "@types/content-type": "^1.1.8", - "cheerio": "^1.0.0", - "content-type": "^1.0.5", - "got-scraping": "^4.1.1", - "iconv-lite": "^0.6.3", - "mime-types": "^3.0.1", - "ow": "^2.0.0", - "tslib": "^2.8.1", - "type-fest": "^4.41.0" - }, - "engines": { - "node": ">=22.0.0" - } - }, - "node_modules/crawlee/node_modules/@crawlee/jsdom": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/jsdom/-/jsdom-4.0.0-beta.4.tgz", - "integrity": "sha512-nBrVYyz2kJCSF/s8yWpWf90L7cr7cpByJacnpNpQTwp8+2QSRrJmU60uddFeQEhC2nWq0WnWsuMA1Mx7cR25qw==", - "dev": true, - "license": "Apache-2.0", - "dependencies": { - "@apify/timeout": "^0.3.2", - "@apify/utilities": "^2.15.5", - "@crawlee/http": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", - "@types/jsdom": "^21.1.7", - "cheerio": "^1.0.0", - "jsdom": "^26.1.0", - "ow": "^2.0.0", - "tslib": "^2.8.1" - }, - "engines": { - "node": ">=22.0.0" - } - }, "node_modules/crawlee/node_modules/@crawlee/playwright": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/playwright/-/playwright-4.0.0-beta.4.tgz", - "integrity": "sha512-mAynANJHAoxPbOnxkyDnLIsS+Ujj40q2fApJMWgdz95Ti0Kg/HJBacjeRV+cGZpA/brB8FV7J4/ZGu26laip1g==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/playwright/-/playwright-4.0.0-beta.56.tgz", + "integrity": "sha512-6LriTQm9dVL/Ram2movEyZEQnCTSJuLHvBvwyMEuVKebJel7Ns79aOjmMAn0yeJYXV0K+dT61yMNOcZnLyI3Ig==", "dev": true, "license": "Apache-2.0", "dependencies": { "@apify/datastructures": "^2.0.3", - "@apify/log": "^2.5.18", "@apify/timeout": "^0.3.2", - "@crawlee/browser": "4.0.0-beta.4", - "@crawlee/browser-pool": "4.0.0-beta.4", - "@crawlee/core": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", + "@crawlee/basic": "4.0.0-beta.56", + "@crawlee/browser": "4.0.0-beta.56", + "@crawlee/browser-pool": "4.0.0-beta.56", + "@crawlee/cheerio": "4.0.0-beta.56", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", "cheerio": "^1.0.0", "idcac-playwright": "^0.1.3", "jquery": "^3.7.1", - "lodash.isequal": "^4.5.0", "ml-logistic-regression": "^2.0.0", "ml-matrix": "^6.12.1", "ow": "^2.0.0", @@ -7637,30 +7959,41 @@ "node": ">=22.0.0" }, "peerDependencies": { + "idcac-playwright": "^0.2.0", "playwright": "*" }, "peerDependenciesMeta": { + "idcac-playwright": { + "optional": true + }, "playwright": { "optional": true } } }, + "node_modules/crawlee/node_modules/@crawlee/playwright/node_modules/idcac-playwright": { + "version": "0.1.3", + "resolved": "https://registry.npmjs.org/idcac-playwright/-/idcac-playwright-0.1.3.tgz", + "integrity": "sha512-VVYQ4sv6OrUJKVzYaIP1hq0qAHd1O22HW5LnL1Wf6zkrLStQ/QEg4iJ0rllIOEpd+Rmm+635AJD59A+Vw+2PgQ==", + "dev": true, + "license": "ISC" + }, "node_modules/crawlee/node_modules/@crawlee/puppeteer": { - "version": "4.0.0-beta.4", - "resolved": "https://registry.npmjs.org/@crawlee/puppeteer/-/puppeteer-4.0.0-beta.4.tgz", - "integrity": "sha512-LiXfvOz3RXmtzVte/vk4PkKEdBzoXrmQmTOeHHtbD7z9AYMaHGI1wimEuu5Z2wYNEQ0n0yi7fkXiiajxsC54sg==", + "version": "4.0.0-beta.56", + "resolved": "https://registry.npmjs.org/@crawlee/puppeteer/-/puppeteer-4.0.0-beta.56.tgz", + "integrity": "sha512-I5aSgeYe8mChUw9eIiFUn3duv5cNVYQlGw94PbcAaLccAlJmsG+Q6F62MYI6qtdls+Vrvk+IjAsEVdmvjFbExA==", "dev": true, "license": "Apache-2.0", "dependencies": { "@apify/datastructures": "^2.0.3", - "@apify/log": "^2.5.18", - "@crawlee/browser": "4.0.0-beta.4", - "@crawlee/browser-pool": "4.0.0-beta.4", - "@crawlee/types": "4.0.0-beta.4", - "@crawlee/utils": "4.0.0-beta.4", + "@crawlee/browser": "4.0.0-beta.56", + "@crawlee/browser-pool": "4.0.0-beta.56", + "@crawlee/core": "4.0.0-beta.56", + "@crawlee/types": "4.0.0-beta.56", + "@crawlee/utils": "4.0.0-beta.56", "cheerio": "^1.0.0", "devtools-protocol": "*", - "idcac-playwright": "^0.1.3", + "idcac-playwright": "^0.2.0", "jquery": "^3.7.1", "ow": "^2.0.0", "tslib": "^2.8.1" @@ -7669,9 +8002,13 @@ "node": ">=22.0.0" }, "peerDependencies": { + "idcac-playwright": "^0.2.0", "puppeteer": "*" }, "peerDependenciesMeta": { + "idcac-playwright": { + "optional": true + }, "puppeteer": { "optional": true } @@ -7697,56 +8034,10 @@ "dev": true, "license": "MIT", "engines": { - "node": ">=12.20" - }, - "funding": { - "url": "https://github.com/sponsors/sindresorhus" - } - }, - "node_modules/crawlee/node_modules/cheerio": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/cheerio/-/cheerio-1.0.0.tgz", - "integrity": "sha512-quS9HgjQpdaXOvsZz82Oz7uxtXiy6UIsIQcpBj7HRw2M63Skasm9qlDocAM7jNuaxdhpPU7c4kJN+gA5MCu4ww==", - "dev": true, - "license": "MIT", - "dependencies": { - "cheerio-select": "^2.1.0", - "dom-serializer": "^2.0.0", - "domhandler": "^5.0.3", - "domutils": "^3.1.0", - "encoding-sniffer": "^0.2.0", - "htmlparser2": "^9.1.0", - "parse5": "^7.1.2", - "parse5-htmlparser2-tree-adapter": "^7.0.0", - "parse5-parser-stream": "^7.1.2", - "undici": "^6.19.5", - "whatwg-mimetype": "^4.0.0" - }, - "engines": { - "node": ">=18.17" + "node": ">=12.20" }, "funding": { - "url": "https://github.com/cheeriojs/cheerio?sponsor=1" - } - }, - "node_modules/crawlee/node_modules/cheerio/node_modules/htmlparser2": { - "version": "9.1.0", - "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-9.1.0.tgz", - "integrity": "sha512-5zfg6mHUoaer/97TxnGpxmbR7zJtPwIYFMZ/H5ucTlPZhKvtum05yiPK3Mgai3a0DyVxv7qYqoweaEd2nrYQzQ==", - "dev": true, - "funding": [ - "https://github.com/fb55/htmlparser2?sponsor=1", - { - "type": "github", - "url": "https://github.com/sponsors/fb55" - } - ], - "license": "MIT", - "dependencies": { - "domelementtype": "^2.3.0", - "domhandler": "^5.0.3", - "domutils": "^3.1.0", - "entities": "^4.5.0" + "url": "https://github.com/sponsors/sindresorhus" } }, "node_modules/crawlee/node_modules/dot-prop": { @@ -7778,66 +8069,10 @@ "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/crawlee/node_modules/htmlparser2": { - "version": "10.0.0", - "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz", - "integrity": "sha512-TwAZM+zE5Tq3lrEHvOlvwgj1XLWQCtaaibSN11Q+gGBAS7Y1uZSWwXXRe4iF6OXnaq1riyQAPFOBtYc77Mxq0g==", - "dev": true, - "funding": [ - "https://github.com/fb55/htmlparser2?sponsor=1", - { - "type": "github", - "url": "https://github.com/sponsors/fb55" - } - ], - "license": "MIT", - "dependencies": { - "domelementtype": "^2.3.0", - "domhandler": "^5.0.3", - "domutils": "^3.2.1", - "entities": "^6.0.0" - } - }, - "node_modules/crawlee/node_modules/htmlparser2/node_modules/entities": { - "version": "6.0.0", - "resolved": "https://registry.npmjs.org/entities/-/entities-6.0.0.tgz", - "integrity": "sha512-aKstq2TDOndCn4diEyp9Uq/Flu2i1GlLkc6XIDQSDMuaFE3OPW5OphLCyQ5SpSJZTb4reN+kTcYru5yIfXoRPw==", - "dev": true, - "license": "BSD-2-Clause", - "engines": { - "node": ">=0.12" - }, - "funding": { - "url": "https://github.com/fb55/entities?sponsor=1" - } - }, - "node_modules/crawlee/node_modules/mime-db": { - "version": "1.54.0", - "resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.54.0.tgz", - "integrity": "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">= 0.6" - } - }, - "node_modules/crawlee/node_modules/mime-types": { - "version": "3.0.1", - "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-3.0.1.tgz", - "integrity": "sha512-xRc4oEhT6eaBpU1XF7AjpOFD+xQmXNB5OVKwp4tqCuBpHLS/ZbBDrc07mYTDqVMg6PfxUjjNp85O6Cd2Z/5HWA==", - "dev": true, - "license": "MIT", - "dependencies": { - "mime-db": "^1.54.0" - }, - "engines": { - "node": ">= 0.6" - } - }, "node_modules/crawlee/node_modules/nanoid": { - "version": "5.1.5", - "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-5.1.5.tgz", - "integrity": "sha512-Ir/+ZpE9fDsNH0hQ3C68uyThDXzYcim2EqcZ8zn8Chtt1iylPT9xXJB0kPCnqzgcEGikO9RxSrh63MsmVCU7Fw==", + "version": "5.1.11", + "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-5.1.11.tgz", + "integrity": "sha512-v+KEsUv2ps74PaSKv0gHTxTCgMXOIfBEbaqa6w6ISIGC7ZsvHN4N9oJ8d4cmf0n5oTzQz2SLmThbQWhjd/8eKg==", "dev": true, "funding": [ { @@ -7891,9 +8126,9 @@ } }, "node_modules/crawlee/node_modules/quick-lru": { - "version": "7.0.1", - "resolved": "https://registry.npmjs.org/quick-lru/-/quick-lru-7.0.1.tgz", - "integrity": "sha512-kLjThirJMkWKutUKbZ8ViqFc09tDQhlbQo2MNuVeLWbRauqYP96Sm6nzlQ24F0HFjUNZ4i9+AgldJ9H6DZXi7g==", + "version": "7.3.0", + "resolved": "https://registry.npmjs.org/quick-lru/-/quick-lru-7.3.0.tgz", + "integrity": "sha512-k9lSsjl36EJdK7I06v7APZCbyGT2vMTsYSRX1Q2nbYmnkBqgUhRkAuzH08Ciotteu/PLJmIF2+tti7o3C/ts2g==", "dev": true, "license": "MIT", "engines": { @@ -7904,9 +8139,9 @@ } }, "node_modules/crawlee/node_modules/yocto-queue": { - "version": "1.2.1", - "resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-1.2.1.tgz", - "integrity": "sha512-AyeEbWOu/TAXdxlV9wmGcR0+yh2j3vYPGOECcIj2S7MkrLyC7ne+oye2BKTItt0ii2PHk4cDy+95+LshzbXnGg==", + "version": "1.2.2", + "resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-1.2.2.tgz", + "integrity": "sha512-4LCcse/U2MHZ63HAJVE+v71o7yOdIe4cZ70Wpf8D/IyjDKYQLV5GD46B+hSTjJsvV5PztjvHoU580EftxjDZFQ==", "dev": true, "license": "MIT", "engines": { @@ -8540,9 +8775,9 @@ } }, "node_modules/encoding-sniffer": { - "version": "0.2.0", - "resolved": "https://registry.npmjs.org/encoding-sniffer/-/encoding-sniffer-0.2.0.tgz", - "integrity": "sha512-ju7Wq1kg04I3HtiYIOrUrdfdDvkyO9s5XM8QAj/bN61Yo/Vb4vgJxy5vi4Yxk01gWHbrofpPtpxM8bKger9jhg==", + "version": "0.2.1", + "resolved": "https://registry.npmjs.org/encoding-sniffer/-/encoding-sniffer-0.2.1.tgz", + "integrity": "sha512-5gvq20T6vfpekVtqrYQsSCFZ1wEg5+wW0/QaZMWkFr6BqD3NfKs0rLCx4rrVlSWJeZb5NBJgVLswK/w2MWU+Gw==", "license": "MIT", "dependencies": { "iconv-lite": "^0.6.3", @@ -9571,25 +9806,6 @@ "pend": "~1.2.0" } }, - "node_modules/fflate": { - "version": "0.8.2", - "resolved": "https://registry.npmjs.org/fflate/-/fflate-0.8.2.tgz", - "integrity": "sha512-cPJU47OaAoCbg0pBvzsgpTPhmhqI5eJjh/JIu8tPj5q+T7iLvW/JAYUqmE7KOB4R1ZyEhzBaIQpQpardBF5z8A==", - "license": "MIT" - }, - "node_modules/figlet": { - "version": "1.8.1", - "resolved": "https://registry.npmjs.org/figlet/-/figlet-1.8.1.tgz", - "integrity": "sha512-kEC3Sme+YvA8Hkibv0NR1oClGcWia0VB2fC1SlMy027cwe795Xx40Xiv/nw/iFAwQLupymWh+uhAAErn/7hwPg==", - "dev": true, - "license": "MIT", - "bin": { - "figlet": "bin/index.js" - }, - "engines": { - "node": ">= 0.4.0" - } - }, "node_modules/figures": { "version": "3.2.0", "resolved": "https://registry.npmjs.org/figures/-/figures-3.2.0.tgz", @@ -9630,18 +9846,18 @@ } }, "node_modules/file-type": { - "version": "20.5.0", - "resolved": "https://registry.npmjs.org/file-type/-/file-type-20.5.0.tgz", - "integrity": "sha512-BfHZtG/l9iMm4Ecianu7P8HRD2tBHLtjXinm4X62XBOYzi7CYA7jyqfJzOvXHqzVrVPYqBo2/GvbARMaaJkKVg==", + "version": "21.3.4", + "resolved": "https://registry.npmjs.org/file-type/-/file-type-21.3.4.tgz", + "integrity": "sha512-Ievi/yy8DS3ygGvT47PjSfdFoX+2isQueoYP1cntFW1JLYAuS4GD7NUPGg4zv2iZfV52uDyk5w5Z0TdpRS6Q1g==", "license": "MIT", "dependencies": { - "@tokenizer/inflate": "^0.2.6", - "strtok3": "^10.2.0", - "token-types": "^6.0.0", + "@tokenizer/inflate": "^0.4.1", + "strtok3": "^10.3.4", + "token-types": "^6.1.1", "uint8array-extras": "^1.4.0" }, "engines": { - "node": ">=18" + "node": ">=20" }, "funding": { "url": "https://github.com/sindresorhus/file-type?sponsor=1" @@ -9712,14 +9928,14 @@ } }, "node_modules/fingerprint-generator": { - "version": "2.1.66", - "resolved": "https://registry.npmjs.org/fingerprint-generator/-/fingerprint-generator-2.1.66.tgz", - "integrity": "sha512-2CvoY+OPcCOWkoIMQim80uNH+ED1+2rM9QXIcSih7ovBMLOmyr3Sp9IOtfccd05QlGDzulU2M9Oav8jOgTlCBA==", + "version": "2.1.82", + "resolved": "https://registry.npmjs.org/fingerprint-generator/-/fingerprint-generator-2.1.82.tgz", + "integrity": "sha512-5Z/yCKW324pMyMarpIKe/QPdkrFWKNJv3ktdU+fXHri80+HAwNE6QhMvEvsMkK9Q8DeCXZlpPHV77UBa1nFb4A==", "dev": true, "license": "Apache-2.0", "dependencies": { - "generative-bayesian-network": "^2.1.66", - "header-generator": "^2.1.66", + "generative-bayesian-network": "^2.1.82", + "header-generator": "^2.1.82", "tslib": "^2.4.0" }, "engines": { @@ -9727,13 +9943,13 @@ } }, "node_modules/fingerprint-injector": { - "version": "2.1.66", - "resolved": "https://registry.npmjs.org/fingerprint-injector/-/fingerprint-injector-2.1.66.tgz", - "integrity": "sha512-h5llsoG0xoDeEo2czjzvo1niEU0xgCMwhs5/jtAxiBf7IiP/wW1Is3DJMEB+4V4PwIvqNQqLlnD07X23D7tErA==", + "version": "2.1.82", + "resolved": "https://registry.npmjs.org/fingerprint-injector/-/fingerprint-injector-2.1.82.tgz", + "integrity": "sha512-FN7W1wbhHk2PBCF6wpBEcFnmOdGUItZnbpVBtYVcQ1/iGM0skNUDqJyH1YOjmpQiqEl2Rhh7qWNXYsivjsT+tg==", "dev": true, "license": "Apache-2.0", "dependencies": { - "fingerprint-generator": "^2.1.66", + "fingerprint-generator": "^2.1.82", "tslib": "^2.4.0" }, "engines": { @@ -10029,9 +10245,9 @@ } }, "node_modules/generative-bayesian-network": { - "version": "2.1.66", - "resolved": "https://registry.npmjs.org/generative-bayesian-network/-/generative-bayesian-network-2.1.66.tgz", - "integrity": "sha512-gbBsyaaEJj/LHp3473TQrMDdcKiRzI8Sn2CbcG/lwONZkp0n9/ChC1mjzcbZQtHHCuqjn+JouSSbzLeepUMbuw==", + "version": "2.1.82", + "resolved": "https://registry.npmjs.org/generative-bayesian-network/-/generative-bayesian-network-2.1.82.tgz", + "integrity": "sha512-DH4NrmQheoMaJErdVv2IzaqkbOYSDQZmiZTV6UPDJYRDK2EyPpIQ88XRcYdPeFrUjS1N0Jj25H3HUywoJ1dbow==", "license": "Apache-2.0", "dependencies": { "adm-zip": "^0.5.9", @@ -10857,9 +11073,9 @@ } }, "node_modules/got-scraping": { - "version": "4.1.1", - "resolved": "https://registry.npmjs.org/got-scraping/-/got-scraping-4.1.1.tgz", - "integrity": "sha512-MbT+NMMU4VgvOg2tFIPOSIrMfH986fm0LJ17RxBLKlyTs3gh3xIMETpe+zdPaXY7tH1j6YYeqtfG0TnVMx6V2g==", + "version": "4.2.1", + "resolved": "https://registry.npmjs.org/got-scraping/-/got-scraping-4.2.1.tgz", + "integrity": "sha512-rhOlO1L4H4Cm31smHJqPtAaXOUrhSKsiTrbZSHKFQW1E/mkTDopnHHpRnXJpqzE0faj+zPsVQnyifIqO+K+cLQ==", "license": "Apache-2.0", "dependencies": { "got": "^14.2.1", @@ -11001,29 +11217,6 @@ "node": ">=6" } }, - "node_modules/has-ansi": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/has-ansi/-/has-ansi-2.0.0.tgz", - "integrity": "sha512-C8vBJ8DwUCx19vhm7urhTuUsr4/IyP6l4VzNQDv+ryHQObW3TTTp9yB68WpYgRe2bbaGuZ/se74IqFeVnMnLZg==", - "dev": true, - "license": "MIT", - "dependencies": { - "ansi-regex": "^2.0.0" - }, - "engines": { - "node": ">=0.10.0" - } - }, - "node_modules/has-ansi/node_modules/ansi-regex": { - "version": "2.1.1", - "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-2.1.1.tgz", - "integrity": "sha512-TIGnTpdo+E3+pCyAluZvtED5p5wCqLdezCyhPZzKPcxvFplEt4i+W7OONCKgeZFT3+y5NZZfOOS/Bdcanm1MYA==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=0.10.0" - } - }, "node_modules/has-bigints": { "version": "1.1.0", "resolved": "https://registry.npmjs.org/has-bigints/-/has-bigints-1.1.0.tgz", @@ -11123,13 +11316,13 @@ } }, "node_modules/header-generator": { - "version": "2.1.66", - "resolved": "https://registry.npmjs.org/header-generator/-/header-generator-2.1.66.tgz", - "integrity": "sha512-g0jd79o0CyzyK0Jega4pAG1eJhykhPNfBLpOnUINtX2YkToVvRSBZ+B2wtmIjqwKHXK8DNWxylKuXnZmLs1yMQ==", + "version": "2.1.82", + "resolved": "https://registry.npmjs.org/header-generator/-/header-generator-2.1.82.tgz", + "integrity": "sha512-4NjPB0+bAKjPoponSmTOkK58IEF2W22sOJA5O48k/MxbCZgOm+jrU4WVR53Z2I6xFgIPkVrQmKtt1LAbWtfqXw==", "license": "Apache-2.0", "dependencies": { "browserslist": "^4.21.1", - "generative-bayesian-network": "^2.1.66", + "generative-bayesian-network": "^2.1.82", "ow": "^0.28.1", "tslib": "^2.4.0" }, @@ -11170,6 +11363,37 @@ "dev": true, "license": "MIT" }, + "node_modules/htmlparser2": { + "version": "10.1.0", + "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.1.0.tgz", + "integrity": "sha512-VTZkM9GWRAtEpveh7MSF6SjjrpNVNNVJfFup7xTY3UpFtm67foy9HDVXneLtFVt4pMz5kZtgNcvCniNFb1hlEQ==", + "funding": [ + "https://github.com/fb55/htmlparser2?sponsor=1", + { + "type": "github", + "url": "https://github.com/sponsors/fb55" + } + ], + "license": "MIT", + "dependencies": { + "domelementtype": "^2.3.0", + "domhandler": "^5.0.3", + "domutils": "^3.2.2", + "entities": "^7.0.1" + } + }, + "node_modules/htmlparser2/node_modules/entities": { + "version": "7.0.1", + "resolved": "https://registry.npmjs.org/entities/-/entities-7.0.1.tgz", + "integrity": "sha512-TWrgLOFUQTH994YUyl1yT4uyavY5nNB5muff+RtWaqNVCAK408b5ZnnbNAUEWLTCpum9w6arT70i1XdQ4UeOPA==", + "license": "BSD-2-Clause", + "engines": { + "node": ">=0.12" + }, + "funding": { + "url": "https://github.com/fb55/entities?sponsor=1" + } + }, "node_modules/http-cache-semantics": { "version": "4.2.0", "resolved": "https://registry.npmjs.org/http-cache-semantics/-/http-cache-semantics-4.2.0.tgz", @@ -11301,11 +11525,11 @@ } }, "node_modules/idcac-playwright": { - "version": "0.1.3", - "resolved": "https://registry.npmjs.org/idcac-playwright/-/idcac-playwright-0.1.3.tgz", - "integrity": "sha512-VVYQ4sv6OrUJKVzYaIP1hq0qAHd1O22HW5LnL1Wf6zkrLStQ/QEg4iJ0rllIOEpd+Rmm+635AJD59A+Vw+2PgQ==", + "version": "0.2.0", + "resolved": "https://registry.npmjs.org/idcac-playwright/-/idcac-playwright-0.2.0.tgz", + "integrity": "sha512-qJH7vQgq3TKnhea/3Z3jlEJL7NC9vK9BkLClAzQHVRepBtq1fWfSI4fSuMKcPq7nDUTTlIEIS+vU+GRwwR1BXw==", "dev": true, - "license": "ISC" + "license": "GPL-3.0-only" }, "node_modules/identifier-regex": { "version": "1.0.0", @@ -11608,9 +11832,9 @@ } }, "node_modules/is-any-array": { - "version": "2.0.1", - "resolved": "https://registry.npmjs.org/is-any-array/-/is-any-array-2.0.1.tgz", - "integrity": "sha512-UtilS7hLRu++wb/WBAw9bNuP1Eg04Ivn1vERJck8zJthEvXCBEBpGR/33u/xLKWEQf95803oalHrVDptcAvFdQ==", + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/is-any-array/-/is-any-array-3.0.0.tgz", + "integrity": "sha512-o4h+tylWykC4BD1vaejp6gDxoM13bwW8FGuNs4yIKpj8xbBJcRxJx8vZpq0dCr7ZDEfeKjmsi/euolKhX6f/ww==", "dev": true, "license": "MIT" }, @@ -12964,9 +13188,9 @@ } }, "node_modules/linkedom": { - "version": "0.18.10", - "resolved": "https://registry.npmjs.org/linkedom/-/linkedom-0.18.10.tgz", - "integrity": "sha512-ESCqVAtme2GI3zZnlVRidiydByV6WmPlmKeFzFVQslADiAO2Wi+H6xL/5kr/pUOESjEoVb2Eb3cYFJ/TQhQOWA==", + "version": "0.18.12", + "resolved": "https://registry.npmjs.org/linkedom/-/linkedom-0.18.12.tgz", + "integrity": "sha512-jalJsOwIKuQJSeTvsgzPe9iJzyfVaEJiEXl+25EkKevsULHvMJzpNqwvj1jOESWdmgKDiXObyjOYwlUqG7wo1Q==", "dev": true, "license": "ISC", "dependencies": { @@ -12975,39 +13199,17 @@ "html-escaper": "^3.0.3", "htmlparser2": "^10.0.0", "uhyphen": "^0.2.0" - } - }, - "node_modules/linkedom/node_modules/entities": { - "version": "6.0.0", - "resolved": "https://registry.npmjs.org/entities/-/entities-6.0.0.tgz", - "integrity": "sha512-aKstq2TDOndCn4diEyp9Uq/Flu2i1GlLkc6XIDQSDMuaFE3OPW5OphLCyQ5SpSJZTb4reN+kTcYru5yIfXoRPw==", - "dev": true, - "license": "BSD-2-Clause", + }, "engines": { - "node": ">=0.12" + "node": ">=16" }, - "funding": { - "url": "https://github.com/fb55/entities?sponsor=1" - } - }, - "node_modules/linkedom/node_modules/htmlparser2": { - "version": "10.0.0", - "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-10.0.0.tgz", - "integrity": "sha512-TwAZM+zE5Tq3lrEHvOlvwgj1XLWQCtaaibSN11Q+gGBAS7Y1uZSWwXXRe4iF6OXnaq1riyQAPFOBtYc77Mxq0g==", - "dev": true, - "funding": [ - "https://github.com/fb55/htmlparser2?sponsor=1", - { - "type": "github", - "url": "https://github.com/sponsors/fb55" + "peerDependencies": { + "canvas": ">= 2" + }, + "peerDependenciesMeta": { + "canvas": { + "optional": true } - ], - "license": "MIT", - "dependencies": { - "domelementtype": "^2.3.0", - "domhandler": "^5.0.3", - "domutils": "^3.2.1", - "entities": "^6.0.0" } }, "node_modules/lint-staged": { @@ -13991,35 +14193,35 @@ } }, "node_modules/ml-array-max": { - "version": "1.2.4", - "resolved": "https://registry.npmjs.org/ml-array-max/-/ml-array-max-1.2.4.tgz", - "integrity": "sha512-BlEeg80jI0tW6WaPyGxf5Sa4sqvcyY6lbSn5Vcv44lp1I2GR6AWojfUvLnGTNsIXrZ8uqWmo8VcG1WpkI2ONMQ==", + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/ml-array-max/-/ml-array-max-2.0.0.tgz", + "integrity": "sha512-QQZ4kENwpWmyNb98UXRDFXrmtIXuXtt1+bSbda/2KA85+F+rrJP8hZk6QOkCQXM2Th9mUDYdq/PNByPdT9ID4A==", "dev": true, "license": "MIT", "dependencies": { - "is-any-array": "^2.0.0" + "is-any-array": "^3.0.0" } }, "node_modules/ml-array-min": { - "version": "1.2.3", - "resolved": "https://registry.npmjs.org/ml-array-min/-/ml-array-min-1.2.3.tgz", - "integrity": "sha512-VcZ5f3VZ1iihtrGvgfh/q0XlMobG6GQ8FsNyQXD3T+IlstDv85g8kfV0xUG1QPRO/t21aukaJowDzMTc7j5V6Q==", + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/ml-array-min/-/ml-array-min-2.0.0.tgz", + "integrity": "sha512-GRj6Ky6sW9vGL6yIjgsHmXZ9YgrdmcQ8nCxPqEGeKc6dkfYg1XDYxGFxADUjNuZyoCd5PUscWAS4N+cFaX6hFg==", "dev": true, "license": "MIT", "dependencies": { - "is-any-array": "^2.0.0" + "is-any-array": "^3.0.0" } }, "node_modules/ml-array-rescale": { - "version": "1.3.7", - "resolved": "https://registry.npmjs.org/ml-array-rescale/-/ml-array-rescale-1.3.7.tgz", - "integrity": "sha512-48NGChTouvEo9KBctDfHC3udWnQKNKEWN0ziELvY3KG25GR5cA8K8wNVzracsqSW1QEkAXjTNx+ycgAv06/1mQ==", + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/ml-array-rescale/-/ml-array-rescale-2.0.0.tgz", + "integrity": "sha512-2GGtKfSno94/kIloWGvpp/U5Q5vLvLrza+SAaGsLeo6Xj4mEbA6Gqx+oTfZFkxnd1grT2X007HfJNs3T5BsiVg==", "dev": true, "license": "MIT", "dependencies": { - "is-any-array": "^2.0.0", - "ml-array-max": "^1.2.4", - "ml-array-min": "^1.2.3" + "is-any-array": "^3.0.0", + "ml-array-max": "^2.0.0", + "ml-array-min": "^2.0.0" } }, "node_modules/ml-logistic-regression": { @@ -14033,14 +14235,14 @@ } }, "node_modules/ml-matrix": { - "version": "6.12.1", - "resolved": "https://registry.npmjs.org/ml-matrix/-/ml-matrix-6.12.1.tgz", - "integrity": "sha512-TJ+8eOFdp+INvzR4zAuwBQJznDUfktMtOB6g/hUcGh3rcyjxbz4Te57Pgri8Q9bhSQ7Zys4IYOGhFdnlgeB6Lw==", + "version": "6.12.2", + "resolved": "https://registry.npmjs.org/ml-matrix/-/ml-matrix-6.12.2.tgz", + "integrity": "sha512-GC+BnW+pBh8Auap8goAxY0senAmF0IEoc3HNVSfnfbvGw0buuDIYb9kAKMS1l+GiwJ1rfK2bzJ8IHhwjzATSFA==", "dev": true, "license": "MIT", "dependencies": { - "is-any-array": "^2.0.1", - "ml-array-rescale": "^1.3.7" + "is-any-array": "^3.0.0", + "ml-array-rescale": "^2.0.0" } }, "node_modules/modify-values": { @@ -15307,15 +15509,6 @@ "node": ">=6" } }, - "node_modules/parent-require": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/parent-require/-/parent-require-1.0.0.tgz", - "integrity": "sha512-2MXDNZC4aXdkkap+rBBMv0lUsfJqvX5/2FiYYnfCnorZt3Pk06/IOR5KeaoghgS2w07MLWgjbsnyaq6PdHn2LQ==", - "dev": true, - "engines": { - "node": ">= 0.4.0" - } - }, "node_modules/parse-conflict-json": { "version": "3.0.1", "resolved": "https://registry.npmjs.org/parse-conflict-json/-/parse-conflict-json-3.0.1.tgz", @@ -15519,19 +15712,6 @@ "through": "~2.3" } }, - "node_modules/peek-readable": { - "version": "7.0.0", - "resolved": "https://registry.npmjs.org/peek-readable/-/peek-readable-7.0.0.tgz", - "integrity": "sha512-nri2TO5JE3/mRryik9LlHFT53cgHfRK0Lt0BAZQXku/AW3E6XLt2GaY8siWi7dvW/m1z0ecn+J+bpDa9ZN3IsQ==", - "license": "MIT", - "engines": { - "node": ">=18" - }, - "funding": { - "type": "github", - "url": "https://github.com/sponsors/Borewit" - } - }, "node_modules/pend": { "version": "1.2.0", "resolved": "https://registry.npmjs.org/pend/-/pend-1.2.0.tgz", @@ -15965,9 +16145,9 @@ } }, "node_modules/proxy-chain": { - "version": "2.5.8", - "resolved": "https://registry.npmjs.org/proxy-chain/-/proxy-chain-2.5.8.tgz", - "integrity": "sha512-TqKOYRD/1Gga/JhiwmdYHJoj0zMJkKGofQ9bHQuSm+vexczatt81fkUHTVMyci+2mWczXiTNv1Eom+2v3Da5og==", + "version": "2.7.1", + "resolved": "https://registry.npmjs.org/proxy-chain/-/proxy-chain-2.7.1.tgz", + "integrity": "sha512-LtXu0miohJYrHWJxv8wA6EoGreRcX1hxKb7qlE1pMFH+BXE7bqMvpyhzR/JvR6M5SzYKzyHFpvfmYJrZeMtwAg==", "dev": true, "license": "Apache-2.0", "dependencies": { @@ -17611,13 +17791,12 @@ } }, "node_modules/strtok3": { - "version": "10.2.2", - "resolved": "https://registry.npmjs.org/strtok3/-/strtok3-10.2.2.tgz", - "integrity": "sha512-Xt18+h4s7Z8xyZ0tmBoRmzxcop97R4BAh+dXouUDCYn+Em+1P3qpkUfI5ueWLT8ynC5hZ+q4iPEmGG1urvQGBg==", + "version": "10.3.5", + "resolved": "https://registry.npmjs.org/strtok3/-/strtok3-10.3.5.tgz", + "integrity": "sha512-ki4hZQfh5rX0QDLLkOCj+h+CVNkqmp/CMf8v8kZpkNVK6jGQooMytqzLZYUVYIZcFZ6yDB70EfD8POcFXiF5oA==", "license": "MIT", "dependencies": { - "@tokenizer/token": "^0.3.0", - "peek-readable": "^7.0.0" + "@tokenizer/token": "^0.3.0" }, "engines": { "node": ">=18" @@ -18020,11 +18199,12 @@ } }, "node_modules/token-types": { - "version": "6.0.0", - "resolved": "https://registry.npmjs.org/token-types/-/token-types-6.0.0.tgz", - "integrity": "sha512-lbDrTLVsHhOMljPscd0yitpozq7Ga2M5Cvez5AjGg8GASBjtt6iERCAJ93yommPmz62fb45oFIXHEZ3u9bfJEA==", + "version": "6.1.2", + "resolved": "https://registry.npmjs.org/token-types/-/token-types-6.1.2.tgz", + "integrity": "sha512-dRXchy+C0IgK8WPC6xvCHFRIWYUbqqdEIKPaKo/AcTUNzwLTK6AH7RjdLWsEZcAN/TBdtfUw3PYEgPr5VPr6ww==", "license": "MIT", "dependencies": { + "@borewit/text-codec": "^0.2.1", "@tokenizer/token": "^0.3.0", "ieee754": "^1.2.1" }, @@ -18040,6 +18220,7 @@ "version": "5.1.2", "resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-5.1.2.tgz", "integrity": "sha512-FVDYdxtnj0G6Qm/DhNPSb8Ju59ULcup3tuJxkFb5K8Bv2pUXILbf0xZWU8PX8Ov19OXljbUyveOFwRMwkXzO+A==", + "dev": true, "license": "BSD-3-Clause", "dependencies": { "tldts": "^6.1.32" @@ -18052,6 +18233,7 @@ "version": "6.1.86", "resolved": "https://registry.npmjs.org/tldts/-/tldts-6.1.86.tgz", "integrity": "sha512-WMi/OQ2axVTf/ykqCQgXiIct+mSQDFdH2fkwhPwgEwvJ1kSzZRiinb0zF2Xb8u4+OqPChmyI6MEu4EezNJz+FQ==", + "dev": true, "license": "MIT", "dependencies": { "tldts-core": "^6.1.86" @@ -18064,6 +18246,7 @@ "version": "6.1.86", "resolved": "https://registry.npmjs.org/tldts-core/-/tldts-core-6.1.86.tgz", "integrity": "sha512-Je6p7pkk+KMzMv2XXKmAE3McmolOQFdxkKw0R8EYNr7sELW46JqnNeTX8ybPiQgvg1ymCoF8LXs5fzFaZvJPTA==", + "dev": true, "license": "MIT" }, "node_modules/tr46": { @@ -18521,9 +18704,9 @@ "license": "ISC" }, "node_modules/uint8array-extras": { - "version": "1.4.0", - "resolved": "https://registry.npmjs.org/uint8array-extras/-/uint8array-extras-1.4.0.tgz", - "integrity": "sha512-ZPtzy0hu4cZjv3z5NW9gfKnNLjoz4y6uv4HlelAjDK7sY/xOkKZv9xK/WQpcsBB3jEybChz9DPC2U/+cusjJVQ==", + "version": "1.5.0", + "resolved": "https://registry.npmjs.org/uint8array-extras/-/uint8array-extras-1.5.0.tgz", + "integrity": "sha512-rvKSBiC5zqCCiDZ9kAOszZcDvdAHwwIKJG33Ykj43OKcWsnmcBRL09YTU4nOeHZ8Y2a7l1MgTd08SBe9A8Qj6A==", "license": "MIT", "engines": { "node": ">=18" @@ -18552,12 +18735,12 @@ } }, "node_modules/undici": { - "version": "6.21.3", - "resolved": "https://registry.npmjs.org/undici/-/undici-6.21.3.tgz", - "integrity": "sha512-gBLkYIlEnSp8pFbT64yFgGE6UIB9tAkhukC23PmMDCe5Nd+cRqKxSjw5y54MK2AZMgZfJWMaNE4nYUHgi1XEOw==", + "version": "7.25.0", + "resolved": "https://registry.npmjs.org/undici/-/undici-7.25.0.tgz", + "integrity": "sha512-xXnp4kTyor2Zq+J1FfPI6Eq3ew5h6Vl0F/8d9XU5zZQf1tX9s2Su1/3PiMmUANFULpmksxkClamIZcaUqryHsQ==", "license": "MIT", "engines": { - "node": ">=18.17" + "node": ">=20.18.1" } }, "node_modules/undici-types": { @@ -19894,88 +20077,6 @@ "node": ">= 14.6" } }, - "node_modules/yargonaut": { - "version": "1.1.4", - "resolved": "https://registry.npmjs.org/yargonaut/-/yargonaut-1.1.4.tgz", - "integrity": "sha512-rHgFmbgXAAzl+1nngqOcwEljqHGG9uUZoPjsdZEs1w5JW9RXYzrSvH/u70C1JE5qFi0qjsdhnUX/dJRpWqitSA==", - "dev": true, - "license": "Apache-2.0", - "dependencies": { - "chalk": "^1.1.1", - "figlet": "^1.1.1", - "parent-require": "^1.0.0" - } - }, - "node_modules/yargonaut/node_modules/ansi-regex": { - "version": "2.1.1", - "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-2.1.1.tgz", - "integrity": "sha512-TIGnTpdo+E3+pCyAluZvtED5p5wCqLdezCyhPZzKPcxvFplEt4i+W7OONCKgeZFT3+y5NZZfOOS/Bdcanm1MYA==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=0.10.0" - } - }, - "node_modules/yargonaut/node_modules/ansi-styles": { - "version": "2.2.1", - "resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-2.2.1.tgz", - "integrity": "sha512-kmCevFghRiWM7HB5zTPULl4r9bVFSWjz62MhqizDGUrq2NWuNMQyuv4tHHoKJHs69M/MF64lEcHdYIocrdWQYA==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=0.10.0" - } - }, - "node_modules/yargonaut/node_modules/chalk": { - "version": "1.1.3", - "resolved": "https://registry.npmjs.org/chalk/-/chalk-1.1.3.tgz", - "integrity": "sha512-U3lRVLMSlsCfjqYPbLyVv11M9CPW4I728d6TCKMAOJueEeB9/8o+eSsMnxPJD+Q+K909sdESg7C+tIkoH6on1A==", - "dev": true, - "license": "MIT", - "dependencies": { - "ansi-styles": "^2.2.1", - "escape-string-regexp": "^1.0.2", - "has-ansi": "^2.0.0", - "strip-ansi": "^3.0.0", - "supports-color": "^2.0.0" - }, - "engines": { - "node": ">=0.10.0" - } - }, - "node_modules/yargonaut/node_modules/escape-string-regexp": { - "version": "1.0.5", - "resolved": "https://registry.npmjs.org/escape-string-regexp/-/escape-string-regexp-1.0.5.tgz", - "integrity": "sha512-vbRorB5FUQWvla16U8R/qgaFIya2qGzwDrNmCZuYKrbdSUMG6I1ZCGQRefkRVhuOkIGVne7BQ35DSfo1qvJqFg==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=0.8.0" - } - }, - "node_modules/yargonaut/node_modules/strip-ansi": { - "version": "3.0.1", - "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-3.0.1.tgz", - "integrity": "sha512-VhumSSbBqDTP8p2ZLKj40UjBCV4+v8bUSEpUb4KjRgWk9pbqGF4REFj6KEagidb2f/M6AzC0EmFyDNGaw9OCzg==", - "dev": true, - "license": "MIT", - "dependencies": { - "ansi-regex": "^2.0.0" - }, - "engines": { - "node": ">=0.10.0" - } - }, - "node_modules/yargonaut/node_modules/supports-color": { - "version": "2.0.0", - "resolved": "https://registry.npmjs.org/supports-color/-/supports-color-2.0.0.tgz", - "integrity": "sha512-KKNVtd6pCYgPIKU4cp2733HWYCpplQhddZLBUryaAHou723x+FRzQ5Df824Fj+IyyuiQTRoub4SnIFfIcrp70g==", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=0.8.0" - } - }, "node_modules/yargs": { "version": "17.7.2", "resolved": "https://registry.npmjs.org/yargs/-/yargs-17.7.2.tgz", @@ -20030,9 +20131,9 @@ } }, "node_modules/yoctocolors-cjs": { - "version": "2.1.2", - "resolved": "https://registry.npmjs.org/yoctocolors-cjs/-/yoctocolors-cjs-2.1.2.tgz", - "integrity": "sha512-cYVsTjKl8b+FrnidjibDWskAv7UKOfcwaVZdp/it9n1s9fU3IkgDbhdIRKCW4JDsAlECJY0ytoVPT3sK6kideA==", + "version": "2.1.3", + "resolved": "https://registry.npmjs.org/yoctocolors-cjs/-/yoctocolors-cjs-2.1.3.tgz", + "integrity": "sha512-U/PBtDf35ff0D8X8D0jfdzHYEPFxAI7jJlxZXwCSez5M3190m+QobIfh+sWDWSHMCWWJN2AWamkegn6vr6YBTw==", "dev": true, "license": "MIT", "engines": { @@ -20046,7 +20147,6 @@ "version": "3.25.28", "resolved": "https://registry.npmjs.org/zod/-/zod-3.25.28.tgz", "integrity": "sha512-/nt/67WYKnr5by3YS7LroZJbtcCBurDKKPBPWWzaxvVCGuG/NOsiKkrjoOhI8mJ+SQUXEbUzeB3S+6XDUEEj7Q==", - "dev": true, "license": "MIT", "funding": { "url": "https://github.com/sponsors/colinhacks" @@ -20186,16 +20286,17 @@ "@apify/log": "^2.5.18", "@apify/timeout": "^0.3.2", "@apify/utilities": "^2.15.5", - "@crawlee/core": "^4.0.0-beta.0", - "@crawlee/types": "^4.0.0-beta.0", - "@crawlee/utils": "^4.0.0-beta.0", + "@crawlee/core": "^4.0.0-beta.56", + "@crawlee/types": "^4.0.0-beta.56", + "@crawlee/utils": "^4.0.0-beta.56", "apify-client": "^2.12.4", "fs-extra": "^11.3.0", "got-scraping": "^4.1.1", "ow": "^2.0.0", "semver": "^7.7.2", "tslib": "^2.8.1", - "ws": "^8.18.2" + "ws": "^8.18.2", + "zod": "^3.24.0 || ^4.0.0" }, "engines": { "node": ">=22.0.0" diff --git a/package.json b/package.json index 22528a5dce..815ed78957 100644 --- a/package.json +++ b/package.json @@ -70,6 +70,9 @@ "@apify/input_secrets": "^1.1.72", "@apify/tsconfig": "^0.1.1", "@commitlint/config-conventional": "^19.8.1", + "@crawlee/core": "^4.0.0-beta.56", + "@crawlee/types": "^4.0.0-beta.56", + "@crawlee/utils": "^4.0.0-beta.56", "@playwright/browser-chromium": "^1.52.0", "@types/content-type": "^1.1.8", "@types/fs-extra": "^11.0.4", @@ -78,10 +81,7 @@ "@types/tough-cookie": "^4.0.5", "@types/ws": "^8.18.1", "commitlint": "^19.8.1", - "crawlee": "^4.0.0-beta.0", - "@crawlee/core": "^4.0.0-beta.0", - "@crawlee/types": "^4.0.0-beta.0", - "@crawlee/utils": "^4.0.0-beta.0", + "crawlee": "^4.0.0-beta.56", "eslint": "^9.27.0", "eslint-config-prettier": "^10.1.5", "fs-extra": "^11.3.0", diff --git a/packages/apify/package.json b/packages/apify/package.json index f3ddcf0be1..cd0430ffc3 100644 --- a/packages/apify/package.json +++ b/packages/apify/package.json @@ -53,15 +53,16 @@ "@apify/log": "^2.5.18", "@apify/timeout": "^0.3.2", "@apify/utilities": "^2.15.5", - "@crawlee/core": "^4.0.0-beta.0", - "@crawlee/types": "^4.0.0-beta.0", - "@crawlee/utils": "^4.0.0-beta.0", + "@crawlee/core": "^4.0.0-beta.56", + "@crawlee/types": "^4.0.0-beta.56", + "@crawlee/utils": "^4.0.0-beta.56", "apify-client": "^2.12.4", "fs-extra": "^11.3.0", "got-scraping": "^4.1.1", "ow": "^2.0.0", "semver": "^7.7.2", "tslib": "^2.8.1", - "ws": "^8.18.2" + "ws": "^8.18.2", + "zod": "^3.24.0 || ^4.0.0" } } diff --git a/packages/apify/src/actor.ts b/packages/apify/src/actor.ts index 627cacaefb..859e723a31 100644 --- a/packages/apify/src/actor.ts +++ b/packages/apify/src/actor.ts @@ -1,20 +1,19 @@ import { createPrivateKey } from 'node:crypto'; import type { - ConfigurationOptions, EventManager, EventTypeName, IStorage, RecordOptions, + StorageOpenOptions, UseStateOptions, } from '@crawlee/core'; import { - Configuration as CoreConfiguration, Dataset, EventType, purgeDefaultStorages, RequestQueue, - StorageManager, + serviceLocator, } from '@crawlee/core'; import type { Awaitable, @@ -44,8 +43,10 @@ import { decryptInputSecrets } from '@apify/input_secrets'; import log from '@apify/log'; import { addTimeoutToPromise } from '@apify/timeout'; +import { ApifyStorageClient } from './apify_storage_client.js'; import type { ChargeOptions, ChargeResult } from './charging.js'; import { ChargingManager } from './charging.js'; +import type { ConfigurationOptions } from './configuration.js'; import { Configuration } from './configuration.js'; import { KeyValueStore } from './key_value_store.js'; import { PlatformEventManager } from './platform_event_manager.js'; @@ -490,19 +491,21 @@ export class Actor { printOutdatedSdkWarning(); // reset global config instance to respect APIFY_ prefixed env vars - CoreConfiguration.globalConfig = Configuration.getGlobalConfig(); + serviceLocator.setConfiguration(Configuration.getGlobalConfig()); if (this.isAtHome()) { - this.config.set('availableMemoryRatio', 1); - this.config.set('disableBrowserSandbox', true); // for browser launcher, adds `--no-sandbox` to args - this.config.useStorageClient(this.apifyClient); - this.config.useEventManager(this.eventManager); + // availableMemoryRatio and disableBrowserSandbox are now set via + // conditional defaults in the Configuration constructor (isAtHome check) + serviceLocator.setStorageClient( + new ApifyStorageClient(this.apifyClient), + ); + serviceLocator.setEventManager(this.eventManager); } else if (options.storage) { - this.config.useStorageClient(options.storage); + serviceLocator.setStorageClient(options.storage); } // Init the event manager the config uses - await this.config.getEventManager().init(); + await serviceLocator.getEventManager().init(); log.debug(`Events initialized`); await purgeDefaultStorages({ @@ -534,8 +537,8 @@ export class Actor { options.exit ??= true; options.exitCode ??= EXIT_CODES.SUCCESS; options.timeoutSecs ??= 30; - const client = this.config.getStorageClient(); - const events = this.config.getEventManager(); + const client = serviceLocator.getStorageClient(); + const events = serviceLocator.getEventManager(); // Close the event manager and emit the final PERSIST_STATE event await events.close(); @@ -601,14 +604,14 @@ export class Actor { * @ignore */ on(event: EventTypeName, listener: (...args: any[]) => any): void { - this.config.getEventManager().on(event, listener); + serviceLocator.getEventManager().on(event, listener); } /** * @ignore */ off(event: EventTypeName, listener?: (...args: any[]) => any): void { - this.config.getEventManager().off(event, listener); + serviceLocator.getEventManager().off(event, listener); } /** @@ -776,12 +779,10 @@ export class Actor { } const { - customAfterSleepMillis = this.config.get( - 'metamorphAfterSleepMillis', - ), + customAfterSleepMillis = this.config.metamorphAfterSleepMillis, ...metamorphOpts } = options; - const runId = this.config.get('actorRunId')!; + const runId = this.config.actorRunId!; await this.apifyClient .run(runId) .metamorph(targetActorId, input, metamorphOpts); @@ -815,27 +816,24 @@ export class Actor { this.isRebooting = true; // Waiting for all the listeners to finish, as `.reboot()` kills the container. + const eventManager = serviceLocator.getEventManager(); await Promise.all([ // `persistState` for individual RequestLists, RequestQueue... instances to be persisted - ...this.config - .getEventManager() + ...eventManager .listeners(EventType.PERSIST_STATE) - .map(async (x) => x()), + .map(async (x: (...args: any[]) => any) => x()), // `migrating` to pause Apify crawlers - ...this.config - .getEventManager() + ...eventManager .listeners(EventType.MIGRATING) - .map(async (x) => x()), + .map(async (x: (...args: any[]) => any) => x()), ]); - const runId = this.config.get('actorRunId')!; + const runId = this.config.actorRunId!; await this.apifyClient.run(runId).reboot(); // Wait some time for container to be stopped. const { - customAfterSleepMillis = this.config.get( - 'metamorphAfterSleepMillis', - ), + customAfterSleepMillis = this.config.metamorphAfterSleepMillis, } = options; await sleep(customAfterSleepMillis); } @@ -873,7 +871,7 @@ export class Actor { return undefined; } - const runId = this.config.get('actorRunId')!; + const runId = this.config.actorRunId!; if (!runId) { throw new Error( `Environment variable ${ACTOR_ENV_VARS.RUN_ID} is not set!`, @@ -924,7 +922,7 @@ export class Actor { break; } - const client = this.config.getStorageClient(); + const client = serviceLocator.getStorageClient(); // just to be sure, this should be fast await addTimeoutToPromise( @@ -937,7 +935,7 @@ export class Actor { 'Setting status message timed out after 1s', ).catch((e) => log.warning(e.message)); - const runId = this.config.get('actorRunId')!; + const runId = this.config.actorRunId!; if (runId) { // just to be sure, this should be fast @@ -1213,13 +1211,9 @@ export class Actor { async getInput(): Promise { this._ensureActorInit('getInput'); - const inputSecretsPrivateKeyFile = this.config.get( - 'inputSecretsPrivateKeyFile', - ); - const inputSecretsPrivateKeyPassphrase = this.config.get( - 'inputSecretsPrivateKeyPassphrase', - ); - const input = await this.getValue(this.config.get('inputKey')); + const { inputSecretsPrivateKeyFile } = this.config; + const { inputSecretsPrivateKeyPassphrase } = this.config; + const input = await this.getValue(this.config.inputKey); if ( ow.isValid(input, ow.object.nonEmpty) && inputSecretsPrivateKeyFile && @@ -1319,7 +1313,7 @@ export class Actor { // eslint-disable-next-line dot-notation queue['initialCount'] = - (await queue.client.get())?.totalRequestCount ?? 0; + (await queue.client.getMetadata())?.totalRequestCount ?? 0; return queue; } @@ -1476,13 +1470,12 @@ export class Actor { * @ignore */ newClient(options: ApifyClientOptions = {}): ApifyClient { - const { storageDir, ...storageClientOptions } = this.config.get( - 'storageClientOptions', - ) as Dictionary; + const { storageDir, ...storageClientOptions } = (this.config + .storageClientOptions ?? {}) as Dictionary; const { apifyVersion, crawleeVersion } = getSystemInfo(); return new ApifyClient({ - baseUrl: this.config.get('apiBaseUrl'), - token: this.config.get('token'), + baseUrl: this.config.apiBaseUrl, + token: this.config.token, userAgentSuffix: [ `SDK/${apifyVersion}`, `Crawlee/${crawleeVersion}`, @@ -2240,18 +2233,30 @@ export class Actor { return this._instance; } + /** + * Drops the cached default `Actor` instance and resets the global `Configuration` (and the + * rest of the service tree via crawlee's `serviceLocator`). Intended for tests that mutate + * `process.env` at runtime — see {@link Configuration.reset} for the underlying mechanism. + * + * @internal + */ + static resetGlobalState(): void { + // eslint-disable-next-line no-underscore-dangle -- `_instance` is the lazy default-Actor cache + delete (Actor as { _instance?: Actor })._instance; + Configuration.reset(); + } + private async _openStorage( - storageClass: Constructor, + storageClass: Constructor & { + open(id?: string | null, options?: StorageOpenOptions): Promise; + }, id?: string, options: OpenStorageOptions = {}, ) { - const client = options.forceCloud ? this.apifyClient : undefined; - return StorageManager.openStorage( - storageClass, - id, - client, - this.config, - ); + const storageClient = options.forceCloud + ? new ApifyStorageClient(this.apifyClient) + : undefined; + return storageClass.open(id ?? null, { storageClient }); } private _ensureActorInit(methodCalled: string) { diff --git a/packages/apify/src/apify_storage_client.ts b/packages/apify/src/apify_storage_client.ts new file mode 100644 index 0000000000..6301411010 --- /dev/null +++ b/packages/apify/src/apify_storage_client.ts @@ -0,0 +1,68 @@ +import type { + CreateDatasetClientOptions, + CreateKeyValueStoreClientOptions, + CreateRequestQueueClientOptions, + DatasetClient, + KeyValueStoreClient, + RequestQueueClient, + StorageClient, +} from '@crawlee/types'; +import type { ApifyClient } from 'apify-client'; + +/** + * Bridges `apify-client`'s synchronous resource accessors (`dataset(id)`, + * `keyValueStore(id)`, `requestQueue(id, options?)`) to crawlee v4's + * `StorageClient` interface (async factory methods accepting either an `id` + * or a `name`). + * + * When only a `name` is provided, we resolve it to a concrete ID via the + * collection client's `getOrCreate(name)` — matching the behaviour the SDK + * relied on in v3 when storages were opened by name. + */ +export class ApifyStorageClient implements StorageClient { + constructor(private readonly client: ApifyClient) {} + + async createDatasetClient( + options?: CreateDatasetClientOptions, + ): Promise { + const id = + options?.id ?? + (options?.name + ? (await this.client.datasets().getOrCreate(options.name)).id + : undefined); + // apify-client's resource clients overlap with `@crawlee/types`' shapes + // but don't yet implement the v4-added members (`getMetadata`, + // `getRecordPublicUrl`). Cast through for now; a follow-up should + // bring apify-client into structural alignment. + return this.client.dataset(id ?? '') as unknown as DatasetClient; + } + + async createKeyValueStoreClient( + options?: CreateKeyValueStoreClientOptions, + ): Promise { + const id = + options?.id ?? + (options?.name + ? (await this.client.keyValueStores().getOrCreate(options.name)) + .id + : undefined); + return this.client.keyValueStore( + id ?? '', + ) as unknown as KeyValueStoreClient; + } + + async createRequestQueueClient( + options?: CreateRequestQueueClientOptions, + ): Promise { + const id = + options?.id ?? + (options?.name + ? (await this.client.requestQueues().getOrCreate(options.name)) + .id + : undefined); + return this.client.requestQueue( + id ?? '', + options?.clientKey ? { clientKey: options.clientKey } : undefined, + ) as unknown as RequestQueueClient; + } +} diff --git a/packages/apify/src/charging.ts b/packages/apify/src/charging.ts index 5ac7777846..44ae99e0dc 100644 --- a/packages/apify/src/charging.ts +++ b/packages/apify/src/charging.ts @@ -87,12 +87,11 @@ export class ChargingManager { private apifyClient: ApifyClient; constructor(configuration: Configuration, apifyClient: ApifyClient) { - this.maxTotalChargeUsd = - configuration.get('maxTotalChargeUsd') || Infinity; // convert `0` to `Infinity` in case the value is an empty string - this.isAtHome = configuration.get('isAtHome'); - this.actorRunId = configuration.get('actorRunId'); - this.purgeChargingLogDataset = configuration.get('purgeOnStart'); - this.useChargingLogDataset = configuration.get('useChargingLogDataset'); + this.maxTotalChargeUsd = configuration.maxTotalChargeUsd; + this.isAtHome = configuration.isAtHome; + this.actorRunId = configuration.actorRunId; + this.purgeChargingLogDataset = configuration.purgeOnStart; + this.useChargingLogDataset = configuration.useChargingLogDataset; if (this.useChargingLogDataset && this.isAtHome) { throw new Error( @@ -100,7 +99,7 @@ export class ChargingManager { ); } - if (configuration.get('testPayPerEvent')) { + if (configuration.testPayPerEvent) { if (this.isAtHome) { throw new Error( 'Using the ACTOR_TEST_PAY_PER_EVENT environment variable is only supported in a local development environment', diff --git a/packages/apify/src/configuration.ts b/packages/apify/src/configuration.ts index b8dfacd42c..9908777966 100644 --- a/packages/apify/src/configuration.ts +++ b/packages/apify/src/configuration.ts @@ -1,7 +1,16 @@ -import type { ConfigurationOptions as CoreConfigurationOptions } from '@crawlee/core'; -import { Configuration as CoreConfiguration } from '@crawlee/core'; +/* eslint-disable no-use-before-define */ +import { AsyncLocalStorage } from 'node:async_hooks'; + +import type { ConfigField, FieldsInput, FieldsOutput } from '@crawlee/core'; +import { + coerceBoolean, + coerceNumber, + Configuration as CoreConfiguration, + crawleeConfigFields, + field, +} from '@crawlee/core'; +import { z } from 'zod'; -import type { META_ORIGINS } from '@apify/consts'; import { ACTOR_ENV_VARS, APIFY_ENV_VARS, @@ -9,37 +18,199 @@ import { LOCAL_APIFY_ENV_VARS, } from '@apify/consts'; -export interface ConfigurationOptions extends CoreConfigurationOptions { - metamorphAfterSleepMillis?: number; - actorEventsWsUrl?: string; - token?: string; - actorId?: string; - actorRunId?: string; - actorTaskId?: string; - apiBaseUrl?: string; - // apiBaseUrl is the internal API URL, accessible only within the platform(private network), - // while apiPublicBaseUrl is the public API URL, available externally(through internet). - apiPublicBaseUrl?: string; - containerPort?: number; - containerUrl?: string; - proxyHostname?: string; - proxyPassword?: string; - proxyPort?: number; - proxyStatusUrl?: string; - /** - * @deprecated use `containerPort` instead - */ - standbyPort?: number; - standbyUrl?: string; - isAtHome?: boolean; - userId?: string; - inputSecretsPrivateKeyPassphrase?: string; - inputSecretsPrivateKeyFile?: string; - maxTotalChargeUsd?: number; - metaOrigin?: (typeof META_ORIGINS)[keyof typeof META_ORIGINS]; - testPayPerEvent?: boolean; - useChargingLogDataset?: boolean; -} +// --- isAtHome check (simple env var presence) --- +const isAtHome = !!process.env[APIFY_ENV_VARS.IS_AT_HOME]; + +// --- Apify config field definitions --- + +export const apifyConfigFields = { + // Inherit all crawlee fields, overriding env vars where the SDK supports ACTOR_/APIFY_ aliases + ...crawleeConfigFields, + + // Override crawlee fields with ACTOR_/APIFY_ env var aliases + defaultDatasetId: field( + z + .string() + .default(LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.DEFAULT_DATASET_ID]), + [ + ACTOR_ENV_VARS.DEFAULT_DATASET_ID, + APIFY_ENV_VARS.DEFAULT_DATASET_ID, + 'CRAWLEE_DEFAULT_DATASET_ID', + ], + ), + defaultKeyValueStoreId: field( + z + .string() + .default( + LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.DEFAULT_KEY_VALUE_STORE_ID], + ), + [ + ACTOR_ENV_VARS.DEFAULT_KEY_VALUE_STORE_ID, + APIFY_ENV_VARS.DEFAULT_KEY_VALUE_STORE_ID, + 'CRAWLEE_DEFAULT_KEY_VALUE_STORE_ID', + ], + ), + defaultRequestQueueId: field( + z + .string() + .default( + LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.DEFAULT_REQUEST_QUEUE_ID], + ), + [ + ACTOR_ENV_VARS.DEFAULT_REQUEST_QUEUE_ID, + APIFY_ENV_VARS.DEFAULT_REQUEST_QUEUE_ID, + 'CRAWLEE_DEFAULT_REQUEST_QUEUE_ID', + ], + ), + inputKey: field(z.string().default('INPUT'), [ + ACTOR_ENV_VARS.INPUT_KEY, + APIFY_ENV_VARS.INPUT_KEY, + 'CRAWLEE_INPUT_KEY', + ]), + memoryMbytes: field(coerceNumber.optional(), [ + ACTOR_ENV_VARS.MEMORY_MBYTES, + APIFY_ENV_VARS.MEMORY_MBYTES, + 'CRAWLEE_MEMORY_MBYTES', + ]), + availableMemoryRatio: field(coerceNumber.default(isAtHome ? 1 : 0.25), [ + 'CRAWLEE_AVAILABLE_MEMORY_RATIO', + 'APIFY_AVAILABLE_MEMORY_RATIO', + ]), + disableBrowserSandbox: field( + isAtHome ? coerceBoolean.default(true) : coerceBoolean.optional(), + ['CRAWLEE_DISABLE_BROWSER_SANDBOX', 'APIFY_DISABLE_BROWSER_SANDBOX'], + ), + persistStateIntervalMillis: field(coerceNumber.default(60_000), [ + 'CRAWLEE_PERSIST_STATE_INTERVAL_MILLIS', + APIFY_ENV_VARS.PERSIST_STATE_INTERVAL_MILLIS, + 'APIFY_TEST_PERSIST_INTERVAL_MILLIS', + ]), + headless: field(coerceBoolean.default(true), [ + 'CRAWLEE_HEADLESS', + APIFY_ENV_VARS.HEADLESS, + ]), + xvfb: field(coerceBoolean.default(false), [ + 'CRAWLEE_XVFB', + APIFY_ENV_VARS.XVFB, + ]), + chromeExecutablePath: field(z.string().optional(), [ + 'CRAWLEE_CHROME_EXECUTABLE_PATH', + APIFY_ENV_VARS.CHROME_EXECUTABLE_PATH, + ]), + defaultBrowserPath: field(z.string().optional(), [ + 'CRAWLEE_DEFAULT_BROWSER_PATH', + 'APIFY_DEFAULT_BROWSER_PATH', + ]), + purgeOnStart: field(coerceBoolean.default(true), [ + 'CRAWLEE_PURGE_ON_START', + APIFY_ENV_VARS.PURGE_ON_START, + ]), + + // Apify-specific fields + metamorphAfterSleepMillis: field( + coerceNumber.default(300_000), + APIFY_ENV_VARS.METAMORPH_AFTER_SLEEP_MILLIS, + ), + actorEventsWsUrl: field(z.string().optional(), [ + ACTOR_ENV_VARS.EVENTS_WEBSOCKET_URL, + APIFY_ENV_VARS.ACTOR_EVENTS_WS_URL, + ]), + token: field(z.string().optional(), APIFY_ENV_VARS.TOKEN), + actorId: field(z.string().optional(), [ + ACTOR_ENV_VARS.ID, + APIFY_ENV_VARS.ACTOR_ID, + ]), + actorRunId: field(z.string().optional(), [ + ACTOR_ENV_VARS.RUN_ID, + APIFY_ENV_VARS.ACTOR_RUN_ID, + ]), + actorTaskId: field(z.string().optional(), [ + ACTOR_ENV_VARS.TASK_ID, + APIFY_ENV_VARS.ACTOR_TASK_ID, + ]), + apiBaseUrl: field( + z.string().default('https://api.apify.com'), + APIFY_ENV_VARS.API_BASE_URL, + ), + apiPublicBaseUrl: field( + z.string().default('https://api.apify.com'), + APIFY_ENV_VARS.API_PUBLIC_BASE_URL, + ), + containerPort: field( + coerceNumber.default( + +LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.WEB_SERVER_PORT], + ), + [ACTOR_ENV_VARS.WEB_SERVER_PORT, APIFY_ENV_VARS.CONTAINER_PORT], + ), + containerUrl: field( + z.string().default(LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.WEB_SERVER_URL]), + [ACTOR_ENV_VARS.WEB_SERVER_URL, APIFY_ENV_VARS.CONTAINER_URL], + ), + proxyHostname: field( + z.string().default(LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_HOSTNAME]), + APIFY_ENV_VARS.PROXY_HOSTNAME, + ), + proxyPassword: field(z.string().optional(), APIFY_ENV_VARS.PROXY_PASSWORD), + proxyPort: field( + coerceNumber.default(+LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_PORT]), + APIFY_ENV_VARS.PROXY_PORT, + ), + proxyStatusUrl: field( + z.string().default('http://proxy.apify.com'), + APIFY_ENV_VARS.PROXY_STATUS_URL, + ), + /** @deprecated use `containerPort` instead */ + standbyPort: field( + coerceNumber.default( + +LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.STANDBY_PORT], + ), + ACTOR_ENV_VARS.STANDBY_PORT, + ), + standbyUrl: field(z.string().optional(), ACTOR_ENV_VARS.STANDBY_URL), + isAtHome: field(coerceBoolean.default(false), APIFY_ENV_VARS.IS_AT_HOME), + userId: field(z.string().optional(), APIFY_ENV_VARS.USER_ID), + inputSecretsPrivateKeyPassphrase: field( + z.string().optional(), + APIFY_ENV_VARS.INPUT_SECRETS_PRIVATE_KEY_PASSPHRASE, + ), + inputSecretsPrivateKeyFile: field( + z.string().optional(), + APIFY_ENV_VARS.INPUT_SECRETS_PRIVATE_KEY_FILE, + ), + // `0` is treated as "no limit" (mirrors the Apify platform contract). + maxTotalChargeUsd: field( + coerceNumber + .transform((val: number) => (val === 0 ? Infinity : val)) + .default(Infinity), + ACTOR_ENV_VARS.MAX_TOTAL_CHARGE_USD, + ), + metaOrigin: field(z.string().optional(), APIFY_ENV_VARS.META_ORIGIN), + testPayPerEvent: field( + coerceBoolean.default(false), + 'ACTOR_TEST_PAY_PER_EVENT', + ), + useChargingLogDataset: field( + coerceBoolean.default(false), + 'ACTOR_USE_CHARGING_LOG_DATASET', + ), + // Grab-bag of ApifyClient constructor options; the `storageDir` key is + // pulled out separately for local storage emulation, the rest is spread + // into `new ApifyClient({...})` in `Actor.newClient()`. No env var alias. + storageClientOptions: field(z.record(z.string(), z.unknown()).optional()), +}; + +// --- Type utilities --- + +export type ApifyConfigurationInput = FieldsInput; +export type ApifyResolvedConfigValues = FieldsOutput; + +/** @deprecated Use {@link ApifyConfigurationInput} instead. */ +export type ConfigurationOptions = ApifyConfigurationInput; + +// --- Configuration class --- + +// eslint-disable-next-line @typescript-eslint/no-empty-object-type, @typescript-eslint/no-unsafe-declaration-merging +export interface Configuration extends ApifyResolvedConfigValues {} /** * `Configuration` is a value object holding the SDK configuration. We can use it in two ways: @@ -48,38 +219,34 @@ export interface ConfigurationOptions extends CoreConfigurationOptions { * * ```javascript * import { Actor } from 'apify'; - * import { BasicCrawler } from 'crawlee'; * * const sdk = new Actor({ token: '123' }); - * console.log(sdk.config.get('token')); // '123' - * - * const crawler = new BasicCrawler({ - * // ... crawler options - * }, sdk.config); + * console.log(sdk.config.token); // '123' * ``` * * 2. To get the global configuration (singleton instance). It will respect the environment variables. * * ```javascript - * import { BasicCrawler, Configuration } from 'crawlee'; + * import { Configuration } from 'apify'; * - * // Get the global configuration * const config = Configuration.getGlobalConfig(); - * // Set the 'persistStateIntervalMillis' option - * // of global configuration to 30 seconds - * config.set('persistStateIntervalMillis', 30_000); - * - * // No need to pass the configuration to the crawler, - * // as it's using the global configuration by default - * const crawler = new BasicCrawler(); + * console.log(config.headless); + * console.log(config.persistStateIntervalMillis); * ``` * + * Configuration is immutable — values are set via the constructor and cannot be changed afterwards. + * The priority order for resolving values is (highest to lowest): + * + * ```text + * constructor options > environment variables > crawlee.json > schema defaults + * ``` + * * ## Supported Configuration Options * * Key | Environment Variable | Default Value * ---|---|--- * `memoryMbytes` | `ACTOR_MEMORY_MBYTES` | - - * `headless` | `APIFY_HEADLESS` | - + * `headless` | `APIFY_HEADLESS` | `true` * `persistStateIntervalMillis` | `APIFY_PERSIST_STATE_INTERVAL_MILLIS` | `60e3` * `token` | `APIFY_TOKEN` | - * `isAtHome` | `APIFY_IS_AT_HOME` | - @@ -112,126 +279,19 @@ export interface ConfigurationOptions extends CoreConfigurationOptions { * `chromeExecutablePath` | `APIFY_CHROME_EXECUTABLE_PATH` | - * `defaultBrowserPath` | `APIFY_DEFAULT_BROWSER_PATH` | - */ +// eslint-disable-next-line @typescript-eslint/no-unsafe-declaration-merging export class Configuration extends CoreConfiguration { - /** @inheritDoc */ - // eslint-disable-next-line no-use-before-define -- Self-reference - static override globalConfig?: Configuration; - - // maps environment variables to config keys (e.g. `APIFY_MEMORY_MBYTES` to `memoryMbytes`) - protected static override ENV_MAP = { - // regular crawlee env vars are also supported - ...CoreConfiguration.ENV_MAP, - - // support crawlee env vars prefixed with `APIFY_` too - APIFY_AVAILABLE_MEMORY_RATIO: 'availableMemoryRatio', - APIFY_PURGE_ON_START: 'purgeOnStart', - APIFY_MEMORY_MBYTES: 'memoryMbytes', - APIFY_DEFAULT_DATASET_ID: 'defaultDatasetId', - APIFY_DEFAULT_KEY_VALUE_STORE_ID: 'defaultKeyValueStoreId', - APIFY_DEFAULT_REQUEST_QUEUE_ID: 'defaultRequestQueueId', - APIFY_INPUT_KEY: 'inputKey', - APIFY_PERSIST_STATE_INTERVAL_MILLIS: 'persistStateIntervalMillis', - APIFY_HEADLESS: 'headless', - APIFY_XVFB: 'xvfb', - APIFY_CHROME_EXECUTABLE_PATH: 'chromeExecutablePath', - APIFY_DEFAULT_BROWSER_PATH: 'defaultBrowserPath', - APIFY_DISABLE_BROWSER_SANDBOX: 'disableBrowserSandbox', - - // as well as apify specific ones - APIFY_TOKEN: 'token', - APIFY_METAMORPH_AFTER_SLEEP_MILLIS: 'metamorphAfterSleepMillis', - APIFY_TEST_PERSIST_INTERVAL_MILLIS: 'persistStateIntervalMillis', // for BC, seems to be unused - APIFY_ACTOR_EVENTS_WS_URL: 'actorEventsWsUrl', - APIFY_ACTOR_ID: 'actorId', - APIFY_API_BASE_URL: 'apiBaseUrl', - APIFY_API_PUBLIC_BASE_URL: 'apiPublicBaseUrl', - APIFY_IS_AT_HOME: 'isAtHome', - APIFY_ACTOR_RUN_ID: 'actorRunId', - APIFY_ACTOR_TASK_ID: 'actorTaskId', - APIFY_CONTAINER_PORT: 'containerPort', - APIFY_CONTAINER_URL: 'containerUrl', - APIFY_USER_ID: 'userId', - APIFY_PROXY_HOSTNAME: 'proxyHostname', - APIFY_PROXY_PASSWORD: 'proxyPassword', - APIFY_PROXY_STATUS_URL: 'proxyStatusUrl', - APIFY_PROXY_PORT: 'proxyPort', - APIFY_INPUT_SECRETS_PRIVATE_KEY_FILE: 'inputSecretsPrivateKeyFile', - APIFY_INPUT_SECRETS_PRIVATE_KEY_PASSPHRASE: - 'inputSecretsPrivateKeyPassphrase', - APIFY_META_ORIGIN: 'metaOrigin', - - // Actor env vars - ACTOR_DEFAULT_DATASET_ID: 'defaultDatasetId', - ACTOR_DEFAULT_KEY_VALUE_STORE_ID: 'defaultKeyValueStoreId', - ACTOR_DEFAULT_REQUEST_QUEUE_ID: 'defaultRequestQueueId', - ACTOR_EVENTS_WEBSOCKET_URL: 'actorEventsWsUrl', - ACTOR_ID: 'actorId', - ACTOR_INPUT_KEY: 'inputKey', - ACTOR_MEMORY_MBYTES: 'memoryMbytes', - ACTOR_RUN_ID: 'actorRunId', - ACTOR_STANDBY_PORT: 'standbyPort', - ACTOR_STANDBY_URL: 'standbyUrl', - ACTOR_TASK_ID: 'actorTaskId', - ACTOR_WEB_SERVER_PORT: 'containerPort', - ACTOR_WEB_SERVER_URL: 'containerUrl', - ACTOR_MAX_TOTAL_CHARGE_USD: 'maxTotalChargeUsd', - ACTOR_TEST_PAY_PER_EVENT: 'testPayPerEvent', - ACTOR_USE_CHARGING_LOG_DATASET: 'useChargingLogDataset', - }; - - protected static override INTEGER_VARS = [ - ...CoreConfiguration.INTEGER_VARS, - 'proxyPort', - 'containerPort', - 'metamorphAfterSleepMillis', - 'maxTotalChargeUsd', - ]; - - protected static override BOOLEAN_VARS = [ - ...CoreConfiguration.BOOLEAN_VARS, - 'isAtHome', - 'testPayPerEvent', - 'useChargingLogDataset', - ]; + /** @internal */ + static storage = new AsyncLocalStorage(); - protected static override DEFAULTS = { - ...CoreConfiguration.DEFAULTS, - defaultKeyValueStoreId: - LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.DEFAULT_KEY_VALUE_STORE_ID], - defaultDatasetId: - LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.DEFAULT_DATASET_ID], - defaultRequestQueueId: - LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.DEFAULT_REQUEST_QUEUE_ID], - inputKey: 'INPUT', - apiBaseUrl: 'https://api.apify.com', - apiPublicBaseUrl: 'https://api.apify.com', - proxyStatusUrl: 'http://proxy.apify.com', - proxyHostname: LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_HOSTNAME], - proxyPort: +LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_PORT], - containerPort: +LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.WEB_SERVER_PORT], - containerUrl: LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.WEB_SERVER_URL], - standbyPort: +LOCAL_ACTOR_ENV_VARS[ACTOR_ENV_VARS.STANDBY_PORT], - metamorphAfterSleepMillis: 300e3, - persistStateIntervalMillis: 60e3, // This value is mentioned in jsdoc in `events.js`, if you update it here, update it there too. - testPayPerEvent: false, - useChargingLogDataset: false, - }; + /** @internal */ + static globalConfig?: Configuration; - /** - * @inheritDoc - */ - override get< - T extends keyof ConfigurationOptions, - U extends ConfigurationOptions[T], - >(key: T, defaultValue?: U): U { - return super.get(key as keyof CoreConfigurationOptions, defaultValue); - } + protected static override fields: Record = + apifyConfigFields; - /** - * @inheritDoc - */ - override set(key: keyof ConfigurationOptions, value?: any) { - super.set(key as keyof CoreConfigurationOptions, value); + constructor(options: ApifyConfigurationInput = {}) { + super(options as any); } /** @@ -247,21 +307,19 @@ export class Configuration extends CoreConfiguration { } /** - * Resets global configuration instance. The default instance holds configuration based on env vars, - * if we want to change them, we need to first reset the global state. Used mainly for testing purposes. + * Drops the cached global configuration so the next `Configuration.getGlobalConfig()` constructs + * a fresh instance from the current environment. Intended mainly for tests that mutate + * `process.env` at runtime — values are resolved eagerly at construction, so changing env vars + * after the singleton has been cached has no effect until the singleton is dropped. + * + * Extends crawlee's `Configuration.reset()` (which drops the service locator) by also clearing + * the SDK's own `globalConfig` static and replacing the `AsyncLocalStorage` that `Actor.init()` + * writes to (replaced wholesale because `enterWith(undefined)` from a child async context does + * not propagate back up on Node 22). */ - static override resetGlobalState(): void { + static override reset(): void { delete this.globalConfig; + this.storage = new AsyncLocalStorage(); + super.reset(); } } - -// monkey patch the core class so it respects the new options too -CoreConfiguration.getGlobalConfig = Configuration.getGlobalConfig; -// @ts-expect-error protected property -CoreConfiguration.ENV_MAP = Configuration.ENV_MAP; -// @ts-expect-error protected property -CoreConfiguration.INTEGER_VARS = Configuration.INTEGER_VARS; -// @ts-expect-error protected property -CoreConfiguration.BOOLEAN_VARS = Configuration.BOOLEAN_VARS; -// @ts-expect-error protected property -CoreConfiguration.DEFAULTS = Configuration.DEFAULTS; diff --git a/packages/apify/src/key_value_store.ts b/packages/apify/src/key_value_store.ts index 89f2138d2c..d186a12e06 100644 --- a/packages/apify/src/key_value_store.ts +++ b/packages/apify/src/key_value_store.ts @@ -1,12 +1,18 @@ -import type { StorageManagerOptions } from '@crawlee/core'; +import type { StorageOpenOptions } from '@crawlee/core'; import { KeyValueStore as CoreKeyValueStore } from '@crawlee/core'; +import type { KeyValueStoreInfo } from '@crawlee/types'; import { createHmacSignature } from '@apify/utilities'; import type { Configuration } from './configuration.js'; -// @ts-ignore newer crawlee versions already declare this method in core -const { getPublicUrl } = CoreKeyValueStore.prototype; +// crawlee v4 dropped the `storageObject` cache from `KeyValueStore`, so the +// per-store `urlSigningSecretKey` (which is part of the platform's metadata +// response but not declared on `@crawlee/types`' `KeyValueStoreInfo`) has to +// be fetched on demand and accessed through a structural-typed augmentation. +type ApifyKeyValueStoreInfo = KeyValueStoreInfo & { + urlSigningSecretKey?: string; +}; /** * @inheritDoc @@ -15,24 +21,35 @@ export class KeyValueStore extends CoreKeyValueStore { /** * Returns a URL for the given key that may be used to publicly * access the value in the remote key-value store. + * + * On the Apify platform the URL is signed with the store's + * `urlSigningSecretKey` so that anyone with the URL can read the record + * without authentication. Locally we delegate to crawlee's default + * implementation (which produces a `file://` URL or returns `undefined`). */ - override getPublicUrl(key: string): string { + override async getPublicUrl(key: string): Promise { const config = this.config as Configuration; - if (!config.get('isAtHome') && getPublicUrl) { - return getPublicUrl.call(this, key); + if (!config.isAtHome) { + return super.getPublicUrl(key); } const publicUrl = new URL( - `${config.get('apiPublicBaseUrl')}/v2/key-value-stores/${this.id}/records/${key}`, + `${config.apiPublicBaseUrl}/v2/key-value-stores/${this.id}/records/${key}`, ); - if (this.storageObject?.urlSigningSecretKey) { + // `client` is `private` on `CoreKeyValueStore`; bypass the visibility + // check to fetch the per-store secret. There is no public crawlee API + // surface for this yet — track upstream exposure as a follow-up. + const metadata = (await ( + this as unknown as { + client: { getMetadata(): Promise }; + } + ).client.getMetadata()) as ApifyKeyValueStoreInfo; + + if (metadata?.urlSigningSecretKey) { publicUrl.searchParams.append( 'signature', - createHmacSignature( - this.storageObject.urlSigningSecretKey as string, - key, - ), + createHmacSignature(metadata.urlSigningSecretKey, key), ); } @@ -44,11 +61,8 @@ export class KeyValueStore extends CoreKeyValueStore { */ static override async open( storeIdOrName?: string | null, - options: StorageManagerOptions = {}, + options: StorageOpenOptions = {}, ): Promise { return super.open(storeIdOrName, options) as unknown as KeyValueStore; } } - -// @ts-ignore newer crawlee versions already declare this method in core -CoreKeyValueStore.prototype.getPublicUrl = KeyValueStore.prototype.getPublicUrl; diff --git a/packages/apify/src/platform_event_manager.ts b/packages/apify/src/platform_event_manager.ts index a71c5ee68e..0f9a7f3167 100644 --- a/packages/apify/src/platform_event_manager.ts +++ b/packages/apify/src/platform_event_manager.ts @@ -1,4 +1,4 @@ -import { EventManager, EventType } from '@crawlee/core'; +import { EventManager, EventType, serviceLocator } from '@crawlee/core'; import { WebSocket } from 'ws'; import { ACTOR_ENV_VARS, ACTOR_EVENT_NAMES } from '@apify/consts'; @@ -48,8 +48,23 @@ export class PlatformEventManager extends EventManager { /** Websocket connection to Actor events. */ private eventsWs?: WebSocket; - constructor(override readonly config = Configuration.getGlobalConfig()) { - super(); + constructor( + readonly config: Configuration = Configuration.getGlobalConfig() as Configuration, + ) { + super({ + persistStateIntervalMillis: config.persistStateIntervalMillis, + }); + } + + /** + * Creates a `PlatformEventManager` from a (resolved) Configuration, mirroring + * `LocalEventManager.fromConfig()` from crawlee. Falls back to the global + * configuration if none is provided. + */ + static fromConfig(config?: Configuration): PlatformEventManager { + return new PlatformEventManager( + config ?? (serviceLocator.getConfiguration() as Configuration), + ); } /** @@ -62,7 +77,7 @@ export class PlatformEventManager extends EventManager { } await super.init(); - const eventsWsUrl = this.config.get('actorEventsWsUrl'); + const eventsWsUrl = this.config.actorEventsWsUrl; // Locally there is no web socket to connect, so just print a log message. if (!eventsWsUrl) { diff --git a/packages/apify/src/proxy_configuration.ts b/packages/apify/src/proxy_configuration.ts index 569ea84ae3..e3033d07f8 100644 --- a/packages/apify/src/proxy_configuration.ts +++ b/packages/apify/src/proxy_configuration.ts @@ -1,8 +1,6 @@ -import type { - ProxyConfigurationOptions as CoreProxyConfigurationOptions, - ProxyInfo as CoreProxyInfo, -} from '@crawlee/core'; +import type { ProxyConfigurationOptions as CoreProxyConfigurationOptions } from '@crawlee/core'; import { ProxyConfiguration as CoreProxyConfiguration } from '@crawlee/core'; +import type { ProxyInfo as CoreProxyInfo } from '@crawlee/types'; import { gotScraping } from 'got-scraping'; import ow from 'ow'; @@ -12,12 +10,17 @@ import { cryptoRandomObjectId } from '@apify/utilities'; import { Actor } from './actor.js'; import { Configuration } from './configuration.js'; -// https://docs.apify.com/proxy/datacenter-proxy#username-parameters -const MAX_SESSION_ID_LENGTH = 50; const CHECK_ACCESS_REQUEST_TIMEOUT_MILLIS = 4_000; const CHECK_ACCESS_MAX_ATTEMPTS = 2; const COUNTRY_CODE_REGEX = /^[A-Z]{2}$/; +// Apify Proxy session identifier embedded in the proxy username — opaque to +// users; a fresh one is minted for every URL the SDK hands out so that the +// returned proxy URLs are independent. +const SESSION_ID_LENGTH = 12; + +type NewUrlOptions = Parameters[0]; + export interface ProxyConfigurationOptions extends CoreProxyConfigurationOptions { /** @@ -56,15 +59,6 @@ export interface ProxyConfigurationOptions * configurate the proxy by UI input schema. You should use the `countryCode` option in your crawler code. */ apifyProxyCountry?: string; - - /** - * Multiple different ProxyConfigurationOptions stratified into tiers. Crawlee crawlers will switch between those tiers - * based on the blocked request statistics. - */ - tieredProxyConfig?: Omit< - ProxyConfigurationOptions, - keyof CoreProxyConfigurationOptions | 'tieredProxyConfig' - >[]; } /** @@ -91,9 +85,6 @@ export interface ProxyConfigurationOptions * requestHandler({ proxyInfo }) { * // Getting used proxy URL * const proxyUrl = proxyInfo.url; - * - * // Getting ID of used Session - * const sessionIdentifier = proxyInfo.sessionId; * } * }) * @@ -104,7 +95,7 @@ export interface ProxyInfo extends CoreProxyInfo { * An array of proxy groups to be used by the [Apify Proxy](https://docs.apify.com/proxy). * If not provided, the proxy will select the groups automatically. */ - groups: string[]; + groups?: string[]; /** * If set and relevant proxies are available in your Apify account, all proxied requests will @@ -193,10 +184,6 @@ export class ProxyConfiguration extends CoreProxyConfiguration { apifyProxyCountry: ow.optional.string.matches(COUNTRY_CODE_REGEX), password: ow.optional.string, - tieredProxyUrls: ow.optional.array.ofType( - ow.array.ofType(ow.string), - ), - tieredProxyConfig: ow.optional.array.ofType(ow.object), }), ); @@ -205,24 +192,13 @@ export class ProxyConfiguration extends CoreProxyConfiguration { apifyProxyGroups = [], countryCode, apifyProxyCountry, - password = config.get('proxyPassword'), - tieredProxyConfig, - tieredProxyUrls, + password = config.proxyPassword, } = options; - this.tieredProxyUrls ??= tieredProxyUrls; - - if (tieredProxyConfig) { - this.tieredProxyUrls = this._generateTieredProxyUrls( - tieredProxyConfig, - options, - ); - } - const groupsToUse = groups.length ? groups : apifyProxyGroups; const countryCodeToUse = countryCode || apifyProxyCountry; - const hostname = config.get('proxyHostname'); - const port = config.get('proxyPort'); + const hostname = config.proxyHostname; + const port = config.proxyPort; // Validation if ( @@ -241,7 +217,7 @@ export class ProxyConfiguration extends CoreProxyConfiguration { this.port = port; this.usesApifyProxy = !this.proxyUrls && !this.newUrlFunction; - if (proxyUrls && proxyUrls.some((url) => url.includes('apify.com'))) { + if (proxyUrls && proxyUrls.some((url) => url?.includes('apify.com'))) { this.log.warning( 'Some Apify proxy features may work incorrectly. Please consider setting up Apify properties instead of `proxyUrls`.\n' + 'See https://sdk.apify.com/docs/guides/proxy-management#apify-proxy-configuration', @@ -287,143 +263,65 @@ export class ProxyConfiguration extends CoreProxyConfiguration { } /** - * This function creates a new {@apilink ProxyInfo} info object. - * It is used by CheerioCrawler and PuppeteerCrawler to generate proxy URLs and also to allow the user to inspect - * the currently used proxy via the requestHandler parameter `proxyInfo`. - * Use it if you want to work with a rich representation of a proxy URL. - * If you need the URL string only, use {@apilink ProxyConfiguration.newUrl}. - * @param [sessionId] - * Represents the identifier of user {@apilink Session} that can be managed by the {@apilink SessionPool} or - * you can use the Apify Proxy [Session](https://docs.apify.com/proxy#sessions) identifier. - * When the provided sessionId is a number, it's converted to a string. Property sessionId of - * {@apilink ProxyInfo} is always returned as a type string. - * - * All the HTTP requests going through the proxy with the same session identifier - * will use the same target proxy server (i.e. the same IP address). - * The identifier must not be longer than 50 characters and include only the following: `0-9`, `a-z`, `A-Z`, `"."`, `"_"` and `"~"`. - * @return Represents information about used proxy and its configuration. + * Returns a new {@apilink ProxyInfo} object with a fresh proxy URL. Each call mints an + * independent URL; for Apify Proxy a random session id is embedded so consecutive + * calls resolve to different IPs. */ override async newProxyInfo( - sessionId?: string | number, - options?: Parameters[1], + options?: NewUrlOptions, ): Promise { - if (typeof sessionId === 'number') sessionId = `${sessionId}`; - ow( - sessionId, - ow.optional.string - .maxLength(MAX_SESSION_ID_LENGTH) - .matches(APIFY_PROXY_VALUE_REGEX), - ); - - const proxyInfo = await super.newProxyInfo(sessionId, options); - if (!proxyInfo) return proxyInfo; - - const { groups, countryCode, password, port, hostname } = ( - this.usesApifyProxy ? this : new URL(proxyInfo.url) - ) as ProxyConfiguration; - - return { - ...proxyInfo, - sessionId, - groups, - countryCode, - // this.password is not encoded, but the password from the URL will be, we need to normalize - password: this.usesApifyProxy - ? (password ?? '') - : decodeURIComponent(password!), - hostname, - port: port!, + const url = await this.newUrl(options); + if (!url) return undefined; + + const parsed = new URL(url); + const result: ProxyInfo = { + url, + username: decodeURIComponent(parsed.username), + password: decodeURIComponent(parsed.password), + hostname: parsed.hostname, + port: parsed.port, }; + if (this.usesApifyProxy) { + result.groups = this.groups; + if (this.countryCode !== undefined) + result.countryCode = this.countryCode; + } + return result; } /** - * Returns a new proxy URL based on provided configuration options and the `sessionId` parameter. - * @param [sessionId] - * Represents the identifier of user {@apilink Session} that can be managed by the {@apilink SessionPool} or - * you can use the Apify Proxy [Session](https://docs.apify.com/proxy#sessions) identifier. - * When the provided sessionId is a number, it's converted to a string. - * - * All the HTTP requests going through the proxy with the same session identifier - * will use the same target proxy server (i.e. the same IP address). - * The identifier must not be longer than 50 characters and include only the following: `0-9`, `a-z`, `A-Z`, `"."`, `"_"` and `"~"`. - * @return A string with a proxy URL, including authentication credentials and port number. - * For example, `http://bob:password123@proxy.example.com:8000` + * Returns a new proxy URL. For Apify Proxy, each call generates a URL with a fresh + * random session id, so consecutive calls return independent URLs. For custom + * `proxyUrls`, the URLs are rotated round-robin. */ override async newUrl( - sessionId?: string | number, - options?: Parameters[1], + options?: NewUrlOptions, ): Promise { - if (typeof sessionId === 'number') sessionId = `${sessionId}`; - ow( - sessionId, - ow.optional.string - .maxLength(MAX_SESSION_ID_LENGTH) - .matches(APIFY_PROXY_VALUE_REGEX), - ); - if (this.newUrlFunction) { - return ( - (await this._callNewUrlFunction(sessionId, { - request: options?.request, - })) ?? undefined - ); - } - if (this.proxyUrls) { - return this._handleCustomUrl(sessionId); - } - - if (this.tieredProxyUrls) { - return ( - this._handleTieredUrl( - sessionId ?? cryptoRandomObjectId(6), - options, - ).proxyUrl ?? undefined - ); + if (this.newUrlFunction || this.proxyUrls) { + return super.newUrl(options); } - - return this.composeDefaultUrl(sessionId); - } - - protected _generateTieredProxyUrls( - tieredProxyConfig: NonNullable< - ProxyConfigurationOptions['tieredProxyConfig'] - >, - globalOptions: ProxyConfigurationOptions, - ) { - return tieredProxyConfig.map((config) => [ - new ProxyConfiguration({ - ...globalOptions, - ...config, - tieredProxyConfig: undefined, - }).composeDefaultUrl(), - ]); + return this.composeDefaultUrl(cryptoRandomObjectId(SESSION_ID_LENGTH)); } /** * Returns proxy username. */ - protected _getUsername(sessionId?: string): string { - let username; + protected _getUsername(sessionId: string): string { const { groups, countryCode } = this; const parts: string[] = []; if (groups && groups.length) { parts.push(`groups-${groups.join('+')}`); } - if (sessionId) { - parts.push(`session-${sessionId}`); - } + parts.push(`session-${sessionId}`); if (countryCode) { parts.push(`country-${countryCode}`); } - username = parts.join(','); - - if (parts.length === 0) username = 'auto'; - - return username; + return parts.join(','); } - protected composeDefaultUrl(sessionId?: string): string { + protected composeDefaultUrl(sessionId: string): string { const username = this._getUsername(sessionId); const url = new URL(`http://${this.hostname}:${this.port}`); url.username = `${username}`; @@ -438,7 +336,7 @@ export class ProxyConfiguration extends CoreProxyConfiguration { */ // TODO: Make this private protected async _setPasswordIfToken(): Promise { - const token = this.config.get('token'); + const { token } = this.config; if (!token) return; try { @@ -500,10 +398,7 @@ export class ProxyConfiguration extends CoreProxyConfiguration { } | undefined > { - const proxyStatusUrl = this.config.get( - 'proxyStatusUrl', - 'http://proxy.apify.com', - ); + const { proxyStatusUrl } = this.config; const requestOpts = { url: `${proxyStatusUrl}/?format=json`, proxyUrl: await this.newUrl(), diff --git a/test/MemoryStorageEmulator.ts b/test/MemoryStorageEmulator.ts index c5d4511236..dae4bedf00 100644 --- a/test/MemoryStorageEmulator.ts +++ b/test/MemoryStorageEmulator.ts @@ -1,9 +1,9 @@ import { rm } from 'node:fs/promises'; import { resolve } from 'node:path'; -import { StorageManager } from '@crawlee/core'; +import { serviceLocator } from '@crawlee/core'; import { MemoryStorage } from '@crawlee/memory-storage'; -import { Configuration } from 'apify'; +import { Actor } from 'apify'; import { ensureDir } from 'fs-extra'; import log from '@apify/log'; @@ -20,7 +20,10 @@ export class MemoryStorageEmulator { protected localStorageDirectories: string[] = []; async init(dirName = cryptoRandomObjectId(10)) { - StorageManager.clearCache(); + // crawlee v4 dropped `StorageManager.clearCache()` and + // `Configuration.useStorageClient()`; reset the service locator + // and re-register the in-memory client instead. + Actor.resetGlobalState(); const localStorageDir = resolve(LOCAL_EMULATION_DIR, dirName); this.localStorageDirectories.push(localStorageDir); await ensureDir(localStorageDir); @@ -28,7 +31,7 @@ export class MemoryStorageEmulator { const storage = new MemoryStorage({ localDataDirectory: localStorageDir, }); - Configuration.getGlobalConfig().useStorageClient(storage); + serviceLocator.setStorageClient(storage); log.debug( `Initialized emulated memory storage in folder ${localStorageDir}`, ); @@ -40,7 +43,7 @@ export class MemoryStorageEmulator { }); await Promise.all(promises); - StorageManager.clearCache(); + Actor.resetGlobalState(); } static toString() { diff --git a/test/apify/actor.test.ts b/test/apify/actor.test.ts index 47565929ff..dc820cebdd 100644 --- a/test/apify/actor.test.ts +++ b/test/apify/actor.test.ts @@ -1,9 +1,15 @@ import { createPublicKey } from 'node:crypto'; -import { Configuration, EventType, StorageManager } from '@crawlee/core'; +import { EventType, RequestQueue, serviceLocator } from '@crawlee/core'; import { sleep } from '@crawlee/utils'; import type { ApifyEnv } from 'apify'; -import { Actor, Dataset, KeyValueStore, ProxyConfiguration } from 'apify'; +import { + Actor, + Configuration, + Dataset, + KeyValueStore, + ProxyConfiguration, +} from 'apify'; import type { WebhookUpdateData } from 'apify-client'; import { ActorClient, ApifyClient, RunClient, TaskClient } from 'apify-client'; @@ -705,13 +711,14 @@ describe('Actor', () => { expect(getValueSpy).toBeCalledWith(KEY_VALUE_STORE_KEYS.INPUT); expect(val1).toBe(123); - // Uses value from config - sdk.config.set('inputKey', 'some-value'); - const val2 = await sdk.getInput(); - expect(getValueSpy).toBeCalledTimes(2); - expect(getValueSpy).toBeCalledWith('some-value'); + // Uses value from config - create a new Actor with custom inputKey + const sdk2 = new Actor({ inputKey: 'some-value' }); + const getValueSpy2 = vitest.spyOn(sdk2 as any, 'getValue'); + getValueSpy2.mockImplementation(async () => 123); + const val2 = await sdk2.getInput(); + expect(getValueSpy2).toBeCalledTimes(1); + expect(getValueSpy2).toBeCalledWith('some-value'); expect(val2).toBe(123); - sdk.config.set('inputKey', undefined); // restore defaults }); test('setValue()', async () => { @@ -752,19 +759,31 @@ describe('Actor', () => { test('openRequestQueue should open storage', async () => { const queueId = 'abc'; const options = { forceCloud: true }; - const openStorageSpy = vitest.spyOn( - StorageManager.prototype, - 'openStorage', - ); + const openSpy = vitest.spyOn(RequestQueue, 'open'); + // crawlee v4's `RequestQueueClient` exposes metadata via + // `getMetadata()` (the v3 `get()` was dropped). const mockRQ = { - client: { get: () => ({ totalRequestCount: 10 }) }, + client: { + getMetadata: async () => ({ totalRequestCount: 10 }), + }, }; - openStorageSpy.mockImplementationOnce(async () => mockRQ); + openSpy.mockImplementationOnce(async () => mockRQ as any); const queue = await sdk.openRequestQueue(queueId, options); - expect(openStorageSpy).toBeCalledWith(queueId, sdk.apifyClient); - expect(openStorageSpy).toBeCalledTimes(1); + // `forceCloud: true` routes through an `ApifyStorageClient` + // adapter that satisfies crawlee v4's `StorageClient` interface. + expect(openSpy).toBeCalledWith( + queueId, + expect.objectContaining({ + storageClient: expect.objectContaining({ + createDatasetClient: expect.any(Function), + createKeyValueStoreClient: expect.any(Function), + createRequestQueueClient: expect.any(Function), + }), + }), + ); + expect(openSpy).toBeCalledTimes(1); // @ts-expect-error private prop expect(queue.initialCount).toBe(10); @@ -773,16 +792,19 @@ describe('Actor', () => { test('openDataset should open storage', async () => { const datasetName = 'abc'; const options = { forceCloud: true }; - const mockOpenStorage = vitest.spyOn( - StorageManager.prototype, - 'openStorage', - ); - mockOpenStorage.mockResolvedValueOnce(vitest.fn()); + const openSpy = vitest.spyOn(Dataset, 'open'); + openSpy.mockResolvedValueOnce(vitest.fn() as any); const ds = await sdk.openDataset(datasetName, options); - expect(mockOpenStorage).toBeCalledTimes(1); - expect(mockOpenStorage).toBeCalledWith( + expect(openSpy).toBeCalledTimes(1); + expect(openSpy).toBeCalledWith( datasetName, - sdk.apifyClient, + expect.objectContaining({ + storageClient: expect.objectContaining({ + createDatasetClient: expect.any(Function), + createKeyValueStoreClient: expect.any(Function), + createRequestQueueClient: expect.any(Function), + }), + }), ); }); }); @@ -1088,7 +1110,9 @@ describe('Actor', () => { const migratingSpy = vitest.fn(persistResource(50)); const persistStateSpy = vitest.fn(persistResource(50)); - const events = Configuration.getEventManager(); + // crawlee v4 removed `Configuration.getEventManager()`; the + // event manager now lives on the global service locator. + const events = serviceLocator.getEventManager(); events.on(EventType.PERSIST_STATE, persistStateSpy); events.on(EventType.MIGRATING, migratingSpy); @@ -1181,9 +1205,20 @@ describe('Actor', () => { }); describe('Actor.getInput', () => { - const TestingActor = new Actor(); + // crawlee v4's Configuration resolves env vars eagerly at construction. + // `Actor.resetGlobalState()` drops the cached default instance + global Configuration; + // we additionally overwrite `actor.config` with a fresh `new Configuration()` + // so env vars set in the test body are guaranteed to be observed. + const buildActor = () => { + Actor.resetGlobalState(); + const actor = new Actor(); + (actor as unknown as { config: Configuration }).config = + new Configuration(); + return actor; + }; test('should work', async () => { + const TestingActor = buildActor(); await expect(TestingActor.getInput()).resolves.toBeNull(); await expect(TestingActor.getInputOrThrow()).rejects.toThrowError( 'Input does not exist', @@ -1196,19 +1231,21 @@ describe('Actor', () => { await TestingActor.getInput(); - // Uses value from env var. + // Uses value from env var — needs a fresh Actor to pick it up. process.env[ACTOR_ENV_VARS.INPUT_KEY] = 'some-value'; - mockGetValue.mockImplementation(async (key) => + const ActorWithInputKey = buildActor(); + const mockGetValue2 = vitest.spyOn(ActorWithInputKey, 'getValue'); + mockGetValue2.mockImplementation(async (key) => expect(key).toBe('some-value'), ); - await TestingActor.getInput(); + await ActorWithInputKey.getInput(); delete process.env[ACTOR_ENV_VARS.INPUT_KEY]; mockGetValue.mockRestore(); + mockGetValue2.mockRestore(); }); test('should work with input secrets', async () => { - const mockGetValue = vitest.spyOn(TestingActor, 'getValue'); const originalInput = { secret: 'foo', nonSecret: 'bar' }; const likeInputSchema = { properties: { secret: { type: 'string', isSecret: true } }, @@ -1223,12 +1260,15 @@ describe('Actor', () => { expect(encryptedInput.secret.startsWith('ENCRYPTED_')).toBe(true); expect(encryptedInput.nonSecret).toBe(originalInput.nonSecret); - mockGetValue.mockImplementation(async (key) => encryptedInput); - + // Set the secrets env vars *before* constructing the Actor so + // the resolved config picks them up. process.env[APIFY_ENV_VARS.INPUT_SECRETS_PRIVATE_KEY_FILE] = testingPrivateKeyFile; process.env[APIFY_ENV_VARS.INPUT_SECRETS_PRIVATE_KEY_PASSPHRASE] = testingPrivateKeyPassphrase; + const TestingActor = buildActor(); + const mockGetValue = vitest.spyOn(TestingActor, 'getValue'); + mockGetValue.mockImplementation(async (key) => encryptedInput); const input = await TestingActor.getInput(); expect(input).toStrictEqual(originalInput); @@ -1282,18 +1322,22 @@ describe('Actor', () => { }); describe('Actor.config and PPE', () => { - test('should work', async () => { - await Actor.init(); + test('empty string maxTotalChargeUsd falls through to the schema default of Infinity', async () => { + // crawlee v4 treats empty-string env vars as unset, so the + // resolved config falls through to the schema default + // (`Infinity`). process.env.ACTOR_MAX_TOTAL_CHARGE_USD = ''; - expect(Actor.config.get('maxTotalChargeUsd')).toBe(0); + await Actor.init(); + expect(Actor.config.maxTotalChargeUsd).toBe(Infinity); expect(Actor.getChargingManager().getMaxTotalChargeUsd()).toBe( Infinity, ); - - // the value in charging manager is cached, so we cant test that here - process.env.ACTOR_MAX_TOTAL_CHARGE_USD = '123'; - expect(Actor.config.get('maxTotalChargeUsd')).toBe(123); await Actor.exit({ exit: false }); }); + + test('numeric maxTotalChargeUsd is correctly resolved from constructor options', () => { + const sdk = new Actor({ maxTotalChargeUsd: 123 }); + expect(sdk.config.maxTotalChargeUsd).toBe(123); + }); }); }); diff --git a/test/apify/events.test.ts b/test/apify/events.test.ts index cb7804fbb7..1a001e814a 100644 --- a/test/apify/events.test.ts +++ b/test/apify/events.test.ts @@ -1,4 +1,4 @@ -import { EventType } from '@crawlee/core'; +import { EventType, serviceLocator } from '@crawlee/core'; import type { Dictionary } from '@crawlee/utils'; import { sleep } from '@crawlee/utils'; import { Actor, Configuration, PlatformEventManager } from 'apify'; @@ -8,23 +8,31 @@ import { ACTOR_ENV_VARS, APIFY_ENV_VARS } from '@apify/consts'; describe('events', () => { let wss: WebSocketServer = null!; - const config = Configuration.getGlobalConfig(); let events: PlatformEventManager = null!; beforeEach(() => { + // Set env vars BEFORE creating the Configuration — crawlee v4 resolves + // env-var-backed fields eagerly at construction, so a global config + // built earlier in the run wouldn't see `actorEventsWsUrl` and + // `events.init()` would silently never open the websocket. + process.env[ACTOR_ENV_VARS.EVENTS_WEBSOCKET_URL] = + 'ws://localhost:9099/someRunId'; + process.env[APIFY_ENV_VARS.TOKEN] = 'dummy'; + wss = new WebSocketServer({ port: 9099 }); + // Drop the cached Configuration so it picks up the env vars we just set. + Configuration.reset(); + const config = Configuration.getGlobalConfig(); events = new PlatformEventManager(config); - config.useEventManager(events); + serviceLocator.setEventManager(events); vitest.useFakeTimers(); - process.env[ACTOR_ENV_VARS.EVENTS_WEBSOCKET_URL] = - 'ws://localhost:9099/someRunId'; - process.env[APIFY_ENV_VARS.TOKEN] = 'dummy'; }); afterEach(async () => { vitest.useRealTimers(); delete process.env[ACTOR_ENV_VARS.EVENTS_WEBSOCKET_URL]; delete process.env[APIFY_ENV_VARS.TOKEN]; + Configuration.reset(); await new Promise((resolve) => { wss.close(resolve); }); @@ -130,7 +138,7 @@ describe('events', () => { test('should send persist state events in regular interval', async () => { const eventsReceived = []; - const interval = config.get('persistStateIntervalMillis')!; + const interval = events.config.persistStateIntervalMillis; events.on(EventType.PERSIST_STATE, (data) => eventsReceived.push(data)); await events.init(); diff --git a/test/apify/proxy_configuration.test.ts b/test/apify/proxy_configuration.test.ts index 8c61a63177..cd6e41c99e 100644 --- a/test/apify/proxy_configuration.test.ts +++ b/test/apify/proxy_configuration.test.ts @@ -1,6 +1,6 @@ import { Actor, ProxyConfiguration } from 'apify'; import { UserClient } from 'apify-client'; -import { type Dictionary, Request, sleep } from 'crawlee'; +import { type Dictionary } from 'crawlee'; import { gotScraping } from 'got-scraping'; import { APIFY_ENV_VARS, LOCAL_APIFY_ENV_VARS } from '@apify/consts'; @@ -10,16 +10,15 @@ const hostname = LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_HOSTNAME]; const port = Number(LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_PORT]); const password = 'test12345'; const countryCode = 'CZ'; -const sessionId = 538909250932; const basicOpts = { groups, countryCode, password, }; -const basicOptsProxyUrl = - 'http://groups-GROUP1+GROUP2,session-538909250932,country-CZ:test12345@proxy.apify.com:8000'; -const proxyUrlNoSession = - 'http://groups-GROUP1+GROUP2,country-CZ:test12345@proxy.apify.com:8000'; +// Apify Proxy URLs always carry a fresh random `session-XXXX` segment; tests +// match against this pattern rather than a hard-coded session id. +const apifyProxyUrlPattern = + /^http:\/\/groups-GROUP1\+GROUP2,session-[A-Za-z0-9]+,country-CZ:test12345@proxy\.apify\.com:8000$/; vitest.mock('got-scraping', async () => { return { @@ -54,48 +53,45 @@ describe('ProxyConfiguration', () => { expect(proxyConfiguration.port).toBe(port); }); - test('newUrl() should return proxy URL', async () => { + test('newUrl() returns an Apify Proxy URL with a random session id', async () => { const proxyConfiguration = new ProxyConfiguration(basicOpts); - expect(await proxyConfiguration.newUrl(sessionId)).toBe( - basicOptsProxyUrl, - ); + const url1 = await proxyConfiguration.newUrl(); + const url2 = await proxyConfiguration.newUrl(); + + expect(url1).toMatch(apifyProxyUrlPattern); + expect(url2).toMatch(apifyProxyUrlPattern); + // Consecutive calls must produce independent URLs. + expect(url1).not.toBe(url2); }); - test('newProxyInfo() should return ProxyInfo object', async () => { + test('newProxyInfo() returns a ProxyInfo object with a fresh URL', async () => { const proxyConfiguration = new ProxyConfiguration(basicOpts); - const url = basicOptsProxyUrl; - const proxyInfo = { - sessionId: `${sessionId}`, - url, - groups, - countryCode, - password, - hostname, - port, - username: 'groups-GROUP1+GROUP2,session-538909250932,country-CZ', - }; - expect(await proxyConfiguration.newProxyInfo(sessionId)).toEqual( - proxyInfo, + const info = await proxyConfiguration.newProxyInfo(); + expect(info).toBeDefined(); + expect(info!.url).toMatch(apifyProxyUrlPattern); + expect(info!.groups).toEqual(groups); + expect(info!.countryCode).toBe(countryCode); + expect(info!.password).toBe(password); + expect(info!.hostname).toBe(hostname); + expect(info!.port).toBe(String(port)); + expect(info!.username).toMatch( + /^groups-GROUP1\+GROUP2,session-[A-Za-z0-9]+,country-CZ$/, ); }); - test('newProxyInfo() works with special characters', async () => { + test('newProxyInfo() works with custom proxyUrls and special characters', async () => { const url = 'http://user%40name:pass%40word@proxy.com:1111'; const proxyConfiguration = new ProxyConfiguration({ proxyUrls: [url] }); - const proxyInfo = { - sessionId: `${sessionId}`, + expect(await proxyConfiguration.newProxyInfo()).toEqual({ url, username: 'user@name', password: 'pass@word', hostname: 'proxy.com', port: '1111', - }; - expect(await proxyConfiguration.newProxyInfo(sessionId)).toEqual( - proxyInfo, - ); + }); }); test('actor UI input schema should work', () => { @@ -168,37 +164,6 @@ describe('ProxyConfiguration', () => { expect(() => new ProxyConfiguration({ countryCode: 1111 })).toThrow(); }); - test('newUrl() should throw on invalid session argument', async () => { - const proxyConfiguration = new ProxyConfiguration(); - await Promise.all([ - expect(async () => - proxyConfiguration.newUrl('a-b'), - ).rejects.toThrow(), - expect(proxyConfiguration.newUrl('a$b')).rejects.toThrow(), - // @ts-expect-error invalid input - expect(proxyConfiguration.newUrl({})).rejects.toThrow(), - // @ts-expect-error invalid input - expect(proxyConfiguration.newUrl(new Date())).rejects.toThrow(), - expect( - proxyConfiguration.newUrl(Array(51).fill('x').join('')), - ).rejects.toThrow(), - - expect(proxyConfiguration.newUrl('a_b')).resolves.not.toThrow(), - expect( - proxyConfiguration.newUrl('0.34252352'), - ).resolves.not.toThrow(), - expect(proxyConfiguration.newUrl('aaa~BBB')).resolves.not.toThrow(), - expect(proxyConfiguration.newUrl('a_1_b')).resolves.not.toThrow(), - expect(proxyConfiguration.newUrl('a_2')).resolves.not.toThrow(), - expect(proxyConfiguration.newUrl('a')).resolves.not.toThrow(), - expect(proxyConfiguration.newUrl('1')).resolves.not.toThrow(), - expect(proxyConfiguration.newUrl(123456)).resolves.not.toThrow(), - expect( - proxyConfiguration.newUrl(Array(50).fill('x').join('')), - ).resolves.not.toThrow(), - ]); - }); - test('should throw on invalid newUrlFunction', async () => { const newUrlFunction = () => { return 'http://proxy.com:1111*invalid_url'; @@ -243,7 +208,6 @@ describe('ProxyConfiguration', () => { 'http://proxy.com:4444', ); - // TODO enable strictNullChecks in tests // through newProxyInfo() expect((await proxyConfiguration.newProxyInfo())?.url).toEqual( 'http://proxy.com:3333', @@ -256,46 +220,6 @@ describe('ProxyConfiguration', () => { ); }); - test('async newUrlFunction should work correctly', async () => { - const customUrls = [ - 'http://proxy.com:1111', - 'http://proxy.com:2222', - 'http://proxy.com:3333', - 'http://proxy.com:4444', - 'http://proxy.com:5555', - 'http://proxy.com:6666', - ]; - const newUrlFunction = async () => { - await sleep(5); - return customUrls.pop() ?? null; - }; - const proxyConfiguration = new ProxyConfiguration({ - newUrlFunction, - }); - - // through newUrl() - expect(await proxyConfiguration.newUrl()).toEqual( - 'http://proxy.com:6666', - ); - expect(await proxyConfiguration.newUrl()).toEqual( - 'http://proxy.com:5555', - ); - expect(await proxyConfiguration.newUrl()).toEqual( - 'http://proxy.com:4444', - ); - - // through newProxyInfo() - expect((await proxyConfiguration.newProxyInfo())!.url).toEqual( - 'http://proxy.com:3333', - ); - expect((await proxyConfiguration.newProxyInfo())!.url).toEqual( - 'http://proxy.com:2222', - ); - expect((await proxyConfiguration.newProxyInfo())!.url).toEqual( - 'http://proxy.com:1111', - ); - }); - describe('With proxyUrls options', () => { test('should rotate custom URLs correctly', async () => { const proxyConfiguration = new ProxyConfiguration({ @@ -347,62 +271,6 @@ describe('ProxyConfiguration', () => { ); }); - test('should rotate custom URLs with sessions correctly', async () => { - const sessions = [ - 'sesssion_01', - 'sesssion_02', - 'sesssion_03', - 'sesssion_04', - 'sesssion_05', - 'sesssion_06', - ]; - const proxyConfiguration = new ProxyConfiguration({ - proxyUrls: [ - 'http://proxy.com:1111', - 'http://proxy.com:2222', - 'http://proxy.com:3333', - ], - }); - - // @ts-expect-error TODO private property? - const { proxyUrls } = proxyConfiguration; - // should use same proxy URL - expect(await proxyConfiguration.newUrl(sessions[0])).toEqual( - proxyUrls![0], - ); - expect(await proxyConfiguration.newUrl(sessions[0])).toEqual( - proxyUrls![0], - ); - expect(await proxyConfiguration.newUrl(sessions[0])).toEqual( - proxyUrls![0], - ); - - // should rotate different proxies - expect(await proxyConfiguration.newUrl(sessions[1])).toEqual( - proxyUrls![1], - ); - expect(await proxyConfiguration.newUrl(sessions[2])).toEqual( - proxyUrls![2], - ); - expect(await proxyConfiguration.newUrl(sessions[3])).toEqual( - proxyUrls![0], - ); - expect(await proxyConfiguration.newUrl(sessions[4])).toEqual( - proxyUrls![1], - ); - expect(await proxyConfiguration.newUrl(sessions[5])).toEqual( - proxyUrls![2], - ); - - // should remember already used session - expect(await proxyConfiguration.newUrl(sessions[1])).toEqual( - proxyUrls![1], - ); - expect(await proxyConfiguration.newUrl(sessions[3])).toEqual( - proxyUrls![0], - ); - }); - test('should throw cannot combine custom proxies with Apify Proxy', async () => { const proxyUrls = [ 'http://proxy.com:1111', @@ -485,81 +353,17 @@ describe('ProxyConfiguration', () => { } }); }); - - describe('With tieredProxyUrls', () => { - test('proxy configuration accepts the tiered urls (Crawlee style)', async () => { - const proxyConfiguration = new ProxyConfiguration({ - tieredProxyUrls: [ - ['http://proxy.com:1111'], - ['http://proxy.com:2222'], - ['http://proxy.com:3333'], - ['http://proxy.com:4444'], - ], - }); - - // through newUrl() - expect( - await proxyConfiguration.newUrl('abc', { - request: new Request({ url: 'http://example.com' }) as any, - }), - ).toEqual('http://proxy.com:1111'); - - // through newProxyInfo() - expect( - (await proxyConfiguration.newProxyInfo('abc', { - request: new Request({ - url: 'http://example.com', - }) as any, - }))!.url, - ).toEqual('http://proxy.com:1111'); - }); - - test('shorthand tieredProxyConfig gets correctly expanded', async () => { - const proxyConfiguration = new ProxyConfiguration({ - password: 'password', - countryCode: 'DE', - tieredProxyConfig: [ - { - groups: ['GROUP1'], - countryCode: 'CZ', - }, - { - groups: ['GROUP2'], - countryCode: 'US', - }, - { - groups: ['GROUP3', 'GROUP4'], - }, - { - groups: ['GROUP3', 'GROUP4'], - countryCode: undefined, - }, - ], - }); - - // eslint-disable-next-line dot-notation - expect(proxyConfiguration['tieredProxyUrls']).toEqual([ - [ - 'http://groups-GROUP1,country-CZ:password@proxy.apify.com:8000', - ], - [ - 'http://groups-GROUP2,country-US:password@proxy.apify.com:8000', - ], - [ - 'http://groups-GROUP3+GROUP4,country-DE:password@proxy.apify.com:8000', - ], - ['http://groups-GROUP3+GROUP4:password@proxy.apify.com:8000'], - ]); - }); - }); }); describe('Actor.createProxyConfiguration()', () => { const userData = { proxy: { password } }; + beforeEach(() => { + Actor.resetGlobalState(); + }); + test('should work with all options', async () => { const status = { connected: true }; - const proxyUrl = proxyUrlNoSession; const url = 'http://proxy.apify.com/?format=json'; gotScrapingSpy.mockResolvedValueOnce({ body: status } as any); @@ -580,7 +384,7 @@ describe('Actor.createProxyConfiguration()', () => { expect(gotScrapingSpy).toBeCalledWith({ url, - proxyUrl, + proxyUrl: expect.stringMatching(apifyProxyUrlPattern), timeout: { request: 4000 }, responseType: 'json', }); @@ -704,7 +508,11 @@ describe('Actor.createProxyConfiguration()', () => { await Actor.createProxyConfiguration(); expect(gotScrapingSpy).toBeCalledWith({ url: `${process.env.APIFY_PROXY_STATUS_URL}/?format=json`, - proxyUrl: `http://auto:${password}@${process.env.APIFY_PROXY_HOSTNAME}:8000`, + proxyUrl: expect.stringMatching( + new RegExp( + `^http://session-[A-Za-z0-9]+:${password}@${process.env.APIFY_PROXY_HOSTNAME}:8000$`, + ), + ), responseType: 'json', timeout: { request: 4000, @@ -713,71 +521,4 @@ describe('Actor.createProxyConfiguration()', () => { gotScrapingSpy.mockRestore(); }); - - describe('With tieredProxyUrls', () => { - test('proxy configuration accepts the tiered urls (Crawlee style)', async () => { - const proxyConfiguration = await Actor.createProxyConfiguration({ - tieredProxyUrls: [ - ['http://proxy.com:1111'], - ['http://proxy.com:2222'], - ['http://proxy.com:3333'], - ['http://proxy.com:4444'], - ], - }); - - // through newUrl() - expect( - await proxyConfiguration!.newUrl('abc', { - request: new Request({ url: 'http://example.com' }) as any, - }), - ).toEqual('http://proxy.com:1111'); - - // through newProxyInfo() - expect( - (await proxyConfiguration!.newProxyInfo('abc', { - request: new Request({ - url: 'http://example.com', - }) as any, - }))!.url, - ).toEqual('http://proxy.com:1111'); - }); - - test('shorthand tieredProxyConfig gets correctly expanded', async () => { - const proxyConfiguration = await Actor.createProxyConfiguration({ - password: 'password', - countryCode: 'DE', - tieredProxyConfig: [ - { - groups: ['GROUP1'], - countryCode: 'CZ', - }, - { - groups: ['GROUP2'], - countryCode: 'US', - }, - { - groups: ['GROUP3', 'GROUP4'], - }, - { - groups: ['GROUP3', 'GROUP4'], - countryCode: undefined, - }, - ], - }); - - // eslint-disable-next-line dot-notation - expect(proxyConfiguration!['tieredProxyUrls']).toEqual([ - [ - 'http://groups-GROUP1,country-CZ:password@proxy.apify.com:8000', - ], - [ - 'http://groups-GROUP2,country-US:password@proxy.apify.com:8000', - ], - [ - 'http://groups-GROUP3+GROUP4,country-DE:password@proxy.apify.com:8000', - ], - ['http://groups-GROUP3+GROUP4:password@proxy.apify.com:8000'], - ]); - }); - }); }); diff --git a/test/apify/utils.test.ts b/test/apify/utils.test.ts index acff5f2324..f09b030972 100644 --- a/test/apify/utils.test.ts +++ b/test/apify/utils.test.ts @@ -21,6 +21,13 @@ describe('Actor.isAtHome()', () => { }); describe('Actor.newClient()', () => { + // crawlee v4's `Configuration` resolves env vars eagerly at construction. + // Reset the cached config + Actor singleton so each test observes the env + // it just wrote. + beforeEach(() => { + Actor.resetGlobalState(); + }); + test('reads environment variables correctly', () => { process.env[APIFY_ENV_VARS.API_BASE_URL] = 'http://www.example.com:1234/path';