feedstock

Configuration

BrowserConfig and CrawlerRunConfig reference.

Feedstock uses two configuration objects: BrowserConfig for browser-level settings, and CrawlerRunConfig for per-crawl behavior.

BrowserConfig

Controls the browser instance. Set once when creating the WebCrawler.

import { createBrowserConfig } from "feedstock";

const config = createBrowserConfig({
  browserType: "chromium",   // "chromium" | "firefox" | "webkit"
  headless: true,
  viewport: { width: 1920, height: 1080 },
  userAgent: "my-bot/1.0",
  proxy: { server: "http://proxy:8080" },
  backend: { kind: "playwright" },
});
OptionTypeDefaultDescription
browserType"chromium" | "firefox" | "webkit""chromium"Browser engine
headlessbooleantrueRun without UI
viewport{ width, height }1920x1080Page viewport size
userAgentstring | nullnullCustom user agent
proxyProxyConfig | nullnullProxy server
ignoreHttpsErrorsbooleantrueIgnore SSL errors
javaEnabledbooleantrueEnable JavaScript
extraArgsstring[][]Extra browser launch args
textModebooleanfalseText-only mode
backendBrowserBackend{ kind: "playwright" }Browser backend
verbosebooleanfalseEnable verbose logging

CrawlerRunConfig

Controls per-crawl behavior. Pass to crawl() or crawlMany().

import { createCrawlerRunConfig, CacheMode } from "feedstock";

const config = createCrawlerRunConfig({
  cacheMode: CacheMode.Bypass,
  waitFor: { kind: "selector", value: "#loaded" },
  screenshot: true,
  excludeTags: ["nav", "footer", "aside"],
  cssSelector: "article",
});

Content Options

OptionTypeDefaultDescription
wordCountThresholdnumber10Min words for content
excludeTagsstring[][]HTML tags to strip
includeTagsstring[][]Only keep these tags
removeOverlayElementsbooleanfalseRemove modals/popups
cssSelectorstring | nullnullExtract only matching elements
generateMarkdownbooleantrueGenerate markdown output

Browser Behavior

OptionTypeDefaultDescription
jsCodestring | string[] | nullnullJavaScript to execute
waitForWaitForType | nullnullWait condition
waitAfterLoadnumber0Additional wait (ms)
pageTimeoutnumber60000Navigation timeout (ms)

Wait Conditions

// Wait for a CSS selector
{ kind: "selector", value: "#content", timeout: 5000 }

// Wait for network idle
{ kind: "networkIdle" }

// Wait a fixed delay
{ kind: "delay", ms: 2000 }

// Wait for a JS function to return truthy
{ kind: "function", fn: "() => document.readyState === 'complete'" }

Capture Options

OptionTypeDefaultDescription
screenshotbooleanfalseCapture full-page screenshot
pdfbooleanfalseCapture page as PDF
captureNetworkRequestsbooleanfalseLog network requests
captureConsoleMessagesbooleanfalseLog console output
Edit on GitHub

Last updated on

On this page