Introduction
High-performance web crawler and scraper for TypeScript, powered by Bun and Playwright.
Feedstock is a TypeScript web crawling library built for speed and developer experience. It runs on Bun and uses Playwright for browser automation, giving you the full power of a real browser with the ergonomics of TypeScript.
Why Feedstock?
- Native TypeScript — no build step, no wrappers. Playwright's API was designed for TypeScript.
- Bun-powered — native SQLite caching, fast test runner, instant startup.
- Strategy pattern — swap out scraping, extraction, and markdown generation strategies.
- Deep crawling — BFS, DFS, and BestFirst traversal with filters and scorers.
- Multiple backends — Playwright (Chromium/Firefox/WebKit) or Lightpanda (local/cloud).
Quick Example
import { WebCrawler, CacheMode } from "feedstock";
const crawler = new WebCrawler();
const result = await crawler.crawl("https://example.com", {
cacheMode: CacheMode.Bypass,
});
console.log(result.markdown?.rawMarkdown);
console.log(result.links.internal);
console.log(result.media.images);
await crawler.close();What You Get
Every crawl returns a CrawlResult with:
html— raw page HTMLcleanedHtml— scripts, styles, and noise removedmarkdown— converted to Markdown with citationslinks— internal and external, classified automaticallymedia— images, videos, and audio with scoringmetadata— title, description, OG tags, canonical URLextractedContent— structured data via CSS or regex strategies
Getting Started
Install feedstock and crawl your first page in under 2 minutes.
Deep Crawling
Recursively crawl entire sites with BFS, DFS, or BestFirst strategies.
Extraction
Extract structured data using CSS selectors or regex patterns.
Browser Backends
Use Playwright or Lightpanda for browser automation.
Edit on GitHub
Last updated on