Search results

198 packages found

A professional library for processing, cleaning, filtering, and converting HTML content to Markdown. Features advanced customization options, presets, plugin support, fluent API, and TypeScript integration for reliable content extraction.

published version 1.0.5, 7 days ago0 dependents licensed under $MIT
218

Package for Apify/Crawlee that allows to store encrypted text values into the Storages

published version 1.0.10, 10 months ago0 dependents licensed under $MIT
179

Model Context Protocol server for WebScraping.AI API. Provides LLM-powered web scraping tools with Chromium JavaScript rendering, rotating proxies, and HTML parsing.

published version 1.0.2, a month ago0 dependents licensed under $MIT
187

Lightfeed API Client for Node.js

published version 0.1.5, 7 days ago0 dependents licensed under $MIT
176

Advanced web scraping framework built on Puppeteer designed to bypass rate limits with smart proxy rotation and browser fingerprinting protection

published version 1.0.1, a month ago0 dependents licensed under $ISC
146

Model Context Protocol (MCP) integration for Scraper.is - A web scraping tool for AI assistants

published version 0.1.22, 3 months ago0 dependents licensed under $MIT
164

MCP server for extracting content from web pages

published version 1.0.2, a month ago0 dependents licensed under $MIT
134

一个基于 MCP 协议的网页内容获取工具,支持多种模式和格式,可与 Claude 等 AI 助手集成

published version 1.0.0, 3 months ago0 dependents licensed under $MIT
122

MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, batch processing, structured data extraction, and LLM-powered content analysis.

published version 1.9.0, a month ago0 dependents licensed under $MIT
116

A wrapper around cURL-impersonate, a binary which can be used to bypass TLS fingerprinting.

published version 1.5.4, 7 months ago0 dependents licensed under $ISC
125

DeepSearch MCP Server with Brave Search API and Puppeteer content extraction

published version 0.0.1, 23 days ago0 dependents licensed under $MIT
119

Nemo-webminer is a Node.js toolkit for scraping content from any website.

published version 1.0.1, a month ago0 dependents licensed under $MIT
137

utility for web scraping and fetching the html from a url or using puppeteer to interact with the page. getHtml uses various strategies in a 'waterfall' approch to get the content of the url, depending on priorities, such as stealth, speed, freshness.

published version 1.0.11, 4 months ago0 dependents licensed under $MIT
121

A Model Context Protocol (MCP) server for WaterCrawl, enabling AI systems to perform web crawling and search operations

published version 1.0.1, 24 days ago0 dependents licensed under $ISC
134

Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs

published version 1.0.3, 2 months ago0 dependents licensed under $MIT
116

Model Context Protocol (MCP) server for Firecrawl Simple - provides web scraping and crawling capabilities to LLMs

published version 1.0.2, 2 months ago0 dependents licensed under $MIT
109

A minimal TypeScript library for fetching and parsing Google Scholar pages.

published version 3.3.0, 10 months ago1 dependents licensed under $MIT
107

Google parser is a lightweight yet powerful HTTP client based Google Search Result scraper/parser with the purpose of sending browser-like requests out of the box. This is very essential in the web scraping industry to blend in with the website traffic.

published version 2.3.0, 2 years ago0 dependents licensed under $MIT
116

A tool for extracting structured content from web pages with customizable selectors and crawling options

published version 0.0.25, 3 months ago0 dependents licensed under $MIT
112

Unofficial high performance API for SIGAA IFSC using web scraping.

published version 1.0.34, 3 years ago0 dependents licensed under $MIT
109