A Model Context Protocol (MCP) integration for Scraper.is - A powerful web scraping tool for AI assistants.
This package allows AI assistants to scrape web content through the MCP protocol, enabling them to access up-to-date information from the web.
- 🌐 Web Scraping: Extract content from any website
- 📸 Screenshots: Capture visual representations of web pages
- 📄 Multiple Formats: Get content in markdown, HTML, or JSON
- 🔄 Progress Updates: Real-time progress reporting during scraping operations
- 🔌 MCP Integration: Seamless integration with MCP-compatible AI assistants
npm install -g scraperis-mcp
Or with yarn:
yarn global add scraperis-mcp
You need a Scraper.is API key to use this package.
- Sign up or log in at scraper.is
- Navigate to the API Keys section in your dashboard: https://www.scraper.is/dashboard/apikeys
- Create a new API key or copy your existing key
- Store this key securely as you'll need it to use this package
Create a .env
file with your Scraper.is API key:
SCRAPERIS_API_KEY=your_api_key_here
To use this package with Claude Desktop:
-
Install the package globally:
npm install -g scraperis-mcp
-
Add the following configuration to your
claude_desktop_config.json
file:{ "mcpServers": { "scraperis_scraper": { "command": "scraperis-mcp", "args": [], "env": { "SCRAPERIS_API_KEY": "your-api-key-here", "DEBUG": "*" } } } }
-
Replace
your-api-key-here
with your actual Scraper.is API key. -
Restart Claude Desktop to apply the changes.
For development and testing, you can use the MCP Inspector:
npx @modelcontextprotocol/inspector scraperis-mcp
This package is designed to be used with AI assistants that support the Model Context Protocol (MCP). When properly configured, the AI assistant can use the following tools:
The scrape
tool allows the AI to extract content from websites. It supports various formats:
-
markdown
: Returns the content in markdown format -
html
: Returns the content in HTML format -
screenshot
: Returns a screenshot of the webpage -
json
: Returns structured data in JSON format
Example prompt for the AI:
Can you scrape the latest news from techcrunch.com and summarize it for me?
Scrapes content from a webpage based on a prompt.
Parameters:
-
prompt
(string): The prompt describing what to scrape, including the URL -
format
(string): The format to return the content in (markdown
,html
,screenshot
,json
,quick
)
Example:
{
"prompt": "Get me the top 10 products from producthunt.com",
"format": "markdown"
}
-
Clone the repository:
git clone https://github.com/Ai-Quill/scraperis-mcp.git cd scraperis-mcp
-
Install dependencies:
npm install
-
Build the project:
npm run build
-
npm run build
: Build the project -
npm run watch
: Watch for changes and rebuild -
npm run dev
: Run with MCP Inspector for development -
npm run test
: Run tests -
npm run lint
: Run ESLint
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.