ScrapeGraph AI
ScrapeGraph AI is a service that provides AI-powered web scraping capabilities. It offers tools for extracting structured data, converting webpages to markdown, and processing local HTML content using natural language prompts.
Installation and Setup
Install the required packages:
pip install langchain-scrapegraph
Set up your API key:
export SGAI_API_KEY="your-scrapegraph-api-key"
Tools
See a usage example.
There are four tools available:
from langchain_scrapegraph.tools import (
    SmartScraperTool,    # Extract structured data from websites
    SmartCrawlerTool,    # Extract data from multiple pages with crawling
    MarkdownifyTool,     # Convert webpages to markdown
    GetCreditsTool,      # Check remaining API credits
)
Each tool serves a specific purpose:
- SmartScraperTool: Extract structured data from websites given a URL, prompt and optional output schema
- SmartCrawlerTool: Extract data from multiple pages with advanced crawling options like depth control, page limits, and domain restrictions
- MarkdownifyTool: Convert any webpage to clean markdown format
- GetCreditsTool: Check your remaining ScrapeGraph AI credits