Prerequisites

The following examples require the scrapegraph-py library.
pip install -U scrapegraph-py
Optionally, if your ScrapeGraph configuration or specific models require an API key, set the SGAI_API_KEY environment variable:
export SGAI_API_KEY="YOUR_SGAI_API_KEY"

Example

The following agent uses ScrapeGraphTools to extract specific information from a webpage using the smartscraper functionality.
from agno.agent import Agent
from agno.tools.scrapegraph import ScrapeGraphTools

agent = Agent(
    tools=[ScrapeGraphTools(smartscraper=True)],
    )

agent.print_response("""
    "Use smartscraper to extract the following from https://www.wired.com/category/science/:
- News articles
- Headlines
- Images
- Links
- Author
""",
)

Toolkit Params

ParameterTypeDefaultDescription
api_keyOptional[str]NoneScrapeGraph API key. If not provided, uses SGAI_API_KEY environment variable.
enable_smartscraperboolTrueEnable the smartscraper function for LLM-powered data extraction.
enable_markdownifyboolFalseEnable the markdownify function for webpage to markdown conversion.
enable_crawlboolFalseEnable the crawl function for website crawling and data extraction.
enable_searchscraperboolFalseEnable the searchscraper function for web search and information extraction.
enable_agentic_crawlerboolFalseEnable the agentic_crawler function for automated browser actions and AI extraction.
allboolFalseEnable all available functions. When True, all enable flags are ignored.

Toolkit Functions

FunctionDescription
smartscraperExtract structured data from a webpage using LLM and natural language prompt. Parameters: url (str), prompt (str).
markdownifyConvert a webpage to markdown format. Parameters: url (str).
crawlCrawl a website and extract structured data. Parameters: url (str), prompt (str), schema (dict), cache_website (bool), depth (int), max_pages (int), same_domain_only (bool), batch_size (int).
searchscraperSearch the web and extract information. Parameters: prompt (str).
agentic_crawlerPerform automated browser actions with optional AI extraction. Parameters: url (str), steps (List[str]), use_session (bool), user_prompt (Optional[str]), output_schema (Optional[dict]), ai_extraction (bool).

Developer Resources