Agno ScrapeGraphTools enable an Agent to extract structured data from webpages for LLMs in markdown format.
scrapegraph-py
library.
SGAI_API_KEY
environment variable:
ScrapeGraphTools
to extract specific information from a webpage using the smartscraper
functionality.
Parameter | Type | Default | Description |
---|---|---|---|
api_key | Optional[str] | None | ScrapeGraph API key. If not provided, uses SGAI_API_KEY environment variable. |
enable_smartscraper | bool | True | Enable the smartscraper function for LLM-powered data extraction. |
enable_markdownify | bool | False | Enable the markdownify function for webpage to markdown conversion. |
enable_crawl | bool | False | Enable the crawl function for website crawling and data extraction. |
enable_searchscraper | bool | False | Enable the searchscraper function for web search and information extraction. |
enable_agentic_crawler | bool | False | Enable the agentic_crawler function for automated browser actions and AI extraction. |
all | bool | False | Enable all available functions. When True, all enable flags are ignored. |
Function | Description |
---|---|
smartscraper | Extract structured data from a webpage using LLM and natural language prompt. Parameters: url (str), prompt (str). |
markdownify | Convert a webpage to markdown format. Parameters: url (str). |
crawl | Crawl a website and extract structured data. Parameters: url (str), prompt (str), schema (dict), cache_website (bool), depth (int), max_pages (int), same_domain_only (bool), batch_size (int). |
searchscraper | Search the web and extract information. Parameters: prompt (str). |
agentic_crawler | Perform automated browser actions with optional AI extraction. Parameters: url (str), steps (List[str]), use_session (bool), user_prompt (Optional[str]), output_schema (Optional[dict]), ai_extraction (bool). |