Workflows
Blog Post Generator
Examples
- Introduction
- Getting Started
- Agents
- Workflows
- Applications
Agent Concepts
- Multimodal
- RAG
- Knowledge
- Memory
- Teams
- Async
- Hybrid Search
- Storage
- Tools
- Vector Databases
- Embedders
Models
- Anthropic
- AWS Bedrock Claude
- Azure OpenAI
- Cohere
- DeepSeek
- Fireworks
- Gemini
- Groq
- Hugging Face
- Mistral
- NVIDIA
- Ollama
- OpenAI
- Together
- Vertex AI
- xAI
Workflows
Blog Post Generator
This advanced example demonstrates how to build a sophisticated blog post generator that combines web research capabilities with professional writing expertise. The workflow uses a multi-stage approach:
- Intelligent web research and source gathering
- Content extraction and processing
- Professional blog post writing with proper citations
Key capabilities:
- Advanced web research and source evaluation
- Content scraping and processing
- Professional writing with SEO optimization
- Automatic content caching for efficiency
- Source attribution and fact verification
Example blog topics to try:
- “The Rise of Artificial General Intelligence: Latest Breakthroughs”
- “How Quantum Computing is Revolutionizing Cybersecurity”
- “Sustainable Living in 2024: Practical Tips for Reducing Carbon Footprint”
- “The Future of Work: AI and Human Collaboration”
- “Space Tourism: From Science Fiction to Reality”
- “Mindfulness and Mental Health in the Digital Age”
- “The Evolution of Electric Vehicles: Current State and Future Trends”
Code
blog_post_generator.py
import json
from textwrap import dedent
from typing import Dict, Iterator, Optional
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.storage.workflow.sqlite import SqliteWorkflowStorage
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.tools.newspaper4k import Newspaper4kTools
from agno.utils.log import logger
from agno.utils.pprint import pprint_run_response
from agno.workflow import RunEvent, RunResponse, Workflow
from pydantic import BaseModel, Field
class NewsArticle(BaseModel):
title: str = Field(..., description="Title of the article.")
url: str = Field(..., description="Link to the article.")
summary: Optional[str] = Field(
..., description="Summary of the article if available."
)
class SearchResults(BaseModel):
articles: list[NewsArticle]
class ScrapedArticle(BaseModel):
title: str = Field(..., description="Title of the article.")
url: str = Field(..., description="Link to the article.")
summary: Optional[str] = Field(
..., description="Summary of the article if available."
)
content: Optional[str] = Field(
...,
description="Full article content in markdown format. None if content is unavailable.",
)
class BlogPostGenerator(Workflow):
"""Advanced workflow for generating professional blog posts with proper research and citations."""
description: str = dedent("""\
An intelligent blog post generator that creates engaging, well-researched content.
This workflow orchestrates multiple AI agents to research, analyze, and craft
compelling blog posts that combine journalistic rigor with engaging storytelling.
The system excels at creating content that is both informative and optimized for
digital consumption.
""")
# Search Agent: Handles intelligent web searching and source gathering
searcher: Agent = Agent(
model=OpenAIChat(id="gpt-4o-mini"),
tools=[DuckDuckGoTools()],
description=dedent("""\
You are BlogResearch-X, an elite research assistant specializing in discovering
high-quality sources for compelling blog content. Your expertise includes:
- Finding authoritative and trending sources
- Evaluating content credibility and relevance
- Identifying diverse perspectives and expert opinions
- Discovering unique angles and insights
- Ensuring comprehensive topic coverage\
"""),
instructions=(
"1. Search Strategy 🔍\n"
" - Find 10-15 relevant sources and select the 5-7 best ones\n"
" - Prioritize recent, authoritative content\n"
" - Look for unique angles and expert insights\n"
"2. Source Evaluation 📊\n"
" - Verify source credibility and expertise\n"
" - Check publication dates for timeliness\n"
" - Assess content depth and uniqueness\n"
"3. Diversity of Perspectives 🌐\n"
" - Include different viewpoints\n"
" - Gather both mainstream and expert opinions\n"
" - Find supporting data and statistics"
),
response_model=SearchResults,
structured_outputs=True,
)
# Content Scraper: Extracts and processes article content
article_scraper: Agent = Agent(
model=OpenAIChat(id="gpt-4o-mini"),
tools=[Newspaper4kTools()],
description=dedent("""\
You are ContentBot-X, a specialist in extracting and processing digital content
for blog creation. Your expertise includes:
- Efficient content extraction
- Smart formatting and structuring
- Key information identification
- Quote and statistic preservation
- Maintaining source attribution\
"""),
instructions=(
"1. Content Extraction 📑\n"
" - Extract content from the article\n"
" - Preserve important quotes and statistics\n"
" - Maintain proper attribution\n"
" - Handle paywalls gracefully\n"
"2. Content Processing 🔄\n"
" - Format text in clean markdown\n"
" - Preserve key information\n"
" - Structure content logically\n"
"3. Quality Control ✅\n"
" - Verify content relevance\n"
" - Ensure accurate extraction\n"
" - Maintain readability"
),
response_model=ScrapedArticle,
structured_outputs=True,
)
# Content Writer Agent: Crafts engaging blog posts from research
writer: Agent = Agent(
model=OpenAIChat(id="gpt-4o"),
description=dedent("""\
You are BlogMaster-X, an elite content creator combining journalistic excellence
with digital marketing expertise. Your strengths include:
- Crafting viral-worthy headlines
- Writing engaging introductions
- Structuring content for digital consumption
- Incorporating research seamlessly
- Optimizing for SEO while maintaining quality
- Creating shareable conclusions\
"""),
instructions=(
"1. Content Strategy 📝\n"
" - Craft attention-grabbing headlines\n"
" - Write compelling introductions\n"
" - Structure content for engagement\n"
" - Include relevant subheadings\n"
"2. Writing Excellence ✍️\n"
" - Balance expertise with accessibility\n"
" - Use clear, engaging language\n"
" - Include relevant examples\n"
" - Incorporate statistics naturally\n"
"3. Source Integration 🔍\n"
" - Cite sources properly\n"
" - Include expert quotes\n"
" - Maintain factual accuracy\n"
"4. Digital Optimization 💻\n"
" - Structure for scanability\n"
" - Include shareable takeaways\n"
" - Optimize for SEO\n"
" - Add engaging subheadings"
),
expected_output=dedent("""\
# {Viral-Worthy Headline}
## Introduction
{Engaging hook and context}
## {Compelling Section 1}
{Key insights and analysis}
{Expert quotes and statistics}
## {Engaging Section 2}
{Deeper exploration}
{Real-world examples}
## {Practical Section 3}
{Actionable insights}
{Expert recommendations}
## Key Takeaways
- {Shareable insight 1}
- {Practical takeaway 2}
- {Notable finding 3}
## Sources
{Properly attributed sources with links}
"""),
markdown=True,
)
def run(
self,
topic: str,
use_search_cache: bool = True,
use_scrape_cache: bool = True,
use_cached_report: bool = True,
) -> Iterator[RunResponse]:
logger.info(f"Generating a blog post on: {topic}")
# Use the cached blog post if use_cache is True
if use_cached_report:
cached_blog_post = self.get_cached_blog_post(topic)
if cached_blog_post:
yield RunResponse(
content=cached_blog_post, event=RunEvent.workflow_completed
)
return
# Search the web for articles on the topic
search_results: Optional[SearchResults] = self.get_search_results(
topic, use_search_cache
)
# If no search_results are found for the topic, end the workflow
if search_results is None or len(search_results.articles) == 0:
yield RunResponse(
event=RunEvent.workflow_completed,
content=f"Sorry, could not find any articles on the topic: {topic}",
)
return
# Scrape the search results
scraped_articles: Dict[str, ScrapedArticle] = self.scrape_articles(
search_results, use_scrape_cache
)
# Write a blog post
yield from self.write_blog_post(topic, scraped_articles)
def get_cached_blog_post(self, topic: str) -> Optional[str]:
logger.info("Checking if cached blog post exists")
return self.session_state.get("blog_posts", {}).get(topic)
def add_blog_post_to_cache(self, topic: str, blog_post: str):
logger.info(f"Saving blog post for topic: {topic}")
self.session_state.setdefault("blog_posts", {})
self.session_state["blog_posts"][topic] = blog_post
# Save the blog post to the storage
self.write_to_storage()
def get_cached_search_results(self, topic: str) -> Optional[SearchResults]:
logger.info("Checking if cached search results exist")
return self.session_state.get("search_results", {}).get(topic)
def add_search_results_to_cache(self, topic: str, search_results: SearchResults):
logger.info(f"Saving search results for topic: {topic}")
self.session_state.setdefault("search_results", {})
self.session_state["search_results"][topic] = search_results.model_dump()
# Save the search results to the storage
self.write_to_storage()
def get_cached_scraped_articles(
self, topic: str
) -> Optional[Dict[str, ScrapedArticle]]:
logger.info("Checking if cached scraped articles exist")
return self.session_state.get("scraped_articles", {}).get(topic)
def add_scraped_articles_to_cache(
self, topic: str, scraped_articles: Dict[str, ScrapedArticle]
):
logger.info(f"Saving scraped articles for topic: {topic}")
self.session_state.setdefault("scraped_articles", {})
self.session_state["scraped_articles"][topic] = scraped_articles
# Save the scraped articles to the storage
self.write_to_storage()
def get_search_results(
self, topic: str, use_search_cache: bool, num_attempts: int = 3
) -> Optional[SearchResults]:
# Get cached search_results from the session state if use_search_cache is True
if use_search_cache:
try:
search_results_from_cache = self.get_cached_search_results(topic)
if search_results_from_cache is not None:
search_results = SearchResults.model_validate(
search_results_from_cache
)
logger.info(
f"Found {len(search_results.articles)} articles in cache."
)
return search_results
except Exception as e:
logger.warning(f"Could not read search results from cache: {e}")
# If there are no cached search_results, use the searcher to find the latest articles
for attempt in range(num_attempts):
try:
searcher_response: RunResponse = self.searcher.run(topic)
if (
searcher_response is not None
and searcher_response.content is not None
and isinstance(searcher_response.content, SearchResults)
):
article_count = len(searcher_response.content.articles)
logger.info(
f"Found {article_count} articles on attempt {attempt + 1}"
)
# Cache the search results
self.add_search_results_to_cache(topic, searcher_response.content)
return searcher_response.content
else:
logger.warning(
f"Attempt {attempt + 1}/{num_attempts} failed: Invalid response type"
)
except Exception as e:
logger.warning(f"Attempt {attempt + 1}/{num_attempts} failed: {str(e)}")
logger.error(f"Failed to get search results after {num_attempts} attempts")
return None
def scrape_articles(
self, search_results: SearchResults, use_scrape_cache: bool
) -> Dict[str, ScrapedArticle]:
scraped_articles: Dict[str, ScrapedArticle] = {}
# Get cached scraped_articles from the session state if use_scrape_cache is True
if use_scrape_cache:
try:
scraped_articles_from_cache = self.get_cached_scraped_articles(topic)
if scraped_articles_from_cache is not None:
scraped_articles = scraped_articles_from_cache
logger.info(
f"Found {len(scraped_articles)} scraped articles in cache."
)
return scraped_articles
except Exception as e:
logger.warning(f"Could not read scraped articles from cache: {e}")
# Scrape the articles that are not in the cache
for article in search_results.articles:
if article.url in scraped_articles:
logger.info(f"Found scraped article in cache: {article.url}")
continue
article_scraper_response: RunResponse = self.article_scraper.run(
article.url
)
if (
article_scraper_response is not None
and article_scraper_response.content is not None
and isinstance(article_scraper_response.content, ScrapedArticle)
):
scraped_articles[article_scraper_response.content.url] = (
article_scraper_response.content
)
logger.info(f"Scraped article: {article_scraper_response.content.url}")
# Save the scraped articles in the session state
self.add_scraped_articles_to_cache(topic, scraped_articles)
return scraped_articles
def write_blog_post(
self, topic: str, scraped_articles: Dict[str, ScrapedArticle]
) -> Iterator[RunResponse]:
logger.info("Writing blog post")
# Prepare the input for the writer
writer_input = {
"topic": topic,
"articles": [v.model_dump() for v in scraped_articles.values()],
}
# Run the writer and yield the response
yield from self.writer.run(json.dumps(writer_input, indent=4), stream=True)
# Save the blog post in the cache
self.add_blog_post_to_cache(topic, self.writer.run_response.content)
# Run the workflow if the script is executed directly
if __name__ == "__main__":
import random
from rich.prompt import Prompt
# Fun example prompts to showcase the generator's versatility
example_prompts = [
"Why Cats Secretly Run the Internet",
"The Science Behind Why Pizza Tastes Better at 2 AM",
"Time Travelers' Guide to Modern Social Media",
"How Rubber Ducks Revolutionized Software Development",
"The Secret Society of Office Plants: A Survival Guide",
"Why Dogs Think We're Bad at Smelling Things",
"The Underground Economy of Coffee Shop WiFi Passwords",
"A Historical Analysis of Dad Jokes Through the Ages",
]
# Get topic from user
topic = Prompt.ask(
"[bold]Enter a blog post topic[/bold] (or press Enter for a random example)\n✨",
default=random.choice(example_prompts),
)
# Convert the topic to a URL-safe string for use in session_id
url_safe_topic = topic.lower().replace(" ", "-")
# Initialize the blog post generator workflow
# - Creates a unique session ID based on the topic
# - Sets up SQLite storage for caching results
generate_blog_post = BlogPostGenerator(
session_id=f"generate-blog-post-on-{url_safe_topic}",
storage=SqliteWorkflowStorage(
table_name="generate_blog_post_workflows",
db_file="tmp/workflows.db",
),
)
# Execute the workflow with caching enabled
# Returns an iterator of RunResponse objects containing the generated content
blog_post: Iterator[RunResponse] = generate_blog_post.run(
topic=topic,
use_search_cache=True,
use_scrape_cache=True,
use_cached_report=True,
)
# Print the response
pprint_run_response(blog_post, markdown=True)
Usage
1
Create a virtual environment
Open the Terminal
and create a python virtual environment.
2
Install libraries
openai duckduckgo-search newspaper4k lxml_html_clean sqlalchemy agno
3
Run the workflow
python blog_post_generator.py