Using the Keboola MCP server to create an Agent that can query data, manage transformations, and orchestrate jobs in your Keboola project.

Overview

Keboola MCP Server is an open-source bridge between your Keboola project and modern AI tools. It turns Keboola features—like storage access, SQL transformations, and job triggers—into callable tools for MCP-compatible clients and AI frameworks.

Features

  • Storage: Query tables directly and manage table or bucket descriptions
  • Components: Create, List and inspect extractors, writers, data apps, and transformation configurations
  • SQL: Create SQL transformations with natural language
  • Jobs: Run components and transformations, and retrieve job execution details
  • Metadata: Search, read, and update project documentation and object metadata using natural language

Setup

Before setting up the MCP server, you need three key pieces of information:

1. KBC_STORAGE_TOKEN

This is your authentication token for Keboola. For instructions on how to create and manage Storage API tokens, refer to the official Keboola documentation. Note: Use custom storage token for limited access, or master token for full project access.

2. KBC_WORKSPACE_SCHEMA

This identifies your workspace in Keboola and is required for SQL queries. Follow this Keboola guide to get your KBC_WORKSPACE_SCHEMA. Note: Check “Grant read-only access to all Project data” option when creating the workspace.

3. Keboola Region

Your Keboola API URL depends on your deployment region:
RegionAPI URL
AWS North Americahttps://connection.keboola.com
AWS Europehttps://connection.eu-central-1.keboola.com
Google Cloud EUhttps://connection.europe-west3.gcp.keboola.com
Google Cloud UShttps://connection.us-east4.gcp.keboola.com
Azure EUhttps://connection.north-europe.azure.keboola.com

BigQuery-Specific Setup

If your Keboola project uses BigQuery backend:
  1. Go to your Keboola BigQuery workspace and display its credentials (click Connect button)
  2. Download the credentials JSON file to your local disk
  3. Set the full path to GOOGLE_APPLICATION_CREDENTIALS environment variable
  4. The Dataset Name in BigQuery workspace is your KBC_WORKSPACE_SCHEMA

Usage

"""
Keboola MCP Agent - Manages your data platform

This example shows how to use the Agno MCP tools to interact with your Keboola project.

1. Get your Keboola Storage API token from your project settings
2. Create a workspace and get your workspace schema
3. Set environment variables:
   export KBC_STORAGE_TOKEN=your_keboola_storage_token
   export KBC_WORKSPACE_SCHEMA=your_workspace_schema
   export KBC_API_URL=https://connection.YOUR_REGION.keboola.com

Dependencies: pip install agno mcp openai
"""

import asyncio
import os
from textwrap import dedent

from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.mcp import MCPTools
from mcp import StdioServerParameters


async def run_agent():
    storage_token = os.getenv("KBC_STORAGE_TOKEN")
    workspace_schema = os.getenv("KBC_WORKSPACE_SCHEMA")
    api_url = os.getenv("KBC_API_URL", "https://connection.keboola.com")
    
    if not storage_token:
        raise ValueError(
            "Missing Keboola Storage API token: set KBC_STORAGE_TOKEN environment variable"
        )
    
    if not workspace_schema:
        raise ValueError(
            "Missing Keboola workspace schema: set KBC_WORKSPACE_SCHEMA environment variable"
        )

    command = "uvx"
    args = ["keboola_mcp_server", "--api-url", api_url]
    env = {
        "KBC_STORAGE_TOKEN": storage_token,
        "KBC_WORKSPACE_SCHEMA": workspace_schema,
    }
    
    # Add BigQuery credentials if available
    google_creds = os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
    if google_creds:
        env["GOOGLE_APPLICATION_CREDENTIALS"] = google_creds
    
    server_params = StdioServerParameters(command=command, args=args, env=env)

    async with MCPTools(server_params=server_params) as mcp_tools:
        agent = Agent(
            name="KeboolaDataAgent",
            model=OpenAIChat(id="gpt-4o"),
            tools=[mcp_tools],
            description="Agent to query and manage Keboola data platform via MCP",
            instructions=dedent("""\
                You have access to Keboola data platform through MCP tools.
                - Use tools to query tables, manage transformations, and run jobs.
                - Confirm with the user before making modifications or running jobs.
                - Always use proper SQL syntax based on the workspace dialect (Snowflake or BigQuery).
                - When querying tables, use fully qualified table names from table details.
            """),
            markdown=True,
            show_tool_calls=True,
        )

        await agent.acli_app(
            message="You are a data platform assistant that can access your Keboola project. I can help you query data, create transformations, manage components, and run jobs.",
            stream=True,
            markdown=True,
            exit_on=["exit", "quit"],
        )


if __name__ == "__main__":
    asyncio.run(run_agent()) 

Supported Tools

The Keboola MCP Server provides comprehensive tools across different categories:

Storage Tools

  • retrieve_buckets - Lists all storage buckets in your Keboola project
  • get_bucket_detail - Retrieves detailed information about a specific bucket
  • retrieve_bucket_tables - Returns all tables within a specific bucket
  • get_table_detail - Provides detailed information for a specific table
  • update_bucket_description - Updates the description of a bucket
  • update_table_description - Updates the description of a table
  • update_column_description - Updates the description for a given column in a table

SQL Tools

  • query_table - Executes custom SQL queries against your data
  • get_sql_dialect - Identifies whether your workspace uses Snowflake or BigQuery SQL dialect

Component Management

  • create_component_root_configuration - Creates a component configuration with custom parameters
  • create_component_row_configuration - Creates a component configuration row with custom parameters
  • create_sql_transformation - Creates an SQL transformation with custom queries
  • find_component_id - Returns list of component IDs that match the given query
  • get_component - Gets information about a specific component given its ID
  • get_component_configuration - Gets information about a specific component/transformation configuration
  • retrieve_component_configurations - Retrieves configurations of components present in the project
  • retrieve_transformations - Retrieves transformation configurations in the project
  • update_component_root_configuration - Updates a specific component configuration
  • update_sql_transformation_configuration - Updates an existing SQL transformation configuration

Job Management

  • retrieve_jobs - Lists and filters jobs by status, component, or configuration
  • get_job_detail - Returns comprehensive details about a specific job
  • start_job - Triggers a component or transformation job to run

Documentation

  • docs_query - Searches Keboola documentation based on natural language queries

Example Queries

Once configured, you can start querying your Keboola data: Data Exploration:
  • “What buckets and tables are in my Keboola project?”
  • “What tables contain customer information?”
  • “Run a query to find the top 10 customers by revenue”
Data Analysis:
  • “Analyze my sales data by region for the last quarter”
  • “Find correlations between customer age and purchase frequency”
Data Pipelines:
  • “Create a SQL transformation that joins customer and order tables”
  • “Start the data extraction job for my Salesforce component”

Troubleshooting

IssueSolution
Authentication ErrorsVerify KBC_STORAGE_TOKEN is valid
Workspace IssuesConfirm KBC_WORKSPACE_SCHEMA is correct
Connection TimeoutCheck network connectivity and API URL region
BigQuery AccessEnsure GOOGLE_APPLICATION_CREDENTIALS path is correct

Resources