Vanna converts natural language questions into SQL queries. You feed it DDL, documentation, and example question/SQL pairs as training data; it retrieves the most relevant context via embeddings and passes that context to an LLM to generate the query.
Why I starred it
Most text-to-SQL tools just prompt the LLM with the schema and hope for the best. Vanna's core idea is different: it maintains a vector store of your training examples and retrieves similar past queries alongside the DDL before generating SQL. The result is that query quality improves as you add corrections — it learns your specific naming conventions, common joins, and business logic. That feedback loop is the part worth understanding.
Version 2.0 also added something I hadn't seen other projects wire up this cleanly: a user-aware agent runtime where identity flows from the HTTP request into every tool execution and row-level security filter, not just the authentication layer.
How it works
The original architecture lives in src/vanna/legacy/base/base.py. The generate_sql method chains five calls:
def generate_sql(self, question: str, allow_llm_to_see_data=False, **kwargs) -> str:
# 1. get_similar_question_sql — vector search over past Q/SQL pairs
# 2. get_related_ddl — vector search over schema definitions
# 3. get_related_documentation — vector search over free-form docs
# 4. get_sql_prompt — assembles context into a prompt
# 5. submit_prompt — sends to LLM, returns SQL string
Each step is an abstract method, making the whole thing composable. You mix-and-match LLM backends (OpenAI, Anthropic, Ollama, Gemini, Bedrock) with vector stores (ChromaDB, Pinecone, Qdrant, pgvector, FAISS) by subclassing. There are 25+ integrations in src/vanna/legacy/.
The 2.0 rewrite in src/vanna/core/ keeps that retrieval idea but restructures around an Agent class with a ToolRegistry. Tools are typed with Pydantic generics:
class Tool(ABC, Generic[T]):
@property
def access_groups(self) -> List[str]:
return [] # empty = accessible to all
@abstractmethod
def get_args_schema(self) -> Type[T]: ...
@abstractmethod
async def execute(self, context: ToolContext, args: T) -> ToolResult: ...
def get_schema(self) -> ToolSchema:
# auto-generates JSON schema from Pydantic model for LLM tool-calling
schema = args_model.model_json_schema()
return ToolSchema(name=self.name, description=self.description,
parameters=schema, access_groups=self.access_groups)
The access_groups property is where permissions enter. When you register RunSqlTool with access_groups=["read_sales"], the ToolRegistry.get_schemas(user) call filters tools based on the user's group_memberships. The LLM never sees tools the user can't execute — it's not a runtime check after the fact, it's excluded from the tool list entirely.
The RunSqlTool in src/vanna/tools/run_sql.py does something subtle with result handling: for large SELECT results it truncates the CSV preview to 1000 characters and appends an all-caps instruction to the LLM to skip summarizing and call VISUALIZE_DATA instead. A bit of prompt engineering baked into the tool return value, which keeps the agent from generating verbose text about data it can't fully see.
The LegacyVannaAdapter in src/vanna/legacy/adapter.py wraps old VannaBase instances as a ToolRegistry, so 0.x setups get the new web UI without a full migration. The migration path is pragmatic: it bridges both worlds rather than forcing a rewrite.
Using it
The minimal FastAPI setup:
from vanna import Agent
from vanna.servers.fastapi.routes import register_chat_routes
from vanna.integrations.anthropic import AnthropicLlmService
from vanna.tools import RunSqlTool
from vanna.integrations.sqlite import SqliteRunner
from vanna.core.registry import ToolRegistry
llm = AnthropicLlmService(model="claude-sonnet-4-5")
tools = ToolRegistry()
tools.register(RunSqlTool(sql_runner=SqliteRunner("./data.db")))
agent = Agent(llm_service=llm, tool_registry=tools)
register_chat_routes(app, agent)
Then drop the web component anywhere:
<script src="https://img.vanna.ai/vanna-components.js"></script>
<vanna-chat sse-endpoint="/api/vanna/v2/chat_sse" theme="dark"></vanna-chat>
The SSE endpoint streams structured UI components — tables, charts, status updates — not just text. The web component renders them.
For the legacy RAG-based workflow, training still works via:
vn.train(ddl="CREATE TABLE orders (...)")
vn.train(question="What are monthly sales?", sql="SELECT ...")
vn.ask("What are the top customers by revenue?")
Rough edges
The 2.0 README is a complete rewrite with no mention of how the RAG retrieval integrates into the new agent architecture. It's unclear whether AgentMemory in src/vanna/capabilities/agent_memory.py replaces the vector store retrieval or sits alongside it. The original VannaBase.generate_sql pipeline is entirely absent from the new docs.
Test coverage is uneven. The tests/ directory has 14 files, but most target specific integrations (ChromaDB, Gemini, Ollama) rather than the core agent loop. test_agents.py and test_workflow.py exist but aren't comprehensive end-to-end tests.
The pyproject.toml [all] extras list is 20+ packages including heavy ML dependencies like transformers and pymilvus[model]. The base install is lean (pydantic, pandas, httpx), but newcomers who pip install vanna[all] will wait a while.
The v2 branch exists separately from main in the repo — as of stargazing, the 2.0 README is on main but the underlying code structure suggests active migration work is still ongoing.
Bottom line
If you're building a data Q&A interface and want something with real permission semantics rather than a thin wrapper around an LLM prompt, Vanna 2.0's tool-registry approach is worth the integration cost. The legacy 0.x path still works for quick prototypes where you just want vn.ask() against an existing database.
