Getting Started
Welcome to TOON Converter! This guide will help you get up and running with the library.
What is TOON?
TOON (Token-Optimized Object Notation) is a data serialization format designed specifically to reduce token usage in Large Language Model (LLM) applications. It achieves 30-60% token savings compared to JSON while maintaining readability and type safety.
Why Use TOON Converter?
100% Spec Compliance
TOON Converter is the only Python library with 100% TOON v2.0 specification compliance. All 26 official specification tests pass, ensuring correct behavior across all edge cases.
✅ All three root forms (Object, Array, Primitive)
✅ All three array forms (Inline, Tabular, List)
✅ Number canonical form (no exponents, no trailing zeros)
✅ String quoting rules (10+ edge cases)
✅ All delimiters (Comma, Tab, Pipe)
✅ Escape sequences (5 types)
Comprehensive Integrations
Built-in support for 10 popular frameworks:
Data Science: Pandas, SQLAlchemy
AI/LLM: LangChain, LlamaIndex, Haystack, DSPy, Instructor
Web: FastAPI, Pydantic
Protocols: Model Context Protocol (MCP)
Production Ready
50+ test files with 95%+ coverage
100% type hints with mypy strict mode
High performance (<100ms for typical datasets)
Comprehensive documentation and examples
Token Savings Examples
Simple Object
data = {"name": "Alice", "age": 30, "city": "NYC"}
# JSON: 42 characters
# {"name":"Alice","age":30,"city":"NYC"}
# TOON: 28 characters (33% savings)
# name: Alice
# age: 30
# city: NYC
Tabular Data
data = {
"users": [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25},
{"name": "Charlie", "age": 35}
]
}
# JSON: ~120 characters
# TOON: ~60 characters (50% savings)
# users[3]{name,age}:
# Alice,30
# Bob,25
# Charlie,35
Use Cases
RAG Systems
Reduce vector database storage and improve retrieval efficiency:
from langchain.schema import Document
from toonverter.integrations import langchain_to_toon
doc = Document(
page_content="Important context...",
metadata={"source": "doc.pdf", "page": 1}
)
toon_str = langchain_to_toon(doc)
# Store in vector DB with 30-60% less space
LLM Prompts
Minimize token usage in context windows:
import toonverter as toon
# Large dataset for LLM prompt
data = get_large_dataset()
# Convert to TOON for minimal tokens
toon_str = toon.encode(data)
prompt = f"Analyze this data:\\n{toon_str}"
API Responses
Efficient data transfer with FastAPI:
from fastapi import FastAPI
from toonverter.integrations import TOONResponse
app = FastAPI()
@app.get("/data", response_class=TOONResponse)
async def get_data():
return {"users": [...], "count": 100}
Data Pipelines
Convert between formats in ETL workflows:
import toonverter as toon
# Load from various formats
data = toon.load('input.json', format='json')
# Process data
processed = process_data(data)
# Save in optimal format
toon.save(processed, 'output.toon', format='toon')
Next Steps
Installation - Install TOON Converter with your preferred integrations
Quick Start - Learn the basic API in 5 minutes
TOON Format Specification v2.0 - Understand the TOON format specification
Basic Usage - See practical examples