Status: Proposed Date: 2026-01-20 Decision Makers: Ruvector Architecture Team Technical Area: LLM Capabilities / Agent Framework Integration
RuvLLM currently provides text generation capabilities but lacks structured function calling (tool use) support, which is essential for integration with modern agent frameworks like LangChain, LlamaIndex, CrewAI, and AutoGPT. Function calling enables models to interact with external tools, APIs, and databases in a structured, type-safe manner.
RuvLLM's generation API is limited to:
- Text-in, text-out generation
- No structured output parsing
- No tool/function definition support
- Manual prompt engineering required for tool interactions
- No support for multi-turn tool conversations
- Agent Framework Integration: Popular frameworks expect OpenAI-compatible function calling APIs
- Structured Outputs: Models need to generate valid JSON function calls, not freeform text
- Multi-Turn Conversations: Tool results must be fed back to the model for reasoning
- Parallel Tool Calls: Efficient agents need to call multiple tools simultaneously
- Model Format Compatibility: Different models (Llama, Mistral, Qwen) use different tool calling formats
- Tool Definitions: JSON Schema-based function signatures
- Tool Choice Control: Auto, none, required, or specific function selection
- Parallel Calls: Multiple function calls in a single response
- Result Integration: Feeding tool outputs back to the model
- Type Safety: Validate function arguments against schemas
- OpenAI API Compatible: Drop-in replacement for OpenAI function calling
- Anthropic Tool Use: Map to Anthropic's tool_use format
- Framework Integration: Direct support for LangChain, LlamaIndex, CrewAI
- Model Agnostic: Work across Llama 3.1+, Mistral, Qwen, custom models
- Constrained Generation: Force valid JSON output via logit biasing
- Low Latency: <10ms overhead for tool call parsing
- Streaming Support: Stream tool calls as they're generated
- Batching: Process multiple tool calls efficiently
Use structured prompts to request tool calls in JSON format, parse with regex/JSON parsers.
Pros:
- No core changes to generation logic
- Works with any model
- Simple implementation
Cons:
- Unreliable: models may generate invalid JSON
- No type safety guarantees
- Poor support for parallel tool calls
- Requires extensive prompt tuning per model
Implement constrained decoding using formal grammars (GBNF, JSON Schema) to force valid tool calls.
Pros:
- Guarantees valid JSON output
- Type-safe by construction
- Works across model architectures
- Best reliability for production
Cons:
- Complex implementation (logit masking)
- Requires grammar compiler
- Potential performance overhead
Leverage each model family's native tool calling format via chat templates.
Pros:
- Optimal for models with native tool support (Llama 3.1+, Mistral)
- Minimal overhead
- Leverages model training
Cons:
- Fragmented implementation across models
- No support for models without native tool calling
- Template maintenance burden
Chosen Option: Hybrid Approach - Option B (Constrained Generation) + Option C (Chat Templates)
Implement constrained generation with grammar-based validation as the foundation, with chat template optimizations for models with native tool calling support.
- Reliability First: Constrained generation guarantees valid outputs for critical production use cases
- Performance Optimization: Chat templates optimize for models with native support (Llama 3.1+, Mistral)
- Universal Compatibility: Fallback to constrained generation for any model
- Future-Proof: New models can be added via chat templates without core changes
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;
/// Tool/function definition for function calling
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ToolDefinition {
/// Function name (must be valid identifier)
pub name: String,
/// Human-readable description for the model
pub description: String,
/// JSON Schema for function parameters
pub parameters: JsonSchema,
/// Required parameter names
#[serde(default)]
pub required: Vec<String>,
}
/// JSON Schema representation
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct JsonSchema {
#[serde(rename = "type")]
pub schema_type: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub properties: Option<std::collections::HashMap<String, JsonSchema>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub items: Option<Box<JsonSchema>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub enum_values: Option<Vec<String>>,
}
/// Tool choice mode for generation
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum ToolChoice {
/// Model decides whether to call tools
Auto,
/// Model must not call any tools
None,
/// Model must call at least one tool
Required,
/// Model must call this specific function
Specific(String),
}/// Request with tool calling support
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ToolCallRequest {
/// User message/prompt
pub messages: Vec<ChatMessage>,
/// Available tools/functions
#[serde(default)]
pub tools: Vec<ToolDefinition>,
/// Tool choice mode
#[serde(default)]
pub tool_choice: ToolChoice,
/// Enable parallel tool calls (default: true)
#[serde(default = "default_true")]
pub parallel_tool_calls: bool,
/// Standard generation parameters
#[serde(flatten)]
pub params: GenerateParams,
}
/// Tool call in model response
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ToolCall {
/// Unique identifier for this tool call
pub id: String,
/// Type (always "function" for now)
#[serde(rename = "type")]
pub call_type: String,
/// Function call details
pub function: FunctionCall,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FunctionCall {
/// Function name (must match a tool definition)
pub name: String,
/// JSON-encoded function arguments
pub arguments: serde_json::Value,
}
/// Chat message with tool call support
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ChatMessage {
/// Role: system, user, assistant, tool
pub role: String,
/// Text content
#[serde(skip_serializing_if = "Option::is_none")]
pub content: Option<String>,
/// Tool calls (for assistant messages)
#[serde(skip_serializing_if = "Option::is_none")]
pub tool_calls: Option<Vec<ToolCall>>,
/// Tool call ID (for tool result messages)
#[serde(skip_serializing_if = "Option::is_none")]
pub tool_call_id: Option<String>,
}
fn default_true() -> bool { true }Different models require different formatting for tool calling:
/// Chat template for tool calling
pub trait ToolCallingTemplate {
/// Format messages with tool definitions
fn format_with_tools(
&self,
messages: &[ChatMessage],
tools: &[ToolDefinition],
tool_choice: &ToolChoice,
) -> Result<String>;
/// Parse tool calls from model output
fn parse_tool_calls(&self, output: &str) -> Result<Vec<ToolCall>>;
/// Check if model has native tool calling support
fn has_native_support(&self) -> bool;
}
/// Llama 3.1+ tool calling format
pub struct Llama31ToolTemplate;
impl ToolCallingTemplate for Llama31ToolTemplate {
fn format_with_tools(
&self,
messages: &[ChatMessage],
tools: &[ToolDefinition],
tool_choice: &ToolChoice,
) -> Result<String> {
// Llama 3.1 uses special <|python_tag|> tokens for tools
let mut prompt = String::new();
// Add tool definitions
prompt.push_str("<|start_header_id|>system<|end_header_id|>\n\n");
prompt.push_str("Available tools:\n");
for tool in tools {
prompt.push_str(&format!(
"<|python_tag|>{}<|eom_id|>\n",
serde_json::to_string_pretty(tool)?
));
}
// Add conversation history
for msg in messages {
prompt.push_str(&format!(
"<|start_header_id|>{}<|end_header_id|>\n\n{}<|eom_id|>\n",
msg.role,
msg.content.as_deref().unwrap_or("")
));
}
// Start assistant response
prompt.push_str("<|start_header_id|>assistant<|end_header_id|>\n\n");
Ok(prompt)
}
fn parse_tool_calls(&self, output: &str) -> Result<Vec<ToolCall>> {
// Parse <|python_tag|>{"name": "...", "arguments": {...}}<|eom_id|>
// Implementation details omitted for brevity
todo!("Parse Llama 3.1 tool call format")
}
fn has_native_support(&self) -> bool { true }
}
/// Mistral tool calling format
pub struct MistralToolTemplate;
impl ToolCallingTemplate for MistralToolTemplate {
fn format_with_tools(
&self,
messages: &[ChatMessage],
tools: &[ToolDefinition],
tool_choice: &ToolChoice,
) -> Result<String> {
// Mistral uses [AVAILABLE_TOOLS] and [/AVAILABLE_TOOLS] markers
let mut prompt = String::new();
prompt.push_str("[AVAILABLE_TOOLS]\n");
prompt.push_str(&serde_json::to_string(tools)?);
prompt.push_str("\n[/AVAILABLE_TOOLS]\n\n");
// Add conversation
for msg in messages {
prompt.push_str(&format!("[INST] {} [/INST]\n", msg.content.as_deref().unwrap_or("")));
}
Ok(prompt)
}
fn parse_tool_calls(&self, output: &str) -> Result<Vec<ToolCall>> {
// Parse [TOOL_CALLS] ... [/TOOL_CALLS]
todo!("Parse Mistral tool call format")
}
fn has_native_support(&self) -> bool { true }
}
/// Qwen tool calling format
pub struct QwenToolTemplate;
/// Generic XML-based format for models without native support
pub struct GenericXmlToolTemplate;
impl ToolCallingTemplate for GenericXmlToolTemplate {
fn format_with_tools(
&self,
messages: &[ChatMessage],
tools: &[ToolDefinition],
tool_choice: &ToolChoice,
) -> Result<String> {
// Generic format using XML tags
let mut prompt = String::from(
"You have access to the following tools. To use a tool, respond with:\n\
<tool_call>\n\
<name>function_name</name>\n\
<arguments>{\"arg1\": \"value1\"}</arguments>\n\
</tool_call>\n\n"
);
prompt.push_str("Available tools:\n");
for tool in tools {
prompt.push_str(&format!("- {}: {}\n", tool.name, tool.description));
prompt.push_str(&format!(" Parameters: {}\n",
serde_json::to_string(&tool.parameters)?));
}
prompt.push_str("\n");
// Add conversation
for msg in messages {
prompt.push_str(&format!("{}: {}\n", msg.role, msg.content.as_deref().unwrap_or("")));
}
Ok(prompt)
}
fn parse_tool_calls(&self, output: &str) -> Result<Vec<ToolCall>> {
// Parse <tool_call>...</tool_call> blocks
use regex::Regex;
let re = Regex::new(
r"<tool_call>\s*<name>([^<]+)</name>\s*<arguments>([^<]+)</arguments>\s*</tool_call>"
)?;
let mut calls = Vec::new();
for cap in re.captures_iter(output) {
calls.push(ToolCall {
id: uuid::Uuid::new_v4().to_string(),
call_type: "function".to_string(),
function: FunctionCall {
name: cap[1].to_string(),
arguments: serde_json::from_str(&cap[2])?,
},
});
}
Ok(calls)
}
fn has_native_support(&self) -> bool { false }
}For guaranteed valid JSON output, implement constrained decoding:
use serde_json::Value as JsonValue;
/// Constrained generation for tool calls
pub struct ConstrainedToolGenerator {
/// JSON Schema grammar compiler
grammar_compiler: GrammarCompiler,
/// Logit processor for constraint enforcement
logit_processor: LogitProcessor,
}
impl ConstrainedToolGenerator {
/// Generate tool calls with grammar constraints
pub fn generate_tool_calls(
&self,
model: &LlmBackend,
prompt: &str,
tools: &[ToolDefinition],
params: GenerateParams,
) -> Result<Vec<ToolCall>> {
// Compile JSON Schema to GBNF grammar
let grammar = self.compile_tool_grammar(tools)?;
// Generate with logit masking to enforce grammar
let output = model.generate_constrained(prompt, &grammar, params)?;
// Parse guaranteed-valid JSON
let calls: Vec<ToolCall> = serde_json::from_str(&output)?;
Ok(calls)
}
/// Compile JSON Schema into GBNF grammar
fn compile_tool_grammar(&self, tools: &[ToolDefinition]) -> Result<Grammar> {
// Build grammar that only allows valid tool calls
// Example: tool_call ::= "{" ws "\"name\"" ws ":" ws name ws "," ws "\"arguments\"" ws ":" ws arguments ws "}"
// name ::= "\"tool1\"" | "\"tool2\"" | ...
// arguments ::= { schema-specific grammar }
self.grammar_compiler.compile_tool_schema(tools)
}
}
/// GBNF (GGML BNF) grammar for constrained generation
#[derive(Debug, Clone)]
pub struct Grammar {
/// Grammar rules in GBNF format
pub rules: String,
}
/// Logit processor for grammar enforcement
pub struct LogitProcessor {
/// Current parse state
state: ParseState,
}
impl LogitProcessor {
/// Mask logits to only allow valid next tokens
pub fn process_logits(
&mut self,
logits: &mut [f32],
grammar: &Grammar,
tokenizer: &Tokenizer,
) -> Result<()> {
// Get valid next tokens from grammar state
let valid_tokens = self.state.get_valid_next_tokens(grammar)?;
// Mask out invalid tokens (set logit to -inf)
for (token_id, logit) in logits.iter_mut().enumerate() {
if !valid_tokens.contains(&(token_id as u32)) {
*logit = f32::NEG_INFINITY;
}
}
Ok(())
}
}
#[derive(Debug)]
struct ParseState {
/// Current position in grammar
position: usize,
/// Parse stack for nested structures
stack: Vec<String>,
}Support iterative tool use:
/// Multi-turn conversation with tool calls
pub struct ToolConversation {
/// Conversation history
messages: Vec<ChatMessage>,
/// Available tools
tools: Vec<ToolDefinition>,
/// Backend for generation
backend: Box<dyn LlmBackend>,
}
impl ToolConversation {
/// Add user message and generate response (may include tool calls)
pub fn send_message(&mut self, content: &str) -> Result<ConversationTurn> {
// Add user message
self.messages.push(ChatMessage {
role: "user".to_string(),
content: Some(content.to_string()),
tool_calls: None,
tool_call_id: None,
});
// Generate response with tool calls
let request = ToolCallRequest {
messages: self.messages.clone(),
tools: self.tools.clone(),
tool_choice: ToolChoice::Auto,
parallel_tool_calls: true,
params: GenerateParams::default(),
};
let response = self.backend.generate_with_tools(request)?;
// Add assistant response to history
self.messages.push(ChatMessage {
role: "assistant".to_string(),
content: response.content.clone(),
tool_calls: response.tool_calls.clone(),
tool_call_id: None,
});
Ok(ConversationTurn {
content: response.content,
tool_calls: response.tool_calls,
})
}
/// Submit tool results and continue conversation
pub fn submit_tool_results(&mut self, results: Vec<ToolResult>) -> Result<ConversationTurn> {
// Add tool result messages
for result in results {
self.messages.push(ChatMessage {
role: "tool".to_string(),
content: Some(result.output),
tool_calls: None,
tool_call_id: Some(result.tool_call_id),
});
}
// Generate next response
self.send_message("")
}
}
#[derive(Debug, Clone)]
pub struct ConversationTurn {
/// Text content
pub content: Option<String>,
/// Tool calls (if any)
pub tool_calls: Option<Vec<ToolCall>>,
}
#[derive(Debug, Clone)]
pub struct ToolResult {
/// Tool call ID this result corresponds to
pub tool_call_id: String,
/// Tool output (JSON or text)
pub output: String,
}-
Define Tool Schema Types
- Implement
ToolDefinition,ToolCall,ToolChoicetypes - Add JSON Schema validation
- Create builder APIs for ergonomic tool definitions
- Implement
-
Chat Template Integration
- Implement
ToolCallingTemplatetrait - Add Llama 3.1, Mistral, Qwen templates
- Create generic XML fallback template
- Implement
-
Request/Response API
- Extend
LlmBackendwithgenerate_with_toolsmethod - Add tool call parsing logic
- Implement OpenAI-compatible API surface
- Extend
Deliverables:
// User-facing API
let tools = vec![
ToolDefinition::new("get_weather")
.description("Get current weather for a location")
.parameter("location", JsonSchema::string())
.parameter("units", JsonSchema::enum_values(&["celsius", "fahrenheit"]))
.required(&["location"])
];
let request = ToolCallRequest {
messages: vec![
ChatMessage::user("What's the weather in San Francisco?")
],
tools,
tool_choice: ToolChoice::Auto,
parallel_tool_calls: true,
params: GenerateParams::default(),
};
let response = backend.generate_with_tools(request)?;
for call in response.tool_calls.unwrap_or_default() {
println!("Tool: {}, Args: {}", call.function.name, call.function.arguments);
}-
Grammar Compiler
- Implement JSON Schema to GBNF compiler
- Support nested objects, arrays, enums
- Add grammar caching for performance
-
Logit Processor
- Implement parse state machine
- Add logit masking for valid tokens
- Optimize for streaming generation
-
Integration
- Wire constrained generation to
LlmBackend - Add fallback logic (native template → constrained generation)
- Benchmark performance impact
- Wire constrained generation to
Deliverables:
// Constrained generation ensures valid JSON
let generator = ConstrainedToolGenerator::new();
let calls = generator.generate_tool_calls(
&backend,
&prompt,
&tools,
params,
)?;
// Guaranteed to parse successfully
assert!(calls.iter().all(|c| tools.iter().any(|t| t.name == c.function.name)));-
Conversation Manager
- Implement
ToolConversationfor stateful interactions - Add automatic tool result integration
- Support parallel tool call orchestration
- Implement
-
Agent Framework Integration
- LangChain adapter
- LlamaIndex integration
- CrewAI support
-
Examples and Documentation
- Multi-turn conversation examples
- Agent framework integration guides
- Performance tuning documentation
Deliverables:
// Multi-turn conversation with tool use
let mut conv = ToolConversation::new(backend, tools);
let turn1 = conv.send_message("Book a flight to NYC")?;
// Model calls search_flights(destination="NYC")
let results = vec![ToolResult {
tool_call_id: turn1.tool_calls[0].id.clone(),
output: r#"{"flights": [{"price": 250, "time": "10am"}]}"#.to_string(),
}];
let turn2 = conv.submit_tool_results(results)?;
// Model responds with flight options| API Style | RuvLLM Support | Notes |
|---|---|---|
| OpenAI Function Calling | ✅ Full | Drop-in replacement for functions and tools parameters |
| Anthropic Tool Use | ✅ Full | Map tool_use blocks to OpenAI format |
| LangChain Tools | ✅ Full | Direct integration via BaseTool adapter |
| LlamaIndex Tools | ✅ Full | Implement BaseToolSpec interface |
| CrewAI Tools | ✅ Full | Compatible with Tool decorator |
| Model Family | Native Support | Template | Constrained Fallback |
|---|---|---|---|
| Llama 3.1+ | ✅ Yes | Llama31ToolTemplate | ✅ |
| Llama 3.0 and earlier | ❌ No | GenericXmlToolTemplate | ✅ |
| Mistral 7B+ | ✅ Yes | MistralToolTemplate | ✅ |
| Qwen 2.5+ | ✅ Yes | QwenToolTemplate | ✅ |
| CodeLlama | ❌ No | GenericXmlToolTemplate | ✅ |
| Custom Models | ❌ No | GenericXmlToolTemplate | ✅ |
// LangChain integration example
use langchain_rs::{Tool, ToolInput, ToolOutput};
struct RuvLlmTool {
definition: ToolDefinition,
executor: Box<dyn Fn(JsonValue) -> Result<String>>,
}
impl Tool for RuvLlmTool {
fn name(&self) -> &str {
&self.definition.name
}
fn description(&self) -> &str {
&self.definition.description
}
fn run(&self, input: ToolInput) -> Result<ToolOutput> {
let args = serde_json::to_value(input)?;
let output = (self.executor)(args)?;
Ok(ToolOutput::Text(output))
}
}| Component | Latency | Notes |
|---|---|---|
| Tool schema compilation | <1ms | Cached after first use |
| Grammar compilation | 5-10ms | Cached per tool set |
| Logit processing (per token) | <0.1ms | Minimal impact on generation |
| JSON parsing | <1ms | Standard serde_json |
| Total overhead | <10ms | Amortized across conversation |
| Component | Memory | Notes |
|---|---|---|
| Tool definitions | ~1KB per tool | Scales with number of tools |
| Grammar cache | ~10KB per tool set | One-time cost |
| Parse state | ~1KB per request | Freed after generation |
| Total overhead | ~10KB + 1KB/tool | Negligible for typical use |
| Method | Tools/sec | Reliability | Use Case |
|---|---|---|---|
| Prompt engineering only | 1000+ | 70-80% | Development/testing |
| Chat template (native) | 800-1000 | 90-95% | Production (supported models) |
| Constrained generation | 200-500 | 99.9%+ | Production (all models), critical systems |
- Agent Framework Integration: Direct compatibility with LangChain, LlamaIndex, CrewAI enables rich agent ecosystems
- Type Safety: JSON Schema validation prevents invalid tool calls at generation time
- Reliability: Constrained generation guarantees valid outputs for production systems
- OpenAI Compatibility: Drop-in replacement for OpenAI API reduces migration friction
- Multi-Modal Agents: Foundation for RAG, web search, database access, API integration
- Parallel Execution: Multiple tool calls enable efficient multi-step reasoning
- Complexity: Grammar compilation and constrained generation add implementation complexity
- Performance Impact: Logit processing adds 5-10% latency for constrained generation
- Model Requirements: Best performance requires models with native tool calling support
- Testing Burden: Must validate across multiple model families and templates
- Template Maintenance: Each new model family may require new chat template
- Schema Limitations: Complex schemas (recursive types, unions) may be challenging to constrain
- Backward Compatibility: Existing text generation API unchanged, tool calling is additive
| Risk | Mitigation |
|---|---|
| Invalid JSON output | Constrained generation with grammar enforcement |
| Template incompatibility | Generic XML fallback for unsupported models |
| Performance regression | Benchmark suite, caching, optional constrained mode |
| Schema complexity | Comprehensive test suite with edge cases |
| Framework API changes | Version pinning, adapter pattern for isolation |
Use prompt engineering with regex/JSON parsing.
- Rejected: Unreliable for production; 20-30% failure rate for complex schemas
- Consideration: Useful for prototyping and development
Integrate vLLM or Outlines Python libraries via FFI.
- Rejected: Cross-language complexity, deployment burden, latency overhead
- Consideration: Reference implementation for grammar compilation logic
Create a Rust macro-based DSL for tool definitions.
- Rejected: JSON Schema is industry standard, better tooling support
- Consideration: Could add as syntactic sugar on top of JSON Schema
- ADR-002: RuvLLM Integration with Ruvector (foundation for tool-enhanced RAG)
- ADR-008: mistral-rs Integration (backend for high-performance tool calling)
- ADR-009: Streaming Architecture (streaming tool calls in progress)
-
OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
- Industry-standard API for tool use
functionsparameter (deprecated) andtoolsparameter- Parallel tool calls and tool choice modes
-
Anthropic Tool Use: https://docs.anthropic.com/claude/docs/tool-use
- Alternative API design with
tool_useblocks - Computer use (bash, editor) as specialized tools
- Multi-step tool orchestration patterns
- Alternative API design with
-
LangChain Tool Documentation: https://python.langchain.com/docs/modules/agents/tools/
- Agent framework integration patterns
BaseToolinterface and tool decorators- Tool result schemas
-
LlamaIndex Tools: https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/tools/
BaseToolSpecinterface- Function tools and query engine tools
-
Constrained Decoding:
- GBNF (GGML BNF) grammar: https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md
- Outlines (Python): https://github.com/outlines-dev/outlines
- Guidance (Microsoft): https://github.com/guidance-ai/guidance
-
Model-Specific Tool Formats:
- Llama 3.1 tool use: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1
- Mistral function calling: https://docs.mistral.ai/capabilities/function_calling/
- Qwen tools: https://qwen.readthedocs.io/en/latest/framework/function_call.html
| Component | Status | Notes |
|---|---|---|
| Tool schema types | Pending | Define ToolDefinition, ToolCall, ToolChoice |
| JSON Schema validation | Pending | Integrate schemars crate |
| Chat templates | Pending | Llama 3.1, Mistral, Qwen, Generic XML |
| Request/Response API | Pending | generate_with_tools method on LlmBackend |
| Grammar compiler | Pending | JSON Schema → GBNF compiler |
| Logit processor | Pending | Parse state machine and masking logic |
| Constrained generation | Pending | Integration with backend |
| Multi-turn conversations | Pending | ToolConversation manager |
| LangChain integration | Pending | BaseTool adapter |
| LlamaIndex integration | Pending | BaseToolSpec implementation |
| CrewAI support | Pending | Tool decorator compatibility |
| OpenAI API compatibility | Pending | /v1/chat/completions endpoint |
| Anthropic format mapping | Pending | tool_use block conversion |
| Streaming tool calls | Pending | Stream partial JSON as generated |
| Parallel tool execution | Pending | Concurrent tool call orchestration |
| Documentation | Pending | API docs, examples, integration guides |
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2026-01-20 | Ruvector Architecture Team | Initial proposal |