Agent-CoreX Documentation

Reduce Token Usage & Optimize Costs

Agent-CoreX cuts costs by 30–70% by eliminating unnecessary tool schemas from your LLM context. Learn how to maximize savings.

The Cost Problem

Traditional AI agents include ALL tools in every request:

Agent thinking → LLM prompt + 100 tool schemas (50k tokens) → LLM response
Cost: $0.75 per request

With Agent-CoreX:
Agent thinking → Query Agent-CoreX → Get 5 relevant tools (5k tokens) → LLM response
Cost: $0.20 per request (73% savings!)

How Agent-CoreX Saves Tokens

Dynamic Selection - Only relevant tools in context
Semantic Matching - Find right tools without tool names
Caching - Reuse tool lists when possible
Batching - Combine similar queries

Cost Optimization Strategies

1. Use Specific Queries

// ❌ Vague (uses more tokens)
await agent.retrieveTools({
  query: "do something with GitHub"
});

// ✅ Specific (fewer tokens)
await agent.retrieveTools({
  query: "Create a pull request on GitHub with code changes"
});

Savings: 15-20% per query

2. Set Lower top_k

// ❌ Request all tools (200 tokens)
await agent.retrieveTools({
  query: "Deploy",
  topK: 50
});

// ✅ Request what you need (50 tokens)
await agent.retrieveTools({
  query: "Deploy",
  topK: 3
});

Savings: 30-40% per query

3. Use Server Filters

// ❌ Search all 100+ servers (higher cost)
await agent.retrieveTools({
  query: "Deploy"
});

// ✅ Filter to AWS only (lower cost)
await agent.retrieveTools({
  query: "Deploy",
  filter: {
    server: "aws-mcp"
  }
});

Savings: 20-25% per query

4. Cache Tool Lists

const toolCache = new Map();

async function getCachedTools(query) {
  // Check cache (1 hour TTL)
  if (toolCache.has(query)) {
    const cached = toolCache.get(query);
    if (Date.now() - cached.time < 3600000) {
      return cached.tools;
    }
  }

  // Not in cache, fetch
  const tools = await agent.retrieveTools({ query });
  toolCache.set(query, { tools, time: Date.now() });
  return tools;
}

Savings: 60-80% for repeated queries

5. Batch Similar Requests

// ❌ Three separate queries
const deployTools = await agent.retrieveTools({
  query: "Deploy"
});
const monitorTools = await agent.retrieveTools({
  query: "Monitor"
});
const notifyTools = await agent.retrieveTools({
  query: "Notify"
});

// ✅ One combined query
const allTools = await agent.retrieveTools({
  query: "Deploy, monitor, and notify"
});

Savings: 50-70% on this operation

Cost Calculation

Before Agent-CoreX

Scenario: ChatGPT agent with 100 tools

Per request:
- Prompt (2k tokens): $0.01
- 100 tool schemas (48k tokens): $0.72
- Completion (1k tokens): $0.03
Total: $0.76

1000 requests/month: $760

With Agent-CoreX

Per request:
- Prompt (2k tokens): $0.01
- Retrieve tools (200 tokens): $0.002
- 5 tool schemas (3k tokens): $0.04
- Completion (1k tokens): $0.03
Total: $0.082

1000 requests/month: $82
Savings: 89%!

Monitoring Your Costs

Go to Dashboard → Usage
View costs by server
Track trends over time
Set budget alerts

Cost Optimization Checklist

Quick Wins

✅ Be specific in queries ✅ Reduce top_k to 3-5 ✅ Cache tool lists ✅ Filter by server

Advanced

✅ Batch requests ✅ Implement rate limiting ✅ Use scheduled jobs ✅ Monitor trends

Real-World Example

A customer reduced costs from $2,400/month to$ 280/month:

Implemented caching (40% savings)
Reduced top_k from 20 to 5 (30% savings)
Used server filters (15% savings)
Combined similar queries (20% savings)

Total: 89% cost reduction ✅

Next Step: Authentication Guide →

​Reduce Token Usage & Optimize Costs

​The Cost Problem

​How Agent-CoreX Saves Tokens

​Cost Optimization Strategies

​1. Use Specific Queries

​2. Set Lower top_k

​3. Use Server Filters

​4. Cache Tool Lists

​5. Batch Similar Requests

​Cost Calculation

​Before Agent-CoreX

​With Agent-CoreX

​Monitoring Your Costs

​Cost Optimization Checklist