Text Models Overview

AnyAPI provides access to the most advanced language models from leading AI providers. Generate human-like text, engage in conversations, write code, and solve complex problems.

Available Models

OpenAI Models

  • GPT-4o: Latest multimodal model with advanced reasoning
  • GPT-4o-mini: Fast and cost-effective version of GPT-4o
  • GPT-4 Turbo: High-performance model with large context window
  • GPT-3.5 Turbo: Reliable and efficient for most tasks

Anthropic Models

  • Claude 3.5 Sonnet: Most capable model with excellent reasoning
  • Claude 3.5 Haiku: Fast and economical for simple tasks
  • Claude 3 Opus: Powerful model for complex analysis

Google Models

  • Gemini 2.5 Pro: Advanced model with 2M token context
  • Gemini 1.5 Pro: Multimodal with long context capabilities
  • Gemini 1.5 Flash: Fast model optimized for speed

Meta Models

  • Llama 3.3: Latest open-source model from Meta
  • Llama 3.1: High-performance open model
  • Code Llama: Specialized for code generation

Model Capabilities

Chat Completion

Natural conversations and Q&A

Text Generation

Creative writing and content creation

Code Generation

Programming assistance and debugging

Analysis

Text analysis and summarization

Chat Completions API

The primary endpoint for text generation:
POST /v1/chat/completions

Basic Example

curl -X POST "https://api.anyapi.ai/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Explain quantum computing"}
    ]
  }'

Message Roles

  • system: Sets the behavior and context for the assistant
  • user: Messages from the human user
  • assistant: Previous responses from the AI model
{
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Python function to sort a list"},
    {"role": "assistant", "content": "Here's a Python function to sort a list..."},
    {"role": "user", "content": "Now make it work with custom comparisons"}
  ]
}

Advanced Features

Streaming Responses

Get real-time responses as they’re generated:
const response = await fetch('/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'gpt-4o',
    messages: [{role: 'user', content: 'Tell me a story'}],
    stream: true
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const {value, done} = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  console.log(chunk);
}

Function Calling

Enable models to call external functions:
{
  "model": "gpt-4o",
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"}
  ],
  "functions": [
    {
      "name": "get_weather",
      "description": "Get the current weather in a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City and state, e.g. San Francisco, CA"
          }
        },
        "required": ["location"]
      }
    }
  ]
}

JSON Mode

Force the model to return valid JSON:
{
  "model": "gpt-4o",
  "messages": [
    {"role": "user", "content": "Extract the key information from this text as JSON"}
  ],
  "response_format": {"type": "json_object"}
}

Model Comparison

ModelContext WindowStrengthsBest For
GPT-4o128KMultimodal, reasoningComplex tasks, analysis
Claude 3.5 Sonnet200KReasoning, codingProgramming, writing
Gemini 2.5 Pro2MLong contextDocument analysis
Llama 3.3128KOpen sourceCustom deployments

Pricing

Text models are priced per token:
  • Input tokens: Text you send to the model
  • Output tokens: Text the model generates
  • Different rates: Input vs output pricing varies by model
Use our token calculator to estimate costs.

Best Practices

Performance Optimization

  • Use system messages to set context once
  • Keep conversations focused and relevant
  • Choose the right model for your use case
  • Implement proper error handling

Cost Optimization

  • Use smaller models for simple tasks
  • Implement response caching
  • Set reasonable max_tokens limits
  • Monitor usage with our dashboard

Quality Improvement

  • Provide clear, specific instructions
  • Use examples in your prompts
  • Iterate on prompt design
  • Test with different models

Rate Limits

Text models have the following limits:
PlanRequests/MinTokens/Min
Free10050,000
Pro1,000500,000
EnterpriseCustomCustom

Common Use Cases

Getting Started