Skip to main content

Image Models Overview

Generate stunning images from text descriptions with state-of-the-art AI image models. AnyAPI provides access to both dedicated image generation models and multimodal chat models with image output capabilities.

Available Models

OpenAI Models

  • GPT-5 Image: Multimodal model with high-quality image generation, reasoning, and text understanding
  • GPT-5 Image Mini: Lightweight version for faster and more affordable image generation

Google Models

  • Gemini 2.5 Flash Image: Fast multimodal model with image generation, vision, reasoning, and PDF support

Amazon Models

  • Nova Canvas: Professional-grade image generation model
  • Titan Image Generator: Affordable image generation for common use cases

Stability Models

  • SD 3.5 Large: High-quality image generation from Stability AI
  • Stable Image Ultra: Premium image generation with the highest quality output

Model Capabilities

Text-to-Image

Generate images from text descriptions

Multimodal Chat

Generate images within a chat conversation context

Vision & Analysis

Understand and describe image content

Reasoning

Combine reasoning capabilities with image generation

Image Generation API

Generate images from text prompts:
POST /v1/images/generations

Basic Example

curl -X POST "https://api.anyapi.ai/v1/images/generations" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5-image",
    "prompt": "A serene mountain landscape at sunset",
    "size": "1024x1024",
    "quality": "hd",
    "n": 1
  }'

Response Format

{
  "created": 1589478378,
  "data": [
    {
      "url": "https://cdn.anyapi.ai/images/generated/abc123.png",
      "revised_prompt": "A detailed mountain landscape at sunset with dramatic lighting and cloud formations"
    }
  ]
}

Multimodal Image Generation

Models like GPT-5 Image and Gemini 2.5 Flash Image also support image generation through the chat completions endpoint, allowing you to combine text conversation with image output:
POST /v1/chat/completions
Python
import requests

response = requests.post(
    "https://api.anyapi.ai/v1/chat/completions",
    headers={
        "Authorization": "Bearer your_api_key",
        "Content-Type": "application/json"
    },
    json={
        "model": "google/gemini-2.5-flash-image",
        "messages": [
            {
                "role": "user",
                "content": "Generate an image of a cat in the style of Van Gogh's Starry Night"
            }
        ]
    }
)

print(response.json())

Model Comparison

ModelProviderStrengthsAccess
GPT-5 ImageOpenAIQuality, multimodal, reasoningBasic
GPT-5 Image MiniOpenAISpeed, affordability, multimodalBasic
Gemini 2.5 Flash ImageGoogleSpeed, vision, reasoning, PDF supportBasic
Nova CanvasAmazonProfessional image generationPremium
Titan Image GeneratorAmazonAffordable image generationBasic
SD 3.5 LargeStabilityHigh-quality image generationPremium
Stable Image UltraStabilityHighest quality outputPremium

Advanced Features

Quality Settings

  • Standard: Default quality, faster generation
  • HD: Higher quality, more detailed images

Prompt Engineering

Best Practices

  1. Be specific: Include details about style, lighting, composition
  2. Use descriptive adjectives: “vibrant”, “moody”, “minimalist”
  3. Specify camera settings: “shot with 85mm lens”, “shallow depth of field”
  4. Include style references: “in the style of…”, “photorealistic”

Example Prompts

“A professional headshot of a confident businesswoman in a modern office, shot with 85mm lens, natural lighting, shallow depth of field, high resolution”
“A mystical forest scene at dawn with ethereal lighting, painted in the style of romantic era landscape paintings, with soft brushstrokes and dreamlike atmosphere”
“A sleek smartphone on a white background, studio lighting, product photography style, clean and minimal, high resolution, commercial quality”
“A cute cartoon cat wearing a space helmet, floating in space with colorful nebulae in the background, digital illustration style, vibrant colors”

Content Policy

All generated images must comply with our content policy:
  • No harmful, offensive, or inappropriate content
  • Respect copyright and intellectual property
  • No generation of real people without consent
  • Commercial use allowed with proper licensing

Common Use Cases

Marketing & Advertising

Product mockups, campaign visuals, social media content

Content Creation

Blog illustrations, thumbnails, creative assets

Product Design

Concept art, prototypes, design variations

E-commerce

Product photography, lifestyle images, backgrounds

Image Formats

Supported input/output formats:
  • Input: PNG, JPEG, WebP (for editing/variations)
  • Output: PNG (default), JPEG available for some models
  • Maximum file size: 20MB for uploads
  • Recommended: PNG for best quality

Getting Started

Quick Start

Generate your first image

SDKs

Use our libraries