AI Models

Explore 71+ AI models for text, image, video, audio, and 3D generation. Compare capabilities, pricing tiers, and find the perfect model for your needs.

Showing 24 of 71 models

Advanced Face Swap
face-swap Standard

Advanced Face Swap

Easel AI (via Fal.ai)

Swap faces between images with preserved lighting, texture, and natural blending.

Natural face swapping

Aura Flow
Image Standard

Aura Flow

Fal.ai

Fast and efficient image generation model.

Beatoven Music
Audio Standard

Beatoven Music

Beatoven (via Fal.ai)

Generate royalty-free instrumental music for any project.

Beatoven SFX
Audio Standard

Beatoven SFX

Beatoven (via Fal.ai)

Generate sound effects for videos, games, and multimedia.

BiRefNet
segmentation Standard

BiRefNet

Fal.ai

High-resolution dichotomous image segmentation for precise object extraction.

Crystal Upscaler
Upscale Standard

Crystal Upscaler

Clarity AI (via Fal.ai)

AI image upscaler that preserves fidelity, color, and detail.

DDColor
colorization Standard

DDColor

piddnad (via Replicate)

Automatic colorization of black-and-white photos with realistic colors.

Colorize B&W photos

Demucs
music-separation Standard

Demucs

Meta (via Replicate)

Separate music into stems: vocals, drums, bass, and other instruments.

Split music into stems

Demucs 6-Stem
music-separation Standard

Demucs 6-Stem

Meta (via Replicate)

6-stem version separating vocals, drums, bass, guitar, piano, and other.

Depth Anything V2
depth-estimation Standard

Depth Anything V2

Fal.ai

State-of-the-art monocular depth estimation. Generate accurate depth maps from single images.

Accurate depth maps

Donut
ocr Standard

Donut

willywongi (via Replicate)

Extract structured data from receipts, invoices, and forms as JSON.

Receipt & invoice OCR

DWPose
pose-estimation Standard

DWPose

Fal.ai

Detect human poses including body, hands, and face keypoints.

Full-body pose detection

ElevenLabs Turbo V2.5
TTS Standard

ElevenLabs Turbo V2.5

ElevenLabs

Fast text-to-speech optimized for low latency with good quality.

Flux 2 Flex
Image Standard

Flux 2 Flex

Black Forest Labs (via Fal.ai)

Flexible Flux 2 variant optimized for versatile image generation and editing.

Flux Dev
Image Standard

Flux Dev

Black Forest Labs (via Replicate)

Development version of Flux. Good balance of quality and cost.

Flux Realism
Image Standard

Flux Realism

Black Forest Labs (via Fal.ai)

Flux model fine-tuned for photorealistic output.

FLUX.1 Canny Dev
controlnet Standard

FLUX.1 Canny Dev

Black Forest Labs (via Replicate)

Open-weight edge-guided FLUX model for development.

FLUX.1 Depth Dev
controlnet Standard

FLUX.1 Depth Dev

Black Forest Labs (via Replicate)

Open-weight depth-guided FLUX model for development.

FLUX.1 Fill Dev
inpainting Standard

FLUX.1 Fill Dev

Black Forest Labs (via Fal.ai)

Open-weight FLUX inpainting model for development and fine-tuning.

Fooocus Inpaint
inpainting Standard

Fooocus Inpaint

Fal.ai

Multi-mode inpainting: fill areas, improve details (face/hands), or modify content.

Gemini 2.0 Flash
Text Standard

Gemini 2.0 Flash

Google

Fast and efficient Gemini model. Multimodal with massive context window.

Gemini 2.5 Flash
Text Standard

Gemini 2.5 Flash

Google

Stable production-ready Gemini model with excellent speed and quality balance. Great for high-volume applications.

Gemini 3 Flash
Text Standard

Gemini 3 Flash

Google

Google's latest and most intelligent flash model. Features enhanced reasoning, improved multimodal understanding, and faster response times.

Google's latest AI

GOT-OCR 2.0
ocr Standard

GOT-OCR 2.0

Fal.ai

Universal OCR for documents, scene text, tables, math formulas, sheet music, and more.

Universal OCR engine