AI Models

Explore 141+ AI models for text, image, video, audio, and 3D generation. Compare capabilities, pricing tiers, and find the perfect model for your needs.

Showing 21 of 141 models

Speech 02 HD
TTS Premium

Speech 02 HD

MiniMax (via Replicate)

High-definition text-to-speech with premium voice quality.

Speech 02 Turbo
TTS Standard

Speech 02 Turbo

MiniMax (via Replicate)

Fast text-to-speech variant of Speech 02 optimized for speed.

Stable Diffusion 3
Image Standard

Stable Diffusion 3

Stability AI (via Replicate)

Latest Stable Diffusion with improved text rendering and composition.

Stable Diffusion 3.5 Large
Image Premium

Stable Diffusion 3.5 Large

Stability AI (via Replicate)

Latest Stable Diffusion with 8B parameters. Superior quality and prompt understanding.

SwinIR
image-restoration Standard

SwinIR

jingyunliang (via Replicate)

General image restoration: denoising, deblurring, and super-resolution.

Sync Lipsync V2
Video Standard

Sync Lipsync V2

Fal.ai

Advanced lipsync technology for realistic talking videos.

Text Extract OCR
ocr Budget

Text Extract OCR

abiruyt (via Replicate)

Simple, versatile text extraction from any image.

Trellis
3D Standard

Trellis

Microsoft (via Fal.ai)

Native 3D generative model using Structured LATents (SLAT) for versatile, high-quality 3D asset creation from images.

SLAT-based 3D generation

Tripo3D
3D Budget

Tripo3D

Tripo (via Fal.ai)

Fast, affordable image-to-3D with clean meshes and PBR texture support.

Fast & affordable 3D

Veo 3
Video Premium

Veo 3

Google (via Replicate)

Google official Veo 3 model for high-fidelity video generation with strong motion realism.

Veo 3.1
Video Premium

Veo 3.1

Google (via Fal.ai)

Google's latest video generation model. Produces high-fidelity videos with excellent understanding of physics and motion.

Google's advanced video AI

Veo 3.1 Fast
Video Standard

Veo 3.1 Fast

Google (via Fal.ai)

Faster version of Veo 3.1 optimized for quick video generation with good quality.

Veo 3.1 Image-to-Video
Video Premium

Veo 3.1 Image-to-Video

Google (via Fal.ai)

Convert images to video using Google's Veo 3.1. Animate still images with natural motion.

Wan 2.2 Image-to-Video
Video Standard

Wan 2.2 Image-to-Video

Alibaba (via Fal.ai)

Alibaba's image-to-video model. Efficient and reliable video generation from images.

Wan 2.5 Image-to-Video
Video Standard

Wan 2.5 Image-to-Video

Wan Video (via Replicate)

Animate images into videos with natural motion and high fidelity.

Wan 2.5 Text-to-Video
Video Standard

Wan 2.5 Text-to-Video

Wan Video (via Replicate)

High-quality text-to-video generation with smooth motion.

Whisper
speech-to-text Standard

Whisper

OpenAI (via Fal.ai)

OpenAI Whisper large v3 for accurate speech transcription and translation. Supports 99+ languages.

Accurate speech transcription

Wizper
speech-to-text Standard

Wizper

Fal.ai

Optimized Whisper v3 by Fal.ai - same accuracy, 2x faster performance.

2x faster Whisper

Wonder3D
3D Standard

Wonder3D

adirik (via Replicate)

Image-to-3D with realistic mesh generation. Outputs textured .glb files.

XTTS-v2
voice-cloning Standard

XTTS-v2

Coqui (via Replicate)

Clone any voice with just 6 seconds of audio. Supports 17 languages.

Clone voices in seconds

Z-Image Turbo
Image Budget

Z-Image Turbo

Tongyi (via Replicate)

Super-fast 6B parameter text-to-image model with LoRA support.