Llama API
Log in

Get started

Overview
Quickstart

Essentials

Models
API keys
SDKs & libraries
Rate limits

Features

Chat completion
Image understanding
Structured output
Tool calling
OpenAI compatibility
Moderation
Fine-tuning & evaluation

Guides

Chat & conversation
Tool calling
Moderation & security
Best practices

API reference

Chat completion
Models
Moderations

Resources

Data commitments
Legal

Llama API overview

Llama API is currently available as a preview release, with ongoing changes to API endpoints, parameters and models. Join the waitlist.
Llama API is a Meta-hosted API service that helps you integrate Llama models into your applications quickly and efficiently.Llama API provides access to Llama models through a simple API interface, with inference provided by Meta, so you can focus on building AI-powered solutions without managing your own inference infrastructure.With Llama API, you get access to state-of-the-art AI capabilities through a developer-friendly interface designed for simplicity and performance.
Meta does not use your content, including API inputs (prompts) or API outputs (model responses), for training our models. See data commitments for more information.

Llama API features

Llama API exposes the capabilities of the latest Llama models via convenient API endpoints, including chat completion, image understanding, and tool calling.
  • •Chat completion: Generate text from a prompt, or build a chat-based AI assistant using multi-modal input (text, images) and text-based outputs.
  • •Image understanding: Process and analyze visual data to extract insights, interpret charts, and more.
  • •JSON structured output: Generate responses that follow pre-defined JSON schemas.
  • •Tool calling: Integrate with your existing tools by defining tools that can be called when generating responses.
  • •OpenAI compatibility: Use OpenAI clients with Llama API using the compatibility endpoint.
  • •Moderation: Use sophisticated safety models to check user and model text for problematic content.
  • •Fine-tuning and evaluation: Fine-tune a pre-trained Llama model on specialized datasets to improve performance for specific use cases.

Using Llama API

Llama API offers endpoints in a REST-like interface that makes it easy to make API calls directly from most programming languages.Meta maintains SDKs for Llama API in multiple languages, including Python and TypeScript. See SDKs and libraries for more information on official libraries for Llama API.Llama API is compatible with OpenAI-based libraries. See OpenAI compatibility for more information on OpenAI-based library support.

Data commitments

Meta does not use your content, including API inputs (prompts) or API outputs (model responses), for training our models.
  • •No training
  • •Encryption at rest and in transit
  • •Data not used for ads
  • •Separation in storage
  • •Strict access control
  • •Compliance & vulnerability management
See data commitments for more information.

Other ways to use Llama

Llama API is a great way to use Llama models in your application, but it is just one of many ways to use Llama.

Llama cloud providers

Meta partners with cloud providers to offer Llama models and cloud inference services at competitive prices. See Meta Llama in the Cloud for a detailed list of cloud providers that offer Llama models.

Llama self-hosted

To host and run Llama models on your own infrastructure, take a look at the Llama Everywhere guide that shows you how to run on common desktop operating systems and Linux-based infrastructure.

Llama Stack

Similar to Llama API, Llama Stack offers a REST-like interface to Llama models, with both server and client implementations, making it easy to host your own API layer with Meta models or your own finetuned models.

Help & support

Find frequently asked questions, get support and assistance, and share your feedback with Meta in the Help Center, or report any problematic content generated by a Llama model.If you encounter a technical issue with a Llama model, file a GitHub issue in the llama-models repository on GitHub.Report security concerns at facebook.com/whitehat/infoReport violations of the Acceptable Use Policy or unlicensed uses of Llama at LlamaUseReport@meta.com

Next steps

Quickstart Learn to make your first API call with Llama API
Chat and conversation Learn about using Llama API to build chat-based AI applications
Best practices Best practices for using language models and Llama API
Was this page helpful?
Llama API features
Using Llama API
Data commitments
Other ways to use Llama
Help & support
Next steps