Skip to main content
This guide helps you get started with AI/ML API chat models. For detailed documentation of all ChatAimlapi features and configurations, head to the API reference. AI/ML API provides unified access to hundreds of hosted foundation models with high availability and throughput.

Overview

Integration details

ClassPackageLocalSerializableJS supportDownloadsVersion
ChatAimlapilangchain-aimlapibetaPyPI - DownloadsPyPI - Version

Model features

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs

Setup

To access AI/ML API models you’ll need to create an account, get an API key, and install the langchain-aimlapi integration package.

Credentials

Head to aimlapi.com to sign up and generate an API key. Once you’ve done this set the AIMLAPI_API_KEY environment variable:
import getpass
import os

if not os.getenv("AIMLAPI_API_KEY"):
    os.environ["AIMLAPI_API_KEY"] = getpass.getpass("Enter your AI/ML API key: ")
To enable automated tracing of your model calls, set your LangSmith API key:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain AI/ML API integration lives in the langchain-aimlapi package:
%pip install -qU langchain-aimlapi

Instantiation

Now we can instantiate our model object and generate chat completions:
from langchain_aimlapi import ChatAimlapi

llm = ChatAimlapi(
    model="meta-llama/Llama-3-70b-chat-hf",
    temperature=0.7,
    max_tokens=512,
    timeout=30,
    max_retries=3,
)

Invocation

messages = [
    ("system", "You are a helpful assistant that translates English to French."),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 23, 'total_tokens': 32}, 'model_name': 'meta-llama/Llama-3-70b-chat-hf'}, id='run-...')
print(ai_msg.content)
J'adore la programmation.

Streaming invocation

You can also stream responses token-by-token:
for chunk in llm.stream("List top 5 programming languages in 2025 with reasons."):
    print(chunk.content, end="", flush=True)

API reference

For detailed documentation of all ChatAimlapi features and configurations head to the API reference.
I