Models

Flagship models

Models overview

The Vendor.com is powered by a diverse set of models with different capabilities and price points. You can also make customizations to our models for your specific use case with fine-tuning.

MODEL

DESCRIPTION

GPT-4o

Versatile, high-intelligence flagship model

GPT-4o-mini

Fast, affordable small model for focused tasks

o1 and o1-mini

Reasoning models that excel at complex, multi-step tasks

GPT-4o Realtime

GPT-4o models capable of realtime text and audio inputs and outputs

GPT-4o Audio

GPT-4o models capable of audio inputs and outputs.

GPT-4 Turbo and GPT-4

The previous set of high-intelligence models

GPT-3.5 Turbo

A fast model for simple tasks, superceded by GPT-4o-mini

Context window

Models on this page will list a context window, which refers to the maximum number of tokens that can be used in a single request, inclusive of both input, output, and reasoning tokens. The following token counts will apply toward the context window total:

  • Input tokens
  • Output tokens (tokens generated in response to your prompt)
  • Reasoning tokens (used by the model to plan a response)

Tokens generated in excess of the context window limit may be truncated.

context_window

Current model aliases

Below, please find current model aliases, and guidance on when they will be updated to new versions (if guidance is available).

ALIAS

POINTS TO

WILL POINT TO

gpt-4o
gpt-4o-2024-08-06
-
chatgpt-4o-latest
Latest used in ChatGPT
Continuously updated
gpt-4o-mini
gpt-4o-mini-2024-07-18
-
o1
o1-2024-12-17
-
o1-mini
o1-mini-2024-09-12
-
o1-preview
o1-preview-2024-09-12
-
gpt-4o-realtime-preview
gpt-4o-realtime-preview-2024-10-01

gpt-4o-realtime-preview-2024-12-17Effective Jan. 9 2025

gpt-4o-mini-realtime-preview
gpt-4o-mini-realtime-preview-2024-12-17
-
gpt-4o-audio-preview
gpt-4o-audio-preview-2024-10-01

gpt-4o-audio-preview-2024-12-17Effective Jan. 9 2025

GPT-4o

GPT-4o (“o” for “omni”) is our versatile, high-intelligence flagship model. It accepts both text and image inputs, and produces text outputs (including Structured Outputs).

The chatgpt-4o-latest model ID below continuously points to the version of GPT-4o used in ChatGPT. It is updated frequently, when there are significant changes to Plagrounds's GPT-4o model.

The knowledge cutoff for GPT-4o models is October, 2023.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

gpt-4o

gpt-4o-2024-08-06

128,000 tokens
16,384 tokens
gpt-4o-2024-11-20
128,000 tokens
16,384 tokens
gpt-4o-2024-08-06
128,000 tokens
16,384 tokens
gpt-4o-2024-05-13
128,000 tokens
4,096 tokens
chatgpt-4o-latest

GPT-4o used in ChatGPT

128,000 tokens
16,384 tokens

GPT-4o-mini

GPT-4o mini (“o” for “omni”) is a fast, affordable small model for focused tasks. It accepts both text and image inputs, and produces text outputs (including Structured Outputs). It is ideal for fine-tuning, and model outputs from a larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar results at lower cost and latency.

The knowledge cutoff for GPT-4o-mini models is October, 2023.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

gpt-4o-mini

gpt-4o-mini-2024-07-18

128,000 tokens
16,384 tokens
gpt-4o-mini-2024-07-18
128,000 tokens
16,384 tokens

o1 and o1-mini

The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. Learn about the capabilities of o1 models in our reasoning guide.

There are two model types available today:

  • o1: reasoning model designed to solve hard problems across domains
  • o1-mini: fast and affordable reasoning model for specialized tasks

The latest o1 model supports both text and image inputs, and produces text outputs (including Structured Outputs). o1-mini currently only supports text inputs and outputs.

The knowledge cutoff for o1 and o1-mini models is October, 2023.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

o1

o1-2024-12-17

200,000 tokens
100,000 tokens
o1-2024-12-17
200,000 tokens
100,000 tokens
o1-mini

o1-mini-2024-09-12

128,000 tokens
65,536 tokens
o1-mini-2024-09-12
128,000 tokens
65,536 tokens
o1-preview

o1-mini-2024-09-12

128,000 tokens
32,768 tokens
o1-preview-2024-09-12
128,000 tokens
32,768 tokens

GPT-4o and GPT-4o-mini Realtime

This is a preview release of the GPT-4o and GPT-4o-mini Realtime models. These models are capable of responding to audio and text inputs in realtime.

The knowledge cutoff for GPT-4o Realtime models is October, 2023.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

gpt-4o-realtime-preview

gpt-4o-realtime-preview-2024-10-01

128,000 tokens
4,096 tokens
gpt-4o-realtime-preview-2024-12-17
128,000 tokens
4,096 tokens
gpt-4o-realtime-preview-2024-10-01
128,000 tokens
4,096 tokens
gpt-4o-mini-realtime-preview

gpt-4o-mini-realtime-preview-2024-12-17

128,000 tokens
4,096 tokens
gpt-4o-mini-realtime-preview-2024-12-17
128,000 tokens
4,096 tokens

GPT-4o Audio

This is a preview release of the GPT-4o Audio models. These models accept audio inputs and outputs, and can be used in the playground.

The knowledge cutoff for GPT-4o Audio models is October, 2023.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

gpt-4o-audio-preview

gpt-4o-audio-preview-2024-10-01

128,000 tokens
16,384 tokens
gpt-4o-audio-preview-2024-12-17
128,000 tokens
16,384 tokens
gpt-4o-audio-preview-2024-10-01
128,000 tokens
16,384 tokens

GPT-4 Turbo and GPT-4

GPT-4 is an older version of a high-intelligence GPT model, usable in Playground. The knowledge cutoff for the latest GPT-4 Turbo version is December, 2023.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

gpt-4-turbo

gpt-4-turbo-2024-04-09

128,000 tokens
4,096 tokens
gpt-4-turbo-2024-04-09
128,000 tokens
4,096 tokens
gpt-4-turbo-preview

gpt-4-0125-preview

128,000 tokens
4,096 tokens
gpt-4-0125-preview
128,000 tokens
4,096 tokens
gpt-4-1106-preview
128,000 tokens
4,096 tokens
gpt-4

gpt-4-0613

8,192 tokens
8,192 tokens
gpt-4-0613
8,192 tokens
8,192 tokens
gpt-4-0314
8,192 tokens
8,192 tokens

GPT-3.5 Turbo

GPT-3.5 Turbo models can understand and generate natural language or code and have been optimized for chat and non-chat tasks.

As of July 2024, gpt-4o-mini should be used in place of gpt-3.5-turbo, as it is cheaper, more capable, multimodal, and just as fast.

MODEL

CONTEXT WINDOW

MAX OUTPUT TOKENS

KNOWLEDGE CUTOFF

gpt-3.5-turbo-0125

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.

16,385 tokens
4,096 tokens
Sep 2021

gpt-3.5-turbo

Currently points to gpt-3.5-turbo-0125.

16,385 tokens
4,096 tokens
Sep 2021

gpt-3.5-turbo-1106

GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.

16,385 tokens
4,096 tokens
Sep 2021

gpt-3.5-turbo-instruct

Similar capabilities as GPT-3 era models.

16,385 tokens
4,096 tokens
Sep 2021