What’s a token?
You can think of tokens as pieces of words used for natural language processing. For English
text, 1 token is approximately 4 characters or 0.75 words. As a point of reference, the collected
works of Shakespeare are about 900,000 words or 1.2M tokens.
To learn more about how tokens work and estimate your usage…
Experiment with our interactive Tokenizer tool.
Log in to your account and enter text into the Playground. The counter in the footer will
display how many tokens are in your text.
Which model should I use?
We generally recommend that developers use either gpt-4 or gpt-3.5-turbo, depending on
how complex the tasks you are using the models for are. gpt-4 generally performs better on a
wide range of evaluations, while gpt-3.5-turbo returns outputs with lower latency and costs
much less per token. We recommend experimenting with these models in Playground to
investigate which models provide the best price performance trade-off for your usage. A
common design pattern is to use several distinct query types which are each dispatched to the
model appropriate to handle them.
How will I know how many tokens I’ve used each month?
Log in to your account to view your usage tracking dashboard. This page will show you how
many tokens you’ve used during the current and past billing cycles.
How can I manage my spending?
You can set a monthly budget in your billing settings, after which we’ll stop serving your
requests. There may be a delay in enforcing the limit, and you are responsible for any overage
incurred. You can also configure an email notification threshold to receive an email alert once
you cross that threshold each month. We recommend checking your usage tracking dashboard
regularly to monitor your spend.
Is the ChatGPT API included in the ChatGPT Plus subscription?
No, the ChatGPT API and ChatGPT Plus subscription are billed separately. The API has its own
pricing, which can be found at [Link] The ChatGPT Plus subscription
covers usage on [Link] only and costs $20/month.
Does Playground usage count against my quota?
Yes, we treat Playground usage the same as regular API usage.
How is pricing calculated for Completions?
Chat completion requests are billed based on the number of input tokens sent plus the
number of tokens in the output(s) returned by the API.
Your request may use up to num_tokens(input) + [max_tokens * max(n, best_of)] tokens,
which will be billed at the per-engine rates outlined at the top of this page.
In the simplest case, if your prompt contains 200 tokens and you request a single 900 token
completion from the gpt-3.5-turbo-1106 API, your request will use 1100 tokens and will cost
[(200 * 0.001) + (900 * 0.002)] / 1000 = $0.002.
You can limit costs by reducing prompt length or maximum response length, limiting usage of
best_of/n , adding appropriate stop sequences, or using engines with lower per-token costs.
How is pricing calculated for Fine-tuning?
There are two components to fine-tuning pricing: training and usage.
When training a fine-tuned model, the total tokens used will be billed according to our training
rates. Note that the number of training tokens depends on the number of tokens in your
training dataset and your chosen number of training epochs. The default number of epochs is
4.
(Tokens in your training file * Number of training epochs) = Total training tokens
Once you fine-tune a model, you’ll be billed only for the tokens you use. Requests sent to fine-
tuned models are billed at our usage rates.