
5 minutes read
Start using GPT‑5 through OpenAI's API, today
Table of contents
Introduction to GPT-5
GPT-5 is OpenAI’s flagship model for the second half of 2025. It’s built for deeper reasoning, better coding, and agentic workflows, and it adds two controls that matter in practice: verbosity and reasoning effort. It runs with a 400,000-token total context and can emit up to 128,000 tokens per response.
If you’re new to large language models, skim my plain-English explainer on how GPT-style LLMs work. You’ll prompt better after.
Ready? Let’s ship your first GPT-5 request.
Create an account to get your GPT-5 API key
- Create an account or sign in.
- Confirm your email address.
- Log in.
- Open the Billing overview page and add credit or a payment method so your keys work right away. (The free-credit program ended mid-2024.)
- Generate your first API key for GPT-5. Keys are shown once; paste it into a password manager immediately.
Got your key? Great. Time to hit the API.
How to make your first request to GPT-5
OpenAI’s Responses API is the modern endpoint. Chat Completions still works, but start with Responses unless you have a hard reason not to.
macOS and Linux (Responses API):
curl -s https://api.openai.com/v1/responses \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5", "input": [ { "role": "user", "content": [{ "type": "input_text", "text": "Hello!" }] } ], "verbosity": "medium", "reasoning_effort": "minimal", "max_output_tokens": 200 }'
Windows (one-liner, Chat Completions still fine):
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer %OPENAI_API_KEY%" https://api.openai.com/v1/chat/completions -d "{ \"model\": \"gpt-5\", \"messages\": [{\"role\":\"user\",\"content\":\"Hello!\"}], \"verbosity\":\"medium\", \"reasoning_effort\":\"minimal\", \"max_output_tokens\":200 }"
Pro tip: Use gpt-5
to track the latest GPT-5 snapshot. If you need strict reproducibility, pin a snapshot in your stack.
Token budget: a single call supports up to 400,000 tokens total. Max output is 128,000 tokens. Your rate-limit tier must be high enough to feed that much TPM; check your org’s quotas before long prompts.
Rock-solid JSON with Structured Outputs (Responses API)
In Responses API, JSON lives under text.format
. If you send response_format
here, you’ll get the exact error you saw. Use this shape:
curl -s https://api.openai.com/v1/responses \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5", "input": [ { "role": "system", "content": [{ "type": "input_text", "text": "Return compact JSON only." }] }, { "role": "user", "content": [{ "type": "input_text", "text": "Solve 8x + 31 = 2." }] } ], "text": { "format": { "type": "json_schema", "name": "equation_solution", "schema": { "type": "object", "properties": { "steps": { "type": "array", "items": { "type": "string" } }, "final_answer": { "type": "string" } }, "required": ["steps", "final_answer"], "additionalProperties": false }, "strict": true } } }'
That’s the Responses-API-correct way to enforce a schema. For Chat Completions, you still use response_format
.
Vision and multimodal (quick-start)
GPT-5 accepts text and images in one request. With the Responses API, set image parts as { "type": "input_image", "image_url": "<url or data URL>" }
, then put your text after the image for better results.
Supported image formats: PNG, JPEG/JPG, WEBP, non-animated GIF. Size limits: Up to 50 MB total payload per request for image bytes across parts.
Image URL example:
curl -s https://api.openai.com/v1/responses \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5", "input": [ { "role": "user", "content": [ { "type": "input_image", "image_url": "https://cdn.example.com/slide.jpg" }, { "type": "input_text", "text": "Describe this slide in 5 bullets." } ] } ], "max_output_tokens": 250 }'
Base64 option:
{ "type": "input_image", "image_url": "data:image/jpeg;base64,...." }
Pro tips
- One image per content part unless you’re explicitly comparing; caption each if multiple.
- Prefer URLs in long threads to avoid resending Base64.
- Always cap
max_output_tokens
so multimodal answers don’t run wild.
Verbosity (new)
What it does: constrains how compact or expansive the answer is without rewriting the prompt.
Values: "low"
, "medium"
(default), "high"
. Set it deliberately.
When to use low: terse assistants, tool-first UX, status replies. When to use high: audits, code reviews, pedagogical explanations.
"verbosity": "low"
Reasoning effort (new)
What it does: controls how much internal reasoning the model does before responding.
Values: "minimal"
, "low"
, "medium"
(default), "high"
. "minimal"
is new and fast for simple tasks.
- Use “minimal” for retrieval, formatting, simple transforms, low-latency UX.
- Use “high” for complex planning, multi-step refactors, ambiguous tradeoffs.
"reasoning_effort": "minimal"
GPT-5 pricing
Model | Input (per 1 M) | Output (per 1 M) |
---|---|---|
gpt-5 (400K context) | $1.25 | $10.00 |
gpt-5-mini (400K context) | $0.25 | $2.00 |
gpt-5-nano (400K context) | $0.05 | $0.40 |
gpt-4.1 (1 M context) | $2.00 | $8.00 |
gpt-4.1-mini (1 M context) | $0.40 | $1.60 |
gpt-4.1-nano (1 M context) | $0.10 | $0.40 |
Prompt-cached input is cheaper; check the official pricing and your model page for cached-input rates.
Output limits: GPT-5 can emit up to 128K tokens per call; GPT-4.1’s max output is ~32K with a ~1.0–1.05M context. If you need the absolute longest context, 4.1 still has the edge; otherwise default to GPT-5.
GPT-5 (full), mini, or nano?
- GPT-5 (full): flagship quality for deep reasoning, complex coding, long-context analysis.
- GPT-5 mini: cost-sensitive apps with crisp prompts.
- GPT-5 nano: ultra-low latency and volume workloads.
There’s also gpt-5-chat-latest
if you want a non-reasoning chat flavor.
Did you like this article? Then, keep learning:
- How to build a ChatGPT plugin using Laravel, complementing GPT-5 API use
- Step-by-step guide to start using GPT-3.5 Turbo API for beginners
- Learn how to access and use GPT-4 Turbo's API effectively
- Understand how GPT-style AIs like GPT actually work
- Learn to use PHP with OpenAI's API and GPT for practical projects
0 comments