Gemini Native (Text)
curl --request POST \
--url https://api.gravitex.ai/v1beta/models/{model}:generateContent \
--header 'Content-Type: application/json' \
--data '
{
"contents": [
{}
],
"generationConfig": {},
"systemInstruction": {},
"safetySettings": [
{}
],
"tools": [
{}
],
"toolConfig": {},
"cachedContent": "<string>"
}
'{
"candidates": [
{
"content": {
"parts": [{"text": "Response text"}],
"role": "model"
},
"finishReason": "STOP",
"index": 0,
"safetyRatings": []
}
],
"usageMetadata": {
"promptTokenCount": 10,
"candidatesTokenCount": 20,
"totalTokenCount": 30,
"thoughtsTokenCount": 0,
"cachedContentTokenCount": 0
},
"modelVersion": "gemini-2.5-pro",
"createTime": "2025-01-01T00:00:00Z"
}
Chat & text
Native Gemini Format
Call GravitexAI using Google Gemini native format
POST
/
v1beta
/
models
/
{model}
:generateContent
Gemini Native (Text)
curl --request POST \
--url https://api.gravitex.ai/v1beta/models/{model}:generateContent \
--header 'Content-Type: application/json' \
--data '
{
"contents": [
{}
],
"generationConfig": {},
"systemInstruction": {},
"safetySettings": [
{}
],
"tools": [
{}
],
"toolConfig": {},
"cachedContent": "<string>"
}
'{
"candidates": [
{
"content": {
"parts": [{"text": "Response text"}],
"role": "model"
},
"finishReason": "STOP",
"index": 0,
"safetyRatings": []
}
],
"usageMetadata": {
"promptTokenCount": 10,
"candidatesTokenCount": 20,
"totalTokenCount": 30,
"thoughtsTokenCount": 0,
"cachedContentTokenCount": 0
},
"modelVersion": "gemini-2.5-pro",
"createTime": "2025-01-01T00:00:00Z"
}
Documentation Index
Fetch the complete documentation index at: https://docs.gravitex.ai/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
The Gemini Native API uses Google Geminiâs request and response format. It is suitable for Google official clients (e.g. thegoogle-generativeai SDK) or when you need to work directly with Gemini data structures. The API follows the Gemini specification and supports thinking mode, multimodal input, tool calling, Google Search (Grounding), context caching, image generation, and other full capabilities.
If you use an OpenAI-compatible client (e.g. OpenAI SDK), use the
/v1/chat/completions endpoint instead.Difference from OpenAI format
| Aspect | Gemini Native | OpenAI-compatible (/v1/chat/completions) |
|---|---|---|
| Message structure | contents[].parts[] (text / inlineData / fileData) | messages[].content |
| Roles | user / model | user / assistant / system |
| System prompt | systemInstruction.parts | messages with role=system |
| Streaming | streamGenerateContent?alt=sse | stream: true |
| Thinking mode | generationConfig.thinkingConfig or model suffix | Model suffix (e.g. -thinking) |
API endpoints
| Feature | Method | Path |
|---|---|---|
| Text generation (non-streaming) | POST | /v1beta/models/{model}:generateContent |
| Text generation (streaming) | POST | /v1beta/models/{model}:streamGenerateContent?alt=sse |
| Single Embedding | POST | /v1beta/models/{model}:embedContent |
| Batch Embedding | POST | /v1beta/models/{model}:batchEmbedContents |
{model} in the path with the actual model ID, e.g. gemini-2.5-pro, gemini-3-pro-preview.
Authentication
Any of the following is supported:Bearer token:
Bearer sk-xxxxxxxxxx (recommended, consistent with other GravitexAI endpoints)Google-style API key:
x-goog-api-key: sk-xxxxxxxxxx?key=sk-xxxxxxxxxx.
Request parameters
generateContent / streamGenerateContent
List of conversation contents. Each item has
role (user or model) and parts. Each part can be: {"text": "..."}, {"inlineData": {"mimeType": "...", "data": "base64..."}}, or {"fileData": {"mimeType": "...", "fileUri": "gs://..."}}.Generation config.
temperature: 0â2, randomnesstopP: nucleus samplingtopK: top-K samplingmaxOutputTokens: max output tokensstopSequences: stop sequencesresponseMimeType: e.g.text/plainresponseModalities: e.g.["TEXT"]or["IMAGE"]thinkingConfig: thinking mode (see below)imageConfig: image generation config (see below)
System instruction:
{"parts": [{"text": "..."}]}.Safety levels, e.g.
[{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "OFF"}].Tool declarations (function calling), see advanced features.
Tool config, e.g.
functionCallingConfig.mode: AUTO / ANY / NONE.Context caching ID returned by the API; used to reuse cached context.
Response format
Non-streaminggenerateContent returns JSON:
{
"candidates": [
{
"content": {
"parts": [{"text": "Response text"}],
"role": "model"
},
"finishReason": "STOP",
"index": 0,
"safetyRatings": []
}
],
"usageMetadata": {
"promptTokenCount": 10,
"candidatesTokenCount": 20,
"totalTokenCount": 30,
"thoughtsTokenCount": 0,
"cachedContentTokenCount": 0
},
"modelVersion": "gemini-2.5-pro",
"createTime": "2025-01-01T00:00:00Z"
}
data: and contains a JSON fragment (e.g. candidates[].content.parts).
Basic examples
- cURL (non-streaming)
- cURL (streaming)
- Python (google-generativeai)
- Node.js
curl -X POST "https://api.gravitex.ai/v1beta/models/gemini-2.5-pro:generateContent" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Describe AI in one sentence"}]}
],
"generationConfig": {
"temperature": 0.7,
"maxOutputTokens": 1024
}
}'
curl -N -X POST "https://api.gravitex.ai/v1beta/models/gemini-2.5-pro:streamGenerateContent?alt=sse" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxx" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Describe AI in one sentence"}]}
],
"generationConfig": {"maxOutputTokens": 1024}
}'
import google.generativeai as genai
genai.configure(
api_key="sk-xxxxxxxxxx",
transport="rest",
client_options={"api_endpoint": "https://api.gravitex.ai"}
)
model = genai.GenerativeModel("gemini-2.5-pro")
response = model.generate_content("Describe AI in one sentence")
print(response.text)
const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI("sk-xxxxxxxxxx");
genAI.apiKey = "sk-xxxxxxxxxx";
// If the SDK supports a custom baseUrl, set it to https://api.gravitex.ai
const model = genAI.getGenerativeModel({ model: "gemini-2.5-pro" });
const result = await model.generateContent("Describe AI in one sentence");
const text = result.response.text();
console.log(text);
By default,
google-generativeai calls Googleâs API. To use GravitexAI, set api_endpoint to https://api.gravitex.ai via client_options or environment variables. See your SDK docs for details.Advanced features
Thinking mode
Supported in three ways:- generationConfig.thinkingConfig (Gemini 2.5 Pro): use
thinkingBudget(token count) - thinkingConfig.thinkingLevel (Gemini 3 Pro): use
LOW/HIGH - Model suffix:
-thinking,-thinking-8192,-nothinking,-thinking-low,-thinking-high
- thinkingBudget (2.5 Pro)
- thinkingLevel (3 Pro)
{
"contents": [{"role": "user", "parts": [{"text": "Give a geometry problem and solve it step by step"}]}],
"generationConfig": {
"maxOutputTokens": 8192,
"thinkingConfig": {
"includeThoughts": true,
"thinkingBudget": 8192
}
}
}
{
"contents": [{"role": "user", "parts": [{"text": "Give a geometry problem and solve it step by step"}]}],
"generationConfig": {
"maxOutputTokens": 8192,
"thinkingConfig": {
"includeThoughts": true,
"thinkingLevel": "HIGH"
}
}
}
Multimodal input
Mix text and media incontents[].parts:
{
"contents": [
{
"role": "user",
"parts": [
{"text": "Describe this image"},
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "/9j/4AAQSkZJRg..."
}
}
]
}
]
}
- Image:
inlineDatawith base64data, orfileDatawithfileUri(e.g.gs://...) - Audio:
inlineDatawithmimeTypesuch asaudio/mp3
Tool calling (Function Calling)
{
"contents": [{"role": "user", "parts": [{"text": "What is the weather in Shanghai today?"}]}],
"tools": [
{
"functionDeclarations": [
{
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
]
}
],
"toolConfig": {
"functionCallingConfig": {
"mode": "AUTO",
"allowedFunctionNames": []
}
}
}
functionCall part; include the corresponding functionResponse in the next contents and send another request.
Google Search (Grounding)
When enabled, the model can use real-time web search to improve answers (e.g. weather, news). AddgoogleSearch to tools:
{
"contents": [{"role": "user", "parts": [{"text": "What is the weather in Beijing today?"}]}],
"tools": [
{
"googleSearch": {}
}
],
"toolConfig": {
"functionCallingConfig": {
"mode": "AUTO"
}
}
}
googleSearch: {} and functionDeclarations as separate elements in the same tools array. Responses may include retrieval metadata (e.g. groundingMetadata).
Streaming
Use:POST /v1beta/models/{model}:streamGenerateContent?alt=sse. Request body is the same as generateContent. Response is SSE; each data: line is a JSON chunk.
Context caching
First request does not includecachedContent. If the server returns a cache ID, subsequent requests can send:
{
"cachedContent": "cached-content-id",
"contents": [{"role": "user", "parts": [{"text": "Continue from the context above"}]}]
}
Image generation (e.g. Gemini 2.5 Flash)
When the model supports image output, set ingenerationConfig:
{
"contents": [{"role": "user", "parts": [{"text": "Draw a cat"}]}],
"generationConfig": {
"responseModalities": ["IMAGE"],
"imageConfig": {
"aspectRatio": "1:1",
"imageSize": "1K",
"imageOutputOptions": {"mimeType": "image/png"}
}
}
}
candidates[].content.parts may include inlineData (e.g. base64 image).
Embedding API
Single: embedContent
Endpoint:POST https://api.gravitex.ai/v1beta/models/{model}:embedContent
Request body example:
{
"model": "text-embedding-004",
"content": {
"parts": [{"text": "Text to embed"}]
}
}
model in the path: /v1beta/models/text-embedding-004:embedContent, with body containing only content.
Batch: batchEmbedContents
Endpoint:POST https://api.gravitex.ai/v1beta/models/{model}:batchEmbedContents
Request body example:
{
"requests": [
{"content": {"parts": [{"text": "First text"}]}},
{"content": {"parts": [{"text": "Second text"}]}}
]
}
Error handling
Errors are returned as HTTP status codes and JSON body, for example:{
"error": {
"code": 400,
"message": "Invalid request: ...",
"status": "INVALID_ARGUMENT"
}
}
| Status | Meaning |
|---|---|
| 400 | Invalid request (e.g. missing contents, unsupported parameter) |
| 401 | Authentication failed (invalid or missing API key) |
| 404 | Model not found or wrong path |
| 429 | Rate limited; retry later |
| 500 | Server error |
error.message in your client and handle retries or user messaging accordingly.
Comparison with OpenAI format
| Item | Gemini Native | OpenAI (/v1/chat/completions) |
|---|---|---|
| Base path | /v1beta/models/{model}:generateContent | /v1/chat/completions |
| Auth | Authorization: Bearer sk-xxx or x-goog-api-key | Authorization: Bearer sk-xxx |
| Message format | contents[].parts[] (text/inlineData/fileData) | messages[].content (string or array) |
| System prompt | systemInstruction.parts | messages with role: "system" |
| Streaming | streamGenerateContent?alt=sse | stream: true |
| Thinking | thinkingConfig or model suffix | Model suffix (e.g. -thinking) |
| Tools | tools[].functionDeclarations | tools[].function (OpenAI shape) |
| Typical clients | Google SDK, custom HTTP client | OpenAI SDK, OpenAI-compatible clients |
thinkingConfig, native multimodal parts). Use /v1/chat/completions when you want to stay within the OpenAI ecosystem.âI