/chat/completions - Sionic Opengateway

Request

Authorization

Provide your bearer token in the

Authorization

header when making requests to protected resources.

Example:

Authorization: Bearer ********************

Header Params

x-opengateway-failover-sequence

string <json>

required

failover 순서를 지정할 수 있는 옵션.

Default:

["openai","azure"]

x-opengateway-vllm-endpoint

string

optional

vLLM 사용 시 Endpoint 지정

x-opengateway-vllm-api-key

string

optional

vLLM 사용 시 API Key 지정

x-opengateway-timeout

string

optional

API 호출 후 첫 chunk를 받기까지의 타임아웃(unit: ms)

Body Params application/json

model

string

required

사용할 모델의 ID입니다. 작동하는 모델에 대한 자세한 내용은 호환성 표를 참조하세요.

messages

array [object {5}]

required

지금까지의 대화를 구성하는 메시지 목록입니다.

role

enum<string>

required

OpenAI 기준 system, user, assistant, tool이 들어갈 수 있습니다.

Allowed values:

systemuserassistanttool

content

string

optional

OpenAI 기준 role이 system, user, tool인 경우 Required 속성입니다. assistant인 경우 Optional 속성입니다.

name

string

optional

OpenAI 기준 role이 system, user, assistant인 경우 Optional 속성입니다.

tool_calls

string

optional

OpenAI 기준 role이 assistant인 경우 Optional 속성입니다.

tool_call_id

string

optional

OpenAI 기준 role이 tool인 경우 Required 속성입니다.

frequency_penalty

integer <int32>

optional

-2.0과 2.0 사이의 숫자입니다. 양수 값은 지금까지 텍스트의 기존 빈도를 기반으로 새 토큰에 불이익을 주어 모델이 동일한 줄을 그대로 반복할 가능성을 줄입니다.

logit_bias

string <json>

optional

completion에 지정된 토큰이 표시될 가능성을 수정합니다. 토큰(토큰화 도구에서 토큰 ID로 지정)을 -100에서 100 사이의 연관 바이어스 값에 매핑하는 JSON 객체를 받습니다. 수학적으로 바이어스는 샘플링 전에 모델에서 생성된 로그에 추가됩니다. 정확한 효과는 모델마다 다르지만 -1에서 1 사이의 값은 선택 가능성을 낮추거나 높이고, -100 또는 100과 같은 값은 관련 토큰을 금지하거나 배타적으로 선택하게 됩니다.

logprobs

boolean

optional

출력 토큰의 로그 확률을 반환할지 여부입니다. true이면 메시지 콘텐츠에 반환된 각 출력 토큰의 로그 확률을 반환합니다.

top_logprobs

integer <int32>

optional

각 토큰 위치에서 반환할 가능성이 가장 높은 토큰의 수를 지정하는 0에서 20 사이의 정수로, 각각 관련 로그 확률을 갖습니다. 이 매개변수를 사용하는 경우 logprobs를 true로 설정해야 합니다.

max_tokens

integer <int32>

optional

completion에서 생성할 수 있는 최대 토큰 수입니다.

n

integer <int32>

optional

각 입력 메시지에 대해 생성할 채팅 완료 선택 항목 수입니다. 모든 선택 항목에서 생성된 토큰 수에 따라 요금이 부과됩니다. 비용을 최소화하려면 n을 1로 유지하세요.

presence_penalty

integer <int32>

optional

-2.0과 2.0 사이의 숫자입니다. 양수 값은 지금까지 텍스트에 나타나는지 여부에 따라 새 토큰에 불이익을 주어 모델이 새로운 주제에 관해 이야기할 가능성을 높입니다.

response_format

string <json>

optional

모델이 출력해야 하는 형식을 지정하는 개체입니다.

seed

integer <int32>

optional

이 기능은 베타 버전입니다. 지정된 경우 시스템은 결정론적으로 샘플링하기 위해 최선을 다하므로 동일한 시드 및 매개변수를 사용하는 반복 요청이 동일한 결과를 반환해야 합니다. 결정성은 보장되지 않으며 백엔드의 변경 사항을 모니터링하려면 system_fingerprint 응답 매개 변수를 참조해야 합니다.

service_tier

string

optional

요청 처리에 사용할 대기 시간 계층을 지정합니다. 이 매개변수는 확장 계층 서비스를 구독한 고객과 관련이 있습니다.

stop

array[string]

optional

API가 추가 토큰 생성을 중지하는 최대 4개의 시퀀스입니다.

stream

boolean

optional

설정된 경우 ChatGPT와 같이 부분 메시지 델타가 전송됩니다.

stream_options

string <json>

optional

스트리밍 응답 옵션. stream: true를 설정한 경우에만 이를 설정하십시오.

temperature

integer <int32>

optional

0에서 2 사이의 값입니다. 0.8과 같이 값이 높을수록 출력이 더 무작위로 만들어지고, 0.2와 같이 값이 낮을수록 더 집중적이고 결정적이게 됩니다.

top_p

integer <int32>

optional

nucleus sampling이라고 하는 temperature sampling의 대안으로, 모델은 확률 질량이 top_p인 토큰의 결과를 고려합니다. 따라서 0.1은 상위 10% 확률 질량을 구성하는 토큰만 고려된다는 의미입니다.

tools

array [object {2}]

optional

모델이 호출할 수 있는 도구 목록입니다.

type

string

required

도구의 유형을 지정합니다. 현재는 function만 지원됩니다.

function

object

required

Example

{
    "model": "string",
    "messages": [
        {
            "role": "system",
            "content": "string",
            "name": "string",
            "tool_calls": "string",
            "tool_call_id": "string"
        }
    ],
    "frequency_penalty": 0,
    "logit_bias": "string",
    "logprobs": true,
    "top_logprobs": 0,
    "max_tokens": 0,
    "n": 0,
    "presence_penalty": 0,
    "response_format": "string",
    "seed": 0,
    "service_tier": "string",
    "stop": [
        "string"
    ],
    "stream": true,
    "stream_options": "string",
    "temperature": 0,
    "top_p": 0,
    "tools": [
        {
            "type": "string",
            "function": {
                "description": "string",
                "name": "string",
                "parameters": {
                    "type": "string",
                    "properties": {},
                    "required": [
                        "string"
                    ],
                    "additionalProperties": true
                },
                "strict": false
            }
        }
    ]
}

Request samples

Shell

JavaScript

Java

Swift

Go

PHP

Python

HTTP

C

C#

Objective-C

Ruby

OCaml

Dart

R

curl --location --request POST '/chat/completions' \
--header 'x-opengateway-failover-sequence;' \
--header 'x-opengateway-vllm-endpoint;' \
--header 'x-opengateway-vllm-api-key;' \
--header 'x-opengateway-timeout;' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "string",
    "messages": [
        {
            "role": "system",
            "content": "string",
            "name": "string",
            "tool_calls": "string",
            "tool_call_id": "string"
        }
    ],
    "frequency_penalty": 0,
    "logit_bias": "string",
    "logprobs": true,
    "top_logprobs": 0,
    "max_tokens": 0,
    "n": 0,
    "presence_penalty": 0,
    "response_format": "string",
    "seed": 0,
    "service_tier": "string",
    "stop": [
        "string"
    ],
    "stream": true,
    "stream_options": "string",
    "temperature": 0,
    "top_p": 0,
    "tools": [
        {
            "type": "string",
            "function": {
                "description": "string",
                "name": "string",
                "parameters": {
                    "type": "string",
                    "properties": {},
                    "required": [
                        "string"
                    ],
                    "additionalProperties": true
                },
                "strict": false
            }
        }
    ]
}'

Responses

🟢201Created

application/json

201

Body

object {0}

Example

{}

🟢201Created

🟠400Bad Request