Request Headers
The authorization token (required).
Request Path
The author of the App.
The unique identifier of the App.
The version of the App.
Request Body
A JSON object containing the arguments to pass to the App.
A token that can be used to retry the request from the point of failure.
If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed for some models.
Specifies the processing type used for serving the request.
Variants
If set to true, the model response data will be streamed to the client as it is generated using server-sent events.
Options for streaming response.
Properties
If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, as well as the cost, if requested.
OpenRouter accounting configuration.
Properties
Whether to include Cost in the response usage.
Response Body (Unary)
Completions returned by the App.
Items
A Multichat completion.
Properties
A unique identifier for the chat completion.
An array of choices returned by the Query Tool.
Items
An object containing the incremental updates to the chat message.
Properties
The content of the message generated by the model.
The refusal information if the model refused to generate a message.
The role of the message, which is always assistant for model-generated messages.
The annotations added by the model in this message.
Items
Properties
Properties
The end index of the citation in the message content.
The start index of the citation in the message content.
The title of the cited webpage.
The URL of the cited webpage.
The audio generated by the model in this message.
Properties
The tool calls made by the model in this delta.
Items
The tool call ID.
Properties
The name of the function being called.
The arguments passed to the function.
The reasoning text generated by the model in this message.
The reason why the model finished generating the response.
Variants
The model finished generating because it reached a natural stopping point.
The model finished generating because it reached the maximum token limit.
The model finished generating because it made one or more tool calls.
The model finished generating because it triggered a content filter.
The model finished generating because an error occurred.
The index of the choice in the list of choices.
The log probabilities of the tokens in the delta.
Properties
An array of log probabilities for each token in the content.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
An array of log probabilities for each token in the refusal.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
If an error occurred while generating this choice, the error object.
Properties
The HTTP status code for the error.
A JSON message describing the error. Typically, either a string or an object.
The base62 22-character unique identifier for the LLM that produced this choice.
The index of the LLM in the Multichat Model that produced this choice.
Details about the chat completion which produced this choice.
Properties
A unique identifier for the chat completion.
The Unix timestamp (in seconds) when the chat completion was created.
The model used for the chat completion.
The service tier used for the chat completion.
Variants
A fingerprint representing the system configuration used for the chat completion.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
The upstream (or upstream upstream) LLM provider used for the chat completion.
The Unix timestamp (in seconds) when the chat completion was created.
The 22-character unique identifier for the Multichat Model which generated the completion.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
A Score completion.
Properties
A unique identifier for the chat completion.
An array of choices returned by the Query Model.
Items
The message generated by the model for this choice.
Properties
The content of the message generated by the model.
The refusal information if the model refused to generate a message.
The role of the message, which is always assistant for model-generated messages.
The annotations added by the model in this message.
Items
Properties
Properties
The end index of the citation in the message content.
The start index of the citation in the message content.
The title of the cited webpage.
The URL of the cited webpage.
The audio generated by the model in this message.
Properties
The tool calls made by the model in this delta.
Items
The tool call ID.
Properties
The name of the function being called.
The arguments passed to the function.
The reasoning text generated by the model in this message.
The images generated by the model in this message.
Items
Properties
Properties
The reason why the model finished generating the response.
Variants
The model finished generating because it reached a natural stopping point.
The model finished generating because it reached the maximum token limit.
The model finished generating because it made one or more tool calls.
The model finished generating because it triggered a content filter.
The model finished generating because an error occurred.
The index of the choice in the list of choices.
The log probabilities of the tokens in the delta.
Properties
An array of log probabilities for each token in the content.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
An array of log probabilities for each token in the refusal.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
The weight of the LLM that produced this choice.
The Confidence Score of the choice.
If an error occurred while generating this choice, the error object.
Properties
The HTTP status code for the error.
A JSON message describing the error. Typically, either a string or an object.
The base62 22-character unique identifier for the LLM that produced this choice.
The index of the LLM in the Score Model that produced this choice.
Details about the chat completion which produced this choice.
Properties
A unique identifier for the chat completion.
The Unix timestamp (in seconds) when the chat completion was created.
The model used for the chat completion.
The service tier used for the chat completion.
Variants
A fingerprint representing the system configuration used for the chat completion.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
The upstream (or upstream upstream) LLM provider used for the chat completion.
The Unix timestamp (in seconds) when the chat completion was created.
The 22-character unique identifier for the Score Model which generated the completion.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
Details about how the weights were computed for the Score Model.
Variants
Indicates that static weights were used for the Score Model.
Properties
Indicates that training table weights were used for the Score Model.
Properties
Properties
An array of embedding objectst.
Items
An embedding vector.
Properties
The embedding vector as an array of floats.
Items
A float in the embedding vector.
The name of the model used to generate the embeddings.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
The final output of the App.
A token that can be used to retry the request from the point of failure.
An error object containing details about any error that occurred during the request.
Properties
The HTTP status code for the error.
A JSON message describing the error. Typically, either a string or an object.
Indicates whether the app has been successfully published. Only ever present if requested.
Response Body (Streaming)
Completion chunks streamed by the App.
Items
A chunk of a streaming Multichat completion.
Properties
A unique identifier for the chat completion.
An array of choices returned by the Query Tool.
Items
An object containing the incremental updates to the chat message.
Properties
The content of the message delta.
The refusal reason if the model refused to generate a response.
The role of the message delta.
The tool calls made by the model in this delta.
Items
The index of the tool call in the message.
The tool call ID.
Properties
The name of the function being called.
The arguments passed to the function.
The reasoning text generated by the model in this delta.
The images generated by the model in this delta.
Items
Properties
Properties
The reason why the model finished generating the response.
Variants
The model finished generating because it reached a natural stopping point.
The model finished generating because it reached the maximum token limit.
The model finished generating because it made one or more tool calls.
The model finished generating because it triggered a content filter.
The model finished generating because an error occurred.
The index of the choice in the list of choices.
The log probabilities of the tokens in the delta.
Properties
An array of log probabilities for each token in the content.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
An array of log probabilities for each token in the refusal.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
If an error occurred while generating this choice, the error object.
Properties
The HTTP status code for the error.
A JSON message describing the error. Typically, either a string or an object.
The base62 22-character unique identifier for the LLM that produced this choice.
The index of the LLM in the Multichat Model that produced this choice.
Details about the chat completion which produced this choice.
Properties
A unique identifier for the chat completion.
The Unix timestamp (in seconds) when the first chat completion chunk was created.
The model used for the chat completion.
The service tier used for the chat completion chunk.
Variants
A fingerprint representing the system configuration used for the chat completion chunk.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
The upstream (or upstream upstream) LLM provider used for the chat completion chunk.
The Unix timestamp (in seconds) when the first chat completion chunk was created.
The 22-character unique identifier for the Multichat Model which generated the completion.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
A chunk of a streaming Score completion.
Properties
A unique identifier for the chat completion.
An array of choices returned by the Score Model.
Items
An object containing the incremental updates to the chat message.
Properties
The content of the message delta.
The refusal reason if the model refused to generate a response.
The role of the message delta.
The tool calls made by the model in this delta.
Items
The index of the tool call in the message.
The tool call ID.
Properties
The name of the function being called.
The arguments passed to the function.
The reasoning text generated by the model in this delta.
The images generated by the model in this delta.
Items
Properties
Properties
The reason why the model finished generating the response.
Variants
The model finished generating because it reached a natural stopping point.
The model finished generating because it reached the maximum token limit.
The model finished generating because it made one or more tool calls.
The model finished generating because it triggered a content filter.
The model finished generating because an error occurred.
The index of the choice in the list of choices.
The log probabilities of the tokens in the delta.
Properties
An array of log probabilities for each token in the content.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
An array of log probabilities for each token in the refusal.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
Items
Properties
The token text.
The byte representation of the token.
Items
A byte in the token's byte representation.
The log probability of the token.
The weight of the LLM that produced this choice.
The Confidence Score of the choice.
If an error occurred while generating this choice, the error object.
Properties
The HTTP status code for the error.
A JSON message describing the error. Typically, either a string or an object.
The base62 22-character unique identifier for the LLM that produced this choice.
The index of the LLM in the Score Model that produced this choice.
Details about the chat completion which produced this choice.
Properties
A unique identifier for the chat completion.
The Unix timestamp (in seconds) when the first chat completion chunk was created.
The model used for the chat completion.
The service tier used for the chat completion chunk.
Variants
A fingerprint representing the system configuration used for the chat completion chunk.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
The upstream (or upstream upstream) LLM provider used for the chat completion chunk.
The Unix timestamp (in seconds) when the first chat completion chunk was created.
The 22-character unique identifier for the Score Model which generated the completion.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
Details about how the weights were computed for the Score Model.
Variants
Indicates that static weights were used for the Score Model.
Properties
Indicates that training table weights were used for the Score Model.
Properties
Properties
An array of embedding objectst.
Items
An embedding vector.
Properties
The embedding vector as an array of floats.
Items
A float in the embedding vector.
The name of the model used to generate the embeddings.
An object containing token usage statistics for the chat completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the input prompt.
The total number of tokens used (prompt + completion).
Properties
The number of audio tokens generated.
The number of reasoning tokens generated.
Properties
The number of audio tokens in the input prompt.
The number of cached tokens in the input prompt.
The cost incurred for this chat completion, in Credits.
Properties
The cost charged by the upstream LLM provider, in Credits.
The cost charged by the upstream LLM provider's own upstream LLM provider, in Credits.
The final output of the App.
A token that can be used to retry the request from the point of failure.
An error object containing details about any error that occurred during the request.
Properties
The HTTP status code for the error.
A JSON message describing the error. Typically, either a string or an object.
Indicates whether the app has been successfully published. Only ever present if requested.