Skip to main content

Completion Models

Completion models are designed for single-turn tasks that require generating text based on a given prompt but don't necessitate maintaining a conversational history. These models excel in applications like text summarization, code generation, and translation, where the focus is on generating accurate and relevant content in one go, rather than engaging in back-and-forth dialogue. In contrast, Chat models are optimized for interactive, multi-turn conversations, and they are better at understanding and generating nuanced responses within a conversational context. While both types of models are capable of generating text, Completion models are generally more suited for tasks that don't require the complexities of conversational state and context.

Google

text-bison

Parameters

Parameter NameStreaming SupportedTypeDescriptionMinMaxDefault
contentTruestring | string[]Text input to generate model response. Prompts can include preamble, questions, suggestions, instructions, or examples. This should be encoded as an array of strings.
temperatureTruefloatThe temperature is used for sampling during the response generation. Controls the degree of randomness in token selection. Lower temperatures result in less randomness. Higher temperatures can lead to more diverse or creative results.010.2
maxOutputTokensTrueintMaximum number of tokens that can be generated in the response. Specify a lower value for shorter responses and a higher value for longer responses. The maximum value may be lower for certain models.120481024
topKTrueintTop-K changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses.14040
candidateCountFalseintThe number of response variations to return.181
stopSequencesTruearray of stringsSpecifies a list of strings that tells the model to stop generating text if one of the strings is encountered in the response. Strings are case-sensitive.
topPTruefloatTop-p changes how the model selects tokens for output. Tokens are selected from most K (see topK parameter) probable to least until the sum of their probabilities equals the top-p value.010.95
Was this helpful?