OptionalmodelThe model ID to use for completion (e.g., "gpt-oss-120b"). As specified in Heroku API documentation. If not provided, defaults to process.env.INFERENCE_MODEL_ID.
OptionaltemperatureControls randomness. Lower values make responses more focused. Parameter from Heroku API.
OptionalmaxMaximum tokens the model may generate. Maps to max_tokens in Heroku API.
OptionalstopList of strings that stop generation. Parameter from Heroku API.
OptionalstreamWhether to stream responses. If true, invoke will still return a complete response. Used by stream(). Heroku API parameter.
OptionaltopProportion of tokens to consider (cumulative probability). Maps to top_p in Heroku API.
OptionalapiHeroku Inference API Key (INFERENCE_KEY). If not provided, the library will check the environment variable INFERENCE_KEY. Used for authentication.
OptionalapiHeroku Inference API Base URL (INFERENCE_URL). If not provided, checks env var INFERENCE_URL or uses a sensible Heroku default. The endpoint path is /v1/chat/completions.
OptionalmaxMaximum number of retries for failed requests. Standard LangChain parameter for resilience.
OptionaltimeoutTimeout for API requests in milliseconds. Standard LangChain parameter for request duration control.
OptionalstreamingAlias for stream for consistency. Sets default for internal _generate method's streaming behavior.
OptionaladditionalAllows passing other Heroku-specific parameters not explicitly defined (e.g., extended_thinking). Provides flexibility for future Heroku API additions or less common parameters.
Interface for the fields to instantiate ChatHeroku. Extends BaseChatModelParams and includes Heroku-specific parameters.