`openai-textgen`

modify records using openai models

Description

textgen is a conduit processor that will transform a record based on a given prompt

Configuration parameters

YAML
Table

version: 2.2
pipelines:
  - id: example
    status: running
    connectors:
      # define source and destination ...
    processors:
      - id: example
        plugin: "openai-textgen"
        settings:
          # APIKey is the OpenAI API key. Required.
          # Type: string
          api_key: ""
          # BackoffFactor is the factor by which the backoff increases. Defaults
          # to 2.0
          # Type: float
          backoff_factor: "2.0"
          # DeveloperMessage is the system message that guides the model's
          # behavior. Required.
          # Type: string
          developer_message: ""
          # Field is the reference to the field to process. Defaults to
          # ".Payload.After".
          # Type: string
          field: ".Payload.After"
          # FrequencyPenalty penalizes new tokens based on frequency in text.
          # Type: float
          frequency_penalty: ""
          # InitialBackoff is the initial backoff duration in milliseconds.
          # Defaults to 1000ms (1s).
          # Type: int
          initial_backoff: "1000"
          # LogProbs is whether to return log probabilities of output tokens.
          # Type: bool
          log_probs: ""
          # LogitBias modifies the likelihood of specified tokens appearing.
          # Type: int
          logit_bias.*: ""
          # MaxBackoff is the maximum backoff duration in milliseconds. Defaults
          # to 30000ms (30s).
          # Type: int
          max_backoff: "30000"
          # MaxCompletionTokens is the maximum number of tokens for completion.
          # Type: int
          max_completion_tokens: ""
          # MaxRetries is the maximum number of retries for API calls. Defaults
          # to 3.
          # Type: int
          max_retries: "3"
          # MaxTokens is the maximum number of tokens to generate.
          # Type: int
          max_tokens: ""
          # Metadata is additional metadata to include with the request.
          # Type: string
          metadata.*: ""
          # Model is the OpenAI model to use (e.g., gpt-4o-mini). Required.
          # Type: string
          model: ""
          # N is the number of completions to generate.
          # Type: int
          n: ""
          # PresencePenalty penalizes new tokens based on presence in text.
          # Type: float
          presence_penalty: ""
          # ReasoningEffort controls the amount of reasoning in the response.
          # Type: string
          reasoning_effort: ""
          # Whether to decode the record key using its corresponding schema from
          # the schema registry.
          # Type: bool
          sdk.schema.decode.key.enabled: "true"
          # Whether to decode the record payload using its corresponding schema
          # from the schema registry.
          # Type: bool
          sdk.schema.decode.payload.enabled: "true"
          # Whether to encode the record key using its corresponding schema from
          # the schema registry.
          # Type: bool
          sdk.schema.encode.key.enabled: "true"
          # Whether to encode the record payload using its corresponding schema
          # from the schema registry.
          # Type: bool
          sdk.schema.encode.payload.enabled: "true"
          # Seed is the seed for deterministic results.
          # Type: int
          seed: ""
          # Stop are sequences where the API will stop generating.
          # Type: string
          stop: ""
          # Store is whether to store the conversation in OpenAI.
          # Type: bool
          store: ""
          # Stream is whether to stream the results or not. Not used for now.
          # Type: bool
          stream: ""
          # StrictOutput enforces strict output format. Defaults to false.
          # Type: bool
          strict_output: "false"
          # Temperature controls randomness (0-2, lower is more deterministic).
          # Type: float
          temperature: ""
          # TopLogProbs is the number of most likely tokens to return
          # probabilities for.
          # Type: int
          top_log_probs: ""
          # TopP controls diversity via nucleus sampling.
          # Type: float
          top_p: ""
          # User is the user identifier for OpenAI API.
          # Type: string
          user: ""

Name	Type	Default	Description
`api_key`	string	null	APIKey is the OpenAI API key. Required.
`backoff_factor`	float	`2.0`	BackoffFactor is the factor by which the backoff increases. Defaults to 2.0
`developer_message`	string	null	DeveloperMessage is the system message that guides the model's behavior. Required.
`field`	string	`.Payload.After`	Field is the reference to the field to process. Defaults to ".Payload.After".
`frequency_penalty`	float	null	FrequencyPenalty penalizes new tokens based on frequency in text.
`initial_backoff`	int	`1000`	InitialBackoff is the initial backoff duration in milliseconds. Defaults to 1000ms (1s).
`log_probs`	bool	null	LogProbs is whether to return log probabilities of output tokens.
`logit_bias.*`	int	null	LogitBias modifies the likelihood of specified tokens appearing.
`max_backoff`	int	`30000`	MaxBackoff is the maximum backoff duration in milliseconds. Defaults to 30000ms (30s).
`max_completion_tokens`	int	null	MaxCompletionTokens is the maximum number of tokens for completion.
`max_retries`	int	`3`	MaxRetries is the maximum number of retries for API calls. Defaults to 3.
`max_tokens`	int	null	MaxTokens is the maximum number of tokens to generate.
`metadata.*`	string	null	Metadata is additional metadata to include with the request.
`model`	string	null	Model is the OpenAI model to use (e.g., gpt-4o-mini). Required.
`n`	int	null	N is the number of completions to generate.
`presence_penalty`	float	null	PresencePenalty penalizes new tokens based on presence in text.
`reasoning_effort`	string	null	ReasoningEffort controls the amount of reasoning in the response.
`sdk.schema.decode.key.enabled`	bool	`true`	Whether to decode the record key using its corresponding schema from the schema registry.
`sdk.schema.decode.payload.enabled`	bool	`true`	Whether to decode the record payload using its corresponding schema from the schema registry.
`sdk.schema.encode.key.enabled`	bool	`true`	Whether to encode the record key using its corresponding schema from the schema registry.
`sdk.schema.encode.payload.enabled`	bool	`true`	Whether to encode the record payload using its corresponding schema from the schema registry.
`seed`	int	null	Seed is the seed for deterministic results.
`stop`	string	null	Stop are sequences where the API will stop generating.
`store`	bool	null	Store is whether to store the conversation in OpenAI.
`stream`	bool	null	Stream is whether to stream the results or not. Not used for now.
`strict_output`	bool	`false`	StrictOutput enforces strict output format. Defaults to false.
`temperature`	float	null	Temperature controls randomness (0-2, lower is more deterministic).
`top_log_probs`	int	null	TopLogProbs is the number of most likely tokens to return probabilities for.
`top_p`	float	null	TopP controls diversity via nucleus sampling.
`user`	string	null	User is the user identifier for OpenAI API.

Examples

Transform text using OpenAI models

This example shows how to use the OpenAI text generation processor to transform a record's .Payload.After field using an OpenAI model. The processor will send the content of the field to OpenAI and replace it with the response.

In this example, we're using a system message that instructs the model to convert the input text to uppercase.

Configuration parameters

YAML
Table

version: 2.2
pipelines:
  - id: example
    status: running
    connectors:
      # define source and destination ...
    processors:
      - id: example
        plugin: "openai-textgen"
        settings:
          api_key: "fake-api-key"
          backoff_factor: "2.0"
          developer_message: "You will receive a payload. Your task is to output back the payload in uppercase."
          field: ".Payload.After"
          initial_backoff: "1000"
          max_backoff: "30000"
          max_retries: "3"
          model: "gpt-4o-mini"
          strict_output: "false"
          temperature: "0"

Name	Value
`api_key`	`fake-api-key`
`backoff_factor`	`2.0`
`developer_message`	`You will receive a payload. Your task is to output back the payload in uppercase.`
`field`	`.Payload.After`
`initial_backoff`	`1000`
`max_backoff`	`30000`
`max_retries`	`3`
`model`	`gpt-4o-mini`
`strict_output`	`false`
`temperature`	`0`

Record difference

After
{
  "position": "cG9zLTE=",
  "operation": "create",
  "metadata": null,
  "key": null,
  "payload": {
    "before": null,
    "after": "HELLO WORLD"
  }
}

scarf pixel conduit-site-docs-using-processors

Before			After
1		{	1		{
2		"position": "cG9zLTE=",	2		"position": "cG9zLTE=",
3		"operation": "create",	3		"operation": "create",
4		"metadata": null,	4		"metadata": null,
5		"key": null,	5		"key": null,
6		"payload": {	6		"payload": {
7		"before": null,	7		"before": null,
8	-	"after": "hello world"	8	+	"after": "HELLO WORLD"
9		}	9		}
10		}	10		}

Description​

Configuration parameters​

Examples​

Transform text using OpenAI models​

Configuration parameters​

Record difference​

Description

Configuration parameters

Examples

Transform text using OpenAI models

Configuration parameters

Record difference