Skip to main content

openai-textgen

modify records using openai models

Description

textgen is a conduit processor that will transform a record based on a given prompt

Configuration parameters

version: 2.2
pipelines:
- id: example
status: running
connectors:
# define source and destination ...
processors:
- id: example
plugin: "openai-textgen"
settings:
# APIKey is the OpenAI API key. Required.
# Type: string
api_key: ""
# BackoffFactor is the factor by which the backoff increases. Defaults
# to 2.0
# Type: float
backoff_factor: "2.0"
# DeveloperMessage is the system message that guides the model's
# behavior. Required.
# Type: string
developer_message: ""
# Field is the reference to the field to process. Defaults to
# ".Payload.After".
# Type: string
field: ".Payload.After"
# FrequencyPenalty penalizes new tokens based on frequency in text.
# Type: float
frequency_penalty: ""
# InitialBackoff is the initial backoff duration in milliseconds.
# Defaults to 1000ms (1s).
# Type: int
initial_backoff: "1000"
# LogProbs is whether to return log probabilities of output tokens.
# Type: bool
log_probs: ""
# LogitBias modifies the likelihood of specified tokens appearing.
# Type: int
logit_bias.*: ""
# MaxBackoff is the maximum backoff duration in milliseconds. Defaults
# to 30000ms (30s).
# Type: int
max_backoff: "30000"
# MaxCompletionTokens is the maximum number of tokens for completion.
# Type: int
max_completion_tokens: ""
# MaxRetries is the maximum number of retries for API calls. Defaults
# to 3.
# Type: int
max_retries: "3"
# MaxTokens is the maximum number of tokens to generate.
# Type: int
max_tokens: ""
# Metadata is additional metadata to include with the request.
# Type: string
metadata.*: ""
# Model is the OpenAI model to use (e.g., gpt-4o-mini). Required.
# Type: string
model: ""
# N is the number of completions to generate.
# Type: int
n: ""
# PresencePenalty penalizes new tokens based on presence in text.
# Type: float
presence_penalty: ""
# ReasoningEffort controls the amount of reasoning in the response.
# Type: string
reasoning_effort: ""
# Whether to decode the record key using its corresponding schema from
# the schema registry.
# Type: bool
sdk.schema.decode.key.enabled: "true"
# Whether to decode the record payload using its corresponding schema
# from the schema registry.
# Type: bool
sdk.schema.decode.payload.enabled: "true"
# Whether to encode the record key using its corresponding schema from
# the schema registry.
# Type: bool
sdk.schema.encode.key.enabled: "true"
# Whether to encode the record payload using its corresponding schema
# from the schema registry.
# Type: bool
sdk.schema.encode.payload.enabled: "true"
# Seed is the seed for deterministic results.
# Type: int
seed: ""
# Stop are sequences where the API will stop generating.
# Type: string
stop: ""
# Store is whether to store the conversation in OpenAI.
# Type: bool
store: ""
# Stream is whether to stream the results or not. Not used for now.
# Type: bool
stream: ""
# StrictOutput enforces strict output format. Defaults to false.
# Type: bool
strict_output: "false"
# Temperature controls randomness (0-2, lower is more deterministic).
# Type: float
temperature: ""
# TopLogProbs is the number of most likely tokens to return
# probabilities for.
# Type: int
top_log_probs: ""
# TopP controls diversity via nucleus sampling.
# Type: float
top_p: ""
# User is the user identifier for OpenAI API.
# Type: string
user: ""

Examples

Transform text using OpenAI models

This example shows how to use the OpenAI text generation processor to transform a record's .Payload.After field using an OpenAI model. The processor will send the content of the field to OpenAI and replace it with the response.

In this example, we're using a system message that instructs the model to convert the input text to uppercase.

Configuration parameters

version: 2.2
pipelines:
- id: example
status: running
connectors:
# define source and destination ...
processors:
- id: example
plugin: "openai-textgen"
settings:
api_key: "fake-api-key"
backoff_factor: "2.0"
developer_message: "You will receive a payload. Your task is to output back the payload in uppercase."
field: ".Payload.After"
initial_backoff: "1000"
max_backoff: "30000"
max_retries: "3"
model: "gpt-4o-mini"
strict_output: "false"
temperature: "0"

Record difference

Before
After
1
{
1
{
2
  "position": "cG9zLTE=",
2
  "position": "cG9zLTE=",
3
  "operation": "create",
3
  "operation": "create",
4
  "metadata": null,
4
  "metadata": null,
5
  "key": null,
5
  "key": null,
6
  "payload": {
6
  "payload": {
7
    "before": null,
7
    "before": null,
8
-
    "after": "hello world"
8
+
    "after": "HELLO WORLD"
9
  }
9
  }
10
}
10
}

scarf pixel conduit-site-docs-using-processors