OllamaCLI
Execute Ollama commands to interact with LLM models.
type: "io.kestra.plugin.ollama.cli.OllamaCLI"
Pull a model and generate text completion
id: ollama_flow
namespace: company.team
tasks:
- id: ollama_cli
type: io.kestra.plugin.ollama.cli.OllamaCLI
commands:
- ollama pull llama2
- ollama run llama2 "Tell me a joke about AI" > completion.txt
outputFiles:
- completion.txt
List available models and output as JSON
id: ollama_list_models
namespace: company.team
tasks:
- id: list_models
type: io.kestra.plugin.ollama.cli.OllamaCLI
commands:
- ollama list --format json > models.json
outputFiles:
- models.json
The commands to run.
ollama/ollama
The task runner container image.
Defaults to 'ollama/ollama' for Ollama operations.
Additional environment variables for the current process.
The files to create on the local filesystem. It can be a map or a JSON object.
Inject namespace files.
Inject namespace files to this task. When enabled, it will, by default, load all namespace files into the working directory. However, you can use the include
or exclude
properties to limit which namespace files will be injected.
The files from the local filesystem to send to Kestra's internal storage.
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**
, my-dir/*/**
or my-dir/my-file.txt
.
The task runner to use.
Task runners are provided by plugins, each have their own properties.
0
The exit code of the entire flow execution.
The output files' URIs in Kestra's internal storage.
The value extracted from the output of the executed commands
.
busybox
The image used for the file sidecar container.
The maximum amount of CPU resources a container can use.
Make sure to set that to a numeric value e.g. cpus: "1.5"
or cpus: "4"
or For instance, if the host machine has two CPUs and you set cpus: "1.5"
, the container is guaranteed at most one and a half of the CPUs.
The registry authentication.
The auth
field is a base64-encoded authentication string of username: password
or a token.
The identity token.
The registry password.
The registry URL.
If not defined, the registry will be extracted from the image name.
The registry token.
The registry username.
The ARM resource ID of the user assigned identity.
Extra boot disk size for each task.
The milliCPU count.
Defines the amount of CPU resources per task in milliCPU units. For example, 1000
corresponds to 1 vCPU per task. If undefined, the default value is 2000
.
If you also define the VM's machine type using the machineType
property in InstancePolicy field or inside the instanceTemplate
in the InstancePolicyOrTemplate field, make sure the CPU resources for both fields are compatible with each other and with how many tasks you want to allow to run on the same VM at the same time.
For example, if you specify the n2-standard-2
machine type, which has 2 vCPUs, you can set the cpu
to no more than 2000
. Alternatively, you can run two tasks on the same VM if you set the cpu
to 1000
or less.
Memory in MiB.
Defines the amount of memory per task in MiB units. If undefined, the default value is 2048
. If you also define the VM's machine type using the machineType
in InstancePolicy field or inside the instanceTemplate
in the InstancePolicyOrTemplate field, make sure the memory resources for both fields are compatible with each other and with how many tasks you want to allow to run on the same VM at the same time.
For example, if you specify the n2-standard-2
machine type, which has 8 GiB of memory, you can set the memory
to no more than 8192
.
true
Whether to enable namespace files to be loaded into the working directory. If explicitly set to true
in a task, it will load all Namespace Files into the task's working directory. Note that this property is by default set to true
so that you can specify only the include
and exclude
properties to filter the files to load without having to explicitly set enabled
to true
.
A list of filters to exclude matching glob patterns. This allows you to exclude a subset of the Namespace Files from being downloaded at runtime. You can combine this property together with include
to only inject a subset of files that you need into the task's working directory.
OVERWRITE
OVERWRITE
FAIL
WARN
IGNORE
Comportment of the task if a file already exist in the working directory.
A list of filters to include only matching glob patterns. This allows you to only load a subset of the Namespace Files into the working directory.
["{{flow.namespace}}"]
A list of namespaces in which searching files. The files are loaded in the namespace order, and only the latest version of a file is kept. Meaning if a file is present in the first and second namespace, only the file present on the second namespace will be loaded.
default
The namespace where the pod will be created.
true
Whether to reconnect to the current pod if it already exists.
PT5S
duration
The additional duration to wait for logs to arrive after pod completion.
As logs are not retrieved in real time, we cannot guarantee that we have fetched all logs when the pod complete, therefore we wait for a fixed amount of time to fetch late logs.
PT10M
duration
The maximum duration to wait until the pod is created.
This timeout is the maximum time that Kubernetes scheduler can take to
- schedule the pod
- pull the pod image
- and start the pod.
The configuration of the target Kubernetes cluster.
Additional YAML spec for the container.
true
Whether the pod should be deleted upon completion.
Additional YAML spec for the sidecar container.
{
"image": "busybox"
}
The configuration of the file sidecar container that handle download and upload of files.
The pod custom labels
Kestra will add default labels to the pod with execution and flow identifiers.
Node selector for pod scheduling
Kestra will assign the pod to the nodes you want (see Assign Pod Nodes)
Additional YAML spec for the pod.
ALWAYS
IF_NOT_PRESENT
ALWAYS
NEVER
The image pull policy for a container image and the tag of the image, which affect when Docker attempts to pull (download) the specified image.
The pod custom resources
The name of the service account.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
PT1H
duration
The maximum duration to wait for the pod completion unless the task timeout
property is set which will take precedence over this property.
The Batch access key.
The Batch account name.
The blob service endpoint.
Id of the pool on which to run the job.
true
Whether to reconnect to the current job if it already exists.
PT5S
duration
Determines how often Kestra should poll the container for completion. By default, the task runner checks every 5 seconds whether the job is completed. You can set this to a lower value (e.g. PT0.1S
= every 100 milliseconds) for quick jobs and to a lower threshold (e.g. PT1M
= every minute) for long-running jobs. Setting this property to a lower value will reduce the number of API calls Kestra makes to the remote service — keep that in mind in case you see API rate limit errors.
true
Whether the job should be deleted upon completion.
Warning, if the job is not deleted, a retry of the task could resume an old failed attempt of the job.
The private registry which contains the container image.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
PT1H
duration
The maximum duration to wait for the job completion unless the task timeout
property is set which will take precedence over this property.
Azure Batch will automatically timeout the job upon reaching such duration and the task will be failed.
Exit codes of a task execution.
If there are more than 1 exit codes, when task executes with any of the exit code in the list, the condition is met and the action will be executed.
The GCP region.
true
Whether to reconnect to the current job if it already exists.
Google Cloud Storage Bucket to use to upload (inputFiles
and namespaceFiles
) and download (outputFiles
) files.
It's mandatory to provide a bucket if you want to use such properties.
PT5S
duration
Determines how often Kestra should poll the container for completion. By default, the task runner checks every 5 seconds whether the job is completed. You can set this to a lower value (e.g. PT0.1S
= every 100 milliseconds) for quick jobs and to a lower threshold (e.g. PT1M
= every minute) for long-running jobs. Setting this property to a lower value will reduce the number of API calls Kestra makes to the remote service — keep that in mind in case you see API rate limit errors.
true
Whether the job should be deleted upon completion.
The GCP project ID.
["https://www.googleapis.com/auth/cloud-platform"]
The GCP scopes to be used.
The GCP service account key.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
PT5S
duration
Additional time after the job ends to wait for late logs.
PT1H
duration
The maximum duration to wait for the job completion unless the task timeout
property is set which will take precedence over this property.
Google Cloud Run will automatically timeout the Job upon reaching such duration and the task will be failed.
ACTION_UNSPECIFIED
RETRY_TASK
FAIL_TASK
UNRECOGNIZED
Action on task failures based on different conditions.
Conditions for actions to deal with task failures.
Network identifier with the format projects/HOST_PROJECT_ID/global/networks/NETWORK
.
Subnetwork identifier in the format projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNET
v1
The API version
CA certificate as data
CA certificate as file path
Client certificate as data
Client certificate as a file path
RSA
Client key encryption algorithm
default is RSA
Client key as data
Client key as a file path
Client key passphrase
Disable hostname verification
Key store file
Key store passphrase
https://kubernetes.default.svc
The url to the Kubernetes API
The namespace used
Oauth token
Oauth token provider
Password
Trust all certificates
Truststore file
Truststore passphrase
Username
The URL of the blob container the compute node should use.
Mandatory if you want to use namespaceFiles
, inputFiles
or outputFiles
properties.
Connection string of the Storage Account.
The blob service endpoint.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
e2-medium
The GCP machine type.
true
Whether to reconnect to the current job if it already exists.
Google Cloud Storage Bucket to use to upload (inputFiles
and namespaceFiles
) and download (outputFiles
) files.
It's mandatory to provide a bucket if you want to use such properties.
PT5S
duration
Determines how often Kestra should poll the container for completion. By default, the task runner checks every 5 seconds whether the job is completed. You can set this to a lower value (e.g. PT0.1S
= every 100 milliseconds) for quick jobs and to a lower threshold (e.g. PT1M
= every minute) for long-running jobs. Setting this property to a lower value will reduce the number of API calls Kestra makes to the remote service — keep that in mind in case you see API rate limit errors.
Compute resource requirements.
ComputeResource defines the amount of resources required for each task. Make sure your tasks have enough compute resources to successfully run. If you also define the types of resources for a job to use with the InstancePolicyOrTemplate field, make sure both fields are compatible with each other.
true
Whether the job should be deleted upon completion.
Warning, if the job is not deleted, a retry of the task could resume an old failed attempt of the job.
Container entrypoint to use.
Lifecycle management schema when any task in a task group is failed.
Currently we only support one lifecycle policy. When the lifecycle policy condition is met, the action in the policy will execute. If task execution result does not meet with the defined lifecycle policy, we consider it as the default policy. Default policy means if the exit code is 0, exit task. If task ends with non-zero exit code, retry the task with max_retry_count.
2
>= 0
<= 10
Maximum number of retries on failures.
The default, 0, which means never retry.
The GCP project ID.
The GCP region.
Compute reservation.
["https://www.googleapis.com/auth/cloud-platform"]
The GCP scopes to be used.
The GCP service account key.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
PT5S
duration
Additional time after the job ends to wait for late logs.
PT1H
duration
The maximum duration to wait for the job completion unless the task timeout
property is set which will take precedence over this property.
Google Cloud Batch will automatically timeout the job upon reaching such duration and the task will be failed.
The maximum amount of kernel memory the container can use.
The minimum allowed value is 4MB
. Because kernel memory cannot be swapped out, a container which is starved of kernel memory may block host machine resources, which can have side effects on the host machine and on other containers. See the kernel-memory docs for more details.
The maximum amount of memory resources the container can use.
Make sure to use the format number
+ unit
(regardless of the case) without any spaces.
The unit can be KB (kilobytes), MB (megabytes), GB (gigabytes), etc.
Given that it's case-insensitive, the following values are equivalent:
"512MB"
"512Mb"
"512mb"
"512000KB"
"0.5GB"
It is recommended that you allocate at least 6MB
.
Allows you to specify a soft limit smaller than memory
which is activated when Docker detects contention or low memory on the host machine.
If you use memoryReservation
, it must be set lower than memory
for it to take precedence. Because it is a soft limit, it does not guarantee that the container doesn’t exceed the limit.
The total amount of memory
and swap
that can be used by a container.
If memory
and memorySwap
are set to the same value, this prevents containers from using any swap. This is because memorySwap
includes both the physical memory and swap space, while memory
is only the amount of physical memory that can be used.
A setting which controls the likelihood of the kernel to swap memory pages.
By default, the host kernel can swap out a percentage of anonymous pages used by a container. You can set memorySwappiness
to a value between 0 and 100 to tune this percentage.
By default, if an out-of-memory (OOM) error occurs, the kernel kills processes in a container.
To change this behavior, use the oomKillDisable
option. Only disable the OOM killer on containers where you have also set the memory
option. If the memory
flag is not set, the host can run out of memory, and the kernel may need to kill the host system’s processes to free the memory.
The reference to the user assigned identity to use to access the Azure Container Registry instead of username and password.
The password to log into the registry server.
The registry server URL.
If omitted, the default is "docker.io".
The user name to log into the registry server.
VOLUME
MOUNT
VOLUME
File handling strategy.
How to handle local files (input files, output files, namespace files, ...).
By default, we create a volume and copy the file into the volume bind path.
Configuring it to MOUNT
will mount the working directory instead.
Docker configuration file.
Docker configuration file that can set access credentials to private container registries. Usually located in ~/.docker/config.json
.
Limits the CPU usage to a given maximum threshold value.
By default, each container’s access to the host machine’s CPU cycles is unlimited. You can set various constraints to limit a given container’s access to the host machine’s CPU cycles.
true
Whether the container should be deleted upon completion.
[
""
]
Docker entrypoint to use.
Extra hostname mappings to the container network interface configuration.
Docker API URI.
PT0S
duration
When a task is killed, this property sets the grace period before killing the container.
By default, we kill the container immediately when a task is killed. Optionally, you can configure a grace period so the container is stopped with a grace period instead.
Limits memory usage to a given maximum threshold value.
Docker can enforce hard memory limits, which allow the container to use no more than a given amount of user or system memory, or soft limits, which allow the container to use as much memory as it needs unless certain conditions are met, such as when the kernel detects low memory or contention on the host machine. Some of these options have different effects when used alone or when more than one option is set.
Docker network mode to use e.g. host
, none
, etc.
List of port bindings.
Corresponds to the --publish
(-p
) option of the docker run CLI command using the format ip: dockerHostPort: containerPort/protocol
.
Possible example :
8080: 80/udp
-127.0.0.1: 8080: 80
-127.0.0.1: 8080: 80/udp
Give extended privileges to this container.
IF_NOT_PRESENT
IF_NOT_PRESENT
ALWAYS
NEVER
The pull policy for a container image.
Use the IF_NOT_PRESENT
pull policy to avoid pulling already existing images.
Use the ALWAYS
pull policy to pull the latest version of an image
even if an image with the same tag already exists.
Size of /dev/shm
in bytes.
The size must be greater than 0. If omitted, the system uses 64MB.
User in the Docker container.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
List of volumes to mount.
Make sure to provide a map of a local path to a container path in the format: /home/local/path:/app/container/path
.
Volume mounts are disabled by default for security reasons — if you are sure you want to use them,
enable that feature in the plugin configuration
by setting volume-enabled
to true
.
Here is how you can add that setting to your kestra configuration:
kestra:
plugins:
configurations:
- type: io.kestra.plugin.scripts.runner.docker.Docker
values:
volume-enabled: true
true
Whether to wait for the container to exit.
A list of capabilities; an OR list of AND lists of capabilities.
Driver-specific options, specified as key/value pairs.
These options are passed directly to the driver.
Compute environment in which to run the job.
AWS region with which the SDK should communicate.
{
"request": {
"memory": "2048",
"cpu": "1"
}
}
Custom resources for the ECS Fargate container.
See the AWS documentation for more details.
Access Key Id in order to connect to AWS.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
S3 Bucket to upload (inputFiles
and namespaceFiles
) and download (outputFiles
) files.
It's mandatory to provide a bucket if you want to use such properties.
PT5S
duration
Determines how often Kestra should poll the container for completion. By default, the task runner checks every 5 seconds whether the job is completed. You can set this to a lower value (e.g. PT0.1S
= every 100 milliseconds) for quick jobs and to a lower threshold (e.g. PT1M
= every minute) for long-running jobs. Setting this property to a lower value will reduce the number of API calls Kestra makes to the remote service — keep that in mind in case you see API rate limit errors.
true
Whether the job should be deleted upon completion.
Warning, if the job is not deleted, a retry of the task could resume an old failed attempt of the job.
The endpoint with which the SDK should communicate.
This property allows you to use a different S3 compatible storage backend.
Execution role for the AWS Batch job.
Mandatory if the compute environment is ECS Fargate. See the AWS documentation for more details.
Job queue to use to submit jobs (ARN). If not specified, the task runner will create a job queue — keep in mind that this can lead to a longer execution.
true
Whether to reconnect to the current job if it already exists.
Secret Key Id in order to connect to AWS.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
AWS session token, retrieved from an AWS token service, used for authenticating that this user has received temporary permissions to access a given resource.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The AWS STS endpoint with which the SDKClient should communicate.
AWS STS Role.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider
. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
AWS STS External Id.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn
is defined.
PT15M
duration
AWS STS Session duration.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn
is defined.
AWS STS Session name.
This property is only used when an stsRoleArn
is defined.
Task role to use within the container.
Needed if you want to authenticate with AWS CLI within your container.
\d+\.\d+\.\d+(-[a-zA-Z0-9-]+)?|([a-zA-Z0-9]+)
The version of the plugin to use.
PT1H
duration
The maximum duration to wait for the job completion unless the task timeout
property is set which will take precedence over this property.
AWS Batch will automatically timeout the job upon reaching that duration and the task will be marked as failed.