sftpin Processor

sftpin is a controller processor used to retrieve (receive) files via the sftp protocol.


Common Settings

SettingDescriptionData TypeDefault
service.type=sftpinindicates the type of processorstringn/a
schedulehow often to check (poll) for new filesstring2m
use.file.pointersinstead of delivering the actual data file, deliver a file that points to itbooleanfalse
max.filesmax number of files to download at a time (i.e. to process at a time)integer1

FTP Server Info

SettingsDescription
ftp.server.namethe target ftp server’s hostname
ftp.server.portthe target ftp server’s port number
ftp.server.public.keyconfigure the sftp server’s public key either directly here (file name) or in known_hosts
ftp.server.time.offsetdate/time offset if different than where TEDI runs
ftp.server.dirthe directory to search for files to download (assumes / if empty)

Mutual Exclusion / Locking

If you run multiple threads or TEDI processes, you’ll need to use global locking to ensure each file is handled independently and in isolation.

SettingsDescription
use.locksacquire a lock prior to processing a file
kvs.namewhich key value store to use as a global lock (configured in nats_kvs.properties)

File Detection

These settings are used to detect new files to download (GET) and process.

SettingsDescription
filename.include.regexrequired - a list of regular expressions to detect new files to receive (uses OR logic)
filename.exclude.regexoptional - a list of regular expressions to filter out files from being received (uses AND and NOT logic with includes)
local.filename.formatterrename files after download (formatter)

State Management

State management is used to track the state of a single file as it is being processed by TEDI.

The fundamental purpose being to ensure that TEDI doesn’t begin reading the file before the system writing the file has finished.

There are two methods to keep state:

  • by renaming the file
  • by using a status file
SettingsDescription
Renaming 
file.state.rename.enabled use the file name itself to manage state
file.state.rename.formatter.startinghow to rename the file to indicate it’s actively being processed (i.e. mark in-process)
file.state.rename.formatter.finishedhow to rename the file after processing is finished (i.e. mark the file as done)
file.state.rename.formatter.failedrename the file to indicate processing failed
Status Files 
file.state.status.file.enableduse a status file to manage state
file.state.status.file.formatterused to identify actual file name relative to status file name

File Purging

You can configure how to detect and delete files that have reached a user defined max age.

SettingsDescription
purge.enabledenable purging
purge.kvs.namethe kvs to use for global locking
purge.file.max.agedelete files successfully received after duration N (set to 0 to delete after pickup)
purge.schedulefrequency at which to check for files to delete
purge.filename.include.regexrequired - regex for identifying files to delete
purge.filename.exclude.regexoptional - regex for files to exclude

Abandon Check: Monitoring for Abandoned Files

The Abandon Check is used to detect files where processing began, but for one reason or another did not complete and were left in the ‘in process’ state.

You can use the Abandon Check to find and fix those files to re-attempt processing.

SettingsDescription
abandon.check.enabledenable the monitor
abandon.check.kvs.namethe kvs to use for global locking
abandon.check.schedulehow often to run the check
abandon.check.max.agehow long the file has been in the processing state (> max.age triggers an alert)
abandon.check.rename.enabledenables renaming the file so it will be picked up again
abandon.check.include.regexhow to identify abandoned files
abandon.check.filename.formatterhow to rename the file

Authentication

TEDI supports the following methods for authentication with sftp servers:

  • basic auth / username password
  • username / key
SettingsDescription
sftp.client.auth[ basic | ssh-key ]
Basic Auth 
basic.auth.usernameclient username
basic.auth.passwordclient password
basic.auth.system.key.aliasthe system key used to encrypt the password
Key Auth 
key.auth.usernameclient username
key.auth.client.private.keyprivate key file name; note: rsa is the only supported key type (you must convert openssh format to pem)

SSH Settings

TEDI supports customization to the following SSH settings.

SettingsDescription
ssh.validate.server.host.keytrue or false. note: false is insecure (no trust verification)
ssh.key.exchange.algoscomma-delimited array or leave empty to accept Golang defaults
ssh.cipherscomma-delimited array or leave empty to accept Golang defaults
ssh.macscomma-delimited array or leave empty to accept Golang defaults
ssh.host.key.algoscomma-delimited array or leave empty to accept Golang defaults
ssh.connection.timeoutmax time to wait to establish a connection (e.g. 1m, 30s)
ssh.debugtrue or false. set to enable verbose connection logging (for troubleshooting)
# -----------------------------------------------------------------------------
# - ssh.key.exchange.algos: curve25519-sha256
#                           curve25519-sha256@libssh.org
#                           ecdh-sha2-nistp256
#                           ecdh-sha2-nistp384
#                           ecdh-sha2-nistp521
#                           diffie-hellman-group14-sha256
#                           diffie-hellman-group14-sha1
# - ssh.ciphers:            aes128-gcm@openssh.com
#                           chacha20Poly1305ID
#                           aes128-ctr
#                           aes192-ctr
#                           aes256-ctr
# - ssh.macs:               hmac-sha2-256-etm@openssh.com
#                           hmac-sha2-256
#                           hmac-sha1
#                           hmac-sha1-96
# - ssh.host.key.algos:     rsa-sha2-512-cert-v01@openssh.com
#                           rsa-sha2-256-cert-v01@openssh.com
#                           ssh-rsa-cert-v01@openssh.com
#                           ssh-dss-cert-v01@openssh.com
#                           ecdsa-sha2-nistp256-cert-v01@openssh.com
#                           ecdsa-sha2-nistp384-cert-v01@openssh.com
#                           ecdsa-sha2-nistp521-cert-v01@openssh.com
#                           ssh-ed25519-cert-v01@openssh.com
#                           ecdsa-sha2-nistp256
#                           ecdsa-sha2-nistp384
#                           ecdsa-sha2-nistp521
#                           rsa-sha2-512
#                           rsa-sha2-256
#                           ssh-rsa
#                           ssh-dss
#                           ssh-ed25519
#
#  leave empty to accept defaults (above)
# -----------------------------------------------------------------------------

File Name Formatter

File Name Formatter is used to perform a set of operations on a file name.

You would employ these operations in the event you need to transform a file name to meet to a specific format required by the destination system you are integrating with.

OperatorsDescription
{COPY}no-op - copy the full name of the input file including extension (but not the path if there is one)
{PATH}copy the path of the file (if there is one)
{BASENAME}copy the basename of the file excluding the extension
{UUID}add a UUID to the filename (likely for uniqueness)
{DATE:date-format}add a date to the file (format must be specified) - spaces are converted to underscores
{EXT:COPY}copy the original extension
{EXT:REMOVE}remove the extension after formatting is complete
{SUBSTRING:[N:N]}substring a portion of the file name

Example

input:     /tmp/files/inbound/xfiles.txt
formatter: {PATH}{BASENAME}_{UUID}.{DATE:2006-01-02 00:00:00}_in.{EXT:COPY}.new
output:    /tmp/files/inbound/xfiles_550e8400-e29b-41d4-a716-446655440000_2021-06-14_12:00:01_in.txt.new