Zengines Transformation Language
Our transformation language is a set of functions, following familiar SQL syntax and patterns.
Zengines custom transformations allow you to easily evaluate and modify source data to match the requirements of your target field and business needs. They are an extended, but limited subset of SQL.
- Working with Strings
- Working with Numbers
- Working with Dates
- Evaluating Values
- Other Functions
- Comments
- Window Functions
- Lookups and Returning Values for Other Data Sets
- Field Data Types and Auto Conformity
- Date Format Specifiers
Working with Strings
CONCAT() and ||
The CONCAT(expression1, expression2, expression3,...) function adds two or more expressions (strings) together.
The shorthand version (double pipes) || can also be used.
Examples:
CONCAT('string1', 'string2', 'string3')
'string1' || 'string2' || 'string3'
Result:
'string1string2string3'
INSTR()
The INSTR(string, search_string) function returns the position of the first occurrence of a string in another string.
This function performs a case-sensitive search.
Examples:
INSTR('Hello World', 'W')
7
LENGTH()
The LENGTH(string) function returns the length of a string.
Examples:
LENGTH('Hello World')
Result:
11
LOWER()
The LOWER(string) function converts a string to lower-case.
Examples:
LOWER('Hello World')
Result:
'hello world'
LPAD()
The LPAD(string, count, character) function pads the string with the character on the left until its length matches the count. Truncates the string on the right if it has more than count characters.
Examples:
LPAD('Hello', 10, 'x')
LPAD('Hello', 2, 'x')
Result:
'xxxxxHello''He'
LTRIM()
The LTRIM(string, characters) function removes any occurrences of any of the characters (optional) from the left side of the string. Characters defaults to space.
Examples:
LTRIM(' Hello')
LTRIM(' Hello', ' H')
Result:
'Hello''ello'REPLACE()
The REPLACE(string, substring, new_string) function replaces all occurrences of a substring within a string, with a new substring.
Examples:
REPLACE('Hello World', 'Hello', 'Howdy')
Result:
'Howdy world'
RPAD()
The RPAD(string, count, character) function pads the string with the character on the right until its length matches the count. Truncates the string on the right if it has more than count characters.
Examples:
RPAD('Hello', 2, 'x')
Result:
'Helloxxxxx'
'He'
RTRIM()
The RTRIM(string, characters) function removes any occurrences of any of the characters (optional) from the right side of the string. Characters defaults to space.
Examples:
RTRIM('Hello ', 'o ')
Result:
'Hello'
'Hell'
SPLIT_PART()
The SPLIT_PART(string, separator, index) function splits the string along the separator and returns the data at the (1-based) index of the list. If the index is outside the bounds of the list, returns an empty string.
Examples:
SPLIT_PART('Hello world ', ' ', 2)
Result:
'world'
SUBSTR()
The SUBSTR(string, start, length) function extracts a substring from a string (starting at any position).
Note: The position of the first character in the string is 1.
Note: The position of the last character in the string is -1.
Examples:
SUBSTR('Hello World', 1, 5)
SUBSTR('Hello World', -11, 5)
Result:
'Hello'
TRIM()
The TRIM(string) function removes leading and trailing spaces from a string.
Examples:
TRIM(' Hello World ')
Result:
'Hello World'
UPPER()
The UPPER(string) function converts a string to upper-case.
Examples:
UPPER('Hello World')
Result:
'HELLO WORLD'
Working with Numbers
ABS()
The ABS(number) function returns the absolute (positive) value of a number.
Examples:
ABS(-243.5)
Result:
243.5
CEIL()
The CEIL(number) function returns the smallest integer value that is bigger than or equal to a number.
Examples:
CEIL(298.6789)
Result:
299
FLOOR()
The FLOOR(number) function returns the largest integer value that is smaller than or equal to a number.
Examples:
FLOOR(298.6789)
Result:
298RANDOM()
The RANDOM() function returns a random decimal number (no seed value - so it returns a completely random number >= 0 and <1)
Examples:
RANDOM()
Result:
0.22131628033045322
ROUND()
The ROUND(number, decimals) function rounds a number to a specified number of decimal places.
Examples:
ROUND(10.13879)
ROUND(10.13879, 2)
Results:
10
10.14
Working with Dates
CURRENT_DATE
The CURRENT_DATE function returns the current date.
Note: The date is returned as "YYYY-MM-DD"
Examples:
CURRENT_DATE
Results:
2025-03-04
CURRENT_TIME
The CURRENT_TIME function returns the current time.
Note: The time is returned as "HH:MM:SS.uuuuuu±HH:MM" (includes microseconds and timezone offset)
Examples:
CURRENT_TIME
Results:
22:32:43.439000+00:00
CURRENT_TIMESTAMP
The CURRENT_TIMESTAMP function returns the current date and time in ISO format.
Note: The time is returned as "YYYY-MM-DD HH:MM:SS.uuuuuu±HH:MM"
Examples:
CURRENT_TIMESTAMP
Results:
2025-03-04 22:35:33.223000+00:00
DATE_ADD()
The DATE_ADD(date, INTERVAL value addunit) function adds a time/date interval to a date and then returns the date.
Examples:
DATE_ADD(CAST('2011-11-09 09:10:00' AS DATETIME), INTERVAL 3 MONTH)
Results:
2012-02-09 09:10:00
DATE_DIFF()
The DATE_DIFF(part, startdate, enddate) function returns the number of date parts (including partials) specified between two date values.
Examples:
DATE_DIFF(
'MONTH',
CAST('2022-04-30' AS DATE),
CAST('2024-11-03' AS DATE)
)
Results:
31
DAYNAME()
The DAYNAME(date) function returns the day of the week (English).
Examples:
DAYNAME(
CAST('2022-04-30 12:45:45' AS DATETIME)
)
Results:
Saturday
EPOCH()
The EPOCH(date) function returns the number seconds since 1970-01-01.
Examples:
EPOCH(
CAST('2022-04-30 12:45:45' AS DATETIME)
)
Results:
1651322745
MONTHNAME()
The MONTHNAME(date) function returns the month (English).
Examples:
MONTHNAME(
CAST('2022-04-30 12:45:45' AS DATETIME)
)
Results:
April
QUARTER()
The QUARTER(date) function returns the numeric quarter.
Examples:
QUARTER(
CAST('2022-04-30 12:45:45' AS DATETIME)
)
Results:
2
STRFTIME()
The STRFTIME(timestamp, output-format) function converts a date to a string according to the format string.
Examples:
DATE_DIFF(
CAST('2011-11-09 09:10:00' AS DATETIME),
'%A, %B %-d, %Y'
)
Results:
Saturday, November 12, 2011
STRPTIME()
The STRPTIME(string, input-format) function converts the string text to timestamp according to the format string.
Examples:
STRPTIME( '7/16/2013', '%-m/%-d/%Y' )
Results:
2013-07-16 00:00:00
WEEK()
The WEEK(date) function returns the numeric week in the year.
Examples:
WEEK(
CAST('2022-04-30 12:45:45' AS DATETIME)
)
Results:
17
WEEKDAY()
The WEEKDAY(date) function returns the numeric weekday synonym (Sunday = 0, Saturday = 6).
Examples:
WEEKDAY(
CAST('2022-04-30 12:45:45' AS DATETIME)
)
Results:
6
Evaluating values
CASE
The CASE statement goes through conditions and return a value when the first condition is met (like an IF-THEN-ELSE statement). So, once a condition is true, it will stop reading and return the result. CASE statements can be nested.
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
WHEN conditionN THEN resultN
ELSE result
END
Examples:
schema.table.fieldvalue = 56
CASE WHEN schema.table.fieldvalue > 80 THEN 'Above 80' WHEN schema.table.fieldvalue < 20 THEN 'Below 20' ELSE 'Between 20 and 80' END
Results:
'Between 20 and 80'
COALESCE()
The COALESCE(value1, value2, …. value_n) function returns the first non-null value in a list.
Examples:
schema.table.fieldvalue1 = NULL
schema.table.fieldvalue2 = 'HELLO'
COALESCE( schema.table.fieldvalue1, schema.table.fieldvalue2, 'No Results' )
Results:
'HELLO'
IF()
The IF(condition, value_if_true, value_if_false) function returns a value if a condition is TRUE, or another value if a condition is FALSE.
Examples:
schema.table.fieldvalue = 56
IF( schema.table.fieldvalue > 50, 'Top 50', 'Bottom 50' )
Results:
'Top 50'
IFNULL()
The IFNULL(expression, alternative-value) function is a two-argument version of coalesce.
Examples:
schema.table.fieldvalue = NULL
IFNULL( schema.table.fieldvalue, 'default_string' )
Results:
'default_string'
NULLIF()
The NULLIF(expression1, expression2) function returns NULL if two expressions are equal, otherwise it returns the first expression.
Examples:
schema.table.fieldvalue = 'Hello'
schema.table.fieldvalue, 'Hello' )
Results:
NULL
String evaluators
LIKE evaluate if a value is like another eg: IF('Maybe' LIKE '%May%', 'TRUE', 'FALSE')
BETWEEN evaluate if a number is in in a range eg: IF(100 BETWEEN 10 AND 60, 'TRUE', 'FALSE')
SIMILAR TO can be used to evaluate if parts or ranges of characters are in a string eg:
IF(
schema.tableA.fieldA NOT SIMILAR TO '[a-zA-Z0-9.-]*',
'INVALID',
schema.tableA.fieldA
)
As with other forms of SQL NOT can be used to check if a strings is not compliant to a pattern.
Other
CAST()
The CAST(expression AS datatype(length, precision etc)) function attempts to convert a value (of any type) into a specified datatype.
Examples:
CAST('123.56987' AS DECIMAL(18,2))
Results:
123.57
TRY_CAST()
TRY_CAST(expression AS datatype(length, precision etc)) can be used as an alternative to CAST() when the preferred behavior is not to throw an error, it will instead return NULL if a cast is not possible.
Examples:
TRY_CAST('ABCDE' AS DECIMAL(18,2))
Results:
NULL
Comments
Commenting is supported in the transformations editor.
Examples:
/* for multi-line comments */ -- for single line comments
Window functions
These enable the use of multiple rows to calculate a value for each row, they are resource intensive.
ROW_NUMBER()
Generate a row_number column to enumerate rows. This requires the OVER() function to specify the set of rows to enumerate
Examples:
-- enumerate the CUSTOMER to create a sequential CUSTID ROW_NUMBER() OVER ()
Results:
CUSTOMER | STATE | CUSTID abc | CA | 1 ddd | CA | 2 xyy | CA | 3 ghj | CA | 4 abc | MA | 5
xyz | MA | 6 abc | WA | 7
OVER()
OVER(PARTITION BY field ORDER BY field) defines the grouping and order of a window action. The OVER() function is required for ROW_NUMBER but can also be used in other aggregation scenarios like, MAX, MIN, AVG, COUNT, SUM
Examples:
-- enumerate the CUSTOMER to create a sequential CUSTID per STATE and order by CUSTOMER ROW_NUMBER() OVER (PARTITION BY STATE ORDER CUSTOMER)
Results:
CUSTOMER | STATE | CUSTID abc | CA | 1 ddd | CA | 2 ghj | CA | 3 xyy | CA | 4 abc | MA | 1 xyz | MA | 2 abc | WA | 1
Lookups and returning values for other data sets
This is a nested SELECT statement used to look up a value in source table using a value from a different source table(when not relying on table to table keys)
Example
-- INPUTS - 3 tables, 3 source schema, The value you want is in SCHEMA_C.TABLE_C.field1 SCHEMA_C.TABLE_C.field1 = 'GOLD' SCHEMA_C.TABLE_C.field2 = 'A1' SCHEMA_A.TABLE_A.field4 = '1' SCHEMA_B.TABLE_B.field3 = 'A'
(SELECT SCHEMA_C.TABLE_C.field1 FROM SCHEMA_C.TABLE_C WHERE SCHEMA_C.TABLE_C.field2 = SCHEMA_B.TABLE_B.field3 || SCHEMA_A.TABLE_A.field4)
Here we concat values from SCHEMA_A.TABLE_A and SCHEMA_B.TABLE_B tables to find the correct value in the ‘reference’ table SCHEMA_C.TABLE_C.field1, using SCHEMA_C.TABLE_C.field2 as a key
Field data types and auto conformity
Please note that where field data types are set to auto conform the preview may show a different precision or scale for the result than what may be expected, for example:
CEIL(298.6789) -- Result for a field set to a data type of INTEGER: 299 -- Result for a field set to a data type of DECIMAL(18,2): 299.00
All external data system data types are converted internally to the following:
-
BLOB - any/all binary data
-
BOOLEAN - true/false values
-
DATE - ISO date YYYY-MM-DD
-
DATETIME - ISO date or timestamp YYYY-MM-DD hh:mm:ss.ffffff
-
DECIMAL - with precision and scale set, defaults to DECIMAL(18,4)
-
INTEGER - supports variety of byte sizes TINYINT, SMALLINT, INT, BIGINT, HUGEINT
-
STRING - max length and fixed length (VARCHAR, CHAR - UTF8)
-
TIME - ISO hh:mm:ss.ffffff
Date Format specifiers
|
specifier |
Description |
Example |
|---|---|---|
|
|
Abbreviated weekday name. |
Sun, Mon, … |
|
|
Full weekday name. |
Sunday, Monday, … |
|
|
Abbreviated month name. |
Jan, Feb, …, Dec |
|
|
Full month name. |
January, February, … |
|
|
ISO date and time representation |
1992-03-02 10:30:20 |
|
|
Day of the month as a zero-padded decimal. |
01, 02, …, 31 |
|
|
Day of the month as a decimal number. |
1, 2, …, 30 |
|
|
Microsecond as a decimal number, zero-padded on the left. |
000000 - 999999 |
|
|
Millisecond as a decimal number, zero-padded on the left. |
000 - 999 |
|
|
ISO 8601 year with century representing the year that contains the greater part of the ISO week (see |
0001, 0002, …, 2013, 2014, …, 9998, 9999 |
|
|
Hour (24-hour clock) as a zero-padded decimal number. |
00, 01, …, 23 |
|
|
Hour (24-hour clock) as a decimal number. |
0, 1, …, 23 |
|
|
Hour (12-hour clock) as a zero-padded decimal number. |
01, 02, …, 12 |
|
|
Hour (12-hour clock) as a decimal number. |
1, 2, … 12 |
|
|
Day of the year as a zero-padded decimal number. |
001, 002, …, 366 |
|
|
Day of the year as a decimal number. |
1, 2, …, 366 |
|
|
Month as a zero-padded decimal number. |
01, 02, …, 12 |
|
|
Month as a decimal number. |
1, 2, …, 12 |
|
|
Minute as a zero-padded decimal number. |
00, 01, …, 59 |
|
|
Minute as a decimal number. |
0, 1, …, 59 |
|
|
Nanosecond as a decimal number, zero-padded on the left. |
000000000 - 999999999 |
|
|
Locale's AM or PM. |
AM, PM |
|
|
Second as a zero-padded decimal number. |
00, 01, …, 59 |
|
|
Second as a decimal number. |
0, 1, …, 59 |
|
|
ISO 8601 weekday as a decimal number where 1 is Monday. |
1, 2, …, 7 |
|
|
Week number of the year. Week 01 starts on the first Sunday of the year, so there can be week 00. Note that this is not compliant with the week date standard in ISO-8601. |
00, 01, …, 53 |
|
|
ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4. |
01, …, 53 |
|
|
Weekday as a decimal number. |
0, 1, …, 6 |
|
|
Week number of the year. Week 01 starts on the first Monday of the year, so there can be week 00. Note that this is not compliant with the week date standard in ISO-8601. |
00, 01, …, 53 |
|
|
ISO date representation |
1992-03-02 |
|
|
ISO time representation |
10:30:20 |
|
|
Year without century as a zero-padded decimal number. |
00, 01, …, 99 |
|
|
Year without century as a decimal number. |
0, 1, …, 99 |
|
|
Year with century as a decimal number. |
2013, 2019 etc. |
|
|
Time offset from UTC in the form ±HH:MM, ±HHMM, or ±HH. |
-0700 |
|
|
Time zone name. |
Europe/Amsterdam |
|
|
A literal |
|