$ZSUBstr()

Returns a properly encoded string from a sequence of bytes.

$ZSUB[STR] (expr ,intexpr1 [,intexpr2])

Examples of $ZSUBSTR()

Example:

GTM>write $ZCHSET
M
GTM>set char1="a" ; one byte character
GTM>set char2="ç"; two-byte character
GTM>set char3=""; three-byte character
GTM>set y=char1_char2_char3
GTM>write $zsubstr(y,1,3)=$zsubstr(y,1,5)
0

With character set M specified, the expression $ZSUBSTR(y,1,3)=$ZSUBSTR(y,1,5) evaluates to 0 or "false" because the expression $ZSUBSTR(y,1,5) returns more characters than $ZSUBSTR(y,1,3).

Example:

GTM>write $zchset
UTF-8
GTM>set char1="a" ; one byte character
GTM>set char2="ç"; two-byte character
GTM>set char3=""; three-byte character
GTM>set y=char1_char2_char3
GTM>write $zsubstr(y,1,3)=$zsubstr(y,1,5)
1

For a process started in UTF-8 mode, the expression $ZSUBSTR(y,1,3)=$ZSUBSTR(y,1,5) evaluates to 1 or "true" because the expression $ZSUBSTR(y,1,5) returns a string made up of char1 and char2 excluding the three-byte char3 because it was not completely included in the specified byte-length.

In many ways, the $ZSUBSTR() function is similar to the $ZEXTRACT() function. For example, $ZSUBSTR(expr,intexpr1) is equivalent to $ZEXTRACT(expr,intexpr1,$L(expr)). Note that this means when using the M character set, $ZSUBSTR() behaves identically to $EXTRACT() and $ZEXTRACT(). The differences are as follows:

  • $ZSUBSTR() cannot appear on the left of the equal sign in the SET command where as $ZEXTRACT() can.

  • In both the modes, the third expression of $ZSUBSTR() is a byte, rather than character, position within the first expression.

  • $EXTRACT() operates on characters, irrespective of byte length.

  • $ZEXTRACT() operates on bytes, irrespective of multi-byte character boundaries.

  • $ZSUBSTR() is the only way to extract as valid UTF-8 encoded characters from a byte string containing mixed UTF-8 and non UTF-8 data. It operates on Unicode® characters so that its result does not exceed the given byte length.