String Variables

A-Shell BASIC strings similar to strings in other languages, although there are a few quirks that are worth special attention:

•   ANSI: In other words, one byte per character—unlike, say, UNICODE or Wide strings. Typically the Latin1 encoding is used: standard ASCII for the low 7 bits, with the upper range used for certain accented characters and symbols common in the most popular Western and Latin-derived languages. But there is nothing preventing you from using other encodings such as UTF8, or XML encoding, provided there are no embedded null bytes, and provided that you handle the necessary conversions for printing and other uses.

•   Fixed vs Variable Length: String variables may either be declared as fixed length (e.g. S,25) or variable length by specifying length zero (S,0). Fixed length strings are not only fixed in length but also in position relative to other variables, which is particularly important for mapped structures mirroring data records made up of fixed length fields. Variable length (aka dynamic) strings will move around in memory as they shrink and grow, so should not be used within fixed structures. See Dynamically-Sized Variables for more details.

•   Termination: String variables are either terminated logically by the first null byte, or physically, by filling up the allotted space in a fixed size variable. Conceptually it shouldn't matter, as the terminator is effectively added as needed when performing operations. But when passing string variables to subroutines embedded within A-Shell (written in C) or to external libraries (using DYNLIB), you may be responsible for providing your own explicit null terminators if the variable was full and is immediately followed by another in memory, as within a structure. Most standard subroutines handle this detail internally—check the documentation if uncertain—but virtually no external libraries will, because in most other languages, the assumption is that strings are always explicitly terminated by a null byte.

•   Logical vs Physical Length: As just noted, when a fixed length string contains an explicit null terminator, the logical length of the string variable, as measured by the LEN() function, will be shorter than the physical length, as measured by the SIZEOF() function. The difference between the logical and physical length is generally not of interest for most operations, except when making assignments directly to a sub-string. See Sub-String Operator for more details. This is not applicable to variable length strings.

•   Trailing Blanks: Trailing blanks are significant when it comes to display, printing, concatenation, sub-string operations, etc. But the one glaring exception is that they are not considered significant when comparing two strings—variables or expressions. This exception applies only to blanks, i.e. ASCII 32, and not to other any non-printing characters such as tabs, carriage returns, etc. If you want to distinguish between two strings that differ only in the number of trailing blanks, you have to take into account the string length.

•   X vs S variables: The X (unformatted) data type behaves similarly to S (string), and a string variable can always be copied to an X variable and back again without any loss. Note that the reverse is not true. The main difference between the two types is that for strings, a null byte marks the logical end of the data, whereas for X variables, nulls bytes are handled just like any other bytes. The logical end of the data is determined solely by the physical length of the variable.  In the case of dynamic X variables, the logical and physical length is always the same.