Suppose you have formatted strings of the form R=nn,nn,nn where nn are spans of decimal digits, but the precise number of digits can vary. The strings are to be converted into the form Bxhhhhhh, where each h is a hex digit. (You may recognize this as the RGB and BGR color palette codes from SPFLite’s automatic colorization files.)

First, we need to find spans of characters that look like the ones we are looking for. There are a number of ways this can be done, but the simplest is to look for the R= prefix, followed by one or more commas and digits. This is not a precise definition, but it should be close enough. Our Regular Expression find string would thus look like R'R=[0‑9,]+'.

  • Note:  Letters in a Regular Expression are matched to data as case-insensitive by default, unless your R string begins with a \c escape code. In our example, the data itself is case-insensitive, so we are fine with allowing this assumption to be used.

  • Note:  If you are unsure how “good” your Regular Expression string actually is, the best way to find out is to exclude your whole file, and then use the Regular Expression string in a FIND ALL command. If you like what “pops out”, you can continue with using it in a CHANGE; otherwise you may have to fix your Regular Expression definition.

The result string requires a constant string of Bx to be included in the result. We do that with an inner quoted string of `Bx`.

Next, we have to extract the decimal values, one at a time, and emit them as hex numbers with a fixed format length of 2 each. Our scan will proceed left-to-right, and we will be extracting the first, second and third decimal values found in that scan. So, each time is of the form  nDX2F1+. These items have the following meaning:

  • n is the relative location of the desired field. Because n appears on the left side of the command code, a left-to-right scan will be performed. There will be three mapping items, and because the result string has to have these values reversed, the n value used will be 3, 2 and 1 in that order.

  • DX is the mapping conversion code, requesting Decimal to Hex conversion.

  • 2F requests a converted value being stored in a fixed field of two hex digits. By default, DX will render hex digits larger than 9 as capital A-F. You can use a case conversion code of < or one of the case alteration codes like LC to change that assumption. Our example will accept the default.

  • The column reference operand 1+ selects the entire value found by the Regular Expression of R'R=[0‑9,]+'. As you will see, this same value is referenced three times, since three separate scans are required to locate the first, second and third values from the original source string. Because the numbers could be anywhere in the source string, the entire value has to be scanned three times.

Combining this all together gives us the following CHANGE command:

CHANGE R'R=[0-9,]+' WORD M"`Bx` 3DX2F1+ 2DX2F1+ 1DX2F1+"

Because the DX command is a copying command that will auto-reference, the 1+ reference can be omitted, which simplifies the CHANGE command to:

CHANGE R'R=[0-9,]+' WORD M"`Bx` 3DX2F 2DX2F 1DX2F"

Example:  A string  R=153,85,15  will be converted into  Bx0F5599. The order of the numeric values has been reversed, the values were converted from decimal to hex, each with a fixed field size of 2, the prefix  Bx has been inserted, and the remainder of the original source data (the R= part) has been ignored. The entire found field was scanned three times to locate the values, because of the 1+ column reference that was used (or implied) three times.

Since the n values (3, 2, 1) precede the DX command codes (3DX, 2DX, 1DX), the scan is performed left-to-right (finding the third value from the left, second value from the left, then first value from the left).

If these had been written as (D1X, D2X, D3X) the scan would have been performed right-to-left.

Extra credit:  Did you notice that we made no attempt to bypass the 'R=' part of the source string?  Why does this work?  When your source string value is being scanned for numbers, characters like 'R', '=' and ',' are not recognized as digits, so they are treated as separators between spans of digits, and otherwise are simply ignored.

Created with the Personal Edition of HelpNDoc: What is a Help Authoring tool?