JOIN - Join lines Using Find/Change Strings

Contents of Article


Syntax

Operands

Description

Performing joins when a search Picture of P'[' or P']' is desired

Simulating a JOIN with a search string of P'[' or P']'

Simulating a JOIN with a search string of P'[]'

Suppose you really want to JOIN and not just simulate it ?

Performing joins more complex than JOIN can support

Line joining and line exclusion

Simulating CHANGE operands that are not available on JOIN


Syntax


JOIN

P'from-string'  |  R'from-string'  

[ to-string ]

[ FIRST | LAST | NEXT | PREV | ALL ]

[ PREFIX | SUFFIX | WORD | CHAR ]

[ line-control-range ]

[ color-selection-criteria ]

[ X | NX ]

[ U | NU ]

[ TOP ]


Operands


from-string

The search string you want to look for.  The from-string must be defined as either a P-type Picture string or an R-type RegEx string, with a very restricted format that is discussed below.


to-string

The string you want to replace from-string.  This may be any standard change string including P-type Picture strings and F-type Format strings.  The to-string is optional, and if omitted, it is treated the same as P'!'.  That is, when to-string is omitted, the replacement string is the same as the value found by the from-string.  Note that the alignment Picture codes of [ and ] do not correspond to data values, and so these are not part of the value represented by  P'!'.


FIRST

Starts at the top of the data and searches ahead to find the first occurrence of from-string.


LAST

Starts at the bottom of the data and searches backward to find the last occurrence of from-string.  See Release Note below.


NEXT

Starts at the first position after the current cursor location and searches ahead to find the next occurrence of from-string.  NEXT is the default.


PREV

Starts at the current cursor location and searches backward to find the previous occurrence of from-string.  See Release Note below.


ALL

Starts at the top of the data and searches ahead to find all occurrences of from-string.


PREFIX

Locates from-string at the beginning of a word.


WORD

Locates from-string when it is delimited on both sides by blanks or other non-alphanumeric characters.


CHAR


Locates from-string regardless of what precedes or follows it.

SUFFIX

Locates from-string at the end of a word.


line-control-range

The range of lines which are to be processed by the command.  Line control ranges provide a powerful tool to customize the range of lines to be processed.   The full syntax and allowable operands which make up a line control range are discussed in "Line Control Range Specification".  Refer to that section of the documentation for details.


color-selection-criteria

A request for selection based on the highlight color of the from-string. Color requests provide another powerful tool to control search selection.   The full syntax and allowable operands which make up a color-selection-criteria  are discussed in "Color Selection Criteria Specification".   Refer to that section of the documentation for details.


X | NX

Specifies a subset of the line range to be processed.   X requests only excluded lines are to be examined, NX requests only non-excluded lines are to be examined.   If neither X or NX are specified, all lines in the range will be examined.


U | NU

Specifies a subset of the line range to be processed.   U requests only User lines are to be processed, NU requests only non-User lines are to be processed.   If neither U or NU are specified, all lines in the range will be processed.


TOP

Normally, at the completion of the command, the first, or only, line processed is highlighted (if it is on the current screen) or the screen is scrolled to the 2nd screen line (as ISPF does) if the line is not on the current screen.  If TOP is coded, then the line is always positioned as the top line of the screen, regardless of its current location.



Release Note for LAST and PREV keywords


The SPLIT and JOIN commands are a new technology feature; their development is quite involved, and is ongoing.  As of the writing of this Help documentation, the LAST and PREV keywords on JOIN may not work correctly or may not be supported at all, initially.  As this situation changes, we will keep you updated via the SPFLite web site for any new capabilities that are made available.

Abbreviations and Aliases


PREFIX can also be spelled as PRE or PFX

SUFFIX can also be spelled as SUF or SFX

WORDS can also be spelled as WORD

CHARS can also be spelled as CHAR

Description


The JOIN edit primary command is used to selectively combine lines of text based on a search string.  After a join occurs, a line on which the from-string is found will be combined either with the line before it, or with the line after it.  So, each time a join takes place, two lines of text will become one line of text.


Notes:


The from-string describes the join point where the joining operation takes place.  Because a join operation joins a given line to the line that precedes it or to the line that follows it, a join can take place at only one of two join points: at the beginning or a line, or at the end of a line.


The from-string may be specified as a Regular Expression, in addition to a Picture string.


The from-string must be either:


P'[string'

JOIN is being asked to perform a left-join operation.  The value of string must appear on the left side of lines within the line range in order to be joined; otherwise any lines not beginning with string will be ignored for JOIN purposes.  When a [ Picture code appears in the from-string of a JOIN command, it must appear in the left-most position of the Picture string, and nowhere else.


P'string]'

JOIN is being asked to perform a right-join operation.  The value of string must appear on the right side of lines within the line range in order to be joined; otherwise any lines not ending with string will be ignored for JOIN purposes.  When a ] Picture code appears in the from-string of a JOIN command, it must appear in the right-most position of the Picture string, and nowhere else.



R'^expression'

JOIN is being asked to perform a left-join operation.  The regular expression must start with the ^ directive to indicate the left-hand edge of the line.  The remaining expression may be any valid RegEx expression.


R'expression$'

JOIN is being asked to perform a right-join operation.  The regular expression must end with the $ directive to indicate the right-hand edge of the line.  The remaining expression may be any valid RegEx expression.


It is not possible to directly join one line to both the line that precedes it and the line that follows it, at the same time.  That is to say, a from-string of a JOIN command cannot be specified in the form of P'[string]' or R'^expression$'.  For any given line, only one JOIN, on one side of a line, is allowed.  JOIN cannot perform a left-join and a right-join at the same time, since this would imply converting three lines of data into one line in a single join operation, an action that SPFLite does not support.  See the discussion below for ways to simulate such complex joins.


Note: Because the from-string must be a Picture or RegEx, there could be cases where you are trying to find data containing characters that are already defined as special-purpose Picture or RegEx codes.  If you need such characters treated as ordinary data rather than as Picture codes, you can escape those characters by preceding them with a \ backslash.  See Specifying a Picture or Format String for more information.  See Specifying a Regular Expression for escaping within a Regular Expression.  


Note: In a Picture string, any character can be escaped with a backslash, whereas in a Regular Expression, only a limited number of "special" codes can be escaped.  If you try to escape a Regular Expression character that is not eligible for being escaped, your Regular Expression will not properly match strings in the way you expected.  This behavior is caused by the design of the Regular Expression "engine" and is outside the control of SPFLite.  Read the Regular Expression documentation carefully on this point.


The from-string must be in exactly one of these two formats.  The string value cannot be omitted.  That is, you cannot have a JOIN from-string appearing literally as P'[' or P']'.  The main reason for this is that the [ and ] codes represent edges of a line, but do not represent actual data values.  So, a picture of P'[' or P']' would actually represent a zero-length string, and the string-search engine in SPFLite will not "find" strings of zero length, because there is literally nothing to find.  For the same reason, a picture of P'[]' is not allowed either.  See the discussion below for ways to simulate such joins.


The to-string is optional.  If you omit it, the found-string is copied as is (ignoring the [ or ] Picture code), the same as if a to-string of P'!' was specified.  That means the following JOIN commands all work the same way:


       JOIN P'[ABC' 'ABC'        

       JOIN P'[ABC' P'!'

       JOIN P'[ABC'


It is not illegal to ask JOIN to left-join line 1 of a file, or to right-join the last line of a file.  Since there is no data to join such lines to, the JOIN request is simply ignored for those lines.


Note:  While the to-string can be a Format string, you will find that using a Picture change string here should address most of your JOIN requirements when the to-string isn't a "simple" change string.  Where a Format change string comes in handy is when the from-string is a Picture (as is always true for JOIN), and you need one or more codes of = in the to-string that don't match the corresponding character positions in the from-string.  See Specifying a Picture or Format String for more information.  When you have an editing requirement of this kind, you will get an error message if an = code is misplaced in a Picture change string; that is your 'clue' that you need a Format string instead.  (These comments about the use of the = codes in Pictures also apply to the related < > and ~ codes, which operate in a similar way and have similar rules.)


See Working with SPLIT and JOIN Commands for example usage of the JOIN command.


Performing joins when a search Picture of P'[' or P']' is desired


As noted above, JOIN will not accept a search Picture of P'[' or P']' or P'[]'.  These strings are illegal in JOIN, because they all represent zero-length strings, and P'[]' (if legal) would represent a zero-length line; the string search-engine in SPFLite will not find such zero-length strings or lines, because there is literally nothing to find.  Suppose, though, you had some reason doing such joins anyway.  How could you go about it?


Simulating a JOIN with a search string of P'[' or P']'


A join of this type implies that you want to join a line to another line, such that the contents of the line itself are not the deciding factor.  You probably are not (intentionally) trying to find zero-length lines, but (a) you don't care about the exact contents of the line, (b) your are almost certainly using F5 and F6 (or other keys mapped to RFIND/RLOCFIND and RCHANGE) to selectively join lines because you have to manually inspect each one to decide whether to JOIN it or not, and (c) in case you do run into a zero-length line while doing this, you don't want to be stopped from what you are planning on doing.


Suppose you wanted to selectively find lines, perhaps with the FIND command or by some other means, and when you find such lines (even zero-length lines), you want to join them to the line before or after.


Because the line command G will perform a "physical join" of the line it is on to the line that follows it, it basically performs a function similar to what a right-joining command of JOIN P']' '' would do, if such a command were legal.  Let's say you issued some kind of FIND command, and you keep pressing F5 to RFIND the desired line(s).  You now want to right join the line - whatever it contains, even a zero-length line - to the line after it.  For convenience, let's map the line command G to the Ctrl Shift F5 key.  The KEYMAP string would appear as {G}.  Then, when you find a desired line, you would right-join it by pressing Ctrl Shift F5.


Performing a function similar to what a left-joining command of JOIN P'[' '' would work basically the same way.  However, since the join/glue line commands always perform operations that amount to joining on the right, you would have to turn a right join into a left join by moving the cursor up one line and then issuing a the G line command.  Let's map the cursor movement and the line command G to the Ctrl Alt F5 key.  The KEYMAP string would appear as (Up){G}.  Then, when you find a desired line, you would right-join it by pressing Ctrl Alt F5.  So, instead of joining the "current" line to the "previous" line, we move the cursor up one line, and then join the "previous" line to the "current" one - but the result is the correct one.


This technique will also work when one of the lines being joined is a zero-length line.  In such cases, the zero-length line is effectively deleted.


Simulating a JOIN with a search string of P'[]'


A join of this type implies that you want to find lines of zero length and "join" them to an adjacent line.  However, by definition, since the line is of zero-length, "joining" such a line - whether to a preceding line or to a subsequent line - is the same thing as deleting it.  So, you are actually trying to delete zero-length lines by using a JOIN to do it.


You cannot directly "find" strings of zero-length, because as noted above, SPFLite's string search engine will not find zero-length strings, as there is literally nothing to find.  However, you can find lines in which no characters are present.  This relies on the NFIND command.  If you issue a command of the form:


       NFIND P'='


you are asking SPFLite to find lines in which not even a single character (of length one) matching a Picture of P'=' is found.  Since a Picture of P'=' represents any character, the NFIND command finds any lines where there are no characters found at all, because the test for finding characters that match the pattern of P'='has failed.  The only lines that can possibly meet this condition are lines of zero length.  This command is repeatable by using the RFIND or RLOCFIND command (traditionally mapped to F5).


Once you find the desired zero-length lines, just delete them.  You could map a key to the {D} delete line command for this purpose, or perhaps you might wish to "mark" such zero-length lines with a special character (like the ? character described below) and then go back and delete all of them with a primary command such as:


       DELETE '?' ALL


If your goal was simply to delete all zero-length lines from a file, the easiest way to do this is as follows, which relies on the NEXCLUDE command, abbreviated as NX.  (This is not the only to do it, but it's a good general example.)


RESET

NX P'=' ALL

DELETE ALL X


Suppose you really want to JOIN and not just simulate it ?


The main problem to doing this is dealing with zero-length lines.  The easiest way to overcome this is to put some data on those lines so they aren't zero-length any more.  One method of doing this is to APPEND a blank to such lines.  Appending a blank to a zero-length line will make it a line of length 1.  For most JOIN purposes, that should be good enough.  Here is a sequence that will 'fix' all the zero-length lines in a file:


RESET

NX P'=' ALL

APPEND ' ' ALL X


The Pad to Length command PL can take a / or \ modifier.  Putting PL/ on line 1 of a file will ensure that every line of the file is at least one character long.


Performing joins more complex than JOIN can support

You may encounter cases where the JOIN command won't do everything you want.  You might want to join a line to the line before it and to the line after it, at the same time.  You may have text already colored by highlighting pens and you need precise control over how the text colors are affected by joining.  You may need special features supported by CHANGE, such as TRUNC, MX, DX, etc.  These and other cases may require a different approach.


Keeping in might that a JOIN is a type of "change" to a line, you can perform more-complex joining by doing this in two stages.  First, use a CHANGE command to insert "user-defined join points" into your data, and then go back and use one or more JOIN commands to actually combine them.  This technique has the nice feature that after the first part, you can go and manually inspect all of the user-defined join points you just put in, and verify they are all where you want them, possibly adding and removing a few before the second part, if you have certain special cases where some join points have to be taken out and others added.  Because the CHANGE command, and any manual editing of your own, will have placed these user-defined join points exactly where you need them, only a simple form of JOIN will be required to combine the lines.


To do this, you might want to map some special ANSI character that you rarely use in your own data, and use that to represent your user-defined join points.  If necessary, you can use the (Ansi) function to get any ANSI character into the clipboard, and then use it as a value for KEYMAP.


Suppose you had a string "AB-CD-EF"  appearing between lines .ONE and .TWO.  In some cases, AB-CD-EF is the only thing on a line, neither preceded nor followed by any characters (even spaces), and in other cases it may appear on some lines next to other data.  You want to take the lines where AB-CD-EF is the only thing on a line, and join that line both to the line before it and to the line after it, at the same time.  You can't do that directly with JOIN, but (assuming that ? does not appear in your data) you could do the following instead.  The idea here is that you would use the ? character to temporarily mark a user-defined join point, which you go back and process with subsequent JOIN commands, which also remove the temporary join marks.  The same mark is used on both sides, so that when you do the JOIN commands, you will be sure that you just change lines in which AB-CD-EF is the only thing on a line, rather than in places where it happens to just begin or end a line.  That is, you want to make sure you join the correct lines, and nothing else.


Note here that CHANGE can have a find-picture containing both [ and ], whereas JOIN allows only one of these codes (but not both). We use a change Picture of P'?!?' in the CHANGE command for conciseness.  You could have also used a simple string of '?AB-CD-EF?' instead.  Note how the JOIN command change-strings effectively delete the ? character by not including it in the characters appearing in the change-string.


As you can see, the to-string of the JOIN command can be a zero-length (null) string.  The net effect is that the ? character defines where the join takes place, and then that ? character is removed by being replaced with ''.


If you want to do this, you have to explicitly specify an empty to-string of '', because if you just omit the second string operand, it's assumed to be P'!', which copies the original found-string rather than deleting it, and that is not what you wanted in this example here.


The first JOIN joins lines on the left side, and the second joins on the right side; the user-defined join-point character then disappears.


       CHANGE P'[AB-CD-EF]' P'?!?' ALL .ONE .ONE

       JOIN P'[?' '' ALL .ONE .TWO

       JOIN P'?]' '' ALL .ONE .TWO

By the way, if you don't feel like using the ? character, pick anything you like that's convenient and not already in your data.


Line joining and line exclusion


The JOIN command supports the X and NX keywords, to allow you to limit your line-range selection to only excluded (X), or only not-excluded lines (NX), if you wish.  Regardless of the use of X or NX keywords, when a line is joined-to, it is considered a "change" to the that line.  



Any joined-to line that was excluded at the time the join is done will be unexcluded.


The JOIN command does not support the MX and DX keywords at this time.


Simulating CHANGE operands that are not available on JOIN


As noted above, JOIN does not allow MX and DX.  JOIN also does not presently support a column range.  Suppose you wanted such features for your JOIN processing; how could you accomplish it?


The main way is to use other commands in a "pre-join preparatory step" and then do your JOIN.


To achieve the effect of MX or DX, it may be possible to use a FIND or CHANGE with MX or DX first.


Another approach is to use the TAG command, tagging lines where you want the JOIN to take place, then going back and using JOIN with the tag name you set.  See TAG - Alter Tag Status of a Range of Lines for more information on using this command.


For example, suppose you want to do a "left join" on all lines where the string ABC appears in columns 21 to 29.  But, ABC might be anywhere in those columns, and it might be preceded by any arbitrary text, so there is no easy way to do this.  Here is an example of how you could do this:


First, tag all the lines that meet your criteria.  Let's use a tag of :J for "join", and we will use the SET option of TAG so any prior :J tags will be cleared.


Then, we can do the join on the tagged lines.  Because the prior TAG command only tagged lines having the ABC string, none of those lines would ever be of zero length, so the JOIN command with the arguments of  P'[='  P'='  will always 'find' every line that we just tagged with :J.


Finally, if you wish, we can clear out the :J tags since they are no longer needed.


Here are the commands you would use:


TAG :J SET 'ABC' 21 29 ALL

JOIN P'[=' P'=' ALL :J

RESET TAG :J


Created with the Personal Edition of HelpNDoc: Easy CHM and documentation editor