Finds a subtext within a text structure.
Options
CASE = string token |
Whether to treat the case of letters (small or capital) as significant when searching for the SUBTEXT within the TEXT (significant, ignored); default sign |
|---|---|
REVERSE = string token |
Whether to reverse the search to work from the end of the TEXT (yes, no); default no |
MULTISPACES = string token |
Whether to treat differences between multiple spaces and single spaces as significant, or to treat them all like a single space (significant, ignored); default sign |
DISTINCT = string tokens |
Whether to require the SUBTEXT to have one or more separators to its left or right within the TEXT (left, right); default * |
SEPARATOR = string |
Characters to use as separators; default ' ,;:.' |
SAMELINE = string token |
Whether to ignore matches in the TEXT where the SUBTEXT is not all on the same line (yes, no); default no |
Parameters
TEXT = texts |
Texts to be searched |
|---|---|
SUBTEXT = texts |
Text to look for in each TEXT |
COLUMN = scalars |
Position of the column within TEXT where the first character of SUBTEXT has been found |
LINE = scalars |
Number of the line within TEXT where the first character of SUBTEXT has been found |
ICOLUMN = scalars |
Column within TEXT at which to start the search |
ILINE = scalars |
Line within TEXT at which to start the search |
ENDCOLUMN = scalars |
Position of the column within TEXT where the last character of SUBTEXT has been found |
ENDLINE = scalars |
Number of the line within TEXT where the last character of SUBTEXT has been found |
Description
The TXFIND directive looks for a Genstat text structure within another text structure. The text to search is specified by the TEXT parameter, and the SUBTEXT parameter specifies the text to be found. The search treats the two texts as if they were paragraphs of characters: that is, it takes no account of the line breaks within the two text structures, replacing each one with a space. The COLUMN parameter saves the column within the TEXT where the first character of the SUBTEXT is found, and the LINE parameter saves its line within the TEXT. These are both set to zero if SUBTEXT is not found. Similarly the ENDCOLUMN and ENDLINE parameters save the position of the last character of the SUBTEXT. You can use the ICOLUMN and ILINE parameters to specify a starting column and line for the search. So you can search for the next occurrence of SUBTEXT by setting ILINE to the saved value of LINE, and ICOLUMN to the saved value of COLUMN plus one.
TXFIND usually takes account of the case of letters (small or capital) when looking for the SUBTEXT within the TEXT. So for example 'Genstat' would not match with 'Genstat'. However, you can set option CASE=ignored to ignore differences in case. It will usually also treat multiple spaces as significant, but you can set option MULTISPACE=ignored to treat them all like a single space.
Option DISTINCT is useful if you are looking for distinct words or phrases. The left setting requires the SUBTEXT to begin either at the start of the TEXT, or to be preceded in the TEXT by a separator (such as a space or comma). Similarly, the right setting requires the SUBTEXT to end within the TEXT with a separator (or to be at the end of the TEXT). The separators are specified by the SEPARATOR option.
By default, the SUBTEXT can be split over several lines of the TEXT, but you can set option SAMELINE=yes to ensure that it will be recognised only if it is all on a single line.
Options: CASE, REVERSE, MULTISPACES, DISTINCT, SEPARATOR, SAMELINE.
Parameters: TEXT, SUBTEXT, COLUMN, LINE, ICOLUMN, ILINE, ENDCOLUMN, ENDLINE.
Action with RESTRICT
Any restrictions are ignored.
See also
Directives: TEXT, CONCATENATE, EDIT, GETLOCATIONS, TXBREAK, TXCONSTRUCT, TXPOSITION, TXREPLACE.
Functions: CHARACTERS, GETFIRST, GETLAST, GETPOSITION, POSITION.
Commands for: Calculations and manipulation.
Example
" Example 1:4.7.3, 1:4.7.4 and 1:4.7.6"
TEXT Intro6; VALUES=!t(\
'Genstat has very comprehensive facilities for Analysis of Variance.',\
'Almost all of these can be accessed using custom menus. In this',\
'chapter, we start with the simplest design, a one-way completely',\
'randomized experiment, before introducing factorial experiments,',\
'which have more than one treatment or fixed effect. We use an',\
'experiment with a randomized block design to show how to deal with',\
'blocks, which involve more than one stratum or source of error in',\
'the analysis, and extend this idea by analysing a split-plot design.',\
'Many other types of design can also be analysed by Genstat, and',\
'details are available in Chapter 4 of Part 2 of the Guide to',\
'Genstat. We also introduce some of Genstat''s extensive facilities',\
'for creating designed experiments, available from the Design option',\
'of the Stats menu.')
TXPOSITION Intro6; SUBTEXT='Genstat'; POSITION=Where
TXPOSITION Intro6; SUBTEXT='Genstat'; POSITION=Next; SKIP=Where
PRINT Where,Next; DECIMALS=0
TXFIND [DISTINCT=left,right] Intro6; SUBTEXT='the';\
COLUMN=column; LINE=line
PRINT [SQUASH=yes] line,column & Intro6$[line] & '!'; FIELD=column
FOR [NTIMES=999]
TXFIND [DISTINCT=left,right] Intro6; SUBTEXT='the';\
COLUMN=column; LINE=line; ICOLUMN=column+1; ILINE=line
EXIT line .EQ. 0
PRINT [SQUASH=yes] line,column & Intro6$[line] & '!'; FIELD=column
ENDFOR
TXBREAK Intro6; WORDS=Words
GROUP [CASE=ignored; REDEFINE=yes] Words
TABULATE [PRINT=count; classification=Words]