NUMAT()

NUMAT()

Number of occurrences of a sequence in a string

Syntax

       NUMAT( <cStringToMatch>, <cString>, [<nIgnore>] ) --> nCount

Arguments

<cStringToMatch> The search string. <cString> The string to search.

<nIgnore> The number of characters that are excluded from the search. The default value ignores none.

Returns

The function returns a value that specifies how frequently the <cStringToMatch> sequence was found in the <cString>.

Description

NUMAT() determines how often a particular <cStringToMatch> appears within <cString>. When you use <nIgnore> you can lock out a number of characters at the beginning of the <cString> and keep them out of the search. The setting for CSETATMUPA() impacts your results. The character string is searched from the left for each occurrence of the <cStringToMatch> string. If CSETATMUPA() is .F., then the search continues after the last character of the found sequence. If CSETATMUPA() is .T., then the search continues after the first character of the found sequence.

Note

. By implementing SETATLIKE(), wildcard characters can be used within the search expression.

Examples

       .  Count from the first position:
              ? NUMAT("ab", "abcdeabc")            // Result: 2
           .  Count from the second position.  <nIgnore> specifies that one
              space is to be skipped:
              ? NUMAT("ab", "abcdeabc", 1)         // Result: 1
           .  This example shows the impact of CSETATMUPA() when counting
              the string "aa" within the <cString> string:
              CSETATMUPA(.F.)                      // Off
              ? NUMAT("aa", "aaaab")               // Result: 2
              CSETATMUPA(.T.)                      // On
              ? NUMAT("aa", "aaaab")               // Result: 3
           .  Examples for the use of SETATLIKE() can be found under the
              corresponding function description.

Compliance

NUMAT() is compatible with CT3’s NUMAT().

Platforms

All

Files

Source is numat.c, library is libct.

Seealso

CSETATMUPA(), SETATLIKE()

StrDiff()

StrDiff()

Evaluate the “Edit (Levensthein) Distance” of two strings

Syntax

      StrDiff( <cString1>, <cString2>, [<nReplacementPenalty>], 
               [<nDeletionPenalty>], [<nInsertionPenalty>] ) 
               -> <nDistance>

Arguments

<cString1> string at the “starting point” of the transformation process, default is “”

<cString2> string at the “end point” of the transformation process, default is “”

<nReplacementPenalty> penalty points for a replacement of one character, default is 3

<nDeletionPenalty> penalty points for a deletion of one character, default is 6

<nInsertionPenalty> penalty points for an insertion of one character, default is 1

Returns

<nDistance> penalty point sum of all operations needed to transform <cString1> to <cString2>

Description

The StrDiff() functions calculates the so called “Edit” or “Levensthein” distance of two strings. This distance is a measure for the number of single character replace/insert/delete operations (so called “point mutations”) required to transform <cString1> into <cString2> and its value will be the smallest sum of the penalty points of the required operations.

Be aware that this function is both quite time – O(len(cString1)*len(cString2)) – and memory consuming – O((len(cString1)+1)*(len(cString2)+1)*sizeof(int)) – so keep the strings as short as possible. E.g., on common 32 bit systems (sizeof(int) == 4), calling StrDiff() with two strings of 1024 bytes in length will consume 4 MB of memory. To not impose unneeded restrictions, the function will only check if (len(cString1)+1)*(len(cString2)+1)*sizeof(int) <= UINT_MAX, although allocing UINT_MAX bytes will not work on most systems. If this simple check fails, -1 is returned.

Also, be aware that there can be an overflow when the penalty points are summed up: Assuming that the number of transformation operations is in the order of max(len(cString1), len(cString2)), the penalty point sum, that is internally stored in an “int” variable, is in the order of (max(len(cString1), len(cString2))*max(nReplacementPenalty, nDeletionPenalty, nInsertionPentaly). The StrDiff() does not do an overflow check due to time performance reasons. Future versions of StrDiff() could use a type different to “int” to store the penalty point sum to save memory or to avoid overflows.

The function is aware of the settings done by SETATLIKE(), that means that the wildchar character is considered equal to ALL characters.

Examples

      ? StrDiff( "ABC", "ADC" ) // 3, one character replaced
      ? StrDiff( "ABC", "AEC" ) // 3, dito
      ? StrDiff( "CBA", "ABC" ) // 6, two characters replaced
      ? StrDiff( "ABC", "AXBC" ) // 1, one character inserted
      ? StrDiff( "AXBC", "ABC" ) // 6, one character removed
      ? StrDiff( "AXBC", "ADC" ) // 9, one character removed and one replaced

Tests

      StrDiff( "ABC", "ADC" ) == 3
      StrDiff( "ABC", "AEC" ) == 3
      StrDiff( "CBA", "ABC" ) == 6
      StrDiff( "ABC", "AXBC" ) == 1
      StrDiff( "AXBC", "ABC" ) == 6
      StrDiff( "AXBC", "ADC" ) == 9

Compliance

StrDiff() is compatible with CT3’s StrDiff().

Platforms

All

Files

Source is strdiff.c, library is libct.

Seealso

SETATLIKE()

AtNum()

AtNum()

Returns the start position of the nth occurence of a substring in a string

Syntax

      AtNum (<cStringToMatch>, <cString>, [<nCounter>],
             [<nIgnore>] ) --> nPosition

Arguments

<cStringToMatch> is the substring scanned for <cString> is the scanned string

[<nCounter>] determines how many occurences are of <cStringToMatch> in <cString> are searched Default: search last occurence

[<nIgnore>] determines how many character from the start should be ignored in the search Default: 0

Returns

<nPosition> the position of the <nCounter>th occurence of <cStringToMatch> in <cString>. If such an occurence does not exist, 0 is returned.

Description

This function scans <cString> for <cStringToMatch>. After the <nCounter>th match (or the last one, depending on the value of <nCounter>) has been found, the position of that match will be returned. If there aren’t enough matches or there is no last match, 0 will be returned. After a match has been found, the function continues to scan after that match if the CSETATMUPA() switch is turned off, with the second character of the matched substring otherwise. The function will also consider the settings of SETATLIKE().

Examples

      ? AtNum( "!", "What is the answer ? 4 ! 5 !" ) // -> 28
      ? AtNum( "!", "What is the answer ? 4 ! 5 ?" ) // -> 24

Tests

      AtNum( "..", "..This..is..a..test!" ) == 14
      AtNum( "..", "..This..is..a..test!", 2 ) == 7
      AtNum( "..", "..This..is..a..test!", 2, 2 ) == 11

Compliance

AtNum() is compatible with CT3’s AtNum().

Platforms

All

Files

Source is AtNum.c, library is libct.

Seealso

AtNum(), AfterAtNum(), CSETATMUPA(), SETATLIKE()

AtAdjust()

AtAdjust()

Adjusts a sequence within a string to a specified position

Syntax

      AtAdjust( <cStringToMatch>, <cString>, <nAdjustPosition>,
                [<nCounter>], [<nIgnore>],
                [<nFillChar|cFillChar>] ) -> cString

Arguments

<cStringToMatch> is the sequence to be adjusted within <cString>

<cString> is the string that contains <cStringToMatch>

<nAdjustPosition> specifies the position to that <cStringToMatch> will be adjusted

[<nCounter>] specifies which occurence of <cStringToMatch> in <cString> is to be adjusted. Default: last occurence

[<nIgnore>] specifies how many characters should be omitted in the scan

[<nFillChar|cFillChar>] specifies the character that is used for the adjustment

Returns

<cString> the changed string

Description

The function first looks for the <cSearchFor> parameter within the character string. From this point, the rest of the <cString> is moved (adjusted) by either inserting or removing blanks until the <nTargetPosition> is reached. In lieu of blanks, <nCharacter> | <cCharacter> can be used as a fill character.

Additionally you can specify that the nth occurrence of be used and whether or not a specific number of characters at the beginning of the search string is eliminated.

Note

Using CSETATMUPA() can influence how the search is performed.

Using SETATLIKE() permits the use of wild cards within the search sequence.

Examples

     .  Align comments at column 60.  The search is for the first
        occurrence of "//".  Since there is usually at least one space before
        each "//", search for " //":

        ? AtAdjust(" //", Line, 60, 1)

     .  Move the extensions for the following list of file names to
        position 10 and eliminate the ".":

        WINDOW.DBF
        PLZ.DBF
        BACK.DBF
        HELP.DBF
        LOG.DBF

        CHARREM(".", AtAdjust(".", File, 10))

        WINDOW      DBF

        PLZ         DBF
        BACK        DBF
        HELP        DBF
        LOG         DBF

Use AtAdjust() with CSETATMUPA(). There is always a problem determining whether “AA” occurs twice or three times in “AAA”. Depending on CSETATMUPA(), the function searches behind the last character, or starts from the last character of a located sequence:

        CSETATMUPA(.F.)
        ? AtAdjust("AA", "123AAABBB", 7, 2)       // Sequence not found

        CSETATMUPA(.T.)
        ? AtAdjust("AA", "123AAABBB", 7, 2)       // "123A  AABBB"

Compliance

AtAdjust() works like CT3’s AtAdjust()

Platforms

All

Files

Source is AtAdjust.c, library is ct3.

Seealso

SETATLIKE(), CSETATMUPA()

AfterAtNum()

AfterAtNum()

Returns string portion after nth occurence of substring

Syntax

      AfterAtNum( <cStringToMatch>, <cString>, [<nCounter>],
                  [<nIgnore>] ) --> cRestString

Arguments

<cStringToMatch> is the substring scanned for <cString> is the scanned string

[<nCounter>] determines how many occurences are of <cStringToMatch> in <cString> are searched Default: search last occurence

[<nIgnore>] determines how many character from the start should be ignored in the search Default: 0

Returns

<cRestString> the portion of <cString> after the <nCounter>th occurence of <cStringToMatch> in <cString> If such a rest does not exist, an empty string is returned.

Description

This function scans <cString> for <cStringToMatch>. After the <nCounter>th match (or the last one, depending on the value of <nCounter>) has been found, the portion of <cString> after that match will be returned. If there aren’t enough matches or the last match is identical to the end of <cString>, an empty string will be returned. After a match has been found, the function continues to scan after that match if the CSETATMUPA() switch is turned off, with the second character of the matched substring otherwise. The function will also consider the settings of SETATLIKE().

Examples

      ? AfterAtNum( "!", "What is the answer ? 4 ! 5 !" ) // -> ""
      ? AfterAtNum( "!", "What is the answer ? 4 ! 5 ?" ) // -> " 5 ?"
      <TODO: add some examples here with csetatmupa() and setatlike()>

Tests

      AfterAtNum( "..", "..This..is..a..test!" ) == "test!"
      AfterAtNum( "..", "..This..is..a..test!", 2 ) == "is..a..test!"
      AfterAtNum( "..", "..This..is..a..test!", 2, 2 ) == "a..test!"

Compliance

AfterAtNum() is compatible with CT3’s AfterAtNum().

Platforms

All

Files

Source is atnum.c, library is libct.

Seealso

ATNUM(), BEFORATNUM(), CSETATMUPA(), SETATLIKE()

Harbour All Functions – S

SaveToken

SayScreen

Seconds
Secs

Select

Set

SetAtLike

SetDate

SetKey

SetMode

SetPrec

SetTime

SetTypeahead

Sign

Sin

SinH

Space

Sqrt

Str

StrDiff

StrFormat

StrSwap
StrTran
StrZero
SubStr

String Functions

AddASCII

AfterAtNum

AllTrim
Asc

ASCIISum

ASCPos
At

AtAdjust

AtNum
AtRepl
AtToken

BeforAtNum

Chr

CharAdd
CharAnd
CharEven
CharHist
CharList
CharMirr
CharMix
CharNoList
CharNot
CharOdd
CharOne
CharOnly
CharOr
CharPix
CharRela
CharRelRep
CharRem
CharRepl
CharRLL
CharRLR
CharSHL
CharSHR
CharSList
CharSort
CharSub
CharSwap
CharWin
CharXOR

CountLeft
CountRight
Descend
Empty
hb_At
hb_RAt
hb_ValToStr
IsAlpha
IsDigit
IsLower
IsUpper

JustLeft
JustRight

Left
Len
Lower
LTrim

NumAt
NumToken
PadLeft
PadRight

PadC
PadL
PadR

POSALPHA
POSCHAR
POSDEL
POSDIFF
POSEQUAL
POSINS
POSLOWER
POSRANGE
POSREPL
POSUPPER

RangeRem
RangeRepl

RAt

RemAll

RemLeft
RemRight
ReplAll

Replicate

ReplLeft

ReplRight

RestToken

Right
RTrim

SaveToken

SetAtLike
Space
Str

StrDiff

StrFormat

StrSwap

StrTran
StrZero
SubStr

TabExpand
TabPack

Token

TokenAt
TokenEnd
TokenExit
TokenInit
TokenLower
TokenNext
TokenNum
TokenSep
TokenUpper

Transform
Trim
Upper
Val

ValPos
WordOne
WordOnly
WordRem
WordRepl
WordSwap

WordToChar