TokenNum()

TokenNum()

Get the total number of tokens in a token environment

Syntax

      TokenNum( [<@cTokenEnvironment>] ) -> nNumberofTokens

Arguments

<@cTokenEnvironment> a token environment

Returns

<nNumberofTokens> number of tokens in the token environment

Description

The TokenNum() function can be used to retrieve the total number of tokens in a token environment. If the parameter <@cTokenEnvironment> is supplied (must be by reference), the information from this token environment is used, otherwise the global token environment is used.

Examples

      tokeninit( "a.b.c.d", ".", 1 )  // initialize global token environment
      ? TokenNum()  // --> 4

Compliance

TokenNum() is a new function in Harbour’s CT3 library.

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENEXIT(), TOKENNEXT(), TOKENAT(), SAVETOKEN(), RESTTOKEN(), TOKENEND()

TokenNext()

TokenNext()

Successivly obtains tokens from a string

Syntax

      TokenNext( <[@]cString>, [<nToken>],
                 [<@cTokenEnvironment>] ) -> cToken

Arguments

<[@]cString> the processed string <nToken> a token number

<@cTokenEnvironment> a token environment

Returns

<cToken> a token from <cString>

Description

With TokenNext(), the tokens determined with the TOKENINIT() functions can be retrieved. To do this, TokenNext() uses the information stored in either the global token environment or the local one supplied by <cTokenEnvironment>. Note that, is supplied, this 3rd parameter has always to be passed by reference.

If the 2nd parameter, <nToken> is given, TokenNext() simply returns the <nToken>th token without manipulating the TE counter. Otherwise the token pointed to by the TE counter is returned and the counter is incremented by one. Like this, a simple loop with TOKENEND() can be used to retrieve all tokens of a string successivly.

Note that <cString> does not have to be the same used in TOKENINIT(), so that one can do a “correlational tokenization”, i.e. tokenize a string as if it was another! E.G. using TOKENINIT() with the string “AA, BBB” but calling TokenNext() with “CCCEE” would give first “CC” and then “EE” (because “CCCEE” is not long enough).

Examples

      // default behavhiour
      tokeninit( cString ) // initialize a token environment
      DO WHILE ! tokenend()
         ? TokenNext( cString )  // get all tokens successivly
      ENDDO
      ? TokenNext( cString, 3 )  // get the 3rd token, counter will remain 
                                 // the same
      tokenexit()                // free the memory used for the global 
                                 // token environment

Compliance

TokenNext() is compatible with CT3’s TokenNext(), but there are two additional parameters featuring local token environments and optional access to tokens.

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENEXIT(), TOKENNUM(), TOKENAT(), SAVETOKEN(), RESTTOKEN(), TOKENEND()

TokenExit()

TokenExit()

Release global token environment

Syntax

      TokenExit() -> lStaticEnvironmentReleased

Returns

<lStaticEnvironmentReleased> .T., if global token environment is successfully released

Description

The TokenExit() function releases the memory associated with the global token environment. One should use it for every tokeninit() using the global token environment. Additionally, TokenExit() is implicitly called from CTEXIT() to free the memory at library shutdown.

Examples

      tokeninit( cString ) // initialize a token environment
      DO WHILE ! tokenend()
         ? tokennext( cString )  // get all tokens successivly
      ENDDO
      ? tokennext( cString, 3 )  // get the 3rd token, counter 
                                 // will remain the same
      TokenExit()                // free the memory used for the 
                                 // global token environment

Compliance

TokenExit() is a new function in Harbour’s CT3 library.

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENNEXT(), TOKENNUM(), TOKENAT(), SAVETOKEN(), RESTTOKEN(), TOKENEND()

TokenEnd()

TokenEnd()

Check whether additional tokens are available with TOKENNEXT()

Syntax

      TokenEnd( [<@cTokenEnvironment>] ) -> lTokenEnd

Arguments

<@cTokenEnvironment> a token environment

Returns

<lTokenEnd> .T., if additional tokens are available

Description

The TokenEnd() function can be used to check whether the next call to TOKENNEXT() would return a new token. This can not be decided with TOKENNEXT() alone, since an empty token cannot be distinguished from a “no more” tokens.

If the parameter <@cTokenEnvironment> is supplied (must be by reference), the information from this token environment is used, otherwise the global TE is used.

With a combination of TokenEnd() and TOKENNEXT(), all tokens from a string can be retrieved successivly (see example).

Examples

      tokeninit( "a.b.c.d", ".", 1 )  // initialize global token environment
      DO WHILE ! TokenEnd()
         ? tokennext( "a.b.c.d" )     // get all tokens successivly
      ENDDO

Compliance

TokenEnd() is compatible with CT3’s TokenEnd(), but there are is an additional parameter featuring local token environments.

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENEXIT(), TOKENNEXT(), TOKENNUM(), TOKENAT(), SAVETOKEN(), RESTTOKEN()

TokenAt()

TOKENAT()

Get start and end positions of tokens in a token environment

Syntax

      TOKENAT( [<lSeparatorPositionBehindToken>], [<nToken>],
               [<@cTokenEnvironment>] ) -> nPosition

Arguments

<lSeparatorPositionBehindToken> .T., if TOKENAT() should return the position of the separator character BEHIND the token. Default: .F., return start position of a token.

<nToken> a token number <@cTokenEnvironment> a token environment

Returns

<nPosition> See description

Description

The TOKENAT() function is used to retrieve the start and end position of the tokens in a token environment. Note however that the position of last character of a token is given by tokenat (.T.)-1 !!

If the 2nd parameter, <nToken> is given, TOKENAT() returns the positions of the <nToken>th token. Otherwise the token pointed to by the TE counter, i.e. the token that will be retrieved by TOKENNEXT() _NEXT_ is used.

If the parameter <@cTokenEnvironment> is supplied (must be by reference), the information from this token environment is used, otherwise the global TE is used.

Tests

      tokeninit( cString ) // initialize a token environment
      DO WHILE ! tokenend()
         ? "From", tokenat(), "to", tokenat( .T. ) - 1
         ? tokennext( cString )  // get all tokens successivly
      ENDDO
      ? tokennext( cString, 3 )  // get the 3rd token, 
// counter will remain the same tokenexit() // free the memory used for the
// global token environment

Compliance

TOKENAT() is compatible with CT3’s TOKENAT(), but there are two additional parameters featuring local token environments and optional access to tokens.

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENEXIT(), TOKENNEXT(), TOKENNUM(), SAVETOKEN(), RESTTOKEN(), TOKENEND()

SaveToken()

SaveToken()

Save the global token environment

Syntax

      SaveToken() -> cStaticTokenEnvironment

Returns

<cStaticTokenEnvironment> a binary string encoding the global token environment

Description

The SaveToken() function can be used to store the global token environment for future use or when two or more incremental tokenizers must the nested. Note however that the latter can now be solved with locally stored token environments.

Compliance

SaveToken() is compatible with CT3’s SaveToken(),

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENEXIT(), TOKENNEXT(), TOKENNUM(), TOKENAT(), RESTTOKEN(), TOKENEND()

RestToken()

RestToken()

Restore global token environment

Syntax

      RestToken( <cStaticTokenEnvironment> ) -> cOldStaticEnvironment

Arguments

<cStaticTokenEnvironment> a binary string encoding a token environment

Returns

<cOldStaticEnvironment> a string encoding the old global token environment

Description

The RESTTOKEN() function restores the global token environment to the one encoded in <cStaticTokenEnvironment>. This can either be the return value of SAVETOKEN() or the value stored in the 4th parameter in a TOKENINIT() call.

Compliance

RestToken() is compatible with CT3’s RestToken(),

Platforms

All

Files

Source is token2.c, library is libct.

Seealso

TOKENINIT(), TOKENEXIT(), TOKENNEXT(), TOKENNUM(), TOKENAT(), SAVETOKEN(), TOKENEND()

Harbour All Functions – T

TabExpand
TabPack

Tan

TanH

TBrowseDB

TBrowseNew

TFileRead

THtml

Time

TimeValid

TNortonGuide 

Token
TokenAt
TokenEnd
TokenExit
TokenInit
TokenLower
TokenNext
TokenNum
TokenSep
TokenUpper

Tone

TOs2

Transform
Trim

TRtf

TTroff

 Type

String Functions

AddASCII

AfterAtNum

AllTrim
Asc

ASCIISum

ASCPos
At

AtAdjust

AtNum
AtRepl
AtToken

BeforAtNum

Chr

CharAdd
CharAnd
CharEven
CharHist
CharList
CharMirr
CharMix
CharNoList
CharNot
CharOdd
CharOne
CharOnly
CharOr
CharPix
CharRela
CharRelRep
CharRem
CharRepl
CharRLL
CharRLR
CharSHL
CharSHR
CharSList
CharSort
CharSub
CharSwap
CharWin
CharXOR

CountLeft
CountRight
Descend
Empty
hb_At
hb_RAt
hb_ValToStr
IsAlpha
IsDigit
IsLower
IsUpper

JustLeft
JustRight

Left
Len
Lower
LTrim

NumAt
NumToken
PadLeft
PadRight

PadC
PadL
PadR

POSALPHA
POSCHAR
POSDEL
POSDIFF
POSEQUAL
POSINS
POSLOWER
POSRANGE
POSREPL
POSUPPER

RangeRem
RangeRepl

RAt

RemAll

RemLeft
RemRight
ReplAll

Replicate

ReplLeft

ReplRight

RestToken

Right
RTrim

SaveToken

SetAtLike
Space
Str

StrDiff

StrFormat

StrSwap

StrTran
StrZero
SubStr

TabExpand
TabPack

Token

TokenAt
TokenEnd
TokenExit
TokenInit
TokenLower
TokenNext
TokenNum
TokenSep
TokenUpper

Transform
Trim
Upper
Val

ValPos
WordOne
WordOnly
WordRem
WordRepl
WordSwap

WordToChar


CT_TOKENINIT

 TOKENINIT()
 Initializes a string for TOKENNEXT()
------------------------------------------------------------------------------
 Syntax

     TOKENINIT([<cString>,[<cDelimiter>],
        [<nSkipDistance>]]) --> lStatus

 Arguments

     <cString>  [@]  Designates a character string for which tokenizing
     is initialized.  This must be passed by reference!

     <cDelimiter>  Designates a list (character string) of individual
     delimiters for tokenizing.

     <nSkipDistance>  Designates the number of delimiting
     characters/sequences after which a value, or a null string (if
     necessary), is returned.  The default value indicates that these
     characters/sequences are not counted.

     ()  When called with no parameters, TOKENNEXT() is set to begin at the
     start of the string.

 Returns

     The function returns .F. when the string variable cannot be initialized.
     For example, the function returns .F. if the string variable was not
     passed by reference.

 Description

     When used in conjunction with the TOKENNEXT() function, an extremely
     versatile incremental tokenizer is available to you.  Specific
     separation processes can be implemented much more quickly than with the
     group of functions around TOKEN().  The speed increase is achieved in
     two ways.

     TOKENINIT() exchanges all delimiting characters for the first one in the
     delimiter list.  This means the entire delimiter list does not have to
     be searched every time.  The second advantage is that TOKENNEXT() does
     not always begin its search for the token that is extracted at the
     beginning of the string (see the function description for TOKENNEXT()).
     However, in contrast to TOKEN(), TOKENNEXT() is unable to extract a
     specific token.

     You can also use the third parameter, a skip distance for the delimiter
     characters.  This allows recognition of empty lines within a text.  In
     this case, an empty line would be displayed by a CrLfCrLf sequence.
     Since TOKENINIT() takes all designated delimiting characters and
     exchanges them for something uniform, this sequence is changed to
     CrCrCrCr.  A skip distance of 2 means that the two delimiters (in this
     case, Cr)) return a token each time.  Since nothing precedes this
     example, TOKENNEXT() returns a null string.

     The function uses the following list of delimiters as standard:

     CHR 32, 0, 9, 10, 12, 26, 32, 138, 141

     and the characters ,.;:!?/\<<>>()^#&%+-*

     This list can be replaced by your own delimiter list, <cDelimiter>.
     Here are some examples of meaningful delimiting characters:

     Table 4-5: Recommended Delimiting Sequences
     ------------------------------------------------------------------------
     Description         <cDelimiter>
     ------------------------------------------------------------------------
     Pages               CHR(12)(Form Feed)
     Sentence            ".!?"
     File Name           ":\."
     Numeric strings     ",."
     Date strings        "/."
     Time strings        ":."
     ------------------------------------------------------------------------

 Notes

     .  When using a skip value, you must use the TOKENEND() function
        as a loop condition.

     .  When TOKENINIT() exchanges all delimiting characters for a new
        one, the first delimiter on the list is always used.  This ensures
        that no character contained in the token becomes a delimiter.

     .  When you use the TOKENINIT() or TOKENNEXT(), you cannot use
        the TOKENSEP() function.  The required information can be determined
        using TOKENAT() in conjunction with the original string (status
        before TOKENINIT()).

           To determine the delimiter position before the last token, set
           TOKENAT() to -1.  To determine the delimiter position after the
           last token, set TOKENAT to .T..

 Examples

     .  Break a string into words.  First the string must be
        initialized:

        cDelim   :=  "!?.,-"
        cString   :=  "A.B-C,D!E??"

        TOKENINIT(@cString, cDelim)         // "A!B!C!D!E!!"

        Do While .NOT. TOKENEND()
           cWord  :=  TOKENNEXT(cString)
           ? cWord
        ENDDO

     .  Break text into lines.  Take blank lines into account using a
        skip distance of 2:

        nCounter  :=  0

        TOKENINIT(cTextString, CHR(13) + CHR(10), 2)

        DO WHILE .NOT. TOKENEND()
           nLine  :=  TOKENNEXT(cTextString)
           ++ nCounter
        ENDDO

        ? nCounter

See Also: SAVETOKEN() RESTTOKEN() TOKENNEXT() TOKENEND()