Ficl Parse Steps

Overview

Ficl 2.05 and later includes an extensible parser chain. Ficl feeds every incoming token (chunk of text with no internal whitespace) to each step in the parse chain in turn. The first parse step that successfully matches the token applies semantics to it and returns a TRUE flag, ending the sequence. If all parse steps fire without a match, ficl prints an error message and resets the virtual machine. Parse steps can be written in precompiled code, or in ficl itself, and can be appended to the chain at run-time if you like.

More detail:

  • If compiling and local variable support is enabled, attempt to find the token in the local variable dictionary. If found, execute the token's compilation semantics and return
  • Attempt to find the token in the system dictionary. If found, execute the token's semantics (may be different when compiling than when interpreting) and return
  • If prefix support is enabled (Compile-time constant FICL_WANT_PREFIX in sysdep.h is non-zero), attempt to match the beginning of the token to the list of known prefixes. If there's a match, execute the associated prefix method.
  • Attempt to convert the token to a number in the present BASE. If successful, push the value onto the stack if interpreting, compile it if compiling. Return
  • All previous parse steps failed to recognize the token. Print " not found" and abort
You can add steps to the parse chain, and you can add prefixes.

Adding Parse Steps

You can add a parse step in two ways. The first is to write a ficl word that has the correct stack signature for a parse step:
my-parse-step   ( c-addr u -- ??? flag )
Where c-addr u are the address and length of the incoming token, and flag is true if the parse step recognizes the token and false otherwise.
Install the parse step using add-parse-step. A trivial example:
: ?silly   ( c-addr u -- flag )
   ." Oh no! Not another  " type cr  true ;
' ?silly add-parse-step
parse-order

The other way to add a parse step is by writing it in C, and inserting it into the parse chain with:

void ficlAddPrecompiledParseStep(FICL_SYSTEM *pSys, char *name, FICL_PARSE_STEP pStep);
Where name is the display name of the parse step in the parse chain (as revealed by parse-order). Parameter pStep is a pointer to the code for the parse step itself, and must match the following declaration:
typedef int (*FICL_PARSE_STEP)(FICL_VM *pVM, STRINGINFO si);

Upon entry to the parse step, si points to the incoming token. The parse step must return FICL_TRUE if it succeeds in matching the token, and FICL_TRUE otherwise. If it succeeds in matching a token, the parse step applies semantics to it before returning. See ficlParseNumber() in words.c for an example.

Adding Prefixes

What's a prefix, anyway? A prefix (contributed by Larry Hastings) is a token that's recognized as the beginning of another token. Its presence modifies the semantics of the rest of the token. An example is 0x, which causes digits following it to be converted to hex regardless of the current value of BASE.

Caveat: Prefixes are matched in sequence, so the more of them there are, the slower the interpreter gets. On the other hand, because the prefix parse step occurs immediately after the dictionary lookup step, if you have a prefix for a particular purpose, using it may save time since it stops the parse process.

Each prefix is a ficl word stored in a special wordlist called <prefixes>. When the prefix parse step (?prefix AKA ficlParsePrefix()) fires, it searches each word in <prefixes> in turn, comparing it with the initial characters of the incoming token. If a prefix matches, the parse step returns the remainder of the token to the input stream and executes the code associated with the prefix. This code can be anything you like, but it would typically do something with the remainder of the token. If the prefix code does not consume the rest of the token, it will go through the parse process again (which may be what you want).

Prefixes are defined in prefix.c and in softwords/prefix.fr. The easiest way to add a new prefix is to insert it into prefix.fr and rebuild the system. You can also add prefixes interactively by bracketing prefix definitions as follows (see prefix,fr):

start-prefixes  ( defined in prefix.fr )
\ make dot-paren a prefix (create an alias for it in the prefixes list)
: .(  .( ;
: 0b  2 __tempbase ; immediate
end-prefixes

The precompiled word __tempbase is a helper for prefixes that specify a temporary value of BASE.

Constant FICL_EXTENDED_PREFIX controls the inclusion of a bunch of additional prefix definitions. This is turned off in the default build since several of these prefixes alter standard behavior, but you might like them.

Notes

Prefixes and parser extensions are non-standard, although with the exception of prefix support, ficl's default parse order follows the standard. Inserting parse steps in some other order will almost certainly break standard behavior.

The number of parse steps that can be added to the system is limited by the value of FICL_MAX_PARSE_STEPS (defined in sysdep.h unless you define it first), which defaults to 8. More parse steps means slower average interpret and compile performance, so be sparing. Same applies to the number of prefixes defined for the system, since each one has to be matched in turn before it can be proven that no prefix matches. On the other hand, if prefixes are defined, use them when possible: since they are matched early in the parse order, a prefix match short circuits the parse process, saving time relative to (for example) using a number builder parse step at the end of the parse chain.

Compile time constant FICL_EXTENDED_PREFIX enables several more prefix definitions in prefix.c and prefix.fr. Please note that this will slow average compile and interpret speed in most cases.

Parser Glossary

parse-order ( -- )
Prints the list of parse steps in the order in which they are evaluated. Each step is the name of a ficl word with the following signature:
parse-step   ( c-addr u -- ??? flag )
A parse step consumes a counted string (the incoming token) from the stack, and exits leaving a flag on top of the stack (it may also leave other parameters as side effects). The flag is true if the parse step succeeded at recognizing the token, false otherwise.
add-parse-step ( xt -- )
Appends a parse step to the parse chain. XT is the adress (execution token) of a ficl word to use as the parse step. The word must have the following signature:
parse-step   ( c-addr u -- ??? flag )
A parse step consumes a counted string (the incoming token) from the stack, and exits leaving a flag on top of the stack (it may also leave other parameters as side effects). The flag is true if the parse step succeeded at recognizing the token, false otherwise.
show-prefixes ( -- )
Defined in softwords/prefix.fr. Prints the list of all prefixes. Each prefix is a ficl word that is executed if its name is found at the beginning of a token. See softwords/prefix.fr and prefix.c for examples.
start-prefixes ( -- )
Defined in softwords/prefix.fr. Declares the beginning of one or more prefix definitions (it just switches the compile wordlist to <prefixes>
end-prefixes ( -- )
Defined in softwords/prefix.fr. Restores the compilation wordlist that was in effect before the last invocation of start-prefixes. Note: the prior wordlist ID is stored in a Ficl variable, so attempts to nest start-prefixes end-prefixes blocks wil result in mildly silly side effects.