Strus query analyzer configuration

Language grammar

class="description">The grammar for the query analysis differs slightly from the the document analysis.

EBNF

IDENTIFIER     : [A-Za-z][A-Za-z0-9_]*
STRING         : <single or double quoted string with backslash escaping>
PRIORITY       : <integer specifying the priority assigned to a feature>
PRGFILENAME    : <Name of a program to load (e.g. a program with patterns to match)>
MODULEID       : <Identifier or string identifying a module to use>
config         = configsection config
               ;
configsection  = "[" "Priority" "]" prioritydeflist
               | "[" "Element" "]" featdeflist
               | "[" "PatternLexem" "]" lexemdeflist
               | "[" "PatternMatch" MODULEID "]" prgdeflist
               ;
prioritydeflist= prioritydef prioritydeflist
               |
               ;
prioritydef    = type "=" PRIORITY ";"
               ;
featdeflist    = featdef featdeflist
               |
               ;
featdef        = type "=" normalizer tokenizer fieldname ";"
               ;
type           = IDENTIFIER ;
prgdef         = type "=" PRGFILENAME ";"
               ;
prgdeflist     = prgdef prgdeflist
               ;
normalizer     = functioncall ":" normalizer
               | functioncall
               ;
tokenizer      = functioncall
               ;
functioncall   = functionname "(" argumentlist ")" ;
               | functionname
               ;
functionname   = IDENTIFIER ;
argumentlist   = argument "," argumentlist
               |
               ;
argument       = IDENTIFIER
               | STRING
               ;
fieldname      = fieldname
               ;
	

Meaning of the sections

Priority

Definitions of query term priorities. Terms with higher priority oust term definitions with lower priority they cover completely.

Element

The declarations in this section are query term definitions

PatternLexem

The declarations in this section are lexem definitions that are not inserted into the index. They are just used to feed post processing pattern matchers with lexems.

PatternMatch

The declarations in this section define pattern matching programs. The pattern matcher module is selection with the argument <moduleid> of the section header. There exists no pattern matcher in the core. The standard pattern matcher name "std" is implemented in the module "analyzer_pattern" of the project strusPattern.

Example

[Element]
	word = lc:convdia(en):stem(en) word word;
[PatternLexem]
	token = orig word word;