An operator represents logic to be applied to a search element. This logic defines the qualifications a document must meet to be retrieved. Operator types are as follows:
- Wildcards
- Evidence operators
- Proximity operators
- Relational operators
- Concept operators
- Score operators
Ordinarily, you use operators in EXPLICIT searches. See CFX_SEARCH for more information about the EXPLICIT. are used in the following manner:
"<operator>search_string"
The following table shows all operators available for conducting searches of Cold Fusion Verity collections.
Search Operators < CONTAINS PHRASE <= ENDS SENTENCE = MATCHES STARTS > NEAR STEM >= NEAR/N SUBSTRING ACCRUE OR WILDCARD AND PARAGRAPH WORD
Query expressions passed to the search engine in the CRITERIA attribute of the CFX_SEARCH tag. Expressions are assembled with a combination of search words, operators, and modifiers.
A number of characters are handled in particular ways by the search engine.
| Characters | Description |
|---|---|
| , ( ) [ | These characters end a text token |
| = > < ! | These characters also end a text token. They are terminated by an associated end character. |
| ' @ ` < { [ ! | These characters signify the start of a delimited token. They are terminated by an associated end character. |
A backslash (\) removes special meaning from whatever character follows it. To enter a literal backslash in a query, use two in succession. Examples:
<FREETEXT>("\"Hello\", said Packard.") "backslash (\\)"
The following rules apply for composing search expressions.
Precedence rules
While an expression is read from left to right, some operators carry more weight than others.AND operators, for example, take precedence over OR operators. To ensure that an OR operator is interpreted prior to an AND operator, you can use parentheses to enclose the OR operator:
(a OR b) AND c
Parentheses indicate the order the directions are to be carried out. Information enclosed by parentheses is read first.
There must be at least on space between operators and words used in the expression.
When the search engine encounters nested parentheses, it starts with the innermost level:
(a AND (b OR c)) OR d
This expression means: Look for documents that contain b or c as well as a, or that contain d.
Search strings that use any operator except evidence operators can be defined in prefix notation or infix notation.
Prefix notation specifies that the operator comes before the search string:
AND (a,b)
When prefix notation is used, precedence is handled explicitly within the expression. The following example means: Look for documents that contain b and c first, then documents that contain a:
OR (a, AND (b,c))
Infix notation specifies that thte opreator is to be specified etween each elemenet within the expression. The following example means: Look for documents that contain a and b or documents that contain c:
a AND b OR c
When infix notation is used, precedence isimplicit with the expression. For example, the AND operator takes precedence over the OR operator.
If an expression includes two or more search terms within parentheses, a comma is required as a separator between each element. The following example means: Look for documents that contain any combination of a and b together. Note that in this example, angle brackets are used with the OR operator.
<OR> (a, b)
Angle brackets < >, double quotation marks " ", and backslashes \ are used to delimit various elements in a query expression.
Angle brackets for operators
Left and right angle brackets < > are reserved for designating operators and modifiers. They are optional for the AND, OR, and NOT operators, but required for all other operators.
Double quotation marks in expressions
You use double quotation marks to search for a word that is otherwise reserved as an operator, such as AND, OR, and NOT.
Backslashes in expressions
To include a backslash \ in a search, insert two backslashes for each backslash character you want to search:
C:\\CFUSION\\BIN
The following wildcard characters are available for searching Verity collections:
| Wildcard | Description |
|---|---|
| ? | Question. Specifies any single alphanumeric character. |
| * | Asterisk. Specifies zero or more alphanumeric characters. Avoid using the asterisk as the first character in a search string. Asterisk is ignored in a set, [ ] or an alternative pattern { }. |
| [ ] | Square brackets. Specifies one of any character in a set, as in "sl[iau]m" which locates "slim," "slam," and "slum." Square brackets indicate an implied OR. |
| { } | Curly braces. Specifies one of each pattern separated by a comma, as in "hoist{s, ing, ed}" which locates "hoists," "hoisting," and "hoisted." Curly braces indicate an implied AND. |
| ^ | Caret. Specifies one of any character not in the set as in "sl[^ia]m" which locates "slum" but not "slim" or "slam." |
| - | Hyphen. Specifies a range of characters in a set as in "c[a-r]t" which locates every word beginning with "c," ending with "t" and containing any letter from "a" to "r." |
To search for a wildcard character in your collection, you need to escape the character with a backslash (\). For example:
- To match a literal asterisk, you precede the * with two backslashes: "a\\*"
- To match a question mark or other wildcard character: "Checkers\?"
The following non-alphanumeric characters must be preceded by a backslash character (\) in a search string:
- comma ( , )
- left and right parentheses ( )
- Double quotation mark ( " )
- backslash ( \ )
- at sign ( @ )
- left curly brace ( { )
- left bracket ( [ )
- less than sign ( < )
- backquote ( ` )
In addition to the backslash character, you can use paired backquotes (`) surrounding to interpret special characters as literals. For example, to search for the wildcard string "a{b" you can surround the string with backquotes, as follows:
`a{b`
To search for a wildcard string that includes the literal backquote character (`) you must use two backquotes together and surround the whole string in backquotes:
`*n``t`Note that you can use either paired backquotes and the backslash character to escape special characters. There is no functional difference in the use of one or the other. For example, you can query for the term: <DDA> in the following ways:
\<DDA\> or `<DDA>`
Evidence operators can be used to specify either a basic word search or an intelligent word search. A basic word search finds documents that contain only the word or words specified in the query. An intelligent word search expands the query terms to create an expanded word list so that the search returns documents that contain variations of the query terms.
Documents retrieved using evidence operators are not ranked by relevance unless you use the MANY modifier.
| Operator Name | Description |
| STEM | Expands the search to include
the word you enter and its variations. The STEM
operator is automatically implied in any SIMPLE
query. Examples of EXPLICIT queries: <STEM>believe This query expression yields the following matches: "believe," "believing," "believer" etc. |
| WILDCARD | Matches wildcard
characters included in
search strings. Certain characters automatically
indicate a wildcard specification, such as * and
?. Examples:
This query expression yields the following matches: "spam," "spammer," "spamming." |
| WORD | Performs a basic word search, selecting documents that include one or more instances of the specific word you enter. The WORD operator is automatically implied in any SIMPLE query. |
Proximity operators specify the relative location of specific words in the document. Specified words must be in the same phrase, paragraph, or sentence for a document to be retrieved. In the case of NEAR and NEAR/N operators, retrieved documents are ranked by relevance based on the proximity of the specified words. Proximity operators can be nested; phrases or words can appear within SENTENCE or PARAGRAPH operators, and SENTENCE operators can appear within PARAGRAPH operators. The following table describes each operator.
| Operator Name | Description |
| NEAR | Selects documents containing specified search terms. The closer the search terms are to one another within a document, the higher the document's score. The document with the smallest possible region containing all search terms always receives the highest score. Documents whose search terms are not within 1000 words of each other are not selected. |
| NEAR/N | Selects documents containing
two or more search terms within N number
of words of each other, where N is an
integer between 1 and 1024 where NEAR/1 searches
for two words that are next to each other. The
closer the search terms are within a document,
the higher the document's score. You can specify multiple search terms using multiple instances of NEAR/N as long as the value of N is the same: commute <NEAR/10> bicycle <NEAR/10> train <NEAR/10> |
| PARAGRAPH | Selects documents that include all of the words you specify within the same paragraph. To search for three or more words or phrases, you must use the PARAGRAPH operator between each word or phrase. |
| PHRASE | Selects documents that include
a phrase you specify. A phrase is a grouping of
two or more words that occur in a specific order.
Examples of phrases: mission oak "mission oak" mission <PHRASE> oak <PARAGRAPH> (mission, oak) |
| SENTENCE | Selects documents that include
all of the words you specify within the same
sentence. Examples: jazz <SENTENCE> musician <SENTENCE> (jazz, musician) |
Relational operators search document fields that have been defined in the collection. Documents containing specified field values are returned. Documents retrieved using relational operators are not ranked by relevance, and you cannot use the MANY modifier with relational operators.
There are two types of relational operators. Numeric and date operators perform numeric and date comparisons. Text comparison operators match words and parts of words.
The following operators are used for numeric and date comparisons.
Operator Description = Equals > Greater than >= Greater than or equal to < Less than <= Less than or equal to
Operator Name Description CONTAINS Selects documents by matching the word or phrase you specify with the values stored in a specific document field. Documents are selected only if the search elements specified appear in the same sequential and contiguous order in the field value. For example, specifying "god" will match "God in heaven," "a god among men," or "good god" but not "godliness," or "gods." MATCHES Selects documents by matching the query string with values stored in a specific document field. Documents are selected only if the search elements specified match the field value exactly. If a partial match is found, a document is not selected. For example, specifying "god" will match a document field containing only "god" and will not match "gods," "godliness," or "a god among men." STARTS Selects documents by matching the character string you specify with the starting characters of the values stored in a specific document field. ENDS Selects documents by matching the character string you specify with the ending characters of the values stored in a specific document field. SUBSTRING Selects documents by matching the query string you specify with any portion of the strings in a specific document field. For example, specifying "god" will match "godliness," "a god among men," "godforsaken," etc.
You can use the SUBSTRING operator to match a character string with data stored in a specific document field. In the following example, a datasource called TEST1 contains the table YearPlaceText, which itself contains three columns: Year, Place, and Text. Year and Place make up the primary key. This is what the table looks like:
Table name: YearPlaceText Year Place Text 1990 Utah Text about Utah 1990 1990 Oregon Text about Oregon 1990 1991 Utah Text about Utah 1991 1991 Oregon Text about Oregon 1991 1992 Utah Text about Utah 1992 The template shown below matches records that have 1990 in the TEXT column and are in the Place Utah. The search is performed against the collection that contains the TEXT column and then is narrowed further by searching the string "Utah" in the CF_TITLE document field. Recall that document fields are defaults defined in every collection corresponding to the values you define for URL, TITLE, and KEY in the CFX_INDEX tag.
<CFQUERY NAME="GetText" DATASOURCE="TEST1"> Select Year+Place AS Identifier, text from YearPlaceText </CFQUERY> <CFX_INDEX COLLECTION="testcollection" ACTION="UPDATE" TYPE="CUSTOM" TITLE="Identifier" KEY="Identifier" BODY="TEXT" QUERY="GetText"> <CFX_SEARCH NAME="GetText_Search" COLLECTION="testcollection" TYPE="EXPLICIT" CRITERIA="1990 and CF_TITLE <SUBSTRING> Utah" > <CFOUTPUT> Record Counts: <br> #GetText.RecordCount# <BR> #GetText_Search.RecordCount# <BR> </CFOUTPUT> <CFOUTPUT> Query Results --- Should be 5 rows <BR> </CFOUTPUT> <CFOUTPUT QUERY="Gettext"> #Identifier# <BR> </CFOUTPUT> <CFOUTPUT> Search Results -- should be 1 row <BR> </CFOUTPUT> <CFOUTPUT QUERY="GetText_Search"> #GetText_Search.TITLE# <BR> </CFOUTPUT>
There are three document fields predefined in any Cold Fusion Verity collection you create and populate:
- CF_KEY -- Defined in the KEY attribute of the CFX_INDEX tag
- CF_TITLE -- Defined in the TITLE attribute of the CFX_INDEX tag
- CF_URL -- Defined in the URLPATH attribute of the CFX_INDEX tag (where applicable)
Document fields are referenced in text comparison operators. These fields can contain alphanumeric characters. You define these fields when you generate a Verity collection using the CFX_INDEX tag.
Concept operators combine the meaning of search elements to identify a concept in a document. Documents retrieved using concept operators are ranked by relevance. The following table describes each concept operator.
| Operator Name | Description |
|---|---|
| AND | Selects documents that contain all of the search elements you specify. |
| OR | Selects documents that show evidence of at least one of the search elements you specify. |
| ACCRUE | Selects documents that include at least one of the search elements you specify. Documents are ranked based on the number of search elements found. |
Score operators govern how the search engine calculates scores for retrieved documents. The maximum score a returned search element can have is 1. When a score operator is used, the search engine first caluclates a separate score for each search element found in a document, and then performs a mathematical operation on the individual element scores to arrive at the final score for each document.
Note that the document's score is available as a result column. The SCORE result column can be referenced to trap the relevancy score of any document retrieved. For example:
<CFOUTPUT> <A HREF="#Search1.URL#">#Search1.Title#</A><BR> Document Score = #Search1.SCORE#<BR> </CFOUTPUT>
| Operator Name | Description |
|---|---|
| YESNO | Forces the score
of an element to 1 if the element's score is
non-zero: <YESNO>mainframe If the retrieval result of the search on "mainframe" is 0.75, the YESNO operator forces the result to 1. |
| PRODUCT | Multiplies the
scores for documents matching a query. To arrive
at a document's score, the search engine
calculates a score for each search element and
multiplies these scores together: <PRODUCT>computers, laptops The resulting score for each document is multiplied together. |
| SUM | Adds together the
scores for documents matching a query, up to a
maximum value of 1: <SUM>computers, laptops The resulting scores are added together. |
| COMPLEMENT | Calculates scores
for documents matching a query by taking the
complement (subtracting from 1) of the scores
for the query's search elemenets. The new score
is 1 minus the search element's original
score. <COMPLEMENT>computers If the search element's oringinal score is .785, the COMPLEMENT operator recalculates the score as .215 |