Legacy Search Syntax
Canopy recommends that you familiarize yourself with how we index data before using search capabilities.
Isabella the hiker
Isabella's tree house
Isabell sleeps on a cot.
Isabella is 25 years old.
Isabella is a tree trimmer.
Isabella and Marcus are sawing a big tree.
Isabella trimmed big trees
Marcus saw
Marcus was trimming a big tree.
Marcus's SS# is 078-05-1120.
Marcus is getting a cat from the tree.
Marcus and Isabella are hiking.
hike
organ
cut
burn
tree
wood
saw
I am going to trim a big tree.
oak
birch
maple
hiking
organization
sawing
cutting
burning
woods
trees
trimmer
trimmings
trimmed
trimming big tree
Isabella's special characters
special`
special~
special!
special@
special#
special$
special%
special*
special:
special;
special"
special'
special,
special.
special\
special/
special?
special(
special)
special-
special_
special`character
special~character
special!character
Not Special
-----------
special1character
specialAcharacter
Red Maple
Silver Maple
Sugar Maple
Black Oak
Chestnut Oak
Northern Red Oak
Scarlet Oak
White Oak
Black Birch
Gray Birch
Paper Birch
Yellow Birch
Below is the data used in the examples on this page that you can download and process into a test project.
When searching for a single word, all documents containing a variant of that word will be returned.
For example, searching
Trim
will return documents containing the words trim, trimming, trimmed, or trimmer:
- hike.txt
- hiking.txt
- isabella.txt
- marcus.txt
When searching for a single word that is possessive, the possessive tense will be ignored.
For example, searching
Isabella's
will return documents containing the name isabella:
- isabella.txt
- marcus.txt
- special.txt
When searching on phrases the order of the word and the position of each word must match.
For example, searching
Trimmed big tree
will return documents containing the phrases trimming big tree or trimming big trees:
- hiking.txt
- isabella.txt
When constructing searches consider presence of stop words since they will impact the position of the keywords in the phrase.
For example, searching
Trim that big tree
will return documents containing Trim a big tree, trimming a big tree, or trimmed trimming big tree:
- hike.txt
- hiking.txt
- marcus.txt
At first, we might not expect these results. To better understand these results, we need to examine how we analyze the text. When we analyze Trim that big tree, we create three tokens:
- Trim with position 1
- Big with position 3
- Tree with position 4
When we search for the phrase Trim that big tree we are searching for documents that have a phrase where:
- Trim is in the first position
- Trim is followed by big
- Trim and big are separated by 1 position
- Big is followed by tree
Since that is a stop word, it is removed from the analysis process and impacts the query results.
This search command is a beta feature. The syntax and operations may change in the future.
If you need to use any of the reserved characters which function as operators in your query itself (and not as operators), then you should preface the search with the beta: command and escape the reserved characters with a leading backslash.
For example, searching
beta:special\?character
will return documents containing the string special?character.
Reserved characters include: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
Failing to escape these special characters correctly could lead to unpredictable results.
< and >
The characters < and > can’t be escaped.
Canopy supports AND, OR, and AND NOT boolean operators. Canopy also supports the alternative ANY, ALL, and NONE operators.
For example, searching
Isabella and saw
will return documents, which contain either Isabella AND saw as well as, Isabella AND sawing:
- isabella.txt
- marcus.txt
For example, searching
Isabella or saw
will return the documents, which contain either Isabella OR saw OR sawing:
- hike.txt
- hiking.txt
- isabella.txt
- marcus.txt
- special.txt
For example, searching
Isabella and not saw
will return documents, which contains the word Isabella but no variation of saw:
- special.txt
As an alternative to the or operator, type any: followed by the list of desired search words.
For example, searching
any:oak,birch,maple
will return the documents:
- hike.txt
- the_birches.txt
- the_maples.txt
- the_oaks.txt
Space matters
The search will only work without spaces after the colon!
As an alternative to the and operator, type all: followed by the list of desired search words.
For example, searching
all:oak,birch,maple
will return documents containing all three words; oak, birch, and maple:
- hike.txt
Type none: followed by the list of desired search words.
For example, searching
none:oak,birch,maple
returns all documents that do not contain oak, birch, and maple:
- hiking.txt
- isabella.txt
- marcus.txt
- special.txt
Type the word with the fuzzy operator ~ followed by the number of one character changes that need to be made to one string to make another string. Valid parameters include 0, 1, or 2.
For example, searching
hyker~2
will return documents containing Hike, Hiker, or Hiking:
- hike.txt
- hiking.txt
- isabella.txt
- marcus.txt
Returning Hiking may not be obvious at first, because more than 2 replacements are required to create Hiking from Hyke. When using the fuzzy operator, however, Hyke will return hike and all of the variants of hike which includes hiking.
Case matters
Change in capitalization will count as a character change!
For example, searching
Trim w/2 tree
will return documents containing trim and tree with up to 2 words between them:
- hike.txt
- hiking.txt
- isabella.txt
- marcus.txt
Another example, searching
isabella w/5 (big tree)
will return documents containing isabella and (big tree) with up to 5 words between them.
This search command is a beta feature. The syntax and operations may change in the future.
While the word phrase search Trim a big tree expects all other terms in exactly the same order and position, this proximity query allows the specified words to be further apart or in a different order. In the same way that fuzzy queries can specify a maximum edit distance for characters in a word, this proximity search allows us to specify a maximum edit distance of words in a phrase.
For example, searching
beta:"Big Tree Trim"~5
will return documents containing the phrases, Trim a big tree, Trimming a big tree, and Trimmed big trees:
- hike.txt
- hiking.txt
- isabella.txt
- marcus.txt
Wildcard searches can be run on individual terms, using ? to replace a single character, and * to replace zero or more characters:
Use * wildcard for consecutive characters in the same word.
Wildcard searches should only be used with root words
When using the wildcard operator, the search will only be reliable using root words.For example,
*ingwill not return any results as the specified search is too broad.
In another example,
w*dswill not return any results from using as it is a plural version of the root word,
wood.
For example, using * at the end of a word,
cut*
will find cut and cutting in the following documents:
- hike.txt
- hiking.txt
For example, using * wildcard in the middle of a root word,
w*d
will return documents containing wood and woods.
For example, using * at the beginning of a root word
*od
will return documents containing wood and woods.
Use ? wildcard for any single character in the same position.
For example,
c?t
will find cut, cot, cat, and cutting in the following documents:
- hike.txt
- hiking.txt
- isabella.txt
- marcus.txt
Expect Stemming
Since the index is configured for stemming,c?t, which returnscut, will also findcutting.
Special characters and the ? wildcard.
Searching using the ? will work for special characters at the end of the word but not in the middle.