Marshall Islands Document Collection
Verity's powerful, full-text search features allow you to search
with great accuracy. There are two basic query parsers that can be
used to perform different types of searches: full-text and query-by-example.
A full-text search will search over the entire text of documents based
on the query you provide. This query can contain one or more words and
phrases, or it can be a query expression that uses Verity query language, such
as operator and modifiers. The query-by-example search will search for
documents whose content is similar to a block of text you provide.
There are many ways to ask a question using the full-text parser.
The simplest way is to enter one or more words and phrases separated by
commas. Using Verity operators and modifiers you can apply logic to search
terms. Using pre-defined query objects called topics; you can search for
a subject encapsulated in a topic by entering the topic name. The
information below tells you about some of the ways you can ask questions.
Searching for Words and Phrases
If you select full-text and enter words and phrases separated by commas,
each comma represents the ACCRUE operator, which is a fuzzy OR--it means,
the more of these the better." By default, words and phrases in the query
are stemmed, meaning the search is broadened to include the stemmed
variations of these words. The effect of the ACCRUE operator is to assign importance
in the form of a score to each document having matched the query. The score
assigned to a document is based on the number of word matches the document
contains and the density of those matches.
The query below will search for the phrase "desktop publisher" and
stemmed variations of the word "editor":
- desktop publisher, editor
The search engine automatically recognizes a topic name as a Verity
topic in a query expression and will treat a word as a topic if it matches
the name of an existing topic. For example, the following query will search
for the topic named "HTML" and stemmed variations of the word "editor"
because "HTML" is the name of a valid topic in this Topic Internet Server
- HTML, editor
You may want the search engine to treat the word HTML as a word instead
of a topic. Also, you may want to search for the word "editor" and not
the word along with all of its stemmed variations. To do this, you just
delimit the search term in double-quotation marks. For example, the
following query will search for the word "HTML" and the word "editor":
- "HTML", "editor"
Note that searches are not case-sensitive by default. This means you
can use "HTML" or "html" in the above examples and get the same search
Using Verity Query Language
You can use operators and modifiers to apply logic to your query and
pinpoint the exact information you are interested in. Popular operators are:
AND, OR, ACCRUE, and NEAR. A modifier can be used with an operator to
further define your question for the search engine. Frequently-used
modifiers are: MANY and NOT. By default, the words "and," "or," and "not" are
interpreted as Verity query language; all other query language elements,
such as the NEAR operator, are interpreted as words unless surrounded by
angle brackets. Sample query expressions using query language are below.
The AND operator selects documents that contain all of the search elements
you specify. To find documents that contain both evidence of the topic named
"HTML" and at least one stemmed variation of the word "editor," you can use
the following query:
- HTML and editor
The OR operator selects documents that show evidence of at least one of
the search elements. To find documents that contain either evidence of
the topic named "HTML" or at least one stemmed variation of the word
"editor," you can use the following query:
- HTML or editor
The MANY modifier is applied to words and phrases for a full-text search
by default. This modifier affects how documents are scored and tells the
search engine to give the highest scores to documents with the highest
density of word matches. This modifier can be used explicitly with many
operators with the exception of: AND, OR, ACCRUE. When you enter a word
such as "editor" as a query, the search engine interprets this as:
- <MANY> <STEM>editor
The <STEM> operator says search for the stemmed variations of this word.
The <STEM> operator and the related <WORD> operator can be used
other modifiers, such as: CASE (for case-sensitive searches) and NOT
(to exclude information from searches).
Proximity Search Methods
There are several search methods for doing proximity searches. A proximity
search looks for documents containing search terms within close proximity
of each other. The following operators enable proximity search methods:
NEAR, PHRASE, SENTENCE, PARAGRAPH.
The NEAR operator selects documents containing specified search terms
within close proximity to each other. Document scores are calculated
based on the relative number of words between search terms; the closer
the search terms, the higher the score. To find documents that contain
the word "HTML" and stemmed variations of the word "publishing" within
close proximity to each other, you can use this query:
The SENTENCE and PARAGRAPH operators are used to specify a search within
a sentence or paragraph. The syntax for using these operators is similar.
To find documents that contain the word "HTML" and stemmed variations of
the word "publishing" within the same paragraph, you can use this query:
Want to exclude something from a search? That's what the NOT modifier
does. For example, to find documents containing stemmed variations of the
words "server" and "configuration" in close proximity to each other,
but not stemmed variations of the word "firewall", you enter this query:
- server<NEAR>configuration<AND> <NOT>firewall
You can search in any named HTML zone, such as <TITLE> and <H1>.
query will find documents whose titles have stemmed variations of
the words "web" and "security" in them:
- (web, security)<IN>title
An HTML zone name corresponds to an HTML tag name.