-
Notifications
You must be signed in to change notification settings - Fork 9
/
indexing.txt
36 lines (30 loc) · 4.33 KB
/
indexing.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Folio indexing codes explanation
TODO?: Build a dictionary of field names to analyzer arrays from the definition file. Use this to populate the MultiFieldWrapper.
Code Explanation
TF Term Field indexing. Individual terms in the field are indexed separately in the Field's index.
PF Phrase Field indexing. The entire contents of the field are indexed as a phrase in the Field's index.
TE Term Enclosing indexing. Individual terms in the field are indexed separately in the infobase's index.
NO Not indexed. Terms in the field will not be found if searched for in the infobase. (This option may not be used in conjunction with other options.)
PR Proximity. All terms in a field primitive must be found within a single application of the field to register a hit. Most often used with TE and TF.
DT Used only with Date fields. Controls how dates specified with a 2-number abbreviation (95, 96, 97, etc.) are handled. If this option is used, dates less than or equal to 49 are assumed to be after the year 2000; dates greater than or equal to 50 are assumed to be before the year 2000 (such as 1995). If this option is not used, all dates are assumed to be in the 1900s.
FP Fast Phrase. Enables the field for fast phrase searching.
SW Stop Words. Uses stop words to reduce the size of the index for fields when fast phrase is enabled.
Bug Notice:
Indexing Options
There are two sets of indexing options. One set applies to text fields only. The other set applies to date fields only.
Text Fields
Indexing options allow you to specify how information in a text field may be searched in the finished infobase. (The indexing options extend the functionality of the IX+ and IX- options provided in the 3.1 version of flat file.)
There are basically three types of indexing options: Field Indexing, Enclosing Indexing, and No Indexing. Field Indexing adds the terms in the field application to the field's index. This allows the terms to be found in a field search (such as Field X:dog). Enclosing Indexing adds the terms in the field application to the infobase index. This allows the terms to be found during a normal query (such as a search for the word dog). No Indexing turns indexing off completely; the terms in the field application cannot be found in any search.
In addition, you may specify how terms are indexed in the Field Indexing and Field Enclosing options. Terms may be indexed as individual terms and searched as individual terms (this is what is described in the preceding paragraph). Terms within a field application may also be indexed as a phrase. When terms are indexed as a phrase, then all of the terms in the field application are considered to be one word for searching purposes. For example, if International Business Machines is indexed as a phrase, the single term International Business Machines appears in the Word List in the Advanced Query dialog (rather than three separate terms).
Finally, text fields may also set a proximity for searches. If you use the Proximity (PR) option, then queries for terms in the field require that all of the terms exist in the same application of the field for a hit to register. For example, a search for [Field Presidents: Washington Adams Lincoln] would require that all three of these terms be found in the same field application. See Notes on the Proximity (PR) Indexing Option for more information.
These options may be used in conjunction for different effects. The default is Term Field (TF) and Term Enclosing (TE) (if no options are specified, these are used by default). This combination allows the terms in the field to be found both within the field and within the infobase as a whole.
Use all of the options (except the No Indexing (NO) option) to index terms both as phrases and as individual terms.
Use Term Field (TF) or Phrase Field (PF) alone to force the user to search in the field to find the term.
Using Term Enclosing (TE) alone does not make much sense but may be done. Using this alone allows users to find the information in the field in a general search of the infobase, but not when they search within the field. This option is normally used in conjunction with PF.
Note: While any combination of these options may be specified, the following are those allowed to be set from within Folio Views:
TF,TE Normal
PF,TE Phrase
No Unindexed
TF Field Only
PF Field Only as Phrase
If no indexing options are set, the Normal is used (TF,TE).