copyright | lastupdated | subcollection | ||
---|---|---|---|---|
|
2021-03-12 |
discovery-data |
{:shortdesc: .shortdesc} {:external: target="_blank" .external} {:tip: .tip} {:note: .note} {:pre: .pre} {:important: .important} {:deprecated: .deprecated} {:codeblock: .codeblock} {:screen: .screen} {:download: .download} {:hide-dashboard: .hide-dashboard} {:apikey: data-credential-placeholder='apikey'} {:url: data-credential-placeholder='url'} {:curl: .ph data-hd-programlang='curl'} {:javascript: .ph data-hd-programlang='javascript'} {:java: .ph data-hd-programlang='java'} {:python: .ph data-hd-programlang='python'} {:ruby: .ph data-hd-programlang='ruby'} {:swift: .ph data-hd-programlang='swift'} {:go: .ph data-hd-programlang='go'}
{: #improve}
You can use the Improve and Customize page in the {{site.data.keyword.discoveryfull}} to try out queries, then add and test customizations to improve the query results for your project. {: shortdesc}
To access the Improve and Customize page, select the Improve and customize icon on the navigation panel.
To improve and customize your Document Retrieval project:
- Enter a natural language query in the query box.
- Review the query results displayed. Depending on the settings, you can view the source document and additional information for each result by selecting View passage in document, View document, or View table in document. Passages are enabled by default for all project types, with the exception of Content Mining. For more information about default project settings, see Default query settings. For information about how passages are identified in natural language queries, see Passages.
Tabs available when you view the source document (tabs displayed will depend on the project settings)
- Document: Preview of the source document, with passages highlighted if enabled.
- JSON: The JSON output of the query that includes the
document id
,metadata
,enriched_text
, and more. - Contract data ({{site.data.keyword.discoveryshort}} for Content Intelligence only): Displays the contract elements. You can filter them by entity, or click on any one of the highlighted elements to view the details about that element.
- Configure the desired improvement tools.
- For some of the tools, after you apply the improvement, a Recrawl or Reprocess of the collections in your project will start automatically. To do so manually, open the Activity page of each collection.
- Retry the query.
To improve and customize your Conversational Search project:
- Enter a question in the chat box.
- Review the query results displayed.
- Configure the desired improvement tools.
- For some of the tools, after you apply the improvement, a Recrawl or Reprocess of the collections in your project will start automatically. To do so manually, open the Activity page of each collection.
- Retry the question.
To improve and customize your Content Mining project:
- Explore the facets. For more information, see Facets in content mining projects.
- Apply filters and add facets.
- Configure the desired improvement tools.
- For some of the tools, after you apply the improvement, a Recrawl or Reprocess of the collections in your project will start automatically. To do so manually, open the Activity page of each collection.
- Review the facets.
You can open your application by choosing Launch application. For more information about the application, see Using the Content Mining application.
{: #improvement-tools}
The improvement tools available will vary depending on the Project type selected.
Customize display
- Facets Create hierarchical categories within your data. For more information, see Facets
- Search bar
Options:
- Autocomplete - Suggested autocompletion of queries as they are typed. For more information, see the API reference{: external}.
- Spelling suggestions - Spelling suggestions will be offered for likely typos when searching.
- Search results
Options:
- Passages - A relevant passage is returned as a query result. You can specify the number of passages returned per document (Passages per document) as well as the Maximum characters in a passage. For more information about passage retrieval, see passages.
- Field - The specified field is used as the title of a query result (the title appears under each query result, along with the collection name).
Extract meaning For more information about each of the following enrichments, see Extracting meaning
- Entities
- Parts of speech
- Keywords
- Sentiment of documents
{{site.data.keyword.icp4dfull_notm}}: The {{site.data.keyword.discoveryshort}} for Content Intelligence enrichments (Contracts
, Invoices
, and Purchase orders
) are available only if you install {{site.data.keyword.discoveryshort}} for Content Intelligence and choose the Project type of Document retrieval.
{{site.data.keyword.cloud_notm}}: On {{site.data.keyword.cloud_notm}} Premium plans, the {{site.data.keyword.discoveryshort}} for Content Intelligence Contracts
enrichment is available if you choose the Project type of Document retrieval, then select the Apply contracts enrichment checkbox.
Teach domain concepts
- Dictionaries - Dictionaries allows you to enrich document fields in your collection. The enrichment terms can be synonyms (car, automotive, auto), or words in the same category (carburetor, piston, valves). For more information, see Dictionary enrichments.
- Classifiers - The Classifier uses the labels and text examples you have specified in a
.csv
file to predict the categories of the documents in your collection. For more information see the Classifier enrichment. - Regular expressions - The Regular expressions enrichment uses regular expressions to identify and extract information from fields in your collection. For more information, see Regular expressions enrichment.
- Machine learning - This enrichment uses models created in {{site.data.keyword.knowledgestudiofull}} for {{site.data.keyword.icp4dfull}} or Watson Explorer Content Analytics Studio to enrich your collection. For more information, see Machine Learning enrichments.
- Advanced rule models - This enrichment uses a text extraction model created and exported from the Advanced rule editor of {{site.data.keyword.knowledgestudiofull}} for {{site.data.keyword.icp4dfull}}. For more information, see Advanced rule models enrichment.
Define structure
- New fields - Annotate fields within your documents to train a custom conversion model. As you annotate, Watson is learning and will start predicting annotations. For more information, see Identify fields.
- Hidden fields - This option allows you to choose which fields should be included in the index for this collection. You can switch off any fields you do not want to index. For more information, see Managing fields.
- Document splitting - This option allows you to split your documents into segments based on a field name. Once split, each segment is a separate document that will be enriched, indexed, and returned as a separate query result. For more information, see Managing fields.
Improve relevance
- Synonyms - You can expand the scope of a query beyond exact matches - for example, you can expand a query for "ibm" to include "international business machines" and "big blue" - by uploading a list of synonyms. For more information, see Implementing synonyms.
- Stopwords - Stopwords are filtered out of queries because they are common terms that are not useful in a search. For more information, see Defining stopwords.
- Data management - Add additional data to your collections. For more information, see Creating and managing collections.
- Relevancy training - The relevance of natural language query results can be improved in {{site.data.keyword.discoveryfull}} with training. For more information, see Improving result relevance with training.