copyright | lastupdated | subcollection | ||
---|---|---|---|---|
|
2021-03-12 |
discovery-data |
{:shortdesc: .shortdesc} {:external: target="_blank" .external} {:tip: .tip} {:note: .note} {:pre: .pre} {:important: .important} {:deprecated: .deprecated} {:codeblock: .codeblock} {:screen: .screen} {:download: .download} {:hide-dashboard: .hide-dashboard} {:apikey: data-credential-placeholder='apikey'} {:url: data-credential-placeholder='url'} {:curl: .ph data-hd-programlang='curl'} {:javascript: .ph data-hd-programlang='javascript'} {:java: .ph data-hd-programlang='java'} {:python: .ph data-hd-programlang='python'} {:ruby: .ph data-hd-programlang='ruby'} {:swift: .ph data-hd-programlang='swift'} {:go: .ph data-hd-programlang='go'} {:xml: .ph data-hd-programlang='xml'} {:properties: .ph data-hd-programlang='properties'}
{: #ccs-tooling}
After you build and deploy a custom connector, you can configure and run it in the {{site.data.keyword.discoveryshort}} tool to create a collection. {: shortdesc}
{{site.data.keyword.icp4dfull_notm}} only
This information applies only to installed deployments. {:note}
You create and manage a collection as described in Creating and managing collections. You can use a successfully deployed custom connector during this process as follows. These instructions enable you to use a custom connector instead of one of the pre-built connectors listed in Configuring Cloud Pak for Data data sources.
-
After you create a new project, including a name and project type, look on the Select data source page for your custom connector. Select the custom connector and click Next. The Configure collection page opens.
The following steps apply specifically to the example custom connector that is shipped with the
custom-crawler-docs.zip
file. {: note} -
Enter values for the following fields on the Configure collection page. If a field is already populated with a value, verify and change the value if needed. A prepopulated value indicates that a value was specified in the custom connector's
template.xml
ormessage.properties
file.
- **General**
- **Collection name**
- **Collection language**
- **Crawler properties**
- **Crawler name**
- **Crawler description**
- **Time to wait between retrieval requests (milliseconds)** (default `0`)
- **Maximum number of active crawler threads** (default `10`)
- **Maximum number of documents to crawl** (default `2000000000`)
- **Maximum document size (KB)** (default `32768`)
- **When the crawler session is started**
Selections include:
- **Start crawling updates (look for new, modified, and deleted content)**
- **Start crawling new and modified content**
- **Start a full crawl**
- **`{connector_name}`_GENERAL_SETTINGS_CUSTOM_CONFIG_CLASS_LABEL** (default `com.ibm.es.ama.custom.crawler.sample.sftp.SftpCrawler`)
- **`{connector_name}`_GENERAL_SETTINGS_CUSTOM_CRAWLER_CLASS_LABEL** (default `com.ibm.es.ama.custom.crawler.sample.sftp.SftpCrawler`)
- **`{connector_name}`_GENERAL_SETTINGS_CUSTOM_SECURITY_CLASS_LABEL** (default `com.ibm.es.ama.custom.crawler.sample.sftp.SftpCrawler`)
- **`{connector_name}`_GENERAL_SETTINGS_DOCUMENT_LEVEL_SECURITY_SUPPORTED_LABEL** (default `On`)
- **`{connector_name}`_GENERAL_SETTINGS_DOCUMENT_LEVEL_SECURITY_SSO_ENABLED_LABEL** (default `Off`)
- **`{connector_name}`_GENERAL_SETTINGS_DOCUMENT_LEVEL_SECURITY_SCOPE_LABEL** (default `{host}:{:port}`)
- **Data Source Properties**
- **Host name** (default `localhost`)
- **Port** (default `22`)
- **User name**
- **Use key file (or input password)** (default `On`)
- **Key file location**
- **passphrase**
- **Password**
- **Space filter**
- **Path filter** (default `/`)
- **Level** (default `1`)
- **Crawl space Properties**
- Click Finish to create the collection with the custom connector and the values specified for it.
- Continue with the instructions in Creating and managing collections and its subsequent topics.