ElasticSearch Configuration

This configuration applies only to Guice wiring.

Consult elasticsearch.properties to get some examples and hints.

Connection to a cluster :
elasticsearch.masterHost
Is the IP (or host) of the ElasticSearch master
elasticsearch.port
Is the port of ElasticSearch master
Or you can connect a cluster by :
elasticsearch.hosts
List of comma separated hosts. An host is composed of an address and a port separated by a ':'. Example : elasticsearch.hosts=host1:9200,host2:9200
Other options includes :
elasticsearch.clusterName
Is the name of the cluster used by James.
elasticsearch.nb.shards
Number of shards for index provisionned by James
elasticsearch.nb.replica
Number of replica for index provisionned by James (default: 0)
elasticsearch.index.mailbox.name
Name of the mailbox index backed by the alias. It will be created if missing.
elasticsearch.index.name
Deprecated Use elasticsearch.index.mailbox.name instead.
Name of the mailbox index backed by the alias. It will be created if missing.
elasticsearch.alias.read.mailbox.name
Name of the alias to use by Apache James for mailbox reads. It will be created if missing. The target of the alias is the index name configured above.
elasticsearch.alias.read.name
Deprecated Use elasticsearch.alias.read.mailbox.name instead.
Name of the alias to use by Apache James for mailbox reads. It will be created if missing. The target of the alias is the index name configured above.
elasticsearch.alias.write.mailbox.name
Name of the alias to use by Apache James for mailbox writes. It will be created if missing. The target of the alias is the index name configured above.
elasticsearch.alias.write.name
Deprecated Use elasticsearch.alias.write.mailbox.name instead.
Name of the alias to use by Apache James for mailbox writes. It will be created if missing. The target of the alias is the index name configured above.
elasticsearch.retryConnection.maxRetries
Number of retries when connecting the cluster
elasticsearch.retryConnection.minDelay
Minimum delay between connection attempts
elasticsearch.indexAttachments
Indicates if you wish to index attachments or not (default: true).
elasticsearch.index.quota.ratio.name
Specify the ElasticSearch alias name used for quotas
elasticsearch.alias.read.quota.ratio.name
Specify the ElasticSearch alias name used for reading quotas
elasticsearch.alias.write.quota.ratio.name
Specify the ElasticSearch alias name used for writing quotas
For configuring the metric reporting on ElasticSearch :
elasticsearch.http.host
Host to report metrics on. Defaults to master host.
elasticsearch.http.port
Http port to use for publishing metrics
elasticsearch.metrics.reports.enabled
Boolean value. Enables metrics reporting.
elasticsearch.metrics.reports.period
Seconds between metric reports
elasticsearch.metrics.reports.index
Index to publish metrics on

If you want more explanation about ElasticSearch configuration, you should visit the dedicated documentation.

Tika Configuration

When using ElasticSearch, you can configure an external Tika server for extracting and indexing text from attachments. Thus you can significantly improve user experience upon text searches.

Note that to use this feature you need Guice, built with ElasticSearch

Consult tika.properties to get some examples and hints.

Here are the different properties:
tika.enabled
Should Tika text extractor be used?
If true, the TikaTextExtractor will be used behind a cache.
If false, the DefaultTextExtractor will be used (naive implementation only supporting text).
Defaults to false.
tika.host
IP or domain name of your Tika server. The default value is 127.0.0.1
tika.port
Port of your tika server. The default value is 9998
tika.timeoutInMillis
Timeout when issuing request to the tika server. The default value is 3 seconds.
tika.cache.eviction.period
A cache is used to avoid, when possible, query Tika multiple time for the same attachments.
This entry determines how long after the last read an entry vanishes.
Please note that units are supported (ms - millisecond, s - second, m - minute, h - hour, d - day). Default unit is seconds.
Default value is 1 day
tika.cache.enabled
Should the cache be used? False by default
tika.cache.weight.max
Maximum weight of the cache.
A value of 0 disables the cache
Please note that units are supported (K for KB, M for MB, G for GB). Defaults is no units, so in bytes.
Default value is 100 MB.
tika.contentType.blacklist
Blacklist of content type is known-to-be-failing with Tika. Specify the list with comma separator.
Note: You can launch a tika server using this command line:
docker run --name tika logicalspark/docker-tikaserver:1.20