Cassandra Configuration

Note: Cassandra is only available with Guice wiring (cassandra-guice and cassandra-guice-ldap).

Consult cassandra.properties to get some examples and hints.

cassandra.nodes
List of some nodes of the cassandra's cluster in following format host:port or host, if the port is not specified we use 9042
cassandra.keyspace
Is the name of the keyspace used by James.
cassandra.replication.factor
Is the replication factor. (should be 1, as cluster is not yet supported)
cassandra.query.logger.constant.threshold
Optional. If specified all queries that take more than the given integer in millisecond will be considered slow and logged. If not specified by default a DynamicThresholdQueryLogger will be used (see above)
cassandra.query.slow.query.latency.threshold.percentile
Default is com.datastax.driver.core.QueryLogger.DEFAULT_SLOW_QUERY_THRESHOLD_PERCENTILE. The latency percentile beyond which queries are considered 'slow' and will be logged. If you specify cassandra.query.logger.constant.threshold, you should not specify this property
cassandra.query.logger.max.query.string.length
Default is com.datastax.driver.core.QueryLogger.DEFAULT_MAX_QUERY_STRING.LENGTH. The maximum length of a CQL query string that can be logged verbatim by the cassandra driver
cassandra.query.logger.max.logged.parameters
Default is com.datastax.driver.core.QueryLogger.DEFAULT_MAX_LOGGED_PARAMETERS. The maximum number of query parameters that can be logged by the cassandra driver
cassandra.query.logger.max.parameter.value.length
Default is com.datastax.driver.core.QueryLogger.DEFAULT_MAX_PARAMETER_VALUE_LENGTH. The maximum length of query parameter value that can be logged by the cassandra driver
cassandra.readTimeoutMillis
Optional. If specified defines the Cassandra driver read timeout.
# Read com.datastax.driver.core.PoolingOptions for knowing defaults value # No value here will default to driver's default value # cassandra.pooling.local.max.connections=8 # cassandra.pooling.local.max.requests=128 ## In ms. Should be higher than socket read timeout # cassandra.pooling.timeout=5000 ## In seconds. # cassandra.pooling.heartbeat.timeout=30 # cassandra.pooling.max.queue.size=256
cassandra.pooling.local.max.connections
Optional. Defaults to 8.
If specified defines the Cassandra maximum number of connections to hosts (remote and local).
cassandra.pooling.local.max.requests
Optional. Defaults to 128.
If specified defines the Cassandra maximum number of concurrent requests per connection.
cassandra.pooling.timeout
Optional. Defaults to 5000 (ms).
If specified defines the Cassandra timeout for waiting in the pool queue. Should be higher than sockets timeout.
cassandra.pooling.heartbeat.timeout
Optional. Defaults to 30 (s).
If specified defines the Cassandra heartbeat timeout.
cassandra.pooling.max.queue.size
Optional. Defaults to 256.
If specified defines the Cassandra maximum size of the connection pool queue.
mailbox.max.retry.acl
Optional. Defaults to 1000.
Controls the number of retries upon Cassandra ACL updates.
mailbox.max.retry.modseq
Optional. Defaults to 100000.
Controls the number of retries upon Cassandra ModSeq generation.
mailbox.max.retry.uid
Optional. Defaults to 100000.
Controls the number of retries upon Cassandra Uid generation.
mailbox.max.retry.message.flags.update
Optional. Defaults to 1000.
Controls the number of retries upon Cassandra flags update, in MessageMapper.
mailbox.max.retry.message.id.flags.update
Optional. Defaults to 1000.
Controls the number of retries upon Cassandra flags update, in MessageIdMapper.
fetch.advance.row.count
Optional. Defaults to 1000.
Controls the number of remaining rows we should wait before prefetch when paging.
chunk.size.message.read
Optional. Defaults to 100.
Controls the number of messages to be retrieved in parallel.
chunk.size.expunge
Optional. Defaults to 50.
Controls the number of messages to be expunged in parallel.
mailbox.blob.part.size
Optional. Defaults to 102400 (100KB).
Controls the size of blob parts used to store messages.

If you want more explanation about Cassandra configuration, you should visit the dedicated documentation.

Cassandra migration process

Cassandra upgrades implies the creation of a new table. Thus restarting James is needed, as new tables are created on restart.

Once done, we ship code that tries to read from new tables, and if not possible backs up to old tables. You can thus safely run without running additional migrations.

On the fly migration can be enabled. However, one might want to force the migration in a controlled fashion, and update automatically current schema version used (assess in the database old versions is no more used, as the corresponding tables are empty). Note that this process is safe: we ensure the service is not running concurrently on this James instance, that it does not bump version upon partial failures, that race condition in version upgrades will be idempotent, etc...

These schema updates can be triggered by webadmin using the Cassandra backend.

Note that currently the progress can be tracked by logs.

Here are the implemented migrations:

From V1 to V2

Migration tag on git repository: cassandra_migration_v1_to_v2

Goal is to create a messageV2 table that aims at replacing message table. Message table is both storing message metadata and blobs. It have been proven inefficient. Instead version 2 is chunking message blobs and storing it in an other table. The migration process involves moving all messages from message table to messageV2 table (contains only metadata) and blobs / blobParts tables.

Read more about this migration here.

Summary of available options for this migration:

migration.v1.v2.on.the.fly
Only available on tag cassandra_migration_v1_to_v2. Optional. Defaults to false.
Controls wether v1 to v2 migration should be run on the fly.
migration.v1.v2.thread.count
Only available on tag cassandra_migration_v1_to_v2. Optional. Defaults to 2.
Controls the number of threads used to asynchronously migrate from v1 to v2.
migration.v1.v2.queue.length
Only available on tag cassandra_migration_v1_to_v2. Optional. Defaults to 1000.
Controls the queue size of v1 to v2 migration task. Drops when full.
migration.v1.read.fetch.size
Only available on tag cassandra_migration_v1_to_v2. Optional. Defaults to 10.
Controls the fetch size of the request to retrieve all messages stored in V1 during the migration process.

From V2 to V3

Migration tag on git repository: cassandra_migration_v2_to_v3

Goal is to drop message table. After this migration, one can manually delete this table.

From V3 to V4

Migration tag on git repository: cassandra_migration_v3_to_v4

Goal is to store attachments in the blob tables.

Summary of available options for this migration:

attachment.v2.migration.read.timeout
Optional. Defaults to one day.
Controls how many milliseconds before the read on attachment v1 time out.

From V4 to V5

Migration tag on git repository: cassandra_migration_v4_to_v5

Goal is to store attachment ids in the separated AttachmentMessageId table.

Summary of available options for this migration:

message.attachmentids.read.timeout
Optional. Defaults to one day.
Controls how many milliseconds before the read attachment ids on message time out.