Apache James Project – Apache James Server 3

Cassandra Configuration

This configuration file allow setting some configuration properties in conjunction to Cassandra driver native configuration.

Consult cassandra.properties to get some examples and hints.

Consult cassandra-driver.conf to get some examples and hints regarding native driver configuration.

Cassandra native configuration allows configuring SSL, timeouts, logs and metrics as well as execution profiles.

Note: Cassandra is only available with Guice wiring (cassandra-guice and cassandra-guice-ldap).

cassandra.nodes: List of some nodes of the cassandra's cluster in following format host:port or host, if the port is not specified we use 9042
cassandra.keyspace.create: Indicate if the keyspace should be created by James. Optional, default value: false
If set to true James will attempt to create the keyspace when starting up.
cassandra.keyspace: Is the name of the keyspace used by James. Optional, default value: apache_james
cassandra.user: Username used as a credential for contacting Cassandra cluster. Optional, default is absent, required if cassandra.password is supplied
cassandra.password: Password used as a credential for contacting Cassandra cluster. Optional, default is absent, required if cassandra.user is supplied
cassandra.replication.factor: Is the replication factor used upon keyspace creation. Modifying this property while the keyspace already exists will have no effect. Optional. Default value 1.
mailbox.read.repair.chance: Optional. Defaults to 0.1 (10% chance).
Must be between 0 and 1 (inclusive). Controls the probability of doing a read-repair upon mailbox read.
mailbox.counters.read.repair.chance.max: Optional. Defaults to 0.1 (10% chance).
Must be between 0 and 1 (inclusive). Controls the probability of doing a read-repair upon mailbox counters read.
Formula: read_repair_chance = min(mailbox.counters.read.repair.chance.max, (100/unseens)*mailbox.counters.read.repair.chance.one.hundred)
mailbox.counters.read.repair.chance.one.hundred: Optional. Defaults to 0.01 (1% chance).
Must be between 0 and 1 (inclusive). Controls the probability of doing a read-repair upon mailbox counters read.
Formula: read_repair_chance = min(mailbox.counters.read.repair.chance.max, (100/unseens)*mailbox.counters.read.repair.chance.one.hundred)
mailbox.max.retry.acl: Optional. Defaults to 1000.
Controls the number of retries upon Cassandra ACL updates.
mailbox.max.retry.modseq: Optional. Defaults to 100000.
Controls the number of retries upon Cassandra ModSeq generation.
mailbox.max.retry.uid: Optional. Defaults to 100000.
Controls the number of retries upon Cassandra Uid generation.
mailbox.max.retry.message.flags.update: Optional. Defaults to 1000.
Controls the number of retries upon Cassandra flags update, in MessageMapper.
mailbox.max.retry.message.id.flags.update: Optional. Defaults to 1000.
Controls the number of retries upon Cassandra flags update, in MessageIdMapper.
chunk.size.message.read: Optional. Defaults to 100.
Controls the number of messages to be retrieved in parallel.
chunk.size.expunge: Optional. Defaults to 50.
Controls the number of messages to be expunged in parallel.
mailbox.blob.part.size: Optional. Defaults to 102400 (100KB).
Controls the size of blob parts used to store messages.
mailbox.read.strong.consistency: Optional. Boolean, defaults to true. Disabling should be considered experimental. If enabled, regular consistency level is used for read transactions for mailbox. Not doing so might result in stale reads as the system.paxos table will not be checked for latest updates. Better performance are expected by turning it off. Note that reads performed as part of write transactions are always performed with a strong consistency.
message.read.strong.consistency: Optional. Boolean, defaults to true. Disabling should be considered experimental. If enabled, regular consistency level is used for read transactions for message. Not doing so might result in stale reads as the system.paxos table will not be checked for latest updates. Better performance are expected by turning it off. Note that reads performed as part of write transactions are always performed with a strong consistency.
message.write.strong.consistency.unsafe: Optional. Boolean, defaults to true. Disabling should be considered experimental and unsafe. If disabled, Lightweight transactions will no longer be used upon messages operation (table `imapUidTable`). As message flags updates relies so far on a read-before-write model, it exposes yourself to data races leading to potentially update loss. Better performance are expected by turning it off. Reads performed as part of write transaction are also performed with a relaxed consistency.
cassandra.local.dc: Optional. Allows specifying the local DC as part of the load balancing policy. Specifying it would result in the use of new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().withLocalDc(value).build()) as a LoadBalancingPolicy. This value is useful in a multi-DC Cassandra setup. Be aware of limitations of multi-DC setups for James. Not specifying this value results in the driver's default load balancing policy to be used.
optimistic.consistency.level.enabled: Optional. Defaults to false. Allows specifying consistency level ONE for reads in Cassandra BlobStore. Falls back to default read consistency level if the blob is missing.
mailrepository.strong.consistency: Optional. Defaults to true. Allows not to use lightweight transactions in CassandraMailRepository. If disabled we implement an idempotent behaviour (duplicates are overridden, missing entries upon deletes are ignored).
acl.enabled: Optional. Boolean, defaults to true. Allows disabling ACLs: if set to false, delegation will fail and users will only have access to the mailboxes they own. ACLs can represent a high volume of requests. If you do not propose mailbox sharing features to your users, you can consider disabling them in order to improve performance.
email.change.ttl: Optional. Duration, default to 60 days. Cassandra Time-to-live for Email change records. Setting time-to-live to zero means refusing to use time-to-live on email changes.
mailbox.change.ttl: Optional. Duration, default to 60 days. Cassandra Time-to-live for Mailbox change records. Setting time-to-live to zero means refusing to use time-to-live on mailbox changes.
uid.modseq.increment: Optional, default to 0. Defensive value to add to uids and modseqs generated. This can be used as an heuristic to maintain consistency even when consensus of Lightweight Transactions is broken, exemple during a disaster recovery process.

If you want more explanation about Cassandra configuration, you should visit the dedicated documentation.

Cassandra migration process

Cassandra upgrades implies the creation of a new table. Thus restarting James is needed, as new tables are created on restart.

Once done, we ship code that tries to read from new tables, and if not possible backs up to old tables. You can thus safely run without running additional migrations.

On the fly migration can be enabled. However, one might want to force the migration in a controlled fashion, and update automatically current schema version used (assess in the database old versions is no more used, as the corresponding tables are empty). Note that this process is safe: we ensure the service is not running concurrently on this James instance, that it does not bump version upon partial failures, that race condition in version upgrades will be idempotent, etc...

These schema updates can be triggered by webadmin using the Cassandra backend.

Note that currently the progress can be tracked by logs.

Here are the implemented migrations:

From V1 to V2

Last support on releases 3.5.0

Migration tag on git repository: cassandra_migration_v1_to_v2

Goal is to create a messageV2 table that aims at replacing message table. Message table is both storing message metadata and blobs. It have been proven inefficient. Instead version 2 is chunking message blobs and storing it in an other table. The migration process involves moving all messages from message table to messageV2 table (contains only metadata) and blobs / blobParts tables.

From V2 to V3

Last support on releases 3.5.0

Migration tag on git repository: cassandra_migration_v2_to_v3

Goal is to drop message table. After this migration, one can manually delete this table.

From V3 to V4

Last support on releases 3.5.0

Migration tag on git repository: cassandra_migration_v3_to_v4

Goal is to store attachments in the blob tables.

Summary of available options for this migration:

attachment.v2.migration.read.timeout: Optional. Defaults to one day.
Controls how many milliseconds before the read on attachment v1 time out.

From V4 to V5

Last support on releases 3.5.0

Migration tag on git repository: cassandra_migration_v4_to_v5

Goal is to store attachment ids in the separated AttachmentMessageId table.

Summary of available options for this migration:

message.attachmentids.read.timeout: Optional. Defaults to one day.
Controls how many milliseconds before the read attachment ids on message time out.

From V5 to V6

Last support on releases 3.6.x

Goal is to no longer rely on an UDT partition key for mailboxPath tables. Entries will be migrated to mailboxPathV2 table relying on a composite primary key

From V6 to V7

Last support on releases 3.6.x

Goal is to populate mapping_sources projection table. This table allows finding the source of a given redirection, which is handy for things like mail aliases (I want to list aliases rewritting things to bob). Without this projection table being available, (ie we rely on schema version 6 or less) such information is obtained through a full table scan, unoptimized. From schema version 7, the optimized projection can safely be used.

From V7 to V8

Last support on releases 3.6.x

Add UID_VALIDITY to mailboxPath table in order not to mandate mailbox table reads.

From V8 to V9

Adopt a more compact representation for message properties.

From V9 to V10

Handles Mailbox ACL transactionality with event-sourcing. We got read of SERIAL consistency upon reads thus unlocking a major performance enhancement.

Adding threadId column to message metadata tables

Add threadId column to messageIdTable and imapUidTable in order to get a message's threadId.

James components

Apache Software Foundation

Cassandra Configuration

Cassandra migration process

From V1 to V2

From V2 to V3

From V3 to V4

From V4 to V5

From V5 to V6

From V6 to V7

From V7 to V8

From V8 to V9

From V9 to V10

Adding threadId column to message metadata tables