Apache Cassandra versions 3.x and below have an all or nothing approach when it comes the datacenter user authorization security model. That is, a user has access to all datacenters in the cluster or no datacenters in the cluster. This has changed to something a little more fine grained for versions 4.0 and above, all thanks to Blake Eggleston and the work he has done on CASSANDRA-13985.
The Cassandra 4.0 feature added via CASSANDRA-13985 allows an operator to restrict the access of a Cassandra role to specific datacenters. This new shiny feature is effectively datacenter authorization for roles and will help provide better security and protection for multi-datacenter clusters.
Consider the example scenario where a cluster has two datacenters; dc1
and dc2
. In this scenario datacenter dc1
backs a web server application that performs Online Transaction Processing, and datacenter dc2
backs an analytics application. The web server application could be restricted via a role to access dc1
only and similarly, the analytics application could be restricted via a role to access dc2
only. The advantage here is that it minimises the reach that each application has to the cluster. If the analytics application was configured incorrectly to connect to dc1
it would fail, rather than quietly running and increasing the load on the dc1
nodes.
The behaviour of the new datacenter authorization feature can be controlled via the cassandra.yaml file using the new setting named network_authorizer
. Out of the box it can be set to one of two values:
- AllowAllNetworkAuthorizer - allows any role to access any datacenter effectively disabling datacenter authorization; which is the current behaviour.
- CassandraNetworkAuthorizer - allows the ability to store permissions which restrict role access to specific datacenters.
- For the
network_authorizer
setting work when set to CassandraNetworkAuthorizer, theauthenticator
setting must be set to PasswordAuthenticator. Otherwise, the node will fail to start. - When enabling any authorization feature in Cassandra including this one, always increase the
system_auth
keyspace replication factor. Failure to do this may result in being locked out of the cluster! - Further values can be added for custom behaviour by implementing the
INetworkAuthorizer
interface. - Apache Cassandra 4.0 will ship with
network_authorizer
set to a value of AllowAllNetworkAuthorizer in the cassandra.yaml file. This is similar to the existingauthorizer
setting in Cassandra where no authorization restrictions are applied by default.
When network_authorizer
is set to CassandraNetworkAuthorizer, the CQL syntax can be used to set the datacenter access for a role in a cluster. To help with the setting of permissions in CQL, its keyword vocabulary has been extended to include the clauses ACCESS TO ALL DATACENTERS
and ACCESS TO DATACENTERS
. Both clause can be added to CQL ROLE
statements when either creating or altering a role.
To create a role that has access to all datacenters in a cluster use the ACCESS TO ALL DATACENTERS
clause. For example:
CREATE ROLE foo WITH PASSWORD = '...' AND LOGIN = true AND ACCESS TO ALL DATACENTERS;
Similarly a role can be altered to have access to all datacenters. For example:
ALTER ROLE foo WITH ACCESS TO ALL DATACENTERS;
To create a role that is restricted to specific datacenters use the clause ACCESS TO DATACENTERS
followed by a set containing the datacenters the role is authorized to access. The datacenter names are literal values i.e. quoted and comma separated. For example, use the following CQL to restrict the access of a role to datacenters dc1
and dc3
only:
CREATE ROLE foo WITH PASSWORD = '...' AND LOGIN = true
AND ACCESS TO DATACENTERS {'dc1', 'dc3'};
Similarly a role can be altered to have restricted access. For example:
ALTER ROLE foo WITH ACCESS TO DATACENTERS {'dc1', 'dc3'};
If the ACCESS TO DATACENTERS {...}
clause is omitted from a CREATE ROLE
command, then the new role will have access to all data centers in the cluster. In this specific case, it is equivalent to adding the ACCESS TO ALL DATACENTERS
clause on the CREATE ROLE
command.
Here is a quick demo of the feature in action. The following demo uses ccm
to launch a cluster running the trunk version of Apache Cassandra commit Id 2fe4b9d
. The cluster will have two datacenters with a single node in each, and the network_authorizer
feature will be enabled on each node. The scripts to set up ccm
and cluster are included inline as well.
Set up ccm
to use the local build of commit 2fe4b9d
for the Cassandra libraries, by running the following script.
#!/bin/bash
set -e
if [ -z "${1}" ]
then
echo "Apache Cassandra repository path required."
exit 1
fi
CCM_CASSANDRA_VERSION="4.0.0"
CCM_CASSANDRA_REPOSITORY_PATH=".ccm/repository/${CCM_CASSANDRA_VERSION}"
CASSANDRA_DIR_PATH=${1}
CASSANDRA_SUB_DIR_LIST="bin build conf lib pylib tools"
echo "Building CCM ${CCM_CASSANDRA_VERSION} repository"
mkdir -p ~/${CCM_CASSANDRA_REPOSITORY_PATH}
echo ${CCM_CASSANDRA_VERSION} > ~/${CCM_CASSANDRA_REPOSITORY_PATH}/0.version.txt
for dir_name in ${CASSANDRA_SUB_DIR_LIST}
do
echo "Copying directory ${CASSANDRA_DIR_PATH}/${dir_name} to CCM ${CCM_CASSANDRA_VERSION} repository"
mkdir -p ~/${CCM_CASSANDRA_REPOSITORY_PATH}/${dir_name}
cp -r ${CASSANDRA_DIR_PATH}/${dir_name}/* ~/${CCM_CASSANDRA_REPOSITORY_PATH}/${dir_name}
done
Create the ccm
cluster which uses the libraries from commit 2fe4b9d
by running the following script.
#!/bin/bash
set -e
CLUSTER_NAME="${1:-dc-security-demo}"
ccm remove ${CLUSTER_NAME}
echo "Creating cluster '${CLUSTER_NAME}'"
ccm create ${CLUSTER_NAME} -v 4.0.0
# Modifies the configuration of a node in the CCM cluster.
function update_node_config {
CASSANDRA_YAML_SETTINGS="authenticator:PasswordAuthenticator \
endpoint_snitch:GossipingPropertyFileSnitch \
network_authorizer:CassandraNetworkAuthorizer \
num_tokens:32 \
seeds:127.0.0.1,127.0.0.2"
for key_value_setting in ${CASSANDRA_YAML_SETTINGS}
do
setting_key=$(echo ${key_value_setting} | cut -d':' -f1)
setting_val=$(echo ${key_value_setting} | cut -d':' -f2)
sed -ie "s/${setting_key}\:\ .*/${setting_key}:\ ${setting_val}/g" \
~/.ccm/${CLUSTER_NAME}/node${1}/conf/cassandra.yaml
done
sed -ie "s/dc=.*/dc=dc${1}/g" \
~/.ccm/${CLUSTER_NAME}/node${1}/conf/cassandra-rackdc.properties
sed -ie 's/\#MAX_HEAP_SIZE=\"4G\"/MAX_HEAP_SIZE=\"1G\"/g' \
~/.ccm/${CLUSTER_NAME}/node${1}/conf/cassandra-env.sh
sed -ie 's/\#HEAP_NEWSIZE=\"800M\"/HEAP_NEWSIZE=\"250M\"/g' \
~/.ccm/${CLUSTER_NAME}/node${1}/conf/cassandra-env.sh
}
NUMBER_NODES=2
for node_num in $(seq ${NUMBER_NODES})
do
echo "Adding 'node${node_num}'"
ccm add node${node_num} \
-i 127.0.0.${node_num} \
-j 7${node_num}00 \
-r 0 \
-b \
-s
update_node_config ${node_num}
# Localhost aliases
echo "ifconfig lo0 alias 127.0.0.${node_num} up"
sudo ifconfig lo0 alias 127.0.0.${node_num} up
done
sed -ie 's/use_vnodes\:\ false/use_vnodes:\ true/g' \
~/.ccm/${CLUSTER_NAME}/cluster.conf
Check the cluster nodes were created.
anthony@Anthonys-MacBook-Pro ~/ > ccm status
Cluster: 'dc-security-demo'
---------------------------
node1: DOWN (Not initialized)
node2: DOWN (Not initialized)
Start the nodes in the cluster.
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 start
anthony@Anthonys-MacBook-Pro ~/ > ccm node2 start
Check that the cluster is up and running as expected.
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 115.46 KiB 32 100.0% 7dafff97-e2c5-4e70-a6a9-523f5594671b rack1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.2 67.05 KiB 32 100.0% 437e3bca-d0b7-4102-bc56-201b96856f01 rack1
Start a CQL session with the cluster and increase the system_auth
replication.
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 cqlsh -u cassandra -p cassandra
Connected to dc-security-demo at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 4.0-SNAPSHOT | CQL spec 3.4.5 | Native protocol v4]
Use HELP for help.
cassandra@cqlsh>
cassandra@cqlsh> ALTER KEYSPACE system_auth
WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 1, 'dc2' : 1};
Warnings :
When increasing replication factor you need to run a full (-full) repair to distribute the data.
cassandra@cqlsh> exit;
Repair the system_auth
keyspace on both nodes.
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 nodetool repair system_auth
...
anthony@Anthonys-MacBook-Pro ~/ > ccm node2 nodetool repair system_auth
...
Start another CQL session and create a few roles with different datacenter restrictions.
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 cqlsh -u cassandra -p cassandra
Connected to dc-security-demo at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 4.0-SNAPSHOT | CQL spec 3.4.5 | Native protocol v4]
Use HELP for help.
cassandra@cqlsh>
cassandra@cqlsh> CREATE ROLE foo WITH PASSWORD = 'foo' AND LOGIN = true
AND ACCESS TO DATACENTERS {'dc1'};
cassandra@cqlsh> CREATE ROLE bar WITH PASSWORD = 'bar' AND LOGIN = true
AND ACCESS TO DATACENTERS {'dc2'};
cassandra@cqlsh> SELECT * FROM system_auth.network_permissions;
role | dcs
-----------+---------
roles/foo | {'dc1'}
roles/bar | {'dc2'}
(2 rows)
cassandra@cqlsh> exit;
Test the datacenter access for the newly created roles.
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 cqlsh -u foo -p foo
Connected to dc-security-demo at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 4.0-SNAPSHOT | CQL spec 3.4.5 | Native protocol v4]
Use HELP for help.
foo@cqlsh> exit;
anthony@Anthonys-MacBook-Pro ~/ > ccm node2 cqlsh -u foo -p foo
Connection error: ('Unable to connect to any servers', {'127.0.0.2': Unauthorized('Error from server: code=2100 [Unauthorized] message="You do not have access to this datacenter"',)})
anthony@Anthonys-MacBook-Pro ~/ > ccm node1 cqlsh -u bar -p bar
Connection error: ('Unable to connect to any servers', {'127.0.0.1': Unauthorized('Error from server: code=2100 [Unauthorized] message="You do not have access to this datacenter"',)})
anthony@Anthonys-MacBook-Pro ~/ > ccm node2 cqlsh -u bar -p bar
Connected to dc-security-demo at 127.0.0.2:9042.
[cqlsh 5.0.1 | Cassandra 4.0-SNAPSHOT | CQL spec 3.4.5 | Native protocol v4]
Use HELP for help.
bar@cqlsh>
As can be seen from the output above, a role is unable to establish a CQL session on a node in a particular datacenter unless it has been granted permissions to do so.
Apache Cassandra 4.0 is definitely shaping up to be an exciting new release of the database! The datacenter authorization feature is a useful for hardening the security of a cluster by limiting the reach of roles and applications talking to the cluster. It is designed to be used in conjunction with other authorization features to create roles that have specific purposes in a cluster. Stay tuned as we post more new features and updates that will be part of Cassandra 4.0.