Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

1/12/2021

Reading time:4 min

How to install the Apache Cassandra NoSQL database server on Ubuntu 20.04

by Jack Wallen

by in Big Data on November 5, 2020, 8:49 AM PST Wanting to up your database game? If so, you're probably need a NoSQL database. Jack Wallen shows you how to install Apache Cassandra to scratch that itch. Big business requires big data. In order for your company to handle such massive amounts of data, you'll need a NoSQL database. Of the available databases, which is best suited for your task? Given you have quite a few to consider, the choice might be a bit daunting. Fortunately a number of those options are open source and pretty easy to deploy so you can quickly kick the tires and see if it's the right one to fit your needs.  One such NoSQL database is Apache Cassandra, which offers linear scalability, high availability, no single point of failure, and can work on commodity hardware. Cassandra is used by a number of very large companies (including Reddit, Netflix, and Github).  I want to walk you through the process of installing Apache Cassandra on Ubuntu Server 20.04. SEE: Navigating data privacy (free PDF) (TechRepublic) What you'll need A running instance of Ubuntu Server 20.04 A user with sudo privileges How to install Java Apache Cassandra requires Java. We need to make sure to install the correct version of Java, which is OpenJDK 8. For that, we turn to the standard repository. Log in to your Ubuntu Server and install Java with the command: sudo apt install openjdk-8-jdk -yWhen that installation completes, you're ready to install Apache Cassandra. How to install Apache Cassandra Apache Cassandra can't be installed from the standard repositories. Because of that, we need to add the official repository. In order to do this over HTTPS, we must first install apt-transport-https with the command: sudo apt install apt-transport-https -yOnce that dependency is taken care of, download and install the necessary GPG key with the command: wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -Create an apt list file for Apache Cassandra with the command: sudo nano /etc/apt/sources.list.d/cassandra.listIn that file, add the following: deb http://www.apache.org/dist/cassandra/debian 311x mainSave and close the file. Update apt with the command: sudo apt-get updateFinally, install Apache Cassandra with the command: sudo apt-get install cassandra -yWhen the installation completes, verify it with the command: nodetool statusYou should see the database server is running on datacenter1 (Figure A). Figure A Apache Cassandra is up and running on Ubuntu Server 20.04. How to change the name of your database cluster By default, your Apache Cassandra cluster will be named "Test Cluster." Let's rename that. To do so, you must first access the Apache Cassandra console with the command: cqlshAt the console, rename the cluster with the command: UPDATE system.local SET cluster_name = 'NAME' WHERE KEY = 'local';Where NAME is the new name of your cluster.  Exit the console with the command: exitNow we need to change the name in the Apache Cassandra configuration file. Open the file for editing with the command: sudo nano /etc/cassandra/cassandra.yamlLook for the line starting with cluster_name and change it to the same name you used in the UPDATE command (Figure B). Figure B The Apache Cassandra configuration file is where you configure the cluster name.Flush the Apache Cassandra cache with the command: nodetool flush systemRestart Apache Cassandra with the command: sudo systemctl restart cassandraNow, when you log in to the Apache Cassandra console, you should see the new name of your cluster is in effect (Figure C). Figure C We've renamed our cluster. How to open Apache Cassandra for remote connection By default, Apache Cassandra is only listening to local connections. That's great, if the applications that will use the database are hosted on the same machine. If you need to use Apache Cassandra as a remote database, you'll need to enable it for remote connection. To do this, open the configuration file again with the command: sudo nano /etc/cassandra/cassandra.yamlIn that file, you must change three lines. The first is: start_rpc: falseChange the above to: start_rpc: trueThe next line is: rpc_address: localhostChange the above line to: rpc_address: 0.0.0.0The next line needs to be uncommented (by removing the # character) and changed to: broadcast_rpc_address: SERVER_IPWhere SERVER_IP is the IP address of your hosting server. Save and close the file. Flush and restart Apache Cassandra with these commands: nodetool flush systemsudo systemctl restart cassandraYou should now be able to connect to Apache Cassandra from remote servers. And that's how you can easily get this powerful, open source NoSQL database server installed. Enjoy your newfound ability to scale your data to meet enterprise-level needs. Subscribe to TechRepublic's How To Make Tech Work on YouTube for all the latest tech advice for business pros from Jack Wallen.Open Source Weekly Newsletter You don't want to miss our tips, tutorials, and commentary on the Linux OS and open source applications. Delivered Tuesdays Sign up today Also see Image: Andrey Popov

Illustration Image

by in Big Data on November 5, 2020, 8:49 AM PST

Wanting to up your database game? If so, you're probably need a NoSQL database. Jack Wallen shows you how to install Apache Cassandra to scratch that itch.

Big business requires big data. In order for your company to handle such massive amounts of data, you'll need a NoSQL database. Of the available databases, which is best suited for your task? Given you have quite a few to consider, the choice might be a bit daunting. Fortunately a number of those options are open source and pretty easy to deploy so you can quickly kick the tires and see if it's the right one to fit your needs. 

One such NoSQL database is Apache Cassandra, which offers linear scalability, high availability, no single point of failure, and can work on commodity hardware. Cassandra is used by a number of very large companies (including Reddit, Netflix, and Github). 

I want to walk you through the process of installing Apache Cassandra on Ubuntu Server 20.04.

SEE: Navigating data privacy (free PDF) (TechRepublic)

What you'll need

  • A running instance of Ubuntu Server 20.04

  • A user with sudo privileges

How to install Java

Apache Cassandra requires Java. We need to make sure to install the correct version of Java, which is OpenJDK 8. For that, we turn to the standard repository.

Log in to your Ubuntu Server and install Java with the command:

sudo apt install openjdk-8-jdk -y

When that installation completes, you're ready to install Apache Cassandra.

How to install Apache Cassandra

Apache Cassandra can't be installed from the standard repositories. Because of that, we need to add the official repository. In order to do this over HTTPS, we must first install apt-transport-https with the command:

sudo apt install apt-transport-https -y

Once that dependency is taken care of, download and install the necessary GPG key with the command:

wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -

Create an apt list file for Apache Cassandra with the command:

sudo nano /etc/apt/sources.list.d/cassandra.list

In that file, add the following:

deb http://www.apache.org/dist/cassandra/debian 311x main

Save and close the file.

Update apt with the command:

sudo apt-get update

Finally, install Apache Cassandra with the command:

sudo apt-get install cassandra -y

When the installation completes, verify it with the command:

nodetool status

You should see the database server is running on datacenter1 (Figure A).

Figure A

cassandraa.jpg
Apache Cassandra is up and running on Ubuntu Server 20.04.

How to change the name of your database cluster

By default, your Apache Cassandra cluster will be named "Test Cluster." Let's rename that. To do so, you must first access the Apache Cassandra console with the command:

cqlsh

At the console, rename the cluster with the command:

UPDATE system.local SET cluster_name = 'NAME' WHERE KEY = 'local';

Where NAME is the new name of your cluster. 

Exit the console with the command:

exit

Now we need to change the name in the Apache Cassandra configuration file. Open the file for editing with the command:

sudo nano /etc/cassandra/cassandra.yaml

Look for the line starting with cluster_name and change it to the same name you used in the UPDATE command (Figure B).

Figure B

cassandrab.jpg
The Apache Cassandra configuration file is where you configure the cluster name.

Flush the Apache Cassandra cache with the command:

nodetool flush system

Restart Apache Cassandra with the command:

sudo systemctl restart cassandra

Now, when you log in to the Apache Cassandra console, you should see the new name of your cluster is in effect (Figure C).

Figure C

cassandrac.jpg
We've renamed our cluster.

How to open Apache Cassandra for remote connection

By default, Apache Cassandra is only listening to local connections. That's great, if the applications that will use the database are hosted on the same machine. If you need to use Apache Cassandra as a remote database, you'll need to enable it for remote connection. To do this, open the configuration file again with the command:

sudo nano /etc/cassandra/cassandra.yaml

In that file, you must change three lines. The first is:

start_rpc: false

Change the above to:

start_rpc: true

The next line is:

rpc_address: localhost

Change the above line to:

rpc_address: 0.0.0.0

The next line needs to be uncommented (by removing the # character) and changed to:

broadcast_rpc_address: SERVER_IP

Where SERVER_IP is the IP address of your hosting server.

Save and close the file. Flush and restart Apache Cassandra with these commands:

nodetool flush system
sudo systemctl restart cassandra

You should now be able to connect to Apache Cassandra from remote servers.

And that's how you can easily get this powerful, open source NoSQL database server installed. Enjoy your newfound ability to scale your data to meet enterprise-level needs.

Subscribe to TechRepublic's How To Make Tech Work on YouTube for all the latest tech advice for business pros from Jack Wallen.

Also see

bigdataistock000091239055andreypopov.jpg
Image: Andrey Popov

Related Articles

cassandra
ubuntu
linux

Installing Apache Cassandra on Ubuntu 20.04

John Doe

11/4/2022

cassandra
ubuntu

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra