What is Cassandra?
Cassandra is a distributed database for managing large amount of structured data. It offers capabilities like horizontal scalability and high availability (no single point of failure because of its decentralized nature). Following are some key points about it
- Scalable: Cassandra supports horizontal scalability. Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications.
- Highly available:
- Decentralized: There is no single point of failure. Every node in cluster is identical (no master/slave notion).
- Fault Tolerant: Data is automatically replicated to multiple nodes for fault tolerance. Failed nodes can be replaced without any downtime. Replication across multiple data centers are supported.
Setup a multi-node cluster on Ubuntu 16.04:
Prerqs
- Three machines with ubuntu 16.04 OS.
- Each machine should be able to communicate with each other.
NOTE: Repeat Below steps on each machine.
1. Installing oracle JVM:
- sudo add-apt-repository ppa:webupd8team/java
- sudo apt-get update
- sudo apt-get install oracle-java8-set-default
- java -version
2. Installing Cassandra:
- echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
- curl -L https://debian.datastax.com/debian/repo_key | sudo apt-key add -
- sudo apt-get update
- sudo apt-get install dsc30
- sudo apt-get install cassandra-tools
3. Connecting to the cluster:
- sudo nodetool status
- cqlsh
You should be able to see the cqlsh prompt.
4. Create a ring -- Deleting default data
- sudo service cassandra stop
- sudo rm -rf /var/lib/cassandra/data/system/*
5. Create a ring -- Configuring the cluster
- Modify /etc/cassandra/cassandra.yaml
cluster_name: 'cassan'
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "<server1 ip>,<server2 ip>"
listen_address: <local server ip>
rpc_address: <local server ip>
auto_bootstrap: false
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
endpoint_snitch: SimpleSnitch
start_native_transport: true
native_transport_port: 9042
6. Create a ring -- Configuring the firewall
To allow communication, we'll need to open the 7000, 9042 network ports for each node
-
sudo apt-get install -y iptables-persistent
- Add following to /etc/iptables/rules.v4-A INPUT -p tcp -s <your_other_server_ip> -m multiport --dports 7000,9042 -m state --state NEW,ESTABLISHED -j ACCEPT
- sudo service iptables-persistent restart
- sudo service cassandra start
Check the cluster status:
- sudo nodetool statusYou should see something like following
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.10.0.32 123.93 KB 1 74.7% 6fa993f1-07f7-4368-8ee5-c52cedae3843 rack1
UN 10.10.0.102 152.18 KB 1 12.4% 64c4c449-3949-4c83-a0a7-86b084a58d5c rack1
UN 10.10.0.4 229.86 KB 1 12.9% 83cd40ec-3e64-43ea-87a9-65bc8a90bd1d rack1
- You should also be able to see cqlsh prompt
cqlsh <serverip> 9042
Congratulations! You now have a multi-node Cassandra cluster running.
References:
https://docs.datastax.com/en/cassandra/2.0/cassandra/initialize/initializeSingleDS.html
https://www.digitalocean.com/community/tutorials/how-to-run-a-multi-node-cluster-database-with-cassandra-on-ubuntu-14-04
https://docs.datastax.com/en/cassandra/2.0/cassandra/initialize/initializeSingleDS.html
https://www.digitalocean.com/community/tutorials/how-to-run-a-multi-node-cluster-database-with-cassandra-on-ubuntu-14-04
No comments:
Post a Comment