Cassandra Insert column example
Cassandra Inserting columns into ColumnFamily
Tutorial Details:
Program: Cassandra
Difficulty: Moderate
Estimated Completion Time: 15 minutes
Welcome back to learning how to develop Cassandra based applications in the Java programming language. In the previous Cassandra tutorial we demonstrated how to delete columns from a Cassandra column family. In this tutorial we explore how to insert a new column into an existing Cassandra column family row.
You can also watch a Screencast of this tutorial (compact version)
The version of Cassandra used in this tutorial is 0.7.3 with Thrift 0.5.
We won’t get into the details of Cassandra’s data model, for example, what is a keyspace, column family etc… Instead we will jump right into the Thrift API and start walking down the code required to do a column insert. If you’re a Cassandra newbie and haven’t fully got your head around the Cassandra data model, then I would recommend you consult the Apache Cassandra documentation and spend some time learning more about Cassandra’s data model.
As a brief reminder, keep in mind that the difference between Cassandra and a traditional relational database is that Cassandra supports a variable amount of columns per rows in any given column family.
In this tutorial we will take an existing column family and append a new column to an existing row. Our column family is of type User and contains one row with the id “100” and two columns: username and password. We will explore how to add a new column “description” with the value “I’m a nice guy”.
Before we get started coding, let's setup our test environment. We will use a test keyspace with the name "tutorials" and a column family User with a couple of columns: username and password. Using your favorite text editor save the following commands in a file named: cassandra-insert-tutorial.txt to your Cassandra installation directory or whatever directory you feel comfortable with.
create keyspace tutorials;
use tutorials;
create column family User with
comparator = UTF8Type and
column_metadata =
[
{column_name: username, validation_class: UTF8Type},
{column_name: password, validation_class: UTF8Type}
];Now run the following command that executes our schema definition defined in our cassandra-insert-tutorial.txt file and adds our dummy data row to the User column family.
> bin/cassandra-cli –h localhost –p 9160 –f cassandra-tutorials.txt > bin/cassandra-cli –h localhost –p 9160 use tutorials; set User[‘100’][‘username’] = ‘abelperez’; set User[‘100’][‘password’] = ‘drowssap’; get User[‘100’];
Your output should show two columns for the User column family.
Now that the environment has been setup, the first thing we need to do is get a connection to Cassandra via the Thrift Client API. We need to define a transport with details about a Cassandra node. The details include the hostname and port that Cassandra is running on. Once we define the transport and a binary protocol, we can instantiate a Cassandra.Client instance with the protocol specified in the constructor and open the connection through the transport object.
TTransport transport = new TFramedTransport(new TSocket(host, port)); TProtocol protocol = new TBinaryProtocol(transport); Cassandra.Client client = new Cassandra.Client(protocol); transport.open();
Once our connection is open we need to specify the keyspace we want to use in this session. A keyspace is the namespace for column families in our Cassandra instance. The keyspace is assigned via the set_keyspace() function of the Cassandra.Client API. We set the keyspace to "tutorials".
client.set_keyspace("tutorials");The next thing we need to do is define a ColumnParent with the name of the column family we plan to append with a new column. A column parent is used when working with groups of columns that belong to the same column family.
ColumnParent parent = new ColumnParent(“User”);
Once we define the column parent, we need to specify the id of the row we want to a new column too. Row id’s are stored in binary format so we need to convert the value of the string “100” into a Java byte buffer.
ByteBuffer rowid = ByteBuffer.wrap(“100”.getBytes());
Next we define the column we wish to add to our column family and set its name, value and timestamp. Note: the timestamp is used for conflict resolution and is outside the scope of this tutorial. For now you can ignore the timestamp. Both name and value are stored in binary format so we convert the name and value in byte buffers.
Column description = new Column() description.setName(“description”.getBytes()); description.setValue(“I’m a nice guy”.getBytes()) description.setTimestamp(System.currentTimeMillis());
The last thing we need to set before we invoke the insert operation is the consistency level for our write operation. In this example, we will set the consistency level to ONE. In the context of write operations, consistency level ONE means that the write operation will not respond until the data has been written to at least one replica’s commit log i.e., SSTable and memory table.
ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;
Now we get to the part were we invoke the insert operation of the Cassandra.Client API. This operation basically executes the insert against the Cassandra node specified in our connection. We don’t illustrate the exceptions that can occur here, but we’ll leave that as an exercise for the reader.
client.insert(rowid, parent, description, consistencyLevel);
Lastly, we flush and close our connection to release all resources associated with our session.
transport.flush(); transport.close();
Now let's verify our insert operation worked as expected. From the Cassandra command line interface run the following command to verify that our User column family now contains the new column for the row id "100".
get User['100'];
Your output should now reflect three columns (username, password, description) in the User column family for row id "100".
And that concludes this tutorial, you should now be equipped with the knowledge required to insert new columns into existing rows of Cassandra column families. In the next part of these series we examine the different ways to insert multiple columns in one batch operation.
Here is the full example wrapped in a simple Java class with a main function
import org.apache.cassandra.thrift.*;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import java.nio.ByteBuffer;
/**
* Cassandra Insert example.
*/
public class InsertExample
{
public static void main(String[] args) throws Exception {
TTransport transport = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol protocol = new TBinaryProtocol(transport);
Cassandra.Client client = new Cassandra.Client(protocol);
transport.open();
client.set_keyspace("tutorials");
// define column parent
ColumnParent parent = new ColumnParent("User");
// define row id
ByteBuffer rowid = ByteBuffer.wrap("100".getBytes());
// define column to add
Column description = new Column();
description.setName("description".getBytes());
description.setValue("I’m a nice guy".getBytes());
description.setTimestamp(System.currentTimeMillis());
// define consistency level
ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;
// execute insert
client.insert(rowid, parent, description, consistencyLevel);
// release resources
transport.flush();
transport.close();
}
}