« Back to blog

Apache Cassandra - Simplified Client API

Apache Cassandra's Java API is a bit low level, no matter how you slice and dice the code, you end up with very verbose code. There are a few third party libraries that try to alleviate the pain of having to write such verbose code. If your like me though, you might not be very excited about adding yet another set of jars to the countless number of jars in your applications lib. So instead why not write your own little wrapper that presents a cleaner and easier to use API? Here's a quick example of a wrapper that literally took minutes to write. First let's compare and contrast the simple wrapper with the Cassandra (out of the box) thrift API.

Our example wrapper is the CassandraGateway class with a simple method for inserting columns into a column family for a specific row. This example inserts a new User (column-family) into Cassandra with values (columns) for username, password, and email.

CassandraGateway gateway = new CassandraGateway("mindplex");
gateway.insert("User", "100",
    new Pair[] {
        new Pair("username", "vitalic"),
        new Pair("password", "poisonlips"),
        new Pair("email", "vitalic@mindplexmedia.com")
    });

I think you will probably agree, that this code is pretty much self explanatory and the API is very simple? We basically have an insert method that accepts the name of a column-family, the row id, and an array of pairs that represent columns.

Now let's take a step back and view what the same operation looks like in the native Cassandra thrift API.

TTransport transport = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol protocol = new TBinaryProtocol(transport);
transport.open();
Cassandra.Client client = new Cassandra.Client(protocol);

client.set_keyspace("mindplex");
String entity = "User";

ByteBuffer userid = ByteBuffer.wrap("100".getBytes());
ColumnParent parent = new ColumnParent(entity);
long timestamp = System.currentTimeMillis();

Column username = new Column(
    ByteBuffer.wrap("username".getBytes("UTF8")),
    ByteBuffer.wrap("vitalic".getBytes()),
    timestamp
);
Column password = new Column(
    ByteBuffer.wrap("password".getBytes("UTF8")),
    ByteBuffer.wrap("poisonlips".getBytes()),
    timestamp
);
Column email = new Column(
    ByteBuffer.wrap("email".getBytes("UTF8")),
    ByteBuffer.wrap("vitalic@mindplexmedia.com".getBytes()),
    timestamp
);

// insert data
client.insert(userid, parent, username, ConsistencyLevel.ONE);
client.insert(userid, parent, password, ConsistencyLevel.ONE);
client.insert(userid, parent, email, ConsistencyLevel.ONE);

transport.close();

As you can see, the amount of code required is much greater than the simple wrapper. All this "byte" buffer code is making me want to "bite" my fingers off. =:0

Now that we have seen the difference between the CassandraGateway and the thrift API version, let's take a look at how the insert method of the gateway is implemented. The purpose here is simply to demonstrate how easy it is to build a thin wrapper over thrift.

public void insert(String entity, String id, Pair[] pairs) throws CassandraGatewayException {
  for (Pair pair : pairs) {
    insert(entity, id, pair);
  }
}

This method is nothing more than an overloaded batch version of the insert method that inserts one column at a time. We simply loop through the specified array of pairs/columns and delegate the insert to the single insert version.

public void insert(String entity, String id, Pair pair) throws CassandraGatewayException {
  Cassandra.Client client = getClient();
  client.insert(getRowId(id), new ColumnParent(entity), getColumn(pair), getConsistencyLevel());
  close();
}

In this method we actually execute the Cassandra Insert. As you can see, this is a template method that breaks the steps required to do an insert, into individual methods that can be overridden for custom behavior. The getClient() method return a client connection to Cassandra. The getRowId(id) method returns a byte buffer representation of the row id. The getColumn(pair) method hides the details of constructing a Column object and setting the required fields i.e., key, value, and timestamp. And lastly, the getConsistencyLevel() method returns whatever the default consistency level has been defined in the Gateway. For illustration purposes, here are the implementations of these methods:

note: The keyspace, host, port and consistency level would be defined when the gateway is constructed.

/**
 * Creates a Cassandra client connection.  This version is overly simplified for 
 * illustration purposes only.  In a production application the use of a connection
 * pool would be preferred.
 */
protected Cassandra.Client getClient() throws Exception {
  TTransport transport = new TFramedTransport(new TSocket(host, port));
  TProtocol protocol = new TBinaryProtocol(transport);
  Cassandra.Client client = new Cassandra.Client(protocol);        
  transport.open();
  client.set_keyspace(keyspace);
  this.transport = transport;  
  return client;
}

/** 
 * Gets a byte buffer representation of the specified row id.
 */
protected ByteBuffer getRowId(String rowid) {
  return ByteBuffer.wrap(rowid.getBytes());
}
 
/**
 * Gets a Column with its key, value and timestamp set.
 */
protected Column getColumn(Pair pair) {
  return new Column()
    .setName(pair.getKey().getBytes())
    .setValue(pair.getVal().getBytes())
    .setTimestamp(System.currentTimeMillis());
}

/**
 * Get whatever the default consistency level is.
 */
protected ConsistencyLevel getConsistencyLevel() {
  return defaultConsistencyLevel;    
}

/**
 * Overly simplified non-threadsafe way to teardown the connection to Cassandra.
 * Some sort of connection pooling mechanism would be ideal instead of manually 
 * dealing with single connections.
 */
protected void close() {
  transport.flush();
  transport.close();
}

And there you have it! A super simple wrapper over the Cassandra thrift API to make life easier and most importantly, make your code readable. I actually have a full CassandraGateway implementation that includes all the Cassandra operations. I will be adding the gateway to my GitHub repo soon, for those who want to play with it.

That concludes this short example, hopefully your now inspired to go write cool Cassandra applications.

| Viewed
times
Filed under: