Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

2/12/2020

Reading time:4 min

UDTs in Cassandra, Simplified!

by Kunal Sethi

In most programming languages, i.e. Scala or Java, we can play with object constructs and create our own classes and instances out of it. A similar construct is also provided by Cassandra: UDT, which stands for User Defined Type.User-defined types (UDTs) can attach multiple data fields, each named and typed, to a single column. The fields that are used to create user-defined type may be any valid datatype, including collection or other UDTs. Once a UDT is created it can be used to define a column in a table.Syntax to define UDT:CREATE TYPE student.basic_info ( first_name text, last_name text, nationality text, roll_no int, address text);student is the keyspace name here and we are creating basic_info. This type contains five fields (first_name, last_name, nationality, roll_no, address) with a type of each. We can use the basic_info type to define a column in a table. In simple words, we can say that we use UDT to store the value as an object in Cassandra that contains some fields within itself.So, we declared a table with name student_stats with three columns (id, grade, basics) with a datatype of each. The last column is UDT itself because the datatype of basics is what we declared as type basic_info.How to insert data in a table with UDT:CREATE TABLE student.student_stats ( id int PRIMARY KEY, grade text, basics FROZEN<basic_info>); This is how we insert data in a UDT column. It looks like we are giving a key-value pair to specify which field value we are giving.Here, I just want to share one more thing about the insert statement: in Cassandra, we can also insert the data in JSON format as below.INSERT INTO student.student_stats JSON'{"id":3, "basics":{"first_name":"Abhinav", "last_name":"Sinha", "nationality":"Indian", "roll_no":103, "address":"Delhi"}, "grade":"Tenth"}'Let's take another case in which we will not insert one of the field values of the UDT. The question is, will the value of that field be inserted?INSERT INTO student.student_stats (id, grade, basics) VALUES (2,'SIXTH', {first_name: 'Anshul', last_name: 'shivhare', nationality: 'Indian',roll_no: 102});In this insert command, we are not passing the value of address field here. So the question arises of how Cassandra will handle this. Well, it will insert this value as a normal value but it will take the address field value as null. Every field value, except the primary key that we do not pass at the time of insertion, Cassandra will take it as null.As you can see here, the address field value is null.Now, we will go through with one more example, where we will see how to fetch data from a UDT field with Java code example:public class ReadData { public static void main(String args[]) throws Exception {//queries String query = "SELECT * FROM student_stats";//Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").withPort(9042).build(); Session session = cluster.connect("folio_viewer"); List udtValList = new ArrayList<>(); List basicInfoList = new ArrayList<>(); ResultSet result = session.execute(query); result.forEach(row -> { UDTValue udt = row.get("basics", UDTValue.class); udtValList.add(udt); }); udtValList.stream().forEach(val -> { BasicInfo basicInfo = BasicInfo.builder().firstName(val.get(0, String.class) != null ? val.get(0, String.class) : "") .lastName(val.get(1, String.class) != null ? val.get(1, String.class) : "") .nationality(val.get(2, String.class) != null ? val.get(2, String.class) : "") .rollNo(val.get(3, Integer.class) != null ? val.get(3, Integer.class) : 0) .address(val.get(4, String.class) != null ? val.get(4, String.class) : "").build(); basicInfoList.add(basicInfo); } ); basicInfoList.stream().forEach(val -> { System.out.println("_______________________________________________"); System.out.println("FirstName :- " + val.getFirstName()); System.out.println("LastName :- " + val.getLastName()); System.out.println("Nationality :- " + val.getNationality()); System.out.println("Roll Number :- " + val.getRollNo()); System.out.println("Address :- " + val.getAddress()); System.out.println("_______________________________________________"); }); }}In the result object, we are getting a ResultSet and then we are performing iterations with the help of foreach and in each iteration, we get one row each, out of which we extract the UDT column basics and then casting that value into a UDTValue object.UDTValue stores the fields in a sequential manner in the order they are present in a UDT column itself. To retrieve values from the UDTValue object, we just need to give the index number of the corresponding field, i.e. val.get(3, Integer.class).As you can notice from the UDT definition, rollno is the fourth field; hence, we are using the index number 3 and the type is int, so we are typecasting that particular field using Integer.class.This is how we can get the data from UDT fields. One more thing to notice in this example is that we used the lombok builder() method to create objects.I hope this article will reduce your efforts in implementing UDTs.This article was originally posted on the Knoldus blog.

Illustration Image

In most programming languages, i.e. Scala or Java, we can play with object constructs and create our own classes and instances out of it. A similar construct is also provided by Cassandra: UDT, which stands for User Defined Type.

User-defined types (UDTs) can attach multiple data fields, each named and typed, to a single column. The fields that are used to create user-defined type may be any valid datatype, including collection or other UDTs. Once a UDT is created it can be used to define a column in a table.

Syntax to define UDT:

CREATE TYPE student.basic_info (
  first_name text,
  last_name text, 
  nationality text,
  roll_no int,
  address text
);

student is the keyspace name here and we are creating basic_info. This type contains five fields (first_name, last_name, nationality, roll_no, address) with a type of each. We can use the basic_info type to define a column in a table. In simple words, we can say that we use UDT to store the value as an object in Cassandra that contains some fields within itself.

So, we declared a table with name student_stats with three columns (id, grade, basics) with a datatype of each. The last column is UDT itself because the datatype of basics is what we declared as type basic_info.

How to insert data in a table with UDT:

CREATE TABLE student.student_stats 
( id int PRIMARY KEY, grade text, basics FROZEN<basic_info>); 

This is how we insert data in a UDT column. It looks like we are giving a key-value pair to specify which field value we are giving.

image

Here, I just want to share one more thing about the insert statement: in Cassandra, we can also insert the data in JSON format as below.

INSERT INTO student.student_stats JSON'{"id":3, "basics":{"first_name":"Abhinav", "last_name":"Sinha", "nationality":"Indian", "roll_no":103, "address":"Delhi"}, "grade":"Tenth"}'

Let's take another case in which we will not insert one of the field values of the UDT. The question is, will the value of that field be inserted?

INSERT INTO student.student_stats (id, grade, basics) VALUES (2,'SIXTH', {first_name: 'Anshul', last_name: 'shivhare', nationality: 'Indian',roll_no: 102});

In this insert command, we are not passing the value of address field here. So the question arises of how Cassandra will handle this. Well, it will insert this value as a normal value but it will take the address field value as null. Every field value, except the primary key that we do not pass at the time of insertion, Cassandra will take it as null.

As you can see here, the address field value is null.

image

Now, we will go through with one more example, where we will see how to fetch data from a UDT field with Java code example:

public class ReadData {
    public static void main(String args[]) throws Exception {
//queries
        String query = "SELECT * FROM student_stats";
//Creating Cluster object
        Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").withPort(9042).build();
        Session session = cluster.connect("folio_viewer");
        List udtValList = new ArrayList<>();
        List basicInfoList = new ArrayList<>();
        ResultSet result = session.execute(query);
        result.forEach(row -> {
            UDTValue udt = row.get("basics", UDTValue.class);
            udtValList.add(udt);
        });
        udtValList.stream().forEach(val -> {
                    BasicInfo basicInfo = BasicInfo.builder().firstName(val.get(0, String.class) != null ?
                            val.get(0, String.class) : "")
                            .lastName(val.get(1, String.class) != null ? val.get(1, String.class) : "")
                            .nationality(val.get(2, String.class) != null ? val.get(2, String.class) : "")
                            .rollNo(val.get(3, Integer.class) != null ? val.get(3, Integer.class) : 0)
                            .address(val.get(4, String.class) != null ? val.get(4, String.class) : "").build();
                    basicInfoList.add(basicInfo);
                }
        );
        basicInfoList.stream().forEach(val -> {
            System.out.println("_______________________________________________");
            System.out.println("FirstName :- " + val.getFirstName());
            System.out.println("LastName :- " + val.getLastName());
            System.out.println("Nationality :- " + val.getNationality());
            System.out.println("Roll Number :- " + val.getRollNo());
            System.out.println("Address :- " + val.getAddress());
            System.out.println("_______________________________________________");
        });
    }
}

In the result object, we are getting a ResultSet and then we are performing iterations with the help of foreach and in each iteration, we get one row each, out of which we extract the UDT column basics and then casting that value into a UDTValue object.

UDTValue stores the fields in a sequential manner in the order they are present in a UDT column itself. To retrieve values from the UDTValue object, we just need to give the index number of the corresponding field, i.e. val.get(3, Integer.class).

As you can notice from the UDT definition, rollno is the fourth field; hence, we are using the index number 3 and the type is int, so we are typecasting that particular field using Integer.class.

This is how we can get the data from UDT fields. One more thing to notice in this example is that we used the lombok builder() method to create objects.

I hope this article will reduce your efforts in implementing UDTs.

This article was originally posted on the Knoldus blog.

image

Related Articles

cluster
troubleshooting
datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

arodrime

4/3/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra