Which is better using node-cassandra-driver-simple-select-query or Cassandra-view--groupby ( Data Model suggestion) [closed]

Author: Manzoor

Originally Sourced from: https://stackoverflow.com/questions/60407603/which-is-better-using-node-cassandra-driver-simple-select-query-or-cassandra-vie

I have a table that will get sensor data for every 5 seconds. When a request comes , I will get the datetime and find the year,month,week,day,hour and other data into this table using node js. My table schema is as shown below.

CREATE TABLE sensorrawdata ( sensorid text, year int, month int, week int, day int, hour int, min30Data int, min15Data int, min5Data int, datetime timestamp, temp float, voltage float, power float, PRIMARY KEY (sensorid,year,month,week,day,hour,min30Data,min15Data,min5Data,datetime ) ) ;

I need to retrieve the data(Temp,voltage and so on) based on various criteria such as monthly data,weekly data,daywise,hourwise,30min,15min and 5min and use this data to be plot graph.

I wanted to find the average of the data from table based on criteria passed.

For example : To retrieve the weekwise data , I used the below query. select avg(temp), avg(voltage) from sensorrawdata group by sensorid,year,month,week;

But sometimes I may need to use different group by order which is different from the one specified in primary key of table.

For example : To get 15minData , select avg(temp), avg(voltage) from sensorrawdata group by sensorid,year,month,day,hour,min15Data;

Error : InvalidRequest: Error from server: code=2200 [Invalid query] message="Group by currently only support groups of columns following their declared order in the PRIMARY KEY"

Approach 1: Change the order of primary key in view and leverage aggregate function provided by cassandra.

So I analysed and found that we have option to create view . And in the view all of the primary key of table needs to be included but the order of primary key can be shuffled.

CREATE MATERIALIZED VIEW sensor_by_15min AS SELECT sensorid,year,month, week,min15Data,datetime, temp, voltage FROM sensorrawdata WHERE sensorid is not null and year is not null and month is not null and week is not null and day is not null and hour is not null and min30Data is not null and min15Data is not null and min5Data is not null and datetime is not null PRIMARY KEY (sensorid,year,month,day,hour,min15Data,min5Data,week,min30Data,datetime);

select avg(temp), avg(voltage) from sensor_by_15min group by sensorid,year,month,day,hour,min15Data;

With this modification, I am able to retrieve the data using group by. But cassandra is informing me not to use views in production as they are experimental. So should I create multiple tables instead of a view ?

Approach 2: Use simple querying technique in node js and cassandra driver , stream and process the data. Stream method is capable of handling large result sets.

 exports.sensor_query_db_using_stream =  function sensor_query_db_using_stream(sensorid,duration,from_criteria,to_criteria){
   return new Promise(async function(resolve, reject)  {
      let data = new Map(); 
      var params = [sensorid,from_criteria,to_criteria];
          var query = 'SELECT sensorid,datetime,temp,voltage,min5data FROM keyspace.sensorrawdata WHERE sensorid=? and datetime>=? and datetime<=?  ALLOW FILTERING';
      return  client.stream(query, params, { prepare: true})  
          .on('readable', function () {
          var row;
          while (row = this.read()) {
            //Process the rows 
        })
        .on('end', function () {
           resolve(data);
        }) 
        .on('error', function (err) {
           reject(err); 
        });   
    });
  };

Kindly provide which approach is better. Also Kindly provide some suggestions.