DataStax has opened up ‘early access’ to its DataStax Change Data Capture (CDC) Connector for Apache Kafka, the open source stream-processing (where applications can use multiple computational units, similar to parallel processing) software platform.
As a company, DataStax offers a commercially supported ‘enterprise-robust’ database built on open source Apache Cassandra.
Stream processing is all about speed and cadence, so, the DataStax CDC Connector for Apache Kafka gives developers ‘bidirectional data movement’ between DataStax, Cassandra and Kafka clusters.
In live deployment, CDC is designed to capture and forward-insert, update and delete activity applied to tables (column families).
Bidirectional dexterity
So what does bidirectional data dexterity bring forward? It makes it easier for developers to build globally synchronised microservices.
Apache Cassandra already offers a core cross-datacentre replication capability for data movement between microservices. This then, is an augmentation of that replication power.
The connector enables bidirectional data flow between Cassandra and Kafka, ensuring that data committed to a developer’s chosen system of record database can be forwarded to microservices through Kafka.
Developers are invited to use the early release of the CDC source connector in DataStax Labs in conjunction with any Kafka offering and provide feedback before the product is finalised.
Kathryn Erickson, senior director of strategic partnerships at DataStax says that her firm surveys its customers every year and, currently, more than 60% of respondents are using Kafka with DataStax built on Cassandra.
“Any time you see a modern architecture design promoted, whether it be SMACK, LAMDA, or general microservices, you see Cassandra and Kafka. This CDC connector is an important technical achievement for our customers. They can now work with the utmost confidence that they’re getting high quality and proven high performance with Cassandra and Kafka,” said Erickson.
Gold plated connectors
As an additional note, the DataStax Apache Kafka Connector recently earned the Verified Gold level by Confluent’s Verified Integrations Programme.
This distinction is supposed to assure connectors meet technical and functional requirements of the Apache Kafka Connect API – an open source component of Kafka, which provides the framework for connecting Kafka with external systems.
According to DataStax, “By adhering to the Connect API, customers can expect a better user experience, scalability and integration with the Confluent Platform. The initial DataStax Apache Kafka Connector enables developers to capture data from Kafka and store it in DataStax and Cassandra for further processing and management, offering customers high-throughput rates.”
Why it all matters
The questions you should perhaps now be asking are: does any of this matter, has DataStax done a good thing… and is stream processing a hugely important part of the still-emerging world of Complex Event Processing (CEP) and the allied worlds of streaming analytics and modern fast-moving cloud native infrastructures?
The answer is: look at your smartphone, laptop, nearest kiosk computer, IoT sensor or (if you happen to be sat in a datacentre) mainframe or other machine — the reality of modern IT is a world with a perpetual stream of data events all happening in real time. If we sit back and store data through aggregation and batch processes, then we will miss the opportunity to spot trends, act and apply forward-thinking AI & ML based processes upon our data workloads.
DataStax obviously knows this and the company appears to be pushing Kafka advancements and augmentations forward to serve the needs of the always-on data workloads that we’re now working with.
Developers are able to test the early release of the DataStax CDC Connector for Apache Kafka immediately and can access the connector here.