Successfully reported this slideshow.
GumGum: Multi-Region Cassandra in AWS
Upcoming SlideShare
Loading in …5
×
- 1. CASSANDRA SUMMIT 2015CASSANDRA SUMMIT 2015 Mario Lazaro September 24th 2015 #CassandraSummit2015 MULTI-REGION CASSANDRA IN AWSMULTI-REGION CASSANDRA IN AWS
- 2. WHOAMIWHOAMI Mario Cerdan Lazaro Big Data Engineer Born and raised in Spain Joined GumGum 18 months ago About a year and a half experience with Cassandra
- 3. #5 Ad Platform in the U.S 10B impressions / month 2,000 brand-safe premium publisher partners 1B+ global unique visitors Daily inventory Impressions processed - 213M Monthly Image Impressions processed - 2.6B 123 employees in seven offices
- 4. AGENDAAGENDA Old cluster International Expansion Challenges Testing Modus Operandi Tips Questions & Answers
- 5. 25 Classic nodes cluster 1 Region / 1 Rack hosted in AWS EC2 US East Version 2.0.8 Datastax CQL driver GumGum's metadata including visitors, images, pages, and ad performance Usage: realtime data access and analytics (MR jobs) OLD C* CLUSTER - MARCH 2015OLD C* CLUSTER - MARCH 2015
- 6. OLD C* CLUSTER - REALTIME USE CASEOLD C* CLUSTER - REALTIME USE CASE Billions of rows Heavy read workload 60/40 TTLs everywhere - Tombstones Heavy and critical use of counters RTB - Read Latency constraints (total execution time ~50 ms)
- 7. OLD C* CLUSTER - ANALYTICS USE CASEOLD C* CLUSTER - ANALYTICS USE CASE Daily ETL jobs to extract / join data from C* Hadoop MR jobs AdHoc queries with Presto
- 8. INTERNATIONAL EXPANSIONINTERNATIONAL EXPANSION
- 9. FIRST STEPSFIRST STEPS Start C* test datacenters in US East & EU West and test how C* multi region works in AWS Run capacity/performance tests. We expect 3x times more traffic in 2015 Q4 FIRST THOUGHTSFIRST THOUGHTS Use AWS Virtual Private Cloud (VPC) Cassandra & VPC present some connectivity issues challenges Replicate entire data with same number of replicas
- 10. TOO GOOD TO BE TRUE ...TOO GOOD TO BE TRUE ...
- 11. CHALLENGESCHALLENGES Problems between Cassandra in EC2 Classic / VPC and Datastax Java driver EC2MultiRegionSnitch uses public IPs. EC2 instances do not have an interface with public IP address - Cannot connect between instances in the same region using Public IPs /** * Implementation of {@link AddressTranslater} used by the driver that * translate external IPs to internal IPs. * @author Mario <mario@gumgum.com> */ public class Ec2ClassicTranslater implements AddressTranslater { private static final Logger LOGGER = LoggerFactory.getLogger(Ec2Cla private ClusterService clusterService; private Cluster cluster; private List<Instance> publicDnss; @PostConstruct public void build() { publicDnss = clusterService.getInstances(cluster); } /** * Translates a Cassandra {@code rpc_address} to another address if * <p> * * @param address the address of a node as returned by Cassandra. * @return {@code address} translated IP address of the source. */ public InetSocketAddress translate(InetSocketAddress address) { for (final Instance server : publicDnss) { if (server.getPublicIpAddress().equals(address.getHostStrin LOGGER.info("IP address: {} translated to {}", address. return new InetSocketAddress(server.getPrivateIpAddress } } return null; } public void setClusterService(ClusterService clusterService) { this.clusterService = clusterService; } public void setCluster(Cluster cluster) { this.cluster = cluster; } } Problems between Cassandra in EC2 Classic / VPC and Datastax Java driver EC2MultiRegionSnitch uses public IPs. EC2 instances do not have an interface with public IP address - Cannot connect between instances in the same region using Public IPs Region to Region connectivity will use public IPs - Trust those IPs or use software/hardware VPN Problems between Cassandra in EC2 Classic / VPC and Datastax Java driver EC2MultiRegionSnitch uses public IPs. EC2 instances do not have an interface with public IP address - Cannot connect between instances in the same region using Public IPs Region to Region connectivity will use public IPs - Trust those IPs or use software/hardware VPN Your application needs to connect to C* using private IPs - Custom EC2 translator
- 12. Datastax Java Driver Load Balancing Multiple choices DCAware + TokenAware Datastax Java Driver Load Balancing Multiple choices DCAware + TokenAware + ? Datastax Java Driver Load Balancing Multiple choices CHALLENGES “ Clients in one AZ attempt to always communicate with C* nodes in the same AZ. We call this zone-aware connections. This feature is built into , Netflix’s C* Java client library.Astyanax
- 13. Zone Aware Connection: Webapps in 3 different AZs: 1A, 1B, and 1C C* datacenter spanning 3 AZs with 3 replicas CHALLENGESCHALLENGES 1A 1B 1C 1B1B
- 14. We added it! - Rack/AZ awareness to TokenAware Policy CHALLENGESCHALLENGES
- 15. CHALLENGESCHALLENGES Third Datacenter: Analytics Do not impact realtime data access Spark on top of Cassandra Spark-Cassandra Datastax connector Replicate specific keyspaces Less nodes with larger disk space Settings are different Ex: Bloom filter chance
- 16. CHALLENGESCHALLENGES Third Datacenter: Analytics Cassandra Only DC Realtime Cassandra + Spark DC Analytics
- 17. CHALLENGESCHALLENGES Upgrade from 2.0.8 to 2.1.5 Counters implementation is buggy in pre-2.1 versions “ My code never has bugs. It just develops random unexpected features
- 18. CHALLENGESCHALLENGES “ To choose, or not to choose VNodes. That is the question. (M. Lazaro, 1990 - 2500) Previous DC using Classic Nodes Works with MR jobs Complexity for adding/removing nodes Manual manage token ranges New DCs will use VNodes Apache Spark + Spark Cassandra Datastax connector Easy to add/remove new nodes as traffic increases
- 19. TESTINGTESTING
- 20. TESTINGTESTING Testing requires creating and modifying many C* nodes Create and configuring a C* cluster is time-consuming / repetitive task Create fully automated process for creating/modifying/destroying Cassandra clusters with Ansible # Ansible settings for provisioning the EC2 instance --- ec2_instance_type: r3.2xlarge ec2_count: - 0 # How many in us-east-1a ? - 7 # How many in us-east-1b ? ec2_vpc_subnet: - undefined - subnet-c51241b2 - undefined - subnet-80f085d9 - subnet-f9138cd2 ec2_sg: - va-ops - va-cassandra-realtime-private
- 21. TESTING - PERFORMANCETESTING - PERFORMANCE Performance tests using new Cassandra 2.1 Stress Tool: Recreate GumGum metadata / schemas Recreate workload and make it 3 times bigger Try to find limits / Saturate clients # Keyspace Name keyspace: stresscql #keyspace_definition: | # CREATE KEYSPACE stresscql WITH replication = {'class': #' ### Column Distribution Specifications ### columnspec: - name: visitor_id size: gaussian(32..32) #domain names are relatively population: uniform(1..999M) #10M possible domains to pic - name: bidder_code cluster: fixed(5) - name: bluekai_category_id - name: bidder_custom size: fixed(32) - name: bidder_id size: fixed(32) - name: bluekai_id size: fixed(32) - name: dt_pd - name: rt_exp_dt - name: rt_opt_out ### Batch Ratio Distribution Specifications ### insert: partitions: fixed(1) # Our partition key is the v select: fixed(1)/5 # We have 5 bidder_code per dom batchtype: UNLOGGED # Unlogged batches # # A list of queries you wish to run against the schema #
- 22. TESTING - PERFORMANCETESTING - PERFORMANCE Main worry: Latency and replication overseas Use LOCAL_X consistency levels in your client Only one C* node will contact only one C* node in a different DC for sending replicas/mutations
- 23. TESTING - PERFORMANCETESTING - PERFORMANCE Main worries: Latency
- 24. TESTING - INSTANCE TYPETESTING - INSTANCE TYPE Test all kind of instance types. We decided to go with r3.2xlarge machines for our cluster: 60 GB RAM 8 Cores 160GB Ephemeral SSD Storage for commit logs and saved caches RAID 0 over 4 SSD EBS Volumes for data Performance / Cost and GumGum use case makes r3.2xlarge the best option Disclosure: I2 instance family is the best if you can afford it
- 25. TESTING - UPGRADETESTING - UPGRADE Upgrade C* Datacenter from 2.0.8 to 2.1.5 Both versions can cohabit in the same DC New settings and features tried DateTieredCompactionStrategy: Compaction for Time Series Data Incremental repairs Counters new architecture
- 26. MODUS OPERANDIMODUS OPERANDI
- 27. MODUS OPERANDIMODUS OPERANDI Sum up From: One cluster / One DC in US East To: One cluster / Two DCs in US East and one DC in EU West
- 28. MODUS OPERANDIMODUS OPERANDI First step: Upgrade old cluster snitch from EC2Snitch to EC2MultiRegionSnitch Upgrade clients to handle it (aka translators) Make sure your clients do not lose connection to upgraded C* nodes (JIRA DataStax - )JAVA-809
- 29. MODUS OPERANDIMODUS OPERANDI Second step: Upgrade old datacenter from 2.0.8 to 2.1.5 nodetool upgradesstables (multiple nodes at a time) Not possible to rebuild a 2.1.X C* node from a 2.0.X C* datacenter. rebuild WARN [Thread-12683] 2015-06-17 10:17:22,845 IncomingTcpConnection.java:91 - UnknownColumnFamilyException reading from socket; closing org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=XXX
- 30. MODUS OPERANDIMODUS OPERANDI Third step: Start EU West and new US East DCs within the same cluster Replication factor in new DCs: 0 Use dc_suffix to differentiate new Virginia DC from old one Clients do not talk to new DCs. Only C* knows they exist Replication factor to 3 on all except analytics 1 Start receiving new data Nodetool rebuild <old-datacenter> Old data
- 31. MODUS OPERANDIMODUS OPERANDI RF 3 Clients RF 3:0:0:0RF 3:3:3:1 US East Realtime EU West Realtime US East Analytics Rebuild Rebuild Rebuild
- 32. From 39d8f76d9cae11b4db405f5a002e2a4f6f764b1d Mon Sep 17 00:00:00 2001 From: mario <mario@gumgum.com> Date: Wed, 17 Jun 2015 14:21:32 -0700 Subject: [PATCH] AT-3576 Start using new Cassandra realtime cluster --- src/main/java/com/gumgum/cassandra/Client.java | 30 ++++------------------ .../com/gumgum/cassandra/Ec2ClassicTranslater.java | 30 ++++++++++++++-------- src/main/java/com/gumgum/cluster/Cluster.java | 3 ++- .../resources/applicationContext-cassandra.xml | 13 ++++------ src/main/resources/dev.properties | 2 +- src/main/resources/eu-west-1.prod.properties | 3 +++ src/main/resources/prod.properties | 3 +-- src/main/resources/us-east-1.prod.properties | 3 +++ .../CassandraAdPerformanceDaoImplTest.java | 2 -- .../asset/cassandra/CassandraImageDaoImplTest.java | 2 -- .../CassandraExactDuplicatesDaoTest.java | 2 -- .../com/gumgum/page/CassandraPageDoaImplTest.java | 2 -- .../cassandra/CassandraVisitorDaoImplTest.java | 2 -- 13 files changed, 39 insertions(+), 58 deletions(-) MODUS OPERANDIMODUS OPERANDI Start using new Cassandra DCs
- 33. RF 3:3:3:1 MODUS OPERANDIMODUS OPERANDI Clients US East Realtime EU West Realtime US East Analytics RF 0:3:3:1
- 34. MODUS OPERANDIMODUS OPERANDI Clients US East Realtime EU West Realtime US East Analytics RF 0:3:3:1 RF 3:3:1Decomission
- 35. TIPSTIPS
- 36. TIPS - AUTOMATED MAINTENANCETIPS - AUTOMATED MAINTENANCE Maintenance in a multi-region C* cluster: Ansible + Cassandra maintenance keyspace + email report = zero human intervention! CREATE TABLE maintenance.history ( dc text, op text, ts timestamp, ip text, PRIMARY KEY ((dc, op), ts) ) WITH CLUSTERING ORDER BY (ts ASC) AND bloom_filter_fp_chance=0.010000 AND caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment='' AND dclocal_read_repair_chance=0.100000 AND gc_grace_seconds=864000 AND read_repair_chance=0.000000 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX history_kscf_idx ON maintenance.history (kscf); 3-133-65:/opt/scripts/production/groovy$ groovy CassandraMaintenanceCheck.groovy -dc us-east-va-realtime -op compaction -e mari
- 37. TIPS - SPARKTIPS - SPARK Number of workers above number of total C* nodes in analytics Each worker uses: 1/4 number of cores of each instance 1/3 total available RAM of each instance Cassandra-Spark connector SpanBy .joinWithCassandraTable(:x, :y) Spark.cassandra.output.batch.size.bytes Spark.cassandra.output.concurrent.writes
- 38. val conf = new SparkConf() .set("spark.cassandra.connection.host", cassandraNodes) .set("spark.cassandra.connection.local_dc", "us-east-va-analytics") .set("spark.cassandra.connection.factory", "com.gumgum.spark.bluekai.DirectLinkConnectionFactory") .set("spark.driver.memory","4g") .setAppName("Cassandra presidential candidates app") TIPS - SPARKTIPS - SPARK Create "translator" if using EC2MultiRegionSnitch Spark.cassandra.connection.factory
- 39. SINCE C* IN EU WEST ...SINCE C* IN EU WEST ... US West Datacenter! EU West DC US East DC Analytics DC US West DC
- 40. Q&AQ&A GumGum is hiring! http://gumgum.com/careers
Public clipboards featuring this slide
No public clipboards found for this slide