Cassandra Error Too many pending remote requests

Author: ZeroPoints

Nodes will frequently be marked down with the following message:

ERROR [MessagingService-Incoming-/192.168.1.34] 2019-09-12 10:52:54,054  MessagingService.java:823 - java.util.concurrent.RejectedExecutionException while receiving WRITES.WRITE from /192.168.1.34, caused by: Too many pending remote requests!

I have tried narrowing the problem down and I think its memtable related.
Because when I run a nodetool tpstats it shows TPC/all/WRITE_MEMTABLE_FULL having approx 4000 active.
Looking through opscenter graphs I can see that TP: Write Memtable Full Active is at approx 4000 and TP: Gossip Tasks Pending keeps increasing and OS: CPU User sits at 80%.
I have also noticed commit log directory exceeds what we set of 16GB and have seen a node with 7911 files(248GB)

The nodes were previously set to have
cassandra-env.sh

MAX_HEAP_SIZE="64G"
HEAP_NEWSIZE="16G"

cassandra.yaml

commitlog_segment_size_in_mb: 32
memtable_heap_space_in_mb: 4096
memtable_offheap_space_in_mb: 4096
memtable_cleanup_threshold: 0.5
commitlog_total_space_in_mb: 16384
memtable_flush_writers: 2

I am currently trying to use these new settings on one of the nodes to see if it fixes the servers performance

memtable_heap_space_in_mb: 10240
memtable_offheap_space_in_mb: 2048
memtable_cleanup_threshold: 0.11
memtable_flush_writers: 8

If anyone has any ideas on what else I can look at.
Also how do I view MemtablePool.BlockedOnAllocation metric that it details against the memtable_flush_writers section in http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html

Originally Sourced from: https://stackoverflow.com/questions/57898557/cassandra-error-too-many-pending-remote-requests