Cassandra Writes Slowing Down Cluster

We had a problem a few weeks ago at work where we had a Cassandra cluster slow down pretty dramatically in response to a large number of writes. The primary cause was that we had set the concurrent_compactors in the cassandra.yml file to be too high. Notably:

  1. The spike in requests caused Cassandra to flush more SSTables
  2. More SSTables triggered more compaction
  3. Too much compaction had all the Cassandra threads dedicated to handling compaction and not actually handling requests.

In the future, this can be discovered with:

We took a little bit to figure this out, although a coworker pointed out that iostat probably could've given us some more insight as well. We also found out that the relevant line to grep for is StatusLogger.java:65, which is the generic dump for in Cassandra for the thread pool status: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/StatusLogger.java#L65.

Posted: 2020-06-07
Filed Under: tech