Answer
What are the Key Advantages to running with a higher replication factor value?
Aerospike will keep the number of copies of the record data equal to the value used for replication-factor distributed on multiple nodes ensuring high availability of all data in the cluster. This data persistence is beneficial when experiencing node failures. For example, with a replication-factor of 3, Aerospike keeps three copies of the data. This will allow full data availability if the cluster experiences a failure on two nodes simultaneously (or a node failure while migrations triggered by a previous node failure are still in progress).
What are the Key Disadvantages of running with a higher replication factor value?
Capacity needs will increase when increasing the replication factor value. Which increases the total cost of a cluster. For example, with a replication-factor of 3, the Aerospike cluster requires three times the storage capacity compared with a cluster with a replication-factor of 1. So changing from a replication-factor of 2 to 3 will have an storage increase of 50%, the IO incurred by write transactions will also be increased by 50%.
Performance also suffers when increasing the replication-factor value.
It is suggested to benchmark the performance drop before attempting this change on a production cluster.
The penalty is primarily due to the additional intra-cluster communication incurred by synchronous replication. The fabric traffic will increase as well as the SSD write load. There will definitely be a performance hit when changing replication-factor to a higher value, as that number of replicas will have to be written. This would only be for writes and not reads.
Overall write performance will decrease as there are additional data writes to each additional node. Ideally, the writes to the replicas will be done in parallel but they may come back at different times so it will be slower as it is waiting for each node to reply.
Having a replication factor higher than the number of nodes
Setting a replication factor higher than the number of nodes in an AP namespace will allow your replications to grow with the number of nodes you have in the cluster should you need every node in your cluster to have a copy of the data on it.There is no direct configuration
replication-factor all to specify all nodes in the cluster. In the extreme case where you want to ensure all nodes in a cluster have a replica copy set replication-factor to 128. The value of 128 is assuming that your cluster is smaller than 128 nodes. Up until AS4.9, cluster size was limited to 128, from AS5+ the limit is 256.In an SC namespace, if you set the replication factor to higher than the number of nodes in your cluster, this results in the following error : error 11 (AEROSPIKE_ERR_CLUSTER)
Notes
For AP namespaces in server version 6.0 and later,replication-factor can be changed dynamically. A recluster command is required for the change to take effect. For server prior to 6.0, changes to replication-factor require a full cluster restart.