Articles in this section

Duplicate events in Kafka coming from the Aerospike connector

Problem Description

Duplicate record events are observed in Kafka, even for records that have not been updated after a connector configuration change.

In the connector logs, errors similar to any of the following may appear:

2023-06-14 13:28:47.001 GMT WARN NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 4 : {myset=UNKNOWN_TOPIC_OR_PARTITION}
...
2023-06-14 13:29:46.498 GMT ERROR ErrorRegistry - Error stack trace
org.apache.kafka.common.errors.TimeoutException: Topic myset not present in metadata after 60000 ms.
...
2023-06-14 13:40:53.949 GMT ERROR ErrorRegistry - Error stack trace
com.aerospike.connect.outbound.ratelimiter.RateLimitExceeded: Max requests queued 32768 exceeded for dispatcher

Explanation

When using the Aerospike Kafka outbound connector with `routing.mode` configured to route records by namespace, set name, or bin value, the connector determines the Kafka topic dynamically based on the record contents.

In this configuration, the connector expects the destination Kafka topic to already exist.

If the topic does not exist and the Kafka broker has the setting `auto.create.topics.enable=false`, the Kafka producer cannot publish the message and returns a retryable error such as:

UNKNOWN_TOPIC_OR_PARTITION
-or-
Topic <topic_name> not present in metadata

Because these errors are considered retry-able, the connector repeatedly retries sending the same event.
As a result, duplicate events may appear once the topic is eventually created or once the retries succeed.

If retries continue to accumulate, the connector may also log queue overflow errors such as:

Max requests queued 32768 exceeded for dispatcher

This indicates that the connector’s retry queue has reached its configured limit due to repeated send failures.


Solution

Ensure that all Kafka topics required by the connector's routing configuration exist before starting the connector.

For example, if routing by set name is used, create the corresponding topics in Kafka:

kafka-topics.sh --create --topic myset --bootstrap-server <broker> --partitions <n> --replication-factor <n>

Alternatively, Kafka can be configured to automatically create topics by enabling:

auto.create.topics.enable=true

This approach has trade-offs.
Automatically created topics will use Kafka’s default configuration values for partition count and replication factor, which may not match your production requirements.

For this reason, many deployments prefer to explicitly create topics with the desired configuration before running the connector.

After creating the missing topics, verify that the connector logs no longer show metadata or routing errors.
Once the topics exist, the retry loop will stop and duplicate event generation should cease.


Applies To Earliest Version

Current Version

Applies To Latest Version

Current Version
Was this article helpful?
0 out of 0 found this helpful