-
Notifications
You must be signed in to change notification settings - Fork 925
Description
We have an internal Python API that fetches all ACLs from a Kafka cluster.
Previously, we used a ZooKeeper-based API for this purpose. As ZooKeeper is being deprecated in favor of KRaft mode, we are migrating to the Confluent Kafka Python API.
However, when using confluent_kafka.admin.AdminClient.describe_acls(acl_filter)
, we observe that:
-
The call is significantly slower than both the Java AdminClient API and our earlier ZooKeeper-based Python API.
-
On the same cluster and ACL dataset, Java AdminClient or ZooKeeper API returns results in under 1 second, whereas the Confluent API takes over 5 seconds.
-
When the ACL count exceeds ~55,000 entries, the Confluent API call frequently terminates with a segmentation fault.
Reproducer code (Note: My Kafka cluster has around 2.5 Lakh acls. Nothing suspicious found in broker logs.)
from confluent_kafka.admin import AdminClient, AclBindingFilter, ResourcePatternType, ResourceType, AclOperation, AclPermissionType
import time
# Replace with your Kafka bootstrap servers
BOOTSTRAP_SERVERS = "localhost:9092"
client = AdminClient({
'bootstrap.servers': BOOTSTRAP_SERVERS,
'security.protocol': 'SASL_PLAINTEXT',
'sasl.mechanism': 'GSSAPI',
'sasl.kerberos.service.name': 'kafka',
})
acl_filter = AclBindingFilter(
restype=ResourceType.ANY,
name=None,
resource_pattern_type=ResourcePatternType.ANY,
principal=None,
host=None,
operation=AclOperation.ANY,
permission_type=AclPermissionType.ANY
)
start = time.time()
try:
acls = client.describe_acls(acl_filter).result()
print(f"Fetched {len(acls)} ACLs")
except Exception as e:
print(f"Error fetching ACLs: {e}")
end = time.time()
print(f"Time taken: {end - start:.2f} seconds")
- Confluent Kafka Version: 2.8.0
- Librdkafka Version: 2.2.0
- Kafka Cluster Version: 3.9.0