ISSUE : OBM/OpsB node performance data not populate in the OPTIC Data Lake

 

Issue Description

When using Containerized Operations Bridge (OpsB), performance data may not populate in the OPTIC Data Lake, which causes BVD reports to show no data or fail to generate.

This issue is commonly caused by Pulsar topic backlog or quota exceeded in the postload pipeline.

Impact

·         No performance metrics in OPTIC Data Lake

·         BVD dashboards/reports show empty or missing data

·         Postload tasks stuck or failing

High-Level Fix

1.      Stop postload pods

2.      Clear Pulsar backlog

3.      Delete postload topics

4.      Restart postload pods

5.      Validate data flow

Step-by-Step Resolution Procedure

Step 1: Stop Postload Pods

Why:
Stops new messages from entering Pulsar while we clean up backlog and topics.

Command:

kubectl scale deployment itom-di-postload-taskcontroller --replicas=0 -n opsb-helm
 
kubectl scale deployment itom-di-postload-taskexecutor --replicas=0 -n opsb-helm

What to check:

kubectl get pods -n opsb-helm | grep postload

Pods should move to Terminating and then disappear.

📸 Screenshot to capture:

·         Output showing replicas scaled to 0/0

Step 2: Access Pulsar Bastion Pod

Why:
All Pulsar admin commands must be executed from the Pulsar bastion container.

Command:

kubectl exec -ti itomdipulsar-bastion-0 -n opsb-helm -c pulsar -- bash

Expected Result:
You should land inside the container shell:

pulsar@itomdipulsar-bastion-0:/pulsar>

📸 Screenshot to capture:

·         Successful login to bastion pod

Step 3: Clear Pulsar Backlog (Quota Exceeded)

Why:
Backlog quota exceeded blocks message consumption, stopping performance data ingestion.

Command:

/pulsar/bin/pulsar-admin topics clear-backlog \
persistent://public/default/opr_event-partition-0 \
-s 0_itom_di_scheduler_provider_default_opr_event

What this does:

·         Clears pending messages for the specified subscription

·         Frees Pulsar backlog space

📸 Screenshot to capture:

·         Command execution with no error output

Step 4: Delete Postload Topics

Why:
Corrupted or stuck topics prevent postload processing.
Deleting them allows OpsB to recreate fresh topics automatically.

Run all commands from the bastion pod:

bin/pulsar-admin topics delete-partitioned-topic -f \
persistent://public/itomdipostload/di_internal_postload_state
 
bin/pulsar-admin topics delete-partitioned-topic -f \
persistent://public/itomdipostload/di_postload_task_status_topic
 
bin/pulsar-admin topics delete-partitioned-topic -f \
persistent://public/itomdipostload/di_postload_task_topic

Expected Result:

·         No errors

·         Topics deleted successfully

📸 Screenshot to capture:

·         Each delete command output

Step 5: Start Postload Pods

Why:
Restarting postload components recreates topics and resumes data flow.

Command:

kubectl scale deployment itom-di-postload-taskcontroller --replicas=2 -n opsb-helm
kubectl scale deployment itom-di-postload-taskexecutor --replicas=2 -n opsb-helm

Verify:

kubectl get pods -n opsb-helm | grep postload

Pods should be in Running state.

Step 6: Validation Checks (Important)

Pods Status

kubectl get pods -n opsb-helm

Pulsar Topics Recreated

Topics should auto-recreate once pods start.

OPTIC Data Lake

·         Wait 10–20 minutes

·         Confirm new performance data ingestion

BVD Reports

·         Refresh dashboard

·         Confirm metrics are now visible

No comments

Powered by Blogger.