[SUPPLY BANK] We are currently experiencing a performance degradation issue affecting platform.

Incident Report for Monkey Exchange

Resolved

We have experienced an instability in Kubernetes CoreDNS that caused our internal name resolution to fail, and as a consequence, our alerting pipeline was impacted, since alerts rely on webhooks that could not be delivered. As an improvement action, we will deploy CoreDNS as a DaemonSet to ensure resilience and consistent availability across all nodes. We will also add an internal DNS probe within the cluster to monitor DNS-related failures and provide early detection even when CoreDNS is degraded.

Posted Dec 08, 2025 - 12:27 GMT-03:00

Update

Communication between our services and the workload manager has been restored, and the environment is stable. We continue to monitor the situation and will update this page soon with more information about the incident.

Posted Dec 08, 2025 - 12:05 GMT-03:00

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Dec 08, 2025 - 11:59 GMT-03:00

Update

We have identified a communication failure between some of our servers and our workload manager, which affected the application distribution process.

Posted Dec 08, 2025 - 11:38 GMT-03:00

Identified

The issue has been identified and a fix is being implemented.

Posted Dec 08, 2025 - 11:22 GMT-03:00

Investigating

We are currently investigating this issue.

Posted Dec 08, 2025 - 11:07 GMT-03:00

This incident affected: PORTAL, OPERATIONS, API, INTEGRATION, WEBHOOKS, NOTIFICATION, and REPORT.