Cloud Recent Issue Analysis

Issue explanation


The message queue service on ecCLOUD was overloaded, which caused slow activity reports and slow configuration changes.

Solution


1. Increase HW resource for statistics service in order to increase the capability of message queue service
2. Adjust the internal timeout mechanism to avoid service restart due to timeout

Affect services:


1. Device/Site/Cloud level statistics report
   Device will update statistics messages periodically, however previous resource arrangement in ecCLOUD cannot handle large amount of messages in spike, which causes statistics to report slowly.
2. Connection and actions between device and ecCLOUD
   Service responsible for connection and actions between ecCLOUD and devices also impacted, some devices will get unreachable in ecCLOUD.

Impacted start date: 2022/6/21 18:00 PM UTC

Impacted end date: 2022/6/23 12:00 PM UTC

We will keep tracking and monitoring the status of ecCLOUD service.

Thank you.