Sentinel (Germain UX's self-monitoring)
Feature
Sentinel is a monitoring system, spun off Germain UX, ensures high availability and performance of Germain UX. The Sentinel script performs the following tasks:
Monitoring features:
Connect to Germain infrastructure services (Zookeeper, ActiveMQ) and record availability / outages as well as metrics (ex: queue usage).
Monitor configurable list of OS services, report on availability, CPU + memory usage.
Monitor configurable list of log files, check last write time, check for errors or warnings based on configurable threshold.
Monitor configurable HTTP endpoints, report status + response.
Summarize all findings in a single email report, some configuration around under which conditions to send the report email
The report categorizes the status of each software feature as follows:
RED: Indicates a software feature that is broken, failing, or unavailable.
ORANGE: Indicates a software feature that is slow or experiencing errors.
GREEN: Indicates a software feature that is available and fast.
The attached example provides a sample report sent by the Sentinel script.
Please note that the specific details and format of the report may vary based on the implementation and configuration of the Sentinel system.
Example of a Report sent by Sentinel script:
Status | Germain Service | Check | Info |
GermainEngineManager-apsep03050 | LogActivity |
CODE
| |
ActiveMQ | AvailabilityCheck | Status: Running, PID(1296 9708) | |
ActiveMQ | BrokerStats | localhost | Temp Percent: 0 | MemoryPercent: 0 | StorePercent: 0 | |
ActiveMQ | QueueStats | apm.action | QueueSize: 0 | ConsumerCount: 1 | EnqueueCount: 28656 | DequeueCount: 28656 | |
ActiveMQ | QueueStats | apm.analytics | QueueSize: 0 | ConsumerCount: 1 | EnqueueCount: 15729162 | DequeueCount: 15729162 | |
ActiveMQ | QueueStats | apm.session | QueueSize: 0 | ConsumerCount: 1 | EnqueueCount: 0 | DequeueCount: 0 | |
ActiveMQ | QueueStats | apm.storage | QueueSize: 0 | ConsumerCount: 2 | EnqueueCount: 7480961 | DequeueCount: 7480961 | |
ActiveMQ | QueueStats | apm.storage.analytics | QueueSize: 0 | ConsumerCount: 2 | EnqueueCount: 189809 | DequeueCount: 189809 | |
GermainActionServices | AvailabilityCheck | Status: Running, PID(1636) | |
GermainActionServices | LogActivity |
CODE
| |
GermainAggregatorServices | AvailabilityCheck | Status: Running, PID(6720) | |
GermainAggregatorServices | LogActivity |
CODE
| |
GermainAnalyticsServices | AvailabilityCheck | Status: Running, PID(1624) | |
GermainAnalyticsServices | LogActivity |
CODE
| |
GermainAPMConfigServices | EndpointAvailability | Rest Endpoint Response Code: 200 | |
GermainAPMConfigServices | EndpointAvailability | Rest Endpoint Response Code: 200 | |
GermainAPMConfigServices-apsep02522 | LogActivity |
CODE
| |
GermainAPMConfigServices-apsep02523 | LogActivity |
CODE
| |
GermainAPMEnginesProd | AvailabilityCheck | Status: Running, PID(16580 4704) | |
GermainAPMIngestionServices-apsep02522 | LogActivity |
CODE
| |
GermainAPMIngestionServices-apsep02523 | LogActivity |
CODE
| |
GermainAPMQueryServices | EndpointAvailability | Rest Endpoint Response Code: 200 | |
GermainAPMQueryServices | EndpointAvailability | Rest Endpoint Response Code: 200 | |
GermainAPMQueryServices-apsep02522 | LogActivity |
CODE
| |
GermainAPMQueryServices-apsep02523 | LogActivity |
CODE
| |
GermainEngineManager-apsep03069 | LogActivity |
CODE
| |
GermainEngineManagerProd | AvailabilityCheck | Status: Running, PID(121228) | |
GermainSessionTrackingServices | AvailabilityCheck | Status: Running, PID(1060) | |
GermainSessionTrackingServices | LogActivity |
CODE
| |
GermainStorageServices | AvailabilityCheck | Status: Running, PID(1648) | |
GermainStorageServices | LogActivity |
CODE
|
Service: Enterprise
Feature Availability: 2023.1