Monitoring & Analytics - Real-time Metrics
American Airlines, eBay, General Electric, Pepsi, Volvo
Germain’s previous query services was good at answering historical questions for a given time range, but not really geared towards answering questions about the current state of a monitored entity.
Example questions that this new Real-Time Metric service can address:
What is the current state of server X or database Y?
How many users are currently active on our prod environment?
How many users are currently impacted by the crash of component Z?
How many JS errors were there in the last minute?
Which engines are currently collecting data?
In other words, this new Real-time Metric service can provide:
Status of Monitored Infrastructure
Show availability of a server over last 30s, 1min, 5min on configuration / management view
KPI Data Counts
Show KPIs for which we collect data - show counts for each
Real-time Gauge Portlet
Gauge control that is updated every few seconds to show the current average (ex. number of concurrent users)
Dynamic Top-n Lists
Dynamically updating list of data grouped by some dimension - updates ranking in real time, positions & average response times change
List of slowest user sessions in the past 5 minutes
List of processes on a server by CPU usage (updated every 5 seconds, similar to Linux top command)
Live Network Map
Map of network requests that updates in real-time as traffic patterns change