Elasticsearch - OnPremise (Preferred)
Feature
Elasticsearch of GermainUX’s most preferred datastore. Below are more details on how to set ElasticSearch up “on-premise”. However you also have the option to use ElasticSearch “saas”.
Percentiles or Std Deviation Require Raw Data
When it comes to calculating percentiles or standard deviation measures, Germain relies on the availability of raw data. Elastic's roll-up mechanism does not support percentiles or standard deviation; it only supports calculations such as min, max, avg, sum, and count. However, Germain/Elastic can calculate percentiles or standard deviation as long as the raw data is present.
Roll-ups
In Germain, the concept of "aggregation" is similar to Elastic's "roll-up." Both aim to compress data over time by reducing the level of detail. Germain provides built-in support for roll-ups, allowing data to be aggregated into variable time windows. Currently, Germain rolls up data into an hourly index, which can be kept beyond the retention period of raw data.
Single API Call for Querying Raw and Aggregated Data
Germain offers the convenience of making a single API call to query both raw and aggregated indexes. The results from these indexes are automatically merged, providing a unified view of the data.
Timeseries Indexes
Timeseries datastreams in Germain are typically designed for appending data and are considered write-once. To update data, you can either use the Elastic API or directly push updates to the underlying index.
For fact datastreams, Germain utilizes a "hot" storage for one day, and after that, the data is moved to a read-only "cold" storage. The data remains in the "cold" storage for as long as the configured retention period for raw data. This allows you to store and access historical data efficiently.
Configuration
Basic high-level process to setup Elasticsearch as data store for Germain. Please contact us for script and more details.
Prepare
Get host for ES and Germain
Get DB for Germain config (SQL DB)
Export config json from Stage Germain
Create a Temporary Germain Server
It will be used to setup the Elasticsearch database pre data migration.
It will need a SQL database for the config
In
common.properties
:Change the queue names to have new prefixes so that it can use the same ActiveMQ without interfering
Change the
germain.elastic.properties.indexPrefix
to be more meaningful (Stage)Configure the connection to ES
No Services will be needed
Setup Elasticsearch DB
Make sure the Elastic DB host is the same timezone as the data source (Central Time)
Import the search indexes from
servicesDistro/install/indexer/es
Prepare the DB
Start Germain
Verify it connects okay in DB
Use special REST endpoint that applies the rest of the config for Germain to ES DB
Data Migration Script
Fill in the details in the migration tools .yml file
Run the script one per data type. Arguments as per examples (same folder as tool)
Multiple can be run in parallel
It might adapt a script/tool to automate
Query Validation for Elastic
Germain supports the validation of Elastic Search queries
Go to Germain Workspace > Left Menu > System > Engine Settings > Component Types
Click :plus: icon
Select Elasticsearch Query Monitor Component