Elasticsearch - OnPremise (Preferred)

Feature

Elasticsearch of GermainUX’s most preferred datastore. Below are more details on how to set ElasticSearch up “on-premise”. However you also have the option to use ElasticSearch “saas”.

Percentiles or Std Deviation Require Raw Data

When it comes to calculating percentiles or standard deviation measures, Germain relies on the availability of raw data. Elastic's roll-up mechanism does not support percentiles or standard deviation; it only supports calculations such as min, max, avg, sum, and count. However, Germain/Elastic can calculate percentiles or standard deviation as long as the raw data is present.

Roll-ups

In Germain, the concept of "aggregation" is similar to Elastic's "roll-up." Both aim to compress data over time by reducing the level of detail. Germain provides built-in support for roll-ups, allowing data to be aggregated into variable time windows. Currently, Germain rolls up data into an hourly index, which can be kept beyond the retention period of raw data.

Single API Call for Querying Raw and Aggregated Data

Germain offers the convenience of making a single API call to query both raw and aggregated indexes. The results from these indexes are automatically merged, providing a unified view of the data.

Timeseries Indexes

Timeseries datastreams in Germain are typically designed for appending data and are considered write-once. To update data, you can either use the Elastic API or directly push updates to the underlying index.

For fact datastreams, Germain utilizes a "hot" storage for one day, and after that, the data is moved to a read-only "cold" storage. The data remains in the "cold" storage for as long as the configured retention period for raw data. This allows you to store and access historical data efficiently.

Configuration

Basic high-level process to setup Elasticsearch as data store for Germain. Please contact us for script and more details.

Prepare

Get host for ES and Germain
Get DB for Germain config (SQL DB)
Export config json from Stage Germain

Create a Temporary Germain Server

It will be used to setup the Elasticsearch database pre data migration.

It will need a SQL database for the config
In common.properties:
1. Change the queue names to have new prefixes so that it can use the same ActiveMQ without interfering
2. Change the germain.elastic.properties.indexPrefix to be more meaningful (Stage)
3. Configure the connection to ES
No Services will be needed

Setup Elasticsearch DB

Make sure the Elastic DB host is the same timezone as the data source (Central Time)
1. Import the search indexes from servicesDistro/install/indexer/es
Prepare the DB
1. Start Germain
  1. Verify it connects okay in DB
2. Use special REST endpoint that applies the rest of the config for Germain to ES DB

Data Migration Script

Fill in the details in the migration tools .yml file
Run the script one per data type. Arguments as per examples (same folder as tool)
1. Multiple can be run in parallel
2. It might adapt a script/tool to automate

Query Validation for Elastic

Germain supports the validation of Elastic Search queries

Go to Germain Workspace > Left Menu > System > Engine Settings > Component Types
Click :plus: icon
Select Elasticsearch Query Monitor Component

Create new component on Component Types page

Elasticsearch Query Monitor Component Wizard