Analytics - Elastic


AllanGray, American Airlines, General Electric, Volvo


A number of our clients have now transitioned to ElasticSearch as datastore for Germain. Billion (not million) of Data Analysis and/or Executed Transaction per day are stored by Germain in Elastic. Things to know about it:

  • Percentiles or Std Deviation need row data

    • Percentiles or std deviation measures are not supported by Elastic’s roll-up mechanism; only min, max, avg, sum, count are supported. Percentiles or std deviation are supported by Germain/Elastic as long as raw data is there

  • Roll-ups

    • Germain’s "aggregation" equates to "roll-up" in Elastic; same basic idea, compress data over time by reducing detail level

    • Built-in support for roll-up at configurable time, into variable time windows

    • For now, Germain rolls up into hourly index that can be kept past raw data window

    • Single API call to query both raw and aggregated indexes, results are merged automatically

  • Timeseries Indexes:

    • Timeseries datastreams are generally append only / write once; in order to update data, we either use Elastic API or directly push update to underlying index

    • For our fact datastreams, we use 1-day "hot" storage, afterwards read-only "cold" storage for as long as raw retention is configured


Other no-sql datastore can be supported, please let us know.

