Data Indexing and Data Discovery
Features
Data indexing and Data Discovery mechanisms can be integrated with Germain, to enable data science at scale.
Currently, Apache SolR and ElasticSearch are integrated out-of-the-box with Germain.
Configure
Prerequisite
Please make sure that your Germain Datamart schema contains DIMENSION_INDEX table. If it doesn't please create it using datamart-internal.sql
DDL script which is available in Germain Service distribution file under install/databases/YOUR_DB_VENDOR.
Apache SolR
Installation
Installation steps are available in Germain Service distribution file under install/indexer/solr.
Integration
Once SoLR is running, it must be integrated with Germain Storage and Query services
Storage Service
In your Germain Services root folder, go to config and edit storage-services.yml
file and make it look similar to this:
# General service properties
storage:
.........
# Search engine indexer
indexer:
vendor: SOLR
# update URL where SoLR instance is available at
url: http://localhost:8983/solr/
# main index name
indexName: apm
# Index refresh schedule (every 1h)
refreshSchedule: 0 0 * * * *
.........
Query Service
In your Germain Server (Tomcat), go to conf folder and edit query-service.yml
fileand make it look similar to this:
# Query service properties
query:
.........
# Search engine indexer
indexer:
vendor: SOLR
# update URL where SoLR instance is available at
url: http://localhost:8983/solr/
# main index name
indexName: apm
..........
Elasticsearch
Installation
Installation steps are available in Germain Service distribution file under install/indexer/es
Integration
Once Elasticsearch installed and running, it must be integrated with Germain Storage and Query services
Storage Service
In your Germain Services root folder, go to config and edit storage-services.yml
file and make it look similar to this:
# General service properties
storage:
.........
# Search engine indexer
indexer:
vendor: ELASTIC_SEARCH
# update URL where ES instance is available at
url: http://localhost:9200/
# main index name
indexName: apm
# Index refresh schedule (every 1h)
refreshSchedule: 0 0 * * * *
.........
Query Service
In your Germain Server (Tomcat), go to conf folder and edit query-service.yml
fileand make it look similar to this:
# Query service properties
query:
.........
# Search engine indexer
indexer:
vendor: ELASTIC_SEARCH
# update URL where ES instance is available at
url: http://localhost:9200/
# main index name
indexName: apm
..........