New Data Search capability we introduced last month, needed a server-side data indexing and data discovery mechanism, to enable fast data/insights searching. We benchmarked SolR and ElasticSearch and found SolR significantly faster and more robust in the area of NLP/ML, so went ahead and integrated SolR in Germain. 

Benchmark

  • Indexed 1million rows in less than 4 min using our own 3-year old desktop, without any tuning
  • Searched on this 1 million row is taking at the most 500ms


Comparison of Apache SolR & ElasticSearch



SolrElasticSearch
Index Speed based on 1mil rows (ootb)~4min~22min
Index Speed based on 1mil rows (with simple optimizations)not tested~8min
Index Size~500mb~750mb
Requires additional tool/software to pull from DB and insert into search platformNoYes (Logstash)
Simple Query APIYesYes
Built-in scheduler for updatesNoYes (Logstash)
Returns entire document as search resultYesYes
Full-Text Search Features (misspealing, synonyms, ..)Yes (very advanced)Yes
Overall applicationText searchanalytical querying, filtering, and grouping
Nested documents supportNoYes