Skip to content
← Search · intermediate · 12 min · 04 / 06

Elasticsearch

Mappings, analyzers, the Query DSL, aggregations, and running a production cluster — for when Meilisearch isn't enough.

ElasticsearchmappingsanalyzersQuery DSLaggregationscluster

Real-World Analogy

A university research library vs a bookstore: the bookstore (Meilisearch) is fast, well-organized for browsing, good for most searches. The research library (Elasticsearch) has specialized cataloguing, subject librarians, cross-referencing, complex aggregations, and handles the full academic corpus. More powerful, more complex to operate, more to configure correctly.

When to Use Elasticsearch

  • Full-text search on billions of documents
  • Complex aggregations (histograms, geo-distance, nested objects)
  • Log and event analytics (Elastic stack: ELK)
  • Multi-language search with custom analyzers
  • Complex relevance tuning (function scoring, scripted scoring)
  • Percolator (reverse search: match documents against stored queries)

Meilisearch for product search, Elasticsearch for analytics and petabyte-scale.

Running Elasticsearch

# docker-compose.yml — single node for development
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false    # disable for dev; MUST enable in production
      - ES_JAVA_OPTS=-Xms2g -Xmx2g
    ports:
      - "9200:9200"
    volumes:
      - es_data:/usr/share/elasticsearch/data
    ulimits:
      memlock:
        soft: -1
        hard: -1
# Verify
curl http://localhost:9200/_cluster/health?pretty

Mappings (Schema)

Unlike Meilisearch, Elasticsearch requires explicit mappings for production use. Dynamic mappings (auto-detection) cause type conflicts and poor performance.

import { Client } from '@elastic/elasticsearch';

const es = new Client({ node: 'http://localhost:9200' });

// Create index with mappings
await es.indices.create({
  index: 'products',
  body: {
    settings: {
      number_of_shards: 3,
      number_of_replicas: 1,
      analysis: {
        analyzer: {
          product_analyzer: {
            type: 'custom',
            tokenizer: 'standard',
            filter: ['lowercase', 'asciifolding', 'english_stop', 'english_stemmer'],
          },
          autocomplete_analyzer: {
            type: 'custom',
            tokenizer: 'standard',
            filter: ['lowercase', 'autocomplete_filter'],
          },
          autocomplete_search_analyzer: {
            type: 'custom',
            tokenizer: 'standard',
            filter: ['lowercase'],
          },
        },
        filter: {
          english_stop: { type: 'stop', stopwords: '_english_' },
          english_stemmer: { type: 'stemmer', language: 'english' },
          autocomplete_filter: {
            type: 'edge_ngram',    // prefix n-grams for autocomplete
            min_gram: 2,
            max_gram: 20,
          },
        },
      },
    },
    mappings: {
      properties: {
        id:          { type: 'keyword' },        // exact match, no analysis
        name: {
          type: 'text',
          analyzer: 'product_analyzer',
          fields: {
            autocomplete: {                       // sub-field for autocomplete
              type: 'text',
              analyzer: 'autocomplete_analyzer',
              search_analyzer: 'autocomplete_search_analyzer',
            },
            keyword: { type: 'keyword' },         // sub-field for exact/sort
          },
        },
        brand:       { type: 'keyword' },
        category:    { type: 'keyword' },
        description: { type: 'text', analyzer: 'product_analyzer' },
        tags:        { type: 'keyword' },
        price_cents: { type: 'integer' },
        in_stock:    { type: 'boolean' },
        popularity:  { type: 'integer' },
        created_at:  { type: 'date' },
        location: {
          type: 'geo_point',                      // geographic coordinates
        },
      },
    },
  },
});

Indexing Documents

// Single document
await es.index({
  index: 'products',
  id: 'prod-123',
  document: {
    id: 'prod-123',
    name: 'MacBook Pro 14"',
    brand: 'Apple',
    category: 'laptops',
    description: 'M3 chip, 18GB RAM, 512GB SSD',
    tags: ['laptop', 'apple', 'pro'],
    price_cents: 199900,
    in_stock: true,
    popularity: 9500,
    created_at: '2024-01-01T00:00:00Z',
  },
});

// Bulk indexing (preferred for large imports)
const operations = documents.flatMap(doc => [
  { index: { _index: 'products', _id: doc.id } },
  doc,
]);

const { errors, items } = await es.bulk({ operations });
if (errors) {
  const failed = items.filter(i => i.index?.error);
  console.error('Bulk index errors:', failed);
}

Query DSL

// Multi-field text search
const results = await es.search({
  index: 'products',
  query: {
    multi_match: {
      query: 'macbook pro',
      fields: ['name^3', 'brand^2', 'description'],  // ^N = boost
      type: 'best_fields',
      fuzziness: 'AUTO',   // typo tolerance
    },
  },
  from: 0,
  size: 20,
});

// Boolean query with filters
const results = await es.search({
  index: 'products',
  query: {
    bool: {
      must: [
        {
          multi_match: {
            query: 'laptop',
            fields: ['name^3', 'description'],
          },
        },
      ],
      filter: [
        { term: { category: 'laptops' } },
        { term: { in_stock: true } },
        { range: { price_cents: { lte: 200000 } } },
      ],
      should: [
        { term: { tags: 'pro' } },  // boost if "pro" tag matches
      ],
      boost: 1.0,
    },
  },
  sort: [
    { _score: 'desc' },
    { popularity: 'desc' },
  ],
});

console.log(results.hits.hits.map(h => h._source));
console.log(results.hits.total.value);  // total matching docs

Aggregations

The killer feature over Meilisearch — complex analytics on search results:

const results = await es.search({
  index: 'products',
  query: {
    bool: {
      must: [{ match: { category: 'laptops' } }],
      filter: [{ term: { in_stock: true } }],
    },
  },
  aggs: {
    // Facet counts
    brands: {
      terms: { field: 'brand', size: 20 },
    },

    // Price histogram
    price_ranges: {
      range: {
        field: 'price_cents',
        ranges: [
          { to: 50000, key: 'under_500' },
          { from: 50000, to: 100000, key: '500_to_1000' },
          { from: 100000, to: 200000, key: '1000_to_2000' },
          { from: 200000, key: 'over_2000' },
        ],
      },
    },

    // Statistics
    price_stats: {
      stats: { field: 'price_cents' },
    },

    // Date histogram for charts
    sales_over_time: {
      date_histogram: {
        field: 'created_at',
        calendar_interval: 'month',
      },
    },
  },
  size: 20,  // still return documents too
});

console.log(results.aggregations?.brands);
// { buckets: [{ key: 'Apple', doc_count: 45 }, { key: 'Dell', doc_count: 23 }] }

Autocomplete

// Using the edge_ngram field configured in mappings
const suggestions = await es.search({
  index: 'products',
  query: {
    match: {
      'name.autocomplete': {
        query: 'macb',     // prefix
        operator: 'and',
      },
    },
  },
  _source: ['name', 'brand'],
  size: 10,
});

// Or using completion suggester (faster, purpose-built for autocomplete)
// Requires 'completion' field type in mapping

Production Cluster

# 3-node cluster configuration (each node's elasticsearch.yml)
cluster.name: production-search

# Node 1:
node.name: es-node-1
node.roles: [master, data]
network.host: 10.0.0.10
discovery.seed_hosts: ["10.0.0.10", "10.0.0.11", "10.0.0.12"]
cluster.initial_master_nodes: ["es-node-1", "es-node-2", "es-node-3"]

# Enable security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true

Sizing:

  • Each data node: 32GB RAM minimum for production (ES uses 50% for JVM heap by default)
  • JVM heap: -Xms16g -Xmx16g (max 32GB — beyond that G1GC pauses increase)
  • SSD storage: ES is I/O intensive

Index lifecycle management (ILM) for log indices:

// Automatically move indices through hot → warm → cold → delete
await es.ilm.putLifecycle({
  name: 'logs-policy',
  policy: {
    phases: {
      hot: {
        actions: {
          rollover: { max_size: '50gb', max_age: '7d' },
        },
      },
      warm: {
        min_age: '7d',
        actions: {
          shrink: { number_of_shards: 1 },
          forcemerge: { max_num_segments: 1 },
          allocate: { require: { box_type: 'warm' } },
        },
      },
      cold: {
        min_age: '30d',
        actions: { freeze: {} },
      },
      delete: {
        min_age: '90d',
        actions: { delete: {} },
      },
    },
  },
});

Choosing Between Meilisearch and Elasticsearch

Use Meilisearch when:
  ✓ Product/content search with typo tolerance
  ✓ Faceted navigation
  ✓ < 100M documents
  ✓ Simple ops (single binary)
  ✓ Fast setup

Use Elasticsearch when:
  ✓ Log analytics (ELK stack)
  ✓ Billions of documents
  ✓ Complex aggregations (geo, date histograms, nested)
  ✓ Custom analyzers per language
  ✓ Percolator / reverse search
  ✓ Multi-index searches