AI search for Magento is the architecture pattern that replaces or augments Magento's default OpenSearch BM25 retrieval with vector embeddings, dense numerical representations of products and queries, so that a shopper typing red velvet cake mix also matches scarlet baking blend and crimson sponge powder on a Magento 2.4.4-2.4.9 store in 2026. The decision is not whether to add semantic search but which of three stacks to add it through, and the answer changes with catalog size, engineering budget, and how much money personalization is worth to the merchant. This article benchmarks all three on the same catalog and ends with the hybrid configuration that beat each of them individually.

Keyword matching alone leaves 18% of intent on the floor.

Magento's default OpenSearch index uses BM25: a tuned variant of TF-IDF that scores tokens by frequency and inverse-document weight. It is fast, deterministic, and falls over the moment a customer types something the catalog does not literally contain. On the test catalog (50,000 SKUs across food, beverage, and baking categories), a query log audit over 30 days surfaced the gap.

14,200 unique queries served.
11,640 returned at least one relevant result on BM25 alone (82%).
2,560 returned zero or off-topic results: synonyms, misspellings, descriptive phrases, intent queries (gift for someone gluten free).

The 18% gap is the addressable problem. The three stacks below close it in different ways.

BM25 is a tokenizer pretending to be a search engine. Embeddings let you search on meaning, not spelling.

1. OpenSearch k-NN: self-hosted, free, runs alongside Magento

Magento 2.4.4-2.4.9 already ships with OpenSearch as the default catalog engine. OpenSearch added the knn plugin in 2.x, which means the same cluster that indexes your products for catalog search can also index dense vectors. No second engine, no second host.^[1]

The index mapping

{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 100
    }
  },
  "mappings": {
    "properties": {
      "sku":         { "type": "keyword" },
      "name":        { "type": "text" },
      "description": { "type": "text" },
      "price":       { "type": "float" },
      "in_stock":    { "type": "boolean" },
      "name_vector": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "lucene",
          "parameters": { "ef_construction": 256, "m": 16 }
        }
      }
    }
  }
}

Generating embeddings: two paths

The cost equation hinges on which embedding model produces name_vector. Two options worth wiring up.

OpenAI text-embedding-3-small : 1536 dimensions, $0.02 per million input tokens.^[2] A typical Magento product (name + short description + attributes) is ~150 tokens. Re-embedding the full 50,000-SKU catalog costs $0.15. Incremental updates on attribute changes cost fractions of a cent per day.

Open-source sentence-transformers/all-MiniLM-L6-v2 384 dimensions, zero cost, runs on the same box as Magento via a tiny Python sidecar. Lower quality than the OpenAI model on long-tail queries but indistinguishable on short product names. Choose this when the merchant cannot route data through OpenAI for compliance reasons.

# app/code/Panth/AiSearch/sidecar/embed.py
import os, json
from openai import OpenAI
from opensearchpy import OpenSearch, helpers

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
os_client = OpenSearch(hosts=["https://opensearch:9200"], http_auth=("admin", os.environ["OS_PASS"]))

def embed(texts):
    resp = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts,
    )
    return [d.embedding for d in resp.data]

def stream_products():
    # Pull from Magento via REST: /rest/V1/products?searchCriteria[pageSize]=500
    # Yield dicts of {sku, name, description, price, in_stock}
    ...

batch = []
for product in stream_products():
    batch.append(product)
    if len(batch) == 100:
        vectors = embed([f"{p['name']}. {p['description']}" for p in batch])
        actions = [
            {"_index": "catalog_search_v2", "_id": p["sku"],
             "_source": {**p, "name_vector": v}}
            for p, v in zip(batch, vectors)
        ]
        helpers.bulk(os_client, actions)
        batch = []

The query

{
  "size": 24,
  "query": {
    "knn": {
      "name_vector": {
        "vector": [0.0123, -0.0456, 0.0789, "... 1533 more floats"],
        "k": 50
      }
    }
  },
  "post_filter": {
    "bool": {
      "must": [
        { "term":  { "in_stock": true } },
        { "range": { "price": { "gte": 0, "lte": 100 } } }
      ]
    }
  }
}

Operational numbers

Index size ~3 GB for 50,000 products at 1536d (float32). Drops to ~750 MB at 384d.
Query latency 80-120 ms on a 4-vCPU OpenSearch node co-located with Magento. HNSW is the bottleneck, not network.
Reindex time full catalog re-embed + bulk index: ~12 minutes on a quiet node.
Total monthly cost $0 if the box already runs OpenSearch for Magento. ~$0.45 / month in OpenAI calls for daily incremental embeddings.

2. Algolia: managed, fast to ship, billed per search

Algolia does not require an embedding model. The platform indexes plain product attributes (name, description, brand, categories) and applies typo-tolerance, language stemming, synonym dictionaries, and prefix matching at query time.^[3] Semantic search via Algolia NeuralSearch is an add-on that uses Algolia's own embeddings: you do not write any embedding code.

Magento integration

composer require algolia/algoliasearch-magento-2
bin/magento module:enable Algolia_AlgoliaSearch
bin/magento setup:upgrade
bin/magento setup:di:compile
bin/magento config:set algoliasearch_credentials/credentials/application_id YOUR_APP_ID
bin/magento config:set algoliasearch_credentials/credentials/api_key YOUR_ADMIN_KEY --lock-env
bin/magento indexer:reindex algolia_products algolia_categories

The Algolia Magento module wires into the standard catalogsearch/result/index route and replaces the default OpenSearch query path. From the storefront side, nothing changes: search forms, autocomplete, faceting all use the existing Hyvä templates.

Operational numbers

Engineering migration cost : $1.5k, $3k. Mostly attribute mapping (which Magento fields become Algolia searchableAttributes, attributesForFaceting, custom ranking).
Index time : Algolia indexes via API. Full 50k catalog: ~8 minutes. Incremental updates near-instant via Magento's catalogsearch_fulltext indexer hook.
Per-search cost : $0.50 per 1,000 searches on the standard plan, dropping to ~$0.25 at scale. A store doing 500,000 searches per month pays $250.
Latency 25-45 ms from Algolia's nearest edge. Faster than self-hosted OpenSearch but the edge is the network, not the algorithm.

3. Coveo: enterprise, personalization, dashboards

Coveo sits in a different price tier and a different feature tier. Where Algolia ships fast search, Coveo ships a full machine-learning relevance pipeline: per-user reranking, A/B testing on relevance rules, intent classification, and an analytics UI that merchandisers actually open.

What Coveo does that the other two do not

Per-user reranking every shopper's search results are reordered based on their browsing history, past purchases, and segment. The first 12 results on a 50k catalog are different for two shoppers typing the same query.
Native A/B testing relevance rules ship as experiments with statistical significance gating. A merchandiser can promote a new product line behind a 10% rollout without touching code.
Headless API + Hyvä : Coveo's headless SDK plugs into Hyvä's search components. No core Magento module to maintain.

Operational numbers

License : $20k, $60k / year depending on query volume and seats. Six-figure for enterprise tiers.
Implementation 4-8 weeks. Coveo Cloud configuration plus Hyvä integration plus merchandiser training.
Latency : 60-90 ms global, similar to Algolia.
Precision on the test catalog measurably higher than Algolia on personalized queries, indistinguishable on cold-start queries.

4. The hybrid that beat all three individually

The winning configuration on the test catalog was not any of the three pure stacks. It was OpenSearch k-NN running both BM25 and vector retrieval in parallel, with scores blended at query time. The blend is the whole trick.

The query

{
  "size": 24,
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "name": {
              "query": "red velvet cake mix",
              "boost": 0.6
            }
          }
        },
        {
          "knn": {
            "name_vector": {
              "vector": [0.0123, -0.0456, 0.0789, "..."],
              "k": 50,
              "boost": 0.4
            }
          }
        }
      ]
    }
  },
  "rescore": {
    "window_size": 50,
    "query": {
      "score_mode": "multiply",
      "rescore_query": {
        "function_score": {
          "functions": [
            { "filter": { "term": { "in_stock": true } }, "weight": 1.2 },
            { "gauss":  { "created_at": { "origin": "now", "scale": "30d" } } }
          ]
        }
      }
    }
  }
}

Real queries on the test catalog

Two queries from the live log, run on the hybrid versus each pure stack.

Query: red velvet cake mix

BM25 only: returned 14 exact-token matches, all relevant. Missed scarlet sponge mix and crimson baking blend that customers were buying as substitutes.
k-NN only: returned scarlet sponge mix at rank 1, red velvet cake mix at rank 3. Correct semantically, but the customer literally typed the exact product name and expected it first.
Hybrid 60/40: exact name match at rank 1, semantic substitutes at ranks 2-4, related-flavor items at ranks 5-24.

Query: chocolate baking supplies

BM25 only: 6 results, all containing both tokens. Missed cocoa powder, ganache mix, brownie kits.
k-NN only: 24 results spanning the whole chocolate-baking category. Order was reasonable but inconsistent across runs (HNSW is approximate).
Hybrid 60/40: 24 results. Items with exact tokens float to the top, semantically related items fill the rest. Order is stable.

Decision table: which stack wins when

Stack	Annual cost (50k SKUs, 500k searches / mo)	Latency	Precision	Self-host	Best for
OpenSearch k-NN (pure)	~$5 OpenAI + existing infra	80-120 ms	High on semantic, low on exact match	Yes	Compliance-locked, budget-locked merchants
Algolia	~$3,000	25-45 ms	High on exact + typo, no semantic without NeuralSearch add-on	No	Fast time-to-ship, no ML team
Coveo	$20k, $60k	60-90 ms	Highest with personalization	No	Enterprise, merchandiser-driven catalogs
Hybrid (OpenSearch BM25 + k-NN, 60/40)	~$5 OpenAI + existing infra	120-160 ms	Highest on the queries customers actually type	Yes	The default recommendation for Magento 2.4.4-2.4.9

Operational gotchas worth knowing

Vector drift : re-embed any product whose name or description changes by more than 10% of tokens. Stale vectors degrade silently.
HNSW determinism : k-NN is approximate. Identical queries can return slightly different result orders. Always re-rank with a deterministic function score before display.
Embedding model lock-in : vectors from text-embedding-3-small are not comparable to vectors from all-MiniLM-L6-v2. Switching models requires re-embedding the entire catalog.
Token logging : log every OpenAI embedding call's token count to a Magento custom table. Without it, embedding costs creep on attribute-update storms.
Cache the vector, not the query Magento's FPC will cache the rendered results page for anonymous users. Make sure the vector is generated against the canonical query string after lowercasing and stop-word stripping, otherwise cache hit rates tank.

What this is not

None of the three stacks is "an AI agent that answers product questions in natural language": that is a different pattern (conversational commerce) covered in the ChatGPT integrations article. Semantic search returns ranked products; conversational search returns ranked products plus a paragraph of generated text. The two are often built together but they are not the same problem.

FAQ

Does adding k-NN to OpenSearch break my existing Magento catalog search?

No. The k-NN plugin runs alongside the standard text indices. The default catalog search keeps using BM25 unless the storefront is wired to a custom query. Both indices live in the same OpenSearch cluster with no version pinning required for Magento 2.4.4-2.4.9.

Which embedding model should I start with?

OpenAI text-embedding-3-small at 1536 dimensions. It is cheap ($0.02 / 1M tokens), well-documented, and the quality jump versus the 384-dimension open-source models is meaningful on long-tail queries. Switch to all-MiniLM-L6-v2 only if compliance forbids routing product text through OpenAI.

How big does my catalog need to be before semantic search is worth it?

The cost curve crosses around 5,000 SKUs. Below that, the BM25 default and a hand-curated synonym list close most of the intent gap. Above 10,000 SKUs, the synonym list becomes unmaintainable and embeddings win on engineering hours alone.

Can I use Algolia for keyword and OpenSearch k-NN for semantic at the same time?

Technically yes: run Algolia for autocomplete and instant search, OpenSearch k-NN for the "you might also like" rail. We do not recommend it. Two indices to maintain, two costs to track, two analytics streams to reconcile. Pick a single primary stack.

How often do I re-embed the catalog?

Trigger re-embed on Magento's catalog_product_save_after observer for any product whose name, description, or primary category changes. A full re-embed nightly is overkill unless the catalog is reseeded daily.

What is the simplest A/B test to validate the upgrade?

Split traffic 50/50 on the storefront search route, log clicked-result rank, and measure the median rank of the clicked result. If the hybrid median drops below the BM25 median over a 7-day window with statistical significance, ship it.

Does this work with Hyvä?

Yes. Hyvä's search components are template-only: they call the same Magento search action regardless of which engine sits behind it. The OpenSearch k-NN upgrade is server-side and Hyvä-transparent. Algolia ships with its own Hyvä-compatible module.

How do I monitor relevance over time?

Log query, top-5 SKUs, clicked SKU, and clicked rank to a custom table. Run a weekly query to compute mean reciprocal rank. Any drop of more than 5% week-over-week is worth investigating: usually a stale vector batch or a new attribute mapping.

Where this is heading next

Three patterns we are watching for the next 12 months on kishansavaliya.com client work.

Multimodal product search : embed product images alongside text. A customer uploads a photo and the catalog returns visually similar SKUs. CLIP-style models are good enough now.
Personalized re-ranking on self-hosted stacks Coveo's killer feature replicated on OpenSearch with a lightweight ranking model trained on click logs. Engineering cost is real but no longer a moonshot.
Query understanding via small LLMs a 3B-parameter local LLM that rewrites a query into a structured search (intent, category, attributes) before retrieval. Closes the remaining gap on intent queries like gift for someone gluten free.

Citations

Want semantic search shipped on your Magento store?

I scope and ship OpenSearch k-NN, Algolia, or hybrid search integrations on Magento 2.4.4-2.4.9 with embedding-cost dashboards, relevance A/B tests, and 30 days of patches. Fixed quote from $499 audit · $2,499 sprint · ~36h @ $25/hr. See hire me.

Tagged #OpenAI API #OpenSearch #Algolia #Embeddings #Vector Search

Keep reading

Generative Engine Optimization (GEO) for Magento: Get Cited by AI Search

GEO is how you get Magento product, category, and brand pages cited inside ChatGPT, Perplexity, and Google AI Overviews. A concrete, honest developer playbook.

Jun 8, 2026
Answer Engine Optimization (AEO) for Magento: Winning Snippets, PAA, and Voice Answers

Answer Engine Optimization for Magento is about being the single extracted answer across search and AI engines. Here is how to win snippets, PAA, and voice answers, plus the honest reality of FAQ rich results in 2026.

Jun 8, 2026
Google AI Mode Is Here (May 2026): The SEO Playbook You Need to Rewrite, Now

AI Overviews now show up on 48% of Google queries, and 93% of AI Mode sessions end without a single click off the page. The bar moved from 'rank a link' to 'be cited inside the answer.' Here is the May-2026 playbook: the six ranking factors that actually drive citations, the llms.txt + JSON-LD stack to deploy this week, the bot-allow rules every site needs, and the one Magento-specific pattern that turns AI Mode from a traffic loss into a brand-mention pipeline.

May 29, 2026

Keyword matching alone leaves 18% of intent on the floor.

1. OpenSearch k-NN: self-hosted, free, runs alongside Magento

The index mapping

Generating embeddings: two paths

The query

Operational numbers

2. Algolia: managed, fast to ship, billed per search

Magento integration

Operational numbers

3. Coveo: enterprise, personalization, dashboards

What Coveo does that the other two do not

Operational numbers

4. The hybrid that beat all three individually

The query

Real queries on the test catalog

Decision table: which stack wins when

Operational gotchas worth knowing

What this is not

FAQ

Where this is heading next

Citations

Related reading

Generative Engine Optimization (GEO) for Magento: Get Cited by AI Search

Answer Engine Optimization (AEO) for Magento: Winning Snippets, PAA, and Voice Answers

Google AI Mode Is Here (May 2026): The SEO Playbook You Need to Rewrite, Now