TL;DR

4,200 product descriptions rewritten on a Magento 2.4.9 + Hyvä store in 6 hours of compute and 30 minutes of human review.
Total Anthropic API spend: $12.60 at claude-3-5-sonnet pricing: ~$0.003 per description, ~3,000 input + 350 output tokens each.
The CLI: bin/magento panth:ai:product-descriptions:generate --from=1 --to=4200 --brand-voice=brand-voice.md --dry-run.
The brand-voice prompt has three sections: tone primer, SEO guardrails (1-2% keyword density, primary keyword in H1 and first 100 words, no duplicate-content patterns), and output schema (200-300 words, 2 paragraphs + 1 spec bullet list).
The duplicate-content guard feeds Claude every prior description in the same category and asks for structural diversity: no two descriptions in one category share opening sentence patterns, bullet order, or closing CTA.
Three months on: zero Google manual actions, zero soft 404s in Search Console, organic clicks up 38% on the rewritten SKUs.

AI-generated product descriptions on Magento 2.4.4-2.4.9 are not a content shortcut, they are a pipeline. The workflow described here updated 4,200 product descriptions on a live Hyvä store in 6 hours of compute and 30 minutes of human review for $12.60 of Anthropic Claude API spend, and three months on the affected SKUs show zero manual actions, zero soft 404s, and a 38% organic-click lift. The pipeline ships as a Magento console command, a brand-voice markdown file, and a duplicate-content guard that compares each generated description against prior siblings in the same category.^[1]

Hand-written, template-generated, and AI-generated are three different failure modes

Stores past 1,000 SKUs cannot afford hand-written descriptions and cannot ship templated ones without inviting a doorway-page penalty. The middle path, AI generation with a brand-voice prompt, a SEO guardrail block, and a duplicate-content guard, is the one we run in production at kishansavaliya.com.

Method	Cost per 1,000 SKUs	Quality (1-10)	Human-review effort	Penalty risk
Hand-written copywriter	$3,000, $8,000	9	None (already human)	None
Rule-based template generator	$0 (one-time dev cost)	3	Low	High: doorway / thin-content
ChatGPT web UI, copy-paste	$50, $150 in time	6	High (no QA gate)	Medium: generic, repetitive
Claude API + brand-voice prompt + dup-guard	$3, $5	7.5	~30 min per 4,000 SKUs	Low: structural diversity enforced

The Claude-API row is the one this post implements. It is not a copywriter replacement: for hero PDP copy on flagship products, a human still writes. It is a long-tail solution for the 90% of catalog SKUs that would otherwise carry templated descriptions or, worse, the manufacturer's spec sheet copy-pasted across every dealer.

The CLI command that runs the workflow

Everything sits behind one console command. It supports a SKU range, a brand-voice file, a dry-run mode that writes to stdout instead of the database, and a per-category mode for incremental rollout.

bin/magento panth:ai:product-descriptions:generate \
  --from=1 --to=4200 \
  --brand-voice=brand-voice.md \
  --dry-run

The real run drops --dry-run. Output writes to var/log/ai-descriptions/run-{timestamp}.csv: one row per SKU with old description, new description, token usage, similarity score. CLI with ranges is restartable, scriptable, and survives SSH disconnects under nohup admin-UI bulk tools cannot.

If your bulk-content tool only runs from the Magento admin UI, it cannot ship 4,000 SKUs in one pass. CLI-first is non-negotiable for content workflows past 500 SKUs.

1. The console command class

Lives at app/code/Panth/AiContent/Console/Command/GenerateDescriptions.php : standard Symfony Console pattern.

<?php
declare(strict_types=1);

namespace Panth\AiContent\Console\Command;

use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Input\InputOption;
use Symfony\Component\Console\Output\OutputInterface;
use Panth\AiContent\Model\DescriptionGenerator;
use Panth\AiContent\Model\DuplicateGuard;

class GenerateDescriptions extends Command
{
    public function __construct(
        private DescriptionGenerator $generator,
        private DuplicateGuard $duplicateGuard
    ) {
        parent::__construct();
    }

    protected function configure(): void
    {
        $this->setName('panth:ai:product-descriptions:generate')
            ->setDescription('Bulk-rewrite product descriptions via Anthropic Claude.')
            ->addOption('from', null, InputOption::VALUE_REQUIRED, 'First entity_id (inclusive).')
            ->addOption('to', null, InputOption::VALUE_REQUIRED, 'Last entity_id (inclusive).')
            ->addOption('brand-voice', null, InputOption::VALUE_REQUIRED, 'Path to brand-voice.md.')
            ->addOption('dry-run', null, InputOption::VALUE_NONE, 'Print to stdout, do not save.');
    }

    protected function execute(InputInterface $input, OutputInterface $output): int
    {
        $from = (int)$input->getOption('from');
        $to = (int)$input->getOption('to');
        $brandVoice = file_get_contents((string)$input->getOption('brand-voice'));
        $dryRun = (bool)$input->getOption('dry-run');

        foreach ($this->generator->range($from, $to) as $product) {
            $priorDescriptions = $this->duplicateGuard->priorInCategory($product);
            $result = $this->generator->rewrite($product, $brandVoice, $priorDescriptions);

            if (!$dryRun) {
                $product->setDescription($result->html)->save();
            }
            $output->writeln(sprintf('[%s] %d tokens, similarity=%.2f', $product->getSku(), $result->tokens, $result->similarity));
        }
        return Command::SUCCESS;
    }
}

DescriptionGenerator and DuplicateGuard hold the real logic. Both are unit-testable without booting Magento because the Claude client is constructor-injected.

2. The brand-voice prompt template

Plain markdown, edited by a copywriter. Read once per run, prepended to every Claude request as the system prompt. Three required sections.

## Tone primer
Write for a B2B hand-tools storefront. Technical, confident, no hype
words (best-in-class, revolutionary, game-changing banned). Reader has
10+ years trade experience. US English. Active voice. Sentences under
22 words. No exclamation marks.

## SEO guardrails
- Keyword density: 1.0% to 2.0%.
- Primary keyword in the first 100 words.
- Secondary keywords (max 2) each appear once.
- No 4+ word phrase repeated across same-category siblings.
- Never copy verbatim from source attributes: paraphrase specs.

## Output schema
200-300 words of valid HTML, no Markdown.
1. Opening paragraph (60-90 words). Primary keyword present.
2. Unordered list of 4-6 specs, each under 14 words.
3. Closing paragraph (60-90 words). Secondary keywords present.
No <h1>, <h2>, or links. Never invent specs.

Checked into the repo at app/code/Panth/AiContent/content/brand-voice.md. When the voice evolves, the file is the diff: one commit per change, reviewable, revertable.^[2]

3. The Claude API request shape

Requests go to Anthropic's /v1/messages. Model pinned to claude-3-5-sonnet-20241022, or claude-3-7-sonnet-20250219 for flagship SKUs. Floating aliases drift and silently change brand voice.

The raw HTTP request

curl -sS https://api.anthropic.com/v1/messages \
  -H 'x-api-key: sk-ant-api03-...' \
  -H 'anthropic-version: 2023-06-01' \
  -H 'content-type: application/json' \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 600,
    "system": "<contents of brand-voice.md>",
    "messages": [
      {"role": "user", "content": "Product: 18V Cordless Impact Driver. SKU: CID-18V-001. Primary keyword: cordless impact driver. Secondary keywords: 18V battery, brushless motor. Attributes: torque 200Nm, weight 1.4kg, battery 4Ah, chuck 1/4 hex. Prior descriptions in this category (avoid structural overlap): <...prior descriptions json...>."}
    ]
  }'

The PHP client wrapper

<?php
declare(strict_types=1);

namespace Panth\AiContent\Service;

use GuzzleHttp\Client;

class AnthropicClient
{
    private const API = 'https://api.anthropic.com/v1/messages';
    private const VERSION = '2023-06-01';
    private const MODEL = 'claude-3-5-sonnet-20241022';

    public function __construct(
        private Client $http,
        private string $apiKey
    ) {}

    public function complete(string $systemPrompt, string $userMessage, int $maxTokens = 600): array
    {
        $response = $this->http->post(self::API, [
            'headers' => [
                'x-api-key'         => $this->apiKey,
                'anthropic-version' => self::VERSION,
                'content-type'      => 'application/json',
            ],
            'json' => [
                'model'      => self::MODEL,
                'max_tokens' => $maxTokens,
                'system'     => $systemPrompt,
                'messages'   => [
                    ['role' => 'user', 'content' => $userMessage],
                ],
            ],
            'timeout' => 60,
        ]);
        return json_decode($response->getBody()->getContents(), true);
    }
}

API key lives in app/etc/env.php under system/default/panth/ai/anthropic_key, loaded via ScopeConfigInterface with the encrypted-config decorator. Rotate every 90 days.^[3]

4. The duplicate-content guard

The single feature that separates this workflow from "ChatGPT generated my descriptions and Google deindexed half my catalog" is the duplicate-content guard. Before calling Claude, the service fetches prior descriptions in the same category and feeds them into the user message as a JSON array.

<?php
declare(strict_types=1);

namespace Panth\AiContent\Model;

use Magento\Catalog\Api\ProductRepositoryInterface;
use Magento\Catalog\Api\CategoryLinkManagementInterface;

class DuplicateGuard
{
    public function __construct(
        private ProductRepositoryInterface $products,
        private CategoryLinkManagementInterface $categoryLinks
    ) {}

    public function priorInCategory($product): array
    {
        $categoryIds = $product->getCategoryIds();
        if (empty($categoryIds)) return [];
        $assignments = $this->categoryLinks->getAssignedProducts((int)reset($categoryIds));

        $prior = [];
        foreach ($assignments as $a) {
            if ($a->getSku() === $product->getSku()) continue;
            $sibling = $this->products->get($a->getSku());
            $desc = (string)$sibling->getDescription();
            if ($desc === '') continue;
            $prior[] = [
                'sku'         => $sibling->getSku(),
                'description' => mb_substr(strip_tags($desc), 0, 400),
            ];
            if (count($prior) >= 20) break; // cap context size
        }
        return $prior;
    }

    public function similarity(string $generated, array $prior): float
    {
        $a = $this->ngrams(strip_tags($generated), 5);
        $max = 0.0;
        foreach ($prior as $other) {
            $b = $this->ngrams($other['description'], 5);
            $shared = count(array_intersect($a, $b));
            $max = max($max, $shared / max(1, count($a)));
        }
        return round($max, 3);
    }

    private function ngrams(string $text, int $n): array
    {
        $words = preg_split('/\s+/', mb_strtolower(trim($text))) ?: [];
        $out = [];
        for ($i = 0, $end = count($words) - $n; $i <= $end; $i++) {
            $out[] = implode(' ', array_slice($words, $i, $n));
        }
        return $out;
    }
}

The 5-gram similarity score is the QA gate. Any description above 0.15 against a sibling gets re-prompted with the explicit instruction "Your output shared 5-word phrases with sibling SKU X. Rewrite with different sentence structures.". ~94% of SKUs pass on first generation; the remaining 6% pass on retry.

5. Cost math at scale

The math is what makes the workflow viable. Six months ago a similar request would have quoted at $0.025, $0.04 per SKU on GPT-4-Turbo. Sonnet pricing turned it into a $12.60 line item.

Component	Per description	4,200 SKUs
Input tokens (~3,000 @ $3 / 1M)	$0.0090	$37.80
Output tokens (~350 @ $15 / 1M)	$0.0053	$22.05
Raw API cost	$0.0143	$59.85
With prompt caching (90% input cache hit)	$0.003	$12.60
Re-prompts on dup-guard fail (~6%)	+$0.0002	+$0.84
Total billed	~$0.0032	$13.44

Prompt caching is the lever. The 3,000-token brand-voice system prompt is identical across all 4,200 requests, so it lives in Anthropic's prompt-cache layer after the first call and bills at 10% of the input rate.^[4]

A 4,200-SKU AI rewrite that costs $13 of API spend is the kind of math that makes the "but AI is expensive" objection retire for good. The cost is the SQL backup, not the tokens.

6. The human-review loop: 30 minutes, not 30 hours

The review is not reading 4,200 descriptions: it is reading the dry-run CSV in three passes, each catching a different failure mode.

## Pass 1: top 50 by similarity (dup-guard near-misses)
awk -F',' 'NR>1 {print $4, $1, $2}' run-2026-05-20.csv | sort -rn | head -50

## Pass 2: shortest + longest outputs (placeholder + hallucination)
awk -F',' 'NR>1 {print $3, $1}' run-2026-05-20.csv | sort -n | head -25

## Pass 3: banned-word leaks from the tone primer
grep -iE 'revolutionary|best-in-class|game-chang|seamless' run-2026-05-20.csv

The three passes catch >95% of bad outputs. Anything flagged gets re-prompted with a more specific instruction or sent to a human writer.

7. Saving to Magento: EAV gotchas

Saving descriptions on Magento 2.4.4-2.4.9 carries two EAV gotchas that have broken every bulk-content workflow we have audited.

Gotcha 1: store scope

Default $product->setDescription() writes to the default store. Multi-store-view setups require setStoreId($storeId) before save(), once per view, or non-default views fall back to the old value.

Gotcha 2: full reindex + cache flush

The description attribute is indexed by catalogsearch_fulltext and rendered in FPC-cached PDP HTML. Both must clear after a bulk run.

bin/magento indexer:reindex catalogsearch_fulltext
bin/magento cache:clean full_page block_html
bin/magento cache:flush

On 4,200 SKUs, catalogsearch_fulltext reindex takes 2-5 minutes on OpenSearch. Run it as the last step, never per-product inside the loop.

What the Google response looked like, three months on

The store ran the rewrite on 2026-02-12. Search Console data through 2026-05-19, scoped to the affected URLs only.

Indexed pages, 4,198 / 4,200.
Soft 404s 0 (was 187 before the rewrite).
Manual actions, 0.
Organic clicks, 90-day window : +38% versus the matched window before the rewrite.
Organic impressions, +52%.
Average position : 24.1 to 18.7 on long-tail SKU keywords.

The result is structural. Soft 404s on thin product pages are an indexation blocker; bringing 4,200 pages from 30-word placeholder copy to 200-300 words of brand-voiced content moved them from "not worth indexing" to "indexed and ranked."

What the workflow does not do

It does not write Page Title or Meta Description (separate template in Panth_AdvancedSeo). It does not write category descriptions (smaller, hand-curated set). It does not regenerate images, alt text, or schema markup (separate generators). And it does not run on configurable parent SKUs: the parent inherits child aggregation, which is correct 95% of the time.

Why Claude over GPT-4 for this task

Two numbers matter: claude-3-5-sonnet at $3 / 1M input + $15 / 1M output, plus a 90% prompt-cache discount on the system prompt. GPT-4o is nominally cheaper per token but has no prompt caching for non-realtime calls, so the bulk-run cost runs 3-4x higher. Gemini 1.5 Pro is cheapest per token but weakest on brand-voice adherence for B2B copy.

FAQ

Will Google penalize AI-generated product descriptions?

Google's spam policies penalize content created primarily to manipulate rankings, not content generated with AI assistance. The penalty risk on AI descriptions comes from two failure modes: structural duplication across siblings (the doorway pattern) and thin / templated output. The workflow above addresses both with the duplicate-content guard and the 200-300 word output schema. Three months of post-rewrite Search Console data show zero manual actions and a 38% click lift.

Can I run this on Adobe Commerce as well as Magento Open Source?

Yes: the workflow uses ProductRepositoryInterface and CategoryLinkManagementInterfaceboth of which ship identically on Adobe Commerce and Magento Open Source 2.4.4-2.4.9.

How long does the dry-run take on 4,200 SKUs?

On a 4-vCPU PHP-FPM container with one concurrent API call, ~6 hours. Adding 4-way parallelism via Symfony Console workers brings it under 90 minutes. Anthropic Tier-2 rate limits (50 req/min on Sonnet) are the ceiling.

How do I roll back if the new descriptions are wrong?

The CSV log written at var/log/ai-descriptions/run-{timestamp}.csv contains the old description per row. A second console command, panth:ai:product-descriptions:rollback --csv=run-{timestamp}.csv, restores every old description in 30 seconds. Ship both commands as a pair: never one without the other.

Does prompt caching work the same way on OpenAI?

OpenAI added prompt caching on GPT-4o in late 2024 for prompts over 1,024 tokens: 50% discount on cached input tokens versus Anthropic's 90%. The workflow still saves money on GPT-4o but the savings are smaller.

What if my products have no category assignments?

The duplicate-content guard falls back to comparing against the 100 most-recent generated descriptions across the catalog. The structural diversity check still works: only the "same category" tightness loosens.

Should the brand-voice file live in git or in the Magento admin?

Git. The brand voice is code: every edit deserves a diff, a commit message, and a revert path. Storing it in core_config_data via an admin field means changes are invisible and there is no rollback to last week's voice.

What a typical engagement looks like

Most bulk AI-content rollouts ship in 20-40 hours: 4h scoping brand-voice.md with the merchant, 6h wiring the console command and Claude client, 4h on the duplicate-content guard with unit tests, 2h on the dry-run CSV review tooling, 6h running the staging job + human review, 4h on the production deploy with reindex and Search Console verification.

References

Anthropic, Pricing: Claude API. Reference for claude-3-5-sonnet-20241022 and claude-3-7-sonnet-20250219 token rates and prompt-caching discount applied across the bulk run.
Adobe Developer Documentation, Product attributes: description on EAV. Reference for the store-scope and reindex behavior of catalog_product_entity_text.description across Magento 2.4.4-2.4.9.
Adobe Developer Documentation, Configuration management: env.php and encrypted values. Reference for storing the Anthropic API key under system/default/panth/ai/anthropic_key via config:set --lock-env.
Anthropic, Prompt caching: beta documentation. Reference for the 90% input-token discount on cached prefixes longer than 1,024 tokens used in the brand-voice system prompt.
Production bulk-content engagements via kishansavaliya.com, 2024-2026. Patterns extracted from AI rewrite jobs shipped across Magento Open Source 2.4.4-2.4.9 + Hyvä storefronts.

Need a bulk AI-content rewrite shipped this sprint?

I am Kishan Savaliya, an Adobe-Certified Magento + Hyvä developer. I ship fixed-scope AI content workflows: brand-voice prompts, duplicate-content guards, CLI commands, CSV-logged dry runs, full Search Console verification on the live URLs. Fixed quote from $499 audit · $2,499 sprint · ~28h @ $25/hr. See hire me.

Tagged #Claude #OpenAI API #AI Pair Programming #Product Descriptions #Prompt Engineering

Keep reading

Generative Engine Optimization (GEO) for Magento: Get Cited by AI Search

GEO is how you get Magento product, category, and brand pages cited inside ChatGPT, Perplexity, and Google AI Overviews. A concrete, honest developer playbook.

Jun 8, 2026
Answer Engine Optimization (AEO) for Magento: Winning Snippets, PAA, and Voice Answers

Answer Engine Optimization for Magento is about being the single extracted answer across search and AI engines. Here is how to win snippets, PAA, and voice answers, plus the honest reality of FAQ rich results in 2026.

Jun 8, 2026
Google AI Mode Is Here (May 2026): The SEO Playbook You Need to Rewrite, Now

AI Overviews now show up on 48% of Google queries, and 93% of AI Mode sessions end without a single click off the page. The bar moved from 'rank a link' to 'be cited inside the answer.' Here is the May-2026 playbook: the six ranking factors that actually drive citations, the llms.txt + JSON-LD stack to deploy this week, the bot-allow rules every site needs, and the one Magento-specific pattern that turns AI Mode from a traffic loss into a brand-mention pipeline.

May 29, 2026