{"id":1720,"date":"2026-04-06T00:06:00","date_gmt":"2026-04-06T00:06:00","guid":{"rendered":"https:\/\/inphronesys.com\/?p=1720"},"modified":"2026-04-06T00:06:00","modified_gmt":"2026-04-06T00:06:00","slug":"anthropic-just-killed-one-of-its-own-prompting-tricks-heres-what-that-means","status":"publish","type":"post","link":"https:\/\/inphronesys.com\/?p=1720","title":{"rendered":"Anthropic Just Killed One of Its Own Prompting Tricks \u2014 Here&#8217;s What That Means"},"content":{"rendered":"<p>Last month, a procurement director at a Fortune 500 manufacturer killed a $2.3M forecasting deal in 20 minutes. The vendor&#8217;s demo used prompts her own team could have written over lunch. &#8222;If that&#8217;s the moat,&#8220; she said, &#8222;there is no moat.&#8220; She&#8217;s half right. The moat isn&#8217;t the prompt \u2014 the moat is knowing which of six years of prompting ideas still apply in 2026, and which ones the vendors quietly abandoned in 2024.<\/p>\n<p>In July 2020, OpenAI shipped a 75-page paper called &#8222;Language Models are Few-Shot Learners.&#8220; Buried inside was an accident that would, six years later, make or break global logistics AI strategy: you don&#8217;t fine-tune these models \u2014 you <em>prompt<\/em> them.<\/p>\n<p>This post is the map.<\/p>\n<h2>What Prompting Is, and Why It Works<\/h2>\n<p>A prompt is the text you hand a language model to condition what it produces next. That is the entire definition. Few-shot examples, chain-of-thought, XML tags, retrieved documents, tool schemas \u2014 everything you&#8217;ve ever read about prompting is just different things to stuff into that text.<\/p>\n<p>To understand why it works, strip the hype away. A language model is trained on one job: given text, predict the next token. Repeat. It does not know what &#8222;translation&#8220; or &#8222;supplier risk scoring&#8220; is \u2014 it only knows which token is most likely to come next. When you type <em>&#8222;Translate English to French: cheese \u2192&#8220;<\/em> you are not asking a question. You are setting up a pattern and letting next-token prediction complete it. <em>Fromage.<\/em> You got a translation without training a translator.<\/p>\n<p>Before 2020, the default way to bend a language model to a task was to fine-tune it: collect labeled examples, run gradient descent, ship a new checkpoint. Prompting replaced that with a conditioning trick that requires no training, no labels, no checkpoint \u2014 which is why it became the default interface to LLMs. It collapsed the distance between &#8222;I have an idea&#8220; and &#8222;I have a working prototype&#8220; from weeks to minutes. Prompting is a conditioning technique, not a programming language. Teams that treat it as programming are where most prompt-engineering pain comes from.<\/p>\n<h2>Origins: The GPT-3 Accident (July 2020)<\/h2>\n<p>The founding document of modern prompting is Brown et al. (2020), &#8222;Language Models are Few-Shot Learners.&#8220; Thirty-one authors at OpenAI, 75 pages, and the quiet sentence that started everything:<\/p>\n<blockquote><p><em>&#8222;Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.&#8220;<\/em><\/p><\/blockquote>\n<p>Read it twice. The authors are reporting a <em>surprise<\/em>: scale the model to 175 billion parameters, simply ask it to do things with no gradient updates, and it starts doing them. Translation, arithmetic, unscrambling letters, simple reasoning. The model had picked up a latent skill for following instructions from nothing but next-token prediction on the open web.<\/p>\n<p>No one in 2020 called this &#8222;prompt engineering.&#8220; The technique had no name because it was barely recognized as a technique. The paper leans on &#8222;demonstrations&#8220; rather than &#8222;prompt.&#8220; The community would spend the next year figuring out what to call the thing they had just been handed.<\/p>\n<p>That is why 2020 matters: prompting was not invented. It was discovered \u2014 as a side effect of making the model bigger.<\/p>\n<h2>Evolution: A 6-Year Timeline (2020\u20132026)<\/h2>\n<blockquote><p><strong>Two months ago, Anthropic retired one of its own signature prompting tricks.<\/strong> Read that again: the vendor who invented prefill stopped recommending it. That single line tells you more about where prompting is in 2026 than any taxonomy can. Keep it in mind as you scan the next 15 milestones \u2014 most of them are one Claude release away from the same fate.<\/p><\/blockquote>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_timeline_2020_2026.png\" alt=\"The evolution of prompt engineering milestones from 2020 to 2026\" \/><\/p>\n<p>Six years, four acts, roughly fifteen papers and vendor documents that every working team should recognize on sight.<\/p>\n<p><strong>2020 \u2014 Discovery.<\/strong> Brown et al. publish GPT-3. Few-shot learning works. No one has a name for it.<\/p>\n<p><strong>January 2022 \u2014 Chain-of-Thought.<\/strong> Jason Wei and colleagues at Google Research publish &#8222;Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.&#8220; Their finding:<\/p>\n<blockquote><p><em>&#8222;We explore how generating a chain of thought \u2014 a series of intermediate reasoning steps \u2014 significantly improves the ability of large language models to perform complex reasoning.&#8220;<\/em><\/p><\/blockquote>\n<p>CoT becomes the most-cited prompting technique in history.<\/p>\n<p><strong>March 2022 \u2014 InstructGPT \/ RLHF.<\/strong> Ouyang et al. publish the InstructGPT paper. They fine-tune GPT-3 with human feedback (RLHF) so it follows instructions directly. The jaw-dropping finding:<\/p>\n<blockquote><p><em>&#8222;Outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters.&#8220;<\/em><\/p><\/blockquote>\n<p>This is the hinge. It made natural-language prompts feel natural. Eight months later, ChatGPT ships.<\/p>\n<p><strong>March 2022 \u2014 Self-Consistency.<\/strong> Wang et al. replace CoT&#8217;s greedy decoding with sampling multiple reasoning paths and taking a majority vote. GSM8K accuracy jumps by 17.9 percentage points.<\/p>\n<p><strong>May 2022 \u2014 Zero-Shot CoT.<\/strong> Kojima et al. publish the shortest magic spell in AI history: <em>&#8222;Let&#8217;s think step by step.&#8220;<\/em> That single phrase raises MultiArith accuracy from 17.7% to 78.7% on InstructGPT. Seven words, no examples, no fine-tuning.<\/p>\n<p><strong>October 2022 \u2014 ReAct.<\/strong> Shunyu Yao et al. interleave reasoning with actions (tool calls). The paper becomes the template for every modern AI agent, from LangChain to Claude Code.<\/p>\n<p><strong>December 2022 \u2014 Constitutional AI.<\/strong> Bai et al. at Anthropic publish Constitutional AI: train a model to critique and revise its own outputs against a written list of principles. It is technically a training method, but the idea \u2014 that explicit written principles can steer behavior \u2014 will shape every system prompt Anthropic writes afterwards.<\/p>\n<p><strong>May 2023 \u2014 Tree of Thoughts.<\/strong> Yao et al. again. Generalize CoT into a search tree with branching, self-evaluation, and backtracking. On the Game of 24 benchmark, GPT-4 jumps from 4% (CoT) to 74% (ToT).<\/p>\n<p><strong>October 2023 \u2014 DSPy.<\/strong> Omar Khattab and the Stanford team publish DSPy: compile declarative programs into optimized prompts. The paper&#8217;s opening shot:<\/p>\n<blockquote><p><em>&#8222;Existing LM pipelines are typically implemented using hard-coded &#8218;prompt templates&#8216;, i.e. lengthy strings discovered via trial and error.&#8220;<\/em><\/p><\/blockquote>\n<p>DSPy is the first serious attempt to treat the prompt as a compile target instead of source code.<\/p>\n<p><strong>December 2023 \u2014 OpenAI&#8217;s six strategies.<\/strong> OpenAI publishes the canonical &#8222;Prompt Engineering Guide&#8220; with its six-strategy framework. The structure will remain stable for years.<\/p>\n<p><strong>May 2024 \u2014 Model Spec.<\/strong> OpenAI publishes its Model Spec, formalizing the instruction hierarchy (platform &gt; developer &gt; user).<\/p>\n<p><strong>August 2024 \u2014 Structured Outputs.<\/strong> OpenAI ships JSON Schema-based structured outputs. An entire category of &#8222;return your answer as JSON like this\u2026&#8220; prompts becomes obsolete overnight.<\/p>\n<p><strong>June 2024 \u2014 The Prompt Report.<\/strong> Sander Schulhoff et al. publish a 1,500-paper meta-survey cataloguing 58 prompting techniques and 33 standardized vocabulary terms. The field gets a textbook.<\/p>\n<p><strong>September 2024 (internal) \/ November 2024 (public) \u2014 Boonstra whitepaper.<\/strong> Google&#8217;s Lee Boonstra publishes a 68-page whitepaper titled simply &#8222;Prompt Engineering,&#8220; distributed through Kaggle. This is the single most comprehensive prompting document any vendor has ever published. Almost nobody in enterprise AI has read it.<\/p>\n<p><strong>2025 \u2014 GPT-4.1 prompting guide.<\/strong> OpenAI publishes a new cookbook entry warning developers that GPT-4.1 follows instructions <em>more literally<\/em> than its predecessors. Prompts that worked on GPT-3.5 by implying intent now fail because the model does exactly what you say. A new template for agentic prompting (Persistence \/ Tool-Calling \/ Planning) shows up.<\/p>\n<p><strong>June 25, 2025 \u2014 The Karpathy moment.<\/strong> Andrej Karpathy posts on X:<\/p>\n<blockquote><p><em>&#8222;+1 for &#8218;context engineering&#8216; over &#8218;prompt engineering&#8216;. [\u2026] context engineering is the delicate art and science of filling the context window with just the right information for the next step.&#8220;<\/em><\/p><\/blockquote>\n<p>Simon Willison adopts the phrase two days later. The art-vs-science debate gets reframed in a single tweet.<\/p>\n<p><strong>2026 \u2014 Anthropic consolidates and retires prefill.<\/strong> Alongside the Claude 4.6 release, Anthropic collapses eight separate prompting subpages into a single &#8222;living reference&#8220; \u2014 and officially deprecates prefill, one of its signature tricks. In their own words:<\/p>\n<blockquote><p><em>&#8222;Starting with Claude 4.6 models, prefilled responses on the last assistant turn are no longer supported. Model intelligence and instruction following has advanced such that most use cases of prefill no longer require it.&#8220;<\/em><\/p><\/blockquote>\n<p>When the inventor deprecates its own invention, your in-house &#8222;prompting playbook&#8220; from 2023 is ticking toward zero.<\/p>\n<h2>The Major Vendor Frameworks<\/h2>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_evolution_infographic.png\" alt=\"The evolution of prompting ideas from 2020 to 2026 \u2014 a NotebookLM-generated overview\" \/><\/p>\n<p>If you only read one section of this post, read this one. The three major model vendors have each published an official framework for how to prompt their models. Nobody in enterprise reads all three. Your vendors are counting on it.<\/p>\n<h3>Anthropic \u2014 &#8222;The Brilliant New Employee&#8220;<\/h3>\n<p>Anthropic&#8217;s prompting philosophy is, above all, a management analogy.<\/p>\n<blockquote><p><em>&#8222;Think of Claude as a brilliant but new employee who lacks context on your norms and workflows. The more precisely you explain what you want, the better the result.&#8220;<\/em><\/p><\/blockquote>\n<p>This one sentence explains every other Anthropic recommendation. You are not writing a query. You are writing an onboarding document.<\/p>\n<p>The <strong>Golden Rule of Claude prompting<\/strong> follows directly:<\/p>\n<blockquote><p><em>&#8222;Show your prompt to a colleague with minimal context on the task and ask them to follow it. If they&#8217;d be confused, Claude will be too.&#8220;<\/em><\/p><\/blockquote>\n<p>Anthropic&#8217;s signature technical move is <strong>XML tags<\/strong>. Every official example wraps documents in <code>&lt;document&gt;<\/code>, examples in <code>&lt;example&gt;<\/code>, and instructions in <code>&lt;instructions&gt;<\/code>. The reason is concrete:<\/p>\n<blockquote><p><em>&#8222;XML tags help Claude parse complex prompts unambiguously, especially when your prompt mixes instructions, context, examples, and variable inputs.&#8220;<\/em><\/p><\/blockquote>\n<p>For long-context prompting, Anthropic publishes the only specific, testable number in the entire vendor literature:<\/p>\n<blockquote><p><em>&#8222;Queries at the end can improve response quality by up to 30% in tests, especially with complex, multi-document inputs.&#8220;<\/em><\/p><\/blockquote>\n<p>That means: put the 200-page contract at the top of the prompt, and the question you want answered at the bottom. For a supplier risk report, document first, query last. If you invert it, you pay a measurable accuracy tax.<\/p>\n<p>Then there is the 2026 story. Anthropic <strong>deprecated prefill<\/strong> in Claude 4.6 and consolidated its eight prompting subpages into a single living reference. Prefill was the technique where you pre-write the beginning of Claude&#8217;s response (&#8222;Sure, here is the JSON:&#8220;) to force a format or skip a preamble. It was distinctive enough that it had its own documentation page for years. Now it&#8217;s gone \u2014 replaced by Structured Outputs for formatting and direct instructions for preamble removal. The retirement is a signal: the vendor who invented the trick has outgrown it.<\/p>\n<h3>OpenAI \u2014 &#8222;The Six Strategies, Now With Literal Instruction-Following&#8220;<\/h3>\n<p>OpenAI&#8217;s canonical framework is the <strong>six strategies<\/strong>, published in December 2023 and stable ever since:<\/p>\n<table style=\"border-collapse: collapse; width: 100%; margin: 1.5em 0; font-size: 0.95em; line-height: 1.5;\">\n<thead>\n<tr>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">#<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Strategy<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">The actual leverage<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">1<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Write clear instructions<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Delimiters, personas, explicit format requirements, desired length<\/td>\n<\/tr>\n<tr style=\"background: #ffffff;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">2<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Provide reference text<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">&#8222;A sheet of notes can help a student do better on a test&#8220;<\/td>\n<\/tr>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">3<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Split complex tasks into simpler subtasks<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Intent classification, recursive summarization, pipelines over monoliths<\/td>\n<\/tr>\n<tr style=\"background: #ffffff;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">4<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Give the model time to &#8222;think&#8220;<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">CoT, inner monologue, work out the answer before stating it<\/td>\n<\/tr>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">5<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Use external tools<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Retrieval, code execution, function calling<\/td>\n<\/tr>\n<tr style=\"background: #ffffff;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">6<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Test changes systematically<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Evals with gold-standard answers \u2014 the sixth strategy is the eval-driven mindset<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Strategy 2 deserves the direct quote because it is the most compressed summary of RAG ever written:<\/p>\n<blockquote><p><em>&#8222;In the same way that a sheet of notes can help a student do better on a test, providing reference text to GPTs can help in answering with fewer fabrications.&#8220;<\/em><\/p><\/blockquote>\n<p>The bigger story is what happened <em>after<\/em> the six strategies. With <strong>GPT-4.1<\/strong> (2025), OpenAI quietly told developers the rules had changed:<\/p>\n<blockquote><p><em>&#8222;GPT-4.1 is trained to follow instructions more closely and more literally than its predecessors.&#8220;<\/em><\/p><\/blockquote>\n<p>Prompts that used to work by <em>implying<\/em> intent now fail. The model does exactly what you say. The 2023-era habit of dropping in vague instructions and hoping the model would fill in the blanks stopped working in production.<\/p>\n<p>The fix is brutal in its simplicity:<\/p>\n<blockquote><p><em>&#8222;If model behavior is different from what you expect, a single sentence firmly and unequivocally clarifying your desired behavior is almost always sufficient.&#8220;<\/em><\/p><\/blockquote>\n<p>The GPT-4.1 guide also formalized the <strong>three-component agentic prompt<\/strong> every serious agentic system has adopted since:<\/p>\n<ol>\n<li><strong>Persistence:<\/strong> <em>&#8222;You are an agent \u2014 keep going until the user&#8217;s query is completely resolved, before ending your turn.&#8220;<\/em><\/li>\n<li><strong>Tool-Calling:<\/strong> <em>&#8222;If unsure about content, use tools to read files; do NOT guess or make up answers.&#8220;<\/em><\/li>\n<li><strong>Planning:<\/strong> <em>&#8222;Plan extensively before each function call and reflect on outcomes; do NOT chain tool calls only.&#8220;<\/em><\/li>\n<\/ol>\n<p>If your SCM vendor&#8217;s agent does not include these three components in its system prompt, it is skipping the 2025 playbook.<\/p>\n<p>Then came the <strong>Model Spec<\/strong> (May 2024, updated 2025), which formalized OpenAI&#8217;s instruction hierarchy: platform &gt; developer &gt; user. OpenAI also renamed &#8222;system messages&#8220; to &#8222;developer messages&#8220; to match the hierarchy \u2014 a small terminological divergence from Anthropic&#8217;s &#8222;system prompt.&#8220; And in August 2024, <strong>Structured Outputs<\/strong> replaced an entire category of prompt-engineered JSON extraction with a schema-enforced API. The most common use case of prompting in 2023 \u2014 &#8222;give me JSON in this shape&#8220; \u2014 became a checkbox in 2024.<\/p>\n<h3>Google \u2014 &#8222;The Textbook Nobody Read&#8220;<\/h3>\n<p><strong>Contrarian take #2:<\/strong> <em>Lee Boonstra&#8217;s 68-page Google whitepaper is the best single prompting document ever published, and almost nobody in enterprise AI has read it.<\/em><\/p>\n<p>I realize this is the kind of claim you are supposed to soften in a blog post. I am not softening it. Anthropic publishes short, opinionated pages. OpenAI publishes a mix of cookbook notebooks and strategy guides. Google, via Lee Boonstra, published a single 68-page treatise in September 2024 (internal) and November 2024 (public) that catalogues twelve techniques, gives specific numerical starting values for temperature \/ top-K \/ top-P, and devotes an entire chapter to code prompting \u2014 all in one document. It is the closest thing the field has to a printable reference manual. It is distributed free on Kaggle. And unless you follow a handful of AI researchers on LinkedIn, you have probably never seen it.<\/p>\n<p>Boonstra&#8217;s taxonomy covers zero-shot, one-\/few-shot, system, role, contextual, step-back, chain-of-thought, self-consistency, tree of thoughts, ReAct, automatic prompt engineering, and code prompting. Two of these are Google-distinctive:<\/p>\n<p><strong>Step-back prompting<\/strong> originates from Google DeepMind research. Instead of asking the narrow question directly, you first ask a broader, higher-level question, then feed the answer back into the narrow question. For supply chain: instead of <em>&#8222;what safety stock should I hold for SKU 4711?&#8220;<\/em> you first ask <em>&#8222;what factors determine safety stock in a service-level-driven inventory policy?&#8220;<\/em> The step-back answer activates the model&#8217;s background knowledge about service levels, demand variance, and lead-time distributions \u2014 material it might otherwise skip in a single-shot answer.<\/p>\n<p><strong>Automatic Prompt Engineering (APE)<\/strong> is Google&#8217;s philosophy distilled:<\/p>\n<blockquote><p><em>&#8222;Automatic Prompt Engineering alleviates the need for human input but also enhances the model&#8217;s performance in various tasks.&#8220;<\/em><\/p><\/blockquote>\n<p>Let the model write its own prompts. Generate N candidates, score them against a held-out eval set, keep the winner. This is the Google-flavored version of DSPy.<\/p>\n<p>But the single most valuable page of the whitepaper, the one your team should print and tape to the wall, is the <strong>sampling parameters table<\/strong>. No other major vendor publishes specific numerical starting points for <code>temperature<\/code>, <code>top-K<\/code>, and <code>top-P<\/code>. Boonstra does:<\/p>\n<table style=\"border-collapse: collapse; width: 100%; margin: 1.5em 0; font-size: 0.95em; line-height: 1.5;\">\n<thead>\n<tr>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Task profile<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Temperature<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Top-P<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Top-K<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>Balanced<\/strong> (most tasks)<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0.2<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0.95<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">30<\/td>\n<\/tr>\n<tr style=\"background: #ffffff;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>Creative<\/strong> (marketing, ideation)<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0.9<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0.99<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">40<\/td>\n<\/tr>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>Factual<\/strong> (analysis, Q&amp;A)<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0.1<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0.9<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">20<\/td>\n<\/tr>\n<tr style=\"background: #ffffff;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>Single correct answer<\/strong> (math, extraction)<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">0<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">\u2014<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">\u2014<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This is the only &#8222;try this first&#8220; table in the entire vendor prompting literature. For a forecasting or procurement audience that tunes hyperparameters for a living, it is the single most immediately actionable piece of advice any vendor has shipped.<\/p>\n<h3>The Vendor Matrix at a Glance<\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_vendor_matrix.png\" alt=\"Vendor \u00d7 technique matrix: how Anthropic, OpenAI, and Google handle the core prompting techniques\" \/><\/p>\n<p>Three vendors, three philosophies, measurable convergence by 2026. OpenAI and Anthropic both recommend XML for document structuring now. Anthropic and OpenAI treat evals as mandatory; Google folds them into its iterative-engineering framing. All three have published agentic-prompting guidance. The differences are no longer about <em>what<\/em> to do \u2014 they are about <em>which model&#8217;s quirks<\/em> you are working around on any given day.<\/p>\n<h2>Art or Science? (Both \u2014 And Here&#8217;s the Split)<\/h2>\n<p>For two years, trade press wrote prompting&#8217;s obituary (&#8222;AI Prompt Engineering Is Dead,&#8220; IEEE Spectrum, March 2024). For those same two years, practitioners kept iterating on prompts. The contradiction had nowhere to go \u2014 until June 25, 2025, when Andrej Karpathy posted:<\/p>\n<blockquote><p><em>&#8222;+1 for &#8218;context engineering&#8216; over &#8218;prompt engineering&#8216;. [\u2026] context engineering is the delicate art and science of filling the context window with just the right information for the next step.&#8220;<\/em><\/p><\/blockquote>\n<p>Karpathy is one of the few voices every camp listens to: ex-OpenAI founding member, ex-Tesla AI director, author of the most-watched LLM tutorials on YouTube. When he called it &#8222;the delicate art and science,&#8220; the binary collapsed. By June 27, Simon Willison was writing: <em>&#8222;I think &#8218;context engineering&#8216; is going to stick \u2014 it has an inferred definition that&#8217;s much closer to the actual work involved than &#8218;prompt engineering&#8216; does.&#8220;<\/em><\/p>\n<p>Five recognizable camps formed around the new consensus: the &#8222;prompting is dead&#8220; populists (IEEE Spectrum, Salesforce Ben), the eval-driven pragmatists (Hamel Husain, Jason Liu \u2014 <em>&#8222;Many people focus exclusively on [prompt engineering], which prevents them from improving their LLM products beyond a demo,&#8220;<\/em> Husain, March 2024), the DSPy &#8222;compile-don&#8217;t-prompt&#8220; researchers (Khattab et al.), the context-engineering school (Karpathy, Willison, Addy Osmani), and a smaller residual of craft defenders (Mollick, half of Willison) arguing that intuition still matters at the frontier. In 2026 the eval-driven and context-engineering camps are winning in enterprise. The pure-art camp has lost the production conversation.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_interest_trend.png\" alt=\"Search interest in &quot;prompt engineering&quot; from 2020 to 2026, with annotated milestones\" \/><\/p>\n<p><strong>Contrarian take #3:<\/strong> <em>The art-vs-science debate was settled in 2024 by evals, and the vendors quietly agreed.<\/em> Anthropic&#8217;s prompt engineering overview now leads with &#8222;before prompt engineering, you need success criteria, empirical tests, and a first draft prompt.&#8220; OpenAI&#8217;s sixth strategy is &#8222;Test changes systematically.&#8220; Google&#8217;s Boonstra whitepaper makes iterative evaluation its closing chapter. Every major vendor now treats evals as the prerequisite, not the afterthought. The art won the creative-task conversation (marketing copy, brand voice, ideation). The science won the production-pipeline conversation (forecasting, risk scoring, contract extraction). If your SCM application is in the second category, the eval-driven camp is the one you live in.<\/p>\n<h2>What This Means for Supply Chain Teams<\/h2>\n<p>Everything above is background. Here is the operational section. Five concrete SCM scenarios, one per row, where the vendor you pick and the prompting philosophy you apply change the outcome you get.<\/p>\n<p><strong>1. Demand forecast reasoning explanations.<\/strong> You want the model to explain <em>why<\/em> the forecast dropped for SKU 4711 next month. Use Anthropic&#8217;s Claude with a <code>&lt;thinking&gt;<\/code> or adaptive-thinking prompt and ground the model in quoted time-series data. Put the historical data at the top of the context window, the question at the bottom \u2014 you will gain up to 30% in answer quality per Anthropic&#8217;s own testing. Use the &#8222;brilliant new employee&#8220; framing: tell the model what S&amp;OP meeting this is for, who will read the explanation, and what the stakes are.<\/p>\n<p><strong>2. Supplier risk scoring narratives.<\/strong> You have a structured risk score (financial, delivery, geopolitical) and you want a short narrative justifying it. Use OpenAI GPT-4.1 with a literal, explicit instruction set \u2014 <em>&#8222;summarize the top three risk drivers in \u2264120 words, cite the source field names in brackets, do not speculate about drivers not present in the data.&#8220;<\/em> GPT-4.1&#8217;s literal-instruction-following is a feature here, not a bug. Pair it with Structured Outputs if you need machine-readable fields alongside the narrative.<\/p>\n<p><strong>3. Contract review with citations.<\/strong> Load a 60-page supplier agreement and ask the model to flag every auto-renewal, uncapped-liability, and unilateral-price-change clause, with exact quoted text for each flag. This is Anthropic&#8217;s home turf: XML-wrapped documents at the top, the extraction query at the bottom, and an explicit instruction to extract quotes <em>before<\/em> reasoning over them. Anthropic&#8217;s grounding-in-quotes pattern is the single most hallucination-resistant workflow in any vendor&#8217;s playbook.<\/p>\n<p><strong>4. S&amp;OP meeting summaries.<\/strong> You have a 90-minute meeting transcript and want a structured summary: decisions, action items, open risks, unresolved questions. Use OpenAI&#8217;s recursive summarization pattern (Strategy 3 \u2014 split complex tasks): chunk the transcript into 10-minute segments, summarize each, then summarize the summaries. The three-component agentic prompt (Persistence \/ Tool-Calling \/ Planning) becomes relevant if you also want the model to cross-check action items against prior meeting minutes retrieved from your knowledge base.<\/p>\n<p><strong>5. Inventory policy explainers for non-technical stakeholders.<\/strong> You want to explain why the safety stock for a seasonal SKU changed after the latest demand review, for an audience that does not know what \u03c3 or z-scores are. This is Google Boonstra&#8217;s step-back prompting scenario: first prompt the model to answer the general question <em>&#8222;what factors determine safety stock in a service-level-driven inventory policy?&#8220;<\/em> Then feed that answer, plus the specific SKU&#8217;s numbers, into the explanation prompt. The two-step method outperforms the one-shot ask noticeably on long-tail items where the model would otherwise skip the conceptual scaffolding.<\/p>\n<p>These are not hypothetical. Every one of them maps to a concrete vendor feature shipped in the last 18 months. If your vendor cannot say which of the five patterns their product uses for your use case, you are the eval harness.<\/p>\n<h2>Practical Best Practices for 2026<\/h2>\n<p>Ten techniques that survived the 2020\u20132026 churn and still matter. Each comes from the vendor docs above, each has an SCM application, each is boring in exactly the way that ships products.<\/p>\n<ol>\n<li><strong>Explicit delimiters (XML tags or pipe-delimited).<\/strong> <em>Anthropic; OpenAI GPT-4.1 guide.<\/em> Wrap every document, example, and instruction in its own tag. SCM example: <code>&lt;po_data&gt;...&lt;\/po_data&gt;<\/code>, <code>&lt;exception_rules&gt;...&lt;\/exception_rules&gt;<\/code>, <code>&lt;question&gt;...&lt;\/question&gt;<\/code>.<\/li>\n<li><strong>3\u20135 diverse, structured examples.<\/strong> <em>Anthropic; Boonstra whitepaper.<\/em> More examples beat fewer examples only up to a point; diversity beats volume. SCM example: for a supplier-risk narrative task, include one high-risk, one medium-risk, and one low-risk example \u2014 not three high-risk ones.<\/li>\n<li><strong>Long documents first, queries last.<\/strong> <em>Anthropic.<\/em> The 30% quality claim is vendor-tested. Flip this at your own cost. SCM example: for contract review, contract text at the top of the prompt, extraction instructions at the bottom.<\/li>\n<li><strong>Ground in quotes before reasoning.<\/strong> <em>Anthropic.<\/em> Ask the model to extract the relevant passages first, wrap them in <code>&lt;quotes&gt;<\/code>, then reason over them. Cuts hallucination more reliably than any other single pattern. SCM example: in a supplier contract analysis, extract all payment-terms clauses verbatim before generating a summary.<\/li>\n<li><strong>Provide reference text to cut hallucinations.<\/strong> <em>OpenAI Strategy 2.<\/em> <em>&#8222;A sheet of notes can help a student do better on a test.&#8220;<\/em> For anything touching prices, dates, SKU IDs, or supplier names, never rely on model memory \u2014 pass the authoritative data in the prompt. SCM example: supplier master data fields must come from the ERP in the context, never from the model&#8217;s training.<\/li>\n<li><strong>Decompose complex tasks into pipelines.<\/strong> <em>OpenAI Strategy 3.<\/em> Single complex prompts have higher error rates than chains of simpler prompts. SCM example: contract review \u2192 (extract clauses) \u2192 (classify clauses) \u2192 (summarize risk). Three prompts, not one.<\/li>\n<li><strong>The three-component agentic prompt: Persistence, Tool-Calling, Planning.<\/strong> <em>OpenAI GPT-4.1 guide.<\/em> Every agent system prompt should include all three. SCM example: an S&amp;OP assistant that pulls data from the ERP needs <em>&#8222;do not guess values \u2014 use the ERP tool&#8220;<\/em> and <em>&#8222;plan your retrieval before calling tools.&#8220;<\/em><\/li>\n<li><strong>Structured Outputs instead of prompt-engineered JSON.<\/strong> <em>OpenAI 2024; equivalents from Anthropic and Google since.<\/em> If you are still writing <em>&#8222;return your answer as JSON like this\u2026&#8220;<\/em> in 2026, you are paying a reliability tax for no reason. SCM example: forecast confidence intervals, risk scores, exception categories \u2014 all go through schema-enforced outputs, not prompt-engineered formatting.<\/li>\n<li><strong>Eval-driven iteration, not vibes.<\/strong> <em>OpenAI Strategy 6; Anthropic overview; Husain 2024.<\/em> Before changing a prompt, run it against a representative eval set and measure the delta. SCM example: build a 30-example eval set of supplier emails with labeled exception categories before you touch the classification prompt again.<\/li>\n<li><strong>Write the prompt for a brilliant new hire, not a search engine.<\/strong> <em>Anthropic.<\/em> Explain <em>why<\/em>, not just <em>what<\/em>. Explain the stakes, the audience, and the consequences of errors. This is the single biggest mindset shift from 2022-era prompting. SCM example: <em>&#8222;You are helping a category manager who is preparing a supplier negotiation for next Tuesday. Errors in this summary will be seen by a director before the meeting \u2014 flag low-confidence items explicitly.&#8220;<\/em><\/li>\n<\/ol>\n<p>If your team practices all ten, you are past the prompting conversation. If your team practices two of them, you are one procurement RFP away from an uncomfortable meeting.<\/p>\n<h2>Interactive Dashboard<\/h2>\n<p>Explore six years of prompting milestones yourself \u2014 filter by vendor, year, and task type, then grab copy-paste templates for your own work.<\/p>\n<div class=\"dashboard-link\" style=\"margin: 2em 0; padding: 1.5em; background: #f8f9fa; border-left: 4px solid #0073aa; border-radius: 4px;\">\n<p style=\"margin: 0 0 0.5em 0; font-size: 1.1em;\"><strong>Interactive Dashboard<\/strong><\/p>\n<p style=\"margin: 0 0 1em 0;\">Explore the data yourself \u2014 adjust parameters and see the results update in real time.<\/p>\n<p><a style=\"display: inline-block; padding: 0.6em 1.2em; background: #0073aa; color: #fff; text-decoration: none; border-radius: 4px; font-weight: bold;\" href=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/2026-04-05_Evolution_of_Prompting_dashboard-1.html\" target=\"_blank\" rel=\"noopener\">Open Interactive Dashboard \u2192<\/a><\/p>\n<\/div>\n<h2>The Bridge \u2014 What Comes After Prompting<\/h2>\n<p>All of this is already being replaced.<\/p>\n<p>By 2025, the serious practitioners had stopped calling it &#8222;prompt engineering&#8220; at all. Karpathy called it &#8222;context engineering.&#8220; Nate B Jones argued that prompting had split into four fundamentally different skills \u2014 <strong>Prompt Craft<\/strong>, <strong>Context Engineering<\/strong>, <strong>Intent Engineering<\/strong>, and <strong>Specification Engineering<\/strong> \u2014 and that most of us are still only practicing the first one. By the time an agent will execute autonomously for 35 minutes after you hit send, the 2022-era habit of tweaking a sentence and re-running is already two paradigms behind.<\/p>\n<p>If this post is the history, the companion post on the <a href=\"\/context-engineering-4-skills\/\">Four Skills of AI Prompting<\/a> is the roadmap forward. Read this one first if you want to understand where the frameworks came from. Read that one next if you want to understand where your job is going.<\/p>\n<h2>Your Next Steps<\/h2>\n<p>Three concrete actions for this week. Not next quarter.<\/p>\n<ol>\n<li><strong>Pick one vendor&#8217;s guide and read it cover to cover by Friday.<\/strong> If you only have time for one, make it Lee Boonstra&#8217;s Google whitepaper on Kaggle \u2014 68 pages, free, and by far the most comprehensive. If you use Claude daily, read Anthropic&#8217;s consolidated <code>claude-prompting-best-practices<\/code> page instead. If you use GPT daily, read the OpenAI GPT-4.1 prompting guide in the developer cookbook. Pick one. Finish it.<\/li>\n<li><strong>Audit your team&#8217;s top 5 prompts against the 10-practice list above.<\/strong> Pull up the five prompts your team runs most often this quarter. Score each one against the list. Any prompt that fails three or more items is technical debt \u2014 rewrite it before you ship the next feature.<\/li>\n<li><strong>Build a 10-example eval set <em>before<\/em> your next prompt change.<\/strong> Ten labeled examples is the minimum viable eval harness. Do it before you tune the prompt, not after. Every subsequent prompt change gets measured against the same ten examples. This is the single habit that separates the teams that ship from the teams that demo.<\/li>\n<\/ol>\n<p>If you do these three things, you will leave the &#8222;prompt engineers&#8220; in your industry behind within a month. If you do none of them, you will remain the person the AI vendor builds a pitch deck around.<\/p>\n<details>\n<summary><strong>Show R Code: Generating the Prompting Evolution Images<\/strong><\/summary>\n<p>All three visualizations in this post \u2014 the 2020\u20132026 milestone timeline, the vendor-technique matrix, and the illustrative interest trend \u2014 were generated with the R script below. It sources the shared <code>theme_inphronesys.R<\/code> (Inter font, brand palette) so the charts match every other post on the site. Run it from the project root with <code>Rscript Scripts\/generate_prompt_evolution_images.R<\/code>.<\/p>\n<pre><code class=\"language-r\"># =============================================================================\n# Prompting Evolution \u2014 Image Generation\n# =============================================================================\n# Generates 3 visualizations for the \"How Prompting Evolved 2020-2026\" blog\n# post.\n#\n#   1. prompt_timeline_2020_2026.png \u2014 milestone scatter timeline (800x600)\n#   2. prompt_vendor_matrix.png      \u2014 technique x vendor heatmap  (800x700)\n#   3. prompt_interest_trend.png     \u2014 illustrative interest trend (800x500)\n#\n# Run from project root:\n#   Rscript Scripts\/generate_prompt_evolution_images.R\n# =============================================================================\n\nsource(\"Scripts\/theme_inphronesys.R\")\n\nlibrary(ggplot2)\nlibrary(dplyr)\nlibrary(tidyr)\nlibrary(scales)\nlibrary(ggrepel)\n\n# =============================================================================\n# IMAGE 1 \u2014 Milestone Scatter Timeline (2020-01 to 2026-04)\n# =============================================================================\n\ntimeline &lt;- tibble::tribble(\n  ~date,         ~category,     ~label,                                     ~highlight,\n  \"2020-07-01\", \"Academic\",    \"GPT-3 (Brown et al., few-shot)\",           FALSE,\n  \"2022-01-01\", \"Academic\",    \"Chain-of-Thought (Wei)\",                   FALSE,\n  \"2022-03-04\", \"OpenAI\",      \"InstructGPT \/ RLHF (Ouyang)\",              FALSE,\n  \"2022-03-21\", \"Academic\",    \"Self-Consistency (Wang)\",                  FALSE,\n  \"2022-05-01\", \"Academic\",    \"Zero-shot CoT (Kojima)\",                   FALSE,\n  \"2022-10-01\", \"Academic\",    \"ReAct (Yao)\",                              FALSE,\n  \"2022-11-01\", \"OpenAI\",      \"ChatGPT launch\",                           FALSE,\n  \"2022-12-15\", \"Anthropic\",   \"Constitutional AI (Bai)\",                  FALSE,\n  \"2023-05-01\", \"Academic\",    \"Tree of Thoughts (Yao)\",                   FALSE,\n  \"2023-10-01\", \"Open Source\", \"DSPy (Khattab)\",                           FALSE,\n  \"2023-12-01\", \"OpenAI\",      \"Six strategies guide published\",           FALSE,\n  \"2024-03-01\", \"Open Source\", \"IEEE \\\"Prompt Engineering Is Dead\\\"\",      FALSE,\n  \"2024-05-01\", \"OpenAI\",      \"Model Spec published\",                     FALSE,\n  \"2024-06-01\", \"Academic\",    \"Prompt Report (Schulhoff)\",                FALSE,\n  \"2024-08-01\", \"OpenAI\",      \"Structured Outputs\",                       FALSE,\n  \"2024-09-01\", \"Google\",      \"Boonstra whitepaper (internal)\",           FALSE,\n  \"2024-11-01\", \"Google\",      \"Boonstra whitepaper (Kaggle public)\",      FALSE,\n  \"2025-04-01\", \"OpenAI\",      \"GPT-4.1 prompting guide\",                  FALSE,\n  \"2025-06-25\", \"Open Source\", \"Karpathy: \\\"context engineering\\\"\",        TRUE,\n  \"2026-02-01\", \"Anthropic\",   \"Claude 4.6 + docs consolidation\",          TRUE,\n  \"2026-02-15\", \"Anthropic\",   \"Prefill deprecated\",                       TRUE\n)\n\ntimeline &lt;- timeline |&gt;\n  mutate(\n    date = as.Date(date),\n    category = factor(category,\n                      levels = c(\"Academic\", \"Anthropic\", \"OpenAI\",\n                                 \"Google\", \"Open Source\"))\n  )\n\ncat_colors &lt;- iph_palette(5)\nnames(cat_colors) &lt;- levels(timeline$category)\n\np_timeline &lt;- ggplot(timeline, aes(x = date, y = category, color = category)) +\n  geom_point(data = filter(timeline, !highlight),\n             size = 3.2, alpha = 0.85) +\n  geom_point(data = filter(timeline, highlight),\n             size = 5, alpha = 0.95) +\n  geom_text_repel(\n    data = filter(timeline, !highlight),\n    aes(label = label),\n    size = 2.9, family = \"Inter\", color = iph_colors$dark,\n    segment.color = iph_colors$lightgrey, segment.size = 0.3,\n    box.padding = 0.45, point.padding = 0.3,\n    min.segment.length = 0, max.overlaps = Inf, seed = 42\n  ) +\n  geom_text_repel(\n    data = filter(timeline, highlight),\n    aes(label = label),\n    size = 3.1, family = \"Inter\", fontface = \"bold\",\n    color = iph_colors$dark,\n    segment.color = iph_colors$grey, segment.size = 0.4,\n    box.padding = 0.6, point.padding = 0.35,\n    min.segment.length = 0, max.overlaps = Inf, seed = 42\n  ) +\n  scale_color_manual(values = cat_colors, name = NULL) +\n  scale_x_date(\n    limits = as.Date(c(\"2020-01-01\", \"2026-06-30\")),\n    date_breaks = \"1 year\", date_labels = \"%Y\",\n    expand = expansion(mult = c(0.02, 0.04))\n  ) +\n  scale_y_discrete(limits = rev) +\n  labs(\n    title = \"How prompting evolved: 2020-2026\",\n    subtitle = \"A decade of breakthroughs compressed into six years \\u2014 and the moment the field renamed itself.\",\n    x = NULL, y = NULL,\n    caption = \"Sources: arxiv.org, platform.claude.com, developers.openai.com, kaggle.com\/whitepaper-prompt-engineering \\u00b7 Captured 2026-04-05\"\n  ) +\n  theme_inphronesys(grid = \"x\") +\n  theme(\n    legend.position = \"top\",\n    legend.justification = \"left\",\n    axis.text.y = element_text(face = \"bold\", color = iph_colors$dark, size = 11)\n  )\n\nggsave(\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_timeline_2020_2026.png\", plot = p_timeline,\n       width = 8, height = 6, dpi = 100, bg = \"white\")\n\n# =============================================================================\n# IMAGE 2 \u2014 Technique x Vendor Tile Heatmap\n# =============================================================================\n\ntechnique_levels &lt;- c(\n  \"XML tags \/ delimiters\",\n  \"Few-shot examples (3-5)\",\n  \"System prompts \/ role\",\n  \"Chain of Thought\",\n  \"Self-Consistency\",\n  \"Tree of Thoughts\",\n  \"ReAct (tool use + reasoning)\",\n  \"Step-back prompting\",\n  \"Automatic Prompt Engineering (APE)\",\n  \"Prefill responses\",\n  \"Structured outputs (feature, not prompt)\",\n  \"Long-context ordering (docs first)\",\n  \"Evals \/ systematic testing\",\n  \"Reference text \/ grounding\",\n  \"Decompose complex tasks\",\n  \"Explicit literal instructions\",\n  \"Instruction hierarchy (dev\/user)\",\n  \"Temperature\/top-K\/top-P guidance\",\n  \"Code prompting chapter\",\n  \"Constitutional principles\"\n)\n\n# \"Emphasized\"  = named, prominent part of the vendor's guide\n# \"Mentioned\"   = appears but is not foregrounded\n# \"Not covered\" = not included in official guidance\nmatrix_data &lt;- tibble::tribble(\n  ~technique,                                 ~Anthropic,    ~OpenAI,       ~Google,\n  \"XML tags \/ delimiters\",                    \"Emphasized\",  \"Emphasized\",  \"Mentioned\",\n  \"Few-shot examples (3-5)\",                  \"Emphasized\",  \"Emphasized\",  \"Emphasized\",\n  \"System prompts \/ role\",                    \"Emphasized\",  \"Emphasized\",  \"Emphasized\",\n  \"Chain of Thought\",                         \"Mentioned\",   \"Emphasized\",  \"Emphasized\",\n  \"Self-Consistency\",                         \"Not covered\", \"Mentioned\",   \"Emphasized\",\n  \"Tree of Thoughts\",                         \"Not covered\", \"Not covered\", \"Emphasized\",\n  \"ReAct (tool use + reasoning)\",             \"Emphasized\",  \"Emphasized\",  \"Emphasized\",\n  \"Step-back prompting\",                      \"Mentioned\",   \"Not covered\", \"Emphasized\",\n  \"Automatic Prompt Engineering (APE)\",       \"Not covered\", \"Not covered\", \"Emphasized\",\n  \"Prefill responses\",                        \"Mentioned\",   \"Not covered\", \"Not covered\",\n  \"Structured outputs (feature, not prompt)\", \"Emphasized\",  \"Emphasized\",  \"Emphasized\",\n  \"Long-context ordering (docs first)\",       \"Emphasized\",  \"Emphasized\",  \"Mentioned\",\n  \"Evals \/ systematic testing\",               \"Emphasized\",  \"Emphasized\",  \"Mentioned\",\n  \"Reference text \/ grounding\",               \"Emphasized\",  \"Emphasized\",  \"Mentioned\",\n  \"Decompose complex tasks\",                  \"Mentioned\",   \"Emphasized\",  \"Mentioned\",\n  \"Explicit literal instructions\",            \"Emphasized\",  \"Emphasized\",  \"Mentioned\",\n  \"Instruction hierarchy (dev\/user)\",         \"Mentioned\",   \"Emphasized\",  \"Not covered\",\n  \"Temperature\/top-K\/top-P guidance\",         \"Mentioned\",   \"Mentioned\",   \"Emphasized\",\n  \"Code prompting chapter\",                   \"Mentioned\",   \"Mentioned\",   \"Emphasized\",\n  \"Constitutional principles\",                \"Emphasized\",  \"Not covered\", \"Not covered\"\n)\n\nmatrix_long &lt;- matrix_data |&gt;\n  pivot_longer(cols = c(Anthropic, OpenAI, Google),\n               names_to = \"vendor\", values_to = \"emphasis\") |&gt;\n  mutate(\n    technique = factor(technique, levels = rev(technique_levels)),\n    vendor = factor(vendor, levels = c(\"Anthropic\", \"OpenAI\", \"Google\")),\n    emphasis = factor(emphasis,\n                      levels = c(\"Emphasized\", \"Mentioned\", \"Not covered\"))\n  )\n\nemphasis_fill &lt;- c(\n  \"Emphasized\"  = iph_colors$blue,\n  \"Mentioned\"   = iph_colors$lightgrey,\n  \"Not covered\" = \"#fafbfc\"\n)\n\np_matrix &lt;- ggplot(matrix_long,\n                   aes(x = vendor, y = technique, fill = emphasis)) +\n  geom_tile(color = \"white\", linewidth = 1.2) +\n  geom_tile(data = filter(matrix_long, emphasis == \"Not covered\"),\n            color = iph_colors$lightgrey, fill = \"#fafbfc\", linewidth = 0.6) +\n  geom_text(aes(label = emphasis, color = emphasis),\n            size = 3.1, family = \"Inter\", fontface = \"bold\") +\n  scale_fill_manual(values = emphasis_fill, guide = \"none\") +\n  scale_color_manual(\n    values = c(\"Emphasized\"  = \"white\",\n               \"Mentioned\"   = iph_colors$dark,\n               \"Not covered\" = iph_colors$grey),\n    guide = \"none\"\n  ) +\n  scale_x_discrete(position = \"top\") +\n  labs(\n    title = \"Three vendors, twenty techniques, one honest map\",\n    subtitle = \"What each vendor officially recommends (April 2026).\",\n    x = NULL, y = NULL,\n    caption = \"Sources: Anthropic Prompting Best Practices, OpenAI GPT-4.1 Guide + Six Strategies, Google Boonstra Whitepaper \\u00b7 Captured 2026-04-05\"\n  ) +\n  theme_inphronesys(grid = \"none\") +\n  theme(\n    axis.text.x = element_text(face = \"bold\", size = 12, color = iph_colors$dark),\n    axis.text.y = element_text(size = 10, color = iph_colors$dark, hjust = 1),\n    panel.grid = element_blank(),\n    plot.margin = margin(15, 20, 10, 10)\n  )\n\nggsave(\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_vendor_matrix.png\", plot = p_matrix,\n       width = 8, height = 7, dpi = 100, bg = \"white\")\n\n# =============================================================================\n# IMAGE 3 \u2014 Illustrative \"prompt engineering\" interest trend\n# =============================================================================\n# NOTE: Synthetic curve, NOT real Google Trends data. It reflects the known\n# trajectory of public interest based on the milestones in Image 1.\n\nanchors &lt;- tibble::tribble(\n  ~date,         ~interest,\n  \"2020-01-01\",  1,\n  \"2020-07-01\",  2,\n  \"2021-01-01\",  3,\n  \"2021-07-01\",  4,\n  \"2022-01-01\",  5,\n  \"2022-07-01\",  7,\n  \"2022-11-01\", 15,\n  \"2023-01-01\", 30,\n  \"2023-03-01\", 65,\n  \"2023-06-01\", 85,\n  \"2023-09-01\", 88,\n  \"2023-12-01\", 92,\n  \"2024-03-01\", 85,\n  \"2024-06-01\", 75,\n  \"2024-09-01\", 68,\n  \"2024-12-01\", 60,\n  \"2025-03-01\", 56,\n  \"2025-06-01\", 55,\n  \"2025-09-01\", 52,\n  \"2025-12-01\", 50,\n  \"2026-04-01\", 48\n) |&gt;\n  mutate(date = as.Date(date))\n\nmonths &lt;- seq(as.Date(\"2020-01-01\"), as.Date(\"2026-04-01\"), by = \"1 month\")\ninterest_df &lt;- tibble::tibble(\n  date = months,\n  interest = approx(\n    x = as.numeric(anchors$date),\n    y = anchors$interest,\n    xout = as.numeric(months)\n  )$y\n)\n\nevents &lt;- tibble::tribble(\n  ~date,         ~label,                               ~y_label, ~is_karpathy,\n  \"2020-07-01\", \"GPT-3\\npaper\",                        102,      FALSE,\n  \"2022-11-01\", \"ChatGPT\\nlaunch\",                     102,      FALSE,\n  \"2023-03-01\", \"GPT-4\\nlaunch\",                       40,       FALSE,\n  \"2023-12-01\", \"OpenAI\\nsix-strategies\\nguide\",       102,      FALSE,\n  \"2025-06-25\", \"Karpathy:\\n\\\"context\\nengineering\\\"\", 102,      TRUE\n) |&gt;\n  mutate(date = as.Date(date))\n\nevents &lt;- events |&gt;\n  mutate(\n    interest = approx(\n      x = as.numeric(interest_df$date),\n      y = interest_df$interest,\n      xout = as.numeric(date)\n    )$y\n  )\n\np_trend &lt;- ggplot(interest_df, aes(x = date, y = interest)) +\n  geom_hline(yintercept = 0, color = iph_colors$lightgrey, linewidth = 0.4) +\n  geom_vline(\n    data = filter(events, !is_karpathy),\n    aes(xintercept = date),\n    color = iph_colors$lightgrey, linetype = \"dashed\", linewidth = 0.4\n  ) +\n  geom_vline(\n    data = filter(events, is_karpathy),\n    aes(xintercept = date),\n    color = iph_colors$red, linetype = \"dashed\", linewidth = 0.6\n  ) +\n  geom_line(color = iph_colors$blue, linewidth = 1.2) +\n  geom_point(\n    data = filter(events, !is_karpathy),\n    aes(x = date, y = interest),\n    color = iph_colors$blue, size = 3.2\n  ) +\n  geom_point(\n    data = filter(events, is_karpathy),\n    aes(x = date, y = interest),\n    color = iph_colors$red, size = 4\n  ) +\n  geom_text(\n    data = filter(events, !is_karpathy),\n    aes(x = date, y = y_label, label = label),\n    family = \"Inter\", size = 2.9, color = iph_colors$grey,\n    vjust = 1, hjust = -0.05, lineheight = 0.95\n  ) +\n  geom_text(\n    data = filter(events, is_karpathy),\n    aes(x = date, y = y_label, label = label),\n    family = \"Inter\", fontface = \"bold\",\n    size = 3.0, color = iph_colors$red,\n    vjust = 1, hjust = 1.05, lineheight = 0.95\n  ) +\n  scale_x_date(\n    limits = as.Date(c(\"2020-01-01\", \"2026-06-30\")),\n    date_breaks = \"1 year\", date_labels = \"%Y\",\n    expand = expansion(mult = c(0.02, 0.02))\n  ) +\n  scale_y_continuous(\n    limits = c(0, 105),\n    breaks = c(0, 25, 50, 75, 100)\n  ) +\n  labs(\n    title = \"\\\"Prompt engineering\\\" interest peaked, then plateaued\",\n    subtitle = \"Illustrative trajectory \\u2014 based on publicly known events, not live Google Trends data.\",\n    x = NULL,\n    y = \"Relative interest (0-100)\",\n    caption = \"Illustrative data \\u00b7 Timeline events confirmed from primary sources \\u00b7 Captured 2026-04-05\"\n  ) +\n  theme_inphronesys(grid = \"y\")\n\nggsave(\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/prompt_interest_trend.png\", plot = p_trend,\n       width = 8, height = 5, dpi = 100, bg = \"white\")\n<\/code><\/pre>\n<\/details>\n<h2>References<\/h2>\n<h3>Academic Papers<\/h3>\n<ol>\n<li>Brown, T. B. et al. (2020). <em>Language Models are Few-Shot Learners.<\/em> arXiv:2005.14165. https:\/\/arxiv.org\/abs\/2005.14165 (captured 2026-04-05)<\/li>\n<li>Liu, P. et al. (2021). <em>Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in NLP.<\/em> arXiv:2107.13586. https:\/\/arxiv.org\/abs\/2107.13586 (captured 2026-04-05)<\/li>\n<li>Wei, J. et al. (2022). <em>Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.<\/em> arXiv:2201.11903. https:\/\/arxiv.org\/abs\/2201.11903 (captured 2026-04-05)<\/li>\n<li>Wang, X. et al. (2022). <em>Self-Consistency Improves Chain of Thought Reasoning in Language Models.<\/em> arXiv:2203.11171. https:\/\/arxiv.org\/abs\/2203.11171 (captured 2026-04-05)<\/li>\n<li>Kojima, T. et al. (2022). <em>Large Language Models are Zero-Shot Reasoners.<\/em> arXiv:2205.11916. https:\/\/arxiv.org\/abs\/2205.11916 (captured 2026-04-05)<\/li>\n<li>Ouyang, L. et al. (2022). <em>Training Language Models to Follow Instructions with Human Feedback<\/em> (InstructGPT). arXiv:2203.02155. https:\/\/arxiv.org\/abs\/2203.02155 (captured 2026-04-05)<\/li>\n<li>Yao, S. et al. (2022). <em>ReAct: Synergizing Reasoning and Acting in Language Models.<\/em> arXiv:2210.03629. https:\/\/arxiv.org\/abs\/2210.03629 (captured 2026-04-05)<\/li>\n<li>Bai, Y. et al. (2022). <em>Constitutional AI: Harmlessness from AI Feedback.<\/em> arXiv:2212.08073. https:\/\/arxiv.org\/abs\/2212.08073 (captured 2026-04-05)<\/li>\n<li>Yao, S. et al. (2023). <em>Tree of Thoughts: Deliberate Problem Solving with Large Language Models.<\/em> arXiv:2305.10601. https:\/\/arxiv.org\/abs\/2305.10601 (captured 2026-04-05)<\/li>\n<li>Khattab, O. et al. (2023). <em>DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines.<\/em> arXiv:2310.03714. https:\/\/arxiv.org\/abs\/2310.03714 (captured 2026-04-05)<\/li>\n<li>Zheng, H. et al. (2023). <em>Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models.<\/em> arXiv:2310.06117. https:\/\/arxiv.org\/abs\/2310.06117 (captured 2026-04-05)<\/li>\n<li>Schulhoff, S. et al. (2024). <em>The Prompt Report: A Systematic Survey of Prompt Engineering Techniques.<\/em> arXiv:2406.06608. https:\/\/arxiv.org\/abs\/2406.06608 (captured 2026-04-05)<\/li>\n<\/ol>\n<h3>Anthropic<\/h3>\n<ol start=\"13\">\n<li>Anthropic. <em>Prompt engineering overview.<\/em> https:\/\/platform.claude.com\/docs\/en\/docs\/build-with-claude\/prompt-engineering\/overview (captured 2026-04-05)<\/li>\n<li>Anthropic. <em>Claude prompting best practices<\/em> (living reference, Claude 4.6). https:\/\/platform.claude.com\/docs\/en\/docs\/build-with-claude\/prompt-engineering\/claude-prompting-best-practices (captured 2026-04-05)<\/li>\n<li>Anthropic. <em>Migrating away from prefilled responses<\/em> (section within the claude-prompting-best-practices page). https:\/\/platform.claude.com\/docs\/en\/docs\/build-with-claude\/prompt-engineering\/claude-prompting-best-practices#migrating-away-from-prefilled-responses (captured 2026-04-05)<\/li>\n<li>Anthropic. <em>Long context prompting tips.<\/em> Same URL as above, &#8222;Long context prompting&#8220; section. (captured 2026-04-05)<\/li>\n<li>Anthropic. <em>Use XML tags.<\/em> Same URL, &#8222;Structure prompts with XML tags&#8220; section. (captured 2026-04-05)<\/li>\n<li>Anthropic. <em>Be clear and direct<\/em> (Golden Rule). Same URL, &#8222;Be clear and direct&#8220; section. (captured 2026-04-05)<\/li>\n<\/ol>\n<h3>OpenAI<\/h3>\n<ol start=\"19\">\n<li>OpenAI (December 2023). <em>Prompt engineering<\/em> (Six Strategies guide). https:\/\/platform.openai.com\/docs\/guides\/prompt-engineering (captured 2026-04-05)<\/li>\n<li>OpenAI. <em>GPT-4.1 Prompting Guide<\/em> (Cookbook). https:\/\/developers.openai.com\/cookbook\/examples\/gpt4-1_prompting_guide (captured 2026-04-05)<\/li>\n<li>OpenAI (2023, pre-platform page). <em>Best practices for prompt engineering with OpenAI API<\/em> (help-center version). https:\/\/help.openai.com\/en\/articles\/6654000-best-practices-for-prompt-engineering-with-openai-api (captured 2026-04-05)<\/li>\n<li>OpenAI (May 2024, updated 2025). <em>Model Spec.<\/em> https:\/\/model-spec.openai.com\/ (captured 2026-04-05)<\/li>\n<li>OpenAI (August 2024). <em>Structured Outputs guide.<\/em> https:\/\/platform.openai.com\/docs\/guides\/structured-outputs (captured 2026-04-05)<\/li>\n<\/ol>\n<h3>Google<\/h3>\n<ol start=\"24\">\n<li>Boonstra, L. (September 2024 internal \/ November 2024 public). <em>Prompt Engineering<\/em> (68-page whitepaper, Google, distributed on Kaggle). https:\/\/www.kaggle.com\/whitepaper-prompt-engineering (captured 2026-04-05)<\/li>\n<li>Google. <em>Vertex AI prompt design strategies.<\/em> https:\/\/cloud.google.com\/vertex-ai\/generative-ai\/docs\/learn\/prompts\/prompt-design-strategies (captured 2026-04-05)<\/li>\n<li>Google. <em>Gemini API prompting strategies.<\/em> https:\/\/ai.google.dev\/gemini-api\/docs\/prompting-strategies (captured 2026-04-05)<\/li>\n<li>Google. <em>Introduction to prompt design.<\/em> https:\/\/cloud.google.com\/vertex-ai\/generative-ai\/docs\/learn\/prompts\/introduction-prompt-design (captured 2026-04-05)<\/li>\n<\/ol>\n<h3>Commentary &amp; Debate<\/h3>\n<ol start=\"28\">\n<li>Karpathy, A. (June 25, 2025). <em>&#8222;+1 for &#8218;context engineering&#8216; over &#8218;prompt engineering&#8217;\u2026&#8220;<\/em> X post. https:\/\/x.com\/karpathy\/status\/1937902205765607626 (captured 2026-04-05)<\/li>\n<li>Willison, S. (June 27, 2025). <em>Context engineering.<\/em> https:\/\/simonwillison.net\/2025\/jun\/27\/context-engineering\/ (captured 2026-04-05)<\/li>\n<li>Husain, H. (March 29, 2024). <em>Your AI Product Needs Evals.<\/em> https:\/\/hamel.dev\/blog\/posts\/evals\/ (captured 2026-04-05)<\/li>\n<li>Willison, S. (March 31, 2024). <em>Coverage of Your AI Product Needs Evals.<\/em> https:\/\/simonwillison.net\/2024\/Mar\/31\/your-ai-product-needs-evals\/ (captured 2026-04-05)<\/li>\n<li>Liu, J. (May 19, 2025). <em>There Are Only 6 RAG Evals.<\/em> https:\/\/jxnl.co\/writing\/2025\/05\/19\/there-are-only-6-rag-evals\/ (captured 2026-04-05)<\/li>\n<li>Genkina, D. (March 6, 2024). <em>AI Prompt Engineering Is Dead.<\/em> IEEE Spectrum. https:\/\/spectrum.ieee.org\/prompt-engineering-is-dead (captured 2026-04-05)<\/li>\n<li>Salesforce Ben (2025). <em>Prompt Engineering Jobs Are Obsolete in 2025 \u2014 Here&#8217;s Why.<\/em> https:\/\/www.salesforceben.com\/prompt-engineering-jobs-are-obsolete-in-2025-heres-why\/ (captured 2026-04-05)<\/li>\n<li>Kimai, D. (2025). <em>Context-Engineering handbook.<\/em> GitHub. https:\/\/github.com\/davidkimai\/Context-Engineering (captured 2026-04-05)<\/li>\n<li>Willison, S. <em>Prompt engineering tag archive<\/em> (2023\u20132026). https:\/\/simonwillison.net\/tags\/prompt-engineering\/ (captured 2026-04-05)<\/li>\n<li>Menlo Ventures (2025). <em>2025: The State of Generative AI in the Enterprise.<\/em> https:\/\/menlovc.com\/perspective\/2025-the-state-of-generative-ai-in-the-enterprise\/ (captured 2026-04-05)<\/li>\n<li>Latent Space podcast archive (swyx \/ Alessio), 2025. https:\/\/www.latent.space\/podcast (captured 2026-04-05)<\/li>\n<li>AI Business. <em>OpenAI&#8217;s six steps to improving your prompts to get better results.<\/em> https:\/\/aibusiness.com\/nlp\/openai-s-six-steps-to-improving-your-prompts-to-get-better-results (captured 2026-04-05)<\/li>\n<li>InfoQ (December 2023). <em>OpenAI Publishes Prompt Engineering Guide.<\/em> https:\/\/www.infoq.com\/news\/2023\/12\/openai-prompt-engineering\/ (captured 2026-04-05)<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Six years ago prompting was a happy accident inside a GPT-3 paper. Today it&#8217;s the single skill separating AI winners from losers. Here&#8217;s the complete history \u2014 what Anthropic, OpenAI and Google actually published, and what still matters in 2026.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[139,162],"tags":[256,257,260,156,262,259,60,258,157,261,149],"class_list":["post-1720","post","type-post","status-publish","format-standard","hentry","category-artificial-intelligence","category-technology","tag-ai-history","tag-anthropic","tag-chain-of-thought","tag-context-engineering","tag-dspy","tag-google","tag-llm","tag-openai","tag-prompt-engineering-2","tag-react","tag-supply-chain-ai"],"_links":{"self":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1720","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1720"}],"version-history":[{"count":1,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1720\/revisions"}],"predecessor-version":[{"id":1731,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1720\/revisions\/1731"}],"wp:attachment":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1720"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1720"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1720"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}