{"id":1916,"date":"2026-04-26T09:14:35","date_gmt":"2026-04-26T09:14:35","guid":{"rendered":"https:\/\/inphronesys.com\/?p=1916"},"modified":"2026-04-26T09:42:14","modified_gmt":"2026-04-26T09:42:14","slug":"why-var-beat-googles-timesfm-and-how-to-build-one-in-r","status":"publish","type":"post","link":"https:\/\/inphronesys.com\/?p=1916","title":{"rendered":"Why VAR Beat Google&#8217;s TimesFM \u2014 and How to Build One in R"},"content":{"rendered":"<p>In December 2025, researchers at UC San Diego published a peer-reviewed receipt on something the forecasting community has been tiptoeing around for two years: Google&#8217;s shiny new time-series foundation model, <strong>TimesFM<\/strong>, got beaten by a model older than the first iPhone. By a lot. The winner \u2014 <strong>vector autoregression (VAR)<\/strong> \u2014 has been kicking around since 1980, when Christopher Sims (the eventual 2011 economics Nobel laureate) wrote the paper that made it famous. The score was not close.<\/p>\n<p>At the La Jolla emergency department, forecasting how many patients would be stuck &#8222;boarding&#8220; (admitted but not yet in a ward bed), VAR scored an RMSE of <strong>8.76<\/strong> at the one-day-ahead horizon where it was actually deployed. Google&#8217;s TimesFM, at the same T+1 horizon, scored <strong>14.54<\/strong> \u2014 <strong>66% worse than VAR<\/strong>. The 2-week moving average baseline sat at <strong>14.85<\/strong>, which means VAR cut error by <strong>41% versus the naive benchmark<\/strong> \u2014 while TimesFM failed to beat that benchmark at <em>any<\/em> horizon, getting worse as the forecast stretched (RMSE 16.88 at T+4). Out of six candidate models, VAR was the only one the hospital put into live production, quietly emailing forecasts to Mission Control every morning at 7:10.[^1]<\/p>\n<p>If you run a supply chain, that result should stop you mid-coffee. Because the reason VAR won in a hospital is exactly the reason it tends to win in demand planning too \u2014 and almost nobody in SCM is using it. This post walks through what VAR actually is (without the matrix algebra), why supply chain data is practically designed for it, and how to build one in R in about thirty lines of code.<\/p>\n<h2>What VAR Actually Is<\/h2>\n<p>Here&#8217;s the one-sentence version: <strong>VAR is ARIMA with friends.<\/strong><\/p>\n<p>An ARIMA model stares intensely at one time series \u2014 weekly demand, for example \u2014 and tries to predict its next value from its own past. It&#8217;s a monk. Deeply introspective. Ignores everything else in the room.<\/p>\n<p>A VAR model, by contrast, watches a small <strong>group<\/strong> of related time series at the same time and lets them <em>gossip<\/em>. Each series predicts its next value using its own past AND the past of every other series in the group. Weekly demand depends on last week&#8217;s demand, <em>and<\/em> on last week&#8217;s price, <em>and<\/em> on last week&#8217;s promotion intensity. Price depends on its own past and on demand. Promotions depend on both. Everyone tells on everyone else.<\/p>\n<p>That&#8217;s the whole idea. No matrices, no eigenvalues, no doctoral-defence nightmares. Just: if you&#8217;ve got three or four series that you suspect talk to each other, VAR lets them actually talk.<\/p>\n<p>The quiet genius of Sims&#8217;s 1980 paper <em>&#8222;Macroeconomics and Reality&#8220;<\/em>[^2] was philosophical. He argued that economists were imposing too much theoretical structure on their models \u2014 deciding in advance that &#8222;interest rates cause inflation&#8220; and baking that into the equations. His counter-proposal: <strong>let the data decide who causes whom<\/strong>. Treat every variable symmetrically. Regress each one on lagged values of itself <em>and<\/em> all the others. Read the tea leaves afterwards. Central banks \u2014 the Fed, the ECB, the Bank of England \u2014 still use VARs and their descendants as their workhorse short-term forecasting tools. Not because they&#8217;re glamorous. Because they work.<\/p>\n<h2>Why Supply Chain Data Is Natively Multivariate<\/h2>\n<p>Here&#8217;s the awkward truth about classical demand forecasting: we pretend demand exists in a vacuum. You look at a weekly sales series, fit an ETS or ARIMA, and call it a forecast. But nobody in purchasing actually believes the sales number moves on its own. What moves it?<\/p>\n<ul>\n<li><strong>Your own price.<\/strong> Raise it, demand drops. Drop it, demand jumps.<\/li>\n<li><strong>Promotion intensity.<\/strong> A +20% TV spend window lifts volume \u2014 and usually pulls some of next month&#8217;s volume into this month.<\/li>\n<li><strong>Competitor price.<\/strong> When the competitor runs a deal, your demand softens.<\/li>\n<li><strong>Supplier lead time.<\/strong> When lead times stretch, safety stock goes up, and so do replenishment orders \u2014 a self-reinforcing demand signal that the univariate model cannot see.<\/li>\n<li><strong>Weather and seasonality.<\/strong> A hot April sells fans; a cold one doesn&#8217;t.<\/li>\n<\/ul>\n<p>Univariate models \u2014 ETS, ARIMA, SNAIVE, Holt-Winters \u2014 throw all of that out. They&#8217;re the forecasting equivalent of trying to predict a tennis match by only watching one player. Sometimes the other side of the court matters.<\/p>\n<p>VAR is the cheapest way to stop throwing that information away. You pick the two to five series you believe actually drive each other, fit one model, and you&#8217;re done. It&#8217;s not the only multivariate method (XGBoost, LSTMs, and foundation models all accept covariates too), but it is the simplest one that a supply chain analyst can build, interpret, and debug at 3 AM when the weekly refresh breaks.<\/p>\n<h2>The Receipts<\/h2>\n<p>The UC San Diego study isn&#8217;t the first time VAR has quietly humiliated a fancier model, but it is the cleanest recent example. Authors Poursoltan and colleagues ran a <strong>prospective<\/strong> comparison \u2014 meaning the forecasts were generated daily, live, and measured against reality as it happened, not in a retrospective backtest. 7,111 boarding patients. Two years of training data. Four months of live validation. Six candidate models, from a 2-week moving average at the bottom to Google&#8217;s TimesFM at the top.<\/p>\n<p>Their conclusion \u2014 quoting the paper directly \u2014 was that VAR was deployed &#8222;for its interpretability, stable performance, and capacity to model multivariate dependencies.&#8220;[^1] The hybrid VAR+XGBoost model won slightly more horizons (5 of 8 across both hospitals), but the authors themselves wrote that &#8222;its improvement over VAR alone was modest.&#8220; In operations, modest improvements are worth less than single-model simplicity.<\/p>\n<p>This isn&#8217;t a hospital fluke. The M5 forecasting competition (Walmart, 2020) reached the same structural conclusion: <strong>simple multivariate methods beat fancy ones on noisy operational data.<\/strong> Only 35.8% of the 7,092 M5 teams beat Seasonal Naive. If you&#8217;ve been reading this blog for a while, the pattern should feel familiar \u2014 we covered it in <a href=\"https:\/\/www.inphronesys.com\/the-m5-lesson-why-simple-still-beats-fancy-in-supply-chain-forecasting\/\">The M5 Lesson<\/a> and in last Friday&#8217;s <em>Foundation Model Reality Check<\/em> LinkedIn carousel. The receipts keep piling up.<\/p>\n<h2>Building a VAR in R: A Short Walkthrough<\/h2>\n<p>Enough theory. Let&#8217;s build one. I&#8217;ve put together a 104-week (two-year) simulated supply-chain dataset with three series that any category manager will recognise:<\/p>\n<ol>\n<li><strong>Weekly demand<\/strong> (units sold)<\/li>\n<li><strong>Own price<\/strong> ($ per unit, mean \u2248 $9.15)<\/li>\n<li><strong>Promotion intensity<\/strong> (0\u20131 scale, where 1 = full-page flyer week)<\/li>\n<\/ol>\n<p>The three series aren&#8217;t independent. The simulated data-generating process, matching what the R code actually does: promotions lift demand the same week they fire <em>and<\/em> knock price down by about $0.80; last week&#8217;s price then drags on this week&#8217;s demand (the classic price-elasticity channel, one-week-lagged); demand has a mild self-autocorrelation, a gentle upward trend, and a yearly seasonal wave. In other words, the sort of mess you&#8217;d find in any half-decent retail dataset if you actually looked.<\/p>\n<p>Here&#8217;s the full 104-week series, with the last 14 weeks held out as a 1-step-ahead rolling test window \u2014 the realistic weekly-replanning scenario where each Friday you forecast next Friday using everything you know so far. Before we even look at the metrics, one reassuring diagnostic: the fitted VAR(2)&#8217;s 80% prediction interval covered <strong>12 of 14<\/strong> actual values on the holdout. That&#8217;s well-calibrated for an 80% nominal level \u2014 the model is not just accurate, its uncertainty estimates are honest.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/var_forecast_vs_actual.png\" alt=\"VAR(2) 1-step-ahead rolling forecast against actual weekly demand on the 14-week holdout\" \/><\/p>\n<p>The modelling workflow, in the <code>fable<\/code> ecosystem, is absurdly compact. The business end is a single family of one-liners \u2014 one per candidate lag order, with a deterministic trend added so the AR coefficients don&#8217;t have to absorb the gentle upward drift in demand:<\/p>\n<pre><code class=\"language-r\">fits &lt;- train |&gt;\n  model(\n    lag1 = VAR(vars(demand, price, promo) ~ AR(1) + trend()),\n    lag2 = VAR(vars(demand, price, promo) ~ AR(2) + trend()),\n    # ... through lag8\n  )\n<\/code><\/pre>\n<p>You hand <code>VAR()<\/code> the three series, fit orders 1 through 8, and let an information criterion pick the winner. Here&#8217;s where a small modelling choice genuinely matters: <strong>AIC and BIC disagree.<\/strong> AIC bottoms out at lag 3 (364.15); BIC bottoms out at lag 2 (459.97), with lag 3 a close second (460.32). I went with <strong>BIC \u2014 p = 2 \u2014 for three reasons<\/strong>, and they&#8217;re worth spelling out because this is exactly the lag-selection dilemma you&#8217;ll hit on your own data:<\/p>\n<ol>\n<li><strong>Sample size.<\/strong> With 90 training weeks and three lagged regressors per variable per equation (9 total at p = 3, plus an intercept and trend term), AIC notoriously under-penalises complexity. BIC&#8217;s heavier penalty is the standard recommendation for VAR lag selection in small samples.<\/li>\n<li><strong>Stationarity.<\/strong> At p = 2, the model&#8217;s internal dynamics decay cleanly \u2014 a shock works its way through the system and then fades, rather than building up forever. (The technical check is the largest eigenvalue of the VAR companion matrix: it comes out to <strong>0.85<\/strong>, comfortably below 1. Higher lag orders nudge it closer to 1, and an almost-unit-root VAR produces IRFs that drift instead of decay.)<\/li>\n<li><strong>Parsimony for interpretability.<\/strong> A VAR(2) has 2 \u00d7 3\u00b2 = 18 AR coefficients to interpret. A VAR(3) has 27. If you&#8217;re showing this to a commercial director, 18 is hard enough.<\/li>\n<\/ol>\n<h3>VAR vs Univariate ETS \u2014 The Covariate Dividend<\/h3>\n<p>To measure whether those covariates actually earn their keep, I fit a plain univariate ETS on weekly demand alone \u2014 the default demand-planning baseline. Same 14-week holdout. Same 1-step-ahead rolling protocol.<\/p>\n<p>One upfront note: this is a simulated teaching example. Real hospital or retail data comes with messier external shocks than any simulation can fake. What it <em>can<\/em> show is where the covariate dividend comes from and how sensitive it is to lag-selection choices \u2014 the transferable part. Here&#8217;s the side-by-side:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/var_vs_univariate.png\" alt=\"VAR(2) vs univariate ETS: 1-step-ahead rolling forecast on 14-week holdout\" \/><\/p>\n<p>The numbers, across every loss function you&#8217;d sensibly measure:<\/p>\n<table style=\"border-collapse: collapse; width: 100%; margin: 1.5em 0; font-size: 0.95em; line-height: 1.5;\">\n<thead>\n<tr>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Model<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">RMSE<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">MAPE<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">MAE<\/th>\n<th style=\"border: 1px solid #ddd; padding: 10px 14px; background: #0073aa; color: #fff; font-weight: 600; text-align: left;\">Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Univariate ETS (demand only)<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">12.92<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">7.11%<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">9.34<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">No covariates. The demand-planning default.<\/td>\n<\/tr>\n<tr style=\"background: #ffffff;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">VAR(2) (demand + price + promo)<\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>11.44<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>4.79%<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>6.56<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">Same demand series, plus two covariates.<\/td>\n<\/tr>\n<tr style=\"background: #f8f9fa;\">\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>Improvement<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>11.5%<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>32.6%<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\"><strong>29.7%<\/strong><\/td>\n<td style=\"border: 1px solid #ddd; padding: 9px 14px; text-align: left;\">From covariates alone, same algorithm family.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>No feature engineering. No neural networks. No foundation models. Just: <em>tell the demand model what the price and promotion were doing<\/em>, and forecast error drops by roughly <strong>12% on RMSE, 30% on MAE, and a full third on MAPE.<\/strong> The MAPE gap is the one that would make a demand planner&#8217;s week \u2014 dropping average percentage error from 7% to under 5% is genuinely category-changing.<\/p>\n<p>Why the big gap? Because our simulated data has real cross-dependencies (promo lifts demand, last week&#8217;s price drags it down) that the univariate model cannot see. The UC San Diego hospital data behaved similarly for the same structural reason \u2014 hospital census and surgical activity genuinely drive boarding patients, and a model that sees them beats a model that doesn&#8217;t. When the covariates are real and material, VAR dominates. When they&#8217;re not, it won&#8217;t. This is an empirical question you should always test on your own data.<\/p>\n<h2>Reading the Output Without Eigenvalues<\/h2>\n<p>Every VAR output looks like a wall of coefficients at first. Forget the table. What you actually want is the <strong>impulse response function (IRF)<\/strong> \u2014 a plot that answers one concrete, executive-friendly question: <em>&#8222;If I shock price by +20% right now, what does demand do over the next 8 weeks?&#8220;<\/em><\/p>\n<p>This is the part that makes VAR uniquely useful for supply chain. You can simulate a price hike of 20% (about +$1.83 on the mean price of $9.15 in our dataset), watch the cascade propagate through the system, and read the decision straight off the chart.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/var_cross_dependency.png\" alt=\"Impulse response: weekly demand response to a one-time +20% price shock over 8 weeks\" \/><\/p>\n<p>Read it left-to-right. Week 0 is the shock. Week 1: demand softens by roughly 2.9 units. Week 2 is the deepest dip, about <strong>\u22123.6 units below baseline<\/strong> \u2014 on a series whose mean runs around 125 units a week, that&#8217;s just under a 3% demand hit. That&#8217;s the structural price-drag signal the model learned during training. From week 3 onwards, the demand response turns positive and rises \u2014 but that&#8217;s not &#8222;higher prices lifting demand later.&#8220; That&#8217;s the VAR&#8217;s own feedback dynamics producing a transient overshoot as the estimated system oscillates back to equilibrium. With the largest eigenvalue of the companion matrix at 0.85, the overshoot eventually decays to zero; it just takes longer than the 8 weeks this chart shows. Bottom line: trust the first two weeks as a pricing signal, treat the positive overshoot as model mechanics.<\/p>\n<p><strong>Why this matters for your P&amp;L.<\/strong> A 3% demand hit is not academic. Scale the IRF to a category doing 10,000 units a week: the week-1 dip is about 230 units, the week-2 dip about 290 \u2014 call it 500 units of missed sell-through across the two weeks following a price move. Depending on your business, that&#8217;s a markdown clearance event, a storage cost spike, or \u2014 on the other side of the same coin \u2014 a stock-out cascade if the price move was in the other direction and you didn&#8217;t replenish to meet the lift. These are the phone calls from commercial you don&#8217;t want to be taking on a Monday morning. The IRF gives you the week-by-week number you need to plan for them <em>before<\/em> the pricing decision gets made.<\/p>\n<p>That is the unique value IRFs add over a black-box model. You can draw the picture, show it to a commercial director, quantify the two-week demand-and-inventory swing the pricing move will cost, and walk out with a decision you can defend by Friday.<\/p>\n<h2>Where VAR Breaks<\/h2>\n<p>I&#8217;d be selling you fairy dust if I didn&#8217;t list the limits. Three real ones:<\/p>\n<p><strong>1. Stationarity assumption.<\/strong> Classical VAR wants every series to be stationary \u2014 meaning its statistical properties (mean, variance) don&#8217;t drift over time. Demand rarely obliges. In practice you difference the series \u2014 forecast week-over-week <em>changes<\/em> rather than levels \u2014 or include a deterministic trend term (as we did here) to soak up the drift. <code>fable::VAR()<\/code> handles a lot of this for you, but it&#8217;s not magic: feed it a wildly trending series without any of this care and it will still misbehave.<\/p>\n<p><strong>2. Curse of dimensionality.<\/strong> A VAR with <em>k<\/em> series and <em>p<\/em> lags has <em>k\u00b2p<\/em> coefficients to estimate. Three series with four lags is 36 coefficients \u2014 fine. Ten series with eight lags is 800 coefficients and you&#8217;ll overfit a desert. VAR is for small, theory-informed groups of series. For 500 SKUs, use hierarchical reconciliation or a global ML model. This is why the hospital study used three covariates, not thirty.<\/p>\n<p><strong>3. Lag-order selection matters \u2014 a lot.<\/strong> Pick <em>p<\/em> too small and you miss the cross-dependency. Too big and you overfit. This is why <code>AIC<\/code>\/<code>BIC<\/code> information criteria exist, and why you should always cross-validate. The convenient bit is that <code>fable<\/code> does this for you automatically. The inconvenient bit is that if AIC and BIC disagree, the honest answer is usually &#8222;try both, and trust the one that wins on held-out data.&#8220;<\/p>\n<h2>When to Use What \u2014 A Decision Cheat Sheet<\/h2>\n<p>Put everything on a single shelf:<\/p>\n<ul>\n<li><strong>Univariate ETS or Holt-Winters<\/strong> \u2014 one stable demand series, no usable covariates, and you need a forecast by lunch. The correct default 80% of the time.<\/li>\n<li><strong>VAR<\/strong> \u2014 2\u20136 related series (demand + price + promo + maybe competitor or weather) that you believe drive each other, and you want interpretability for the S&amp;OP meeting. The sweet spot for category-level planning.<\/li>\n<li><strong>XGBoost or global ML<\/strong> \u2014 a hierarchy, many series (50+), and rich features (hundreds of SKUs \u00d7 weather \u00d7 holidays \u00d7 promotions). The M5-winning pattern. See <a href=\"https:\/\/www.inphronesys.com\/global-forecasting-with-xgboost-in-r-a-walmart-weekly-walkthrough\/\">our XGBoost walkthrough<\/a>.<\/li>\n<li><strong>Foundation models (TimesFM, Chronos, Lag-Llama)<\/strong> \u2014 fascinating research, not yet operational. Revisit in 2027. Or when you see a peer-reviewed prospective study where one actually wins. (We&#8217;re still waiting.)<\/li>\n<\/ul>\n<p>The UC San Diego paper is not a sweeping condemnation of foundation models. It&#8217;s a reminder that benchmarks are cheap and deployments are expensive. Until the foundation model beats your baseline on <em>your<\/em> data, in <em>your<\/em> regime, the boring multivariate model is the honest default.<\/p>\n<h2>Interactive Dashboard<\/h2>\n<p>Before you scroll to the code, spend three minutes with the interactive explorer below. Three KPI cards up top give you the 3-second takeaway: VAR&#8217;s week-ahead forecast, the ETS baseline&#8217;s week-ahead forecast, and the <strong>+11.5% RMSE gain<\/strong> from adding price and promo covariates. From there, slide the lag order from 1 to 8 and watch AIC and BIC move across the candidate models \u2014 the exact disagreement (AIC prefers p = 3, BIC prefers p = 2) described in the walkthrough above. Fire a \u00b120% price shock and watch the IRF plot redraw live. Compare VAR against the ETS baseline at any horizon from 1 to 12 weeks. Every number in the dashboard comes straight from the R script below \u2014 nothing fabricated, nothing rounded.<\/p>\n<div class=\"dashboard-link\" style=\"margin: 2em 0; padding: 1.5em; background: #f8f9fa; border-left: 4px solid #0073aa; border-radius: 4px;\">\n<p style=\"margin: 0 0 0.5em 0; font-size: 1.1em;\"><strong>Interactive Dashboard<\/strong><\/p>\n<p style=\"margin: 0 0 1em 0;\">Explore the data yourself \u2014 adjust parameters and see the results update in real time.<\/p>\n<p><a style=\"display: inline-block; padding: 0.6em 1.2em; background: #0073aa; color: #fff; text-decoration: none; border-radius: 4px; font-weight: bold;\" href=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/04\/2026-04-28_VAR_Vector_Autoregression_Supply_Chain_dashboard.html\" target=\"_blank\" rel=\"noopener\">Open Interactive Dashboard \u2192<\/a><\/p>\n<\/div>\n<h2><\/h2>\n<p>If you made it this far: the fact that you read a 2,000-word blog post about a 45-year-old econometric technique already tells me you&#8217;re the right kind of person for this newsletter. Welcome.<\/p>\n<details>\n<summary><strong>Show R Code<\/strong><\/summary>\n<pre><code class=\"language-r\"># =============================================================================\n# generate_var_images.R \u2014 Vector Autoregression (VAR) Blog Post\n# =============================================================================\n# Fits fable::VAR() across lag orders 1-8 (with a deterministic trend), selects\n# the best by BIC, benchmarks against a univariate ETS baseline using\n# 1-step-ahead rolling forecasts on the 14-week holdout, and produces three\n# 800-px static charts.\n#\n# Run from project root:  Rscript Scripts\/generate_var_images.R\n# =============================================================================\n\nsource(\"Scripts\/theme_inphronesys.R\")\n\nsuppressPackageStartupMessages({\n  library(fpp3)\n  library(ggplot2); library(dplyr); library(tidyr); library(scales)\n  library(patchwork); library(jsonlite)\n})\n\nset.seed(42)\nn &lt;- 104\n\n# -- 1. Simulate a realistic weekly SCM dataset (104 weeks = 2 years) ----------\npromo &lt;- rbinom(n, size = 1, prob = 0.18)\n\nprice &lt;- numeric(n); price[1] &lt;- 10\nfor (i in 2:n) {\n  price[i] &lt;- 0.80 * price[i - 1] + 0.20 * 10 +\n              rnorm(1, 0, 0.10) - 0.80 * promo[i]\n  price[i] &lt;- max(7.5, price[i])\n}\n\ndemand &lt;- numeric(n); demand[1] &lt;- 100; demand[2] &lt;- 100\nfor (i in 3:n) {\n  trend_c &lt;- 0.15 * i\n  seas_c  &lt;- 6 * sin(2 * pi * i \/ 52 - pi \/ 2)\n  price_c &lt;- -8.0 * (price[i - 1] - 10)\n  promo_c &lt;- 30 * promo[i]\n  ar_c    &lt;- 0.30 * (demand[i - 1] - (100 + 0.15 * (i - 1)))\n  demand[i] &lt;- 100 + trend_c + seas_c + price_c + promo_c + ar_c +\n               rnorm(1, 0, 1.8)\n}\n\ndates &lt;- as.Date(\"2024-01-01\") + (0:(n - 1)) * 7\ntsbl &lt;- as_tsibble(\n  tibble::tibble(\n    week = yearweek(dates),\n    demand = round(demand, 1), price = round(price, 3),\n    promo_intensity = as.numeric(promo)\n  ),\n  index = week\n)\n\n# -- 2. Train \/ test split (90 \/ 14) -------------------------------------------\nn_train &lt;- 90\ntrain &lt;- tsbl %&gt;% slice_head(n = n_train)\ntest  &lt;- tsbl %&gt;% slice_tail(n = n - n_train)\n\n# -- 3. Fit VAR at lag orders 1-8 (with deterministic trend), pick min-BIC -----\n# The trend() term keeps the AR coefficients from absorbing the demand drift \u2014\n# crucial for a stable IRF. BIC is the standard lag-selection criterion for VAR\n# in small samples (AIC under-penalises complexity; AICc becomes unreliable\n# when parameter count approaches n).\nfits_var &lt;- train %&gt;%\n  model(\n    lag1 = VAR(vars(demand, price, promo_intensity) ~ AR(1) + trend()),\n    lag2 = VAR(vars(demand, price, promo_intensity) ~ AR(2) + trend()),\n    lag3 = VAR(vars(demand, price, promo_intensity) ~ AR(3) + trend()),\n    lag4 = VAR(vars(demand, price, promo_intensity) ~ AR(4) + trend()),\n    lag5 = VAR(vars(demand, price, promo_intensity) ~ AR(5) + trend()),\n    lag6 = VAR(vars(demand, price, promo_intensity) ~ AR(6) + trend()),\n    lag7 = VAR(vars(demand, price, promo_intensity) ~ AR(7) + trend()),\n    lag8 = VAR(vars(demand, price, promo_intensity) ~ AR(8) + trend())\n  )\n\ngl &lt;- glance(fits_var) %&gt;% arrange(.model)\nchosen_lag &lt;- as.integer(sub(\"lag\", \"\", gl$.model[which.min(gl$BIC)]))\n#&gt; AIC by lag: 437.15, 385.65, 364.15, 366.89, 378.09, 380.46, 385.68, 387.05\n#&gt; BIC by lag: 496.87, 459.97, 460.32, 484.70, 517.32, 540.90, 567.09, 589.21\n#&gt; AIC minimised at p = 3 (364.15); BIC minimised at p = 2 (459.97), p = 3 close second (460.32)\n#&gt; chosen_lag: 2\n\n# -- 4. Univariate ETS baseline on demand --------------------------------------\nfit_ets &lt;- train %&gt;% model(ets = ETS(demand))\n\n# -- 5. Manual VAR fit (equation-by-equation OLS) for IRF + rolling forecasts --\n# Design matrix includes constant + linear trend + p*3 lag columns, matching\n# the fable spec so the closed-form coefficient matrices A_1..A_p are directly\n# usable for iterating the IRF companion system.\nY &lt;- as.matrix(train %&gt;% as_tibble() %&gt;%\n                 select(demand, price, promo_intensity))\np &lt;- chosen_lag\nT_obs &lt;- nrow(Y)\nX_reg &lt;- matrix(0, nrow = T_obs - p, ncol = 2 + p * 3)\nX_reg[, 1] &lt;- 1\nX_reg[, 2] &lt;- (p + 1):T_obs            # deterministic trend\nfor (lag_i in 1:p) {\n  X_reg[, (3 + (lag_i - 1) * 3):(2 + lag_i * 3)] &lt;-\n    Y[(p - lag_i + 1):(T_obs - lag_i), ]\n}\nY_reg &lt;- Y[(p + 1):T_obs, ]\nB &lt;- solve(crossprod(X_reg), crossprod(X_reg, Y_reg))\nA_mats &lt;- lapply(1:p,\n  function(i) t(B[(3 + (i - 1) * 3):(2 + i * 3), ]))\n\n# Sanity-check stationarity: largest-modulus eigenvalue of the VAR companion\n# matrix must be &lt; 1 for a stable (trend-stationary) fit.\nk &lt;- 3\ncompanion &lt;- matrix(0, nrow = k * p, ncol = k * p)\nfor (lag_i in 1:p) {\n  companion[1:k, ((lag_i - 1) * k + 1):(lag_i * k)] &lt;- A_mats[[lag_i]]\n}\nif (p &gt; 1) companion[(k + 1):(k * p), 1:(k * (p - 1))] &lt;- diag(1, k * (p - 1))\nmax_mod_root &lt;- max(Mod(eigen(companion)$values))   # should be &lt; 1\n\n# -- 6. 1-step-ahead rolling forecasts on the 14-week holdout ------------------\nY_all &lt;- as.matrix(tsbl %&gt;% as_tibble() %&gt;%\n                     select(demand, price, promo_intensity))\nn_test &lt;- nrow(test)\nvar_1step &lt;- matrix(NA_real_, nrow = n_test, ncol = 3)\nfor (k in 1:n_test) {\n  t_k &lt;- n_train + k\n  x &lt;- c(1, t_k)\n  for (lag_i in 1:p) x &lt;- c(x, Y_all[t_k - lag_i, ])\n  var_1step[k, ] &lt;- x %*% B\n}\n\nets_full &lt;- fit_ets %&gt;% refit(tsbl, reestimate = FALSE)\nets_pred &lt;- augment(ets_full) %&gt;% slice_tail(n = n_test) %&gt;% pull(.fitted)\n\n# -- 7. Holdout accuracy (1-step rolling) --------------------------------------\nactual &lt;- test$demand\nrmse &lt;- function(y, yhat) sqrt(mean((y - yhat)^2))\nmape &lt;- function(y, yhat) mean(abs((y - yhat) \/ y)) * 100\nmae  &lt;- function(y, yhat) mean(abs(y - yhat))\n\ncat(sprintf(\"VAR(%d)  RMSE = %.2f   MAPE = %.2f%%   MAE = %.2f\\n\",\n            chosen_lag, rmse(actual, var_1step[, 1]),\n            mape(actual, var_1step[, 1]), mae(actual, var_1step[, 1])))\ncat(sprintf(\"ETS     RMSE = %.2f   MAPE = %.2f%%   MAE = %.2f\\n\",\n            rmse(actual, ets_pred), mape(actual, ets_pred),\n            mae(actual, ets_pred)))\n#&gt; VAR(2)  RMSE = 11.44   MAPE = 4.79%   MAE = 6.56\n#&gt; ETS     RMSE = 12.92   MAPE = 7.11%   MAE = 9.34\n\n# -- 8. Impulse response: +20% price shock -------------------------------------\nH_IRF &lt;- 8\nprice_mean &lt;- mean(Y[, \"price\"])\nshock_size &lt;- 0.20 * price_mean   # \u2248 +$1.83\n\ny_hist &lt;- matrix(0, nrow = H_IRF + p, ncol = 3)\ncolnames(y_hist) &lt;- c(\"demand\", \"price\", \"promo_intensity\")\nshock_row &lt;- p\ny_hist[shock_row, \"price\"] &lt;- shock_size\n\nirf &lt;- numeric(H_IRF + 1)\nfor (h in 1:H_IRF) {\n  y_next &lt;- numeric(3)\n  for (lag_i in 1:p) {\n    y_next &lt;- y_next + A_mats[[lag_i]] %*% y_hist[shock_row + h - lag_i, ]\n  }\n  y_hist[shock_row + h, ] &lt;- y_next\n  irf[h + 1] &lt;- y_next[1]\n}\n#&gt; demand response at t+1..t+8 (units):\n#&gt;   -2.86, -3.62, +0.71, +3.38, +5.25, +6.49, +7.21, +7.55\n\n# Charts 1-3 (forecast-vs-actual, VAR vs ETS, IRF) follow \u2014 standard ggplot\n# using theme_inphronesys() for the brand styling.\n<\/code><\/pre>\n<\/details>\n<h2>References<\/h2>\n<p>[^1]: Poursoltan, L. et al. (2025). <em>Prospective comparison of econometric, machine learning, and foundation models for forecasting emergency department boarding patients.<\/em> npj Health Systems, vol. 2, article 49. DOI: <a href=\"https:\/\/doi.org\/10.1038\/s44401-025-00054-z\">10.1038\/s44401-025-00054-z<\/a>. Open access.<br \/>\n[^2]: Sims, C. A. (1980). <em>Macroeconomics and Reality.<\/em> Econometrica, 48(1), 1\u201348.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A peer-reviewed 2025 study put Google&#8217;s TimesFM foundation model head-to-head with vector autoregression on real hospital data. Spoiler: the 1980s econometric model won. Here&#8217;s what VAR is, why it works for supply chain, and how to build one in R.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[206,301],"tags":[220,8,15,93,302,303],"class_list":["post-1916","post","type-post","status-publish","format-standard","hentry","category-forecasting","category-r-for-supply-chain","tag-fable","tag-forecasting","tag-r","tag-supply-chain-2","tag-var","tag-vector-autoregression"],"_links":{"self":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1916","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1916"}],"version-history":[{"count":2,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1916\/revisions"}],"predecessor-version":[{"id":1919,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1916\/revisions\/1919"}],"wp:attachment":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1916"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1916"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1916"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}