{"id":1019,"date":"2026-02-21T20:05:04","date_gmt":"2026-02-21T20:05:04","guid":{"rendered":"https:\/\/inphronesys.com\/?p=1019"},"modified":"2026-02-21T20:12:13","modified_gmt":"2026-02-21T20:12:13","slug":"bupar-the-process-mining-toolkit-that-shows-you-how-your-factory-actually-runs","status":"publish","type":"post","link":"https:\/\/inphronesys.com\/?p=1019","title":{"rendered":"bupaR: The Process Mining Toolkit That Shows You How Your Factory Actually Runs"},"content":{"rendered":"<h2>The Problem Every Operations Manager Knows<\/h2>\n<p>You designed your production process. You documented it. You trained your team on it. And yet, when you walk the shop floor, things look different. Orders take detours. Some workstations are idle while others have queues. Rework happens in places it should not. Exceptions become the norm.<\/p>\n<p>The gap between your designed process and your actual process is where efficiency dies. The challenge is that this gap is invisible in standard reporting. Your ERP system tracks transactions \u2014 when an order was created, when a goods receipt was posted, when an invoice was paid. But it does not show you the <em>flow<\/em>. It does not show you the path each case actually took, where it waited, where it looped back, or which resource handled it.<\/p>\n<p>This is exactly what process mining does. And <strong>bupaR<\/strong> is how you do it in R.<\/p>\n<h2>What Is bupaR?<\/h2>\n<p><a href=\"https:\/\/bupar.net\/\">bupaR<\/a> (Business Process Analysis in R) is an open-source ecosystem of R packages developed at Hasselt University. It takes timestamped event data \u2014 the kind your ERP, MES, or workflow system already generates \u2014 and reconstructs the actual process flow.<\/p>\n<p>Think of it as an X-ray for your operations. Instead of seeing a static org chart or a process flow diagram drawn in Visio, you see what actually happened: every case, every step, every delay, every deviation.<\/p>\n<p>The ecosystem is installed with a single command:<\/p>\n<pre><code class=\"language-r\">install.packages(\"bupaverse\")\nlibrary(bupaR)\nlibrary(edeaR)\nlibrary(processmapR)\n<\/code><\/pre>\n<p>The core packages give you everything you need:<\/p>\n<p><br \/>| <strong>bupaR<\/strong> | Event log data structures and manipulation\u00a0<br \/>| <strong>edeaR<\/strong> | Metrics \u2014 throughput time, processing time, activity frequency, trace coverage\u00a0<br \/>| <strong>processmapR<\/strong> | Visual process discovery \u2014 process maps, dotted charts, trace explorers\u00a0<br \/>| <strong>processcheckR<\/strong> | Conformance checking \u2014 did the process follow the rules?\u00a0<br \/>| <strong>processanimateR<\/strong> | Animated process maps that show cases flowing through the system\u00a0<\/p>\n<h2>Seeing Your Process for the First Time<\/h2>\n<p>Let us start with bupaR&#8217;s built-in <code>patients<\/code> dataset \u2014 500 emergency department cases with 7 activities. This is the kind of data any operation generates: cases flowing through a sequence of steps with timestamps and resource assignments.<\/p>\n<pre><code class=\"language-r\">library(eventdataR)\npatients %&gt;% process_map(type = frequency(\"absolute\"))\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_process_map-1.png\" alt=\"Process Map \u2014 Frequency\" \/><\/p>\n<p>This single line of code produces a complete process map. Each node is an activity with its execution count. Each arrow shows how many cases flowed from one activity to the next. The layout is automatically determined from the data \u2014 you did not draw this diagram, your data did.<\/p>\n<p>For an operations manager, this is the first moment of truth. You can immediately see:<\/p>\n<ul>\n<li><strong>The main flow<\/strong> \u2014 Registration \u2192 Triage \u2192 Clinical Assessment \u2192 Treatment \u2192 Discharge. This is your designed process.<\/li>\n<li><strong>The exceptions<\/strong> \u2014 Cases that skip steps, loop back, or take unexpected paths. These are the deviations you need to investigate.<\/li>\n<li><strong>The volume distribution<\/strong> \u2014 Which activities handle the most cases? Where does the flow split?<\/li>\n<\/ul>\n<h2>How Long Does It Actually Take?<\/h2>\n<p>Throughput time \u2014 the total time from a case entering the system to leaving it \u2014 is the metric operations managers care about most. bupaR computes it directly from the event log:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% throughput_time(\"log\") %&gt;% plot()\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_throughput_time-1.png\" alt=\"Throughput Time Distribution\" \/><\/p>\n<p>This distribution tells you what no average ever could. You do not just see that the mean throughput is X days \u2014 you see the <em>shape<\/em> of the distribution. A long right tail means some cases get stuck. A bimodal distribution means you have two fundamentally different process paths. Both of these patterns are invisible in a KPI dashboard that only shows averages.<\/p>\n<h2>Where Does the Time Go?<\/h2>\n<p>Knowing the total throughput is useful. Knowing <em>where<\/em> the time is spent is actionable. bupaR breaks processing time down by activity:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% processing_time(\"activity\") %&gt;% plot()\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_processing_time-1.png\" alt=\"Processing Time by Activity\" \/><\/p>\n<p>This immediately identifies your bottleneck. The activity with the longest processing time \u2014 or the highest variance \u2014 is where you should focus improvement efforts first. In a factory context, this could be a particular assembly step, a testing station, or a quality gate.<\/p>\n<p>The distinction between <strong>processing time<\/strong> (time actively working) and <strong>throughput time<\/strong> (total elapsed time including waiting) is critical. If throughput time is high but processing time is low, your problem is not capacity \u2014 it is queuing. You have cases sitting in buffers waiting for the next station. That is a scheduling problem, not a staffing problem.<\/p>\n<h2>Which Paths Do Cases Actually Take?<\/h2>\n<p>In a well-running operation, most cases should follow the same path. In reality, they do not. bupaR&#8217;s trace explorer shows you every unique path (called a &#8222;trace variant&#8220;) and how often it occurs:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% trace_explorer(n_traces = 7)\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_trace_explorer-1.png\" alt=\"Trace Explorer \u2014 Process Variants\" \/><\/p>\n<p>Each row is a distinct process path. The colored blocks represent activities in sequence. The percentage tells you how many cases followed that exact path.<\/p>\n<p>This is where rework becomes visible. If your designed process has 6 steps but a trace variant shows 8 or 9 blocks, cases are looping back through activities they already completed. Every loop is wasted capacity \u2014 material, labor, and machine time consumed without producing output.<\/p>\n<p>For a factory manager, the trace explorer answers a direct question: <strong>what percentage of my production follows the happy path?<\/strong> If it is 70%, you have a process control problem. If it is 95%, you have a stable process with occasional exceptions that can be managed individually.<\/p>\n<h2>How Much of the Variation Matters?<\/h2>\n<p>Not every deviation is a problem. Some variation is inherent in complex operations. Trace coverage tells you how many distinct paths you need to cover a given percentage of your cases:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% trace_coverage(\"trace\") %&gt;% plot()\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_trace_coverage-1.png\" alt=\"Trace Coverage\" \/><\/p>\n<p>If 3 trace variants cover 90% of your cases, your process is relatively stable \u2014 focus your improvement on those 3 paths. If you need 50 variants to cover 90%, your process is chaotic and needs structural intervention before optimization makes sense.<\/p>\n<h2>Which Activities Drive the Volume?<\/h2>\n<p>Activity frequency shows how often each step executes. In a linear process, every activity should have roughly the same count. Deviations from that baseline reveal rework:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% activity_frequency(\"activity\") %&gt;% plot()\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_activity_frequency-1.png\" alt=\"Activity Frequency\" \/><\/p>\n<p>If your quality control step executes 600 times but your assembly step only executes 500 times, 100 cases went through QC twice. That is a 20% rework rate at that station. You now know where to look and roughly how much capacity you are losing.<\/p>\n<h2>Who Is Doing the Work?<\/h2>\n<p>Resource analysis shows how work is distributed across people, machines, or workstations:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% resource_frequency(\"resource\") %&gt;% plot()\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_resource_frequency-1.png\" alt=\"Resource Frequency\" \/><\/p>\n<p>Uneven resource utilization creates bottlenecks even when total capacity is sufficient. If one operator handles 40% of cases while three others split the remaining 60%, you have a single point of failure. When that operator is absent, throughput drops by 40%.<\/p>\n<h2>Watching Every Case on a Timeline<\/h2>\n<p>The dotted chart is one of bupaR&#8217;s most powerful visualizations. Each row is a case. Each dot is an activity. The horizontal axis is time:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% dotted_chart(x = \"relative\")\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_dotted_chart-1.png\" alt=\"Dotted Chart \u2014 Relative Timeline\" \/><\/p>\n<p>Patterns that are invisible in summary statistics become obvious here:<\/p>\n<ul>\n<li><strong>Consistent spacing between dots<\/strong> = stable processing rhythm<\/li>\n<li><strong>Large gaps between dots<\/strong> = cases waiting in queues<\/li>\n<li><strong>Dots stacking vertically at the same x-position<\/strong> = batch processing or shift-start effects<\/li>\n<li><strong>Cases with many more dots than others<\/strong> = rework or exceptions<\/li>\n<\/ul>\n<p>A factory manager looking at this chart can immediately identify which cases had problems, when delays occurred, and whether the issues are systematic or random.<\/p>\n<h2>From Frequency to Performance<\/h2>\n<p>The same process map can show time instead of counts. Replace <code>frequency()<\/code> with <code>performance()<\/code> and you see average processing time on each activity and average waiting time on each arrow:<\/p>\n<pre><code class=\"language-r\">patients %&gt;% process_map(type = performance())\n<\/code><\/pre>\n<p><img decoding=\"async\" src=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/intro_performance_map-1.png\" alt=\"Performance Process Map\" \/><\/p>\n<p>This is your bottleneck map. Red nodes are slow activities. Long edge times mean cases are waiting between steps. The combination tells you exactly where your process is losing time and whether the problem is at the station (processing) or between stations (transport, queuing, scheduling).<\/p>\n<h2>Why This Matters for Your Factory<\/h2>\n<p>Process mining with bupaR answers the questions that keep operations managers awake at night:<\/p>\n<p><strong>&#8222;Where are my bottlenecks?&#8220;<\/strong> \u2014 Processing time analysis and performance maps show which stations constrain throughput. You stop guessing and start measuring.<\/p>\n<p><strong>&#8222;How much rework is happening?&#8220;<\/strong> \u2014 Trace analysis reveals every loop, every repeated activity, every case that deviated from the designed path. You can quantify rework as a percentage of total capacity \u2014 and set reduction targets.<\/p>\n<p><strong>&#8222;Are my resources balanced?&#8220;<\/strong> \u2014 Resource frequency shows whether work is distributed evenly or concentrated on a few overloaded stations. Rebalancing often costs nothing but delivers immediate throughput improvement.<\/p>\n<p><strong>&#8222;What does my process actually look like?&#8220;<\/strong> \u2014 The process map replaces the Visio diagram you drew three years ago with reality. The designed process is your intention. The mined process is your truth.<\/p>\n<p><strong>&#8222;Which cases need attention?&#8220;<\/strong> \u2014 The dotted chart highlights outliers \u2014 cases that took too long, had too many steps, or followed unusual paths. Instead of reviewing every case, you review the exceptions.<\/p>\n<h2>Getting Your Data Into bupaR<\/h2>\n<p>Every ERP and MES system can export the data bupaR needs. You need four columns:<\/p>\n<p><strong>Case ID<\/strong> | Production order number, work order, batch ID<br \/>A<strong>ctivity<\/strong> | Operation name, process step, workstation<br \/><strong>Timestamp<\/strong> | Start time and\/or completion time of each step<br \/><strong>Resource<\/strong> | Operator, machine, workstation ID<\/p>\n<pre><code class=\"language-r\"># Load your own data\nevent_data &lt;- read.csv(\"your_mes_export.csv\")\n\n# Convert to a bupaR activity log\nmy_log &lt;- activitylog(\n  event_data,\n  case_id     = \"order_number\",\n  activity_id = \"operation\",\n  resource_id = \"workstation\",\n  timestamps  = c(\"start_time\", \"end_time\")\n)\n\n# Start analyzing\nmy_log %&gt;% process_map()\nmy_log %&gt;% throughput_time(\"log\") %&gt;% plot()\nmy_log %&gt;% trace_explorer(n_traces = 10)\n<\/code><\/pre>\n<p>That is it. Three lines of analysis code and you have a process map, throughput distribution, and trace analysis for your factory.<\/p>\n<h2>Where to Go From Here<\/h2>\n<p>This post covers the fundamentals. For deeper dives into specific bupaR capabilities, see:<\/p>\n<ul>\n<li><strong><a href=\"https:\/\/inphronesys.com\/wp-content\/uploads\/2026\/02\/bupar_process_mining_dashboard.html\">Interactive Process Mining Dashboard<\/a><\/strong> \u2014 A standalone dashboard demonstrating process mining KPIs following Stephen Few&#8217;s visualization principles.<\/li>\n<\/ul>\n<p>Your factory generates event data every second. bupaR turns that data into operational intelligence. Install it, point it at your ERP export, and see your process for the first time.<\/p>\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Your factory has a designed process and an actual process. bupaR \u2014 the open-source process mining suite for R \u2014 shows you the difference, and that difference is where your efficiency gains are hiding.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13,12],"tags":[105,14,104,99,103,97,15],"class_list":["post-1019","post","type-post","status-publish","format-standard","hentry","category-data-science","category-process-mining","tag-bottleneck-analysis","tag-bupar","tag-efficiency","tag-manufacturing","tag-operations-management","tag-process-mining-2","tag-r"],"_links":{"self":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1019","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1019"}],"version-history":[{"count":5,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1019\/revisions"}],"predecessor-version":[{"id":1024,"href":"https:\/\/inphronesys.com\/index.php?rest_route=\/wp\/v2\/posts\/1019\/revisions\/1024"}],"wp:attachment":[{"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1019"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1019"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/inphronesys.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1019"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}