<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Baldock's Blueprints]]></title><description><![CDATA[Baldock’s Blueprints: Practical guides on Microsoft 365, cloud tech, cybersecurity, AI, and self-hosted LLMs—real-world solutions to help you innovate, secure, and scale with confidence.]]></description><link>https://blog.brianbaldock.net</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1738510214561/9fce7b73-9500-423f-9caa-590a445ecda3.png</url><title>Baldock&apos;s Blueprints</title><link>https://blog.brianbaldock.net</link></image><generator>RSS for Node</generator><lastBuildDate>Fri, 10 Apr 2026 15:31:35 GMT</lastBuildDate><atom:link href="https://blog.brianbaldock.net/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Learning Fine-Tuning on Nvidia DGX Spark - Part 2]]></title><description><![CDATA[In Part 1 of this series, I got NVIDIA AI Workbench to accept a custom PyTorch base image on DGX Spark. Got the container validated, the GPU was visible, and CUDA worked.

🔗
Click here to read part 1 of this series


By most checklists, the environm...]]></description><link>https://blog.brianbaldock.net/learning-fine-tuning-on-nvidia-dgx-spark-part-2</link><guid isPermaLink="true">https://blog.brianbaldock.net/learning-fine-tuning-on-nvidia-dgx-spark-part-2</guid><category><![CDATA[DGXSpark]]></category><category><![CDATA[#fine Tuning models]]></category><category><![CDATA[pytorch]]></category><category><![CDATA[inference]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Tue, 03 Feb 2026 06:01:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1770096843023/f9236764-26c5-400b-bc92-23f43b38e2a4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In Part 1 of this series, I got NVIDIA AI Workbench to accept a custom PyTorch base image on DGX Spark. Got the container validated, the GPU was visible, and CUDA worked.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🔗</div>
<div data-node-type="callout-text"><a target="_self" href="https://blog.brianbaldock.net/learning-fine-tuning-on-nvidia-dgx-spark-part-1">Click here</a> to read part 1 of this series</div>
</div>

<p>By most checklists, the environment was “working”, except it wasn’t.</p>
<p>This article documents what happened next. No new directions, no future training steps. Just the troubleshooting journey required to turn a technically valid Workbench project into something I could actually experiment in.</p>
<h2 id="heading-starting-from-a-working-container-that-wasnt">Starting From a “Working” Container That Wasn’t</h2>
<p>After creating the Workbench project:</p>
<ul>
<li><p>The custom base container validated and pulled successfully <sup>✅</sup></p>
</li>
<li><p>The GPU was visible ✅</p>
</li>
<li><p>CUDA was available inside the container ✅</p>
</li>
</ul>
<p>And yet:</p>
<ul>
<li><p>No JupyterLab</p>
</li>
<li><p>No VS Code</p>
</li>
<li><p>No obvious interactive entry point</p>
</li>
<li><p>The Packages UI was completely empty</p>
</li>
</ul>
<p>At the infrastructure level, everything looked correct. At the usability level, the project was effectively dead on arrival.</p>
<h2 id="heading-the-first-wrong-assumption-packages-and-applications">The First Wrong Assumption: Packages and Applications</h2>
<p>The obvious next step was to add JupyterLab through the Workbench UI.</p>
<p>I clicked <strong>Add JupyterLab</strong>.</p>
<p>The UI accepted the click.</p>
<p><strong>Nothing</strong> happened.</p>
<p>No error. No feedback. No application created.</p>
<p>That’s when I realized:</p>
<blockquote>
<p>In AI Workbench, “Packages” and “Applications” are <strong>not</strong> the same thing.</p>
</blockquote>
<p>Applications require two things:</p>
<ul>
<li><p>The runtime dependencies must already exist in the container</p>
</li>
<li><p>Metadata must be present to tell Workbench how to manage them</p>
</li>
</ul>
<p>Without both, the UI quietly does nothing. Great.</p>
<h2 id="heading-why-jupyterlab-could-not-be-added">Why JupyterLab Could Not Be Added</h2>
<p>So why couldn’t it be added, well, it’s pretty simple. The custom GHCR-based base image did not include JupyterLab. AI Workbench does <strong><mark>not install applications into custom containers</mark></strong>.</p>
<p>If the binary does not exist, the <strong>Add JupyterLab</strong> action fails silently.</p>
<p>At this point, I had a choice:</p>
<ul>
<li><p>Install JupyterLab manually inside the running container (not the best choice)</p>
</li>
<li><p>Or fix the base image properly (went with this one)</p>
</li>
</ul>
<p>Ad-hoc installs would work temporarily, but they would not be reproducible and they would fight the way Workbench is designed to operate. It’s better to have a correct base image to build on.</p>
<h2 id="heading-fix-the-base-image-not-the-project">Fix the Base Image, Not the Project</h2>
<p>A pattern was already emerging through my testing. The base image had required wrapping to satisfy Workbench metadata in Part 1. Tooling needed the same treatment. Rather than layering fixes inside a single project, I decided to rebuild the GHCR base image, saves time in the long wrong and makes this reproducible (fewer moving parts) and reusable (scales for future projects too).</p>
<h2 id="heading-how-does-ai-workbench-actually-discover-capabilities">How Does AI Workbench Actually Discover Capabilities?</h2>
<p>NVIDIA’s documentation confirmed something important: <strong>AI Workbench does not infer or scan container contents.</strong> It discovers capabilities entirely through labels.</p>
<p>For package management, <strong>two</strong> labels are required:</p>
<ul>
<li><p><code>com.nvidia.workbench.package-manager.apt.binary</code></p>
</li>
<li><p><code>com.nvidia.workbench.package-manager.pip.binary</code></p>
</li>
</ul>
<p>This explained everything:</p>
<ul>
<li><p>Why the Packages UI was empty</p>
</li>
<li><p>Why dependency management was unavailable</p>
</li>
</ul>
<p>Workbench had no declared way to manage packages!</p>
<h2 id="heading-verifying-package-manager-paths">Verifying Package Manager Paths</h2>
<p>We can’t go guessing the paths, not good enough. I verified the actual binary locations inside the NGC PyTorch base image:</p>
<ul>
<li><p><code>apt</code> at <code>/usr/bin/apt</code></p>
</li>
<li><p><code>pip</code> at <code>/usr/local/bin/pip</code></p>
</li>
</ul>
<p>The nuance here matters:</p>
<blockquote>
<p><mark>If the label path does not match the real binary exactly, Workbench fails silently.</mark></p>
</blockquote>
<h2 id="heading-baking-jupyterlab-into-the-base-image">Baking JupyterLab Into the Base Image</h2>
<p>With the model clear, JupyterLab belonged in the base image.</p>
<p>During the build:</p>
<ul>
<li><p>Installed JupyterLab</p>
</li>
<li><p>Installed <code>ipykernel</code></p>
</li>
</ul>
<p>No runtime installs. No state drift. JupyterLab now exists as a first-class capability of the environment.</p>
<h2 id="heading-rebuilding-the-wrapper-image-correctly">Rebuilding the Wrapper Image Correctly</h2>
<p>The updated Dockerfile now included:</p>
<ul>
<li><p>Required Workbench OS labels</p>
</li>
<li><p>CUDA metadata</p>
</li>
<li><p>Package manager labels</p>
</li>
<li><p>JupyterLab installation</p>
</li>
</ul>
<p>The image was rebuilt using <code>--no-cache</code> to avoid stale layers.</p>
<h2 id="heading-tagging-pushing-and-relearning-the-same-lesson">Tagging, Pushing, and Relearning the Same Lesson</h2>
<p>A new pinned tag was used. The image was pushed to GHCR. Labels were verified on the pulled image, not just the local build. Once again, the lesson held:</p>
<blockquote>
<p>AI Workbench validates remote metadata, and <code>latest</code> should <strong>never</strong> be used for base environments.</p>
</blockquote>
<h2 id="heading-repointing-the-workbench-project">Repointing the Workbench Project</h2>
<p>The Workbench project was updated to reference the new pinned image tag.</p>
<p>This time:</p>
<ul>
<li><p>OS metadata validated</p>
</li>
<li><p>Package managers were recognized</p>
</li>
<li><p>The Packages UI appeared</p>
</li>
<li><p>JupyterLab functioned correctly</p>
</li>
</ul>
<p>The project finally functioned like a true development environment, similar to the default one included with NVIDIA Workbench.</p>
<h2 id="heading-final-sanity-check">Final Sanity Check</h2>
<p>A dedicated GPU validation script was run.</p>
<p>Confirmed:</p>
<ul>
<li><p>CUDA availability</p>
</li>
<li><p>GPU detected</p>
</li>
<li><p>BF16 support</p>
</li>
<li><p>Tensor operations executed on GPU successfully</p>
</li>
</ul>
<p>This was the first point where the environment wasn’t just “working”. It was actually usable.</p>
<h2 id="heading-core-lessons-from-this-phase">Core Lessons From This Phase</h2>
<ul>
<li><p>AI Workbench is metadata-driven, not convenience-driven</p>
</li>
<li><p>Custom containers must <strong>explicitly</strong> declare capabilities</p>
</li>
<li><p>Silent failures usually indicate <strong>missing</strong> <strong>metadata</strong>, not runtime errors</p>
</li>
<li><p>Fixing the base image once is cheaper than fixing every project</p>
</li>
<li><p><code>latest</code> is actively harmful in Workbench workflows</p>
</li>
</ul>
<h2 id="heading-closing-the-loop">Closing the Loop</h2>
<p>This article intentionally closes the loop on environment readiness.</p>
<p>From here on, the platform is stable. Training, fine-tuning, and experimentation can finally begin.</p>
<p>Stay tuned for Part 3 where that work begins actively.</p>
]]></content:encoded></item><item><title><![CDATA[Learning Fine-Tuning on Nvidia DGX Spark - Part 1]]></title><description><![CDATA[If you are working with NVIDIA DGX Spark (or your own rig) and trying to do anything beyond toy experimentation, you will very quickly run into challenges.

👉
Setting up your own rig? See my previous article to get started Deploying Local AI Inferen...]]></description><link>https://blog.brianbaldock.net/learning-fine-tuning-on-nvidia-dgx-spark-part-1</link><guid isPermaLink="true">https://blog.brianbaldock.net/learning-fine-tuning-on-nvidia-dgx-spark-part-1</guid><category><![CDATA[DGXSpark]]></category><category><![CDATA[finetuning]]></category><category><![CDATA[pytorch]]></category><category><![CDATA[inference]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Sat, 24 Jan 2026 05:38:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1769233000568/97e46a8f-7d96-4246-8d94-44717faf8150.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you are working with <strong>NVIDIA DGX Spark (or your own rig)</strong> and trying to do anything beyond toy experimentation, you will very quickly run into challenges.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">👉</div>
<div data-node-type="callout-text"><strong>Setting up your own rig? See my previous article to get started </strong><a target="_self" href="https://blog.brianbaldock.net/deploying-local-ai-inference-with-vllm-and-chatui-in-docker"><strong>Deploying Local AI Inference with vLLM and ChatUI in Docker</strong></a></div>
</div>

<p>This article documents the steps, decisions, and some of the failures I hit while getting <strong>NVIDIA AI Workbench</strong> running cleanly on DGX Spark with a <strong>Blackwell-capable PyTorch base image.</strong></p>
<p>By the end of this article, you will have:</p>
<ul>
<li><p>AI Workbench running on DGX Spark (This is pretty straightforward)</p>
</li>
<li><p>A CUDA-enabled PyTorch environment that supports <strong>sm_121 (This is where I had some issues)</strong></p>
</li>
<li><p>A clean, reproducible base image</p>
</li>
<li><p>A foundation suitable for fine-tuning modern models</p>
</li>
</ul>
<h2 id="heading-what-i-was-trying-to-accomplish">What I Was Trying to Accomplish</h2>
<p>The requirements were non-negotiable:</p>
<ul>
<li><p>Use <strong>NVIDIA AI Workbench</strong> as the development control plane</p>
</li>
<li><p>Run on <strong>DGX Spark (GB10, Blackwell,</strong> <code>sm_121</code>)</p>
</li>
<li><p>Fine-tune modern models (starting with <strong>Phi-4</strong>)</p>
</li>
<li><p>Use <strong>CUDA-enabled PyTorch</strong>, not CPU-only fallbacks</p>
</li>
<li><p>Avoid silent GPU architecture incompatibilities</p>
</li>
</ul>
<h2 id="heading-failure-1-generic-python-base-images">Failure #1: Generic Python Base Images</h2>
<p>Starting from a standard Python base image fails exactly how you would expect:</p>
<ul>
<li><p>PyTorch installs as <code>+cpu</code></p>
</li>
<li><p><a target="_blank" href="http://torch.cuda.is"><code>torch.cuda.is</code></a><code>_available()</code> returns <code>False</code></p>
</li>
</ul>
<p>AI Workbench does not magically make PyTorch CUDA-aware. The base image determines everything.</p>
<h2 id="heading-failure-2-built-in-workbench-pytorch-images">Failure #2: Built-in Workbench PyTorch Images</h2>
<p>The next logical step was to use NVIDIA-provided PyTorch base images directly from AI Workbench.</p>
<p>Symptoms:</p>
<ul>
<li><p>CUDA appears available</p>
</li>
<li><p>GPU is visible</p>
</li>
<li><p>Warnings about unsupported architecture: <code>sm_121</code></p>
</li>
</ul>
<p>Root cause:</p>
<ul>
<li><p>These images were compiled for <code>sm_80</code>, <code>sm_86</code>, and <code>sm_90</code></p>
</li>
<li><p>Blackwell (<code>sm_120</code> / <code>sm_121</code>) support was not present</p>
</li>
</ul>
<p>At this point the issue was not Docker or Workbench; it was the <strong>PyTorch build target</strong>.</p>
<h2 id="heading-choosing-a-pytorch-base-that-supports-blackwell">Choosing a PyTorch Base That Supports Blackwell</h2>
<p>Once the constraint was clear, the solution was straightforward.</p>
<h3 id="heading-nvidia-ngc-pytorch-container">NVIDIA NGC PyTorch Container</h3>
<p>The working base image:</p>
<ul>
<li><a target="_blank" href="http://nvcr.io/nvidia/pytorch:25.12-py3"><code>nvcr.io/nvidia/pytorch:25.12-py3</code></a></li>
</ul>
<p>Why this image:</p>
<ul>
<li><p>Blackwell-capable (<code>sm_120</code>, compatible with <code>sm_121</code>)</p>
</li>
<li><p>CUDA 13.x user-space</p>
</li>
<li><p>NVIDIA-validated for DGX-class systems</p>
</li>
</ul>
<p>From a GPU and framework standpoint, this is the correct foundation.</p>
<p>Unfortunately, AI Workbench adds another constraint, this led to some trial and error.</p>
<h2 id="heading-why-ngc-images-fail-in-ai-workbench-by-default">Why NGC Images Fail in AI Workbench by Default</h2>
<p>NGC images are valid Docker images. They are <strong><em>not</em></strong> valid AI Workbench base environments.</p>
<p>AI Workbench validates <strong>image metadata before pulling layers</strong>. If required labels are missing, the image is rejected immediately with errors such as:</p>
<ul>
<li><p><code>invalid base environment (invalid OS)</code></p>
</li>
<li><p><code>no OSDistro set</code></p>
</li>
<li><p><code>no OSDistroRelease set</code></p>
</li>
</ul>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🔑</div>
<div data-node-type="callout-text"><strong>Key point: </strong><em>NGC containers are not automatically Workbench-compatible. Lesson learned.</em></div>
</div>

<p>You must explicitly provide the metadata Workbench expects.</p>
<h2 id="heading-the-fix-a-minimal-wrapper-image">The Fix: A Minimal Wrapper Image</h2>
<p>This is the cleanest solution and the one that scales. No recompiles. No rebuilding PyTorch. Just metadata.</p>
<h3 id="heading-inspect-the-base-os">Inspect the Base OS</h3>
<p>Inside the NGC container:</p>
<ul>
<li><p>OS: <code>linux</code></p>
</li>
<li><p>Distro: <code>ubuntu</code></p>
</li>
<li><p>Release: <code>24.04</code></p>
</li>
</ul>
<p>Verified via <code>/etc/os-release</code>.</p>
<h3 id="heading-wrapper-dockerfile-with-workbench-metadata">Wrapper Dockerfile with Workbench Metadata</h3>
<p>The wrapper image does one thing only; it adds the labels required by AI Workbench.</p>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">FROM</span> nvcr.io/nvidia/pytorch:<span class="hljs-number">25.12</span>-py3

<span class="hljs-keyword">LABEL</span><span class="bash"> com.nvidia.workbench.schema-version=<span class="hljs-string">"v2"</span> \
      com.nvidia.workbench.name=<span class="hljs-string">"NGC PyTorch 25.12 (Workbench)"</span> \
      com.nvidia.workbench.description=<span class="hljs-string">"Wrapper for nvcr.io/nvidia/pytorch:25.12-py3 with Workbench metadata"</span> \
      com.nvidia.workbench.image-version=<span class="hljs-string">"25.12.1"</span> \
      com.nvidia.workbench.cuda-version=<span class="hljs-string">"13.0"</span> \
      com.nvidia.workbench.os=<span class="hljs-string">"linux"</span> \
      com.nvidia.workbench.os-distro=<span class="hljs-string">"ubuntu"</span> \
      com.nvidia.workbench.os-distro-release=<span class="hljs-string">"24.04"</span> \
      com.nvidia.workbench.programming-languages=<span class="hljs-string">"python3"</span></span>
</code></pre>
<p><em>No software changes are introduced.</em></p>
<h2 id="heading-building-and-verifying-the-wrapper-image">Building and Verifying the Wrapper Image</h2>
<pre><code class="lang-bash">docker build --no-cache -t nvwb-pytorch-25.12:latest .
</code></pre>
<p>Verify that the labels are present:</p>
<pre><code class="lang-bash">docker inspect nvwb-pytorch-25.12:latest \
  --format <span class="hljs-string">'{{range $k,$v := .Config.Labels}}{{println $k "=" $v}}{{end}}'</span> \
  | grep <span class="hljs-string">'com.nvidia.workbench'</span>
</code></pre>
<p>If these labels are missing or incorrect, AI Workbench will reject the image.</p>
<h2 id="heading-publishing-the-image-to-github-container-registry-ghcr">Publishing the Image to GitHub Container Registry (GHCR)</h2>
<h3 id="heading-why-ghcr">Why GHCR</h3>
<ul>
<li><p>AI Workbench needs a container URL that can be pulled, and I chose GHCR for this purpose. You can choose any option you prefer, as long as the image can be pulled from a valid URL.</p>
</li>
<li><p><strong>Local images are ignored</strong></p>
</li>
<li><p>GHCR works reliably for personal and public images</p>
</li>
</ul>
<h3 id="heading-authentication-requirements">Authentication Requirements</h3>
<p>I used a <strong>classic GitHub Personal Access Token</strong>. It might be better to use a fine-grained token, but since this is a local development setup, I didn't think it was necessary to determine the exact permissions needed and kept it basic.</p>
<ul>
<li><p><code>read:packages</code></p>
</li>
<li><p><code>write:packages</code></p>
</li>
</ul>
<pre><code class="lang-bash"><span class="hljs-built_in">echo</span> <span class="hljs-string">"<span class="hljs-variable">$GH_TOKEN</span>"</span> | docker login ghcr.io -u &lt;username&gt; --password-stdin
</code></pre>
<h3 id="heading-docker-credential-helper-pitfall">Docker Credential Helper Pitfall</h3>
<p>My Docker config initially contained:</p>
<pre><code class="lang-json"><span class="hljs-string">"credHelpers"</span>: {
  <span class="hljs-attr">"ghcr.io"</span>: <span class="hljs-string">"workbench"</span>
}
</code></pre>
<p>This silently overrides standard Docker authentication and causes:</p>
<ul>
<li><p>Failed pushes</p>
</li>
<li><p><code>wb-svc</code> errors</p>
</li>
<li><p>AI Workbench unable to read image metadata</p>
</li>
</ul>
<p><strong>Fix:</strong></p>
<ul>
<li><p>Remove the <code>credHelpers</code> entry for <code>ghcr.io</code></p>
</li>
<li><p>Allow Docker to use <code>auths</code> normally</p>
</li>
</ul>
<p>This is subtle and easy to miss.</p>
<h3 id="heading-tagging-and-pushing-pinned-tags-only">Tagging and Pushing (Pinned Tags Only)</h3>
<p>Avoid <code>latest</code>.</p>
<pre><code class="lang-bash">docker tag nvwb-pytorch-25.12:latest ghcr.io/brianbaldock/nvwb-pytorch-25.12:25.12.1
docker push ghcr.io/brianbaldock/nvwb-pytorch-25.12:25.12.1
</code></pre>
<hr />
<h2 id="heading-why-latest-causes-workbench-validation-failures">Why <code>latest</code> Causes Workbench Validation Failures</h2>
<p>AI Workbench validates metadata <em>before</em> pulling layers.</p>
<p>With <code>latest</code>:</p>
<ul>
<li><p>Multiple digests</p>
</li>
<li><p>Registry-side caching</p>
</li>
<li><p>Inconsistent metadata resolution</p>
</li>
</ul>
<p>Pinned tags remove ambiguity and resolve validation failures immediately.</p>
<h2 id="heading-creating-the-ai-workbench-project">Creating the AI Workbench Project</h2>
<p>In AI Workbench:</p>
<ul>
<li><p>New Project → Custom Container</p>
</li>
<li><p>Container URL:</p>
</li>
</ul>
<pre><code class="lang-plaintext">ghcr.io/&lt;username&gt;/nvwb-pytorch-25.12:25.12.1
</code></pre>
<p>Result:</p>
<ul>
<li><p>Base environment validated</p>
</li>
<li><p>Project created successfully</p>
</li>
<li><p>GPU visible inside the container</p>
</li>
</ul>
<hr />
<h2 id="heading-final-verification">Final Verification</h2>
<p>Inside the container:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> torch
print(torch.__version__)
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(<span class="hljs-number">0</span>))
</code></pre>
<p>Expected output:</p>
<ul>
<li><p>CUDA-enabled PyTorch</p>
</li>
<li><p><code>NVIDIA GB10</code></p>
</li>
<li><p>No architecture warnings</p>
</li>
</ul>
<h2 id="heading-key-takeaways">Key Takeaways</h2>
<ul>
<li><p>AI Workbench is <strong>metadata-driven</strong>, not just Docker-driven</p>
</li>
<li><p>NGC containers require wrapper labels</p>
</li>
<li><p>Blackwell requires the correct PyTorch build target</p>
</li>
<li><p>GHCR authentication is not Git authentication</p>
</li>
<li><p>Docker <code>credHelpers</code> can silently override auth</p>
</li>
<li><p><code>latest</code> is unsafe for Workbench base images</p>
</li>
<li><p>Wrapper images are the cleanest long-term approach</p>
</li>
</ul>
<hr />
<h2 id="heading-whats-next">What’s Next</h2>
<p>This article establishes a clean, reproducible base.</p>
<p>Next articles in this series will build on it:</p>
<ul>
<li><p>Fine-tuning <strong>Phi-4</strong> on DGX Spark</p>
</li>
<li><p>LoRA vs QLoRA on Blackwell</p>
</li>
<li><p>Serving models with <strong>vLLM</strong></p>
</li>
<li><p>Multi-container AI Workbench workflows</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[My Simple Setup for Sanity in Python Development on Windows]]></title><description><![CDATA[Dropping a quick post about how I keep my Python versions under control on Windows. Nothing fancy, just something that has saved me enough headaches that it feels worth sharing.
Python environments have burned me more times than I want to admit. Diff...]]></description><link>https://blog.brianbaldock.net/python-version-control</link><guid isPermaLink="true">https://blog.brianbaldock.net/python-version-control</guid><category><![CDATA[Python]]></category><category><![CDATA[pyenv]]></category><category><![CDATA[Windows]]></category><category><![CDATA[pyenv-win]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Tue, 02 Dec 2025 06:23:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1764656478339/f2704d0f-711f-4f69-a4f3-a70adbf4832b.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Dropping a quick post about how I keep my Python versions under control on Windows. Nothing fancy, just something that has saved me enough headaches that it feels worth sharing.</p>
<p>Python environments have burned me more times than I want to admit. Different projects, different versions, weird PATH collisions. The usual mess. PyEnv cleaned that up for me on MacOS and Linux, so when I found the Windows version, I was all in.</p>
<p>PyEnv for MacOS and Linux: <a target="_blank" href="https://github.com/pyenv/pyenv">https://github.com/pyenv/pyenv</a><br />PyEnv for Windows: <a target="_blank" href="https://github.com/pyenv-win/pyenv-win">https://github.com/pyenv-win/pyenv-win</a></p>
<p>My workflow lives in VS Code, and I want things to just work. If I open a repo that needs a specific Python version, I want that version active immediately. No guessing. No scrolling through interpreter lists. So I wired up a small PowerShell trick to keep everything in sync.</p>
<p>Here is the flow.</p>
<p>I open a repo. I know it needs Python 3.10. I run:</p>
<pre><code class="lang-bash">pyenv <span class="hljs-built_in">local</span> 3.10
</code></pre>
<p>PyEnv drops a .python version file into the folder. The only job left is making sure my shell respects it.</p>
<p>This line in my PowerShell profile does exactly that. It puts the PyEnv shims at the front of PATH:</p>
<pre><code class="lang-powershell"><span class="hljs-variable">$env:Path</span> = <span class="hljs-string">"<span class="hljs-variable">$</span>(<span class="hljs-variable">$env:USERPROFILE</span>)\.pyenv\pyenv-win\shims;<span class="hljs-variable">$</span>(<span class="hljs-variable">$env:USERPROFILE</span>)\.pyenv\pyenv-win\bin;"</span> + <span class="hljs-variable">$env:Path</span>
</code></pre>
<p>I keep this line in two places so everything behaves the same way:</p>
<pre><code class="lang-bash">%userprofile%\Documents\PowerShell\Microsoft.PowerShell_profile.ps1
%userprofile%\Documents\PowerShell\Microsoft.VSCode_profile.ps1
</code></pre>
<p>Why put it in the profile instead of editing the global Windows PATH.</p>
<p>For me, it is safer and more predictable. The change applies only to the current shell session, and PyEnv’s shims always load first. Nothing else on the system silently overrides them.</p>
<p>In the end, it is one small line that keeps my Python setup clean and my VS Code projects consistent on Windows. No more surprises.</p>
<p>Like this post if you find it useful!</p>
]]></content:encoded></item><item><title><![CDATA[LightningCopilot - Integrating Microsoft Copilot Studio into Salesforce Lightning (LWC) with Entra ID SSO]]></title><description><![CDATA[Intro
If you’ve ever tried to embed a Microsoft Copilot Studio agent inside Salesforce, you’ve probably learned the hard way that it’s not as simple as dropping in an iframe. Between Salesforce Locker restrictions, Entra ID redirect rules, and Copilo...]]></description><link>https://blog.brianbaldock.net/lightningcopilot-salesforce-meets-copilotstudio</link><guid isPermaLink="true">https://blog.brianbaldock.net/lightningcopilot-salesforce-meets-copilotstudio</guid><category><![CDATA[entra id sso]]></category><category><![CDATA[Salesforce]]></category><category><![CDATA[lwc]]></category><category><![CDATA[copilotstudio]]></category><category><![CDATA[agentic AI]]></category><category><![CDATA[Entra ID]]></category><category><![CDATA[SSO]]></category><category><![CDATA[copilot agents]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Wed, 29 Oct 2025 10:00:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1761707251118/9b1550d5-e295-47a3-8bc0-32a8696f2d5e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-intro">Intro</h2>
<p>If you’ve ever tried to embed a Microsoft Copilot Studio agent inside Salesforce, you’ve probably learned the hard way that it’s not as simple as dropping in an iframe. Between Salesforce Locker restrictions, Entra ID redirect rules, and Copilot Studio’s federated auth model, it takes some serious tinkering to get everything to line up.</p>
<p>That’s why I built <strong>LightningCopilot</strong>. It’s a Lightning Web Component that brings a Copilot Studio agent to life inside Salesforce with full Entra ID SSO, MSAL-based auth, and clean token flow from start to finish. The chat experience runs through a BotFramework WebChat based lightweight chat experience with adaptive cards, all without breaking Locker.</p>
<p>This post walks through the whole thing so you can wire it up yourself across Entra ID, Copilot Studio, and Salesforce.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">This guidance is provided as is with no garuantees and I highly recommend testing anything in a dev or sandbox environment before trying this in production.</div>
</div>

<p>The repository is published here: <a target="_blank" href="https://github.com/brianbaldock/LightningCopilot">github.com/brianbaldock/LightningCopilot</a></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🗒</div>
<div data-node-type="callout-text">Be aware that utilizing Copilot Studio Agents outside of the Microsoft 365 Copilot UI incurs Copilot Credit consumption. <a target="_self" href="https://learn.microsoft.com/en-us/microsoft-copilot-studio/billing-licensing">Copilot Studio licensing | Microsoft Learn</a></div>
</div>

<h2 id="heading-what-youll-build">What you’ll build</h2>
<ul>
<li><p>A Salesforce Lightning Web Component that securely hosts a Copilot Studio Agent</p>
</li>
<li><p>Seamless Entra ID SSO with MSAL.js</p>
</li>
<li><p>Properly scoped Salesforce Static Resources, Custom Labels and CSP Trusted URLs</p>
</li>
</ul>
<h2 id="heading-architecture-overview">Architecture Overview</h2>
<p>Everything starts in the Lightning Web Component. It loads the required libraries from Salesforce Static Resources, signs in users through Entra ID using MSAL, and passes authentication to the Copilot Studio agent.</p>
<p>At runtime, the component connects to Microsoft’s Power Platform APIs and Bot Framework endpoints to enable real-time conversations.</p>
<p>Using <strong>Static Resources</strong> was intentional. In Salesforce, scripts loaded directly from CDNs can break under Locker restrictions or CSP rules. Packaging them as Static Resources, you get to control versioning, avoid runtime issues, and keep the entire bundle self-contained, no external script calls needed. I chose this method for simplicity but feel free to test with script loading from CDN if you prefer.</p>
<p><strong>Custom Labels</strong> follow the same logic. Instead of hardcoding client IDs, region URLs, or agent endpoints in your JavaScript, labels let you manage those values from Salesforce Setup. It’s cleaner, secure, and easier to update between environments without touching the code.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761361565838/260e8f72-9d36-462a-9ea3-8b942b17fc7e.png" alt="Flowchart depicting the integration of Salesforce Lightning Copilot with Microsoft services. It shows two main flows: one for loading scripts with static resources and another for acquiring tokens using MSAL. It includes Microsoft Entra ID, Copilot Studio Agent, and API connections to Power Platform and DirectLine." class="image--center mx-auto" /></p>
<p>Let’s get started!</p>
<h2 id="heading-prerequisites-im-assuming-you-have-this-already">Prerequisites (I’m assuming you have this already)</h2>
<ul>
<li><p>A Salesforce org (and know how to create a utility bar component)</p>
</li>
<li><p>A <strong>published</strong> Copilot Studio Agent</p>
</li>
<li><p>Access to Microsoft Entra ID (Application Administrator is sufficient RBAC on the app registration itself)</p>
</li>
<li><p>Access to Azure Cloud Shell to enable the Copilot SPN.</p>
</li>
<li><p>Notepad - for copy pasting information from this guide as well as the different platforms outlined below.</p>
</li>
</ul>
<h2 id="heading-prepare-the-agent">Prepare the Agent</h2>
<p>Chances are you have a Copilot Studio Agent already. If not, create one.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">ℹ</div>
<div data-node-type="callout-text">Creating agents is outside of the scope of this article, just search for a guide and you’ll find one.</div>
</div>

<p>Assuming you already have a published agent, open it in <a target="_blank" href="https://copilotstudio.microsoft.com">https://copilotstudio.microsoft.com</a>. Depending on your Power Platform environment type, some authentication or configuration options might not appear. I’d recommend checking that these settings are available; if they’re missing, you’ll likely need your IT team to spin up a new Power Platform environment for Copilot Studio, ideally using a <strong>Dev</strong> or <strong>Sandbox</strong> environment type. See the screenshot below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761401845015/7f0508aa-4057-42f3-959c-dbf4ab13af50.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p>Once able, select “<strong>Authenticate manually</strong>” and click “<strong>Save</strong>”.</p>
<p> <strong><em>Note:</em></strong> <em>Much of this form is filled out by default and you need only note down the values.</em></p>
</li>
<li><p>Add the “<strong>Redirect URL</strong>” to your notes.</p>
</li>
<li><p>Select “<strong>Microsoft Entra ID V2 with federated credentials</strong>” from the drop down.</p>
</li>
<li><p>Add the “<strong>Federated credential issuer</strong>” to your notes.</p>
</li>
<li><p>Add the “<strong>Federated credential value</strong>” to your notes.</p>
</li>
<li><p>Add the “<strong>Client ID</strong>” to your notes.</p>
</li>
<li><p><strong><em>Note</em></strong>*:* <strong><em>The “Scopes” will likely only contain “profile” and “openapi”, this is okay for now. We’ll come back to this later.</em></strong></p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761408032098/6eca53f8-0c57-44ee-8081-359f8db829f2.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-prepare-the-entra-id-app-registration">Prepare the Entra ID App Registration</h2>
<ul>
<li><p>Access the <a target="_blank" href="https://entra.microsoft.com">Entra Admin Center</a>.</p>
</li>
<li><p>Click “App Registrations”.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761609128418/8b98c35e-caa5-4bdb-b9f6-9aae492f57ed.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>Select your agent from the list. You should see your agent name with the (Microsoft Copilot Studio) suffix.</p>
</li>
<li><p>Note down your Application (Client) ID, Directory (Tenant) ID.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761610778868/2d34bfb6-2b3b-4eb4-b117-1f62eac3be6b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-1-authentication-settings">1) Authentication settings</h3>
<ul>
<li><p>Click “<strong>Authentication</strong>”, and click “<strong>Add platform</strong>”</p>
</li>
<li><p>Select “<strong>Web</strong>”</p>
<ul>
<li>Enter the following value: <strong>https://token.botframework.com/.auth/web/redirect</strong></li>
</ul>
</li>
<li><p>Click “Add platform” again, this time selecting “<strong>Single-page application</strong>” (<strong>SPA</strong>)</p>
</li>
<li><p>Add the following SPAs:</p>
<ul>
<li><p><strong>https://api.powerplatform.com/CopilotStudio.Copilots.Invoke</strong></p>
</li>
<li><p><strong>https://&lt;YOUR-ORG&gt;.&lt;develop,sandbox, etc&gt;.lightning.force.com</strong></p>
</li>
<li><p><strong>https://&lt;YOUR-ORG&gt;.&lt;develop,sandbox, etc&gt;.lightning.force.com/ <mark>← note the trailing slash</mark></strong></p>
</li>
<li><p><strong>https://&lt;YOUR-ORG&gt;.&lt;develop,sandbox, etc&gt;.lightning.force.com/lightning/page/home</strong></p>
</li>
</ul>
</li>
<li><p>Select “<strong>ID tokens (used for implicit and hybrid flows)</strong>”</p>
</li>
<li><p>Should look something like this when you’re done:</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761610381246/55e77876-f464-46d6-bac8-fe38c2900666.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-2-certificates-amp-secrets">2) Certificates &amp; Secrets</h3>
<ol>
<li><p>Click “<strong>Federated credentials</strong>”, there are likely already 2 credentials here.</p>
</li>
<li><p>Click “<strong>Add credential”.</strong></p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761611179217/87ff8cbd-ae45-44d9-a98e-007d9f60f0eb.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p>From the “<strong>Federated credential scenario</strong>” dropdown select “<strong>Other issuer</strong>”.</p>
</li>
<li><p>The “<strong>Issuer</strong>” in this context is the following with your tenant ID: <strong>https://login.microsoftonline.com/&lt;TENANT ID&gt;/v2.0</strong></p>
<p> Leave the “<strong>Type</strong>” as is: “<strong>Explicit subject identifier</strong>”.</p>
</li>
<li><p>In “<strong>Value</strong>” drop in the “<strong>/eid1</strong>” generated when you created the manual authentication settings in your agent. This field should be populated in your agent’s “<strong>Federated credential value</strong>” field.</p>
</li>
<li><p>Name this credential something clear and easily identifiable like, “<strong>FED_ID_AGENTNAME</strong>”.</p>
</li>
<li><p>Provide a description, example: “<strong>Federated credential for &lt;AGENTNAME&gt; for Salesforce SSO flow</strong>“</p>
</li>
<li><p>Click “<strong>Add</strong>” to save the settings</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761679642873/577df253-7e7d-46da-a44b-612a2e16a4f1.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-3-api-permissions">3) API Permissions</h3>
<p>For this section we need to create the Power Platform API service principal for the tenant. The easiest and quickest way to do this is via the Azure Cloud Shell. Open a new tab in your browser and access the Cloud Shell at the following link: <a target="_blank" href="https://shell.azure.com">https://shell.azure.com</a></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">ℹ</div>
<div data-node-type="callout-text">Requirements for accessing the Azure Cloud Shell are not outlined in this blog post. Check out the following link for more information: <a target="_self" href="https://learn.microsoft.com/en-us/azure/cloud-shell/get-started/classic?tabs=azurecli">https://learn.microsoft.com/en-us/azure/cloud-shell/get-started/classic?tabs=azurecli</a></div>
</div>

<p>Later we’ll be adding the <strong>CopilotStudio.Copilots.Invoke</strong> delegated permission in Entra ID, it shows up as part of the <strong>Power Platform API</strong>. That’s expected, Copilot Studio is built on top of the Power Platform service layer. The documentation under <a target="_blank" href="https://learn.microsoft.com/en-us/power-platform/admin/">Power Platform Admin API</a> covers this same backend service.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">You will need at least the “<strong>Application Administrator</strong>” role to run this command. Don’t worry, it will tell you if you don’t have the correct role.</div>
</div>

<pre><code class="lang-bash">az ad sp create --id 8578e004-a5c6-46e7-913e-12f58912df43
</code></pre>
<p>With the SPN enabled we can now go back to Entra ID and add additional API Permissions. Open up your Agent’s App Registration again and click on the “<strong>API Permissions</strong>” pane.</p>
<ol>
<li>Under API Permissions select “<strong>Add Permission</strong>”.</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761683012650/8f30f820-6ee3-453b-8702-96942f43080b.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p>Select "<strong>APIs my organization uses</strong>”.</p>
</li>
<li><p>Search for “<strong>Power Platform API</strong>” and select it.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761683038733/1b43b913-a1b8-471e-9f5b-a8044f8d2e44.png" alt class="image--center mx-auto" /></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">⚠</div>
<div data-node-type="callout-text">If you do not see “Power Platform API” in this list then go back to the start of this section and add the Power Platform API SPN to your Tenant.</div>
</div>

<ol>
<li><p>Search for “<strong>CopilotStudio.Copilots.Invoke</strong>”.</p>
</li>
<li><p>Check the “<strong>CopilotStudio.Copilots.Invoke</strong>” option.</p>
</li>
<li><p>Click “<strong>Add permissions</strong>”.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761683527840/cd3995ea-72fd-4151-8f6c-f8a6bbe6ab13.png" alt class="image--center mx-auto" /></p>
<ol>
<li>Click “<strong>Grant admin consent for &lt;Your Org Name&gt;</strong>”.</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761683749477/da19c989-da34-4e0d-bab1-2b90d44a0992.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-4-expose-an-api">4) Expose an API</h3>
<ol>
<li><p>Select “<strong>Expose and API</strong>”.</p>
</li>
<li><p>Click “<strong>Add</strong>" next to “<strong>Application ID URI</strong>”</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761683994439/1dc18b72-190a-410a-9c59-a17e39a6f739.png" alt class="image--center mx-auto" /></p>
<ol>
<li>Click “<strong>Save</strong>” (the generated URI is sufficient for our purposes)</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761684321569/2de72b8a-baac-40be-ae46-02605e2553b3.png" alt class="image--center mx-auto" /></p>
<ol>
<li><p>Next, click “<strong>Add a Scope</strong>”.</p>
</li>
<li><p>For "Scope name” enter: "<strong>Chat.Invoke</strong>”.</p>
</li>
<li><p>Who can consent?: “<strong>Admins and users</strong>”.</p>
</li>
<li><p>Admin consent display name: “<strong>Copilot Invocation</strong>”.</p>
</li>
<li><p>Admin consent description: “<strong>Invoke Copilot Agent</strong>”.</p>
</li>
<li><p>User consent display name: “<strong>Copilot Invocation</strong>”.</p>
</li>
<li><p>User consent description: “<strong>Invoke Copilot Agent</strong>”.</p>
</li>
<li><p>State: “<strong>Enabled</strong>”.</p>
</li>
<li><p>Click “Add scope”.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761684445730/ce539d59-6465-4f71-823d-37c1de35c130.png" alt class="image--center mx-auto" /></p>
<p>When you’re done it should look something like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761685041961/40831e61-f119-42f6-9be8-eb5a8d74c4d5.png" alt class="image--center mx-auto" /></p>
<p><strong>Update the Scopes in your Copilot Agent</strong></p>
<ul>
<li><p>Now copy the full scope URI from under “Scopes” see the Clipboard button next to the api:// address.</p>
</li>
<li><p>Go back to Copilot Studio and select you Agent, click Settings, Security, Authentication and paste the full <strong>api://&lt;GUID&gt;/Chat.Invoke</strong> at the end of the “Scopes” text field, leave a space after the existing scopes there.</p>
</li>
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761705639749/9d598a92-1db9-4f33-bade-a8581858713d.png" alt class="image--center mx-auto" /></p>
</li>
</ul>
<h2 id="heading-building-the-lwc">Building the LWC</h2>
<ol>
<li><p>Clone the GitHub repository here: <a target="_blank" href="https://github.com/brianbaldock/LightningCopilot">https://github.com/brianbaldock/LightningCopilot</a></p>
</li>
<li><p>At the root of the project, type the following commands in order</p>
</li>
</ol>
<pre><code class="lang-bash"><span class="hljs-comment"># Install node modules:</span>
npm install
</code></pre>
<p>I’ve added a build-static-resources.mjs which pulls the latest components required for Adaptive Cards, MSAL, and Copilot Studio Client from various modules and SDKs. The script is here: <strong>scripts/build-static-resources.mjs</strong></p>
<p>You should see a result like this.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761697370228/8c91048c-7342-4411-a9a7-36aaa7ba68cc.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-rebranding">Rebranding</h3>
<p>This is a good time to rename your agent project because chances are you’re not calling it <strong>LightningCopilot</strong> (though it is a pretty cool name huh 😉)</p>
<p><strong>To publish under a different brand:</strong></p>
<ul>
<li><p>Rename this folder: lightningCopilot/main/default/lwc/<strong>lightningCopilotAuth</strong></p>
</li>
<li><p>Rename classes/export: Update exported class <strong>LightningCopilotAuth</strong> to new name.</p>
</li>
<li><p>Event names: Replace: <strong>lightningcopilotauthsignin</strong>, <strong>lightningcopilotauthsignout</strong>, <strong>lightningcopilotautherror</strong>.</p>
</li>
<li><p>HTML labels: Change “<strong>Sign in to Lightning Copilot</strong>” to the new branding text.</p>
</li>
<li><p>CSS namespace: Root class <strong>.lightning-copilot-shell</strong> → new scoped class.</p>
</li>
<li><p>Logging prefix: Update <strong>[LightningCopilotAuth]</strong> occurrences.</p>
</li>
<li><p><strong>sfdx-project.json</strong>: Adjust package directory path if source folder renamed.</p>
</li>
<li><p>README: Replace occurrences of LightningCopilot with new brand. Keep naming internally consistent to avoid broken imports.</p>
</li>
</ul>
<h3 id="heading-pushing-the-project-to-salesforce">Pushing the project to SalesForce</h3>
<p>At the root of the project folder do the run the following commands.</p>
<pre><code class="lang-bash">sf org login web --<span class="hljs-built_in">alias</span> MyOrg --instance-url https://login.salesforce.com
sf config <span class="hljs-built_in">set</span> target-org MyOrg --global
sf project deploy start
</code></pre>
<h3 id="heading-creating-the-static-resources">Creating the Static Resources</h3>
<p><strong>Static Resources</strong> in Salesforce are essentially packaged files (JavaScript, CSS, images, or libraries) that you upload once and reference securely inside Lightning Web Components or Aura apps. Instead of linking to external CDNs or hardcoding paths, Static Resources let you bundle everything you need (like msalBrowser, copilotStudioClient, and adaptiveCards) directly into your org. This keeps your component self-contained, avoids Locker and CSP violations, and gives you full control over versioning and deployment between environments.</p>
<p>Let’s create the static resources:</p>
<ol>
<li><p>In SalesForce select the “<strong>Setup</strong>” gear.</p>
</li>
<li><p>Search for “<strong>Static Resources</strong>” and enter the Static Resource screen.</p>
</li>
<li><p>Create three new Static Resources with the following details:</p>
</li>
</ol>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Name</strong></td><td><strong>File</strong></td></tr>
</thead>
<tbody>
<tr>
<td>msalBrowser</td><td>static-resources-build/msalbrowser/dist/msal-browser.min.js</td></tr>
<tr>
<td>adaptiveCard</td><td>static-resources-build/adaptiveCards/dist/adaptivecards.js</td></tr>
<tr>
<td>copilotStudioClient</td><td>static-resources-build/copilotStudioClient/dist/copilotStudioClient.js</td></tr>
</tbody>
</table>
</div><div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">The naming convention is static and case sensitive, do not modify these names or you’ll have to update the LWC code as well.</div>
</div>

<p>When you’re done, it should look something like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761700662149/96f08684-d83b-44fb-854b-d3f943485396.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-creating-the-trustedurls">Creating the TrustedURLs</h3>
<p>Trusted URLs in Salesforce define which external domains your org is allowed to load resources or make network calls to. Because of Salesforce’s strict Content Security Policy (CSP), anything not explicitly trusted will be blocked, including scripts, iframes, or API requests from your Lightning Web Component. By adding these Microsoft endpoints as <strong>Trusted URLs</strong>, you’re telling Salesforce that communication with Copilot Studio, Power Platform, and Entra ID is safe and intentional. This step ensures that token exchanges, Adaptive Card rendering, and chat functionality all work correctly within Locker’s sandboxed environment.</p>
<ol>
<li><p>In SalesForce select the “<strong>Setup</strong>” gear.</p>
</li>
<li><p>Search for “<strong>CSP</strong>” and select “<strong>Trusted URLs</strong>”</p>
</li>
<li><p>Click “<strong>New Trusted URL</strong>”</p>
</li>
<li><p>All of the Trusted URLs listed below follow the same recipe, the only difference is the URL and the API Name.</p>
<ul>
<li><strong>Trusted URL Recipe:</strong></li>
</ul>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761701189098/13069aa2-f1f8-46cd-95cc-ba825566294d.png" alt class="image--center mx-auto" /></p>
<p>Here is the list of URLs:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>API Name</strong></td><td><strong>URL</strong></td><td><strong>About</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>API_BAP_MSFT</strong></td><td>https://api.bap.microsoft.com</td><td>Power Platform backend used by Copilot Studio for configuration and policy management.</td></tr>
<tr>
<td><strong>BOTFRAMEWORK</strong></td><td>https://cdn.botframework.com</td><td>Hosts core Bot Framework scripts and assets used for chat functionality.</td></tr>
<tr>
<td><strong>COPILOTSTUDIO</strong></td><td>https://copilotstudio.microsoft.com</td><td>Main Copilot Studio web interface and runtime services for agents.</td></tr>
<tr>
<td><strong>DIRECTLINE</strong></td><td>https://directline.botframework.com</td><td>Handles authenticated chat sessions and message transport between Salesforce and Copilot.</td></tr>
<tr>
<td><strong>HTTPS_POWER PLATFORM_API</strong></td><td>https://&lt;GUID&gt;.f1.environment.api.powerplatform.com</td><td>Environment-specific Power Platform API endpoint that Copilot uses for execution and data exchange.</td></tr>
<tr>
<td><strong>M365</strong></td><td>https://login.microsoftonline.com</td><td>Microsoft Entra ID endpoint for authentication and token issuance.</td></tr>
<tr>
<td><strong>MDCA</strong></td><td>https://abtcyber-net.access.mcas.ms</td><td>Defender for Cloud Apps endpoint providing conditional access and session control. <strong><mark>NOTE: You may not need this if you don’t use Microsoft Defender for Cloud Apps.</mark></strong></td></tr>
<tr>
<td><strong>WSS_POWER_PLATFORM_API</strong></td><td>wss://&lt;GUID&gt;.f1.environment.api.powerplatform.com</td><td>WebSocket endpoint for real-time Copilot Studio and Power Platform communications. Environment specific.</td></tr>
</tbody>
</table>
</div><div data-node-type="callout">
<div data-node-type="callout-emoji">ℹ</div>
<div data-node-type="callout-text"><strong>Note:</strong> The wss:// (WebSocket) endpoint enables real-time chat and Adaptive Card updates. Without it, you might see delays or dropped connections inside Salesforce.</div>
</div>

<p>When you’re done, it should looks something like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761700977687/613cce7d-d43e-4f3e-89e3-8aa5ffca440e.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-creating-the-custom-labels">Creating the Custom Labels</h3>
<p><strong>Custom Labels</strong> in Salesforce let you store configurable values, like URLs, client IDs, and scopes, that your Lightning Web Component can reference at runtime. Instead of hardcoding these details into your code, labels make it easier to manage environments (Dev, Test, Prod) without redeploying the component. It’s also more secure and keeps sensitive identifiers out of your source files. Think of them like environment variables.</p>
<ol>
<li><p>In SalesForce select the “<strong>Setup</strong>” gear.</p>
</li>
<li><p>Search for “<strong>Custom Labels</strong>” and select “<strong>Custom Labels</strong>”</p>
</li>
<li><p>Click “<strong>New Custom Label</strong>”</p>
</li>
<li><p>You’ll need custom labels for all of the values listed below. This activity will require you to lookup values in Entra ID, Copilot Studio, and do a bit of discovery using Azure App Insights.</p>
</li>
</ol>
<p>List of Custom Labels:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Label Name</strong></td><td><strong>Value</strong></td><td><strong>About</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>COPILOT_AgentUrl</strong></td><td>https://copilotstudio.microsoft.com/environments/&lt;guid&gt;/bots/&lt;name&gt;/webchat?__version__=2</td><td>Direct URL to the Copilot Studio agent (used to embed or call your agent from Salesforce).</td></tr>
<tr>
<td><strong>COPILOT_EmbedUrl</strong></td><td>https://api.bap.microsoft.com/providers/Microsoft.BusinessAppPlatform/environments/&lt;Environment GUID&gt;/copilotAgents?api-version=2023-10-01-preview</td><td>Base API endpoint for invoking the Copilot Agent through Power Platform.</td></tr>
<tr>
<td><strong>MSAL_ClientId</strong></td><td><strong>aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa</strong></td><td>The Entra ID app registration’s client ID used by MSAL for authentication.</td></tr>
<tr>
<td><strong>MSAL_RedirectUri</strong></td><td>https://&lt;orgname&gt;.&lt;sandbox etc.&gt;.lightning.force.com/lightning/page/home</td><td>The redirect URI registered in Entra ID, pointing back to Salesforce for token return after login.</td></tr>
<tr>
<td><strong>MSAL_Scopes</strong></td><td>api://&lt;GUID&gt;/Chat.Invoke https://api.powerplatform.com/CopilotStudio.Copilots.Invoke openid profile email</td><td>Defines the OAuth scopes that the MSAL client requests when authenticating with Entra ID. Separated by spaces.</td></tr>
<tr>
<td><strong>MSAL_TenantId</strong></td><td><strong>aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa</strong></td><td>Your Microsoft Entra ID (Azure AD) tenant ID, used to authenticate against the correct directory.</td></tr>
</tbody>
</table>
</div><p>Great list right? Where do we find all the things? Let’s start with the easy ones:</p>
<ul>
<li><p><strong>MSAL_ClientId -</strong> Your Copilot Agent’s App Registration Application (Client) ID - Find this in Entra ID.</p>
</li>
<li><p><strong>MSAL_RedirectUri -</strong> Your SalesForce page where the component will be running.</p>
</li>
<li><p><strong>MSAL_Scopes -</strong> These are what we defined (with some additions) to the App Registration earlier in this article. List these out in the custom label with spaces separating each one. - Find this in Entra ID.</p>
</li>
<li><p><strong>MSAL_TenantId -</strong> Your Entra ID Tenant ID - Find this in Entra ID.</p>
</li>
<li><p><strong>COPILOT_AgentURL</strong></p>
<ol>
<li><p>Go to <a target="_blank" href="https://copilotstudio.microsoft.com">https://copilotstudio.microsoft.com.</a></p>
</li>
<li><p>Open your agent.</p>
</li>
<li><p>Click Channels, select Demo Website</p>
</li>
<li><p>Copy the URL listed in “Share the URL” to your browsers address bar:</p>
</li>
<li><p>When the page opens notice how the URL changed. Copy this new URL to your notes.</p>
</li>
<li><p>Update the URL the following way:</p>
<ul>
<li><p>Remove the highlighted parts in the example below:</p>
<ul>
<li>https://copilotstudio.microsoft.com/environments/&lt;ENVIRONMENTGUID&gt;/bots/&lt;NAME&gt;/<mark>canvas?version=2&amp;enableFileAttachment=true</mark></li>
</ul>
</li>
<li><p>Replace with the following (ensure to remove any trailing or doubled slashes)</p>
<ul>
<li><strong>/webchat?__version__=2</strong></li>
</ul>
</li>
<li><p>It should looks like this now:</p>
<ul>
<li>https://copilotstudio.microsoft.com/environments/&lt;ENVIRONMENTGUID&gt;/bots/&lt;NAME&gt;<strong><mark>/webchat?version=2</mark></strong></li>
</ul>
</li>
</ul>
</li>
</ol>
</li>
<li><p><strong>COPILOT_EmbedURL</strong></p>
<ol>
<li><p>This one is really tricky to find but luckily you don’t have to do the network tracing, CSP monitoring, trials and errors that I went through to lock this one in. Just take the Environment GUID from the COPILOT_AgentURL above and drop it into this preconstructed link:</p>
<ul>
<li>https://api.bap.microsoft.com/providers/Microsoft.BusinessAppPlatform/environments/&lt;ENVIRONMENTGUID&gt;/copilotAgents?api-version=2023-10-01-preview</li>
</ul>
</li>
</ol>
</li>
</ul>
<p>When you’re finished, it should look something like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1761704643698/1c3c9cfc-4e31-4d81-b085-f45da576c321.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-troubleshooting">Troubleshooting</h2>
<p><strong>Agent keeps asking to sign in</strong></p>
<ul>
<li><p>Verify every Salesforce SPA redirect URI exists in Entra ID. Include the trailing slash and /lightning/page/home.</p>
</li>
<li><p>Confirm <a target="_blank" href="https://token.botframework.com/.auth/web/redirect">https://token.botframework.com/.auth/web/redirect</a> is listed under <strong>Web</strong>.</p>
</li>
<li><p>Check the federated credential issuer and value. Issuer must be <a target="_blank" href="https://login.microsoftonline.com/%3Ctenant-id%3E/v2.0">https://login.microsoftonline.com/&lt;tenant-id&gt;/v2.0</a>. Value should match the /eid1/... from Copilot Studio.</p>
</li>
<li><p>If MSAL silent auth fails inside Salesforce, set cacheLocation: "localStorage" and storeAuthStateInCookie: true in your MSAL config.</p>
</li>
</ul>
<p><strong>Power Platform API not found in Entra ID</strong></p>
<ul>
<li><p>Create the service principal first (see the section earlier in the article)</p>
</li>
<li><p>Reopen API permissions and search for Power Platform API. Add CopilotStudio.Copilots.Invoke and grant admin consent.</p>
</li>
</ul>
<p><strong>Adaptive Cards show [object Object]</strong></p>
<ul>
<li><p>Use the non-minified Adaptive Cards build (this should be the case if you followed the article so far) This is mainly noted for posterity.</p>
</li>
<li><p>Confirm the Static Resource name matches exactly what the LWC imports expect.</p>
</li>
</ul>
<p><strong>MSAL redirect loop or stale cache</strong></p>
<ul>
<li><p>Clear browser storage for your Salesforce domain.</p>
</li>
<li><p>Verify the SPA redirect URIs exactly match the domain you are testing on.</p>
</li>
<li><p>In sandboxes, double-check you are not mixing lightning.force.com and my.salesforce.com origins.</p>
</li>
</ul>
<h3 id="heading-known-gotchas-by-environment"><strong>Known gotchas by environment</strong></h3>
<ul>
<li><p><strong>Locker</strong>: avoid dynamic eval or libraries that assume window globals.</p>
</li>
<li><p><strong>Static Resources</strong>: names are case sensitive. Change a name and your LWC fails to load.</p>
</li>
<li><p><strong>Multiple Salesforce domains</strong>: Entra SPAs must exist for each domain you use to test.</p>
</li>
<li><p><strong>Region drift</strong>: moving the agent to another Power Platform environment without updating labels breaks calls silently.</p>
</li>
</ul>
<h3 id="heading-security-notes"><strong>Security notes</strong></h3>
<ul>
<li><p>Minimum role to create the Power Platform API SPN is <strong>Application Administrator</strong>. Cloud App Admin, Privileged Role Admin, or Global Admin also work.</p>
</li>
<li><p>Keep client IDs, tenant IDs, and environment GUIDs in <strong>Custom Labels</strong>. Do not hardcode in JS.</p>
</li>
<li><p>Pin your Static Resource versions with your build script to avoid unplanned upgrades.</p>
</li>
</ul>
<h3 id="heading-quick-validation-checklist"><strong>Quick validation checklist</strong></h3>
<ul>
<li><p>Copilot Studio agent is published in the expected environment.</p>
</li>
<li><p>Entra ID app has Web and SPA redirects configured.</p>
</li>
<li><p>Power Platform API with CopilotStudio.Copilots.Invoke is granted and consented.</p>
</li>
<li><p>Chat.Invoke scope exposed and referenced in <strong>MSAL_Scopes</strong> custom label.</p>
</li>
<li><p>Static Resources uploaded and named exactly: msalBrowser, adaptiveCard, copilotStudioClient.</p>
</li>
<li><p>Trusted URLs added for login.microsoftonline.com, api.bap.microsoft.com, copilotstudio.microsoft.com, directline.botframework.com, your environment-specific HTTPS and WSS endpoints.</p>
</li>
<li><p>Custom Labels populated and referenced by the LWC.</p>
</li>
</ul>
<h2 id="heading-wrap-up"><strong>Wrap up</strong></h2>
<p>You should now have a Copilot Studio agent running inside Salesforce with Entra ID SSO, MSAL handling the token flow, and a lightweight Bot Framework chat that behaves under Locker. The setup is opinionated for reliability in Salesforce: Static Resources for scripts, Custom Labels for environment config, and a short list of Trusted URLs to satisfy CSP. From here you can extend the component with richer telemetry, swap the hosted chat for a custom Direct Line client, or light it up inside other Salesforce workspaces.</p>
<p>Repo is here if you want to clone or open a PR: <a target="_blank" href="http://github.com/brianbaldock/LightningCopilot"><strong>github.com/brianbaldock/LightningCopilot</strong></a></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🧡</div>
<div data-node-type="callout-text">If you found this post helpful, please like it below and share it out!</div>
</div>]]></content:encoded></item><item><title><![CDATA[When "Ethical AI" cites ghosts]]></title><description><![CDATA[If you want a real-world example of why AI hallucinations and human overreliance matter, check out the story about an education “ethical AI” report with more than 15 fake sources. Ouch. That is exactly what recent research has been warning us about: ...]]></description><link>https://blog.brianbaldock.net/when-ethical-ai-cites-ghosts</link><guid isPermaLink="true">https://blog.brianbaldock.net/when-ethical-ai-cites-ghosts</guid><category><![CDATA[Ethical AI]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Fri, 19 Sep 2025 04:44:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1758256349368/55a87850-3625-4118-917f-4b46dd324278.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you want a real-world example of why AI hallucinations and human overreliance matter, check out the story about an education “ethical AI” report with more than 15 fake sources. Ouch. That is exactly what recent research has been warning us about: when people co-create with AI, plausible nonsense can slip in, and our brains are a little too happy to let it slide.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🔗</div>
<div data-node-type="callout-text"><a target="_self" href="https://arstechnica.com/ai/2025/09/education-report-calling-for-ethical-ai-use-contains-over-15-fake-sources/">Ars Technica - Education report calling for ethical AI use contains over 15 fake sources</a></div>
</div>

<h2 id="heading-the-research-minus-the-lab-coat">The research, minus the lab coat</h2>
<p>An IBM team studied how people use large language models to write content that is supposed to be tied to real documents. Picture a simple setup; folks are asked to answer questions using a source doc, and an AI offers “helpful” suggestions along the way. Sometimes the AI’s answer is faithful to the source, sometimes it quietly makes stuff up. They also tried a few speed bumps to slow people down and make them think first:</p>
<ul>
<li><p><strong>Answer first</strong>; write your own response before seeing the AI.</p>
</li>
<li><p><strong>Read first</strong>; skim the source doc before you start.</p>
</li>
<li><p><strong>Highlight</strong>; show which parts of the AI’s answer actually line up with the source.</p>
</li>
</ul>
<p>What shook out matches what most of us have seen in the wild:</p>
<ul>
<li><p><strong>When the AI invents facts, quality drops.</strong> No surprise; if the suggestion is off, the end result gets worse.</p>
</li>
<li><p><strong>People still lean on bad AI, often by “topping up” their own answer with AI text.</strong> I’ve done this, you’ve done this: you write something correct, then paste in a smart-sounding line that quietly disagrees with you. Now you have a confident, wrong paragraph.</p>
</li>
<li><p><strong>Those speed bumps help mindset, but they are not magic.</strong> They nudge people to think, they do not erase bad inputs.</p>
</li>
<li><p><strong>Three simple checks matter most: faithfulness to the source, factual accuracy, and completeness.</strong> If your response fails any one of those, it should not ship.</p>
</li>
</ul>
<p><strong>Bottom line</strong>; when the model makes things up, humans can make it worse by trusting or blending the output. The Ars Technica piece is the public version of that lab result.</p>
<h2 id="heading-the-public-faceplant">The public face‑plant</h2>
<p>That education report called for “ethical AI”, yet the bibliography had ghosts; over a dozen citations that do not exist. Eighteen months of work, still shipped with phantom sources. That is the “paste the smart sentence and keep moving” problem, just at report scale. Nobody stopped to ask the only question that matters: “where did this claim come from, and can I <strong>open</strong> it?”</p>
<h2 id="heading-why-we-fall-for-it">Why we fall for it</h2>
<p>A few human things get in the way:</p>
<ul>
<li><p><strong>Anchoring, and saving brain cycles.</strong> If you see the AI answer first, it <em>frames your thinking</em>. Even if you write first, you may paste in a clever line to save time.</p>
</li>
<li><p><strong>Plausibility beats provenance.</strong> Smooth, on-topic text feels accurate at a glance. Without a hard source check, fluent hallucinations slide through. And, <strong>“I used less AI on this one”</strong> is not the same as, <strong>“I verified the facts.”</strong></p>
</li>
</ul>
<h2 id="heading-a-practical-playbook-so-ethical-ai-stops-citing-ghosts">A practical playbook so “ethical AI” stops citing ghosts</h2>
<p>Given that report, everything around it, and the research, it’s time for a playbook. Here are some ideas:</p>
<ol>
<li><p><strong>No source, no ship.</strong> If a claim has no link or citation you can actually open and verify, it does not publish. Use three dials to review: faithfulness, accuracy, completeness. Fail one, it fails all.</p>
</li>
<li><p><strong>Answer first, compare second, justify third.</strong> Capture a human draft before <strong>any</strong> AI suggestion, then show the AI response side-by-side <strong>with</strong> citation checks. Add one required sentence: “what changed after seeing AI, and why.” Not every org can build this workflow today, but even a lightweight version in Word or Copilot helps (maybe an Agent?).</p>
</li>
<li><p><strong>Automate the boring stuff.</strong></p>
<ul>
<li><p>Resolve every reference; flag dead links and junky journals.</p>
</li>
<li><p>Run a basic overlap check between claims and sources; even a highlight-style view helps reviewers see what is actually grounded.</p>
</li>
</ul>
</li>
<li><p><strong>Block on grounding when confidence is low.</strong> If the system cannot point to real evidence, kick it to a human. Do not allow publish.</p>
</li>
<li><p><strong>Catch the “append the AI” bug.</strong> Lint for contradictions between the human draft and pasted AI text. If they disagree, stop and reconcile.</p>
</li>
<li><p><strong>Person-in-the-Loop for real.</strong> Make escalation a main path, not a side door. If a source cannot be verified, a human decides: fix it or cut it. Log that decision.</p>
</li>
<li><p><strong>Spot-check after publish.</strong> Sample a few items weekly against the same rubric. Celebrate “caught in review” saves, not just volume shipped.</p>
</li>
</ol>
<h2 id="heading-bring-it-home">Bring it home</h2>
<p>The IBM study shows how easily hallucinations and overreliance can drag quality down in AI-assisted writing. The “ethical AI” report shows the reputational blast radius when that slips into production. If we want AI to help us move faster without burning trust, we need hard gates for provenance, simple checks humans can actually use, and workflows that make the right thing the easy thing. Otherwise we will keep shipping polished documents that cite ghosts 👻.</p>
<h3 id="heading-references">References</h3>
<ul>
<li><p>Ashktorab, Desmond, Pan, Johnson, Brachman, Dugan, Danilevsky, Geyer. <em>Emerging Reliance Behaviors in Human‑AI Content Grounded Data Generation: The Role of Cognitive Forcing Functions and Hallucinations</em>. CHIWORK ’25. Key findings summarized above; see <em>Figure 5</em>, <em>Table 1</em>, <em>Table 4</em>, and <em>Table 5</em> for the results I reference. <a target="_blank" href="https://arxiv.org/pdf/2409.08937v2">https://arxiv.org/pdf/2409.08937v2</a></p>
</li>
<li><p>Edwards, Ars Technica. <em>Education report calling for ethical AI use contains over 15 fake sources</em>. Summary and context for the Newfoundland and Labrador report. <a target="_blank" href="https://arstechnica.com/ai/2025/09/education-report-calling-for-ethical-ai-use-contains-over-15-fake-sources/">https://arstechnica.com/ai/2025/09/education-report-calling-for-ethical-ai-use-contains-over-15-fake-sources/</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Bite-sized Microsoft Entra ID Feature Configuration Videos]]></title><description><![CDATA[Looking for quick, bite-sized videos on Microsoft Entra ID?
Whether you’re setting up Internet Access, enabling Private Access, or exploring the benefits of Token Protection, we’ve got you covered.
FastTrack partnered with Product Marketing, GTM, WWL...]]></description><link>https://blog.brianbaldock.net/bite-sized-entra-id</link><guid isPermaLink="true">https://blog.brianbaldock.net/bite-sized-entra-id</guid><category><![CDATA[entra features]]></category><category><![CDATA[security service edge]]></category><category><![CDATA[Entra ID]]></category><category><![CDATA[microsoft-entra-id]]></category><category><![CDATA[identity-management]]></category><category><![CDATA[pim]]></category><category><![CDATA[privileged-identity-management]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Mon, 15 Sep 2025 00:42:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1757896830632/2578cafd-979d-44b4-9daa-33448d4923d6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-looking-for-quick-bite-sized-videos-on-microsoft-entra-id">Looking for quick, bite-sized videos on Microsoft Entra ID?</h2>
<p>Whether you’re setting up Internet Access, enabling Private Access, or exploring the benefits of Token Protection, we’ve got you covered.</p>
<p>FastTrack partnered with Product Marketing, GTM, WWL Studios, and Microsoft Learn to build a library of short deployment videos that show you how to get Entra features running quickly.</p>
<p>Each video is hosted on the <a target="_blank" href="https://m365accelerator.microsoft.com/">M365Accelerator</a> site, a hub for FastTrack resources, playbooks, and tools to help IT pro’s move from planning to deployment with confidence. Bookmark it as your go-to resource for accelerating Microsoft 365 adoption.</p>
<p><a target="_blank" href="https://m365accelerator.microsoft.com/videos">With these videos</a>, you can go from learning to deploying in minutes.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Video Title</strong></td><td><strong>Description</strong></td><td><strong>Link</strong></td><td><strong>Presenter</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Deploy Private Access</strong></td><td>Step-by-step demo of deploying Microsoft Entra Private Access, including connector setup, traffic forwarding, app publishing, Quick Access, per-app segments, and testing with the Global Secure Access client. Secures access with conditional access policies.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/deploy-private-access">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/charles-lewis-66b28861/">Charles Lewis - Principal Engineer, Identity &amp; Network Access</a></td></tr>
<tr>
<td><strong>Deploy Internet Access</strong></td><td>Walkthrough of deploying Entra Internet Access. Covers GSA client, traffic profiles, content filtering, security profiles, and conditional access to protect users and devices from internet threats.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/deploy-internet-access">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/charles-lewis-66b28861/">Charles Lewis - Principal Engineer, Identity &amp; Network Access</a></td></tr>
<tr>
<td><strong>Deploy Internet Access for Microsoft Services</strong></td><td>How to secure Microsoft 365 traffic with Entra Internet Access. Includes traffic profiles, adaptive access, source IP restoration, and conditional access for compliant network checks.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/deploy-internet-access-for-microsoft-services">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/charles-lewis-66b28861/">Charles Lewis - Principal Engineer, Identity &amp; Network Access</a></td></tr>
<tr>
<td><strong>Deploy Conditional Access Policies</strong></td><td>Rapid deployment of conditional access policies using templates and the Advanced Deployment Guide. Shows best practice scenarios, report-only mode, and analytics for Zero Trust enforcement.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/deploy-conditional-access-policies">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/charles-lewis-66b28861/">Charles Lewis - Principal Engineer, Identity &amp; Network Access</a></td></tr>
<tr>
<td><strong>Migrate from AD FS to Entra ID</strong></td><td>Guidance on moving from ADFS to Entra ID. Covers planning, application inventory, stage rollout, decommissioning ADFS, and benefits of modern authentication.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/migrate-from-ad-fs-to-entra-id">Link</a></td><td>[Jose G. Luna, CISSP - Escalation Engineer @ Microsoft</td><td>Identity Access Management &amp; Security](https://www.linkedin.com/in/josegluna/)</td></tr>
<tr>
<td><strong>Copilot in Entra: Guided Walkthrough</strong></td><td>Demo of Microsoft Security Copilot for Entra. Shows how AI assists with risky sign-ins, audit logs, lifecycle workflows, and enterprise app risk analysis.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/copilot-in-entra-guided-walkthrough">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/milena-todorovi%C4%87-1ba32910/">MiLena Todorović - FastTrack Identity &amp; Access Management Subject Matter Expert at Microsoft</a></td></tr>
<tr>
<td><strong>Automate ID Governance with Lifecycle Workflows</strong></td><td>How to automate onboarding, transitions, and offboarding with Entra Identity Governance. Includes task setup, notifications, and workflow reviews.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/automate-id-governance-with-lifecycle-workflows">Link</a></td><td>[Malak (Mickey) Moussa - Microsoft Senior Technical Support Engineer</td><td>Identity &amp; Defender Expert](https://www.linkedin.com/in/malakmoussa/)</td></tr>
<tr>
<td><strong>Configure Hybrid Identity with Entra Connect</strong></td><td>Intro to Entra Connect. Covers install, configuration, sync features, Cloud Sync comparison, and hybrid identity best practices.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/configure-hybrid-identity-with-entra-connect">Link</a></td><td>[Malak (Mickey) Moussa - Microsoft Senior Technical Support Engineer</td><td>Identity &amp; Defender Expert](https://www.linkedin.com/in/malakmoussa/)</td></tr>
<tr>
<td><strong>Enable ID Protection with Risk-based Conditional Access</strong></td><td>Enable user and sign-in risk policies with conditional access. Includes secure password reset and automated remediation for high-risk users.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/enable-id-protection-with-risk-based-conditional-access">Link</a></td><td>Sunesh E. Surendran</td></tr>
<tr>
<td><strong>Configure B2B Collaboration</strong></td><td>How to invite and manage external users with Entra B2B. Covers invitation methods, identity providers, conditional access, and guest experiences.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/configure-b2b-collaboration">Link</a></td><td>Sunesh E. Surendran</td></tr>
<tr>
<td><strong>Configure Multitenant Organization Capabilities</strong></td><td>Setup of multi-tenant organizations using Cross-Tenant Sync. Includes trust setup, user sync, Teams integration, and shared apps.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/configure-multitenant-organization-capabilities">Link</a></td><td>Sunesh E. Surendran</td></tr>
<tr>
<td><strong>Enable Multifactor Authentication (MFA) and Self-Service Password Reset (SSPR)</strong></td><td>How to configure MFA and self-service password reset. Includes policy setup, migration to the new portal, and adoption tools for a modern authentication experience.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/enable-multi-factor-authentication-and-self-service-password-reset">Link</a></td><td>Scotty Pucket</td></tr>
<tr>
<td><strong>Deploy Entitlement Management</strong></td><td>Automating access to groups, apps, and sites with Entra Entitlement Management. Includes catalogs, packages, workflows, and expirations.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/deploy-entitlement-management">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/beatricefarcas/">Beatrice Farcas - Identity Subject Matter Expert at Microsoft</a></td></tr>
<tr>
<td><strong>Configure Privileged Identity Management (PIM)</strong></td><td>Manage privileged roles with Entra PIM. Includes proof of concept setup, role assignment, approval workflows, and just-in-time permissions.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/configure-privileged-identity-management">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/ahmed-habib-1603a3161/">Ahmed Habib - Identity and Access Management SME at Microsoft</a></td></tr>
<tr>
<td><strong>Automate Access Governance with Access Reviews</strong></td><td>How to create and manage access reviews for groups, apps, and privileged roles. Includes scope setup, recurrence, and automated enforcement.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/automate-access-governance-with-access-reviews">Link</a></td><td>Cindy Wang</td></tr>
<tr>
<td><strong>Benefits of Token Protection</strong></td><td>Demo of Entra Token Protection to block token replay and bind tokens to devices. Includes configuration steps and best practices.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/benefits-of-token-protection">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/bbaldock/overlay/about-this-profile/">Brian Baldock - Senior Program Manager @ Microsoft</a></td></tr>
<tr>
<td><strong>ID Governance Hero Scenarios</strong></td><td>Full onboarding scenario with Entra governance. Includes access packages, lifecycle workflows, license assignment, and Teams setup.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/id-governance-hero-scenarios">Link</a></td><td>[Malak (Mickey) Moussa - Microsoft Senior Technical Support Engineer</td><td>Identity &amp; Defender Expert](https://www.linkedin.com/in/malakmoussa/)</td></tr>
<tr>
<td><strong>Internet Access Deep Dive</strong></td><td>Deep dive into Entra Internet Access architecture and deployment. Includes TLS inspection, content filtering, custom policies, and roadmap features.</td><td><a target="_blank" href="https://m365accelerator.microsoft.com/videos/internet-access-deep-dive">Link</a></td><td><a target="_blank" href="https://www.linkedin.com/in/gayanrandeny/">Gayan Randeny - Senior Program Manager at Microsoft</a></td></tr>
</tbody>
</table>
</div><p>All of these and more are available on the <a target="_blank" href="https://m365accelerator.microsoft.com/">M365Accelerator</a> website. Start exploring today and deploy with confidence.</p>
]]></content:encoded></item><item><title><![CDATA[Admin Guide: Controlling Copilot in Viva Engage]]></title><description><![CDATA[As a Senior Program Manager at Microsoft, one of the things I love about my job (and tinkering in my lab) is spotting gaps where automation can make life easier. This month’s gap is, managing access policies for Copilot and AI-Powered Summarization i...]]></description><link>https://blog.brianbaldock.net/admin-guide-controlling-copilot-in-viva-engage</link><guid isPermaLink="true">https://blog.brianbaldock.net/admin-guide-controlling-copilot-in-viva-engage</guid><category><![CDATA[M365 Copilot]]></category><category><![CDATA[Microsoft Viva Engage]]></category><category><![CDATA[copilot]]></category><category><![CDATA[viva engage]]></category><category><![CDATA[Microsoft Viva ]]></category><category><![CDATA[Powershell]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Thu, 04 Sep 2025 15:40:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1757000277415/f6ff9ec8-e722-40bf-a526-ad24b2f36eae.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As a Senior Program Manager at Microsoft, one of the things I love about my job (and tinkering in my lab) is spotting gaps where automation can make life easier. This month’s gap is, <strong>managing access policies for Copilot and AI-Powered Summarization in Viva Engage</strong>.</p>
<h2 id="heading-why-i-built-a-mini-powershell-module">Why I built a mini-PowerShell Module</h2>
<p>The <a target="_blank" href="https://learn.microsoft.com/en-us/viva/manage-access-policies">official documentation</a> explains that you can have more than one access policy and the most <strong>restrictive</strong> one <strong>always wins</strong>. To clarify:</p>
<ul>
<li><p><strong>Direct user or group assignments take precedence over org-wide.</strong></p>
</li>
<li><p>If a user lands in multiple policies, <strong><em>Disabled</em></strong> beats <em>Enabled</em>.</p>
</li>
</ul>
<p>Given that there is no other way to create or manage these policies than the cmdlets called out in the documentation I wanted to create some light automation that handles this in an idempotent way (create or update if needed, no duplicates, safe re-runs) and silent by default (user -verbose to see the noise).</p>
<h2 id="heading-the-module">The module</h2>
<p>Repo is here: <a target="_blank" href="https://github.com/brianbaldock/Set-CopilotForEngage">Set-CopilotForEnga</a><a target="_blank" href="https://github.com/brianbaldock/Set-CopilotForEngage">ge</a></p>
<p>The core cmdlet is <code>Set-EngageFeatureAccess</code>. It does the heavy lifting:</p>
<ul>
<li><p>Ensures the Exchange Online module is installed/updated.</p>
</li>
<li><p>Connects to Exchange Online if needed.</p>
</li>
<li><p>Resolves the Viva Engage feature IDs.</p>
</li>
<li><p>Creates or updates policies for Copilot and/or AI Summarization.</p>
</li>
<li><p>Handles scope (-Everyone, -GroupIds, -UserIds).</p>
</li>
<li><p>Supports user controls (-UserOptInByDefault).</p>
</li>
</ul>
<h2 id="heading-recommended-approach">Recommended approach</h2>
<h3 id="heading-keep-it-simple">Keep it simple:</h3>
<ul>
<li><p>Create an org-wide disabled policy for each feature (baseline).</p>
</li>
<li><p>Create targeted enable policies scoped to security groups.</p>
</li>
<li><p>Don’t bother with “disable groups” unless you want carve-outs from a permissive baseline.</p>
</li>
</ul>
<h3 id="heading-that-way">That way:</h3>
<ul>
<li><p>Default is deny (no one gets access unless explicitly added).</p>
</li>
<li><p>Membership in an Enable group grants access.</p>
</li>
<li><p>Most restrictive still applies, but you’re not juggling extra disable layers.</p>
</li>
</ul>
<h2 id="heading-quick-start">Quick start</h2>
<blockquote>
<p>Requires: PowerShell 5.1+, <code>ExchangeOnlineManagement</code> 3.9.0+ (the module helper can auto‑install/update with switches) see step 2a</p>
</blockquote>
<pre><code class="lang-powershell"><span class="hljs-comment"># 1) Dot source the mini module</span>
. .\<span class="hljs-built_in">Set-CopilotForEngage</span>.ps1

<span class="hljs-comment"># 2a) Baseline: Use this to disable both features org-wide and install/update the to the latest version of Exchange Online Managment PowerShell Module</span>
<span class="hljs-built_in">Set-EngageFeatureAccess</span> <span class="hljs-literal">-Mode</span> Disable <span class="hljs-literal">-Copilot</span> <span class="hljs-literal">-AISummarization</span> <span class="hljs-literal">-Everyone</span> <span class="hljs-literal">-PolicyNamePrefix</span> <span class="hljs-string">"All"</span> <span class="hljs-literal">-AutoInstallEXO</span> <span class="hljs-literal">-AutoUpdateEXO</span> <span class="hljs-literal">-Confirm</span>:<span class="hljs-variable">$false</span> <span class="hljs-literal">-Verbose</span>

<span class="hljs-comment"># 2b) Baseline: Disable both features org‑wide </span>
<span class="hljs-built_in">Set-EngageFeatureAccess</span> <span class="hljs-literal">-Mode</span> Disable <span class="hljs-literal">-Copilot</span> <span class="hljs-literal">-AISummarization</span> `
  <span class="hljs-literal">-Everyone</span> <span class="hljs-literal">-PolicyNamePrefix</span> <span class="hljs-string">"All"</span> <span class="hljs-literal">-Confirm</span>:<span class="hljs-variable">$false</span> <span class="hljs-literal">-Verbose</span>

<span class="hljs-comment"># 2) Enable Copilot for one or more groups</span>
<span class="hljs-built_in">Set-EngageFeatureAccess</span> <span class="hljs-literal">-Mode</span> Enable <span class="hljs-literal">-Copilot</span> `
  <span class="hljs-literal">-GroupIds</span> <span class="hljs-string">"GROUP GUID HERE"</span> `
  <span class="hljs-literal">-PolicyNamePrefix</span> <span class="hljs-string">"Enable"</span> <span class="hljs-literal">-Confirm</span>:<span class="hljs-variable">$false</span> <span class="hljs-literal">-Verbose</span>

<span class="hljs-comment"># 3) Enable AI Summarization for one or more groups</span>
<span class="hljs-built_in">Set-EngageFeatureAccess</span> <span class="hljs-literal">-Mode</span> Enable <span class="hljs-literal">-AISummarization</span> `
  <span class="hljs-literal">-GroupIds</span> <span class="hljs-string">"GROUP GUID HERE"</span> `
  <span class="hljs-literal">-PolicyNamePrefix</span> <span class="hljs-string">"Enable"</span> <span class="hljs-literal">-Confirm</span>:<span class="hljs-variable">$false</span> <span class="hljs-literal">-Verbose</span>

<span class="hljs-comment"># 4) Verify policy layout</span>
<span class="hljs-built_in">Get-VivaModuleFeaturePolicy</span> <span class="hljs-literal">-ModuleId</span> VivaEngage
</code></pre>
<h2 id="heading-what-good-looks-like"><strong>What “good” looks like</strong></h2>
<p>The view below shows <strong>two org‑wide block policies</strong> (one per feature) and <strong>two group‑targeted enable policies</strong>. With this layout, anyone <strong>not</strong> in an enable group stays blocked; group members are enabled.</p>
<p><img src="https://github.com/brianbaldock/Set-CopilotForEngage/raw/main/Images/Policy%20Layout.png" alt="Policy layout screenshot" /></p>
<p>So a short blog article for September where I created a tight little module which solves a real admin pain point, and makes policy management repeatable and predictable.</p>
<p>Code’s up on GitHub if you want to try it: <a target="_blank" href="https://github.com/brianbaldock/Set-CopilotForEngage">Set-CopilotForEngage</a></p>
]]></content:encoded></item><item><title><![CDATA[Wipe the line, raise the score]]></title><description><![CDATA[I used to work in restaurants and we had a saying, “there is always something to clean.” If you had a spare minute, you were cleaning the line, sweeping, mopping etc. Not glamorous work; absolutely essential. Security is the same. There is always som...]]></description><link>https://blog.brianbaldock.net/securescore</link><guid isPermaLink="true">https://blog.brianbaldock.net/securescore</guid><category><![CDATA[Microsoft Secure Score]]></category><category><![CDATA[security kitchen]]></category><category><![CDATA[zero-trust]]></category><category><![CDATA[zero trust security]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Mon, 11 Aug 2025 21:17:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754946776140/5980c958-fc4f-4d08-bd6e-6131a250ae2f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I used to work in restaurants and we had a saying, “<strong>there is always something to clean.</strong>” If you had a spare minute, you were cleaning the line, sweeping, mopping etc. Not glamorous work; absolutely essential. Security is the same. There is always something to improve; otherwise entropy wins. Microsoft Secure Score gives you the list. Your job is building a habit of using it.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💬</div>
<div data-node-type="callout-text"><em>There is </em><strong><em>always</em></strong><em> something to clean.</em></div>
</div>

<h2 id="heading-what-secure-score-actually-is">What Secure Score actually is</h2>
<p>Secure Score pulls together recommended security improvements across Microsoft 365 into one place with a point-based system. Each recommendation tells you what to do, why it matters, and how many points you’ll earn by completing it. Think of it as a prioritized sanitation checklist for identity, devices, apps, and data.</p>
<p>The number of points isn’t the goal. The real value is in the visible breadcrumb trail that proves you’re making continuous improvements. It’s your “<em>cleaning the line.</em>”</p>
<p>Check it out here (<em>you will need the right role to view it)</em>: <a target="_blank" href="https://security.microsoft.com/exposure-secure-score?viewid=overview">Microsoft Secure Score</a></p>
<h2 id="heading-turn-secure-score-into-a-living-backlog">Turn Secure Score into a living backlog</h2>
<p>Treat every Secure Score recommendation like a task in your security backlog.</p>
<ul>
<li><p>Pull items weekly into your work queue.</p>
</li>
<li><p>Filter them by category: Identity, Devices, Apps, Data</p>
</li>
<li><p>Label them as Quick Clean, Hot Clean, Deep Clean:</p>
<ul>
<li><p><strong>Quick Clean:</strong> low-risk configuration fixes; like wiping down a station on the line during a lull.</p>
</li>
<li><p><strong>Hot Clean:</strong> high impact controls that attackers target first; like sanitizing the cutting boards, washing the hoods.</p>
</li>
<li><p><strong>Deep Clean:</strong> structural changes that require planning; like pulling out appliances to clean behind them.</p>
</li>
</ul>
</li>
<li><p>Use your existing work management tool. If you are Microsoft-heavy, a Planner or Azure Boards view works well so Product, Infra, and SecOps can see the same queue.</p>
</li>
</ul>
<h2 id="heading-prioritize-like-a-pro">Prioritize like a pro</h2>
<p>Do not chase easy points first. Use a simple triage that fits enterprise reality.</p>
<ul>
<li><p><strong>Impact:</strong> How much real risk it cuts given how attackers operate today. Use exposure management and define critical devices to map attack paths. (<a target="_blank" href="https://security.microsoft.com/attack-paths">Microsoft Security Exposure Management (MSEM) Attack Paths</a>)</p>
</li>
<li><p><strong>Effort:</strong> People hours, complexity, approvals.</p>
</li>
<li><p><strong>Dependencies:</strong> Does something else need to happen first?</p>
</li>
<li><p><strong>Blast radius:</strong> Who could be affected if it goes wrong.</p>
</li>
<li><p><strong>Regulatory tie-in:</strong> Does it map to controls that auditors will ask about.</p>
</li>
</ul>
<p>Score each on a 1 to 3 scale. Hot Clean items with high impact and low to medium effort go to the top.</p>
<p>Examples that usually pay off early:</p>
<ul>
<li><p>Require phishing-resistant MFA for admins; enforce conditional access for high risk sign-ins.</p>
</li>
<li><p>Disable legacy auth where feasible.</p>
</li>
<li><p>Safe Links and Safe Attachments policies.</p>
</li>
<li><p>Attack surface reduction rules in audit, then enforce.</p>
</li>
<li><p>Device compliance policies with a minimum OS and encryption.</p>
</li>
</ul>
<h2 id="heading-assign-owners-set-a-cadence-show-progress">Assign owners; set a cadence; show progress</h2>
<p>This is where teams succeed or stall.</p>
<ul>
<li><p><strong>Owner per control:</strong> Security writes the “why”, the platform owner runs the change, and the helpdesk knows the support path.</p>
</li>
<li><p><strong>Weekly working sessions:</strong> 30 to 45 minutes. Review last week’s moves, pull two to five new items, unblock dependencies.</p>
</li>
<li><p><strong>Monthly showcase:</strong> One slide that tells a story. Current score, what changed, what risk you removed, and what is next.</p>
</li>
</ul>
<p>Tie it to culture, if we use Microsoft Cultural Attributes as an example (<a target="_blank" href="https://careers.microsoft.com/v2/global/en/culture">Microsoft cultural attributes</a>): Growth Mindset means you’re iterating; Customer Obsession means you protect users while uplifting controls; One Microsoft means SecOps, Identity, Endpoint and Applications ship together.</p>
<h2 id="heading-automate-the-easy-wins">Automate the easy wins</h2>
<p>Use policy and configuration at scale so hygiene stays done.</p>
<ul>
<li><p>Where possible use baseline policies. Conditional access templates in Entra ID. Security Baselines in Intune.</p>
</li>
<li><p>Configuration as code for repeatability where possible; even simple scripts tracked in a repo beat one-off portal clicks.</p>
</li>
<li><p>Alerting for drift. If a control falls out of compliance, it reopens in your backlog automatically.</p>
</li>
<li><p>Document exceptions with an owner and an expiry date. Exceptions without an end date become permanent debt.</p>
</li>
</ul>
<h2 id="heading-report-upwards-with-context-not-just-a-number">Report upwards with context, not just a number</h2>
<p>Executives want to know if the business is safer this month than last.</p>
<ul>
<li><p>Show trend: last quarter, last month, today.</p>
</li>
<li><p>Translate actions to attack friction. Use wording like: <em>“Blocking legacy auth removed password spray success paths for X% of accounts.”</em></p>
</li>
<li><p>Map to frameworks and programs: <a target="_blank" href="https://learn.microsoft.com/en-us/security/zero-trust/zero-trust-overview">What is Zero Trust?</a> | <a target="_blank" href="https://microsoft.github.io/zerotrustassessment/">Zero Trust Assessment &amp; Workshop</a></p>
</li>
<li><p>Keep a small “narrative log” of notable changes and why you chose them.</p>
</li>
</ul>
<h2 id="heading-common-pitfalls">Common pitfalls</h2>
<ul>
<li><p><strong>Point chasing:</strong> Hitting a target score without reducing meaningful risk.</p>
</li>
<li><p><strong>Big-bang changes:</strong> Shipping ten identity policies at once and flooding the helpdesk.</p>
</li>
<li><p><strong>No rollback plan:</strong> Always know how to revert a control safely.</p>
</li>
<li><p><strong>Unowned exceptions:</strong> If everyone owns it, no one owns it.</p>
</li>
<li><p><strong>Stale backlog:</strong> If recommendations sit untouched for 90 days, re-evaluate or explicitly accept the risk (at this point you already have).</p>
</li>
</ul>
<h2 id="heading-a-simple-microsoft-secure-score-30-60-90-starter-plan">A simple Microsoft Secure Score 30-60-90 starter plan</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754946032422/58b1c956-cacf-4b62-95cf-bb85bd9c5f4e.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-days-1-30-set-the-system">Days 1-30 “Set the system”</h2>
<ul>
<li><p>Stand up one visible backlog for Secure Score in Planner or Azure Boards; tag items as Quick Clean, Hot Clean, or Deep Clean.</p>
</li>
<li><p>Define your triage rubric: Impact, Effort, Dependencies, Blast radius, Regulatory tie-in; simple scoring.</p>
</li>
<li><p>Lock the cadence: a weekly 30 or 45 minute working session with clear entry and exit criteria; clear entry and exit criteria; one owner per item with a rollback checklist.</p>
</li>
<li><p>Baseline metrics and rules: score by pillar, average days to close, exception policy with owner and expiry.</p>
</li>
</ul>
<h2 id="heading-days-31-60-run-the-loop-make-it-boring">Days 31-60 “Run the loop, make it boring”</h2>
<ul>
<li><p>Work the cadence; move a steady batch from proposed to validated to done.</p>
</li>
<li><p>Apply the rubric before pulling anything; no point chasing, document user impact and the helpdesk path.</p>
</li>
<li><p>Add drift checks and auto-reopen on deviation; track rollbacks and lessons learned.</p>
</li>
<li><p>Publish a monthly showcase with trend, risks reduced, and next moves; close or explicitly accept items that sit for 90 days.</p>
</li>
</ul>
<h2 id="heading-days-61-90-scale-it-and-age-it">Days 61-90 “Scale it and age it”</h2>
<ul>
<li><p>Break Deep Clean items into small, testable steps; add a simple RACI so accountability is clear.</p>
</li>
<li><p>Map changes to Zero Trust pillars and Secure Future Initiative outcomes; include this view in the showcase.</p>
</li>
<li><p>Formalize exception reviews; auto-notify owners 2 weeks before expiry, renew or remove with justification.</p>
</li>
<li><p>Ship v1.0 of the playbook: intake, triage, change, validate, rollback, report; set quarterly objectives for the backlog.</p>
</li>
</ul>
<h2 id="heading-closing">Closing</h2>
<p>In the kitchen, you never stand still because there is always something to clean. In security, there is always something to <strong>improve</strong>. Secure Score is the list on the cooler door. Work it daily, celebrate small wins, and keep moving!</p>
]]></content:encoded></item><item><title><![CDATA[Recipe: Designing Identity for Agentic AI]]></title><description><![CDATA[The first time I watched an AI agent chain multiple tool calls and hit customer data faster than I could finish a coffee, I realized something: these aren’t apps anymore. They’re users. They do real work, they touch real data, and if we don't manage ...]]></description><link>https://blog.brianbaldock.net/recipe-designing-identity-for-agentic-ai</link><guid isPermaLink="true">https://blog.brianbaldock.net/recipe-designing-identity-for-agentic-ai</guid><category><![CDATA[Identity for agentic ai]]></category><category><![CDATA[agentic ai identity]]></category><category><![CDATA[sql mcp]]></category><category><![CDATA[Entra ID]]></category><category><![CDATA[agentic ai development]]></category><category><![CDATA[mcp server]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Tue, 22 Jul 2025 03:56:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1753156772068/4727f395-2a39-43f2-98c9-67b20492b3cb.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The first time I watched an AI agent chain multiple tool calls and hit customer data faster than I could finish a coffee, I realized something: these aren’t apps anymore. They’re users. They do real work, they touch real data, and if we don't manage their identity properly, we’re in for a mess of ghost accounts, hardcoded secrets, and zero accountability.</p>
<p>This blog is for security architects and identity engineers trying to build scalable agentic systems that don’t compromise your entire environment. If your agents need to talk to APIs, legacy databases, and each other.</p>
<h2 id="heading-why-agent-identity-is-a-different-beast">Why agent identity is a different beast</h2>
<p>Agentic workloads don’t follow the old patterns. They:</p>
<ul>
<li><p>Spin up and disappear, or maybe not.</p>
</li>
<li><p>Call services in parallel.</p>
</li>
<li><p>Act on behalf of users (or other agents).</p>
</li>
<li><p>Break assumptions about static infrastructure.</p>
</li>
</ul>
<p>We can’t just treat them like apps or users. We need first-class identity that’s dynamic, scoped, and trackable. Microsoft Entra Agent ID helps, but architecture matters more than tooling.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753122976704/23532119-5bbf-42a9-bb25-7faa8ceef04e.png" alt="Diagram comparing &quot;Human Identity&quot; and &quot;Agent Identity.&quot; Human Identity features include logins and passwords, long-lived accounts, independent actions, and static nature. Agent Identity features include token-based authentication, short-lived instances, autonomous actions, and dynamic nature. There are simple icons of a person and a robot for each category." class="image--center mx-auto" /></p>
<h2 id="heading-the-core-protocols-and-what-they-expect-from-identity">The core protocols (and what they expect from identity)</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Protocol</strong></td><td><strong>Description</strong></td></tr>
</thead>
<tbody>
<tr>
<td>MCP (Model Context Protocol)</td><td>For calling tools and fetching data. Assumes OAuth2.</td></tr>
<tr>
<td>A2A (Agent-to-Agent)</td><td>Agents delegate scoped authority to each other using signed Agent Cards.</td></tr>
<tr>
<td>ACP (Agent Communication Protocol)</td><td>Structured messaging with optional DID support. Trust and authorization must be verified</td></tr>
</tbody>
</table>
</div><p>All of them assume identity is solved, our job is to make that true.</p>
<h1 id="heading-the-agent-identity-architecture-recipe">The Agent Identity Architecture Recipe</h1>
<p>Here’s a simple truth: you can’t bolt on identity after you’ve built an agentic system. It needs to be part of the <strong>foundation</strong>. I threw together a recipe you can use to secure agents in real-world environments.</p>
<ol>
<li><p><strong>Give every agent an identity</strong></p>
<ul>
<li><p>Agents shouldn’t be invisible, they should show up in your Entra ID tenant like any other principal.</p>
</li>
<li><p>Entra Agent ID or app registrations, no shared accounts. No mystery agents.</p>
</li>
</ul>
</li>
<li><p><strong>Use modern auth only</strong></p>
<ul>
<li><p>Use OAuth2 client credentials flow or managed identity for agents calling APIs.</p>
</li>
<li><p>Prefer federated identity for agents outside Azure (GitHub, K8s, etc.).</p>
</li>
<li><p>Support on-behalf of flows when an agent acts on behalf of a user.</p>
</li>
</ul>
</li>
<li><p><strong>Wrap legacy systems with identity-aware proxies</strong></p>
</li>
</ol>
<ul>
<li><p>Put an <strong>API layer</strong> or <strong>proxy</strong> in front of legacy tools.</p>
</li>
<li><p>Let the agent call the API with a token; the API handles legacy auth using credentials from a vault.</p>
</li>
</ul>
<div data-node-type="callout">
<div data-node-type="callout-emoji">📢</div>
<div data-node-type="callout-text"><strong><em>Bonus:</em></strong><em> this also gives you a place to enforce RBAC and log activity.</em></div>
</div>

<ol start="4">
<li><p><strong>Vaut every secret</strong></p>
<ul>
<li><p>Use <strong>Azure Key Vault</strong> (or HashiCorp Vault) to store DB passwords, tokens, and certs.</p>
</li>
<li><p>Grant agents minimal read access scoped to what they need.</p>
</li>
<li><p>Rotate secrets on a schedule. Better: make them ephemeral where possible.</p>
</li>
</ul>
</li>
<li><p><strong>Enforce Zero Trust</strong></p>
<ul>
<li><p>Assign <strong>scoped roles</strong> and use <strong>Conditional Access</strong> for workload identities.</p>
</li>
<li><p>Leverage <strong>Agent Cards</strong> or scoped JWTs for delegation (especially with A2A).</p>
</li>
<li><p>Never assume an agent is “safe” just because it’s internal. Validate everything.</p>
</li>
</ul>
</li>
<li><p><strong>Monitor and decomission</strong></p>
<ul>
<li><p>Use <strong>access reviews</strong> and <strong>audit logs</strong> to track usage.</p>
</li>
<li><p>Deprovision stale agents. Revoke access. Rotate keys.</p>
</li>
<li><p>Feed everything into Sentinel or your SIEM so you can catch weird behavior early.</p>
</li>
</ul>
</li>
</ol>
<h2 id="heading-quick-scan-checklist">Quick-scan checklist</h2>
<ul>
<li><p><strong>◻️ Every agent has a unique identity in Entra ID</strong> (Agent ID / service principal)</p>
</li>
<li><p><strong>◻️ No raw secrets in code or memory</strong>—everything is pulled from Key Vault</p>
</li>
<li><p><strong>◻️ Service-to-service auth</strong> uses OAuth 2.0, mTLS, or federated identity</p>
</li>
<li><p><strong>◻️ Scoped permissions only</strong>—never more than what’s needed</p>
</li>
<li><p><strong>◻️ Legacy systems</strong> are abstracted behind modern identity-aware proxies</p>
</li>
<li><p><strong>◻️ Conditional Access &amp; role policies</strong> apply to agents like any other principal</p>
</li>
<li><p><strong>◻️ Secrets are rotated regularly</strong> (or eliminated via managed identity)</p>
</li>
<li><p><strong>◻️ Full audit trail</strong> from user → agent → resource</p>
</li>
<li><p><strong>◻️ Monitoring</strong> for abnormal behaviour or usage patterns</p>
</li>
<li><p><strong>◻️ Lifecycle automation</strong>—agents are created, monitored, and decommissioned cleanly</p>
</li>
</ul>
<h2 id="heading-example-end-to-end-identity-flow-from-user-to-sql-via-mcp">Example: End-to-end Identity Flow from user to SQL via MCP</h2>
<p>Let’s try and make this real. Using the previous recipe, let’s outline what it would look like for a user to trigger an action via an agent, which then queries SQL Server via an MCP server. The MCP server could just be used as a wrapper for another API in this case.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753153939385/dff0c1c7-3f46-4baf-8d79-2fdebf4aa268.png" alt class="image--center mx-auto" /></p>
<p>Here’s a matrix view of the different flows:</p>
<table><tbody><tr><td><p></p></td><td><p><strong>On-Behalf-Of (user-impersonation)</strong></p></td><td><p><strong>Service-credential (app-identity)</strong></p></td></tr><tr><td><p><strong>Purpose</strong></p></td><td><p>Preserve the human’s identity end-to-end for fine-grained auditing</p></td><td><p>Let the MCP act as a trusted service when user context isn’t required (or the back-end can’t handle it)</p></td></tr><tr><td><p><strong>Credentials MCP receives from the agent</strong></p></td><td><p>Delegated Entra ID access token containing scopes like sql.read, sql.write</p></td><td><p>Same delegated token (used only for authorization)</p></td></tr><tr><td><p><strong>What MCP does first</strong></p></td><td><p>Validates JWT, checks the required scope</p></td><td><p>Same validation and scope check</p></td></tr><tr><td><p><strong>How MCP logs into SQL</strong></p></td><td><p>Backend API called by MCP uses Windows / gMSA account to do Kerberos S4U2Self and gets a TGT for the user - Trades it via S4U2Proxy for a service ticket to MSSQLSvc…</p></td><td><p>Opens key vault with managed identity - Pulls either a DB password or requests and Entra token scoped to Azure SQL</p></td></tr><tr><td><p><strong>Identity SQL sees</strong></p></td><td><p>The users UPN/SID - looks exactly like they logged into the db directly</p></td><td><p>MCP or API Service Principal (Windows login, managed identity or Entra ID App)</p></td></tr><tr><td><p><strong>Secret material traces</strong></p></td><td><p>Only a short lived local kerberos ticket</p></td><td><p>Either a short lived DB access token or a vault fetched password that you can auto-rotate.</p></td></tr><tr><td><p><strong>Audit Trail</strong></p></td><td><p>Entra Sign-in - MCP Logs (User OID + request ID) - DC Kerberos Events - SQL audit rows for user, row level security</p></td><td><p>Entra ID sign-in - MCP Logs - Key Vault “Get Secret” audit - SQL audit rows for the service</p></td></tr><tr><td><p><strong>Typical Uses</strong></p></td><td><p>Reporting tools, data explorers, anything where row level security or per-user quotas matter</p></td><td><p>ETL jobs, schema migrations, bulk admins, legacy DBs that can’t accept Kerberos delegation</p></td></tr></tbody></table>

<h2 id="heading-wrap-up">Wrap up</h2>
<p>Agentic AI isn’t the future, it’s already live in your environment. I guarantee someone in your org is trying to build an agent right now. The question is: do you know what it’s doing?</p>
<p>The good news is you don’t need to start over. You just need to start being deliberate.</p>
<p>Treat your agents like non-human teammates. Secure them with the same Zero Trust mindset, enforce guardrails like you would for any user, and apply governance from day one. If you’re already using Microsoft Entra ID, you’re not far off. You’ve got the foundation, now it’s just about wiring it up the right way.</p>
<p>Start small. Register your agents. Vault your secrets. Lock down what they can do.</p>
<p>Because when that next agent spins up and hits your data layer, you want to know exactly who kicked it off, what it accessed, and why.</p>
<h2 id="heading-references">References</h2>
<ul>
<li><p><strong>Microsoft Tech Community</strong> – <a target="_blank" href="https://techcommunity.microsoft.com/t5/microsoft-entra-azure-ad-blog/announcing-microsoft-entra-agent-id-secure-and-manage-your-ai/ba-p/4157518">Announcing Microsoft Entra Agent ID</a></p>
</li>
<li><p><strong>Model Context Protocol</strong> – <a target="_blank" href="https://modelcontextprotocol.io/introduction">Anthropic MCP Overview</a></p>
</li>
<li><p><strong>Agent Card &amp; A2A Protocol</strong> – <a target="_blank" href="https://a2a-protocol.org/latest/specification/">Agent-to-Agent</a></p>
</li>
<li><p><strong>Azure Identity Platform Docs</strong> – <a target="_blank" href="https://learn.microsoft.com/en-us/entra/workload-id/workload-identity-federation">Workload identity federation</a></p>
</li>
<li><p><strong>Azure Key Vault</strong> – <a target="_blank" href="https://learn.microsoft.com/en-us/azure/key-vault/general/best-practices">Best practices for secrets management</a></p>
</li>
<li><p><strong>Britive Blog</strong> – <a target="_blank" href="https://www.britive.com/resource/blog/agentic-ai-redefining-identity-security-cloud">Agentic AI Is Redefining Identity Security in the Cloud</a></p>
</li>
<li><p><strong>Techolution Report</strong> – <a target="_blank" href="https://www.techolution.com/blog/how-legacy-systems-are-quietly-sabotaging-agentic-ai-across-enterprises/">How Legacy Systems Sabotage Agentic AI</a></p>
</li>
<li><p><a target="_blank" href="https://www.microsoft.com/en-us/security/blog/2023/12/19/how-strata-identity-and-microsoft-entra-id-solve-identity-challenges-in-mergers-and-acquisitions/?msockid=05efbb07d8d16f5a36a1ad2bd9076e99"><strong>How Strata Identity and Microsoft Entra ID solve identity challenges in mergers and acquisitions</strong></a></p>
</li>
<li><p><a target="_blank" href="https://www.strata.io/blog/agentic-identity/why-agentic-ai-demands-more-from-oauth-6a/"><strong>The Identity Problem at AI Scale: Why Agentic AI Demands More From OAuth</strong></a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How breaking in made me a better defender]]></title><description><![CDATA[Why I started pen testing

📖
TL;DR: After years as a defender, switching to offensive security taught me crucial lessons: attackers set the tempo, simplicity often trumps complexity, and thinking sideways reveals hidden gaps. Every defender benefits...]]></description><link>https://blog.brianbaldock.net/a-better-defender</link><guid isPermaLink="true">https://blog.brianbaldock.net/a-better-defender</guid><category><![CDATA[#cybersecurity]]></category><category><![CDATA[pentesting]]></category><category><![CDATA[hacking]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Mon, 07 Jul 2025 21:37:39 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1751920353253/527e2932-9ed8-4b60-9a91-7e23e6fc9102.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-why-i-started-pen-testing">Why I started pen testing</h2>
<div data-node-type="callout">
<div data-node-type="callout-emoji">📖</div>
<div data-node-type="callout-text"><strong>TL;DR:</strong> After years as a defender, switching to offensive security taught me crucial lessons: attackers set the tempo, simplicity often trumps complexity, and thinking sideways reveals hidden gaps. Every defender benefits from thinking (and safely acting) like an attacker.</div>
</div>

<p>I’ve spent over a decade helping customers harden environments, deploy secure solutions, and chase down alerts. But it wasn’t until I started trying to break into systems myself that I really began to understand how attackers think, and honestly, it gave me a headache.</p>
<p>I’m not trying to be some elite super hacker. At best, I’m a proud script kiddie still learning the ropes. The real pros out there see the world sideways; they think in crooked lines and find gaps I wouldn’t have considered. It’s like that classic Abraham Wald story: the places without bullet holes are what really bring the plane down. That analogy fits security a little too well.</p>
<p>I’ve been using my homelab as a safe playground to simulate attacks and sharpen my thinking. There’s something humbling about exploiting your own environment and realizing how many assumptions you’ve made as a defender. I’m also a huge fan of the <a target="_blank" href="https://hackthebox.com">Hack The Box</a> platform. Shoutout to those folks for keeping it real and <strong>challenging</strong>.</p>
<p>So, while I’m still early in my offensive journey, I’ve already picked up a few lessons that completely changed how I think about defense. Here are some of the biggest ones so far.</p>
<h2 id="heading-lesson-1-that-moment-i-got-my-first-reverse-shell">Lesson 1: That moment I got my first reverse shell</h2>
<p>The first time I landed a reverse shell honestly blew my mind. Before that moment, I'd spent my entire career locking down systems, hardening environments, and chasing security alerts. Flipping roles and becoming the attacker (even ethically in a controlled setting) completely changed my perspective.</p>
<p>I still remember it clearly. I set up my listener, launched a PowerShell payload on the target, and almost instantly saw a new prompt appear. Just like that, I was inside the Windows machine.</p>
<p>Seeing the command prompt switch and reveal the target’s hostname was eye-opening. It felt like discovering a hidden world beneath a familiar surface, similar to the first time you realize there's an entire community dedicated to geocaching right under your nose. I'd spent years strengthening barriers without fully appreciating the creativity attackers use to bypass them.</p>
<p>Two big insights came from that experience:</p>
<ol>
<li><p>Attackers always have the initiative. Defenders usually react, plugging gaps as threats appear. Tools like <a target="_blank" href="https://learn.microsoft.com/en-us/security-exposure-management/microsoft-security-exposure-management">Microsoft Security Exposure Management</a> help shift defenders to a more proactive stance, but attackers still control the tempo. Recognizing this fundamentally changed my approach to security.</p>
</li>
<li><p>Simple techniques remain highly effective. Before getting hands-on with ethical hacking, I thought attackers mainly relied on sophisticated exploits that grab headlines. While advanced threats exist, basic attacks (like phishing emails that deliver reverse shell payload) are incredibly common. Solid foundational security measures are as crucial as ever in 2025.</p>
</li>
</ol>
<p>Breaking into my first system, even ethically, taught me more in a few hours than months of defensive and sysadmin work ever did. To defend effectively, you have to deeply understand how attackers think, operate, and exploit vulnerabilities.</p>
<h2 id="heading-lesson-2-overlooking-the-obvious">Lesson 2: Overlooking the obvious</h2>
<p>Early on in my offensive journey, one moment stands out clearly. I was deep into a Hack The Box challenge, convinced there had to be some complex exploit hidden beneath the surface. Hours later, I realized I'd completely overlooked checking basic file permissions.</p>
<p>When I finally looped back, the vulnerability was painfully obvious—misconfigured permissions on a sensitive directory handed me immediate access. Oops.</p>
<p>Trust your process, cover the basics thoroughly, and <strong>don’t</strong> <strong>assume</strong> complexity. Skipping the obvious steps only <strong>costs you time</strong>.</p>
<h2 id="heading-lesson-3-thinking-sideways-attackers-dont-play-by-your-rules">Lesson 3: Thinking sideways (Attackers don’t play by your rules)</h2>
<p>One of my favorite "aha" moments came from a box that initially seemed impossible. I was convinced the vulnerability had to be web-based because everything pointed in that direction. After exhausting every possible web exploit, I took a break, frustrated. It turned out the entry point wasn't web at all; it was an outdated, seemingly harmless print service that I'd dismissed early on.</p>
<p>That pivot taught me something critical: attackers don’t care how things “should” work. They exploit how they “actually” work. If defenders only look at what's typical or documented, we leave blind spots wide open.</p>
<p>So, forget your assumptions. Security is about expecting the unexpected. To be effective, defenders must learn to look at their environments with fresh eyes (and think sideways) just like the attackers do.</p>
<h2 id="heading-bonus-what-i-changed-in-my-own-environment">Bonus: What I changed in my own environment</h2>
<p>After spending time trying to break into my own systems, I quickly saw where my blind spots were, and immediately got to work plugging them. Here are some practical changes I've made to strengthen my own security posture:</p>
<ul>
<li><p><strong>Firmware Updates</strong>: Let's be honest, keeping firmware up to date isn't fun or flashy, but outdated firmware can be a goldmine for attackers. Now, I make it a routine to regularly check and update firmware on everything from routers and switches to IP cameras and printers.</p>
</li>
<li><p><strong>Isolating IoT Devices</strong>: IoT devices are notoriously sketchy when it comes to security. My smart home gear is now isolated on its own VLAN, keeping these potentially vulnerable devices separated from my main network.</p>
</li>
<li><p><strong>VLAN Segmentation</strong>: Beyond IoT, I’ve segmented my network into different VLANs, creating clear boundaries between sensitive data, regular browsing, and guest access. It might feel a little paranoid at first, but compartmentalizing access drastically reduces lateral movement opportunities for attackers.</p>
</li>
<li><p><strong>Zero Trust Mindset</strong>: Probably the biggest mindset shift has been embracing Zero Trust principles. Instead of assuming internal traffic is safe, I now operate as if every device and user is potentially compromised. It takes discipline, but it's worth it.</p>
</li>
</ul>
<p>Taking these steps not only improved my security but gave me greater peace of mind. Nothing sharpens your defensive instincts quite like actively testing your own limits.</p>
<h2 id="heading-final-thoughts-why-every-defender-should-break-in-safely">Final Thoughts: Why every defender should break in (safely)</h2>
<p>If you've spent your career on the defensive side like I did, switching perspectives can feel intimidating, but it's essential. Understanding attackers requires thinking like one. And there's no substitute for hands-on experience.</p>
<p>By safely and ethically exploring offensive security techniques, you will:</p>
<ul>
<li><p>Gain practical insights that theory alone can't teach.</p>
</li>
<li><p>Recognize weak spots in your environment that you might have overlooked.</p>
</li>
<li><p>Develop empathy and respect for the creativity and persistence attackers bring to their craft.</p>
</li>
</ul>
<p>Even basic, controlled exercises (like those available on platforms such as <a target="_blank" href="https://www.hackthebox.com/">Hack The Box</a> or <a target="_blank" href="https://tryhackme.com/">TryHackMe</a>) can profoundly change your approach. You don’t need to become a world-class pen tester but spending some time on the other side can sharpen your defensive instincts and make your security efforts more effective.</p>
<p><strong>Every defender benefits from knowing how to break in safely. So give it a try, your environment (and your skills) will thank you.</strong></p>
]]></content:encoded></item><item><title><![CDATA[MFA Beyond Push Notifications]]></title><description><![CDATA[Yes, yet another blog article by me about multifactor authentication. In this one I want to focus on the different MFA methods and call out their differences and give you some ammunition to argue moving beyond standard push notifications! I’m going t...]]></description><link>https://blog.brianbaldock.net/mfa-beyond-push-notifications</link><guid isPermaLink="true">https://blog.brianbaldock.net/mfa-beyond-push-notifications</guid><category><![CDATA[MFA]]></category><category><![CDATA[microsoft authenticator app]]></category><category><![CDATA[Entra ID]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Wed, 09 Apr 2025 00:42:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1744159129762/a47b3075-566f-4cf0-912a-e1d9e95298b5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Yes, yet another blog article by me about multifactor authentication. In this one I want to focus on the different MFA methods and call out their differences and give you some ammunition to argue moving beyond standard push notifications! I’m going to focus on the following topics:</p>
<ul>
<li><p>Where basic push-notification MFA falls short.</p>
</li>
<li><p>Why improvements like number matching help, but don’t fully solve the problem.</p>
</li>
<li><p>Which methods are actually “phishing-resistant”.</p>
</li>
<li><p>How Microsoft Entra MFA ties into device ‘trust’, and why “hybrid join” doesn’t automatically mean “secure device”.</p>
</li>
</ul>
<h2 id="heading-the-problem-with-push-mfa">The problem with push MFA</h2>
<h3 id="heading-the-problem-with-push-mfa-1"><strong>The Problem with Push MFA</strong></h3>
<p>Push-based MFA is that nice, convenient <em>method</em> where you get a notification on your phone: “Was this you signing in?” You tap “Approve” or “Deny,” and you’re in. It uses services like <strong>Firebase</strong> (Android) and <strong>APNS</strong> (iOS) to pop up that request in your authenticator app.</p>
<p>People tend to like this method because it’s fast and easy. No fuss.</p>
<p>Attackers also tend to like this method because they’ll just keep spamming you at all hours of the night until you tap “Approve” and they’re in. Or, they’ll do a little social engineering, “Hey, I’m so-and-so from IT, can you accept the prompt you’re getting please?” Let’s face it, when you’re in a hurry, you might just tap <strong>Approve</strong> by mistake or pure annoyance.</p>
<p><strong>Real-World Breach Examples</strong></p>
<ul>
<li><p><strong>Cisco (2022)</strong>: Attackers hammered an employee’s phone with MFA prompts and impersonated IT on a phone call. Eventually, the user caved and hit <em>Approve</em>. <em>Boom!</em> Network compromised. [Reference: <a target="_blank" href="https://blog.talosintelligence.com/recent-cyber-attack/">https://blog.talosintelligence.com/recent-cyber-attack/</a>]</p>
</li>
<li><p><strong>Uber (2022)</strong>: Same deal. An attacker texted the user, spammed them with prompts, eventually wearing them down. [Reference: <a target="_blank" href="https://www.wired.com/story/uber-hack-mfa-phishing/">https://www.wired.com/story/uber-hack-mfa-phishing/</a>]</p>
</li>
</ul>
<p>Push MFA is convenient, but way too reliant on <em>human judgment.</em> If someone can just pester or trick you into tapping <em>Approve</em>, they win.</p>
<h2 id="heading-number-matching-a-partial-fix">Number matching - A partial fix</h2>
<p>A crafty phisher can call up the user, do a bit of social engineering and say something like, “Hey, please approve the following request, the confirmation code is 56,” or something along those lines. Basically, <em>number matching</em> reduces accidental approvals but still can’t stop a determined social engineer.</p>
<h3 id="heading-phishing-resistant-mfa-fido2-passkeys-and-certificates"><strong>Phishing-Resistant MFA</strong>: FIDO2, Passkeys, and Certificates</h3>
<p>A truly phishing-resistant MFA method doesn’t rely on the user handing over a code or tapping a random approval prompt. Instead, it uses cryptographic tricks bound to the <em>real</em> site or service, so a fake site can’t fool it. Let’s check out the big kids in the neighborhood:</p>
<ul>
<li><p><strong>FIDO2 Security Keys</strong>: USB or NFC “keys” that do public-key cryptography with each site. They’re awesome because they will <em>not</em> authenticate you with the wrong domain. So even if a user stumbles onto a phishing page, the key sees it’s not the real site and refuses to sign.</p>
<ul>
<li><em>SMB Reality Check:</em> Physical keys cost money and require some admin overhead. But many businesses find them worth it, especially for high-risk accounts. Employees can also use built-in authenticators on devices (like Windows Hello for Business) to skip buying physical keys.</li>
</ul>
</li>
<li><p><strong>Passkeys</strong>: These wrap all the goodness of FIDO2 but sync across devices, so it’s more consumer-friendly. Microsoft, Google, Apple, and Amazon are jumping on the bandwagon. Adoption is climbing fast—people are tired of dealing with passwords. There <em>is</em> a bit of friction in setup, but once it’s running, people notice the convenience: no password, just a quick biometric or PIN.</p>
</li>
<li><p><strong>Certificate-Based Authentication</strong>: The user or device has a certificate that never leaves the system. The server checks it via a cryptographic challenge. This is phishing-resistant because you can’t phish someone into reading off a certificate. It’s all behind the scenes—no one-time codes to intercept. The downside? PKI can be complicated. Usually larger enterprises or government organizations go this route, but it’s an option for smaller businesses who need an ironclad approach (and can handle the overhead).</p>
</li>
</ul>
<h2 id="heading-but-wait-you-need-more-than-just-mfa">But wait, you need more than just MFA</h2>
<p>Security in depth is the best way to approach authentication. A layered approach will help you secure your environment hence why phishing-resistant MFA is so important but so is the compliant device tag. Let’s explore a scenario:</p>
<ul>
<li><p>An attacker using a proxy phishes a user and provides a link to what appears to the user as the M365 login page.</p>
</li>
<li><p>The user performs the login and gets the MFA prompt, if they are using push MFA, TOTP or number matching, they get the same prompt on the login page and enter the details.</p>
<ul>
<li>Meanwhile the attacker has grabbed a copy of the token.</li>
</ul>
</li>
<li><p>The user is redirected to the actual M365 login page or passed through to M365.</p>
</li>
<li><p>Now the attacker takes the token and replays it on their device allowing them to login as the user.</p>
</li>
</ul>
<p>How do we stop this? Simple: add the <em>Require Compliant Device</em> checkbox to the Conditional Access policy, or at least the <em>Require Hybrid Joined Device</em> checkbox. This doesn’t allow any access to your tenant resources if the device is not explicitly enrolled with your tenant. Maybe the attacker can capture the user’s token, but they can’t do anything with it unless they own <em>one of your devices</em>, which means they’re already in your network, which brings on a whole other set of problems you’d hopefully detect.</p>
<ul>
<li><p><strong>Hybrid Azure AD Joined</strong>: Means a Windows device is both on-prem domain-joined <em>and</em> registered with Azure AD. Great for single sign-on and identity, but doesn’t guarantee the device is secure—it’s more of a static “corporate device” check.</p>
</li>
<li><p><strong>Compliant Device (MDM)</strong>: Means the device is enrolled in Intune (or another MDM) and meets your security rules (encryption, OS updates, no jailbreak, etc.). Intune continuously verifies compliance via a certificate-based attestation.</p>
</li>
</ul>
<p><strong>Key Point</strong>: <em>Hybrid join</em> alone won’t catch a laptop that’s loaded with malware or missing patches. <em>Compliance</em> via MDM is the stronger signal because it’s an active, policy-based check.</p>
<p><strong>Recommendation</strong>: For sensitive access, don’t rely solely on <em>hybrid join.</em> Combine it with “require compliant device” so you know the machine is actually up to policy.</p>
<h2 id="heading-recommendations">Recommendations</h2>
<ol>
<li><p><strong>Don’t over-rely on Push MFA</strong></p>
<p> It’s convenient but social engineering and phishing can break it. Number matching which is enabled by default for all Microsoft Authenticator users will reduce accidental approvals but realize that it’s not foolproof.</p>
</li>
<li><p><strong>Use phishing-resistant MFA where it matters most</strong></p>
<p> Admins, executives, and anyone with access to critical data should have FIDO2 or passkeys. Even if you roll it out gradually, it’s worth the peace of mind.</p>
</li>
<li><p><strong>Consider going passwordless</strong></p>
<p> Tools like Microsoft Authenticator phone sign-in or Windows Hello for Business are leaps ahead of memorized passwords. You’ll cut out password-based phishes entirely. Just remember that phone sign-in still relies on user caution; FIDO2 is more bulletproof.</p>
</li>
<li><p><strong>Embrace “Compliant Device”</strong></p>
<p> Require that devices are enrolled and meet your security policies. That way, even if an attacker steals someone’s creds, they can’t just log in from any random device.</p>
</li>
<li><p><strong>Learn from the breaches</strong></p>
<p> Set up user education to prevent “push fatigue” approvals, and get rid of weaker MFA factors (SMS, voice calls) as soon as you can.</p>
</li>
<li><p><strong>Keep watching the future</strong></p>
<p> Passkeys are on a fast upward trend, and user acceptance is growing. It might feel daunting to jump on new tech right away, but these days it’s often less painful than you’d think. The payoff: better security <em>and</em> fewer password headaches.</p>
</li>
</ol>
<p>Phishing resistant MFA isn’t just for Fortune 500s or government agencies anymore. If you’re running a business, these modern authentication methods could be the difference between a small incident and a massive breach. I hope this rundown helps you navigate the next steps in your MFA journey. Have questions or want to share your own experience? Let me know in the comments.</p>
]]></content:encoded></item><item><title><![CDATA[Making IT simpler]]></title><description><![CDATA[If you’re an IT admin or business owner looking to streamline your Microsoft 365 onboarding and configuration, you need to see this!
This tool is a game-changer for setting up secure and efficient environments in no time. Whether you’re new to M365 o...]]></description><link>https://blog.brianbaldock.net/making-it-simpler-1</link><guid isPermaLink="true">https://blog.brianbaldock.net/making-it-simpler-1</guid><category><![CDATA[setup expert]]></category><category><![CDATA[M365 deployment]]></category><category><![CDATA[deploying m365]]></category><category><![CDATA[Microsoft 365]]></category><category><![CDATA[#ai-tools]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Thu, 13 Mar 2025 13:28:03 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1742585339446/01f61037-a96d-4c7e-aa0e-fb7c4b9c7466.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<iframe src="https://www.linkedin.com/embed/feed/update/urn:li:ugcPost:7286153079555567616?compact=1" height="650" width="750"></iframe>

<p>If you’re an IT admin or business owner looking to streamline your Microsoft 365 onboarding and configuration, you need to see this!</p>
<p>This tool is a game-changer for setting up secure and efficient environments in no time. Whether you’re new to M365 or a seasoned pro, the setup expert AI is an epic resource to streamline your expertise.</p>
<p>Check out the video to see how it works and let me know what you think!</p>
]]></content:encoded></item><item><title><![CDATA[Introducing Leave A Gripe]]></title><description><![CDATA[📢
Ever had something you just needed to say—no filters, no accounts, no algorithms deciding what gets seen? That’s why I built LeaveA.Gripe.


It’s a simple concept: an infinite pinup board for text-only messages. No replies, no moderation, no track...]]></description><link>https://blog.brianbaldock.net/leaveagripe</link><guid isPermaLink="true">https://blog.brianbaldock.net/leaveagripe</guid><category><![CDATA[gripes]]></category><category><![CDATA[griping]]></category><category><![CDATA[leave a gripe]]></category><category><![CDATA[Pinup]]></category><category><![CDATA[social media]]></category><category><![CDATA[anonymity]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Thu, 13 Mar 2025 02:22:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741832108587/c15bcbf2-e5a0-48cb-9ede-ef3b878a53b6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div data-node-type="callout">
<div data-node-type="callout-emoji">📢</div>
<div data-node-type="callout-text">Ever had something you just needed to say—no filters, no accounts, no algorithms deciding what gets seen? That’s why I built <strong>LeaveA.Gripe</strong>.</div>
</div>

<p>It’s a simple concept: an <strong>infinite pinup board</strong> for text-only messages. No replies, no moderation, no tracking; just pure, unfiltered expression, the way I remember the internet of the late 90s early 00s. Drop a note anywhere on the board, and it stays. Every user starts in a random location, and if you want to post, the app will nudge you to the nearest open space. That’s it. No noise, no engagement metrics, just a raw stream of whatever people feel like saying.</p>
<p>Why? Because every platform these days tries to shape what we see. LeaveA.Gripe doesn’t. There are no hidden rules, no boosted posts, no content policing. It’s the <strong>wild west of anonymous text</strong>, where people can say whatever they want, and it just exists.</p>
<p>It’s live now at <a target="_blank" href="https://leavea.gripe">LeaveA.Gripe</a>. Check it out, leave your mark, and let the gripes flow!</p>
]]></content:encoded></item><item><title><![CDATA[Disconnected Environments Revisited]]></title><description><![CDATA[Back in 2023, I wrote about deploying Microsoft Defender for Endpoint (MDE) (Link) in disconnected environments, covering why proxies were necessary and how to make them work. Fast forward to 2025, and the core message hasn't changed: Defender for En...]]></description><link>https://blog.brianbaldock.net/mde-proxies-2025</link><guid isPermaLink="true">https://blog.brianbaldock.net/mde-proxies-2025</guid><category><![CDATA[Defender for Endpoint]]></category><category><![CDATA[MDE]]></category><category><![CDATA[proxies]]></category><category><![CDATA[disconnected networks]]></category><category><![CDATA[DefenderPortal]]></category><category><![CDATA[xdr]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Mon, 10 Mar 2025 21:32:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1741642159655/bee62bcd-4da8-4ea2-af8b-b0dd14f462ec.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Back in 2023, I wrote about deploying Microsoft Defender for Endpoint (MDE) (<a target="_blank" href="https://aka.ms/mde-proxies">Link</a>) in disconnected environments, covering why proxies were necessary and how to make them work. Fast forward to 2025, and the core message hasn't changed: <strong>Defender for Endpoint is a cloud-powered security solution, and you need to give it a way to reach the cloud</strong> if you want the best protection. The good news? Microsoft has made connectivity easier with <a target="_blank" href="https://learn.microsoft.com/en-us/defender-endpoint/configure-device-connectivity">Streamlined Connectivity</a>, but proxies are still a key tool in getting Defender working in restricted networks. Let’s break it down.</p>
<h2 id="heading-whats-changed">What’s Changed?</h2>
<p>Since 2023, Microsoft has dramatically <strong>reduced the number of URLs</strong> needed for allow-listing, consolidating Defender for Endpoint’s cloud endpoints into a much smaller set. Instead of dealing with a long list of domains, most organizations now only need to allow <code>*.endpoint.security.microsoft.com</code> and a few others. Microsoft also introduced <strong>static IP ranges and Azure service tags</strong>, making firewall configurations much more manageable.</p>
<p>For organizations with disconnected networks, these changes mean fewer headaches when setting up proxy rules. But even with these improvements, <strong>you still need a path to the cloud</strong>—and that’s where proxies remain essential.</p>
<h2 id="heading-why-proxies-still-matter">Why Proxies Still Matter</h2>
<p>Many organizations don’t allow direct internet access from endpoints, especially in high-security environments. A proxy allows MDE to connect to Microsoft’s cloud while maintaining network control. This isn’t a security compromise; it’s a smart way to ensure MDE can leverage AI-driven protection and real-time threat intelligence without opening the floodgates.</p>
<p>To make it work, you need to:</p>
<ul>
<li><p><strong>Use a system-wide proxy configuration</strong> (WinHTTP) so Defender can always communicate, even when no user is logged in.</p>
</li>
<li><p>Allow required Microsoft endpoints <strong><mark>without SSL inspection</mark></strong>; Defender uses <strong>certificate pinning</strong>, and inspecting traffic will break its connection.</p>
</li>
<li><p><strong>Ensure outbound connections don’t require user authentication</strong>, since Defender’s telemetry is sent by the system, not a logged-in user.</p>
</li>
</ul>
<p>With a properly configured proxy, <strong>you get full cloud protection without sacrificing security</strong>.</p>
<h2 id="heading-airgapped-environments-need-a-different-approach">Airgapped Environments Need a Different Approach</h2>
<p>If your environment is fully <strong>airgapped</strong> (no internet at all), then cloud-based protection just isn’t an option. Defender for Endpoint isn’t designed for fully offline use, and while you can keep Defender Antivirus running with offline signature updates, <strong>you lose out on AI-driven threat detection, EDR, and cloud analytics</strong>.</p>
<p>For true airgap scenarios, your focus should be on <strong>offline update mechanisms</strong> (WSUS, Configuration Manager) and <strong>strict network segmentation</strong> to prevent lateral movement. But if there’s even a <strong>tiny</strong> opportunity to establish controlled, intermittent connectivity; say, syncing telemetry weekly; <strong><mark>it’s worth doing</mark></strong>.</p>
<h2 id="heading-lets-talk-about-trust">Let’s Talk About Trust</h2>
<p>One of the biggest pushbacks I still hear is trust. Some organizations hesitate to open a proxy for Defender’s cloud security, despite <strong>already trusting Microsoft with their emails (Exchange Online), files (SharePoint and OneDrive), and collaboration (Teams)</strong>. If your business-critical data already lives in Microsoft’s cloud, why would you suddenly draw the line at security telemetry?</p>
<p>Defender for Endpoint sends <strong>s</strong>ecurity signals, not sensitive business data. It’s about identifying threats, improving detection, and keeping your environment safe. If your security model is still based on “we don’t trust cloud security,” it might be time to rethink that stance.</p>
<h2 id="heading-best-practices-for-2025">Best Practices for 2025</h2>
<p>If you’re operating in a disconnected or hybrid network, here’s what you should be doing:</p>
<ul>
<li><p><strong>Use the new streamlined allow-list</strong> instead of managing dozens of URLs.</p>
</li>
<li><p><strong>Disable SSL inspection for Defender traffic</strong>; it’ll break functionality.</p>
</li>
<li><p><strong>Use a dedicated proxy configuration</strong> so Defender always has cloud access.</p>
</li>
<li><p><strong>Regularly check connectivity</strong> using the client analyzer tool.</p>
</li>
<li><p><strong>Educate security teams</strong>; this isn’t about opening everything, it’s about controlled access to a trusted security cloud.</p>
</li>
</ul>
<h2 id="heading-references"><strong>References:</strong></h2>
<p><a target="_blank" href="https://techcommunity.microsoft.com/blog/microsoftdefenderatpblog/announcing-a-streamlined-device-connectivity-experience-for-microsoft-defender-f/3956236">Announcing a streamlined device connectivity experience for Microsoft Defender for Endpoint</a></p>
<p><a target="_blank" href="https://aka.ms/mde-proxies">Disconnected environments, proxies and Microsoft Defender for Endpoint</a></p>
<p><a target="_blank" href="https://techcommunity.microsoft.com/blog/microsoftdefenderatpblog/defender-for-endpoint-and-disconnected-environments-cloud-centric-networking-dec/3786540">Defender for Endpoint and disconnected environments. Cloud-centric networking decisions</a></p>
]]></content:encoded></item><item><title><![CDATA[Defender for Endpoint & "The Internet"]]></title><description><![CDATA["Defender for Endpoint only works when you're connected to the internet." If I had a dollar for every time I heard that, I'd be set for life. This statement makes security pros wince because it fundamentally misunderstands how modern cloud security o...]]></description><link>https://blog.brianbaldock.net/defender-for-endpoint-and-the-internet</link><guid isPermaLink="true">https://blog.brianbaldock.net/defender-for-endpoint-and-the-internet</guid><category><![CDATA[MDE]]></category><category><![CDATA[The Internet]]></category><category><![CDATA[Air-gapped networks]]></category><category><![CDATA[disconnected networks]]></category><category><![CDATA[Defender for Endpoint]]></category><category><![CDATA[encryption]]></category><category><![CDATA[https]]></category><category><![CDATA[TLS]]></category><category><![CDATA[vpn]]></category><category><![CDATA[airgap]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Fri, 14 Feb 2025 05:57:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739512453321/0c0c31f2-15b0-499b-9460-bad7d5cf329b.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>"Defender for Endpoint only works when you're connected to the internet." If I had a dollar for every time I heard that, I'd be set for life. This statement makes security pros wince because it fundamentally misunderstands how modern cloud security operates.</p>
<p>Here's the reality: just because something requires an internet connection doesn't mean it's exposed to the "open internet." When you set up a tightly controlled, encrypted tunnel between your endpoint and a specific service, you're not throwing data into the wild. You're essentially creating a “VPN”, whether you call it that or not.</p>
<p>With HTTPS/TLS, your data is encrypted, directed to a <strong>single</strong>, <strong>defined endpoint</strong>, and <strong>protected</strong> from interception; just like a VPN tunnel. The difference? One is <strong>application-specific</strong>, and the other typically routes all traffic. This distinction is at the heart of why the "Defender needs the internet" argument is flawed. It uses an HTTPS/TLS connection to send telemetry and signals to the Defender service. But before we dive too deep, let's talk about why encryption became the backbone of modern security in the first place.</p>
<h1 id="heading-the-internet-goes-dark"><strong>The Internet "Goes Dark"</strong></h1>
<p>Before 2013, encryption was primarily used for sensitive data like banking transactions and login credentials, while most internet traffic remained unencrypted. The Snowden leaks in 2013 exposed extensive global surveillance and the exploitation of unencrypted data, triggering a shift to an "encrypt everything" mentality.</p>
<p>In 2016, Let's Encrypt revolutionized the scene by offering free, automated SSL/TLS certificates, making HTTPS the norm overnight. Today, encryption is standard for almost all online activities; but it also poses challenges for law enforcement, as robust encryption can hinder traditional surveillance methods and complicate data access during investigations.</p>
<h1 id="heading-the-cultural-shift"><strong>The Cultural Shift</strong></h1>
<p>The industry transitioned from "only encrypt sensitive data" to "encrypt everything by default." Today, whether you're checking email, using a SaaS app, or running a cloud-based AV/EDR like Defender for Endpoint, encryption is the norm. This means that when an endpoint communicates with the cloud, it's through a controlled, encrypted tunnel; not just general web surfing.</p>
<h3 id="heading-key-takeaways-from-the-encryption-revolution"><strong>Key Takeaways from the Encryption Revolution:</strong></h3>
<ul>
<li><p><strong>Pre-2013:</strong> Encryption was limited to specific use cases.</p>
</li>
<li><p><strong>Post-Snowden (2013+):</strong> "Encrypt everything" became the standard (it was a slow process).</p>
</li>
<li><p><strong>Google’s HTTPS Push (2014-2018):</strong> Encouraged <strong>HTTPS</strong> adoption by ranking HTTPS sites higher in search results and marking HTTP sites as insecure in Chrome.</p>
</li>
<li><p><strong>2016 and beyond:</strong> Let's Encrypt removed cost barriers, making HTTPS ubiquitous.</p>
</li>
<li><p><strong>Regulatory Changes (e.g., GDPR, PCI DSS v3.2):</strong> Strengthened encryption requirements for compliance.</p>
</li>
</ul>
<h1 id="heading-the-basics-of-secure-tunnels"><strong>The Basics of Secure Tunnels</strong></h1>
<p>Now, let's break down what a VPN does versus what HTTPS/TLS does.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>What a VPN Typically Does:</strong></td><td><strong>What HTTPS/TLS Typically Does:</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Encrypts all device traffic and routes it through a secure tunnel.</td><td>Encrypts traffic at the application layer (specific services, not all device traffic).</td></tr>
<tr>
<td>Often changes the IP/location of your traffic.</td><td>Does not change the user's IP but still prevents interception.</td></tr>
<tr>
<td>Requires a VPN client for setup.</td><td>Ensures end-to-end security between an application and a specific endpoint.</td></tr>
</tbody>
</table>
</div><p>At their core, both VPNs and HTTPS/TLS tunnel encrypted traffic to a specific destination. The main difference is that a VPN typically works at the network level, whereas HTTPS secures individual services.</p>
<h3 id="heading-how-httpstls-mimics-a-vpn"><strong>How HTTPS/TLS ‘Mimics’ a VPN</strong></h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>HTTPS/TLS (Targeted)</strong></td><td><strong>VPNs (Broad)</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Secures specific traffic (e.g., Defender services communicating with Microsoft's Defender cloud endpoints, not your entire network session).</td><td>Route all traffic through a secure tunnel.</td></tr>
</tbody>
</table>
</div><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739512064750/a4b249d1-d08b-48ec-bc7a-ba83bef372c6.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-pinning-and-allowlisting"><strong>Pinning and Allowlisting</strong></h3>
<ul>
<li><p>Many security teams allowlist only the endpoints that services like Defender for Endpoint require.</p>
</li>
<li><p>If a device can only communicate with Microsoft's secure Defender services, labeling it as "internet-exposed" is misleading.</p>
</li>
</ul>
<h3 id="heading-equivalent-security"><strong>Equivalent Security</strong></h3>
<ul>
<li>A properly pinned HTTPS connection to a cloud service offers the same level of encryption as a VPN, but at the application level rather than the entire network.</li>
</ul>
<h1 id="heading-addressing-the-ot-concerns"><strong>Addressing the OT concerns</strong></h1>
<h3 id="heading-lets-talk-about-air-gapped"><strong>Let’s talk about “air-gapped”</strong></h3>
<ul>
<li><p>A true air-gapped system has zero external connectivity. These are rare and <strong>not</strong> foolproof.</p>
</li>
<li><p>In reality, most environments have controlled exceptions, such as telemetry, updates, or security tools.</p>
</li>
</ul>
<p><strong>Challenges with Air-Gapped Networks</strong></p>
<ul>
<li><p><strong>Human Factors:</strong> Air-gapped systems may require data transfer via physical media like USB drives. This necessity introduces a vector for malware.</p>
</li>
<li><p><strong>Insider Threats:</strong> Employees or contractors with legitimate access can inadvertently or maliciously introduce malware into the system.</p>
</li>
<li><p><strong>Advanced Persistent Threats (APTs):</strong> Sophisticated adversaries develop methods to breach air-gapped systems, such as exploiting electromagnetic emissions or using compromised hardware.</p>
</li>
<li><p><strong>Maintenance and Updates:</strong> Keeping air-gapped systems updated is challenging, often leading to outdated software that is vulnerable to exploits.</p>
</li>
</ul>
<p>While air-gapped networks add a layer of security, they should not be the sole defense mechanism. Implementing strict access controls, regular security audits, and comprehensive monitoring is essential. Isolation can lead to complacency. Just look at the damage Stuxnet caused.</p>
<h1 id="heading-httpstls-as-a-micro-vpn"><strong>HTTPS/TLS as a "Micro-VPN"</strong></h1>
<p>By pinning HTTPS/TLS traffic to a specific set of endpoints, you create a tunnel just as restricted as a VPN.</p>
<h3 id="heading-benefits-of-a-strict-allowlist"><strong>Benefits of a Strict Allowlist</strong></h3>
<p>If a network only allows outbound HTTPS traffic to a handful of approved domains, then that traffic is as "air-gapped" as possible while still allowing required security functions.</p>
<h1 id="heading-final-thoughts"><strong>Final thoughts</strong></h1>
<p>"Connecting to the internet" does not equate to exposing your network. HTTPS/TLS is a secure tunnel, just like a VPN; just scoped differently.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Review your own organization’s ‘air-gap assumptions’ and see if you’re missing out on the benefits of controlled cloud connectivity!</div>
</div>

<h3 id="heading-sources">Sources</h3>
<p><a target="_blank" href="https://www.reuters.com/article/world/five-eyes-security-alliance-calls-for-access-to-encrypted-material-idUSKCN1UP19C/">'Five Eyes' security alliance calls for access to encrypted material</a></p>
<p><a target="_blank" href="https://www.publicsafety.gc.ca/cnt/rsrcs/pblctns/ntnl-cbr-scrt-strtg-2025/index-en.aspx">Canada's National Cyber Security Strategy</a></p>
<p><a target="_blank" href="https://www.politico.com/news/2022/06/29/canada-national-police-spyware-phones-00043092">Canada’s national police force admits use of spyware to hack phones</a></p>
<p><a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/intelligence-reports/microsoft-digital-defense-report-2024?msockid=0704d7dfc247627b2dd4c259c391636f">Microsoft Digital Defense Report 2024</a></p>
<p><a target="_blank" href="https://www.wired.com/2014/11/countdown-to-zero-day-stuxnet/">An Unprecedented Look at Stuxnet, the World's First Digital Weapon</a></p>
<p><a target="_blank" href="https://www.gao.gov/blog/what-are-biggest-challenges-federal-cybersecurity-high-risk-update">What are the Biggest Challenges to Federal Cybersecurity? (High Risk Update)</a></p>
<p><a target="_blank" href="https://www.gao.gov/products/gao-21-288">High-Risk Series:Federal Government Needs to Urgently Pursue Critical Actions to Address Major Cybersecurity Challenges</a></p>
<p><a target="_blank" href="https://www.paloaltonetworks.com/cybersecurity-perspectives/the-air-gap-is-dead">The Air Gap Is Dead. It’s Time for Industrial Organisations to Embrace the Cloud</a></p>
<p><a target="_blank" href="https://darktrace.com/blog/why-the-air-gap-is-not-enough">Securing OT Systems: The Limits of the Air Gap Approach</a></p>
<p><a target="_blank" href="https://www.tofinosecurity.com/blog/scada-security-air-gap-debate-over">SCADA Security: Is the Air Gap Debate Over?</a></p>
<p><a target="_blank" href="https://arstechnica.com/information-technology/2013/12/scientist-developed-malware-covertly-jumps-air-gaps-using-inaudible-sound/">Scientist-developed malware prototype covertly jumps air gaps using inaudible sound</a></p>
]]></content:encoded></item><item><title><![CDATA[Deploying Local AI Inference with vLLM and ChatUI in Docker]]></title><description><![CDATA[Why I Built This
I’ve always been fascinated by AI and self-hosted solutions, so with my home lab setup, I figured - why not experiment with AI and containers?
Since I already had the hardware, building a local inference server seemed like a natural ...]]></description><link>https://blog.brianbaldock.net/deploying-local-ai-inference-with-vllm-and-chatui-in-docker</link><guid isPermaLink="true">https://blog.brianbaldock.net/deploying-local-ai-inference-with-vllm-and-chatui-in-docker</guid><category><![CDATA[ChatUI]]></category><category><![CDATA[vLLM]]></category><category><![CDATA[Docker]]></category><category><![CDATA[Docker compose]]></category><category><![CDATA[debian]]></category><category><![CDATA[Homelab]]></category><category><![CDATA[AI]]></category><category><![CDATA[Local LLM]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Sat, 01 Feb 2025 03:57:50 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738382153760/c3421c66-a6cd-4f07-a410-3877c2a21da0.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-why-i-built-this">Why I Built This</h1>
<p>I’ve always been fascinated by AI and self-hosted solutions, so with my home lab setup, I figured - why not experiment with AI and containers?</p>
<p>Since I already had the hardware, building a local inference server seemed like a natural next step.</p>
<p>I started researching different options and, as a longtime Hugging Face lurker, decided to try out some of their tools. That’s when I came across <strong>vLLM</strong>, a fast, OpenAI-compatible model server, and <strong>ChatUI</strong>, a clean, customizable frontend. It looked like a straightforward setup.</p>
<p>Yeah, it wasn’t.</p>
<p>Between networking issues, container misconfigurations, and a handful of other headaches, getting everything running was far more involved than I expected. But after plenty of troubleshooting, rebuilding, and debugging, I got it working.</p>
<p>This article walks through that process - what I learned, the challenges I ran into, and the resources that helped along the way. I’ve also included the final working configuration for anyone looking to set up a <strong>local AI inference server</strong> using Docker, NVIDIA GPUs, and open-source tools.</p>
<h2 id="heading-the-goal">The Goal</h2>
<p>I wanted to deploy a <strong>self-hosted AI chatbot</strong> with the following components:</p>
<ul>
<li><p><strong>vLLM</strong> serving as the AI model (replacing cloud APIs like Hugging Face or OpenAI)</p>
</li>
<li><p><strong>ChatUI</strong> providing the frontend (so I wouldn’t have to build a UI from scratch)</p>
</li>
<li><p><strong>MongoDB</strong> storing chat history for persistence</p>
</li>
<li><p><strong>NGINX</strong> handling reverse proxying and TLS (SSL) termination</p>
</li>
<li><p><strong>Docker</strong> managing deployment for a consistent and reproducible setup</p>
</li>
<li><p><strong>GPU acceleration</strong> ensuring snappy responses</p>
</li>
</ul>
<h2 id="heading-the-final-architecture">The Final Architecture</h2>
<p>Here’s what I ended building/using</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Component</strong></td><td><strong>Tool Used</strong></td><td><strong>Documentation</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Model Server</strong></td><td>vLLM (OpenAI-compatible)</td><td><a target="_blank" href="https://docs.vllm.ai/en/latest/">vLLM Docs</a></td></tr>
<tr>
<td><strong>Frontend</strong></td><td>ChatUI (Hugging Face’s clean UI)</td><td><a target="_blank" href="https://huggingface.co/docs/chat-ui/installation/local">ChatUI Docs</a></td></tr>
<tr>
<td><strong>Database</strong></td><td>MongoDB</td><td><a target="_blank" href="https://www.mongodb.com/docs/manual/tutorial/install-mongodb-community-with-docker/?msockid=14ef65ebd2d4698b35e470b1d3f3683a">MongoDB &amp; Docker</a></td></tr>
<tr>
<td><strong>Reverse Proxy</strong></td><td>NGINX</td><td><a target="_blank" href="https://www.docker.com/blog/how-to-use-the-official-nginx-docker-image/">NGINX &amp; Docker</a></td></tr>
<tr>
<td><strong>Deployment</strong></td><td>Docker &amp; Docker Compose</td><td><a target="_blank" href="https://docs.docker.com/compose/">Docker &amp; Docker Compose</a></td></tr>
<tr>
<td><strong>GPU Acceleration</strong></td><td>NVIDIA RTX 3080 (10GB VRAM &amp; 272 Tensor Cores)</td><td><a target="_blank" href="https://docs.nvidia.com/ai-enterprise/deployment/bare-metal/latest/docker.html">Nvidia &amp; Docker</a></td></tr>
</tbody>
</table>
</div><hr />
<h1 id="heading-getting-the-prerequisites-ready">Getting the prerequisites ready</h1>
<p>With the physical setup complete, the next step was choosing an operating system. I wanted something lightweight with minimal overhead since I’d be managing everything remotely over SSH using Visual Studio Code. A streamlined OS also helps keep configuration and maintenance simple.</p>
<p>I went with <strong>Debian</strong>, a solid and reliable choice.</p>
<h2 id="heading-setting-up-the-environment"><strong>Setting Up the Environment</strong></h2>
<p>Once Debian was installed, I ran the following commands to update the system, install essential packages, and set up Docker.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Baseline updates and setup</span>
sudo apt update
sudo apt upgrade -y
sudo apt install ca-certificates curl gcc git git-lfs wget

<span class="hljs-comment"># Docker specific prerequisite setup</span>
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [arch=<span class="hljs-subst">$(dpkg --print-architecture)</span> signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian <span class="hljs-subst">$(. /etc/os-release &amp;&amp; echo <span class="hljs-string">"<span class="hljs-variable">$VERSION_CODENAME</span>"</span>)</span> stable"</span> | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null
sudo apt update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
</code></pre>
<h2 id="heading-installing-nvidia-drivers-amp-cuda"><strong>Installing NVIDIA Drivers &amp; CUDA</strong></h2>
<p>For GPU acceleration, I needed <strong>CUDA</strong> and the latest <strong>NVIDIA drivers</strong>. The best way to ensure compatibility is by downloading the appropriate CUDA version directly from <a target="_blank" href="https://developer.nvidia.com/cuda-downloads?target_os=Linux&amp;target_arch=x86_64&amp;Distribution=Debian&amp;target_version=12&amp;target_type=deb_network">NVIDIA’s site. The site d</a>ynamically generates <a target="_blank" href="https://developer.nvidia.com/cuda-downloads?target_os=Linux&amp;target_arch=x86_64&amp;Distribution=Debian&amp;target_version=12&amp;target_type=deb_network">the correct ins</a>tall commands based on your OS version - always check for updates before running these commands.</p>
<p>As of writing, CUDA <strong>12.8</strong> was the latest version. Here’s how I installed it:</p>
<pre><code class="lang-bash">
<span class="hljs-comment"># Installing Nvidia Drivers and Cuda</span>
wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8
sudo apt-get install -y nvidia-open
</code></pre>
<h2 id="heading-setting-up-the-nvidia-container-toolkit"><strong>Setting Up the NVIDIA Container Toolkit</strong></h2>
<p>To enable GPU access inside Docker containers, I installed the <strong>NVIDIA Container Toolkit</strong> following <a target="_blank" href="https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-the-nvidia-container-toolkit">NVIDIA’s documentation.</a></p>
<pre><code class="lang-bash"><span class="hljs-comment"># Installing the Nvidia Container Toolkit</span>
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg &amp;&amp; curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed <span class="hljs-string">'s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g'</span> | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
</code></pre>
<h2 id="heading-verifying-everything-works"><strong>Verifying Everything Works</strong></h2>
<p>After installation, I rebooted the system and verified that the <strong>GPU</strong> and <strong>container toolkit</strong> were working properly.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Reboot (start fresh)</span>
sudo reboot

<span class="hljs-comment"># Now check prereqs</span>
nvidia-smi <span class="hljs-comment"># This should output driver information</span>

<span class="hljs-comment"># Test that the container toolkit is working properly with Docker and that containers can access the GPU</span>
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
</code></pre>
<p>The <strong>nvidia-smi</strong> output should display your GPU details. Running the same command inside a Docker container should produce identical output. If anything looks off, retrace the setup steps to catch any missed configurations.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738207463575/c99f89e7-157f-4fbe-bb42-d4bf0fb68585.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-setting-up-the-working-directory"><strong>Setting Up the Working Directory</strong></h2>
<p>This is where I stored all <strong>Docker Compose files, models, and configuration files</strong> for the various containers. Most directories will be generated later, but a few need to be created manually.</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> ~
mkdir working
mkdir working/chat-ui
mkdir working/nginx
</code></pre>
<hr />
<h2 id="heading-full-setup-commands-for-reference"><strong>Full Setup Commands (For Reference)</strong></h2>
<p>If you want to set everything up in one go, here’s a consolidated version.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">❗</div>
<div data-node-type="callout-text"><strong><em>The following commands include a reboot.</em></strong></div>
</div>

<pre><code class="lang-bash"><span class="hljs-comment"># Baseline updates and setup</span>
sudo apt update
sudo apt upgrade -y
sudo apt install ca-certificates curl gcc git git-lfs wget

<span class="hljs-comment"># Docker specific prerequisite setup</span>
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [arch=<span class="hljs-subst">$(dpkg --print-architecture)</span> signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian <span class="hljs-subst">$(. /etc/os-release &amp;&amp; echo <span class="hljs-string">"<span class="hljs-variable">$VERSION_CODENAME</span>"</span>)</span> stable"</span> | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null
sudo apt update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

<span class="hljs-comment"># Installing Nvidia Drivers and Cuda</span>
wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8
sudo apt-get install -y nvidia-open

<span class="hljs-comment"># Installing the Nvidia Container Toolkit</span>
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg &amp;&amp; curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed <span class="hljs-string">'s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g'</span> | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker

<span class="hljs-comment"># Reboot host</span>
sudo reboot

<span class="hljs-comment"># Now check prereqs</span>
nvidia-smi <span class="hljs-comment"># This should output driver information</span>

<span class="hljs-comment"># Test that the container toolkit is working properly with Docker and that containers can access the GPU</span>
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

<span class="hljs-comment"># Prep the working folder(s)</span>
<span class="hljs-built_in">cd</span> ~
mkdir working
mkdir working/chat-ui
mkdir working/nginx
</code></pre>
<hr />
<h1 id="heading-why-vllm"><strong>Why vLLM?</strong></h1>
<p>After setting up the environment, the next step was choosing an <strong>inference engine</strong> to serve the AI model. With so many options available - like Hugging Face’s <strong>Text Generation Inference (TGI)</strong> - I wanted something fast, <strong>compatible with OpenAI’s API</strong>, and optimized for <strong>high-throughput inference</strong>.</p>
<p>I came across an excellent blog post by <a target="_blank" href="https://www.inovex.de/de/blog/author/mbuettner/">Malte Büttner at inovex.de</a>. While the article focused on TGI rather than vLLM, it helped frame my approach to deploying this solution efficiently.</p>
<h2 id="heading-why-vllm-over-other-solutions"><strong>Why vLLM Over Other Solutions?</strong></h2>
<p>Inference engines evolve rapidly, but I chose <strong>vLLM</strong> for its balance of <strong>speed, efficiency, and support for modern models</strong>. Here’s a quick comparison:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Feature</strong></td><td><strong>TGI (Hugging Face)</strong></td><td><strong>vLLM</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Developer</strong></td><td>Hugging Face</td><td>UC Berkeley</td></tr>
<tr>
<td><strong>Optimization</strong></td><td>Best for HF transformer models</td><td>High-throughput LLM inference</td></tr>
<tr>
<td><strong>Quantization Support</strong></td><td>❌ Limited (4-bit/8-bit only)</td><td>✅ GPTQ &amp; AWQ (better for modern models)</td></tr>
<tr>
<td><strong>Memory Efficiency</strong></td><td>⚠️ Statically allocated KV cache</td><td>✅ PagedAttention (better memory usage)</td></tr>
<tr>
<td><strong>Batching</strong></td><td>✅ Continuous batching</td><td>✅ More optimized for multi-user workloads</td></tr>
<tr>
<td><strong>Long-Context Support</strong></td><td>⚠️ Decent, but struggles with newer models</td><td>✅ Better KV cache management</td></tr>
<tr>
<td><strong>Multi-GPU Support</strong></td><td>✅ Yes (DeepSpeed-based)</td><td>✅ Yes</td></tr>
<tr>
<td><strong>OpenAI API Compatibility</strong></td><td>❌ No</td><td>✅ Yes (drop-in replacement for OpenAI API)</td></tr>
</tbody>
</table>
</div><p>vLLM is OpenAI-compatible API made it easy to integrate <strong>without modifying other components</strong> like ChatUI. It also supports <strong>efficient batching and memory handling</strong>, which is crucial when running inference on consumer-grade GPUs.</p>
<hr />
<h1 id="heading-choosing-the-right-model"><strong>Choosing the Right Model</strong></h1>
<p>Selecting the right model wasn’t just about grabbing the biggest, most powerful LLM available. It had to fit within <strong>hardware constraints</strong> while still performing well in <strong>chat-based interactions</strong>.</p>
<p>The <strong>RTX 3080 (10GB VRAM)</strong> is a solid GPU, but large models like <strong>LLaMA 3 8B</strong> or <strong>Mistral 7B</strong> would have pushed its memory limits, making them impractical for real-time chat. I needed a model that was:</p>
<ul>
<li><p><strong>Compact enough</strong> to run efficiently on my GPU</p>
</li>
<li><p><strong>Strong in conversation</strong></p>
</li>
<li><p><strong>Fully compatible with vLLM</strong> for seamless inference</p>
</li>
</ul>
<h2 id="heading-why-phi-3">Why Phi-3?</h2>
<p>Phi-3 stood out because it <strong>delivers strong chat performance</strong> despite its smaller size. At just <strong>3.8B parameters</strong>, it competes with (and even outperforms) some <strong>7B</strong> <strong>models</strong> while maintaining <strong>efficiency</strong>.</p>
<p><strong>Key Factors in My Decision</strong></p>
<ol>
<li><p><strong>✅ Performance vs. Hardware Constraints</strong></p>
<ul>
<li><p>The RTX 3080 has limited VRAM (10GB), so full-precision LLMs were out.</p>
</li>
<li><p>Phi-3 runs efficiently without excessive slowdowns.</p>
</li>
<li><p>Larger models like <strong>Mistral 7B</strong> would have been <strong>too memory-intensive</strong> for real-time use.</p>
</li>
</ul>
</li>
<li><p><strong>✅ Optimized for Chat</strong></p>
<ul>
<li><p>Some models excel at <strong>code generation</strong>, but <strong>fall short in conversation</strong>.</p>
</li>
<li><p>Phi-3 is specifically <strong>fine-tuned for instruction-following</strong>, making it ideal for a chatbot.</p>
</li>
<li><p>Benchmarks show it <strong>outperforms some larger models</strong> in reasoning tasks.</p>
</li>
</ul>
</li>
<li><p><strong>✅ Seamless vLLM Integration</strong></p>
<ul>
<li><p>Since vLLM <strong>supports OpenAI’s API</strong>, I needed a model that played well with that framework.</p>
</li>
<li><p>Phi-3 works <strong>out of the box</strong> with vLLM, eliminating compatibility headaches.</p>
</li>
</ul>
</li>
</ol>
<h2 id="heading-alternative-models-i-considered">Alternative Models I Considered</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Model</strong></td><td><strong>Size</strong></td><td><strong>Pros</strong></td><td><strong>Cons</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>LLaMA 3 8B</strong></td><td>8B</td><td>✅ Stronger performance</td><td><strong>❌ Too large for 10GB VRAM</strong></td></tr>
<tr>
<td><strong>Mistral 7B</strong></td><td>7B</td><td>✅ Great general reasoning</td><td>❌ <strong>Pushes memory limits</strong></td></tr>
<tr>
<td><strong>Gemma 2B</strong></td><td>2B</td><td>✅ Extremely lightweight</td><td><strong>❌ Inferior reasoning skills</strong></td></tr>
</tbody>
</table>
</div><h2 id="heading-why-phi-3-was-the-best-fit"><strong>Why Phi-3 Was the Best Fit</strong></h2>
<p>Phi-3 struck the right balance between <strong>size, performance, and efficiency</strong>, making it the <strong>best option for local inference on a 3080</strong>.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">📌</div>
<div data-node-type="callout-text"><a target="_self" href="https://arxiv.org/abs/2404.14219">Phi-3 Technical Report</a></div>
</div>

<ul>
<li><p>✅ <strong>Fast</strong></p>
</li>
<li><p>✅ <strong>Lightweight</strong></p>
</li>
<li><p>✅ <strong>Strong chat performance</strong></p>
</li>
<li><p>✅ <strong>Easily deployable with vLLM</strong></p>
</li>
</ul>
<p>Starting with <strong>one solid model</strong> made sense, but ChatUI supports <strong>multiple models</strong>, so I can expand later.</p>
<hr />
<h1 id="heading-preparing-phi-3-for-vllm"><strong>Preparing Phi-3 for vLLM</strong></h1>
<p>Since this is a <strong>self-hosted</strong> setup, the model <strong>must be stored locally</strong>. There are several ways to download models, but I opted for <strong>git</strong> to keep things simple.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">📌</div>
<div data-node-type="callout-text"><strong><em>If storage is limited</em></strong><em>, using the Hugging Face Hub CLI is an alternative, it downloads models </em><strong><em>without full repository history</em></strong><em> to save space.</em></div>
</div>

<h2 id="heading-downloading-phi-3"><strong>Downloading Phi-3</strong></h2>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> ~/working
git lfs install
git <span class="hljs-built_in">clone</span> https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
</code></pre>
<p>After running this, the folder structure should look like this:</p>
<pre><code class="lang-bash">/home/username
 |-- working
 │    |-- nginx
 │    |-- Phi-3-mini-4k-instruct
 │    |-- chat-ui
</code></pre>
<h1 id="heading-breaking-down-the-container-configuration">Breaking Down the Container Configuration</h1>
<p>Each container in this setup plays a specific role, and getting them to work together required <strong>deliberate configuration choices</strong>. Below, I’ll walk through:</p>
<ul>
<li><p><strong>Design Decisions</strong> – Why I configured it this way and the trade-offs involved.</p>
</li>
<li><p><strong>Configuration Details</strong> – Key settings and how they impact functionality.</p>
</li>
<li><p><strong>Troubleshooting Steps</strong> – How I validated that each container was running correctly.</p>
</li>
</ul>
<h2 id="heading-understanding-the-configuration"><strong>Understanding the Configuration</strong></h2>
<p>This is <strong><mark>not</mark></strong> a plug-and-play setup. You <strong><mark>can’t</mark></strong> just copy and paste the full docker-compose.yml and expect it to work out of the box. Certain containers require additional configuration files and environment variables that I’ll cover in their respective sections.</p>
<p>If you’re following along, <strong>make sure to review the highlighted sections</strong>, those require extra setup.</p>
<h3 id="heading-container-overview">Container Overview</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Component</strong></td><td><strong>Purpose</strong></td><td><strong>Requires Additional Configuration?</strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>vLLM</strong></td><td>Model inference server</td><td>No (defaults work)</td></tr>
<tr>
<td><strong>ChatUI</strong></td><td>Web frontend for chat</td><td><strong>➡️ Yes</strong> (requires .env.local file)</td></tr>
<tr>
<td><strong>MongoDB</strong></td><td>Stores chat history</td><td>No (basic setup)</td></tr>
<tr>
<td><strong>NGINX</strong></td><td>Reverse proxy for secure access</td><td><strong>➡️ Yes</strong> (requires custom nginx.conf)</td></tr>
<tr>
<td><strong>CertBot</strong></td><td>Automates SSL certificates</td><td><strong>➡️ Yes</strong> (custom API-based setup for my domain provider)</td></tr>
</tbody>
</table>
</div><p>In the following sections, I’ll break down each container’s configuration, highlight potential pitfalls, and explain the reasoning behind key decisions.</p>
<hr />
<h1 id="heading-vllm-the-model-inference-server">vLLM: The Model Inference Server</h1>
<p>At the core of this self-hosted AI chatbot is <strong>vLLM</strong>, a highly optimized inference framework designed for efficient text generation. My goal was to maximize performance on an <strong>RTX 3080</strong> while ensuring <strong>compatibility with ChatUI</strong>, which integrates easily using an <strong>OpenAI-compatible API</strong>.</p>
<p>I previously covered why I chose vLLM over TGI, but to reiterate, vLLM offers <strong>out-of-the-box compatibility with OpenAI’s API</strong>, making it a drop-in replacement for services like OpenAI or Hugging Face APIs. This meant <strong>seamless integration</strong> with other components in my setup without needing to rewrite API interactions.</p>
<h2 id="heading-vllm-configuration">vLLM Configuration</h2>
<p>Here’s the docker-compose.yml configuration for vLLM:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">text-generation:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">vllm/vllm-openai:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8080:8000"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./Phi-3-mini-4k-instruct:/data/model/Phi-3-mini-4k-instruct</span> 
    <span class="hljs-attr">command:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--model"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"/data/model/Phi-3-mini-4k-instruct"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--dtype"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"bfloat16"</span> 
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--tensor-parallel-size"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"1"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--gpu-memory-utilization"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"0.9"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--max-model-len"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"3264"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--max-num-batched-tokens"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"3264"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--trust-remote-code"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">NVIDIA_VISIBLE_DEVICES:</span> <span class="hljs-string">all</span>
    <span class="hljs-attr">deploy:</span>
      <span class="hljs-attr">resources:</span>
        <span class="hljs-attr">reservations:</span>
          <span class="hljs-attr">devices:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">driver:</span> <span class="hljs-string">nvidia</span>
              <span class="hljs-attr">count:</span> <span class="hljs-string">all</span>
              <span class="hljs-attr">capabilities:</span> [<span class="hljs-string">gpu</span>]
    <span class="hljs-attr">runtime:</span> <span class="hljs-string">nvidia</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">text-generation</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>
</code></pre>
<h3 id="heading-vllm-key-configuration-choices">vLLM - Key configuration choices</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Parameter</strong></td><td><strong>Reasoning</strong></td></tr>
</thead>
<tbody>
<tr>
<td>--dtype bfloat16</td><td>Reduces memory usage while maintaining precision. Best for RTX 3080.</td></tr>
<tr>
<td>--tensor-parallel-size 1</td><td>Since I only have one GPU in this setup, parallelism isn’t needed.</td></tr>
<tr>
<td>--gpu-memory-utilization 0.9</td><td>Maximizes VRAM usage while avoiding out-of-memory (OOM) errors.</td></tr>
<tr>
<td>--max-model-len 3264</td><td>Allows longer prompts while fitting within VRAM constraints.</td></tr>
<tr>
<td>--trust-remote-code</td><td>Required for some Hugging Face models that need additional scripts.</td></tr>
</tbody>
</table>
</div><hr />
<h1 id="heading-troubleshooting-vllm">Troubleshooting vLLM</h1>
<p>vLLM <strong>wasn’t plug-and-play</strong>, and I hit a few roadblocks along the way. Here’s how I validated that it was working correctly and debugged common issues.</p>
<h2 id="heading-1-ensuring-vllm-is-running-properly">1️⃣ Ensuring vLLM is running properly</h2>
<p>After deployment, I tested whether vLLM was active by running:</p>
<pre><code class="lang-bash">curl -X GET http://localhost:8080/v1/models
</code></pre>
<h3 id="heading-expected-response-formatted-for-readability"><strong>✅ Expected response (formatted for readability)</strong></h3>
<pre><code class="lang-json">{
    <span class="hljs-attr">"object"</span>: <span class="hljs-string">"list"</span>,
    <span class="hljs-attr">"data"</span>: [
        {
            <span class="hljs-attr">"id"</span>: <span class="hljs-string">"/data/model/Phi-3-mini-4k-instruct"</span>,
            <span class="hljs-attr">"object"</span>: <span class="hljs-string">"model"</span>,
            <span class="hljs-attr">"created"</span>: <span class="hljs-number">1738295351</span>,
            <span class="hljs-attr">"owned_by"</span>: <span class="hljs-string">"vllm"</span>,
            <span class="hljs-attr">"root"</span>: <span class="hljs-string">"/data/model/Phi-3-mini-4k-instruct"</span>,
            <span class="hljs-attr">"parent"</span>: <span class="hljs-literal">null</span>,
            <span class="hljs-attr">"max_model_len"</span>: <span class="hljs-number">3264</span>,
            <span class="hljs-attr">"permission"</span>: [
                {
                    <span class="hljs-attr">"id"</span>: <span class="hljs-string">"modelperm-18926d767c1346399251c729e0cd251a"</span>,
                    <span class="hljs-attr">"object"</span>: <span class="hljs-string">"model_permission"</span>,
                    <span class="hljs-attr">"created"</span>: <span class="hljs-number">1738295351</span>,
                    <span class="hljs-attr">"allow_create_engine"</span>: <span class="hljs-literal">false</span>,
                    <span class="hljs-attr">"allow_sampling"</span>: <span class="hljs-literal">true</span>,
                    <span class="hljs-attr">"allow_logprobs"</span>: <span class="hljs-literal">true</span>,
                    <span class="hljs-attr">"allow_search_indices"</span>: <span class="hljs-literal">false</span>,
                    <span class="hljs-attr">"allow_view"</span>: <span class="hljs-literal">true</span>,
                    <span class="hljs-attr">"allow_fine_tuning"</span>: <span class="hljs-literal">false</span>,
                    <span class="hljs-attr">"organization"</span>: <span class="hljs-string">"*"</span>,
                    <span class="hljs-attr">"group"</span>: <span class="hljs-literal">null</span>,
                    <span class="hljs-attr">"is_blocking"</span>: <span class="hljs-literal">false</span>
                }
            ]
        }
    ]
}
</code></pre>
<p>If this request <strong>fails</strong>, check the logs for errors:</p>
<pre><code class="lang-bash">sudo docker logs text-generation
</code></pre>
<h2 id="heading-2-common-issues-you-may-encounter"><strong>2️⃣ Common issues you may encounter</strong></h2>
<h3 id="heading-model-path-mismatch"><strong>Model Path Mismatch</strong></h3>
<p>Ensure that the <strong>volume mounts are correct</strong>. The host directory ./Phi-3-mini-4k-instruct should be properly mapped to /data/model/Phi-3-mini-4k-instruct in the container.</p>
<h3 id="heading-gpu-visibility-issues"><strong>GPU Visibility Issues</strong></h3>
<p>If vLLM can’t access the GPU, check nvidia-smi:</p>
<pre><code class="lang-bash">sudo docker <span class="hljs-built_in">exec</span> -it text-generation nvidia-smi
</code></pre>
<p>If the GPU <strong>isn’t detected</strong>, the issue is likely related to missing drivers, an incorrect runtime setting in Docker, or the NVIDIA Container Toolkit not being configured correctly.</p>
<h2 id="heading-3-validating-inference-works-via-api"><strong>3️⃣ Validating Inference Works via API</strong></h2>
<p>To ensure text generation was functioning properly, I ran a test request directly from the Docker host:</p>
<pre><code class="lang-bash">curl -X POST http://localhost:8080/v1/chat/completions -H <span class="hljs-string">"Authorization: Bearer sk-fake-key"</span> -H <span class="hljs-string">"Content-Type: application/json"</span> -d <span class="hljs-string">'{
    "model": "/data/model/Phi-3-mini-4k-instruct",
    "messages": [{"role": "user", "content": "Hello, what is AI?"}]
  }'</span>
</code></pre>
<h3 id="heading-expected-response-formatted-for-readability-1"><strong>✅ Expected response (formatted for readability)</strong></h3>
<pre><code class="lang-json">{
    <span class="hljs-attr">"id"</span>: <span class="hljs-string">"chatcmpl-9375af51aa3f479db7c3053e33eb753d"</span>,
    <span class="hljs-attr">"object"</span>: <span class="hljs-string">"chat.completion"</span>,
    <span class="hljs-attr">"created"</span>: <span class="hljs-number">1738296405</span>,
    <span class="hljs-attr">"model"</span>: <span class="hljs-string">"/data/model/Phi-3-mini-4k-instruct"</span>,
    <span class="hljs-attr">"choices"</span>: [
        {
            <span class="hljs-attr">"index"</span>: <span class="hljs-number">0</span>,
            <span class="hljs-attr">"message"</span>: {
                <span class="hljs-attr">"role"</span>: <span class="hljs-string">"assistant"</span>,
                <span class="hljs-attr">"content"</span>: <span class="hljs-string">" Artificial Intelligence (AI) .... blah blah blah"</span>,
                <span class="hljs-attr">"tool_calls"</span>: []
            },
            <span class="hljs-attr">"logprobs"</span>: <span class="hljs-literal">null</span>,
            <span class="hljs-attr">"finish_reason"</span>: <span class="hljs-string">"stop"</span>,
            <span class="hljs-attr">"stop_reason"</span>: <span class="hljs-number">32007</span>
        }
    ],
    <span class="hljs-attr">"usage"</span>: {
        <span class="hljs-attr">"prompt_tokens"</span>: <span class="hljs-number">10</span>,
        <span class="hljs-attr">"total_tokens"</span>: <span class="hljs-number">179</span>,
        <span class="hljs-attr">"completion_tokens"</span>: <span class="hljs-number">169</span>,
        <span class="hljs-attr">"prompt_tokens_details"</span>: <span class="hljs-literal">null</span>
    },
    <span class="hljs-attr">"prompt_logprobs"</span>: <span class="hljs-literal">null</span>
}
</code></pre>
<h2 id="heading-debugging-networking-issues">Debugging networking issues</h2>
<p>One of the trickiest parts of setting up vLLM was <strong>getting the container networking right</strong>. Since containers handle networking behind the scenes, misconfigurations can cause connection failures between ChatUI and vLLM.</p>
<h2 id="heading-4-testing-connectivity-from-another-container"><strong>4️⃣ Testing Connectivity from Another Container</strong></h2>
<p>I confirmed that ChatUI could talk to vLLM by running a cURL test <strong>inside the ChatUI container</strong>:</p>
<pre><code class="lang-bash">sudo docker <span class="hljs-built_in">exec</span> -it chatui bash
curl -X POST http://text-generation:8000/v1/chat/completions \
-H <span class="hljs-string">"Authorization: Bearer sk-fake-key"</span> \
-H <span class="hljs-string">"Content-Type: application/json"</span> \
-d <span class="hljs-string">'{
    "model": "/data/model/Phi-3-mini-4k-instruct",
    "messages": [{"role": "user", "content": "Hello, what is AI?"}]
}'</span>
</code></pre>
<p>If this request <strong>fails</strong>, the issue is likely <strong>container networking</strong>. I resolved it by explicitly defining a <strong>dedicated Docker network</strong> for this deployment in my final configuration.</p>
<hr />
<h1 id="heading-thoughts-on-the-vllm-deployment">Thoughts on the vLLM deployment</h1>
<p>Deploying vLLM with Phi-3 Mini was an <strong>exercise in balancing performance and efficiency</strong>. The <strong>RTX 3080 handles it well</strong>, but it’s clear that <strong>more powerful models</strong> (LLaMA 3, Mistral, etc.) would need more VRAM.</p>
<h3 id="heading-would-i-recommend-vllm-for-local-inference"><strong>Would I recommend vLLM for local inference?</strong></h3>
<p>Absolutely. If you need solid performance and <strong>compatibility with OpenAI-style APIs</strong>, vLLM is a <strong>great choice</strong>.</p>
<h3 id="heading-next-steps-for-my-setup"><strong>Next Steps for My Setup</strong></h3>
<ul>
<li><p><strong>Deploy multiple models</strong> for <strong>on-the-fly model switching</strong> in ChatUI.</p>
</li>
<li><p><strong>Optimize memory usage</strong> for longer prompts.</p>
</li>
<li><p><strong>Enhance functionality</strong> with <strong>web search</strong> and <strong>retrieval-augmented generation (RAG)</strong>.</p>
</li>
</ul>
<hr />
<h1 id="heading-final-thoughts-vllm">Final Thoughts - vLLM</h1>
<p>Setting up vLLM <strong>wasn’t as easy as I expected</strong>, but once everything was dialed in, it worked <strong>flawlessly</strong>. The biggest hurdles were <strong>container networking</strong> and <strong>GPU resource management</strong>, but <strong>debugging logs and running manual API tests</strong> made troubleshooting straightforward.</p>
<p>If you’re looking to self-host an AI chatbot, <strong>vLLM is a powerful, flexible option</strong> - and one that <strong>scales well for local deployments</strong>.</p>
<hr />
<h1 id="heading-chatui-the-frontend-interface">ChatUI: The Frontend Interface</h1>
<p>With <strong>vLLM handling inference</strong>, the next step was setting up <strong>ChatUI</strong> - a <strong>web-based chat interface</strong> that makes interacting with the model easy. ChatUI provides an OpenAI-style frontend, so I didn’t need to build my own UI from scratch.</p>
<h3 id="heading-why-chatui"><strong>Why ChatUI?</strong></h3>
<p>I briefly mentioned ChatUI earlier, but let’s break down why I chose it:</p>
<ul>
<li><p><strong>OpenAI-compatible</strong> – Works seamlessly with vLLM <strong>without major modifications</strong>.</p>
</li>
<li><p><strong>Lightweight and simple</strong> – Minimal dependencies, making deployment straightforward.</p>
</li>
<li><p><strong>Multi-model support</strong> – Allows loading multiple models and switching between them in the UI.</p>
</li>
<li><p><strong>Customizable API endpoints</strong> – Ensures all requests are routed to <strong>my self-hosted AI</strong> instead of external APIs.</p>
</li>
</ul>
<h3 id="heading-chatui-docker-configuration">ChatUI Docker Configuration</h3>
<p>Here’s how I set up ChatUI using <strong>Docker Compose</strong>, connecting it to <strong>MongoDB</strong> for chat history and <strong>vLLM</strong> for inference:</p>
<pre><code class="lang-yaml">  <span class="hljs-attr">chat-ui:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">ghcr.io/huggingface/chat-ui:latest</span> <span class="hljs-comment"># Chat UI container</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"3000:3000"</span> <span class="hljs-comment"># Expose Chat UI on port 3000</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">MONGODB_URL:</span> <span class="hljs-string">mongodb://mongo-chatui:27017</span> <span class="hljs-comment"># MongoDB URL for frontend</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./chat-ui/.env.local:/app/.env.local</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">chatui</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>
</code></pre>
<p>This configuration ensures ChatUI:</p>
<ul>
<li><p>✅ Connects to <strong>MongoDB</strong> for storing conversations.</p>
</li>
<li><p>✅ Runs on <strong>port 3000</strong> for easy web access.</p>
</li>
<li><p>✅ Reads settings from <strong>.env.local</strong>, making it easy to tweak configurations.</p>
</li>
</ul>
<hr />
<h1 id="heading-creating-the-chatui-configuration-file"><strong>Creating the ChatUI Configuration File</strong></h1>
<p>The .env.local file controls ChatUI’s settings, including which <strong>AI models</strong> are available and where to send requests.</p>
<h3 id="heading-1-create-the-configuration-file"><strong>1️⃣ Create the Configuration File</strong></h3>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> ~/working/chat-ui
nano .env.local
</code></pre>
<h3 id="heading-2-add-configuration-details"><strong>2️⃣ Add Configuration Details</strong></h3>
<pre><code class="lang-plaintext">### Models ###
MODELS=`[
    {
      "name": "/data/model/Phi-3-mini-4k-instruct",
      "description": "Microsoft's Phi-3 Model",
      "promptExamples": [
        {
          "title": "Write an email from bullet list",
          "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
        }, {
          "title": "Code a snake game",
          "prompt": "Code a basic snake game in python, give explanations for each step."
        }, {
          "title": "Assist in a task",
          "prompt": "How do I make a delicious grilled cheese sandwich?"
        }
      ],
      "endpoints": [
      {
        "type": "openai",
        "baseURL": "http://text-generation:8000/v1"
      }
      ]
    }
]`

### MongoDB ###
MONGODB_URL=mongodb://mongo-chatui:27017
USE_HF_API=false
USE_OPENAI_API=false

## Removed models, useful for migrating conversations
# { name: string, displayName?: string, id?: string, transferTo?: string }`
OLD_MODELS=`[]`

### Task model ###
# name of the model used for tasks such as summarizing title, creating query, etc.
# if not set, the first model in MODELS will be used
TASK_MODEL="/data/model/Phi-3-mini-4k-instruct"

### Authentication ###

# if it's defined, only these emails will be allowed to use login
ALLOWED_USER_EMAILS=`[]` 
# If it's defined, users with emails matching these domains will also be allowed to use login
ALLOWED_USER_DOMAINS=`[]`
# valid alternative redirect URLs for OAuth, used for HuggingChat apps
ALTERNATIVE_REDIRECT_URLS=`[]` 
### Cookies
# name of the cookie used to store the session
COOKIE_NAME=hf-chat

## Websearch configuration
PLAYWRIGHT_ADBLOCKER=true
WEBSEARCH_ALLOWLIST=`[]` # if it's defined, allow websites from only this list.
WEBSEARCH_BLOCKLIST=`[]` # if it's defined, block websites from this list.
WEBSEARCH_JAVASCRIPT=true # CPU usage reduces by 60% on average by disabling javascript. Enable to improve website compatibility
WEBSEARCH_TIMEOUT=3500 # in milliseconds, determines how long to wait to load a page before timing out
ENABLE_LOCAL_FETCH=false # set to true to allow fetches on the local network. /!\ Only enable this if you have the proper firewall rules to prevent SSRF attacks and understand the implications.

## Public app configuration ##
PUBLIC_APP_GUEST_MESSAGE= #a message to the guest user. If not set, no message will be shown. Only used if you have authentication enabled.
PUBLIC_APP_NAME="Bri-Chat" # name used as title throughout the app
PUBLIC_APP_ASSETS=chatui # used to find logos &amp; favicons in static/$PUBLIC_APP_ASSETS
PUBLIC_ANNOUNCEMENT_BANNERS=`[
    {
    "title": "Welcome to ChatUI",
    "linkTitle": "Check out Brian's blog",
    "linkHref": "https://blog.brianbaldock.net"
  }
]`
PUBLIC_SMOOTH_UPDATES=false # set to true to enable smoothing of messages client-side, can be CPU intensive

### Feature Flags ###
LLM_SUMMARIZATION=false # generate conversation titles with LLMs
ENABLE_ASSISTANTS=false #set to true to enable assistants feature
ENABLE_ASSISTANTS_RAG=false # /!\ This will let users specify arbitrary URLs that the server will then request. Make sure you have the proper firewall rules in place. 
REQUIRE_FEATURED_ASSISTANTS=false # require featured assistants to show in the list
COMMUNITY_TOOLS=false # set to true to enable community tools
EXPOSE_API=true # make the /api routes available
ALLOW_IFRAME=true # Allow the app to be embedded in an iframe

### Tools ###
# Check out public config in `chart/env/prod.yaml` for more details
TOOLS=`[]` 

USAGE_LIMITS=`{}`

### HuggingFace specific ###
# Let user authenticate with their HF token in the /api routes. This is only useful if you have OAuth configured with huggingface.
USE_HF_TOKEN_IN_API=false

### Metrics ###
METRICS_ENABLED=false
METRICS_PORT=5565
LOG_LEVEL=info

# Remove or leave blank any unused API keys
OPENAI_API_KEY=
</code></pre>
<p>The file configures:</p>
<ul>
<li><p>✅ <strong>Model selection</strong> (Phi-3 Mini)</p>
</li>
<li><p>✅ <strong>MongoDB connection</strong> for chat history</p>
</li>
<li><p>✅ <strong>Frontend branding</strong> (custom name, banners, etc.)</p>
</li>
<li><p>✅ <strong>API exposure</strong></p>
</li>
</ul>
<hr />
<h1 id="heading-troubleshooting-chatui">Troubleshooting ChatUI</h1>
<p>Like vLLM, ChatUI <strong>wasn’t plug-and-play</strong>. Here’s how I debugged the common issues.</p>
<h2 id="heading-1-making-sure-chatui-talks-to-vllm"><strong>1️⃣ Making sure ChatUI talks to vLLM</strong></h2>
<p>One of the biggest problems I faced was <strong>ChatUI trying to connect to OpenAI instead of my local vLLM server</strong>. To check where requests were going, I used <strong>browser dev tools</strong>:</p>
<ol>
<li><p>Open <strong>ChatUI</strong> in a browser (<a target="_blank" href="http://localhost:3000">http://&lt;docker host IP&gt;:3000</a>).</p>
</li>
<li><p>Press <strong>F12</strong> (or right-click → Inspect) to open <strong>Developer Tools</strong>.</p>
</li>
<li><p>Go to the <strong>Network</strong> tab and send a test message.</p>
</li>
<li><p>Look for requests to /v1/chat/completions.</p>
</li>
</ol>
<h3 id="heading-expected-request-url">✅ <strong>Expected request URL</strong></h3>
<pre><code class="lang-bash">http://text-generation:8000/v1/chat/completions
</code></pre>
<p>❌ <strong>If the request goes to api.openai.com, then:</strong></p>
<ul>
<li><p>The <em>.env.local</em> file is incorrect.</p>
</li>
<li><p>The baseURL value might be missing or incorrectly set.</p>
</li>
</ul>
<p>💡 <strong>Fix: Update .env.local and restart ChatUI:</strong></p>
<pre><code class="lang-bash">sudo docker restart chatui
</code></pre>
<h2 id="heading-2-running-a-direct-api-test-from-inside-the-chatui-container"><strong>2️⃣ Running a Direct API Test from Inside the ChatUI Container</strong></h2>
<p>If the frontend <strong>isn’t returning responses</strong>, I tested the API connection manually inside the <strong>ChatUI container</strong>:</p>
<pre><code class="lang-bash">sudo docker <span class="hljs-built_in">exec</span> -it chatui bash

<span class="hljs-comment"># Inside the ChatUI container now</span>
curl -X POST http://text-generation:8000/v1/chat/completions -H <span class="hljs-string">"Authorization: Bearer sk-fake-key"</span> -H <span class="hljs-string">"Content-Type: application/json"</span> -d <span class="hljs-string">'{
    "model": "/data/model/Phi-3-mini-4k-instruct",
    "messages": [{"role": "user", "content": "Hello, what is AI?"}]
  }'</span>
</code></pre>
<h3 id="heading-expected-response-formatted-for-readability-2"><strong>✅ Expected response (formatted for readability):</strong></h3>
<pre><code class="lang-yaml">{
    <span class="hljs-attr">"id":</span> <span class="hljs-string">"chatcmpl-9375af51aa3f479db7c3053e33eb753d"</span>,
    <span class="hljs-attr">"object":</span> <span class="hljs-string">"chat.completion"</span>,
    <span class="hljs-attr">"created":</span> <span class="hljs-number">1738296405</span>,
    <span class="hljs-attr">"model":</span> <span class="hljs-string">"/data/model/Phi-3-mini-4k-instruct"</span>,
    <span class="hljs-attr">"choices":</span> [
        {
            <span class="hljs-attr">"index":</span> <span class="hljs-number">0</span>,
            <span class="hljs-attr">"message":</span> {
                <span class="hljs-attr">"role":</span> <span class="hljs-string">"assistant"</span>,
                <span class="hljs-attr">"content":</span> <span class="hljs-string">" Artificial Intelligence (AI) .... blah blah blah"</span>,
                <span class="hljs-attr">"tool_calls":</span> []
            },
            <span class="hljs-attr">"logprobs":</span> <span class="hljs-literal">null</span>,
            <span class="hljs-attr">"finish_reason":</span> <span class="hljs-string">"stop"</span>,
            <span class="hljs-attr">"stop_reason":</span> <span class="hljs-number">32007</span>
        }
    ],
    <span class="hljs-attr">"usage":</span> {
        <span class="hljs-attr">"prompt_tokens":</span> <span class="hljs-number">10</span>,
        <span class="hljs-attr">"total_tokens":</span> <span class="hljs-number">179</span>,
        <span class="hljs-attr">"completion_tokens":</span> <span class="hljs-number">169</span>,
        <span class="hljs-attr">"prompt_tokens_details":</span> <span class="hljs-literal">null</span>
    },
    <span class="hljs-attr">"prompt_logprobs":</span> <span class="hljs-literal">null</span>
}
</code></pre>
<p>❌ <strong>If this request fails,</strong> there’s likely a <strong>networking issue</strong> between ChatUI and vLLM.</p>
<h2 id="heading-3-debugging-networking-issues"><strong>3️⃣ Debugging Networking Issues</strong></h2>
<p>One of the most frustrating parts of the setup was <strong>container networking</strong>.</p>
<p>To check whether ChatUI could reach vLLM, I ran a simple connectivity test inside the ChatUI container:</p>
<pre><code class="lang-bash">sudo docker <span class="hljs-built_in">exec</span> -it chatui curl -I http://text-generation:8000/v1/models
</code></pre>
<h3 id="heading-expected-output">✅ <strong>Expected output:</strong></h3>
<pre><code class="lang-bash">HTTP/1.1 200 OK
</code></pre>
<p>❌ <strong>If the request fails:</strong></p>
<ul>
<li><p>The docker network isn’t configured properly.</p>
</li>
<li><p>vLLM isn’t running (docker ps to check).</p>
</li>
</ul>
<p>💡 <strong>Fix:</strong> Explicitly define a <strong>Docker network</strong> in docker-compose.yml:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">networks:</span>
  <span class="hljs-attr">chat-network:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">chat-network</span>
    <span class="hljs-attr">driver:</span> <span class="hljs-string">bridge</span>
</code></pre>
<p>Then <strong>restart everything</strong>:</p>
<pre><code class="lang-bash">sudo docker-compose down &amp;&amp; sudo docker-compose up -d
</code></pre>
<hr />
<h1 id="heading-final-thoughts-on-chatui"><strong>Final Thoughts on ChatUI</strong></h1>
<p>Once ChatUI was configured properly, it <strong>worked exactly as expected</strong>:</p>
<ul>
<li><p><strong>✅ Low-latency responses</strong> even on an RTX 3080</p>
</li>
<li><p>✅ <strong>Fully private</strong> – no external API calls</p>
</li>
<li><p><strong>✅ Scalable</strong> – can support multiple models later</p>
</li>
</ul>
<p>Next Steps:</p>
<ul>
<li><p><strong>Enable authentication</strong> for secure access</p>
</li>
<li><p><strong>Deploy more models</strong> and allow switching in the UI</p>
</li>
<li><p><strong>Improve caching &amp; performance</strong></p>
</li>
</ul>
<p>With ChatUI in place, I now had a <strong>functional, self-hosted AI assistant</strong> running entirely on my own hardware.</p>
<hr />
<h1 id="heading-mongodb-storing-chat-history-and-context">MongoDB: Storing Chat History and Context</h1>
<p>With <strong>vLLM handling inference</strong> and <strong>ChatUI providing the frontend</strong>, the next piece was <strong>storing chat history</strong>. MongoDB made the most sense here since <strong>ChatUI requires it for storing past conversations</strong>.</p>
<p>This setup ensures:</p>
<ul>
<li><p>✅ Users can <strong>continue conversations</strong> even after closing ChatUI.</p>
</li>
<li><p>✅ The system maintains <strong>chat history</strong> for context-aware responses.</p>
</li>
<li><p>✅ Queries remain <strong>fast and efficient</strong> without unnecessary complexity.</p>
</li>
</ul>
<h2 id="heading-why-mongodb">Why MongoDB?</h2>
<p>MongoDB wasn’t a choice I actively made, it was <strong>required by ChatUI</strong>. That said, it fits well for this use case:</p>
<ul>
<li><p><strong>Native ChatUI Support</strong> – ChatUI is <strong>built to use MongoDB</strong>, making setup seamless.</p>
</li>
<li><p><strong>Flexible Schema</strong> – Since chat logs <strong>don’t have a fixed structure</strong>, MongoDB handles this well.</p>
</li>
<li><p><strong>Lightweight</strong> – Minimal resource overhead, making it <strong>ideal for containerized environments</strong>.</p>
</li>
<li><p><strong>Persistent Storage</strong> – Conversations <strong>aren’t lost</strong> between restarts, improving usability.</p>
</li>
</ul>
<h2 id="heading-mongodb-docker-configuration">MongoDB Docker Configuration</h2>
<p>Here’s the <strong>Docker Compose</strong> configuration for <strong>MongoDB</strong>, ensuring it integrates smoothly with ChatUI:</p>
<pre><code class="lang-yaml">  <span class="hljs-comment"># MongoDB for storing history/context</span>
  <span class="hljs-attr">mongo-chatui:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">mongo:latest</span> <span class="hljs-comment"># MongoDB container</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"27017:27017"</span> <span class="hljs-comment"># Expose MongoDB on the default port</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./mongo-data:/data/db</span> <span class="hljs-comment"># Persist MongoDB data</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">mongo-chatui</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>
</code></pre>
<h3 id="heading-key-configuration-choices">Key Configuration Choices</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Parameter</strong></td><td><strong>Reasoning</strong></td></tr>
</thead>
<tbody>
<tr>
<td>27017:27017</td><td>Exposes the MongoDB service on its default port.</td></tr>
<tr>
<td>./mongo-data:/data/db</td><td>Ensures chat history is persisted even if the container restarts.</td></tr>
<tr>
<td>restart: always</td><td>Ensures MongoDB automatically restarts if it crashes.</td></tr>
</tbody>
</table>
</div><h1 id="heading-troubleshooting-mongodb"><strong>Troubleshooting MongoDB</strong></h1>
<p>This container worked as expected <strong>right out of the box</strong>, so I didn’t run into major issues. However, here are a few quick verification steps if something goes wrong:</p>
<h2 id="heading-1-check-if-the-container-is-running"><strong>1️⃣ Check if the container is running</strong></h2>
<pre><code class="lang-bash">sudo docker ps | grep mongo-chatui
</code></pre>
<h3 id="heading-expected-output-1">✅ Expected Output:</h3>
<p>The container should be <strong>running</strong> with a <strong>stable uptime</strong>.</p>
<h2 id="heading-2-verify-mongodb-is-accepting-connections"><strong>2️⃣ Verify MongoDB is accepting connections</strong></h2>
<pre><code class="lang-bash">mongo --host localhost --port 27017 --<span class="hljs-built_in">eval</span> <span class="hljs-string">"db.stats()"</span>
</code></pre>
<h3 id="heading-expected-output-2">✅ Expected Output:</h3>
<p>A JSON response with <strong>database statistics</strong>, confirming the DB is accessible.</p>
<hr />
<h1 id="heading-final-thoughts-on-mongodb">Final Thoughts on MongoDB</h1>
<p>With MongoDB properly configured, <strong>ChatUI now maintains session history</strong>, but there are areas to improve:</p>
<ul>
<li><p><strong>✅ Implement indexing</strong> – To speed up query times.</p>
</li>
<li><p><strong>✅ Optimize storage</strong> – Automate <strong>log cleanup</strong> over time.</p>
</li>
<li><p><strong>✅ Monitor performance</strong> – As usage scales, keeping track of <strong>resource consumption</strong> will be important.</p>
</li>
</ul>
<p>This setup provides a <strong>simple, persistent chat history solution</strong>, laying the groundwork for <strong>future optimizations</strong>.</p>
<hr />
<h1 id="heading-nginx-reverse-proxy-for-secure-access">NGINX: Reverse Proxy for Secure Access</h1>
<p>Once <strong>vLLM</strong>, <strong>ChatUI</strong>, and <strong>MongoDB</strong> were up and running, I needed a way to make the deployment secure and accessible via a domain. While I haven’t exposed this setup to the internet, I’m using a public domain for <strong>SSL certificates</strong>. This is where <strong>NGINX</strong> comes in, acting as a reverse proxy, handling <strong>SSL termination</strong>, routing, and ensuring everything is accessible from a single domain.</p>
<h2 id="heading-why-use-nginx">Why use NGINX?</h2>
<ul>
<li><p><strong>Reverse Proxy:</strong> Routes traffic from a <strong>single domain</strong> to multiple internal services.</p>
</li>
<li><p><strong>SSL Termination:</strong> Handles <strong>HTTPS encryption</strong> via <strong>Certbot and Let’s Encrypt</strong>.</p>
</li>
<li><p><strong>Performance Boost:</strong> Enables <strong>caching</strong> and optimizes request handling for faster responses.</p>
</li>
<li><p><strong>Security:</strong> Hides internal containers from direct exposure, securing ports like <strong>8080, 3000, and 27017</strong>.</p>
</li>
</ul>
<h2 id="heading-nginx-docker-configuration">NGINX Docker Configuration</h2>
<p>Here’s the <strong>Docker Compose</strong> configuration for NGINX:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># NGINX Reverse Proxy</span>
  <span class="hljs-attr">nginx:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">nginx:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"80:80"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"443:443"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/nginx.conf:/etc/nginx/nginx.conf:ro</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/certs:/etc/letsencrypt:ro</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">nginx</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">chat-ui</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">text-generation</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>
</code></pre>
<h2 id="heading-key-configuration-choices-1">Key Configuration Choices</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Parameter</strong></td><td><strong>Reasoning</strong></td></tr>
</thead>
<tbody>
<tr>
<td>80:80, 443:443</td><td>Exposes NGINX on standard HTTP/HTTPS ports.</td></tr>
<tr>
<td>./nginx/nginx.conf:/etc/nginx/nginx.conf:ro</td><td>Mounts a custom NGINX config for routing requests.</td></tr>
<tr>
<td>./nginx/certs:/etc/letsencrypt:ro</td><td>Ensures SSL certificates from Certbot are used.</td></tr>
<tr>
<td>depends_on</td><td>Ensures ChatUI and vLLM start before NGINX.</td></tr>
</tbody>
</table>
</div><h2 id="heading-the-nginx-configuration-nginxconf">The NGINX Configuration (nginx.conf)</h2>
<p>Navigate to the NGINX folder to create the configuration file:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> ~/working/nginx
nano nginx.conf
</code></pre>
<p>Paste the following configuration:</p>
<pre><code class="lang-bash">events {
    worker_connections 1024;
}

http {
    server_tokens off;
    charset utf-8;

    <span class="hljs-comment"># General HTTP server for nginx status or redirects</span>
    server {
        listen 80 default_server;
        listen [::]:80 default_server;

        location /nginx_status {
            stub_status on;
            allow 127.0.0.1; <span class="hljs-comment"># Allow only localhost to access this</span>
            deny all;
        }
    }

    <span class="hljs-comment"># Frontend (Chat UI)</span>
    server {
        listen 443 ssl http2;
        listen [::]:443 ssl http2;
        server_name chat.brianbaldock.net;

        ssl_certificate /etc/letsencrypt/live/&lt;YOURDOMAINNAME&gt;/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/&lt;YOURDOMAINNAME&gt;/privkey.pem;

        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_prefer_server_ciphers on;
        ssl_ciphers HIGH:!aNULL:!MD5;

        client_max_body_size 15G;

        location / {
            proxy_pass http://chat-ui:3000;
            proxy_set_header Host <span class="hljs-variable">$host</span>;
            proxy_set_header X-Real-IP <span class="hljs-variable">$remote_addr</span>;
            proxy_set_header X-Forwarded-For <span class="hljs-variable">$proxy_add_x_forwarded_for</span>;
            proxy_set_header X-Forwarded-Proto <span class="hljs-variable">$scheme</span>;

            <span class="hljs-comment"># WebSocket support (optional, depending on your frontend)</span>
            proxy_http_version 1.1;
            proxy_set_header Upgrade <span class="hljs-variable">$http_upgrade</span>;
            proxy_set_header Connection <span class="hljs-string">"upgrade"</span>;
        }
    }

    <span class="hljs-comment"># Backend (Text Generation API)</span>
    server {
        listen 443 ssl http2;
        listen [::]:443 ssl http2;
        server_name api.chat.brianbaldock.net;

        ssl_certificate /etc/letsencrypt/live/&lt;YOURDOMAINNAME&gt;/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/&lt;YOURDOMAINNAME&gt;/privkey.pem;

        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_prefer_server_ciphers on;
        ssl_ciphers HIGH:!aNULL:!MD5;

        client_max_body_size 15G;

        location / {
            proxy_pass http://text-generation:8000/v1/;
            proxy_set_header Host <span class="hljs-variable">$host</span>;
            proxy_set_header X-Real-IP <span class="hljs-variable">$remote_addr</span>;
            proxy_set_header X-Forwarded-For <span class="hljs-variable">$proxy_add_x_forwarded_for</span>;
            proxy_set_header X-Forwarded-Proto <span class="hljs-variable">$scheme</span>;
        }
    }

    <span class="hljs-comment"># HTTP Redirect (Redirect HTTP to HTTPS)</span>
    server {
        listen 80;
        listen [::]:80;
        server_name &lt;YOURDOMAINNAME&gt; api.&lt;YOURDOMAINNAME&gt;;

        <span class="hljs-built_in">return</span> 301 https://<span class="hljs-variable">$host</span><span class="hljs-variable">$request_uri</span>;
    }
}
</code></pre>
<h2 id="heading-what-this-configuration-does">What this configuration does</h2>
<ul>
<li><p><strong>Redirects HTTP to HTTPS</strong> (forces encrypted connections).</p>
</li>
<li><p><strong>Routes &lt;YOURDOMAINNAME&gt; to ChatUI</strong>.</p>
</li>
<li><p><strong>Proxies API calls to vLLM</strong> (/v1/ endpoint).</p>
</li>
<li><p><strong>Handles SSL Certificates</strong> via Certbot.</p>
</li>
<li><p><strong>Supports HTTP/2</strong> for better performance.</p>
</li>
</ul>
<h2 id="heading-what-you-need-to-modify">⚠️ What You Need to Modify</h2>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Setting</strong></td><td><strong>Update This</strong></td></tr>
</thead>
<tbody>
<tr>
<td>&lt;YOURDOMAIN&gt;</td><td>Replace with your actual domain name (e.g., chat.example.com)</td></tr>
<tr>
<td>SSL Certificate Paths</td><td>Ensure Certbot paths match your domain.</td></tr>
<tr>
<td>proxy_pass http://chat-ui:3000;</td><td>Update if ChatUI uses a different container name or port.</td></tr>
<tr>
<td>proxy_pass http://text-generation:8000/v1/;</td><td>Modify if vLLM’s container name or port changes.Troubleshooting NGINX</td></tr>
</tbody>
</table>
</div><hr />
<h1 id="heading-troubleshooting-nginx">Troubleshooting NGINX</h1>
<h2 id="heading-1-check-if-nginx-is-running">1️⃣ Check if NGINX is running</h2>
<pre><code class="lang-bash">sudo docker ps | grep nginx
</code></pre>
<p>If NGINX isn’t running or keeps restarting:</p>
<pre><code class="lang-bash">sudo docker logs nginx
</code></pre>
<h2 id="heading-2-test-chatui-access-via-nginx">2️⃣ Test ChatUI access via NGINX</h2>
<p>From the Docker host, verify that ChatUI is accessible through NGINX:</p>
<pre><code class="lang-bash">curl -I https://&lt;YOURDOMAIN&gt;
</code></pre>
<h3 id="heading-expected-output-3"><strong>✅ Expected output</strong></h3>
<pre><code class="lang-bash">HTTP/2 200 
server: nginx
date: <span class="hljs-string">""</span>
content-type: text/html; charset=utf-8
</code></pre>
<hr />
<h1 id="heading-final-thoughts-on-the-nginx-deployment">Final Thoughts on the NGINX deployment</h1>
<p>With NGINX properly configured</p>
<ul>
<li><p><strong>ChatUI</strong> is now accessible via HTTPS.</p>
</li>
<li><p><strong>All traffic</strong> is proxied through a single, controlled entry point.</p>
</li>
<li><p><strong>TLS encryption</strong> keeps your data secure while improving performance with HTTP/2.</p>
</li>
</ul>
<p>This setup adds a solid security layer while keeping everything manageable and scalable.</p>
<hr />
<h1 id="heading-certbot-automating-ssl-certificates-for-secure-connections">Certbot: Automating SSL Certificates for Secure Connections</h1>
<p>With <strong>NGINX</strong> acting as the reverse proxy, the next step was securing all connections with <strong>TLS</strong>. Instead of manually managing certificates, I used <strong>Certbot</strong>, an automated tool for obtaining <strong>Let’s Encrypt SSL certificates</strong> and handling renewals.</p>
<p>While many DNS registrars offer APIs to automate Let’s Encrypt certificate creation, the process often requires some tweaking. Fortunately, GitHub is packed with scripts, plugins, and Docker container builds for different registrars.</p>
<p>In my case, I customized a Certbot container to support <strong>Porkbun</strong> (who are awesome, by the way, <a target="_blank" href="https://porkbun.com">https://porkbun.com</a>). It was a straightforward configuration but required some trial and error to get right.</p>
<h2 id="heading-why-use-dns-based-validation"><strong>Why Use DNS-Based Validation?</strong></h2>
<p>I chose <strong>DNS-based validation</strong> over the traditional HTTP method because:</p>
<ul>
<li><p><strong>No Need to Expose Port 80:</strong> I didn’t want to open HTTP ports on my Docker host just for certificate validation.</p>
</li>
<li><p><strong>API Integration:</strong> Let’s Encrypt can communicate directly with my registrar via API, making the process seamless.</p>
</li>
</ul>
<p>If you’re using a different registrar, you’ll need to adjust the process accordingly.</p>
<hr />
<h1 id="heading-certbot-configuration-and-customizations">Certbot configuration and customizations</h1>
<h2 id="heading-1-pull-the-latest-certbot-image">1️⃣ Pull the Latest Certbot Image</h2>
<pre><code class="lang-bash"><span class="hljs-comment"># Pull down the default image for Certbot</span>
sudo docker pull certbot/certbot:latest
</code></pre>
<h2 id="heading-2-set-up-the-working-directory">2️⃣ Set Up the Working Directory</h2>
<pre><code class="lang-bash">mkdir ~/working/certbot-porkbun
mkdir ~/.secrets
</code></pre>
<p>The .secrets folder keeps sensitive API credentials out of your bash history.</p>
<h2 id="heading-3-create-the-custom-dockerfile">3️⃣ Create the Custom Dockerfile</h2>
<p>Navigate to the certbot-porkbun directory and create a <strong>Dockerfile:</strong></p>
<pre><code class="lang-bash"><span class="hljs-built_in">cd</span> ~/working/certbot-porkbun
nano Dockerfile
</code></pre>
<pre><code class="lang-bash">FROM certbot/certbot:latest

<span class="hljs-comment"># Install pip if not already installed</span>
RUN apk add --no-cache python3 py3-pip

<span class="hljs-comment"># Install the certbot_dns_porkbun plugin</span>
RUN pip3 install certbot-dns-porkbun
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text"><strong><em>Don’t forget to create a new API key and enable your domain for API Access (in the case of Porkbun)</em></strong></div>
</div>

<h2 id="heading-4-add-your-api-credentials">4️⃣ Add Your API Credentials</h2>
<pre><code class="lang-bash">nano ~/.secrets/porkbun.ini
</code></pre>
<p>Insert your API key and secret:</p>
<pre><code class="lang-bash">dns_porkbun_key=&lt;keyid&gt;
dns_porkbun_secret=&lt;secretid&gt;
</code></pre>
<p>Ensure the file has secure permissions:</p>
<pre><code class="lang-bash">chmod 600 ~/.secrets/porkbun.ini
</code></pre>
<h2 id="heading-5-build-the-custom-certbot-image">5️⃣ Build the Custom Certbot Image</h2>
<pre><code class="lang-bash">docker build -t certbot-porkbun .
</code></pre>
<h2 id="heading-6-request-your-ssl-certificate">6️⃣ Request your SSL Certificate</h2>
<p>Run the following command to spin up the new container image and generate your certificate:</p>
<pre><code class="lang-bash">sudo docker run --rm -it \
    -v ~/working/nginx/certs:/etc/letsencrypt \
    -v ~/.secrets/porkbun.ini:/etc/letsencrypt/porkbun.ini \
    certbot-porkbun certonly \
    --non-interactive \
    --agree-tos \
    --email your@email.com \
    --preferred-challenges dns \
    --authenticator dns-porkbun \
    --dns-porkbun-credentials /etc/letsencrypt/porkbun.ini \
    --dns-porkbun-propagation-seconds 600 \
    -d &lt;YOURDOMAINNAME&gt; \
    -d api.&lt;YOURDOMAINNAME&gt;
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Porkbun’s default TTL is 600 seconds, which can slow down propagation. You <em>might</em> get away with 60 seconds, but Certbot will likely throw a warning.</div>
</div>

<h2 id="heading-7-automate-certificate-renewal-with-docker-compose">7️⃣ Automate Certificate Renewal with Docker Compose</h2>
<p>Add this to your docker-compose.yml file:</p>
<pre><code class="lang-bash">  <span class="hljs-comment"># Certbot for SSL Certificates</span>
  certbot:
    image: certbot-porkbun
    volumes:
      - ./nginx/certs:/etc/letsencrypt
    entrypoint: /bin/sh -c <span class="hljs-string">"trap exit TERM; while :; do sleep 6h &amp; wait $<span class="hljs-variable">${!}</span>; certbot renew; done"</span>
    container_name: certbot-porkbun
    restart: unless-stopped
</code></pre>
<ul>
<li><p>The container checks for certificate renewals every <strong>6 hours</strong>.</p>
</li>
<li><p>The certbot renew command ensures certificates stay up-to-date without manual intervention.</p>
</li>
</ul>
<hr />
<h1 id="heading-thoughts-on-the-certbot-deployment">Thoughts on the Certbot Deployment</h1>
<p>With the Certbot container properly configured:</p>
<ul>
<li><p><strong>SSL/TLS certificates</strong> are issued and renewed automatically.</p>
</li>
<li><p><strong>ChatUI</strong> and <strong>vLLM</strong> are securely accessible via HTTPS.</p>
</li>
<li><p>All data in transit is <strong>encrypted</strong> for end-to-end security.</p>
</li>
</ul>
<p>This setup makes SSL management effortless, don’t have to think about manual renewals.</p>
<hr />
<h1 id="heading-full-docker-composeyml-bringing-everything-together">Full docker-compose.yml: Bringing Everything Together</h1>
<p>After configuring <strong>vLLM</strong>, <strong>ChatUI</strong>, <strong>MongoDB</strong>, <strong>NGINX</strong>, and <strong>Certbot</strong>, it’s time to bring everything together into a unified docker-compose.yml file. This file defines the entire deployment, ensuring all services work seamlessly.</p>
<p>If you’ve followed the prerequisites and configuration instructions for <strong>NGINX</strong>, <strong>ChatUI</strong>, and <strong>Certbot</strong>, you should be able to copy this file into your working directory, run one command, and spin up your own chatbot. <em>(Notice the networks section at the end, this is important for container communication.)</em></p>
<hr />
<h2 id="heading-final-docker-composeyml-configuration">Final docker-compose.yml configuration</h2>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-comment"># vLLM Backend</span>
  <span class="hljs-attr">text-generation:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">vllm/vllm-openai:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"8080:8000"</span>  <span class="hljs-comment"># vLLM API Server</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./Phi-3-mini-4k-instruct:/data/model/Phi-3-mini-4k-instruct</span> 
    <span class="hljs-attr">command:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--model"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"/data/model/Phi-3-mini-4k-instruct"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--dtype"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"bfloat16"</span> 
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--tensor-parallel-size"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"1"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--gpu-memory-utilization"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"0.9"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--max-model-len"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"3264"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--max-num-batched-tokens"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"3264"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"--trust-remote-code"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">NVIDIA_VISIBLE_DEVICES:</span> <span class="hljs-string">all</span>
    <span class="hljs-attr">deploy:</span>
      <span class="hljs-attr">resources:</span>
        <span class="hljs-attr">reservations:</span>
          <span class="hljs-attr">devices:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">driver:</span> <span class="hljs-string">nvidia</span>
              <span class="hljs-attr">count:</span> <span class="hljs-string">all</span>
              <span class="hljs-attr">capabilities:</span> [<span class="hljs-string">gpu</span>]
    <span class="hljs-attr">runtime:</span> <span class="hljs-string">nvidia</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">text-generation</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>

  <span class="hljs-comment"># The frontend</span>
  <span class="hljs-attr">chat-ui:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">ghcr.io/huggingface/chat-ui:latest</span> <span class="hljs-comment"># Chat UI container</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"3000:3000"</span> <span class="hljs-comment"># Expose Chat UI on port 3000</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">MONGODB_URL:</span> <span class="hljs-string">mongodb://mongo-chatui:27017</span> <span class="hljs-comment"># MongoDB URL for frontend</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./chat-ui/.env.local:/app/.env.local</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">chatui</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>

  <span class="hljs-comment"># MongoDB for storing history/context</span>
  <span class="hljs-attr">mongo-chatui:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">mongo:latest</span> <span class="hljs-comment"># MongoDB container</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"27017:27017"</span> <span class="hljs-comment"># Expose MongoDB on the default port</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./mongo-data:/data/db</span> <span class="hljs-comment"># Persist MongoDB data</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">mongo-chatui</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>

  <span class="hljs-comment"># NGINX Reverse Proxy</span>
  <span class="hljs-attr">nginx:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">nginx:latest</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"80:80"</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"443:443"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/nginx.conf:/etc/nginx/nginx.conf:ro</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/certs:/etc/letsencrypt:ro</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">nginx</span>
    <span class="hljs-attr">depends_on:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">chat-ui</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">text-generation</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">always</span>

  <span class="hljs-comment"># Certbot for SSL Certificates</span>
  <span class="hljs-attr">certbot:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">certbot-porkbun</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">./nginx/certs:/etc/letsencrypt</span>
    <span class="hljs-attr">entrypoint:</span> <span class="hljs-string">/bin/sh</span> <span class="hljs-string">-c</span> <span class="hljs-string">"trap exit TERM; while :; do sleep 6h &amp; wait $${!}; certbot renew; done"</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">certbot-porkbun</span>
    <span class="hljs-attr">restart:</span> <span class="hljs-string">unless-stopped</span>

<span class="hljs-attr">networks:</span>
  <span class="hljs-attr">chat-network:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">chat-network</span>
    <span class="hljs-attr">driver:</span> <span class="hljs-string">bridge</span>
</code></pre>
<hr />
<h1 id="heading-how-to-deploy-everything">How to deploy everything</h1>
<p>Once the docker-compose.yml file is ready, deployment is as simple as:</p>
<pre><code class="lang-bash">sudo docker-compose up -d
</code></pre>
<h2 id="heading-verifying-deployment"><strong>✅ Verifying Deployment</strong></h2>
<p>To check if everything’s running smoothly:</p>
<pre><code class="lang-bash">sudo docker ps
</code></pre>
<ul>
<li><p>All containers should show a similar uptime.</p>
</li>
<li><p>If a container is restarting, check its logs:</p>
</li>
</ul>
<pre><code class="lang-bash">sudo docker logs &lt;container_name&gt;
</code></pre>
<p>You should now be able to access ChatUI and start chatting with your model!</p>
<p>Example: I think Phi-3 is happy to be featured in this blog post.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738361805650/3c3639a5-81fe-452b-a358-2cbca4c6f72b.png" alt class="image--center mx-auto" /></p>
<hr />
<h1 id="heading-lessons-learned">Lessons learned</h1>
<ol>
<li><p><strong>Use a dedicated docker network</strong></p>
<ul>
<li><p>Avoids conflicts between containers.</p>
</li>
<li><p>Prevents duplicate networks causing routing issues.</p>
</li>
</ul>
</li>
<li><p><strong>Double-check environment variables</strong></p>
<ul>
<li>Ensure the right endpoints are set.</li>
</ul>
</li>
<li><p><strong>Verify connectivity between containers</strong></p>
<ul>
<li>Running cURL inside the containers helped troubleshoot API connection issues.</li>
</ul>
</li>
<li><p><strong>Monitor logs for debugging</strong></p>
<ul>
<li>Don’t be afraid to get your hands dirty and check out the container logs.</li>
</ul>
</li>
</ol>
<hr />
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<p>This self-hosted AI assistant should now be fully operational with</p>
<ul>
<li><p>✅ A local LLM inference server</p>
</li>
<li><p>✅ A secure web interface</p>
</li>
<li><p>✅ Full HTTPS support</p>
</li>
<li><p>✅ Persistent chat history</p>
</li>
</ul>
<p>The deployment is scalable, secure, and future-ready.</p>
<hr />
<h1 id="heading-wrapping-it-up-the-journey-the-lessons-and-whats-next">Wrapping It Up: The Journey, The Lessons, and What’s Next!</h1>
<p>What started as a simple curiosity project quickly turned into a deep dive into self-hosted AI, containerized deployments, and debugging Docker networking nightmares. Along the way, I learned a lot. Some of it the hard way.</p>
<p>At the end of the day, this setup works. It’s fast, secure, and completely under my control. No API costs, no privacy concerns, just a local AI assistant running on my own hardware.</p>
<h2 id="heading-lessons-that-stuck-with-me"><strong>Lessons That Stuck With Me</strong></h2>
<ul>
<li><p><strong>Nothing is truly plug-and-play</strong> – Even with solid documentation, real-world deployments always require tweaks, troubleshooting, and some extensive searching.</p>
</li>
<li><p><strong>Container networking can be a pain</strong> – Debugging why one container can’t talk to another inside Docker’s virtual network was probably the biggest headache. Using a dedicated network and running cURL tests from within containers made all the difference.</p>
</li>
<li><p><strong>Self-hosting an LLM is totally doable</strong> – Even on a consumer-grade GPU like the RTX 3080, inference speeds are surprisingly snappy with Phi-3 and vLLM.</p>
</li>
<li><p><strong>Certbot with DNS validation is the way to go</strong> – No need to expose port 80, no manual renewal, just hands-off, automated SSL certs.</p>
</li>
<li><p><strong>Future-proofing matters</strong> – I designed this setup to be scalable, allowing for multiple models, additional tools, and even more powerful hardware down the line.</p>
</li>
</ul>
<h2 id="heading-whats-next"><strong>What’s Next?</strong></h2>
<p>This was only the beginning. Some of the things I’d love to explore next include:</p>
<ul>
<li><p><strong>Multi-Model Support</strong> – Deploying multiple LLMs and being able to switch between them on the fly in ChatUI.</p>
</li>
<li><p><strong>Integrating RAG (Retrieval-Augmented Generation)</strong> – Adding a local document index so the model can search and reference private documents.</p>
</li>
<li><p><strong>Expanding the Infrastructure</strong> – Maybe a multi-GPU setup or upgrading to a 4090 to push inference speeds even further.</p>
</li>
<li><p><strong>Web-Connected AI</strong> – Implementing a secure web search tool to allow the model to fetch real-time information.</p>
</li>
</ul>
<p>Self-hosting AI is a game-changer. The more control we have over our own tools, the better.</p>
<h2 id="heading-want-to-try-this-yourself"><strong>Want to Try This Yourself?</strong></h2>
<p>Everything in this article is reproducible, and I’d love to hear how it works for you! If you have any questions, improvements, or want to share your own builds, let’s connect. Drop a comment, or hit me up on <a target="_blank" href="https://www.linkedin.com/in/brianbaldock">LinkedIn</a>!</p>
<p>Here’s to building cool things, breaking them, fixing them again, and learning along the way!</p>
]]></content:encoded></item><item><title><![CDATA[Making IT Simpler]]></title><description><![CDATA[Check out this quick 2-minute video showcasing the Setup Expert on setup.cloud.microsoft. If you’re an IT admin or business owner looking to streamline your Microsoft 365 onboarding and configuration, you need to see this!This tool is a game-changer ...]]></description><link>https://blog.brianbaldock.net/making-it-simpler</link><guid isPermaLink="true">https://blog.brianbaldock.net/making-it-simpler</guid><category><![CDATA[setupexpert]]></category><category><![CDATA[setup.cloud.microsoft]]></category><category><![CDATA[microsoft 365 setup]]></category><category><![CDATA[Microsoft365]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Sat, 18 Jan 2025 15:32:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1737214252674/2d6d8414-173f-4f21-a5fd-2d4d2871ec04.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Check out this quick 2-minute video showcasing the Setup Expert on <a target="_blank" href="https://setup.cloud.microsoft/">setup.cloud.microsoft</a>. If you’re an IT admin or business owner looking to streamline your Microsoft 365 onboarding and configuration, you need to see this!<br />This tool is a game-changer for setting up secure and efficient environments in no time. Whether you’re new to M365 or a seasoned pro, the setup expert AI is an epic resource to streamline your expertise.<br />Check out the video on LinkedIn to see how it works and let me know what you think! 👇</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.linkedin.com/feed/update/urn:li:activity:7286153240411295744/">https://www.linkedin.com/feed/update/urn:li:activity:7286153240411295744/</a></div>
]]></content:encoded></item><item><title><![CDATA[How a User Surge Broke Fabrikam's 'All Company' Group]]></title><description><![CDATA[When it comes to multi-tenant organizations, you think you’ve seen it all; until a new twist reminds you that there is always something complex beneath the surface. Recently, I encountered a fascinating issue that perfectly illustrates this point. Le...]]></description><link>https://blog.brianbaldock.net/fabrikam-allcompany-broke</link><guid isPermaLink="true">https://blog.brianbaldock.net/fabrikam-allcompany-broke</guid><category><![CDATA[All Company group]]></category><category><![CDATA[Microsoft 365 Multi-Tenant Organization]]></category><category><![CDATA[Microsoft Cross-Tenant Sync]]></category><category><![CDATA[M365 Cross-Tenant Sync]]></category><category><![CDATA[Entra ID Cross-Tenant Sync]]></category><category><![CDATA[Cross-Tenant Sync]]></category><category><![CDATA[microsoft-entra-id]]></category><category><![CDATA[Entra ID]]></category><category><![CDATA[M365 Admin]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Tue, 08 Oct 2024 04:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728483588754/03a8de9b-52b8-4366-a0ed-ac926b984d20.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When it comes to multi-tenant organizations, you think you’ve seen it all; until a new twist reminds you that there is always something complex beneath the surface. Recently, I encountered a fascinating issue that perfectly illustrates this point. Let’s dive into the adventure involving our two organizations: Contoso (The giant corp with 30,000 users) and Fabrikam (The nimble subsidiary with 3000 users)</p>
<h2 id="heading-the-grand-setup">The grand setup</h2>
<p>On July 31st, Contoso’s users were initially synced to Fabrikam via Microsoft’s Multi-Tenant Organization (MTO) and Cross-Tenant Sync (CTS). Everything was humming along nicely; users appeared where they should.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728477048734/f7ebee70-a5f2-47dd-ab7b-9d70a78ce0d1.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-the-unexpected-twist">The unexpected twist</h2>
<p>The influx of over 10,000 users from Contoso into Fabrikam caused Fabikam’s org wide “All Company” team to hit its membership limit (All Company Teams are created by default if your tenant has less than 5000 users). Remember, org wide teams max out at a total of 10,000 users (as of 2024-10) <a target="_blank" href="https://learn.microsoft.com/en-us/microsoftteams/create-an-org-wide-team#:~:text=Organization%2Dwide%20teams%20are%20limited%20to%20organizations%20with%20no%20more%20than%2010%2C000%20users.%20You%20can%20have%20up%20to%20five%20organization%2Dwide%20teams">[Reference]</a>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728477268647/f2b28f5a-f2f9-47b6-b62c-1b33693eaf83.png" alt class="image--center mx-auto" /></p>
<p>Fabrikam hadn’t anticipated the sheer volume of users syncing over, and the automatic group membership expansion threw a wrench into their communications and permissions setup. When an org wide team expands beyond 10,000 users it is automatically converted to a public group. Teams are front ends for the Microsoft 365 group.</p>
<h2 id="heading-the-quick-fix-or-so-they-thought">The quick fix - Or so they thought</h2>
<p>Realizing the issue, the Fabrikam swiftly acted. On August 4th, 4 days following the initial sync, they applied a dynamic rule to remove the now converted org wide groups members that are from Contoso. Crisis averted, or so it seemed.</p>
<pre><code class="lang-plaintext">Dynamic Rule Example:
user.userPrincipalName -notContains "_contoso.com#EXT#@fabrikam.com" and user.UserType -eq "Member"
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728484739629/74ebabb8-2786-4c4d-ab6b-b3d6048e382e.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-the-hidden-ripple-effect">The hidden ripple effect</h2>
<p>This is where things got interesting. Between the initial sync on July 31st and the cleanup on August 4th, Contoso had disabled some users (as they are a large organization, turnover is high - so not a rare scenario). When an external tenant disables (soft deletes) a user in their own tenant, the corresponding synced user in the partner tenant (in this case Fabrikam) gets put into a soft deleted state. After August 4th, some of these users were re-enabled on Contoso’s side (again, large org) Remember, before the group was cleaned up, these users were members of the org wide team/now converted to public group, <em>and soft deleted users</em> <strong><em>will not display as members of a team.</em></strong> So, due to the restore process in cross-tenant sync, the “removed” users were effectively resurrected in Fabrikam and, unexpected, but logically, re-enabled as members of the “All Company” group/team, even though Fabrikam had previously cleaned up the membership.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728478058413/952e3893-d9de-460e-b38e-439d244405b7.png" alt class="image--center mx-auto" /></p>
<p>This process meant that every time a user was disabled and then re-enabled on Contoso’s side, they could potentially pop-up into Fabrikam’s “All Company” team/group, adding to communication compliance concerns. After some investigative work we pinpointed a robust solution and it’s surprisingly easy. Just permanently delete any soft-deleted users in the partner tenant. In this example, the fix was having Fabrikam delete all soft-deleted users in their tenant that contain <strong>_contoso.com#EXT#@fabrikam.com</strong> in their UPN. This ensures that if a deleted user in Contoso’s tenant gets re-enabled, the cross-tenant sync process will recreate the user with a new ObjectID ignoring any previous group memberships. This prevents the user from being “restored” to the “All Company” group/team.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728478394333/c4166405-da6f-4aa3-b3d7-7fcf9f4d6ded.png" alt class="image--center mx-auto" /></p>
<p>Hopefully you do not run into a similar situation in your organization(s), but, if you do, and you’re seeing similar oddities, double-check your soft-deleted users and do a cleanup. Complex systems can sometimes have unintended consequences.</p>
]]></content:encoded></item><item><title><![CDATA[Avoid These 10 Cybersecurity Blunders]]></title><description><![CDATA[In today's digital world, small and medium businesses often think they're too small to be targets of cybercrime, but that's a dangerous myth. The truth is that cybercriminals see smaller businesses as easy prey because they often lack the robust defe...]]></description><link>https://blog.brianbaldock.net/top-10-cybersecurity-blunders</link><guid isPermaLink="true">https://blog.brianbaldock.net/top-10-cybersecurity-blunders</guid><category><![CDATA[security blunders]]></category><category><![CDATA[top 10 cybersecurity blunders]]></category><category><![CDATA[#cybersecurity]]></category><category><![CDATA[top10]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Wed, 07 Aug 2024 05:07:12 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1722657219783/e42ee7bb-8424-4b5f-adc6-d5d056f90122.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today's digital world, small and medium businesses often think they're too small to be targets of cybercrime, but that's a dangerous myth. The truth is that cybercriminals see smaller businesses as easy prey because they often lack the robust defenses of larger enterprises. With ransomware attacks up by over 200% and phishing scams becoming more sophisticated [<a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023</code></a>], it's never been more critical to get your cybersecurity game on point. Below is a list of common mistakes many businesses make and how you can avoid them to keep your own business safe.</p>
<h2 id="heading-tldr">TLDR</h2>
<ol>
<li><p>Train your employees on cybersecurity basics</p>
</li>
<li><p>Enforce strong password policies and go passwordless when able</p>
</li>
<li><p>Do not ignore software updates, rip that band-aid off as soon as possible</p>
</li>
<li><p>Test your backups, have backups if you don't already, but test them.</p>
</li>
<li><p>Don't forget about Mobile Devices, they need protection too</p>
</li>
<li><p>Beef up your network security, segment-segment-segment</p>
</li>
<li><p>Monitor security alerts, make sure you stay on top of them</p>
</li>
<li><p>Upgrade your software, bit the bullet and get it done, maintaining the old is going to hurt you in the long run</p>
</li>
<li><p>Lock your doors and windows, that server running in the closet that people put their winter boots on, not a good idea, especially if it's running your crown jewels</p>
</li>
<li><p>Have plans for any incidents and make sure your employees and teams know what to do when something bad happens</p>
</li>
</ol>
<p>If you're worried about the cost, ask yourself this: would you rather invest in these preventive measures now or face the potentially crippling expenses of a cyber breach? Imagine losing everything and being unable to restore production because your backups failed, and no one knows how to respond. The initial cost of proper security measures pales in comparison to the financial and reputational damage of a full-scale breach.</p>
<h2 id="heading-common-cybersecurity-mistakes-and-solutions">Common Cybersecurity Mistakes and Solutions</h2>
<h3 id="heading-1-lack-of-employee-training">1) Lack of Employee Training</h3>
<ul>
<li><p><strong>Problem</strong>: Sure, we can throw tech solutions at problems all day, but let's be real—teaching humans is the hard part. Cybercriminals know this and target people as the primary entry point for attacks. Social engineering tactics like phishing, smishing, and vishing are getting more sophisticated, making them the top human risks.</p>
</li>
<li><p><strong>Solution</strong>: You need a solid security awareness program that's more than just a checkbox exercise for compliance. This means regular, engaging training sessions that go beyond the basics and keep employees up to date on the latest threats. The goal is to create a security-conscious culture where everyone knows how to spot and avoid these scams. It's not just about a yearly meeting; it’s about continuous learning and behavior change.</p>
</li>
<li><p>Check out this article I wrote that talks about options for end user cybersecurity training: <a target="_blank" href="https://blog.brianbaldock.net/cybersecurity-training-programs-empower-your-employees"><code>Cybersecurity Training Programs: Empower Your Employee</code></a></p>
</li>
</ul>
<hr />
<h3 id="heading-2-weak-password-policies"><strong>2) Weak Password Policies</strong></h3>
<ul>
<li><p><strong>Problem</strong>: Weak or reused passwords are still a huge problem for many businesses. Even with all the tech advancements, a lot of organizations stick with outdated password habits, making them easy targets. The <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023 (MDDR)</code></a> points out that <strong>password-related breaches are still a major issue</strong>, with cybercriminals taking advantage of simple or compromised passwords to break in. The <a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/"><code>SANS Security Awareness Report 2024</code></a> also highlights those human errors, like poor password practices, are a big risk factor​.</p>
</li>
<li><p><strong>Solution:</strong> To up your security game, start by implementing Multi-Factor Authentication. It's a game-changer that adds an extra layer of protection and can stop 99% of attacks. Next, use a password manager like <a target="_blank" href="https://bitwarden.com/"><code>Bitwarden</code></a> or <a target="_blank" href="https://1password.com/"><code>1Password</code></a> to generate and securely store strong, unique passwords, helping you avoid the bad habit of reusing passwords. Get an Enterprise version if you have a team so that you can collaborate and share access. Finally, make sure your team knows why strong passwords matter (refer to point 1) and enforce policies requiring complex passwords and regular updates​. These steps are crucial for preventing unauthorized access and strengthening your overall security.</p>
</li>
</ul>
<hr />
<h3 id="heading-3-ignoring-software-updates"><strong>3) Ignoring Software Updates</strong></h3>
<ul>
<li><p><strong>Problem:</strong> Skipping software updates is like leaving the front door unlocked. It’s an open invitation for cybercriminals to exploit known vulnerabilities. When you don't update your systems, you're missing out on critical security patches that protect against threats. Cyber attackers are constantly evolving, and outdated software is a prime target for their exploits. The <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023</code></a> emphasizes that unpatched vulnerabilities are one of the most common ways systems get compromised​. Failing to update can lead to serious consequences, from data breaches to ransomware attacks, costing your business time, money, and reputation.</p>
</li>
<li><p><strong>Solution:</strong> Keep your systems and software up to date, <strong>plain and simple</strong>. This means enabling automatic updates wherever possible and regularly checking for updates on systems that require manual intervention. Make sure you're leveraging advanced security features like <a target="_blank" href="https://www.microsoft.com/en-us/security/blog/2022/09/20/new-windows-11-security-features-are-designed-for-hybrid-work/?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Hypervisor-Protected Code Integrity (HVCI)</code></a>, which helps protect against malware, and <a target="_blank" href="https://support.microsoft.com/en-us/topic/what-is-smart-app-control-285ea03d-fa88-4d56-882e-6698afdb7003"><code>Smart App Control</code></a>, which blocks unauthorized applications. These features are part of the latest Windows updates and offer an extra layer of security to prevent unauthorized access and potential threats​​. Staying current with updates isn't just about getting the latest features—it's a crucial step in defending your business against cyber threats.</p>
</li>
</ul>
<hr />
<h3 id="heading-4-insufficient-and-untested-data-backup"><strong>4) Insufficient and Untested Data Backup</strong></h3>
<ul>
<li><p><strong>Problem:</strong> Not having a solid backup plan is like playing with fire. If your data isn't properly backed up, you're just one cyberattack away from losing everything. The <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023 (MDDR)</code></a> highlights the rise of double extortion tactics, where attackers not only encrypt your data but also threaten to leak it unless you pay up​. It's not just cyberattacks you need to worry about, hardware failures, accidental deletions, and natural disasters can also wipe out your critical information. The <a target="_blank" href="https://techcommunity.microsoft.com/t5/windows-it-pro-blog/windows-resiliency-best-practices-and-the-path-forward/ba-p/4201550"><code>Windows Resiliency Best Practices</code></a> article emphasizes that without a robust backup and recovery strategy, you're leaving yourself wide open to these risks​.</p>
</li>
<li><p><strong>Solution:</strong> To avoid disaster, you need a comprehensive backup strategy that covers all your bases. Start by scheduling regular backups of all critical data, and don't just rely on one type of storage. Use a mix of on-site and off-site solutions, like cloud storage, to protect against physical and cyber threats. Make sure all your backups are encrypted, both when they're stored and when they're being transferred. It's crucial to <strong>regularly</strong> test your backups and recovery processes so you're not caught off guard in an emergency. Make sure your backup strategy is integrated into your <strong>overall</strong> business continuity and disaster recovery plan, <strong>so everyone knows what to do when things go south</strong>. By taking these steps, you'll be better equipped to bounce back quickly from any data loss incident and keep your business running smoothly.</p>
</li>
</ul>
<hr />
<h3 id="heading-5-overlooking-mobile-device-security"><strong>5) Overlooking Mobile Device Security</strong></h3>
<ul>
<li><p><strong>Problem:</strong> In today's work environment, mobile devices are everywhere, making work more convenient but also introducing significant security risks. Many businesses drop the ball on securing these devices, leaving them wide open to malware, data breaches, and unauthorized access. According to the <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023 (MDDR)</code></a>, unmanaged devices are a major target for ransomware and other attacks​​. With employees often using personal devices for work tasks, the risk of data leakage skyrockets, especially if these devices aren't secured properly. <a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/"><code>SANS Security Awareness Report 2024</code></a> also highlights the importance of having comprehensive security policies that cover all devices accessing corporate data, not just the typical desktops and laptops​.</p>
</li>
<li><p><strong>Solution:</strong> To lock down mobile devices, a multi-layered security approach is essential. Start by deploying an MDM solution like <a target="_blank" href="https://learn.microsoft.com/en-us/mem/intune/fundamentals/what-is-intune"><code>Microsoft Intune</code></a> or <a target="_blank" href="https://www.vmware.com/docs/vmware-workspace-one-uslet-web"><code>VMware Workspace One</code></a>, which allows for enforcing security policies, managing app permissions, and controlling access to sensitive company data. Make sure all devices are encrypted to protect data at rest and in transit; this is crucial if a device gets lost or stolen. Keep mobile operating systems and apps <strong>up to date</strong>, as outdated software can become a backdoor for cyberattacks, see point 3. And don't forget to implement strong authentication methods like MFA to prevent unauthorized access.</p>
</li>
</ul>
<hr />
<h3 id="heading-6-inadequate-network-security"><strong>6) Inadequate Network Security</strong></h3>
<ul>
<li><p><strong>Problem:</strong> Network security often gets overlooked, especially in smaller businesses that might lack the resources or expertise to manage a complex setup. This oversight can leave your business wide open to attacks like malware, unauthorized access, and data breaches. The <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023 (MDDR)</code></a> points out that poor network configurations and weak security measures are common ways attackers get in​. Many companies don't segment their networks properly or set up advanced firewalls, which creates security gaps. The <a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/"><code>SANS Security Awareness Report 2024</code></a> also emphasizes the importance of continuous monitoring and a quick response to detect and deal with threats in real-time​.</p>
</li>
<li><p><strong>Solution:</strong> To boost your network security, start by adopting a Zero Trust model. This approach assumes that threats could be lurking both inside and outside your network, so it requires constant verification of user identities and device security. Next, make sure to implement network segmentation. By dividing your network into different segments, you can contain breaches and limit their impact. Advanced firewalls with/and Intrusion Detection Systems or Intrusion Prevention Systems are also key; they help monitor and block malicious traffic. Finally, regularly audit your network configurations and keep them updated to close any security holes.</p>
</li>
</ul>
<div data-node-type="callout">
<div data-node-type="callout-emoji">ℹ</div>
<div data-node-type="callout-text">In a Zero Trust Security setup, the principle of "least privileged access" is crucial. It's not just about who has admin rights; it's also about how your network is configured. When you segment your network, you're essentially limiting each part's access to only what's necessary for it to function properly. This means even within your own systems; you're restricting access to the bare minimum needed to get the job done. By doing this, you reduce the risk of unauthorized access and make it harder for potential attackers to move laterally across your network. It's all about minimizing exposure and ensuring that every component stays in its lane, keeping your overall security posture tight. <a target="_blank" href="https://www.microsoft.com/en-us/security/business/zero-trust"><code>Embrace proactive security with Zero Trust</code></a></div>
</div>

<hr />
<h3 id="heading-7-not-monitoring-security-alerts"><strong>7) Not Monitoring Security Alerts</strong></h3>
<ul>
<li><p><strong>Problem:</strong> A major pitfall for many businesses is neglecting to properly monitor and respond to security alerts. It's not uncommon for alerts to get ignored, delayed, or lost in the noise, especially if the organization lacks a dedicated security team or relies on basic, limited tools. The <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023 (MDDR)</code></a> emphasizes that quick detection and response are crucial for minimizing damage from security incidents​. Yet, many SMBs either don't have the resources or the expertise to handle this effectively, consider finding a third party to assist with resourcing requirements. There are many excellent partners that provide SOC services. The <a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/"><code>SANS Security Awareness Report 2024</code></a> points out that many breaches could have been prevented or at least mitigated if alerts had been handled appropriately and promptly. Ignoring these alerts can lead to extended dwell times, giving attackers more opportunity to cause significant damage.</p>
</li>
<li><p><strong>Solution:</strong> To stay on top of security alerts, businesses need to set up a comprehensive Security Information and Event Management (SIEM) system like Microsoft Sentinel or Splunk. This tool can aggregate and analyze data from multiple sources, giving you a full picture of what's happening in your environment. It's essential to have a dedicated team or at least a designated person responsible for monitoring and responding to these alerts, ensuring that potential threats don't slip through the cracks. Regularly fine-tuning your alert settings is also important to minimize false positives, so your team can focus on real issues. Lastly, integrating automated response mechanisms can help speed up threat containment and mitigation, reducing the time attackers have to cause harm. By implementing these strategies, you can significantly improve your incident response capabilities and better protect your business from potential breaches. Hire a SOC provider to assist where resources are limited.</p>
</li>
</ul>
<hr />
<h3 id="heading-8-using-unsupported-or-outdated-software"><strong>8) Using Unsupported or Outdated Software</strong></h3>
<ul>
<li><p><strong>Problem:</strong> Relying on unsupported or outdated software is like leaving your backdoor unlocked for cybercriminals. Many businesses stick with old systems because they think they "get the job done," but this can be a risky move. Outdated software often lacks essential security updates, making it an easy target for attackers. Without support, software doesn't receive patches for new vulnerabilities, leaving your systems wide open for exploitation. Keeping your software up to date is one of the simplest and most effective ways to boost your security posture. Ignoring this not only increases the risk of data breaches but also puts you at odds with compliance requirements, especially as governments ramp up regulations. It's better to get ahead of the curve now and avoid the potential fallout later.</p>
</li>
<li><p><strong>Solution:</strong> To mitigate these risks, businesses <strong>must</strong> prioritize regular updates and timely upgrades. Start by ensuring that all systems and applications are running on supported versions that receive regular security updates. Implement a software asset management system to keep track of all software licenses and their support status. This system can help in identifying which software needs to be updated or replaced. Also, consider transitioning to cloud-based solutions where possible, as they often provide continuous updates and security enhancements. Finally, train employees on the importance of using updated software and the risks associated with unsupported systems. By adopting these practices, businesses can significantly reduce the risk of cyber-attacks, ensuring their systems are secure and compliant with industry standards.</p>
</li>
</ul>
<hr />
<h3 id="heading-9-inadequate-physical-security"><strong>9) Inadequate Physical Security</strong></h3>
<ul>
<li><p><strong>Problem:</strong> In the digital age, it's easy to focus solely on cyber threats and forget about physical security, but the two are closely intertwined. Inadequate physical security measures can lead to unauthorized access to critical infrastructure, resulting in data breaches and other security incidents. The <a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6"><code>Microsoft Digital Defense Report 2023 (MDDR)</code></a> highlights that physical breaches often go hand-in-hand with cyber-attacks, as physical access can enable malicious actors to tamper with hardware, install malware, or steal sensitive information. The <a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/"><code>SANS Security Awareness Report 2024</code></a> notes that businesses often overlook physical security, especially in smaller offices or remote locations, making them easy targets for insider threats or external attackers​. Without proper physical security measures, companies are at risk of significant data loss and operational disruption.</p>
</li>
<li><p><strong>Solution:</strong> To address these risks, businesses should implement a comprehensive physical security strategy. Start by securing all entry points with measures like keycard access, biometric scanners, or even traditional locks and alarms. It's essential to monitor these access points with surveillance cameras and motion detectors, providing a real-time view of who is entering and exiting the premises. Additionally, ensure that sensitive areas, such as server rooms, are restricted to authorized personnel only, with strict access controls in place. Regular security audits and drills should be conducted to identify and address potential vulnerabilities, keeping everyone prepared for emergency situations. By integrating these physical security measures with your overall cybersecurity strategy, you can create a more secure and resilient environment that protects both digital and physical assets.</p>
</li>
</ul>
<hr />
<h3 id="heading-10-neglecting-incident-response-planning"><strong>10) Neglecting Incident Response Planning</strong></h3>
<ul>
<li><p><strong>Problem:</strong> Many businesses overlook the importance of having a well-defined incident response plan, thinking they'll deal with issues as they arise. This lack of preparation can lead to chaos and confusion during a cyber incident, resulting in delayed responses and greater damage. Without a solid incident response strategy, organizations struggle to contain and mitigate the effects of a security breach​. Quick and coordinated action is crucial to minimizing the impact of incidents. Having a pre-defined plan helps ensure that all team members know their roles and responsibilities during a crisis​. The <a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/"><code>SANS Security Awareness Report 2024</code></a> points out that businesses that conduct regular incident response drills are significantly better prepared to handle real-world attacks​. The absence of a well-practiced plan can result in increased recovery time, higher costs, and a greater likelihood of data loss or regulatory penalties.</p>
</li>
<li><p><strong>Solution: T</strong>o effectively manage security incidents, businesses must develop and maintain a comprehensive incident response plan. Start by clearly defining the roles and responsibilities of all team members, ensuring that everyone knows their part in the event of an incident. Regularly conduct incident response drills to practice these roles and improve the team's ability to respond swiftly and efficiently. It’s also crucial to establish communication protocols, both internally and externally, to keep all stakeholders informed during an incident. Additionally, document and review all incidents to learn from each event and refine your response plan. By implementing these measures, businesses can ensure a coordinated and efficient response to security incidents, minimizing damage and facilitating a quicker recovery.</p>
</li>
</ul>
<hr />
<h3 id="heading-references">References</h3>
<ul>
<li><p><a target="_blank" href="https://techcommunity.microsoft.com/t5/windows-it-pro-blog/windows-resiliency-best-practices-and-the-path-forward/ba-p/4201550">Windows resiliency: Best practices and the path forward - Microsoft Community Hub</a></p>
</li>
<li><p><a target="_blank" href="https://techcommunity.microsoft.com/t5/windows-os-platform-blog/securely-design-your-applications-and-protect-your-sensitive/ba-p/4179543">Securely design your applications and protect your sensitive data with VBS enclaves - Microsoft Community Hub</a></p>
</li>
<li><p><a target="_blank" href="https://www.microsoft.com/en-us/security/blog/2024/05/20/new-windows-11-features-strengthen-security-to-address-evolving-cyberthreat-landscape/">New Windows 11 features strengthen security to address evolving cyberthreat landscape | Microsoft Security Blog</a></p>
</li>
<li><p><a target="_blank" href="https://www.microsoft.com/en-us/security/blog/2024/07/27/windows-security-best-practices-for-integrating-and-managing-security-tools/">Windows Security best practices for integrating and managing security tools | Microsoft Security Blog</a></p>
</li>
<li><p><a target="_blank" href="https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023?msockid=00fed3910aae6aef0354c7590b956bd6">Microsoft Digital Defense Report 2023 (MDDR)</a></p>
</li>
<li><p><a target="_blank" href="https://www.sans.org/mlp/ssa-2024-security-awareness-report/">2024 Security Awareness Report | SANS Institute</a></p>
</li>
<li><p><a target="_blank" href="https://www.amazon.ca/Phoenix-Project-DevOps-Helping-Business/dp/0988262592">The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win: Kim, Gene, Behr, Kevin, Spafford, George: 9780988262591: Books - Amazon.ca</a> *** While not directly referenced, many of the concepts discussed align with those covered in the book, which also happens to be a fantastic read.</p>
</li>
<li><p>Bonus: <a target="_blank" href="https://www.rsaconference.com/library/presentation/usa/2024/youre%20doing%20it%20wrong%20common%20security%20antipatterns">You’re Doing It Wrong! Common Security Anti Patterns | RSA Conference</a> *** Not referenced but a really great session that I highly recommend!</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Cybersecurity Training Programs: Empower Your Employees]]></title><description><![CDATA[In today's digital world, SMBs are increasingly becoming targets for cyberattacks. It's crucial to have a solid security awareness program in place to protect sensitive information and maintain customer trust. As part of our ongoing series on cyberse...]]></description><link>https://blog.brianbaldock.net/cybersecurity-training-programs-empower-your-employees</link><guid isPermaLink="true">https://blog.brianbaldock.net/cybersecurity-training-programs-empower-your-employees</guid><category><![CDATA[#cybersecurity]]></category><category><![CDATA[CybersecurityAwareness]]></category><category><![CDATA[training]]></category><dc:creator><![CDATA[Brian Baldock]]></dc:creator><pubDate>Tue, 30 Jul 2024 03:19:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1722310182250/fd2dc443-c4e4-4705-a141-cafe25f6959c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today's digital world, SMBs are increasingly becoming targets for cyberattacks. It's crucial to have a solid security awareness program in place to protect sensitive information and maintain customer trust. As part of our ongoing series on cybersecurity for SMBs, let’s dive into some effective tools and platforms that can help you get started.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🗒</div>
<div data-node-type="callout-text">This list is provided as is, without any specific ranking, and doesn't endorse one platform over another. I just wanted to give you a quick rundown of options you can explore. This is not a review.</div>
</div>

<h3 id="heading-knowbe4">KnowBe4</h3>
<p>A comprehensive platform that offers security awareness training and simulated phishing. They’ve got a ton of interactive training modules, videos, and games to educate your team about various cybersecurity threats. <a target="_blank" href="https://www.knowbe4.com/products/security-awareness-training"><code>More info</code></a></p>
<h3 id="heading-proofpoint-security-awareness-training">Proofpoint Security Awareness Training</h3>
<p>Proofpoint Security Awareness Training provides targeted training modules based on user behavior and the current threat landscape. It’s a solid option if you're looking for phishing simulations and other interactive content to engage your team. <a target="_blank" href="https://www.proofpoint.com/us/products/security-awareness-training/platform"><code>More info</code></a></p>
<h3 id="heading-cofense-phishme">Cofense PhishMe</h3>
<p>Cofense PhishMe focuses specifically on phishing awareness and response training. They offer real-world phishing simulations and educational content to help users spot and avoid phishing attacks. <a target="_blank" href="https://cofense.com/phishing-security-awareness-training/"><code>More info</code></a></p>
<h3 id="heading-sans-security-awareness-training">SANS Security Awareness Training</h3>
<p>SANS is well-known for its high-quality training courses. They cover a wide range of topics, including phishing, social engineering, and data protection. Check out the SANS Security Awareness Report for more insights on building a strong security culture. <a target="_blank" href="https://www.sans.org/security-awareness-training/products/security-awareness-solutions/end-user/"><code>More info</code></a></p>
<h3 id="heading-microsoft-defender-for-office-365-p2-included-in-e5">Microsoft Defender for Office 365 P2 (included in E5)</h3>
<p>MDO P2 includes a great attack simulation training platform that you can run regular campaigns and provide training directly. It’s a fantastic way to keep your employees on their toes and educate them about the latest threats. You can simulate real-world attacks and give your team practical experience in recognizing and responding to these threats. It's all about making sure your people are prepared and know exactly what to do when faced with potential security issues. <a target="_blank" href="https://learn.microsoft.com/en-us/defender-office-365/attack-simulation-training-get-started"><code>More info</code></a></p>
<h3 id="heading-terranova-security">Terranova Security</h3>
<p>Terranova Security provides a full suite of security awareness training, including phishing simulations, interactive modules, and compliance training. They’ve got a good variety of content to keep things fresh and engaging. <a target="_blank" href="https://www.terranovasecurity.com/"><code>More info</code></a></p>
<h3 id="heading-cybersecurity-amp-infrastructure-security-agency-cisa">Cybersecurity &amp; Infrastructure Security Agency (CISA)</h3>
<p>CISA offers a bunch of free resources and training materials to help businesses build a security-conscious culture. Their materials are especially useful if you're on a tight budget but may require some work to integrate into any campaigns you're running. <a target="_blank" href="https://www.cisa.gov/resources-tools/training?f%5B0%5D=training_delivery%3A84&amp;f%5B1%5D=training_topic%3A68&amp;f%5B2%5D=training_topic%3A238"><code>More info</code></a></p>
<h3 id="heading-internal-workshops-and-seminars">Internal Workshops and Seminars</h3>
<p>Don't underestimate the power of in-house training sessions. Regular workshops, possibly with guest speakers from the cybersecurity industry, can keep your team up to date on the latest threats and best practices.</p>
<h3 id="heading-get-started">Get started</h3>
<p>There are plenty of options available, as you can see, but the most important thing is to start implementing a security awareness program now rather than waiting until it's too late. Being proactive is key; addressing security risks before they become issues is always better than trying to fix problems after they've occurred. And let's face it, the human problem is the toughest one to deal with. People are often the weakest link in cybersecurity, so educating your team is crucial to staying ahead of potential threats.</p>
]]></content:encoded></item></channel></rss>