blog.krauza.com
~ / posts / squid_proxy_mitm / index.md

Evolving Squid to MITM HTTPS with HashiCorp Vault PKI

TLS bump in Squid shows you the SNI hostname of every outbound connection. In a world of supply chain attacks and container CVEs, that’s not enough. This is how I extended my Squid deployment to full HTTPS inspection, with a signing certificate issued by HashiCorp Vault through an ADCS-rooted PKI chain.

Knowing that a connection went to downloads.example.com is useful. Knowing that it went to downloads.example.com/software/package-3.2.1.tar.gz?token=aHR0cHM6Ly8... is a different category of useful entirely. The difference between those two pieces of information is the difference between observing a knock on the door and actually hearing the conversation.

For a long time, TLS bump in Squid was enough for my purposes. I could see every outbound HTTPS destination by hostname, log it, alert on unexpected domains, and build some basic behavioral baselines. But as supply chain attacks have become more sophisticated, and as CVE feeds for the Linux kernel and container runtimes have been coming in at a pace that would embarrass a security team with a real patching budget, hostname visibility alone started to feel inadequate. A threat actor who has compromised a legitimate CDN, or a malicious package quietly inserted into a trusted artifact store, is going to look completely normal at the SNI layer. They are using the right hostname. The malice is in the path.

This post covers how I evolved my Squid deployment from TLS bump to full HTTPS MITM inspection, and the PKI architecture that makes it possible: an Active Directory Certificate Services root, a HashiCorp Vault intermediate, and a Vault-issued sub-intermediate that Squid uses as its signing certificate. I want to be direct about something upfront: I know PKI well. I have been working with certificates, chains, and trust anchors long enough that I can read an X.509 extension dump and tell you what’s wrong without googling. HashiCorp Vault’s PKI secrets engine, however, was new territory for me, and that is where Claude made a genuine difference in how quickly this came together.

What Squid Is and Why It Belongs in a Security Stack

Squid is a forward proxy, which is one of those phrases that sounds more complicated than it is. When a client makes an HTTPS request through Squid, Squid is the intermediary: the client connects to Squid, tells it (via the CONNECT method) where it wants to go, and Squid establishes the outbound connection on the client’s behalf. The client never directly touches the destination server.

That position in the middle is what makes Squid useful for security. Everything leaving your network, at least everything you’ve configured to route through the proxy, passes through a single choke point where you can log it, inspect it, filter it, and alert on it. In an enterprise environment, forward proxies are standard infrastructure. In a homelab, running one means you have the same visibility tooling that a real SOC would expect.

Squid supports two fundamentally different modes of TLS handling:

Splice (pass-through): Squid observes the TLS handshake and logs the SNI hostname, but does not terminate or re-encrypt the connection. The end-to-end TLS session remains between the client and the destination server. This is the least invasive mode and the one I was running before.

Bump (MITM inspection): Squid terminates the client’s TLS connection, decrypts the traffic, inspects it, then re-encrypts and forwards it to the destination. To do this without triggering certificate errors in the client, Squid must dynamically generate a certificate for each destination, signed by a CA that the client trusts. This is what we built.

The tradeoff is real and worth naming: bump mode means Squid can read all HTTPS content passing through it, including credentials, session tokens, and anything else riding inside that TLS session. For my homelab, where I control every client and every service, that tradeoff is acceptable. In a shared or enterprise environment it has legal, ethical, and compliance dimensions that require careful consideration.

My current Squid setup runs as a Docker container on Ubuntu 22.04 and serves four distinct ports to cover different authentication and inspection requirements:

PortAuthTLS ModeUse Case
8443NoneSplice (pass-through)Allowlisted hosts, no inspection
9443BasicSplice (pass-through)Authenticated users, no inspection
11443NoneBump (inspect)Allowlisted hosts, full TLS inspection
12443BasicBump (inspect)Authenticated users, full TLS inspection

The splice ports give me hostname-level logging for traffic I’ve explicitly decided not to inspect (things like WireGuard traffic, software update services, and CDN endpoints where path-level inspection adds noise without value). The bump ports give me full URL visibility on everything else.

Where We Came From: SSL Bump Without MITM

Before this work, Squid was configured for ssl_bump peek, which is exactly what it sounds like: Squid peeks at the TLS handshake to extract the SNI hostname, logs it, then either splices or steps out of the way. The result is log entries that look something like this:

text
11747382400.000  15 192.168.1.50 TCP_TUNNEL/200 4382 CONNECT registry-1.docker.io:443 - HIER_DIRECT/52.72.232.213 -

That entry tells me a client connected to registry-1.docker.io. It does not tell me which image was pulled, which tag, which layer hashes were downloaded, or whether the manifest pointed at a recently-pushed image that replaced the expected one. In a world where xz-utils can be backdoored through a multi-month supply chain compromise, the hostname is table stakes. The path is the signal.

With MITM inspection enabled, the same transaction produces entries that include the full URL:

text
11747382400.000  423 192.168.1.50 TCP_MISS/200 8192 GET https://registry-1.docker.io/v2/library/python/manifests/3.12-slim - HIER_DIRECT/52.72.232.213 application/vnd.docker.distribution.manifest.v2+json

Now I can see which image, which tag, which manifest endpoint. That data feeds into alerting rules that flag tag-based pulls (latest, slim, version strings without a digest pin) and correlate pull activity against deployment pipelines. A container image pulled at 3am outside of any known CI job has a very different risk profile from the same pull initiated by a GitLab runner with a traceable pipeline ID.

The PKI Architecture

Getting MITM inspection to work requires Squid to dynamically generate certificates for every destination it intercepts. Those certificates have to be trusted by every client routing traffic through the proxy, which means they have to be signed by a CA that’s in the client trust stores. The cleanest way to handle this, without distributing a static signing certificate that could be extracted and abused, is to use a proper PKI hierarchy where Squid’s signing material comes from an intermediate CA.

My PKI chain looks like this:

flowchart TD
    ADCS["Active Directory\nCertificate Services\nRoot CA\nSelf-signed, offline when possible"]
    Vault["HashiCorp Vault\nCA 2025\nIntermediate\nIssued by ADCS"]
    SquidSub["Squid Signing\nSub-Intermediate\nIssued by Vault\nat deploy time"]
    DynCert["Per-connection\nLeaf Certificates\nGenerated by Squid\non the fly"]

    ADCS -->|Signs intermediate CSR| Vault
    Vault -->|sign-intermediate| SquidSub
    SquidSub -->|Runtime signing| DynCert

The ADCS root is the trust anchor. It lives in the Windows Certificate Store and is distributed to every machine in my domain via Group Policy or ca-certs. Every certificate in this chain ultimately chains back to it, which means every client trusts every certificate Squid generates, as long as the chain is correct and complete.

Vault CA 2025 is the first intermediate. ADCS issued it a certificate with CA:TRUE and CertSign in the key usage, which means Vault can issue its own certificates and subordinate CAs. Vault manages this through its PKI secrets engine, which is the piece of infrastructure that actually does the cryptographic work of signing CSRs.

The Squid signing sub-intermediate is what’s new here. Rather than having Squid use the Vault intermediate directly (which would mean Squid’s signing key has the same trust scope as the entire Vault CA), we issue a subordinate intermediate specifically for Squid. Squid’s certificate is scoped to this purpose. If it were ever compromised, revoking it and issuing a new one does not require touching the Vault CA or anything else in the chain.

HashiCorp Vault PKI Configuration

This is where I want to be honest about the division of labor. I had a clear picture of what the PKI needed to look like: a sub-intermediate issued by Vault, with the right extensions, deployed to Squid at container start. What I did not have was fluency in Vault’s PKI secrets engine API. I had used Vault for secret storage and dynamic credentials before, but PKI was new, and Vault’s PKI surface has some sharp edges that are not obvious from the documentation alone.

Claude helped me navigate those edges. I described the PKI goal and what I knew about certificate extensions, and over a debugging session that would otherwise have taken me considerably longer, we worked through the Terraform role configuration and the signing path together. I’ll detail what we found below, because the specific failure modes are the valuable part.

The Terraform Role

The Vault PKI role defines what kinds of certificates a given mount is allowed to issue. For a signing certificate, the role needs to be configured precisely, and several of the defaults are wrong for this use case:

hcl
 1resource "vault_pki_secret_backend_role" "squid_intermediate_ca" {
 2  name    = "squid-ca"
 3  backend = vault_mount.vault_ca.path
 4
 5  allow_any_name = true
 6
 7  key_usage     = ["CertSign", "CRLSign"]
 8  ext_key_usage = []
 9
10  server_flag           = false
11  client_flag           = false
12  code_signing_flag     = false
13  email_protection_flag = false
14
15  key_type = "rsa"
16  key_bits = 2048
17  use_pss  = false
18
19  ttl     = "8760h"
20  max_ttl = "8760h"
21}

The critical fields here are ext_key_usage = [] and the four _flag = false settings. When ext_key_usage is not explicitly set, Vault’s role defaults automatically add serverAuth and clientAuth extended key usages to every issued certificate. That sounds innocuous until you understand what it means for a CA certificate: a certificate that carries serverAuth or clientAuth in its EKU (Extended Key Usage) is flagged by TLS libraries as a leaf certificate, not a CA. Browsers and curl and every other TLS client will refuse to treat it as a signing authority, regardless of what the basic constraints say. Setting ext_key_usage = [] explicitly removes those extended key usages and produces a clean CA certificate.

The server_flag, client_flag, code_signing_flag, and email_protection_flag settings are the equivalent controls for EKU at the Vault role level. Setting them all to false ensures that the role itself cannot add those usages even if the defaults change.

The Two-Step Signing Process

The standard way to issue a certificate from a Vault PKI mount is vault write pki/issue/<role>. That endpoint issues leaf certificates. It is designed for leaf certificates. It does not matter what you put in key_usage or what flags you set on the role: the issue endpoint will never produce a certificate with CA:TRUE in its basic constraints. This is by design, and it is not documented as prominently as it should be.

To get a genuine intermediate CA certificate out of Vault, you need a two-step process. First, generate the key pair and CSR inside Vault:

bash
1vault write vault-ca/intermediate/generate/exported \
2  common_name="squid-proxy-$(date +%s)" \
3  key_type=rsa \
4  key_bits=2048 \
5  organization=Krauza \
6  ou="Squid Proxy"

This produces a CSR and a private key. The private key stays on the host (it’s marked exported so we can deploy it). The CSR goes to the next step:

bash
1vault write vault-ca/issuer/default/sign-intermediate \
2  csr=<csr_from_step_1> \
3  ttl=8760h \
4  use_csr_values=true

This path, issuer/default/sign-intermediate, is what produces a certificate with CA:TRUE and pathLen:0 in the basic constraints. Vault is being explicit here: it is signing this CSR as an intermediate CA, not as a leaf. The resulting certificate is what gets deployed to Squid.

The Ansible playbook runs both steps at deploy time and writes the signed certificate plus private key to /etc/squid/ssl_cert/:

text
1/etc/squid/ssl_cert/squid.pem   # signed cert + full ca_chain
2/etc/squid/ssl_cert/squid.key   # private key

The squid.pem file contains the chain, not just the signed certificate. More on why that matters in the troubleshooting section.

The ROTATE_CERT Variable

One thing I specifically did not want was Vault issuing a new signing certificate on every deploy. Each issued certificate is an entry in Vault’s certificate serial index. Issuing hundreds of certificates over months of deploys, when most of those certificates are identical in purpose and valid for a year, creates PKI sprawl: a long tail of active serial numbers that never get revoked, increasing your CRL size and adding noise to any audit of what certificates are actually in use.

The GitLab CI pipeline handles this with a pipeline variable:

yaml
1variables:
2  ROTATE_CERT:
3    value: "false"
4    description: "Set to 'true' to rotate the Squid signing certificate from Vault PKI on this deploy."
5    options:
6      - "false"
7      - "true"

All four certificate tasks in the Ansible playbook are gated on when: rotate_cert | default('false') | bool. A normal deploy does not touch the signing certificate. When I actually need to rotate (the certificate is approaching expiry, or there’s a suspected compromise), I trigger the pipeline with ROTATE_CERT=true and the rotation happens cleanly. Every other deploy just re-uses the existing certificate that’s already on disk.

This is a small configuration detail but it maps to a real operational principle: every certificate issuance is an event. Minimize events you don’t need.

Troubleshooting: Four Certificate Errors and a BGP Surprise

This section is the one I want to spend the most time on, because it represents the actual work. The architecture section describes what was built. This section describes everything that was wrong before it worked.

Finding One: Wrong Extended Key Usage

The first error the browser returned, after deploying what I thought was a correctly configured certificate, was SEC_ERROR_CA_CERT_INVALID. Firefox’s error code for “this certificate is structurally wrong as a CA.” Not “untrusted.” Not “expired.” Wrong at the structural level.

Running the certificate through openssl x509 -text -noout showed the problem immediately:

text
1X509v3 Extended Key Usage:
2    TLS Web Server Authentication, TLS Web Client Authentication

There they were: serverAuth and clientAuth, added automatically by Vault’s role defaults. A certificate carrying those EKUs is a leaf certificate by TLS library convention. No client will use it to validate other certificates, regardless of whether CA:TRUE is set. The fix, as described above, was ext_key_usage = [] in the Terraform role plus the four _flag = false settings.

Finding Two: The issue Endpoint Does Not Produce CA Certificates

After fixing the EKU problem, the next iteration still failed. SEC_ERROR_CA_CERT_INVALID again. Back to the openssl output:

text
1X509v3 Basic Constraints: critical
2    CA:FALSE

This was the issue endpoint problem. Even with CertSign in key_usage and clean EKU settings, the vault write pki/issue/<role> path sets CA:FALSE. It is a leaf certificate endpoint. The Vault documentation for this endpoint is correct, but the implication that CertSign in key usage is insufficient without CA:TRUE in basic constraints is not stated as plainly as it should be.

Switching to the intermediate/generate/exported plus issuer/default/sign-intermediate two-step resolved this. The sign-intermediate endpoint is specifically designed to produce intermediate CA certificates and sets CA:TRUE correctly.

Finding Three: The ca_chain vs. issuing_ca Distinction

With a structurally correct certificate finally in hand, Squid could start, but some clients still showed certificate errors. Not all of them, and not on all destinations. The pattern pointed to chain validation: clients that had recently refreshed their trust stores worked; clients that hadn’t seen the intermediate certificates before didn’t.

To understand why, it helps to be precise about how TLS chain validation actually works. Clients, whether browsers, curl, or any other TLS library, do not trust intermediate certificates. They trust root certificates, and only root certificates, anchored in a local trust store. When a client receives a certificate chain during a TLS handshake, it walks up the chain looking for a certificate it recognizes in that trust store. If it hits a gap, a certificate whose issuer it doesn’t have, validation fails. Intermediates are not trusted implicitly; they are trusted only because a root you already trust signed them.

In my PKI chain, the root is ad-krauza-CA, issued and managed by Active Directory Certificate Services, and entirely outside of HashiCorp Vault. Vault knows about it as the trust anchor for the chain, but it did not issue it and it is not part of Vault’s own certificate inventory. Clients trust it because it was distributed via Group Policy or ca-certs into every machine’s trust store. Everything below it, Vault CA 2025 and the Squid sub-intermediate, they trust only because the chain traces back to that root.

The Vault signing response returns two fields that look superficially similar: issuing_ca (the direct parent certificate, Vault CA 2025 in this case) and ca_chain (an array of every intermediate in the chain from the issued certificate up to, and including, the root). Writing certificate + issuing_ca to squid.pem gives Squid only two certificates to present: the Squid sub-intermediate and its immediate parent. That is not the full chain. A client that doesn’t already have Vault CA 2025 cached from a previous connection has no way to continue walking up to ad-krauza-CA. It sees an issuer it doesn’t recognize and fails validation, even though the root it needs is already in its trust store.

The full chain that needs to be in squid.pem is: the Squid sub-intermediate, Vault CA 2025, any other intermediates, and the root CA itself. Clients only trust roots; they do not and should not implicitly trust intermediates, so the root must be present for the chain to be verifiable. Every certificate between the leaf and the root must be present without gaps, because clients do not fetch missing certificates; they fail.

Vault’s ca_chain field gives the ordered intermediates, and since ad-krauza-CA issued the intermediate, it is included in Vault’s ca_chain, as opposed to issuing_ca which only contains the Squid sub-intermediate’s issuer chain:

yaml
1# Before: Only hosts which trust the intermediate can validate the certificate:
2content: "{{ (vault_sign_response.stdout | from_json).data.certificate }}\n{{ (vault_sign_response.stdout | from_json).data.issuing_ca | join('\n') }}\n"
3
4# After: full chain including ADCS root, every client can validate to the trusted root:
5content: "{{ (vault_sign_response.stdout | from_json).data.certificate }}\n{{ (vault_sign_response.stdout | from_json).data.ca_chain | join('\n') }}\n"

Finding Four: Vault 1.15 Moved the sign-intermediate Path

At one point during development, an attempt to sign the intermediate CSR returned:

json
1{
2  "path": "vault-ca/sign-intermediate",
3  "error": "1 error occurred:\n\t* unsupported path"
4}

This is a Vault version issue. Prior to Vault 1.11, pki/sign-intermediate was the correct path. Starting in 1.11, Vault’s PKI secrets engine introduced issuer-scoped paths, and in 1.15 the legacy path was removed entirely. The current correct path is issuer-scoped:

text
1vault-ca/issuer/default/sign-intermediate

If you’re working from older documentation or tutorials, this is going to bite you. The error message is not especially helpful in pointing to a version incompatibility; it just says “unsupported path,” which initially suggested a permissions problem.

Finding Five: BGP Failover, DynamicDNS, and WireGuard

This one had nothing to do with certificates. I mention it because it was the most surprising failure mode and the one most likely to affect anyone running a similar setup.

My Squid proxies sit behind BGP anycast routing: traffic routes to the nearest healthy proxy host, and if one fails, the BGP failover redirects traffic to another. The proxies are transparent to the clients; the routing change is invisible. That’s the intent.

The problem is that my DynamicDNS update traffic also routes through the proxy. When a DDNS update request goes through Squid, the DDNS provider sees the proxy’s egress IP as the source, not the actual WireGuard endpoint IP I’m trying to register. Every BGP failover caused the proxy host’s IP to be written into my DDNS record, pointed at the wrong host, and broke the WireGuard tunnel until I noticed and corrected it manually in Route53.

What Full URL Visibility Actually Enables

Moving from hostname to full URL in proxy logs changes what you can build on top of them.

At the SNI layer, you can answer: “what external services is this host communicating with?” At the URL layer, you can answer questions like these. When a container pulls from a registry, are the pulls pinning specific digest hashes, or are they pulling mutable tags? If a host starts requesting paths under /.well-known/acme-challenge/, that’s either legitimate certificate renewal or a misconfigured service. If something in the environment starts hitting artifact storage paths that don’t correspond to any known deployment, that’s worth investigating.

The logging feeds into a broader detection pipeline: proxy logs go to Loki, Loki feeds alert rules, and anything that pattern-matches against known supply chain attack indicators (unexpected registry paths, newly registered domains used for artifact delivery, tag-based pulls for production-critical images) generates a notification. The hostname alone would not have been sufficient to build any of those rules.

The certificate infrastructure is also now in a much better place operationally. The Vault intermediate is the signing authority for the proxy. If it’s compromised or the certificate expires, rotating it does not touch anything else in the PKI hierarchy. The ADCS root remains offline and untouched. The Vault CA remains valid. Only the Squid sub-intermediate changes.

On Using Claude for the Vault Configuration

I said upfront that I know PKI well. I want to be specific about where that knowledge ended and where Claude’s contribution began, because the framing of “AI helped me” can mean a lot of different things and I’d rather be precise.

The PKI architecture itself, the decision to use a sub-intermediate rather than the Vault CA directly, the chain structure, the extension requirements for a signing certificate: all of that came from existing knowledge. I knew what the certificate needed to look like. What I did not know was how to express that correctly in Vault’s PKI API, specifically which Vault endpoints, which Terraform resource arguments, and which defaults would silently produce the wrong result.

That’s the thing about being a “pseudo-expert” in an adjacent system: you know enough to know what you want, but not enough to know the exact path that gets you there, and you definitely don’t know which defaults are wrong for your use case. The ext_key_usage = [] insight, the distinction between issue and sign-intermediate, the ca_chain vs. issuing_ca difference: those came out of debugging sessions with Claude where I could describe the certificate problem in PKI terms and get back answers grounded in Vault’s specific API behavior.

What would have taken me several weeks of documentation reading, forum searching, and trial-and-error (I know this because I have spent several weeks doing exactly that kind of work on other Squid configurations in the past) compressed into a working deployment over a few hours. That is a real, specific, measurable difference. I don’t say that as a promotional claim about AI; I say it as an honest account of how the work actually went.

Some final thoughts

Supply chain attacks are not theoretical. The xz-utils backdoor, the SolarWinds supply chain compromise, the steady drumbeat of malicious packages published to PyPI and npm: these are real incidents that relied on trust in distribution channels that appeared legitimate at the surface. A Squid proxy configured for HTTPS inspection is not a complete defense against any of them, but it is a layer of visibility that didn’t exist before.

What I have now is a proxy that logs full URLs, feeds those logs into structured alerting, and does so using a signing certificate that’s properly rooted in my existing PKI hierarchy. The certificate management is automated, the rotation is deliberate rather than accidental, and the blast radius of a compromised signing certificate is limited to one sub-intermediate that can be revoked and replaced without touching anything else.

The homelab has a tendency to accumulate complexity faster than it accumulates coherence. This project is one of the cases where the complexity feels justified: the operational overhead of the PKI chain and the Vault integration is real, but so is the visibility it produces. Knowing which Docker manifest endpoint was queried, which Go module proxy path was hit, which artifact storage path lit up at an unexpected time: that’s the kind of visibility that makes the complexity worthwhile.

Search