This is exactly why I don't understand the libertarian right in the US. Large corporations are just as capable of the unaccountable abuses that large governments are.
That "right-" hyper-focus is the purposeful result of moderating corporate propaganda that neutralizes the philosophy, even making it positively conducive to corporate control. The same type of corruption occurs with more popular political philosophies like progressivism and conservatism. But it's less starkly pronounced as there are many more people defending their outcomes.
This seems like a really easy problem to solve. Just don't give the LLM access to any prod credentials[1]. If you can't repro a problem locally or in staging/dev environments, you need to update your deployment infra so it more closely matches prod. If you can't scope permissions tightly enough to distinguish between environments, update your permissions system to support that. I've never had anything even vaguely resembling the problems you are describing because I follow this approach.
[1] except perhaps read-only credentials to help diagnose problems, but even then I would only issue it an extremely short-lived token in case it leaks it somehow
A few weeks ago, I ran into a bug with Cloudflare's DNS server not detecting when I updated the records with the registrar. The bug was 100% on their end, entirely unsolvable by me, yet they have made it literally impossible to contact them to file a bug report. Their standard user help workflow dead-ended by forcing me to talk to their absolutely useless AI help chatbot, which proceeded to regurgitate their FAQ (inaccurately, uselessly), then referred me to a phone number that was disconnected/not in service, then gave me an email address that auto-replied it was no longer in use, then just looped back to the FAQ. There was no way for me to even send them an email to let them know they have a major bug.
I immediately pulled all my sites off of Cloudflare and I will never use that godawful nightmare of a company for anything ever again. If they can't even host a generic help bot without screwing it up that badly, why would I ever use them for anything at all, never mind an AI platform?
That had previously been my experience with CF too. In this case, I was migrating my domain over from the registrar, and updated the nameservers to point to CF as per the standard practice, then waited for CF to detect the updated DNS records. Two days later (well after DNS should have propagated) CF was still displaying an error saying the update to the DNS record for the domain hadn't been detected.
There's not a lot of UI surface area that a user can touch that can even theoretically affect the NS detection process because that process happens in CF entirely "under the hood" as it were. You more or less just have to wait for CF to detect the DNS changes. That said, I tried everything I could think of to try to trigger their detector to reset, including deleting and recreating the site from scratch in CF. After another few days of combing through CF docs and forums, and after changing and reverting every setting I possibly could, I concluded there was no workaround available to me as a user and tried to reach CF as I described above.
Having done this many times before, I am quite certain that I set the nameservers correctly. I even had two other very experienced engineers review what I had done to make sure I wasn't falling victim to some mental blindspot that prevented me from recognizing what the problem was. I think every SWE has had the experience of spending an enormous amount of time debugging a problem only to realize they mistyped a magic string somewhere, but for whatever reason their brain just straight up refused to recognize the typo, but unfortunately that was not the case here. The other engineers saw what I saw and also were unable to fix the problem.
I was subsequently able to set DNS up on Vercel without any trouble at all. Bottom line, the issue was almost certainly a bug in Cloudflare's code. That indicates a code quality problem to me, which, in combination with the reckless incompetence that it takes to try to automate customer support with a chatbot that doesn't even have accurate information about their own processes and basic contact information, never mind a reasonable escape hatch to actual human-provided support in unusual cases (even for a paying customer), has led me simply not to trust them to deliver a reasonable quality product anymore.
They didn't even maintain any mechanism for reporting bugs to them, which is just insanity because it means there is no way to inform them even in extreme cases like a critical security bug. I get that they want to cut costs by reducing the employees needed to deal with customer service complaints, but it costs practically nothing to have a little feedback form somewhere, especially now that an LLM can handle most user feedback processing. Or failing that, a functioning support email address or phone number. But they can't even clear that incredibly low bar.
All of these issues could have been avoided with a very limited application of ordinary common sense and foresight. Whoever programmed their chatbot did not take the time to set up a decent RAG system with up-to-date information about their support processes and how to contact them, even though that is an obvious requirement for a tech support chatbot. They should also have recognized the business risks posed by exposing their customers to a system which lacks any escape hatches for outlier cases requiring actual human support, which risks alienating customers like me by forcing us to jump through Kafkaesque bureaucratic hoops just to get simple problems addressed, and--even worse--making it impossible to resolve such problems after jumping through all their hoops. The team implementing this chatbot didn't even think to include a contact form as a last resort method for reporting problems to them when the chatbot gets in over its head.
Most people hate this kind of LLM-provided customer support without any human escalation options, because the bots often end up uselessly looping through some debugging steps that simply do not work for the customer's specific issue for whatever reason, which feels like slamming your head against a wall repeatedly. It's a truly infuriating user experience and is practically guaranteed to destroy the business's public goodwill and reputation.
All of which means they are gutting their customer service department following some process that lacks access to these very basic insights, which screams mismanagement to me.
I'm not exactly a huge customer, but between my personal and business sites, I plowed $45k into CF last year, and will spend not another penny on them this year, or ever again. Maybe that's not huge spend in the grand scheme of the tech industry, but at a minimum that amount of money should entitle me to some human-provided support. My annual spend alone could provide the budget for multiple offshored CSRs. If I am spending enough money to buy a car, the least they can do is let me send them an email when I have a problem instead of just throwing me to the wolves.
Ultimately, they have a much weaker moat now than at any point in the past, because LLMs make it so much easier to build out critical functionality in-house that previously would have been worth paying someone else to manage via a SaaS. And while I may not be a big enough customer for them to worry about in and of myself, I am also not the only person affected by these business practices. Every affected person increases the reputational harms suffered by Cloudflare, with another alienated customer like me bashing CF in posts like this or in conversations with their friends and colleagues in the industry. Those harms should be very concerning to CF's management because it is extremely difficult to recover lost goodwill.
There are a few reasons DoD PKI is a shitshow which make it somewhat more understandable (although only somewhat).
First, the issues you describe affect only unclassified public-facing web services, not internal DoD internet services used for actual military operations. DoD has its own CA, the public keys for which are not installed on any OS by default, but anyone can find and install the certs from DISA easily enough. Meaning, the affected sites and services are almost entirely ones not used by members of the military for operational purposes. That approach works for internal DoD sites and services where you can expect people to jump through a couple extra hoops for security, but is not acceptable for the general public who aren't going to figure out how to install custom certs on their machine to deal with untrusted cert errors in their browser. That means most DoD web infra is built around their custom PKI, which makes it inappropriate for hosting public sites. Thus anyone operating a public DoD site is in a weird position where they deviate from DoD general standards but also aren't able to follow commercial standard best practices without getting approval for an exception like the one you linked to. Bureaucratically, that can be a real nightmare to navigate, even for experienced DoD website operators, because you are way off the happy path for DoD web security standards.
Second, many DoD sites need to support mTLS for CAC (DoD-issued smartcards) authentication. That requires the site to use the aforementioned non-standard DoD CA certs to validate the client cert from the CAC, which in turn requires that the server's TLS cert be issued by a CA in the same trust chain, which means the entire site will not work for anyone who hasn't jumped through the hoops to install the DoD CA certs. Meaning, any public-facing site has to be entirely segregated from the standard DoD PKI system. For now, that means using commercial certs, which in turn requires a vendor that meets DoD supply chain security requirements.
Third, most of these sites and services run on highly customized, isolated DoD networks that are physically isolated from the internet. There's NIPR (unclassified FOUO), SIPR (classified secret), and JWICS (classified top secret). NIPR can connect to the regular internet, but does so through a limited number of isolated nodes, and SIPR/JWICS are entirely isolated from the public internet. DoD cloud services are often not able to use standard commercial products as a result of the compatibility problems this isolation causes. That puts a heavy burden on the engineers working these problems, because they can't just use whatever standard commercial solutions exist.
Fourth, the DoD has only shifted away from traditional old school on-prem Windows Server hosting for website to cloud-hosting over the past few years. That has required tons of upskilling and retraining for DoD SREs, which has not been happening consistently across the entire enterprise. It also has made it much harder to keep up with the standards in the private sector as support for on-prem has faded, while the assumptions about cloud environments built into many private sector solutions don't hold true for DoD.
Fifth, even with the move to cloud services, the working conditions can be so extraordinarily burdensome and the DoD-specific restrictions so unusual, obscure, poorly documented, and difficult to debug that it dramatically slows down all software development. e.g., engineers may have to log into a jump box via a VDI to then use Jenkins to run a Groovy script to use Terraform to deploy containers to a highly customized version of AWS.
Ultimately, the sites this affects are ones which are lower priority for DoD because they are not operationally relevant, and setting up PKI that can easily service both their internal mTLS requirements and compatibility with commercial standards for public-facing sites and services is not totally straightforward. That said, it is an inexcusable shitshow. Having run CAC-authenticated websites, I can tell you it's insane how much dev time is wasting trying to deal with obscure CAC-related problems, which are extremely difficult to deal with for a variety of technical and bureaucratic reasons.
Good write up. Not to mention the other big HR constraints on DoD engineers: they almost always have to be a “US person.”
Anyone who gets a CAC working on a personal computer deals with this all too much. The root certs DoD uses are not part of the public trusted sources that commonly come installed in browsers.
lol I very nearly included a rant about that but decided it was too far off topic. Not being able to smoke weed may be more of an obstacle these days though.
> engineers may have to log into a jump box via a VDI to then use Jenkins to run a Groovy script to use Terraform to deploy containers to a highly customized version of AWS.
This hits too close to home. I'm sending you my therapist's bill for this month.
> That requires the site to use the aforementioned non-standard DoD CA certs to validate the client cert from the CAC, which in turn requires that the server's TLS cert be issued by a CA in the same trust chain, which means the entire site will not work for anyone who hasn't jumped through the hoops to install the DoD CA certs. Meaning, any public-facing site has to be entirely segregated from the standard DoD PKI system. For now, that means using commercial certs, which in turn requires a vendor that meets DoD supply chain security requirements.
Is this actually all the way technically correct? As far as I know, there is no requirement that the trust chains for server certificates and client certificates are in any way related. It seems to me that it would be perfectly possible for the DoD to use its own entirely private client certificate infrastructure but to still have the server certificate use something resembling an ordinary root certificate.
This is not to say that this would actually be all that worthwhile.
> Is this actually all the way technically correct? As far as I know, there is no requirement that the trust chains for server certificates and client certificates are in any way related. It seems to me that it would be perfectly possible for the DoD to use its own entirely private client certificate infrastructure but to still have the server certificate use something resembling an ordinary root certificate.
I think you're right that it's possible in principle for a Web server to enforce use of DoD CAC (enforcing the client cert being in the DoD PKI) without itself using a DoD PKI cert on the server side.
That said there's little benefit to it, users who haven't jumped through hoops to install DoD root CA certs won't typically be able to get their browsers to present them to the remote server in the first place, and if we're willing to jump through those hoops then there's no good reason for the DoD server not to have a DoD PKI cert.
I've never used one of the DoD smartcards, but I can certainly imagine the DoD wanting a user of one of these smartcards to be able to use it with a COTS client device to authenticate themselves.
Sure, people do that all the time. After they run "InstallRoot" to install DoD root certs on their COTS device, that is. I'm honestly not sure any major browser will allow you to use a client smartcard without having the smartcard's certificate chain to the trust store used by the browser so this part seems unavoidable.
FWIW I just tested it and yes you can run a web server using a commercial server cert that enforces client PKI tied to the client having a DoD PKI cert. It works just fine.
> I'm honestly not sure any major browser will allow you to use a client smartcard without having the smartcard's certificate chain to the trust store used by the browser so this part seems unavoidable.
It’s been a while, but I’ve used file-backed client certs issued by a private CA in an ordinary browser without installing anything into the trust store, and it worked fine. I don’t see why a client cert using PKCS11 or any other store would work any differently. Why would the browser want to verify a certificate chain at all?
I'm really just talking about the browser trusting the user cert itself. I've done the softcert thing myself before, I forget if it used commercial root CA or not but it did work.
I guess you could flag the leaf (user) cert as ultimately trusted and that should be fine, but if the browser doesn't see that trust notation, and does see an intermediate CA, it's going to try to pull that back to a trusted root.
One way or the other the user will have to fiddle with browser settings to make a CAC work, either to tell the browser to trust their cert explicitly, or to have the browser trust DoD certs.
No unfortunately it is not correct. You can supply a different CA to verify client certs against to what is given in server hello. There's no need for them to be related at all.
Critically you probably want to use a custom CA for client certs. The usual implementation logic in servers is "is this cert from the client signed by one I trust?". If that CA is LetsEncrypt say then that's a lot of certificates that will pass that check.
Except for people like me who struggle to wake up before dawn. And whether people prefer light after work doesn't change the available scientific evidence which suggests there are significant negative health effects of waking up too early relative to sunrise, but no significant health benefits from having sunlight hours after work. People's preferences in this case are generally only mildly held and typically are not well informed by the science. I suspect if more people were aware of the deleterious health effects, their stated preferences would change.
I've worked on mission planning software for parachute systems and the precision we can achieve is already extremely high. Given how poorly this seems to scale, the only use case that makes any sense to me would be something like sensor drop, which are the only payloads small enough for these chutes. Or potentially for drogues on multi-stage systems, but I'm not sure they'd even be useful there because usually a fast descent is part of the appeal of a drogued payload, and not just to reduce time exposed to wind drift (e.g., to reduce time it is vulnerable to enemy fire).
Our inflammation responses evolved in part to help us fight off pathogens, but people in modern society are exposed to far, far fewer pathogens than even our immediate ancestors were as recently as 70 years ago when diseases like polio and mumps were still common. As a result many people have an overactive inflammation response relative to the pathogen load to which they are regularly exposed.
In extreme cases, that can manifest as autoimmune disease, when overly strong inflammation or other immune responses end up attacking not just foreign pathogens but the person's body itself. As another poster said, inflammation is a blunt instrument. It's a knob that can only be turned up or down, across the entire body. If you turn it down too far, you risk infectious illness. And if you turn it up too far, you risk damage to your organs.
Interestingly, there was a substantial increase in the incidence of autoimmune diseases in Europe in the generations following the Black Death, probably because people with excessively strong immune responses were more likely to survive exposure to plague bacteria. Celiacs or MS will kill someone much, much more slowly than bubonic plague will, so a disproportionate number of people with those or similar autoimmune disorders were able to survive to pass on their genes.
reply