July 29th, 2024

DigiCert Revocation Incident (CNAME Domain Validation)

DigiCert reported a certificate revocation incident affecting 0.4% of domain validations due to improper Domain Control Verification. Customers must replace affected certificates promptly and follow reissue procedures.

Read original articleLink Icon
DigiCert Revocation Incident (CNAME Domain Validation)

DigiCert has announced a certificate revocation incident affecting approximately 0.4% of its domain validations due to improper Domain Control Verification (DCV). The issue arose from the omission of an underscore prefix in DNS CNAME records used for validation, which is required to prevent potential collisions with actual domain names. This non-compliance with the CA/Browser Forum (CABF) rules mandates that affected certificates be revoked within 24 hours of discovery. Customers impacted by this incident have been notified and must replace their certificates promptly by logging into their DigiCert accounts.

The root cause of the issue was traced back to a modernization effort in DigiCert's validation systems, which inadvertently removed the automatic addition of the underscore prefix during the validation process. Although regression testing was conducted, it did not catch this specific change in functionality. DigiCert has since implemented a solution to ensure that the underscore prefix is automatically included in future random value generations, regardless of the validation method chosen.

Customers are advised to generate new Certificate Signing Requests (CSRs) and follow the reissue process outlined in their accounts. DigiCert has expressed regret for any business disruptions caused by this incident and is prepared to assist customers in validating their domains and issuing replacement certificates.

Related

Entrust certificates will not be trusted in Chrome 127+

Entrust certificates will not be trusted in Chrome 127+

The Chrome Root Program Policy is updating trust for Entrust CAs due to compliance issues. Entrust must show improvement to maintain trust. Chrome will oversee changes to safeguard users and the web.

Cloudflare 1.1.1.1 incident on June 27, 2024

Cloudflare 1.1.1.1 incident on June 27, 2024

Cloudflare faced a global incident on June 27, 2024, with its 1.1.1.1 DNS resolver due to BGP hijacking and a route leak. Despite affecting some users, Cloudflare responded by disabling peering locations and engaging with network operators to resolve the issue.

Telekom Security: Revocation delay for TLS certificates

Telekom Security: Revocation delay for TLS certificates

Telekom Security experienced a delay in revoking TLS certificates, affecting 336 certificates due to basicConstraints not marked as critical. Efforts were made to prompt customers for replacement within 5 days. Lessons included the need for customer sensitization and faster certificate replacement procedures. Automation via protocols like ACME was considered for future processes. Stakeholders questioned the delay, but Telekom Security defended its decision based on low security risk and impact on critical infrastructures. The incident underscored challenges faced by CAs in ensuring timely revocation and the importance of continuous improvement for industry standards and trust.

Deutsche Telekom issued invalid certificates, hasn't revoked them since 6 months

Deutsche Telekom issued invalid certificates, hasn't revoked them since 6 months

Telekom Security faced delays in revoking TLS certificates, impacting critical infrastructures. Efforts were made to replace 336 certificates within 5 days, highlighting the need for faster procedures and customer sensitization. Mozilla raised concerns about the response, emphasizing the importance of compliance with industry standards.

CrowdStrike admits faulty content update wasn't tested on a real machine

CrowdStrike admits faulty content update wasn't tested on a real machine

CrowdStrike acknowledged a bug in its software that caused 8.5 million Windows machines to crash due to a faulty update. The company plans to enhance testing protocols and update validation processes.

Link Icon 9 comments
By @agwa - 6 months
> The underscore prefix ensures that the random value cannot collide with an actual domain name that uses the same random value. While the odds of that happening are practically negligible, the validation is still deemed as non-compliant if it does not include the underscore prefix.

That's not the rationale for mandating the underscore prefix. The actual reason is so services that allow users to create DNS records at subdomains (e.g. dynamic DNS services) can block users from registering subdomains starting with an underscore. It serves the same purpose that /.well-known does.

For example, if an attacker requests a certificate for dyndns.example and DigiCert gives them a record without an underscore prefix like da39a3ee5e6b4b0d3255bfef95601890afd80709.dyndns.example, they can register that subdomain with the dynamic DNS provider, publish the required record, and get the certificate for dyndns.example. It doesn't matter how much entropy DigiCert put in the record name.

I definitely commend DigiCert for pledging to revoke the certificates within 24 hours and not having a delayed revocation or trying to language lawyer their way to a 5 day revocation as other CAs have tried. Nevertheless, this post severely minimizes the security impact of their mistake, and provides an excellent example of why CAs should always be required to strictly adhere to the rules and not be permitted to excuse noncompliance based on their own security analysis.

By @kevinday - 6 months
https://bugzilla.mozilla.org/show_bug.cgi?id=1910322

for more background. The short story is that when doing CNAME based validation, they were supposed to put an underscore at the start of the random string for you to add to your DNS records. They still generated sufficiently random strings but didn't include a _ before it which is in violation of the RFC. The rationale is that some sites might do something like give you control of yourusername.example.com and they don't want to make it possible for random users to register the random string and be able to manipulate it. If you don't allow users to generate anything that causes a hostname to appear with a leading underscore, they can't pass the domain validation.

By @olliej - 6 months
One of the impacted companies filed a restraining order, because they believe their incompetence is more important than basic functionality of the PKI. Can't wait to hear how they expect to respond if they ever have encounter a cert compromise or actual misissuance, maybe they'll demand 24 hour revocation in that case?

Honestly my opinion is that this should trigger the company being banned by all CAs.

The company in question is Alegeus Technologies LLC: https://www.courtlistener.com/docket/68995396/alegeus-techno...

From basic googling it looks like a healthcare provider, so exactly the kind of company you would want to have shitty IT and security infrastructure. A++ work. Absolutely stellar.

By @jiggawatts - 6 months
I just want to call out both CrowdStrike and DigiCert for being one of "those" companies that insist on publishing critical support information behind a login with the clock ticking on a global outage of their own making.

There are no polite words that I can use to accurately convey the depth of my disappointment at this kind of inconsiderate behaviour during a crisis, so I won't say anything more.

By @ratg13 - 6 months
24h notice to change certificates in who knows how many systems, at the worlds largest companies, while everyone is on vacation.

This will be interesting.

By @256_ - 6 months
> While we had regression testing in place, those tests failed to alert us to the change in functionality because the regression tests were scoped to workflows and functionality instead of the content/structure of the random value. [...]

> Unfortunately, no reviews were done to compare the legacy random value implementations with the random value implementations in the new system for every scenario.

In other words, they didn't do proper testing. At the bottom of the article they suggest they're going to improve it.

By @Apfel - 6 months
Is this a potential cause of the current Azure outages hitting western europe? I know DigiCert are used by Azure extensively...
By @notemaker - 6 months
Can someone explain why this issue deserves a 24h notice?

Seems more reasonable to me to have a much longer deprecation notice.