August 7th, 2024

Jeremy Rowley resigns from DigiCert due to mass-revocation incident

DigiCert identified a bug allowing certificate issuance without an underscore prefix, affecting 83,267 certificates. They plan revocation within 24 hours, but critical sector customers may face reissuance challenges.

Read original articleLink Icon
Jeremy Rowley resigns from DigiCert due to mass-revocation incident

DigiCert is currently addressing a potential issue related to its DNS-based validation method for certificate issuance. A recent code review revealed that a bug in the system could allow certificates to be issued without the required underscore prefix in the CNAME resource record. This issue was identified during an investigation prompted by a certificate problem report. DigiCert supports various DNS verification methods, and the bug was found specifically in the implementation of one method where the underscore prefix was not being appended correctly. Although the issue was inadvertently resolved during a user-experience enhancement project, DigiCert is still gathering information on the impacted certificates. They have identified approximately 83,267 certificates affecting 6,807 subscribers and are preparing to initiate revocation within a 24-hour timeframe. However, some customers, particularly those in critical sectors, may face challenges in reissuing certificates without service interruptions. DigiCert is committed to adhering to the CA/Browser Forum Baseline Requirements and is actively engaging with stakeholders to address the situation.

- DigiCert discovered a bug allowing certificate issuance without an underscore prefix in CNAME records.

- Approximately 83,267 certificates affecting 6,807 subscribers are identified for potential revocation.

- The issue was initially prompted by a certificate problem report and confirmed during a code review.

- DigiCert is committed to compliance with CA/Browser Forum Baseline Requirements.

- Some customers in critical sectors may struggle with timely certificate reissuance.

Related

Entrust certificates will not be trusted in Chrome 127+

Entrust certificates will not be trusted in Chrome 127+

The Chrome Root Program Policy is updating trust for Entrust CAs due to compliance issues. Entrust must show improvement to maintain trust. Chrome will oversee changes to safeguard users and the web.

Telekom Security: Revocation delay for TLS certificates

Telekom Security: Revocation delay for TLS certificates

Telekom Security experienced a delay in revoking TLS certificates, affecting 336 certificates due to basicConstraints not marked as critical. Efforts were made to prompt customers for replacement within 5 days. Lessons included the need for customer sensitization and faster certificate replacement procedures. Automation via protocols like ACME was considered for future processes. Stakeholders questioned the delay, but Telekom Security defended its decision based on low security risk and impact on critical infrastructures. The incident underscored challenges faced by CAs in ensuring timely revocation and the importance of continuous improvement for industry standards and trust.

Deutsche Telekom issued invalid certificates, hasn't revoked them since 6 months

Deutsche Telekom issued invalid certificates, hasn't revoked them since 6 months

Telekom Security faced delays in revoking TLS certificates, impacting critical infrastructures. Efforts were made to replace 336 certificates within 5 days, highlighting the need for faster procedures and customer sensitization. Mozilla raised concerns about the response, emphasizing the importance of compliance with industry standards.

DigiCert Revocation Incident (CNAME Domain Validation)

DigiCert Revocation Incident (CNAME Domain Validation)

DigiCert reported a certificate revocation incident affecting 0.4% of domain validations due to improper Domain Control Verification. Customers must replace affected certificates promptly and follow reissue procedures.

Health industry company sues to prevent certificate revocation

Health industry company sues to prevent certificate revocation

Alegeus Technologies has sued DigiCert to prevent the revocation of security certificates due to a flaw in validation. A Temporary Restraining Order has been granted, complicating compliance and raising security concerns.

Link Icon 20 comments
By @braiamp - 5 months
I'm with amir in comment 23 and with Aaron in previous comments. Stuff happens. And when there are multiple moving pieces, the process and policies are the issue, not the individuals. Since individuals rarely have a complete overview of the entire system.

Also, as noted in the comments, it sets a bad precedent for people coming forward reporting issues.

By @bryan0 - 5 months
I think this comment from the thread sums it up:

“When DigiCert has another incident (and while I have tremendous faith in Tim, it will happen), I would rather that they have Jeremy Rowley with his wisdom and scar tissue around to guide their response and subsequent improvement.”

By @cebert - 5 months
> “The code worked in our original monolithic system but was not implemented properly when we moved to our micro-services systems.”

This could happen to anyone, but imagine being the developer or development team that made this mistake.

By @upon_drumhead - 5 months
For those of you who don't know who he is, he was the Chief Information Security Officer

https://www.digicert.com/blog/author/jeremy-rowley

By @McGlockenshire - 5 months
This should probably be a direct link to the comment announcing the resignation: https://bugzilla.mozilla.org/show_bug.cgi?id=1910322#c17
By @gklitz - 5 months
> The ultimate root cause ended up being me. I have led the compliance team for the past several years. The fact this went unnoticed in our many reviews during that time shows that we need a different approach to both our internal investigations and compliance controls. I also dropped the ball on the certificate problem report by failing to escalate the issue to engineering and give it the proper attention it deserved. Although I did some investigation, I failed to treat the allegations with sufficient seriousness based on what could have been wrong. I assumed I knew the systems and what was happening in them rather than deeply investigating the report. Finally, I didn’t do enough to eliminate the silos between compliance and engineering.

Really does sound like he personally dropped the ball in the handling of the report. It would be interesting to hear the story from the researcher who will undoubtably have been frustrated beyond reason that they kept acting like there was no issue despite the repeated persistent attempts at getting them to take it serious.

By @langsoul-com - 5 months
I question why we accept someone resigning after making a big mistake.

Unless it's malice, or the fault truly is entirely on that person, what good would resigning do?

Rowley admitted he fucked up, badly, he admitted on several layers what must be changed. How he must change. How the org must change. How the way things are presently is not good enough. Made an extremely deep dive into what happened.

And now he's leaving??? Someone who royalty messes up, would not want to mess up on the same issue twice. So all that experience is now worthless and doesn't benefit Digicert in the slightest.

By @fallingsquirrel - 5 months
> We note that other customers have also initiated legal action against us to block revocation.

This seems crazy to me. In what world does suing your business partner make more sense than clicking some buttons in a UI or running some shell commands to renew your cert?

By @hugneutron - 5 months
I imagine someone as articulate and humble as that guy is going to land on his feet. That was a really good write up.
By @amluto - 5 months
One thing I find odd about this: the rules for CAs are long and detailed, but they don’t seem especially complicated. If I were implementing a CA, I would have the main code (their “service oriented architecture” or a monolith or whatever) produce not just an instruction to issue a certificate but a transcript of the entire exchange. Then a completely separate code path (plain old synchronous Rust or Python or Go or Haskell or ML — no microservices) would check the transcript for compliance with each clause of the requirements and block issuance if anything fails. And raise an alert that gets noticed.

One could even get fancy and use verifiable randomness for everything in the protocol that is supposed to be random.

And then one could refactor some other code with much less worry about messing up.

This might also reduce the blast radius from a bug in some other component. If the magic random string generator can be coerced into returning ‘www’, then a separate check would prevent this from compromising everything.

(I work in a different industry, and in my industry there is plenty of complex, evolving code, that needs to do the right thing. The more competent players have separate verification code as a double-check.)

By @mr_toad - 5 months
> We note that other customers have also initiated legal action against us to block revocation.

How can it make economic sense to initiate a lawsuit rather than just get new certificates?

By @xyst - 5 months
I have low expectations from C-level executives. But this incident and his response to it has changed my perspective of them just slightly.

It's a rare incident where a C-level executive actually takes accountability for their fuck up. Shit rolls down hill. He is very likely to end up taking the helm at another place or startup on his own. He is the exact opposite of the CrowdStrike CEO (George Kurtz) that caused an absolute shitstorm compared to DigiCert incident.

By @Banditoz - 5 months
> We also found that the bug in the code was inadvertently remediated when engineering completed a user-experience enhancement project that collapsed multiple random value generation microservices into a single service.

Interesting. What is the value of a microservice that generates random numbers over just using a language's SecureRandom equivalent?

By @23B1 - 5 months
He resigned with honor, grace and responsibility – and should be applauded.

This is what real accountability looks like, and doing so not only preserves the reputation and trustworthiness of his employer, but demonstrates that he is a valuable contributor and trustworthy individual. He will land on his feet as a result.

By @jtc331 - 5 months
I don’t quite follow why a missing underscore results in a security problem. It seems like it must be somehow related to what’s valid for CNAME records?
By @sneak - 5 months
Given that a revocation is simply a publication of additional data by a CA, and does not directly affect the customer’s systems, how is the TRO in this case not unconstitutional? I’m not a lawyer but it feels like prior restraint, no?
By @RevEng - 5 months
While I applaud his openness and willingness to take accountability, I agree with others that resigning shouldn't be necessary.

Resigning is what you do when you are clearly not fit for your post. Jeremy has demonstrated that he is anything but unfit. People that can see where things went wrong, who can communicate such, can come up with changes to fix those issues, and can implement them are exactly what is needed at such a high level of management. Most people would bury the story or claim ignorance, but Jeremy doesn't hide anything and takes full responsibility.

I wish Jeremy could have stayed and used this honesty and insight to make the necessary changes. Firing a C-level executive when things go wrong doesn't fix anything any more than finding a low level engineer to blame and fire. Experienced people learn lessons by making mistakes. It sucks that it happens, but unexpected circumstances can't be foreseen. Hindsight is 20/20. Now that they know, they know to look out for it and to change the system to prevent it next time.

Perhaps he did overlook it. Perhaps he didn't respond when he should have. It's easy to get complacent. This is a wake up call. I have no doubt that he would be much more attentive and responsive as a result of this, and as such, be exactly what's needed for his post.

Mistakes don't call for sacrifices; they call for systematic changes to prevent making the same mistakes again.

Thank you Jeremy for being as forthcoming as you have been. I only wish more C-level execs would do the same. I hope you find a good place to land where you can take this experience and do an even better job. And I hope that whoever replaces you can bring the same rigor and professionalism that your brought.

By @_3u10 - 5 months
Sounds like inside baseball