Technical details of Firefox's recent add-on disabling

Note. translator: for the convenience of readers, the dates are given in Moscow time

We recently missed the expiration date for one of the certificates used to sign add-ons. This resulted in the add-ons being disabled for users. Now that most of the problem is fixed, I would like to share the details of what happened and the work done.

Background: additions and signatures

While many use the browser out of the box, Firefox supports extensions called "add-ons". With their help, users add various features to the browser. There are over 15 thousand additions: from ad blocking to managing hundreds of tabs.

Add-ons installed must have digital signature, which protects users from malicious add-ons and requires minimal review of add-ons by Mozilla employees. We introduced this requirement in 2015 as we were testing serious problems with malicious extensions.

How it works: Every copy of Firefox contains a "root certificate". The key to this "root" is stored in Hardware Security Module (HSM)that does not have access to the network. Every few years, a new "intermediate certificate" is signed with this key, which is used when signing add-ons. When a developer submits an add-on, we create a temporary "end certificate" and sign it using the intermediate certificate. Then the add-on itself is signed by the final certificate. schematically it looks like this.

Note that each certificate has a "subject" (to whom the certificate was issued) and an "issuer" (who issued the certificate). In the case of a root certificate, "subject" = "issuer", but for other certificates, the issuer of a certificate is the subject of the parent certificate that signed it.

An important point: each addition is signed by a unique end certificate, but almost always these end certificates are signed by the same intermediate certificate.

Author's Note: The exception is very old additions. Various intermediate certificates were used at that time.

This intermediate certificate caused problems: each certificate is valid for a certain period. Before or after this period, the certificate is invalid and the browser will not use add-ons signed with this certificate. Unfortunately, the interim certificate expired on May 4th at 4:XNUMX am.

The consequences did not appear immediately. Firefox checks the signatures of installed add-ons not all the time, but about once every 24 hours, and the check time is individual for each user. As a result, some people have problems immediately, some much later. We first became aware of the problem around the time the certificate expired and immediately started looking for a solution.

We reduce damage

Once we realized what had happened, we tried to prevent the situation from worsening.

First, they stopped accepting and signing new additions. It makes no sense to use an expired certificate for this. Looking back, I would say that it would be possible to leave everything as it is. Now accepting additions resumed.

Secondly, they immediately sent out a fix that prevented daily verification of signatures. Thus, we saved those users whose browser has not had time to check the add-ons for the last XNUMX hours. This fix has now been withdrawn and is no longer needed.

Parallel work

Theoretically, the solution to the problem looks simple: create a new valid intermediate certificate and re-sign each addition. Unfortunately this won't work:

  • we cannot quickly re-sign 15 add-ons at once, the system is not designed for such a load
  • after we sign the additions, the updated versions need to be delivered to users. Most add-ons are installed from Mozilla servers, so Firefox will find updates in the next XNUMX hours, but some developers distribute signed add-ons through third-party channels, so users would have to update such add-ons manually

Instead, we tried to develop a fix that would reach all users with little or no action on their part.

Pretty quickly, we came up with two main strategies, which we used in parallel:

  • Update Firefox to change the validity period of the certificate. This will make existing add-ons magically work again, but will require a new build of Firefox to be released and shipped.
  • Generate a valid certificate and somehow convince Firefox to accept it instead of an expired existing one

We decided to use the first option first, which seemed to work quite well. At the end of the day, the second fix (new certificate) was also released, which we will talk about later.

Certificate replacement

As I mentioned above, it required:

  • create a new valid certificate
  • install it remotely in Firefox

To understand why this would work, let's take a closer look at the add-on validation process. The add-on itself comes as a set of files, including a chain of certificates used for signing. As a result, the add-on can be verified if the browser knows the root certificate that is built into Firefox at build time. However, as we already know, the intermediate certificate is expired, so it is not possible to verify the add-on.

When Firefox attempts to validate an add-on, it is not limited to using the certificates contained within the add-on itself. Instead, the browser tries to build a valid certificate chain, starting with the end certificate and continuing until it gets to the root. At the first level, we start with the leaf certificate, and then find the certificate whose subject is the issuer of the leaf certificate (that is, the intermediate certificate). Usually this intermediate certificate comes with the add-on, but any certificate from the browser store can also act as this intermediate certificate. If we can remotely add a new valid certificate to the certificate store, Firefox will try to use it. Situation before and after installing a new certificate.

Once the new certificate is installed, Firefox will have two options when checking the certificate chain: use the old invalid certificate (which won't work) or the new valid one (which will work). It is important that the new certificate contains the same subject name and public key as the old certificate, so its signature on the final certificate will be valid. Firefox is smart enough to try both until it finds one that works, so the add-ons will be verified again. Note that this is the same logic we use to validate TLS certificates.

Author's note: Readers familiar with WebPKI will notice that cross-certificates work exactly the same way.

The great thing about this fix is ​​that it doesn't require you to re-sign existing add-ons. As soon as the browser receives a new certificate, all add-ons will work again. There remains the challenge of delivering a new certificate to users (automatically and remotely) and also getting Firefox to recheck disabled add-ons.

Normandy and the research system

Ironically, this problem is solved by a special add-on called "system". In order to conduct research, we have developed a system called Normandy that delivers research to users. These explorations are automatically performed in the browser, and have enhanced access to Firefox's internal APIs. Research can add new certificates to the certificate store.

Author's Note: We're not adding a certificate with any special privileges; it's signed with a root certificate, so Firefox trusts it. We simply add it to the pool of certificates that can be used by the browser.

So the solution is to create a study:

  • which installs the new certificate we created for users
  • forcing the browser to recheck disabled add-ons so that they work again

β€œBut wait,” you say, β€œadd-ons don’t work, how do I run the system add-on?”. Let's sign it with a new certificate!

Putting it all together…why is it taking so long?

So, the plan is to issue a new certificate to replace the old one, create a system add-on and install it to users through Normandy. The problems, as I said, started on May 4 at 4:00, and already at 12:44 on the same day, less than 9 hours later, we sent a fix to Normandy. It took another 6-12 hours for it to reach all users. Not bad already, but Twitter users are asking why we couldn't have acted faster.

First, it took time to issue a new intermediate certificate. As I mentioned above, the key from the root certificate is stored offline in the hardware security module. This is good from a security point of view, since the root is rarely used and needs to be secure, but it's a little inconvenient when you need to urgently sign a new certificate. One of our engineers had to go to the HSM storage. Then there were unsuccessful attempts to issue the correct certificate, and each attempt was worth one or two hours spent on testing.

Secondly, the development of a system add-on took some time. Conceptually it is very simple, but even simple programs require attention. We wanted to make sure we didn't make things worse. The research needs to be tested before being sent to users. Also, the add-on needs to be signed, but our add-on signing system was disabled, so we had to look for a workaround.

Finally, after we prepared the studies for shipment, it took time to deploy. The browser checks for Normandy updates every 6 hours. Not all computers are constantly on and connected to the Internet, so it takes time for the fix to spread to users.

Final steps

The research should fix the problem for most users, but is not available to everyone. Some users require a special approach:

  • users who have turned off research or telemetry
  • users of the Android version (Fennec), where research is not supported at all
  • users of custom builds of Firefox ESR in enterprises where telemetry cannot be enabled
  • users sitting behind MitM proxies, because our add-on installation system uses key pinning, which does not work with such proxies
  • users of older versions of Firefox that do not support research

There is nothing we can do about the last category of users - they should still upgrade to the new version of Firefox, because the outdated ones have serious unpatched vulnerabilities. We know that some people stay on older versions of Firefox because they want to run older add-ons, but many of the older add-ons have already been ported to new versions of the browser. For other users, we have developed a patch that will install a new certificate. It was released as a bugfix release (translator's note: Firefox 66.0.5), so people will get it - most likely already got it - through the normal update channel. If you are using a custom build of Firefox ESR, please contact your maintainer.

We understand that this is not ideal. In some cases, users lost add-on data (for example, add-on data Multi Account Containers).

This side effect could not be avoided, but we believe that in the short term we have chosen the best solution for most users. In the long term, we will look for other, more advanced architectural approaches.

Lessons

First, our team did an amazing job of creating and submitting a fix in less than 12 hours of discovering the issue. As someone who attended the meetings, I can say that in this difficult situation, people worked very hard and very little time was wasted.

Obviously, this was not supposed to happen at all. It is clearly worth adjusting our processes to reduce the likelihood of such incidents and make it easier to correct the consequences.

Next week we will publish an official post-mortem and a list of changes that we intend to make. For now, I'll share my thoughts. First, there must be a better way to track the status of what is a potential time bomb. We need to be sure that we do not find ourselves in a situation where one of them suddenly works. We're still working on the details, but at a minimum, we need to account for all of these things.

Secondly, we need a mechanism to quickly deliver updates to users, even when - especially when - everything else is not working. It was great that we were able to use the "research" system, but it's not a perfect tool and has some undesirable side effects. In particular, we know that many users have automatic updates turned on, but they would rather not participate in research (I admit, I have them turned off too!). At the same time, we need a way to push updates to users, but whatever the internal technical implementation, users should be able to subscribe to updates (including hot fixes) but opt ​​out of everything else. Also, the update channel should be more responsive than it is now. Even on May 6, there were still users who did not take advantage of either the fix or the new version. This problem has already been worked on, but what happened showed how important it is.

Finally, we'll take a look at the security architecture of the add-ons to make sure it provides the right level of security with minimal risk of breaking something.

Next week we will look at the results of a more thorough analysis of what happened, but in the meantime I will be happy to answer questions by e-mail: [email protected]

Source: linux.org.ru

Add a comment