The Thunderbird team is a remote-first, globally distributed group, so it made perfect sense to devote an episode to Remote Work! Join Heather, Chris, and Jason for some useful tips and tricks to make your daily remote work more enjoyable and more productive. We also include tips from ThunderCast listeners Pedro and Mike, who emailed us at podcast@thunderbird.net. (You can do the same if something’s on your mind.)
Plus: An inside look at the upcoming Thunderbird Send service, some fascinating origin stories, and geeky Raspberry Pi solutions for weather and BBQ.
<figcaption class="wp-element-caption">Listen to this episode on PeerTube</figcaption>
August was a relatively calm month for the K-9 Mail team, with many taking well-deserved summer vacations and attending our first Mozilla All-Hands event. Despite the quieter pace, we managed to hit a significant milestone on our journey to Thunderbird for Android: the beta release of our new account setup interface.
Beta Release with New Account Setup: We Want Your Feedback!
We’re thrilled to announce that we rolled out a beta version featuring the new account setup UI. This has been a long-awaited feature, and even though the team was partially on vacation, we managed to get it out for user testing. The initial feedback has been encouraging, and we’re eager to hear your thoughts.
<figcaption class="wp-element-caption">Add your email address</figcaption><figcaption class="wp-element-caption">Configuration found through autoconfig</figcaption><figcaption class="wp-element-caption">Authenticate with your OAuth provider</figcaption><figcaption class="wp-element-caption">Addjust your account options</figcaption>
If you’ve tried the beta, we’d love to get your feedback. What did you like? What could be improved? Your insights will help us refine the feature for its official release.
How to Provide Feedback
You can provide feedback through the following channels:
The Thunderbird team is back from Mozilla’s All-Hands event, and we’re overwhelmed in the most positive way. In addition to the happy and positive vibes we’re feeling from meeting our colleagues in person for the first time, we have a lot of thoughts and impressions to share. Ryan, Jason, and Alex talk about how Mozilla is building AI tools for the good of humanity, and how our perception of AI has changed dramatically. Plus, the problem with the “hey Mozilla, just build a browser” argument.
Today’s episode is light on actual Thunderbird talk, and more focused on Mozilla as an organization and the future of AI. We hope you enjoy it as much as we enjoyed discussing it!
Hello Thunderbird community! We’re bringing back monthly Office Hours, now with a Saturday option to make attendance more convenient. Please see the details below to learn how and when you can meet with us to share your feedback and ask questions.
Now that Thunderbird 115 Supernova has been released, we have a lot to discuss, plan, and do! And we’re rolling out monthly Office Hours sessions so that you can:
Share your Thunderbird experiences with us
Share your ideas for the future of Thunderbird
Discuss ways to get involved with Thunderbird
Ask us questions about Thunderbird and the Thunderbird Project
Meet new team members, and meet your fellow Thunderbird users
This month we’re hosting these sessions using Zoom (in the future we plan to stand up a dedicated Jitsi instance). You can easily participate using video, dialing in by phone, or asking questions in our community Matrix channel at #thunderbird:mozilla.org.
In the table above please click a session for meeting time converted to your local time. If you encounter difficulty joining please post in Matrix back channel #thunderbird:mozilla.org or email us.
Hosts Wayne (Thunderbird Community Manager) and Jason (Marketing & Communications Manager), plus special guests from the Thunderbird team look forward to meeting you!
PLEASE NOTE: We’ll be recording this call for internal use and distribution only. In the future, we may explore publishing these on video platforms.
How To Join Community Office Hours By Phone
Meeting ID: 981 2417 3850 Password: 647025
Dial by your location +1 646 518 9805 US (New York) +1 669 219 2599 US (San Jose) +1 647 558 0588 Canada +33 1 7095 0103 France +49 69 7104 9922 Germany +44 330 088 5830 United Kingdom Find your local number: https://mozilla.zoom.us/u/adPpRGsVms
The day I write this, it’s very hot outside. Too hot to think of a good introduction to this blog post that also includes a link to the previous month’s progress report… Well, I guess this will have to do. I’m off to get some ice cream
Please enjoy this brief report of our development activities in July 2023.
Improved account setup
Since Wolf joined in February of this year, he has spent a considerable amount of time on many of the individual pieces that make up the new and improved account setup user interface. July was the month when things started coming together. For the first time we were able to test the whole flow and not just individual parts.
Things were looking good. But a few small issues kept us busy and prevented us from releasing a beta version containing the new account setup.
Material 3 experiments
We’ve done some experiments to get a better idea of how much work it will be to switch the app to Material 3, the latest version of Google’s open-source design system. We’re now cautiously optimistic. And so the current plan is switch to Material 3 before renaming the app from K-9 Mail to Thunderbird.
Community contributions
In July we merged the following pull requests by external contributors:
After a few busy days surrounding the Thunderbird Supernova release, we finally managed to publish the report of the security audit organized by OSTIF and performed by 7ASecurity. We’re happy to report that no high-risk vulnerabilities were found. The security audit did uncover a handful of low-to-medium risk vulnerabilities.
Thunderbird 115 “Supernova” ships with brand new layout options to give you a more beautiful and more productive email experience. But those new options aren’t on by default (for now), out of respect to those who have grown comfortable with Thunderbird’s Classic View throughout the years. Fortunately, getting that shiny new “Supernova” look is accomplished in just a few seconds. In this short guide, we’ll show you how to do it!
Step 1: Turn On Vertical View
First, click on the App Menu (≡) and choose “View”, followed by “Layout,” and then select “Vertical View.” This will rearrange the Folder pane, Message List pane, and Message pane to be displayed side-by-side.
Step 2: Turn On Cards View
Next, let’s switch on “Cards” view. This new way of displaying your message list is simpler and more compact, to help reduce cognitive burden when you view the list. And it’s easy to activate.
Look immediately to the right of the “Quick Filter” area for the “Message List Display Options” icon. Click it, and select “Cards View” as seen in the GIF above.
Cards View is still in active development. More features such as a message preview line and sender avatars will be added in the future.
TIP #1: Are you having trouble finding that menu? The area of Thunderbird where you activate Cards View is called the Message List Header. Some people choose to hide this section to reclaim a bit of vertical space. It’s easy to get it back: Just return to the App menu (≡), then select View Layout, and make sure that “Message List Header” is checked ✓
TIP #2: Cards View is also available when using Classic or Wide layouts.
Step 3: Turn On “Tags” Folder Mode
If Tags are an important part of your workflow, now it’s easier than ever to access them.
In the Folder Pane Options menu (⋯) next to the “New Message” button, click “Folder Modes” and then choose “Tags.” That’s it! But if you want to continue customizing Thunderbird 115, you can also use this menu to hide Local Folders.
Tip: Want to move your Tags up or down in the Folder Pane? Click on the 3 vertical dots menu (⋮) next to Tags, and simply choose “Move Up” or “Move Down” as seen in the above GIF.
Step 4: Customize The Message Header
And finally, we arrive at the Message Header Settings. That’s the section at the top of your email showing all the information such as sender’s name, contact photo (which pulls from the Address Book), subject, any associated tags, and more. Configuring this to fit your preferences is easy. Just click the “More” button with the downward facing arrow, then select “Customize” and make it yours!
We hope this helps you enjoy an even better Thunderbird experience. Thanks for being part of the Thunderbird family, and make sure to check back later for more customization guides and usage tips.
The roadmap item we’re currently working on is Improve Account Setup. Most of our time went into working on this. However, for June there’s no exciting news to share. We mostly worked on the internal plumbing; that is important to get right, but not necessarily great material for a blog post. Hopefully there will be new screenshots to share in July’s progress report.
App maintenance
Having an app with a large user base means we can’t spend all of our time working on new features. Fixing bugs is a large and important part of the job. Here’s a writeup of just three of the bugs we fixed in June.
Folder appears to be empty
A user reported that some of their folders appear to be empty in K-9 Mail. Using the provided debug log () we were able to track this down to a message containing an invalid email address, specifically one whose local part (the text before the @ symbol) exceeds the limit of 64 characters.
The error was thrown by a newly added email address parser that is stricter than what we used before. At first it was a bit surprising that this would lead to messages in a folder not being shown. We deliberately kept this new implementation out of the code responsible for parsing emails after download and the code for displaying messages.
However, it turned out the new email address parser was used when getting the contact name belonging to an email address. This lookup is performed when loading the message list of a folder from the local database. When an error occurs during this step, an empty message list is shown to the user.
To fix this bug and limit the impact of similar problems in the future, we made the following changes:
Ignore most errors when parsing email addresses from messages that the user has received. The world is full of email addresses that violate the specification but work mostly fine in practice. However, we still want to be strict when it comes to the email addresses we accept, e.g. when setting up a new account.
Ignore errors with the email address when trying to fetch the system contact belonging to that email address. This may lead to the app not being able to fetch a contact name for an spec-violating email address. But this will no longer lead to failing to load the entire message list.
We added a message with an email address whose local part exceeds the length limit to our test account. That way we are likely to catch bugs related to such email addresses before they make it into a beta release.
We’re very grateful to our beta testers for finding and reporting bugs like this one. That way we can fix them before they make it into a stable release.
Adding an email address to an existing contact
With the introduction of the message details screen we added a button to add a message sender or recipient to the contacts database using an installed contact app. If the email address can already be found in the contacts database, this button is hidden and tapping the participant name, email address, or contact picture opens the contacts app.
Previously the app didn’t make that distinction and tapping an email address or participant name would open the contacts app using the “show or create” action. Apparently this reliably allowed to add an email address to an existing contact. However, the “insert” action used by the details screen only allows adding the email address to an existing contact with some contacts apps, but not others
<figcaption class="wp-element-caption">“Insert” action with contacts app that supports adding to an existing contact</figcaption><figcaption class="wp-element-caption">“Insert” action with contacts app that doesn’t support adding to an existing contact</figcaption>
We changed the action to start the contacts app from “insert” to “insert or edit”, and this seems to reliably offer the option to add the email address to an existing contact.
<figcaption class="wp-element-caption">“Insert or edit” action with Google’s Contacts app</figcaption><figcaption class="wp-element-caption">“Insert or edit” action with AOSP Contacts app</figcaption>
Reply behavior depends on message size
A user reported that the behavior when replying to a message retrieved via a mailing list was different depending on whether the message had been downloaded completely or only partially.
K-9 Mail supports only downloading parts of a message when the email exceeds a configured size. In that case also only selected parts of the message header are downloaded. Unfortunately, we forgot to include the List-Post header field that is used to determine the email address to which to reply.
The fix was simply adding List-Post to the list of header fields to fetch from the IMAP server.
Community contributions
In June we merged the following pull requests by external contributors:
Hello Thunderbird family! First and foremost, we want to express our deepest appreciation for your patience. The road to Thunderbird 115 “Supernova” has been a long one, and we’re confident you’ll love the final result. If you’re already using Thunderbird 115, you may have noticed a feature that is conspicuously absent: Thunderbird Sync.
When we started creating our roadmap for Supernova, our feature targets were ambitious. As it turns out, a little too ambitious. We did our very best to finish up Thunderbird Sync for the initial release of version 115, but some technical blockers prevented us from moving forward fast enough to deliver it to you. Besides, this is a feature that absolutely must be secure and reliable. And it needs months of user testing to ensure that stability.
We do have the basic user interface for Thunderbird Sync designed already. However, what slowed us down is the need to hire a Site Reliability Engineer (SRE) to help us spin up our own back-end infrastructure that is independent of Firefox Sync. While our Sync functionality does use Firefox Sync code, they will end up being completely different products with different use-cases.
When Will Thunderbird Sync Be Finished?
We don’t have a solid release date, but our objective is to have Thunderbird Sync finished in time for the next ESR release, or shortly after we switch to a monthly release schedule (we’re aiming to complete that transition to monthly by early 2024).
Once we have a server and a proper back-end infrastructure, we’ll enable it on beta for you all to test.
What Data Will Thunderbird Sync Support?
We plan to support syncing of your email account definitions, credentials, signatures, saved searches, tags, tasks, filters, and most major preferences across multiple installations of Thunderbird on PC, (Yes, this is cross-compatible with Windows, macOS, and Linux.) You’ll also be able to sync your Thunderbird accounts with the forthcoming Thunderbird for Android.
Thank you again for being patient with us as we continue to build the best possible software for managing your email and personal communications. In the future, we’re hopeful that a switch to monthly releases will allow us to put new features in your hands faster than what’s previously been possible.
Hello Thunderbird family, and welcome back to a long-overdue episode of the ThunderCast! Ryan, Alex, and Jason get together to talk about the new features and improvements in Thunderbird 115 “Supernova.” But they also share WHY those features were developed, and what’s being worked on right now.
Plus, Ryan shares some breaking news about the future of the Thunderbird Project! It’s a casual, informative, behind-the-scenes chat.
Our journey to transform K-9 Mail to Thunderbird for Android involves more than just improving the user interface and adding new features. K-9 Mail is already an important part of the open source ecosystem on Android, and because it lays the foundation for the future of Thunderbird on Android, we believe it’s important to invest in the security health of the software.
To that end, we recently enjoyed a collaboration with the Open Source Technology Improvement Fund (OSTIF) and 7ASecurity on an extensive security audit of K-9 Mail.
A team of six auditors at 7ASecurity worked diligently to identify and address any potential security or stability issues found in K-9 Mail. The audit focused specifically on threat modeling, fuzzing (a technique that simulates real-world scenarios where software might encounter unexpected or malicious inputs), and our software supply chain.
We are happy to report that zero high-risk vulnerabilities were found. The security audit did uncover a handful of low-to-medium risk vulnerabilities, the majority of which the K-9 Mail team has already resolved or is in the process of addressing.
Additionally, we’re very pleased to share this promising conclusion from OSTIF:
“[Mozilla] has an incredible foundation to begin this new chapter with, as the report notes seven wide-ranging points of secure and healthy practices and conditions of K-9 Mail that the 7ASecurity team evidenced during the engagement.”
Amir Montazery, OSTIF
The entire process was educational and productive, and we sincerely appreciated working with such professional and knowledgeable teams. We’d like to extend our deepest thanks to everyone at OSTIF, including Amir Montazery, Ashley Leszkiewicz, and Derek Zimmer, who were instrumental in orchestrating a smooth experience.
And our sincerest gratitude goes out to Abraham Aranguren, Dariusz Jastrzębski, Daniel Ortiz, Dr. Miroslav Štampar, Óscar Martínez, and Patrick Ventuzelo at 7ASecurity for their hard work and attention to detail.
“OSTIF has a strong understanding of how open source projects operate, and we really appreciated that they were able to jump in and help us coordinate this security audit of our K-9 Mail software. OSTIF and 7ASecurity were amazing partners that provided a helpful guiding hand, and made the process of doing the audit a breeze. We really appreciated their professionalism and expertise. I can confidently say that we plan on working with them again.”
Ryan Sipes, Thunderbird Product and Business Development Manager.
In the last few months, I have been building up a new business called Trade Post 47. While we envision it as a little space station in orbit of a nice planet, to most people it will be a science fiction merchandise trading company, with an online shop and booths at events like local Comic-Cons in Central Europe, especially in Austria. If you want to learn more, we have put up a complete page about us on our shop website.
To manage our products, which we get from different vendors (sometimes the same product via different vendors) as well as plan and manage our orders, I built an internal, custom merchandise management in my own PHP framework or CMS CBSM (which is also used for this blog, for example). I did this mostly out of convenience as I have and maintain this system anyhow and I needed some database tables with fitting UI for managing our merchandise, vendors, and more (even conventions we may want to run booths at).
OTOH, the public shop is (as you may notice when looking at the website) an installation of Magento 2 (i.e. the open-source version of what is nowadays called "Adobe Commerce"). We decided to run that system because we are partnering closely with MCO Shop, which is a local ham radio and electronics shop, and they already had this software running previously on the servers we share and know how to work with it, run the upgrades, and maintain it. After all, when building a new business, as in so many areas of life, it always helps if you can share some resources and knowledge with others. First, I adapted the Magento theme to make it look more "space-like", mostly importantly, having a dark instead of light background. Once that worked well enough, I still had to get those products that we actually ordered from my custom management system into this Magento shop. Initially, I did this via creating a big CSV file and importing that into the shop, but it was clear that we needed a more fine-grained solution in the long run that can add and update entries individually.
Additionally, when we run booths on our "away missions" to events/conventions (or whenever we otherwise sell anything in person), Austrian law requires us to use a cash register system that follows strict rules and passes a certification so that the government can be sure we pay taxes for everything we sell. For that we decided to use a solution integrated into our bookkeeping system, which runs online as a web service as well, a specialized Austrian solution called FreeFinance. And of course, the cash register needs a full list of products and prices as well, which we also initially solved with a CSV creation and import in anticipation of a more fine-grained solution after our first big appearance at Austria Comic-Con in early June.
As icing on the cake, we also wanted to generate nicely styled invoice documents in FreeFinance for all online shop orders that weren't paid via the cash register, and in the future, we'll want to make the online shop automatically aware of merchandise sold at events so they are removed from available stock for online purchases.
To achieve that, I looked into the APIs that both Magento and FreeFinance provide, accessing them from the custom internal system that I have full access to and that is required for providing the merchandise data anyhow. I found that the FreeFinance API is relatively simple, well-documented and does authentication via OAuth2, which I already had some knowledge of (and code to access it) from other projects, including some code already in the CBSM system for facilitating its own logins. That said, Magento is a different beast: its product catalog feature set is way more complex, and so is its API. Also, there is no well-structured collective documentation that would explain what various things mean or what is preferably done in what way (often there are multiple paths to the same result), it's a lot of turning on developer mode so that a Swagger/OpenAPI UI is available on your installation and then trying around there and searching the web for what could work how and what value could mean what. In addition, authentication is done via OAuth1, which is more complicated than its successor, and which I didn't have any pre-existing code for, though I could build on some code from their tutorial. Also, as we're running Magento ourselves on the server side, I could more easily try around things than with FreeFinance, which is a hosted service and I needed to request access credentials from their team. But FreeFinance gave us access to a testing system, whereas for Magento, for various reasons, we only have a live system and no staging/testing environment, so we can't "play around" very much when testing.
I wrote quite a bit of code for all those cases, the simplest part was and is surely updating the cash register with our products, the only slight complication there is adding categories if needed. For adding products to the shop, I needed to respect all kinds of things, like creating and managing configurable products, adding values to some attributes, uploading images, managing categories, and more. And the curious structure of the Magento API, which requires way more detailed action than the CSV import route, did at times make this even more complicated - but it works now and I can just add or change a product in the merch management and at the latest on the next day, both the shop and the cash register have updated to those changes (I can trigger the sync jobs earlier if required). For creating the invoice documents, I could base some things on a make.com "blueprint" provided by FreeFinance, but for one thing, we don't want to use a paid third-party service if we can automate this ourselves, and for the other, we have some restrictions and specialties of our own there (like only generating invoices for orders actually paid via the web shop payment integration and not in person via the cash register). I did run into some curiosities there, like the Magento order API result containing several pieces of data multiple times, or us initially using a document layout template that didn't allow for different products having different VAT rates (which we require) - but that's working now as well. The reverse part about getting cash register purchases into the online shop is still on my plate, but I now have a good plan for how to do that, and some time until our next big "away mission" where this will be important to have.
All in all, this has been a quite interesting experience, and I'm sure now that I am comfortable with working with those systems and APIs, I will do more with them in the long run - and our Trade Post 47 hopefully will still grow as well and therefore have additional requirements in the future. If you are a developer and have questions about some details, feel free to contact me - and if you are running such systems yourself and need a developer who can adapt them in a similar fashion, I'm happy to offer those services as a contractor!
Searchfox (source, config source) is Mozilla’s primary code searching tool for Firefox introduced by Bill McCloskey in 2016 which built upon prior work on DXR. This roadmap post is the second of two posts attempting to lay out where my personal efforts to enhance searchfox are headed and the decision making framework that guides them. The first post was a more abstract product vision document and can be found here.
Discoverable, Extensible, Powerful Queries
Searchfox has a new “query” endpoint introduced in bug 1762817 which is intended to enable more powerful queries. Queries are parsed using :kats‘ query-parser crate which allows us to support our existing (secret) key:value syntax in a more rigorous way (and with automatic parse correction for the inevitable typos). In order to have a sane execution model, these key/value pairs are mapped through an extensible configuration file into a pipeline / graph execution model whose clap-based commands form the basis of our testing mechanism and can also be manually built and run from the command-line via searchfox-tool.
Above you will find a diagram rendering the execution pipeline of searching for foo manually created from the JSON insta crate check output for the query. Bug 1763005 will add automatically generated diagrams as well as further expanding on the existing capability to produce markdown explanations of what is happening in each stage of the pipeline and the values at each stage.
While a new query syntax isn’t exciting on its own, what is exciting is that this infrastructure makes it easier to add functionality with confidence (and tests!). Some particular details worth discussing:
Customizable, Shareable Queries
Bug 1799796: Do you really wish that you could issue a query like webidl:CacheStorage to search just our WebIDL files for “CacheStorage”? Does your team have terminology that’s specific to your team and it would be great to have special search terms/aliases but it would feel wrong to use up all the cool short prefixes for your team? The new query mechanism has plans for these situations!
The new searchfox query endpoint looks like /mozilla-central/query/default. You’ll note that default looks like something that implies there are non-default options. And indeed, the plan is to allow files like this example “preset” dom.toml file to layer additional “terms” and “aliases” onto the base query_core.toml file as well as any other presets you want to build off of. You will need to add your preset to the mozsearch-mozilla repository for the tree in question, but the upside is that any query links you share will work for other people as well!
Faceting in Search Results with Shareable URLs
Bug 1799802: The basic idea of faceted search/filtering is:
You start with a basic search query.
Your results come back, potentially quite a lot of them. Too many, even!
The faceting logic looks at the various attributes of the results and classifies or “facets” them. Does that sound too circular? We just throw things in bins. If a bin ends up having a lot of things in it and there’s some hierarchy to its contents, we recursively bin those contents.
The UI presents these facets (bins), giving you a high level overview of the shape of your results, and letting you limit your results to only include certain attribute values, or to exclude based on others.
The UI is able to react quickly because it already knows about the result set
The cool screenshot above is of a SIMILE Exhibit-based faceting UI I created for bugzilla a while back which may help provide a more immediate concept of how faceting works. See my exhibit blog tag for more in the space.
Here are some example facets that search can soon support:
Individual result paths: Categorize results by the path in which they happen. Do you not want to look at any results under devtools/? Push a button and filter out all those devtool results in an instant! Do you only care about layout/? Push a button and only see layout results!
Subsystem facets: moz.build files labels every file in mozilla-central so that it has an associated Bugzilla Component. As of Bug 1783761 searchfox now also derives a subsystem mapping from the bugzilla components, which really just means that if you have a component that looks like “Core :: Storage: IndexedDB”, searchfox transforms that first colon into a slash so we get “Core/Storage/IndexedDB”. This would let you restrict your results to “Core/Storage” without having to manually select every Storage bugzilla component or path by hand.
Symbol relationships: Did you search for a base class or virtual method which has a number of subclasses/overrides? Do you only care about some subset of the class hierarchy? Then restrict your results to whatever the portion of the set you care about.
Recency of changes: Do you only care about seeing results whose blame history indicates it happened recently? Can do! Or maybe you only want to see code that hasn’t been touched in a long time? Uh, that might work less well until we improve the blame situation in Bug 1517978, but it’s always nice to have something to dream about.
Code coverage: Only want to see results that runs a lot under our tests? Sure thing! Only want to see results that seem like we don’t have test coverage for? By looking at the result you’re now morally obligated to add test coverage!
Key to this enhancement is that the faceting state will be reflected in the URL (likely the hash) so that you can share it or navigate forward and back and the state will be the same. It’s all too common on the web for state like this to be local to the page, but key to my searchfox vision is that URLs are key. If you do a lot of faceting, the URL may become large an unwieldy, but continuing in the style of :arai‘s fantastic work on Bug 1769936 and follow-ups to make it easy to get usable markdown out of searchfox, we can help wrap your URL in markdown link syntax so that when you paste it somewhere markdown-aware, it looks nice.
Additional Query Constraints
A bunch of those facets mentioned above sound like things that it would be neat to query on, right? Maybe even put them in a preset that you can share with others? Yes, we would add explicit query constraints for those as well, as well as to provide a way to convert faceted query results into a specific query that does not need to be faceted in Bug 1799805.
A variety of other additional queries become possible as well:
Searching for lines of text that are near each other, or not near each other, or maybe both inside the same argument list.
Locating member fields by type (Bug 1733217), like if you wanted to find all member fields that are smart or raw pointer references to nsILoadInfo.
Result Context Lines For All Result Types, Including Automatic Context
<figcaption class="wp-element-caption">Current query results for C:4 AddOrPut</figcaption>
A major limitation for searchfox searches has been a lack of support for context lines. (Disclaimer: in Bug 1414954 I added secret support for fulltext-only queries by prefixing a search with context:4 or similar, but you would then want to force a fulltext query like context:4 text:my actual search or context:4 re:my.*regexp[y]?.*search.) The query mechanism already supports full context, as the above screenshot is taken from the query for C:4 AddOrPut but note that the UX needs more passes and the gathering mechanism currently needs optimization which I have a WIP for in Bug 1794177
This next diagram is a live diagram from mozilla-central I just generated with the query calls-between:'ClientSource::Focus' calls-between:'WindowClient_Binding::focus' depth:10 and which demonstrates searchfox’s understanding of our IPDL bindings, as each of the SendP*/RecvP* pairs is capturing the IPC semantics that are only possible because of searchfox’s understanding of both C++ and IPDL.
The next steps in diagramming will happen in Bug 1773165 with a focus on making the graphs interactive and applying heuristics related to graph clustering based on work on the “fancy branch” prototype and my recent work to derive the sub-component mapping for files that can in turn be propagated to classes/methods so that we can automatically collapse edges that cross sub-component boundaries (but which can be interactively expanded). This has involved a bit of yak-shaving on Bug 1776522 and Bug 1783761 and others.
Note that we also support calls-to:'Identifier' in the query endpoint as well, but the graphs look a lot messier without the clustering heuristics, so I’m not including any in this post.
Most of my work on searchfox is motivated by my desire to use diagrams in system understanding, with much of the other work being necessary because to make useful diagrams, you need to have useful and deep models of the underlying data. I’ll try and write more about this in the future, but this is definitely a case where:
A picture is worth a thousand words and iterations on the diagrams are more useful than the relevant prose.
Providing screen-reader accessible versions of the underlying data is fundamental. I have not yet ported the tree-dual version of the diagram logic from the “fancy” branch and I think this is a precondition to an initial release that’s more than just a proof-of-sorta-works.
Documentation Integration
Our in-tree docs rendered at https://firefox-source-docs.mozilla.org/ are fantastic. Searchfox cannot replace human-authored documentation, but it can help you find them! Have you spent hours understanding code only to find that there was documentation that would help clarify what was going on only after the fact? Bug 1763532 will teach searchfox to index markdown so that documentation definitions and references show up in search and that we can potentially expose those in context menus. Subsequent steps could also index comment contents.
Bug 1458882 will teach searchfox how to link to the rendered documentation.
Improved Language Support
New Language Support via SCIP
With the advent of LSIF and SCIP and in particular the work by the team at sourcegraph to add language indexing built on existing analysis tools, there is now a tremendous amount of low hanging fruit in terms of off-the-shelf language indexing that searchfox can potentially ingest. Thanks to Emilio‘s initial work in Bug 1761287 we know that it’s reasonably straightforward to ingest SCIP data from these indexers.
For each additional language we want to index, we expect the primary effort required will be to make the indexer available in a taskcluster task and appropriately configure it to index the potentially many component roots within the mozilla-central mono-repo. There will also be some searchfox-specific effort required to map the symbols into searchfox’s symbol namespace.
Specific languages we can support (better):
Javascript / Typescript via scip-typescript (Bug 1740290): scip-typescript potentially allows us to expose the same enhanced understanding of JS code, especially module-based JS code, that you experience in VS code, including type inference/extraction from JSDoc. Additionally, in Bug 1775130 we can leverage the amazing eslint work already done to bring enhanced analysis to more confusing situations like our mochitests which deal with more complex global situations. Overall, this can allow us to move away from searchfox’s current “soupy” understanding of JS code where it assumes that all the JS it ever sees is running in a single global.
Searchfox’s strongest support is for C++ (and its interactions with XPIDL and IPDL), but there is still more to do here. Thankfully Botond is working to improve C++ template handling in Bug 1781178 and related bugs.
Other enhancements:
Bug 1419018: Allow distinguishing LHS/RHS usages of fields, with a prototype of relevance in Bug 1733217.
Improved Mozilla-Specific Language Support
mozilla-central contains a number of Mozilla-specific Interface Definition Languages (IDLs) and Domain Specific Languages (DSLs). Searchfox has existing support for:
XPIDL.idl files: Our C++ support here is pretty good because XPIDL files are not preprocessed (beyond in-language support of #include and the ability to put pass-through C++ code including preprocessor directives inside %{C++ and %}demarcated blocks. Bug 1761689 tracks adding support for constants/enums which is not currently supported, and I have WIPs for this. Bug 1800008 tracks adding awareness of the rust bindings.
IPDL.ipdl and .ipdlh files: Our C++ support here is good as long as the file is not pre-processed and the rust IPDL parser hasn’t fallen behind the Python parser. Unfortunately a lot of critical files like PContent.ipdl are pre-processed so this currently creates massive blind-spots in searchfox’s understanding of the system. Bug 1661067 will move us to having the Python parser/code generator emit data searchfox can ingest
Searchfox has planned support for:
WebIDL.webidl files: Bug 1416899 tracks indexing WebIDL files, and Bug 1525189 tracks the JS interaction side of this.
StaticPrefList.yaml and related preference bindings: Bug 1699048 with discussion in Bug 1727789 on which it certainly depends.
Searchfox’s language indexing is inherently a form of static analysis. Consistent with the searchfox vision saying that “searchfox is not the only tool”, it makes sense to attempt to integrate with and build upon the tools that Firefox engineers are already using. Mozilla’s code-coverage data is already integrated with searchfox, and the next logical step is to integrate with pernosco, why not. I created pernosco-bridge as an experimental means of extracting data from pernosco and allowing for interactive visualizations.
The screenshot above is an example of a timeline graph automatically extracted from a config file to show data relevant to IndexedDB. IndexedDB transactions were hierarchically related to their corresponding database and the origin that opened those databases. Within each transaction, ObjectStoreAddOrPutRequestOp and CommitOp operations are graphed. Clicking on the timeline would direct the pernosco tab to jump to those instants in time.
The above is a different visualization based on a config file for DocumentChannel to help group what’s going on in a pernosco trace and surface the relevant information. If you check out the config file, you will probably find it inscrutable, but with searchfox’s structured understanding of C++ classes landed last year in Bug 1641372 we can imagine leveraging searchfox’s understanding of the codebase to make this process friendly. More importantly, there is the potential to collaboratively build a shared knowledge base of what’s most relevant for classes, so everyone doesn’t need to re-do the same work.
Using the same information pernosco-bridge used to build the hierarchically organized timelines above with extracted values like URIs, it can also build graphs of the live objects at any moment in time in the trace. Above we can see the relationship between windowGlobalParent instances and their corresponding canonicalBrowsingContexts, plus the URIs of the canonicalBrowsingContexts. We can imagine using this to help visualize representative object graphs in searchfox.
We can also imagine doing something like the above screenshot from my prior experiment pecobro where we interleave graphs of function activity into source listings.
Token-Centric Blame / “hyperannotate” Support via Microannotate
<figcaption class="wp-element-caption">A demonstration of microannotate’s output</figcaption>
Quoting my dev-platform post about the unfortunate removal of searchfox’s first attempt at blame-skipping: “Revision history and the “annotate” / “blame” UIs for revision control are tricky because they’re built on a sequential, line-centric data-model where moving a function above another function in a file results in a destructive representational decision to treat one function as continuing through history and the other function as removed and then re-added as new code. Reformatting that maintains the overall sequence of tokens but changes how they are distributed across multiple lines also looks like removal of all of the old code and the addition of new code. Tools frequently perform heuristic-based post-passes to help identify intra-line changes which are reflected in diff UIs, as well as (entire) lines of code that are copied/moved in a revision (ex: Phabricator does this).”
The plan to move forward is to move to a token-centric approach using :marco‘s microannotate project as tracked in Bug 1517978. We would also likely want to combine this with heuristics that skip over backout pairs. The screenshot at the top of this section is of example output for nsWebBrowserPersist.cpp where colors distinguish between different blame revision origins. Note that the addition of specific arguments can be seen as well as changes to comments.
Source Listings
Bug 1781179: Improved syntax and semantic highlighting in C++ for the tip/head indexed revision.
Bug 1583635: Show expansion of C++ macros. Do you ever look at our XPCOM macrology and wish you weren’t about to spend several minutes clicking through those macros to understand what’s happening? This bug, my friend, this bug.
Bug 1796870: Adopt use of tree-sitter as a tokenizer which can improve syntax highlighting for other languages as well as position: sticky context for both the tip/head indexed revision but also for historical revisions!
Bug 1799557: Improved handling of links to source files that no longer exist by offering to show the last version of the file that existed or try and direct the user to the successor code.
Bug 1697671: Link resource:// and chrome:// URLs in source listings to the underlying source files
Test Info Boxes
Bug 1785129: Add an info box mechanism to indicate the need for data collection review (“data review”) in info boxes on searchfox source listing pages
Bug 1797855: Joel Maher and friends have been adding all kinds of great test metadata for searchfox to expose, and soon, this bug shall expose that information. Unfortunately there’s some yak shaving related to logging that remains underway.
Bug 1797857: Extend the “Go to header file”/”Go to source file” mechanism to support WPT `.headers` files and xpcshell/mochitest `^headers^` files.
Alternate Views of Data
Searchfox is able to provide more than source listings. The above screenshot shows searchfox’s understanding of the field layouts of C++ classes across all the platforms searchfox indexes on as rendered by the “fancy” branch prototype. Bug 1468445 tracks implementing a production quality version of this, noting that the data is already available, so this is straightforward. Bug 1799517 is a variation on this which would help us explicitly capture the destructor order of C++ fields.
Bug 1672307 tracks showing the binary size impact of a given file, class, etc.
Source Directory Listings
In the recently landed Bug 1783761 I moved our directory listings into rust after shaving a whole bunch of yaks. Now we can do a bunch of queries on data about files. Would you like to see all the tests that are disabled in your components? We could do this! Would you like to see all the files in your components that have been modified in the last month but have bad coverage? We could also do that! There are many possibilities here but I haven’t filed bugs for them.
Bug 1732585: Provide a way to search related (phabricator revision) review/bugzilla comments related to the current file
Bug 1657786: Create searchfox taskcluster mode/variant that can run the C++ indexer only against changed files for try builds / phabricator requests
Bug 1778802: Consider storing m-c analysis data in a git repo artifact with a bounded history via `git checkout –orphan` to enable try branch/code review features and recent semantic history support
Easier Contributions
The largest hurdle new contributors have faced is standing up a virtual machine. In Bug 1612525 we’ve added core support for docker, and we have additional work to do in that bug to document using docker and add additional support for using docker under WSL2 on Windows. Please feel free to drop by https://chat.mozilla.org/#/room/#searchfox:mozilla.org if you need help getting started.
Deeper Integration with Mozilla Infrastructure
Currently much of searchfox runs as EC2 jobs that exists outside of taskcluster, although C++ and rust indexing artifacts as well as all coverage data and test info data comes from taskcluster. Bug 1598502 tracks moving more of searchfox into taskcluster, although presumably the web-servers will still need to exist outside of taskcluster.
Searchfox (source, config source) is Mozilla’s primary code searching tool for Firefox introduced by Bill McCloskey in 2016 which built upon prior work on DXR. This product vision post describes my personal vision for searchfox and the rationale that underpins it. I’m also writing an accompanying road map that describes specific potential enhancements in support of this vision which I will publish soon and goes into the concrete potential features that would be implemented in the spirit of this vision.
Note that the process of developing searchfox is iterative and done in consultation with its users and other contributors, primarily in the searchfox channel on chat.mozilla.org and in its bugzilla component. Accordingly, these documents should be viewed as a basis for discussion rather than a strict project plan.
The Whys Of Searchfox
Searchfox is a Tool For System Understanding
Searchfox enables exploration and understanding of the Firefox codebase as it exists now, as it existed in the past, and to support understanding of the ramifications of potential changes.
Searchfox Is A Tool For Shared System Understanding
Firefox is a complex piece of software which has more going on than any one person can understand at a time. Searchfox enables Firefox’s contributors to leverage the documentation artifacts of other teams and contributors when exploring in isolation, and to communicate more effectively when interacting.
Searchfox Is Not The Only Tool
Searchfox integrates relevant data from automation and other tools in the Firefox development ecosystem where they make sense and provides deep links into those tools or first steps to help you get started without having to start from nothing.
The Hows Of Searchfox
Searchfox is Immediate: Low Latency and Complete
Searchfox’s results should be available in their entirety when the page load completes, ideally in much less than a second. There should be no asynchronous lazy loading or spinners. Practically speaking, if you could potentially see something on a page, you should be able to ctrl-f for it.
In situations where results are too voluminous to be practically useful, Searchfox should offer targeted follow-on searches that can relax limits and optionally provide for additional constraints so that iterative progress is made.
Searchfox is Accessible
Searchfox should always present a usable accessibility tree. This includes ensuring that any dynamically generated graphical representations such as graphviz-style diagrams have a directly usable accessibility tree or an alternate representation that maximally captures any hierarchy or clustering present in the visual presentation.
Searchfox Favors Iterative Exploration and Low Activation Energy
Searchfox seeks to avoid UX patterns where you have to metaphorically start from a blank sheet of paper or face decision paralysis choosing among a long list of options. Instead, start from whatever needle you have (an identifier name, a source file location, a string you saw in the UI) and searchfox will help you iterate and refine from there.
Searchfox Provides Stable, Useful URLs When Possible and Markdown For More Complex Situations
If you’re looking at something in searchfox, you should be able to share it as a URL, although there may be a few URLs to choose from such as whether to use a permalink which includes a specific revision identifier. More complicated situations may merit searchfox providing you with markdown that you can paste in tools that understand markdown.
After some behind-the-scenes discussions with Michael Kohler on what I could contribute at this year's FOSDEM, I ended up doing a presentation about my personal Suggestions for a Stronger Mozilla Community (video is available on the linked page). While figuring out the points I wanted to talk about and assembling my slides for that talk, I realized that one of the largest issues I'm seeing is that the Mozilla community nowadays feels very disconnected to me, like several islands, within each there is good stuff being done, but most people not knowing much about what's happening elsewhere. That has been helped a lot by a lot of interesting projects being split off Mozilla into separate projects in recent years (see e.g. Coqui, WebThings, and others) - which is often taking them off the radar of many people even though I still consider them as being part of this wider community around the Mozilla Manifesto and the Open Web.
Following the talk, I brought that topic to the Reps Weekly Call this last week (see linked video), esp. focusing on one slide from my FOSDEM talk that talks about finding some kind of communication channel to cross-connect the community. As Reps are already a somewhat cross-function community group, my hope is that a push from that direction can help getting such a channel in place - and figuring out what exactly is a good idea and doable with the resources we have available (I for example like the idea of a podcast as I like how those can be listened to while traveling, cooking, doing house work, and others things - but it would be a ton of work to organize and produce that).
Some ideas that came up in the Reps Call were for example a regular newsletter on Mozilla Discourse in the style of the MoCo-internal "tl;dr" (which Reps have access to via NDA), but as something that is public, as well as from and for the community - or maybe morphing some Reps Calls regularly into some sort of "Community News" calls that would highlight activities around the wider community, even bringing in people from those various projects/efforts there. But there may be more, maybe even better ideas out there.
To get this effort to the next level, we agreed that we'll first get the discussion rolling on a Discourse thread that I started after the meeting and then probably do a brainstorming video call. Then we'll take all that input and actually start experimenting with the formats that sound good and are practically achievable, to find what works for us the best way.
If you have ideas or other input on this, please join the conversation on Discourse - and also let us know if you can help in some form!
tl;dr: Happy 23rd birthday, Mozilla. And for the question: yes.
Here's a bit more rambling on this topic...
First of all, the Mozilla project was officially started on March 31, 1998, which is 23 years ago today. Happy birthday to my favorite "dino" out there! For more background, take a look at my Mozilla History talk from this year's FOSDEM, and/or watch the "Code Rush" documentary that conserved that moment in time so well and also gives nice insight into late-90's Silicon Valley culture.
Now, while Mozilla initially was there to "act as the virtual meeting place for the Mozilla code" as Netscape was still there with the target to win back the browser market that was slipping over to Micosoft. The revolutionary stance to develop a large consumer application in the open along with the marketing of "hack - this technology could fall into the right hands" as well as the general novenly of the open-source movement back then - and last not least a very friendly community (as I could find out myself) made this young project grow fast to be more than a development vehicle for AOL/Netscape, though. And in 2003, a mission to "preserve choice and innovation on the Internet" was set up for the project, shortly after backed by a non-profit Mozilla Foundation, and then with an independently developed Firefox browser, implementing "the idea [...] to design the best web browser for most people" - and starting to take back the web from the stagnation and lack of choice represented by >95% of the landscape being dominated by Microsoft Internet Explorer.
The exact phrasing of Mozilla's mission has been massages a few times, but from the view of the core contributors, it always meant the same thing, it currently reads:
Quote:
Our mission is to ensure the Internet is a global public resource, open and accessible to all. An Internet that truly puts people first, where individuals can shape their own experience and are empowered, safe and independent.
On the Foundation site, there's the sentence "It is Mozilla’s duty to ensure the internet remains a force for good." - also pretty much meaning the same thing with that, just in less specific terms. Of course, the spirit of the project was also put into 10 pretty concrete technical principles, prefaced by 4 social pledges, in the Mozilla Manifesto, which make it even more clear and concrete what the project sees as its core purpose.
So, if we think about the question whether we still need Mozilla nowadays, we should take a look if moving in that direction is still required and helpful, and if Mozilla is still able and willing to push those principles forward.
When quite a few communities I'm part of - or would like to be part of - are moving to Discord or are adding it as an additional option to Facebook groups, and I read the Terms of Services of those two tightly closed and privacy-unfriendly services, I have to conclude that the current Internet is not open, not putting people first, and I don't feel neither empowered, safe or independent in that space. When YouTube selects recommendations so I live in a weird bubble that pulls me into conspiracies and negativity pretty fast, I don't feel like individuals can shape their own experience. When watching videos stored on certain sites is cheaper or less throttled than other sources with any new data plan I can get for my phone, or when geoblocking hinders me from watching even a trailer of my favorite series, I don't feel like the Internet is equally accessible to all. Neither do I when political misinformation is targeted at certain groups of users in election ads on social networks without any transparency to the public. But I would long for that all to be different, and to follow the principles I talked of above. So, I'd say those are still required, and would be helpful to push for.
I definitely think we badly need a Mozilla that works on all those issues, and we need a whole lot of other projects and people help in the space as well. Be it in advocacy, in communication, in technology (links are just examples), or in other topics.
Can all that actually succeed in improving the Internet? Well, it definitely needs all of us to help, starting with using products like Firefox, supporting organizations like Mozilla, spreading the word, maybe helping to build a community, or even to contribute where we can.
We definitely need Mozilla today, even 23 years after its inception. Maybe we need it more than ever, actually. Are you in?
As mentioned in a previous post, I've been working with the Capacity Blockchain Solutions team on the Crypto stamp project, the first physical postage stamp with a unique digital twin, issued by the Austrian Postal Service (Österreichische Post AG). After a successful release of Crypto stamp 1, one of our core ideas for a second edition was to represent stamp albums (or stamp collections) in the digital world as well - and not just the stamps themselves.
We set off to find existing standards on Ethereum contracts for grouping NFTs (ERC-721 and potentially ERC-1155 tokens) together and we found that there are a few possibilities (like EIP-998) but those ares getting complicated very fast. We wanted a collection (a stamp album) to actually be the owner of those NFTs or "assets" but at the same time being owned by an Ethereum account and able to be transferred (or traded) as an NFT by itself. So, for the former (being the owner of assets), it needs to be an Ethereum account (in this case, a contract) and for the latter (being owned and traded) be a single ERC-721 NFT as well. The Ethereum account should not be shared with other collections so ownership of an asset is as transparent as people and (distributed) apps expect. Also, we wanted to be able to give names to collections (via ENS) so it would be easier to work with them for normal users - and that also requires every collection to have a distinct Ethereum account address (which the before-mentioned EIP-998 is unable to do, for example). That said, to be NFTs themselves, the collections need to be "indexed" by what we could call a "registry of Collections".
To achieve all that, we came up with a system that we think could be a model for future similar project as well and would ideally form the basis of a future standard itself.
At its core, a common "Collections" ERC-721 contract acts as the "registry" for all Crypto stamp collections, every individual collection is represented as an NFT in this "registry". Additionally, every time a new NFT for a collection is created, this core contract acts a "factory" and creates a new separate contract for the collection itself, connecting this new "Collection" contract with the newly created NFT. On that new contract, we set the requested ENS name for easier addressing of the Collection.
Now this Collection contract is the account that receives ERC-721 and ERC-1155 assets, and becomes their owner. It also does some bookkeeping so it can actually be queried for assets and has functionality so the owner of the Collection's own NFT (the actual owner of the Collection itself) and full control over those assets, including functions to safely transfer those away again or even call functions on other contracts in the name of the Collection (similar to what you would see on e.g. multisig wallets).
As the owner of the Collection's NFT in the "registry" contract ("Collections") is the one that has power over all functionality of this Collection contract (and therefore the assets it owns), just transferring ownership of that NFT via a normal ERC-721 transfer can give a different person control, and therefore a single trade can move a whole collection of assets to a new owner, just like handing a full album of stamps physically to a different person.
To go into more details, you can look up the code of our Collections contract on Etherscan. You'll find that it exposes an ERC-721 name of "Crypto stamp Collections" with a symbol of "CSC" for the NFTs. The collections are registered as NFTs in this contract, so there's also an Etherscan Token Tracker for it, and as of writing this post, over 1600 collections are listed there. The contract lets anyone create new collections, and optionally hand over a "notification contract" address and data for registering an ENS name. When doing that, a new Collection contract is deployed and an NFT minted - but the contract deployment is done with a twist: As deploying a lot of full contracts with a larger set of code is costly, an EIP-1167 minimal proxy contract is deployed instead, which is able to hold all the data for the specific collection while calling all its code via proxying to - in this case - our Collection Prototype contract. This makes creating a new Collection contract as cheap as possible in terms of gas cost while still giving the user a good amount of functionality. Thankfully, Etherscan for example has knowledge of those minimal proxy contracts and even can show you their "read/write contract" UI with the actually available functionality - and additionally they know ENS names as well, so you can go to e.g. etherscan.io/address/kairo.c.cryptostamp.eth and read the code and data of my own collection contract. For connecting that Collection contract with its own NFT, the Collections (CSC) contract could have a translation table between token IDs and contract addresses, but we even went a step further and just set the token ID to the integer value of the address itself - as an Ethereum address is a 40-byte hexadecimal value, this results in large integer numbers (like 675946817706768216998960957837194760936536071597 for mine) but as Ethereum uses 256-bit values by default anyhow, it works perfectly well and no translation table between IDs and addresses is needed. We still do have explicit functions on the main Collections (CSC) contract to get from token IDs to addresses and vice versa, though, even if in our case, it can be calculated directly in both ways.
Both the proxy contract pattern and the address-to-token-ID conversion scheme are optimizations we are using but if we were to standardize collections, those would not be in the core standard but instead to be recommended implementation practices instead.
Of course, users do not need to care about those details at all - they just go to crypto.post.at, click "Collections" and create their own collection for there (when logged in via MetaMask or a similar Ethereum browser module), and they also go the the website to look at its contents (e.g. crypto.post.at/collection/kairo). Ideally, they'll also be able to view and trade them on platforms like OpenSea - but the viewing needs specific support (which probably would need standardization to at least be in good progress), and the trading only works well if the platform can deal with NFTs that can change value while they are on auction or the trade market (and then any bids made before need to be invalidated or re-confirmed in some fashion). Because the latter needs a way to detect those value changes and OpenSea doesn't have that, they had to suspend trade for collections for now after someone exploited that missing support by transferring assets out of the collection while it was on auction. That said, there are ideas on how to get this back again the right way but it will need work on both the NFT creator side (us in the specific case of collections) and platforms that support trade, like OpenSea. Most importantly, the meta data of the NFT needs to contain some kind of "fingerprint" value that changes when any property changes that influences the value, and the trading platform needs to check for that and react properly to changes of that "fingerprint" so bids are only automatically processed as long as it doesn't change.
For showing contents or calculating such a "fingerprint", there needs to be a way to find out, which assets the collection actually owns. There are three ways to do that in theory: 1) Have a list of all assets you care about, and look up if the collection address is listed as their owner, 2) look at the complete event log on the blockchain since creation of the collection and filter all NFT Transfer events for ones going to the collection address or away from it, or 3) have some way of so the collection itself can record what assets it owns and allow enumeration of that. Option 1 is working well as long as your use case only covers a small amount of different NFT contracts, as e.g. the Crypto stamp website is doing right now. Option 2 gives general results and is actually pretty feasible with the functionality existing in the Ethereum tool set, but it requires a full node and is somewhat slow.
So, for allowing general usage with decent performance, we actually implemented everything needed for option 3 in the collections contract. Any "safe transfer" of ERC-721 or ERC-1155 tokens (e.g. via a call to the safeTransferFrom() function) - which is the normal way that those are transferred between owners - does actually test if the new owner is a simple account or a contract, and if it actually is a contract, it "asks" if that contract can receive tokens via a contract function call. The collection contract does use that function call to register any such transfer into the collection and puts such received assets into a list. As for transferring away an asset, you need to make a function call on the collection contract anyhow, removing from that list can be done there. So, this list can be made available for querying and will always be accurate - as long as "safe" transfers are used. Unfortunately, ERC-721 allows "unsafe" transfers via transferFrom() even though it warns that NFTs "MAY BE PERMANENTLY LOST" when that function is used. This was probably added into the standard mostly for compatibility with CryptoKitties, which predate this standard and only supported "unsafe" transfers. To deal with that, the collections contract has a function to "sync" ownership, which is given a contract address and token ID, and it adjusts it assets list accordingly by either adding or removing it from there. Note that there is a theoretical possibility to also lose an assets without being able to track it there, that's why both directions are supported there. (Note: OpenSea has used "unsafe" transfers in their "gift" functionality at least in the past, but that hopefully has been fixed by now.)
So, when using "safe" transfers or - when "unsafe" ones are used - "syncing" afterwards, we can query the collection for its owned assets and list those in a generic way, no matter which ERC-721 or ERC-1155 assets are sent to it. As usual, any additional data and meta data of those assets can then be retrieved via their NFT contracts and their meta data URLs.
I mentioned a "notification contract" before which can be specified at creation of a collection. When adding or removing an assets from the internal list in the collection, it also calls to that notification contract (if one is set) as a notification of this asset list change. Using that feature, it was possible to award achievements directly on the blockchain for e.g. collecting a certain number of NFTs of a specific type or one of each motif of Crypto stamps. Unfortunately, this additional contract call costs even more gas on Ethereum, as does tracking and awarding of achievements themselves, so rising gas costs forced us to remove that functionality and not set a notification contract for new collections as well as offer an "optimization" feature that would remove it from collections already created with one. This removal made transaction costs for using collections more bearable again for users, though I still believe that on-chain achievements were a great idea and probably a feature that was ahead of its time. We may come back to that idea when it can be done with an acceptably small impact on transaction cost.
One thing I also mentioned before is that the owner of a Collection can actually call functions in other contracts in the name of the Collection, similar to functionality that multisig wallets provide. This is done via an externalCall() function, to which the caller needs to hand over a contract address to call and an encoded payload (which can relatively easily be generated e.g. via the web3.js library). The result is that the Collection can e.g. call the function for Crypto stamps sold via the OnChain shop to have their physical versions sent to a postage address, which is a function that only the owner of a Crypto stamp can call - as the Collection is that owner and its own owner can call this "external" function, things like this can still be achieved.
To conclude, with Crypto stamp Collections we have created a simple but feature-rich solution to bring the experience of physical stamp albums to the digital world, and we see a good possibility to use the same concept generally for collecting NFTs and enabling a whole such collection of NFTs to be transferred or traded easily as one unit. And after all, NFT collectors would probably expect a collection of NFTs or a "stamp album" to have its own NFT, right? I hope we can push this concept to larger adoption in the future!
The FOSDEM conference in Brussels has become a bit of a ritual for me. Ever since 2002, there has only been a single year of the conference that I missed, and any time I was there, I did take part in the Mozilla devroom - most years also with a talk, as you can see on my slides page.
This year, things were a bit different as for obvious reasons the conference couldn't bring together thousands of developers in Brussels but almost a month ago, in its usual spot, the conference took place in a virtual setting instead. The team did an incredibly good job of hosting this huge conference in a setting completely run on Free and Open Source Software, backed by Matrix (as explained in a great talk by Matthew Hodgson) and Jitsi (see talk by Saúl Ibarra Corretgé).
On short notice, I also added my bit to the conference - this time not talking about all the shiny new software, but diving into the past with "Mozilla History: 20+ Years And Counting". After that long a time that the project exists, I figured many people may not realize its origins and especially early history, so I tried to bring that to the audience, together with important milestones and projects on the way up to today.
The video of the talk has been available for a short time now, and if you are interested yourself in Mozilla's history, then it's surely worth a watch. Of course, my slides are online as well.
If you want to watch more videos to dig deeper into Mozilla history, I heavily recommend the Code Rush documentary from when Netscape initially open-sourced Mozilla (also an awesome time capsule of late-90s Silicon Valley) and a talk on early Mozilla history from Mitchell Baker that she gave at an all-hands in 2012.
The Firefox part of the history is also where my song "Rock Me Firefox" (demo recording on YouTube) starts off, for anyone who wants some music to go along with all this!
We’re working on a thing to make Firefox start faster! It appears to work! Here’s a video showing off a before (left) and after (right):
Improving Firefox Startup Time With The about:home Startup Cache
For the past year or so, the Firefox Desktop Front-End Performance team has been concentrating on making improvements to browser startup performance.
The launching of an application like Firefox is quite complex. Meticulous profiling of Firefox startup in various conditions has, thankfully, helped reveal a number of opportunities where we can make improvements. We’ve been evaluating and addressing these opportunities, and several have made it into the past few Firefox releases.
This blog post is about one of those improvements that is currently in the later stages of development. I’m going to describe the improvement, and how we went about integrating it.
In a default installation of Firefox, the first (and only) tab that loads is about:home1.
The about:home page is actually the same thing that appears when you open a new tab (about:newtab). The fact that they have different addresses allows us to treat their loading differently.
Your about:home might look slightly different from the above — depending on your locale, it may or may not include the Pocket stories.
Do not be fooled by what appears to be a very simple page of images and text. This page is actually quite sophisticated under the hood. It is designed to be customized by the user in the following ways:
Users can
Collapse or expand sections
Remove sections entirely
Reorganize the order of their Top Sites by dragging and dropping
Pin and unpin Top Sites to their positions
Add their own custom Top Sites with custom thumbnails
Add or remove search engines from their Top Sites
Change the number of rows in the Top Sites and Recommended by Pocket sections
Choose to have the Highlights composed of any of the following:
Visited pages
Recent bookmarks
Recent downloads
Pages recently saved to Pocket
The user can customize these things at any time, and any open copies of the page are expected to reflect those customizations immediately.
There are further complexities beyond user customization. The page is also designed to be easy for our design and engineering teams to experiment with reorganizing the layout and composition of the page so that they can test variations on its layout in the wild.
The about:home page also has special privileges not afforded to normal websites. It can
Save and remove bookmarks
Add pages to Pocket
Cause the URL bar to be focused and selected
Show thumbnails for pages that the user has visited
Access both high and normal resolution favicons
Render information about the user’s recent activity (recent page visits, downloads, saves to Pocket, etc.)
So while at first glance, this appears to be a static page of just images and text, rest assured that the page can do much more.
Like the Firefox Developer Tools UI, about:home is written with the help of the React and Redux libraries. This has allowed the about:home development team to create sophisticated, reusable, and composable components that could be easily tested using modern JavaScript testing methods.
Unsurprisingly, this complexity and customizability comes at a cost. The page needs to request a state object from the parent process in order to have the Redux store populated and to have React render it. Essentially, the page is dynamically rendering itself after the markup of the page loads.
Startup is a critical time for an application. The user has expressed a need for their browser, and we have an obligation to serve the user as quickly and efficiently as possible. The user’s time is a resource that we should not squander. Similarly, because so much needs to occur during startup,2 disk reads, disk writes, and CPU time are also considered precious resources. They should only be used if there’s no other choice.
In this case, we believed that the CPU time and disk accesses spent constructing the state object and dynamically rendering the about:home page was competing with all of the other CPU and disk access happening during startup, and this was slowing us down from presenting about:home to the user in a timely way.
Generally speaking, in my mind there are four broad approaches to performance problems once a bottleneck has been identified.
You can widen the bottleneck (make the operations more efficient)
You can divide the bottleneck (split the work into smaller slices that can be done over a longer period of time with rests in between)
You can move the bottleneck (defer work until later when it seems that there is less competition for resources, or move it to a different thread)
You can remove the bottleneck (don’t do the work)
We started by trying to apply the last two approaches, wondering what startup performance would be like if the page did not render itself dynamically, but was instead a static page generated periodically and pulled off of the disk at startup.
Prototype when possible
The first step to improving something is finding a way to measure it. Thankfully, we already have a number of logged measurements for startup. One of those measurements gives us the time from process start to rendering the Top Sites section of about:home. This is not a perfect measurement—ideally, we’d measure to the point that the page finally “settles” and stops changing3—but for this project, this measurement served our purposes.
Before investing a bunch of time into a potential improvement, it’s usually a good idea to try to see if what you’re gaining is worth the development time. It’s not always possible to build a prototype for performance improvements, but in this case it was.
So, with that information, we thought we had a real improvement opportunity here. We decided to proceed with the idea, and began a long arduous search for “the right way to do it.”
Pre-production
As I mentioned earlier, about:home is complex. The infrastructure that powers it is complex. Coupled with the fact that no one on the Firefox Front-End Performance team had spent much time studying React and Redux meant that we had a lot of learning to do.
The first step was to get some React and Redux fundamentals under our belt. This meant building some small toy applications and getting familiar with the framework idioms and how things are organized.
With that grounding, the next step was to start reading the code — starting from the entrypoint into the code that powers about:home when the browser starts. This was an intense period of study that branched into many different directions. Part of the complexity was because much of the code is asynchronous and launched work on different threads, which introduced some non-determinism. While it is generally good for responsiveness to move work off of the main thread, it can lead to some complex reading and interpretation of the code when more than two threads are involved.
A tool we used during this analysis was the Firefox Profiler, to get a realistic sense of the order of executions during startup. These profiles helped to inform much of our reading of the code.
This analysis helped us solidify our mental model of how about:home loads. With that model in place, it was much easier to propose practical approaches for introducing a static about:home document into the ecosystem of pre-existing code. The Firefox Front-End Performance team documented our findings and recommendations and then presented them to the team that originally built the about:home system to ensure that we were all on the same page and that we hadn’t missed anything critical. They were already aware that we were investigating potential performance improvements, and had very useful feedback for us, as well as historical product decision context that clarified our understanding.
Critically, we presented our recommendation for loading a static about:home page at startup and ensured that there were no upcoming plans for about:home that would break our mental model or render the recommendation no longer valid. Thankfully, it sounded like we were aligned and fine to proceed with our plan.
So what was the plan? We knew that since about:home is quite dynamic and can change over time4 we needed a startup cache for about:home that could be periodically updated during the course of a browsing session. We would then load from that cache at startup. Clearly, I’m glossing over some details here, but that was the general plan.
As usual, no plan survives breakfast, and as we started to architect our solution, we identified things we would need to change along the way.
Development
We knew that the process that loads about:home would need to be able to read from the about:home startup cache. We also knew that about:home can potentially contain information about what pages the user has visited, and that about:home can do privileged things that normal web pages cannot. It seemed that this project would be a good opportunity to finish a project that was started (and mothballed) a year or so earlier: creating a special privileged content process for about:home. We would load about:home in that process, and add assertions to ensure that privileged actions from about:home could only happen from that content process type5
So getting the “privileged about content process”6 fixed up and ready for shipping was the first step.
This also paved the way for solving the next step, which was to enable the moz-page-thumb:// protocol for the “privileged about content process.” The moz-page-thumb:// protocol is used to show the screenshot thumbnails for pages that the user has visited in the past. The previous implementation was using Blob URLs to send those thumbnails down to the page, and those Blob URLs exist only during runtime and would not work properly after a restart.
The next step was figuring out how to build the document that would be stored in the cache. Thankfully, ReactDOMServer has the ability to render a React application to a string. This is normally used for server-side rendering of React-powered applications. This feature also allows the React library to passively attach to the server-side page without causing the DOM to be modified. With some small modifications, we were able to build a simple mechanism in a Web Worker to produce this cached document string off of the main thread. Keeping this work off of the main thread would help maintain responsiveness.
With those pieces of foundational work out of the way, it was time to figure out the cache storage mechanism. Firefox already has a startupcache module that it uses for static resources like markup and JavaScript, but that cache is not designed to be written to periodically at runtime. We would need something different.
We had originally supposed that we would need to give the privileged about content process special access to a file on the filesystem to read from and to write to (since our sandbox prevents content processes from accessing disks directly). Initial experiments along this line worried us — we didn’t like the idea of poking holes in the sandbox if we didn’t need to. Also, adding yet another read from the filesystem during startup seemed counter to our purposes.
We evaluated IndexedDB as a storage mechanism, but the DOM team talked us out of it. The performance characteristics of IndexedDB, especially during startup, were unlikely to work for us.
Finally, after some consulting, we were directed to the HTTP cache. The HTTP cache’s job is to cache pages that the user visits (when appropriate) and to offer those caches to the user instead of hitting the network when retrieving the resource within the expiration time7. Functionally speaking, this seemed like a storage mechanism perfectly suited to our purposes.
After consulting with the Necko team and building a few proof-of-concepts, we figured out how to tie the whole system together. Importantly, we figured out how to get the initial about:home load to pull a document out from the HTTP cache rather than reading it from the application resource package.
We also figured out the cache writing mechanism. The cached document that would periodically get built inside of the privileged about content process inside of a Worker off of the main thread, would then send that data back up to the parent to stream into the cache.
At this point, we felt we had all of the pieces that we needed. Construction on each component began.
One of the more gratifying parts of implementation was when we modified one of our startup tests to use the new caching mechanism.
In this graph, the Y axis is the geometric mean time to render the about:home Top Sites over 20 restarts of the browser, in milliseconds. Lower is better. The dots along the top are without the cache. The dots along the bottom are with the cache enabled. According to our measurements, we improved the rendering time from process start to Top Sites by just over 20%! We beat our prototype!
Noticeable differences
But the real proof will be if there’s actually a noticeable visual change. Here’s that screen recording again from one of our reference devices8.
The screen on the left is with the cache disabled, and on the right with the cache enabled. Looks to me like we made a noticeable dent!
Try it out!
We haven’t yet enabled the about:home startup cache in Nightly by default, but we hope to do so soon. In the meantime, Nightly users can try it out right now by going to about:preferences#experimental and toggling it on. If you find problems and have a Bugzilla account, here’s a form for submitting bugs to the right place.
You can tell if the about:home you’re looking at is from the cache by opening up the DevTools Inspector and looking for a <!-- Cached: <some date> --> comment just above the <body> tag.
Caveat emptor
There are a few cases where the cache isn’t used or is invalidated.
The first case is if you’ve configured something other than about:home as your home page (where the cache isn’t used). In this case, the cache won’t be read from, and the code to create the cache won’t ever run. If the user ever resets about:home to be their home page, then the caching code will start working for them.
The second case is if you’ve configured Firefox to restore your previous session by default. In this case, it’s unlikely that the first tab you’ll see is about:home, so the cache won’t be read from, and the code to create the cache won’t ever run. As before, if the user switches to not loading their previous session by default, then the cache will start working for them.
Another case is when the Firefox build identifier doesn’t match the build identifier from when the cache was created. This is also how the other startupcache module for static resources works. This ensures that when an update is applied, we don’t accidentally load old assets from the cache. So the first time you launch Firefox after you apply an update will not pull the about:home document from the cache, even if one exists (and will throw the cache out if it does). For Nightly users that generally receive updated builds twice a day, this makes the cache somewhat useless. Beta and Release users update much less frequently, so we expect to see a greater impact there.
The last case is in the event that your disk was in a situation such that reading the dynamic code from the asset bundle was faster than reading from the cache. If by the time the about:home document attempts to load and the cache isn’t ready, we fall back to loading it the old way. We don’t expect this to happen too often, but it’s theoretically possible, so we handle the case.
Future work
The next major step is to get the about:home startup cache turned on by default on Nightly and get it tested by a broader audience. At that point, hopefully we’ll get a better sense of its behaviour out in the wild via bug reports and Telemetry. Then our improvement will either ride the release train, or we might turn it on for subsets of the Beta or Release populations to measure its impact on more realistic restart scenarios. Once we’re confident that it’s helping more than hindering, we’ll turn it on by default for everyone.
After that, I think it would be worth seeing if we can load from the cache more often. Perhaps we could load about:newtab from there as well, for example.
One step at a time!
Thanks to
Florian Quèze and Doug Thayer, both of whom independently approached me with the idea of creating a static about:home
Jay Lim and Ursula Sarracini, both of whom wrote some of the groundwork code that was needed for this feature (namely, the infrastructure for the privileged about content process, and the first version of the moz-page-thumbs support)
Gijs Kruitbosch, Kate Hudson, Ed Lee, Scott Downe, and Gavin Suntop for all of the consulting and reviews
Honza Bambas and Andrew Sutherland for storage consultations
Markus Stange, Andrew Creskey, Eric Smyth, Asif Youssuff, Dan Mosedale, Emily Derr for their generous feedback on this post
This is only true if the user hasn’t just restarted after applying an update, and if they haven’t set a custom home page or configured Firefox to restore their previous session on start. ↩
You can think of startup like a traveling circus coming to town. You have to get the trucks and trailers parked, get the tents set up, hook up power, then lighting and sound … it’s a big, complex operation, and we haven’t even shot a clown out of a cannon yet. ↩
As the user browses, bookmarks and downloads things, their Highlights and Top Sites sections might change. If Pocket is enabled, new stories will also be downloaded periodically. ↩
It’s vitally important that content processes have limited abilities. That way, if they’re ever compromised by a bad actor, there are limits to what damage they can do. The assertions mentioned in this case mean that if a compromised content process tries to “pretend” to be the privileged about content process by sending one of its messages, that the parent process will terminate that content process immediately. ↩
This has changed slightly in the past few years with a feature called Race Cache With Network, which races the disk cache with the network instead of relying on the disk entirely. ↩
Thunderbird Conversations is an add-on for Thunderbird that provides a conversation view for messages. It groups message threads together, including those stored in different folders, and allows easier reading and control for a more efficient workflow.
Over the last couple of years, Conversations has been largely rewritten to adapt to changes in Thunderbird’s architecture for add-ons. Conversations 3.1 is the result of that effort so far.
<figcaption>Message Controls Menu</figcaption>
The new version will work with Thunderbird 68, and Thunderbird 78 that will be released soon.
<figcaption>Attachment preview area with gallery view available for images.</figcaption>
The one feature that is currently missing after the rewrite is inline quick reply. This has been of lower priority, as we have focussed on being able to keep the main part of the add-on running with the newer versions of Thunderbird. However, now that 3.1 is stable, I hope to be able to start work on a new version of quick reply soon.
More rewriting will also be continuing for the foreseeable future to further support Thunderbird’s new architecture. I’m planning a more technical blog post about this in future.
If you find an issue, or would like to help contribute to Conversations’ code, please head over to our GitHub repository.
Last year, I worked with the Capacity team on the Crypto stamp project, the first physical postage stamp with a unique digital twin, issued by the Austrian Postal Service (Österreichische Post AG). Those stamps are mainly intended as collectibles, but their physical "half" can be used as valid postage on packages or letters, and a QR code on that physical stamp links to a website presenting the digital collectible. Our job (at Capacity Blockchain Solutions) was to build that digital collectible, the website at crypto.post.at, and the back-end service delivering both public meta data and the back end for the website. I specifically did most of the work on the EthereumSmart Contract for the digital collectible, a "non-fungible token" (NFT) using the ERC-721 standard (publicly visible), as well as the back-end REST service, which I implemented in Python (based on Flask and Web3.py). The coding for the website was done by colleagues, of course using JavaScript for the dynamic elements.
One feature we have in this project is that people can purchase Crypto stamps directly from the blockchain, with the website guiding those with an Ethreum-enabled browser (e.g. with the MetaMask add-on) through that. By sending Ether cryptocurrency to the right address (the OnChainShop contract), they will directly receive the digital NFT - but then, every Crypto stamp consists of both a digital and physical item, so what about the physical part?
Of course, we cannot send a physical item to an Ethereum address (which is just a mostly-random number) so we needed a way for the owner of the NFT to give us (or actually Post AG) a postal address to send the physical stamp to. For this, we added a form to allow them to enter the postal address for stamps that were bought via the OnChain shop - but then the issue arose of how would we would verify that the sender was the actual owner of the NFT. Additionally, we had to figure out how do we do this without requiring a separate database or authentication in the back end, as we also did not need those features for anything else, since authentication for purchases are already done via signed transactions on the blockchain, and any data that needs to be stored is either static or on the blockchain.
We can easily verify the ownership if we send the information to a Smart Contract function on the blockchain, given that the owner has proven to be able to do such calls by purchasing via the OnChain shop already, and anyone sending transactions there has to sign those. To not need to store the whole postage address in the blockchain state database, which is expensive, we just emit an event and therefore put it in the event log, which is much cheaper and can still be read by our back end service and forwarded to Post AG. But then, anything sent to the public Ethereum blockchain (no matter if we put it into state or logs afterwards) is also visible to everyone else, and postal address are private data, so we need to ensure others reading the message cannot actually read that data.
So, our basic idea sounded simple: We generate a public/private key pair, use the public key to encrypt the postage address on the website, call a Smart Contract function with that data, signed by the user, emit an event with the data, and decrypt the information on the back-end service when receiving the event, before forwarding it to the actual shipping department in a nice format. As someone who has heard a lot about encryption but not actually coded encryption usage, I was surprised how many issues we ran into when actually writing the code.
So, first thing I did was seeing what techniques there are for sending encrypted messages, and pretty soon I found ECIES and was enthusiastic that sending encrypted messages was standardized, there are libraries for this in many languages and we just need to use implementations of that standard on both sides and it's all solved! Yay!
So I looked for ECIES libraries, both for JavaScript to be used in the browser and for Python, making sure they are still maintained. After some looking, I settled for eccrypto (JS) and eciespy, which both sounded pretty decent in usage and being kept up to date. I created a private/public key pair, trying to encrypt back and forth via eccrypto worked, so I went for trying to decrypt via eciespy, with great hope - only to see that eccrypto.encrypt() results in an object with 4 member strings while eciespy expects a string as input. Hmm.
With some digging, I found out that ECIES is not the same as ECIES. Sigh. It's a standard in terms of providing a standard framework for encrypting messages but there are multiple variants for the steps in the standardized mechanism, and both sides (encryption and decryption) need to agree on using the same to make it work correctly. Now, both eccrypto and eciespy implement exactly one variant, and of course two different ones, of course. Things would have been too easy if the implementations would be compatible, right?
So, I had to unpack what ECIES does to understand better what happens there. For one thing, ECIES basically does an ECDH exchange with the receiver's public key and a random "ephemeral" private key to derive a shared secret, which is then used as the key for AES-encrypting the message. The message is sent over to the recipient along with the AES parameters (IV, MAC) and the "ephemeral" public key. The recipient can use that public key along with their private key in ECDH, get the same shared secret, and do another round of AES with the given parameters to decrypt (as AES is symmetric, i.e. encryption and decryption are the same operation).
While both libraries use the secp256k1 curve (which incidentally is also used by Ethereum and Bitcoin) for ECDH, and both use AES-256, the main difference there, as I figured, is the AES cipher block mode - eccrypto uses CBC while eciespy uses GCM. Both modes are fine for what we are doing here, but we need to make sure we use the same on both sides. And additional difference is that eccrypto gives us the IV, MAC, ciphertext, and ephemeral public key as separate values while eciespy expects them packed into a single string - but that would be easier to cope with.
In any case, I would need to change one of the two sides and not use the simple-to-use libraries. Given that I was writing the Python code while my collegues working on the website were already busy enough with other feature work needed there, I decided that the JavaScript-side code would stay with eccrypto and I'd figure out the decoding part on the Python side, taking apart and adapting the steps that ecies would have done.
We'd convert the 4 values returned from eccrypto.encrypt() to hex strings, stick them into a JSON and stringify that to hand it over to the blockchain function - using code very similar to this:
var data = JSON.stringify(addressfields);
var eccrypto = require("eccrypto");
eccrypto.encrypt(pubkey, Buffer(data))
.then((encrypted) => {
var sendData = {
iv: encrypted.iv.toString("hex"),
ephemPublicKey: encrypted.ephemPublicKey.toString("hex"),
ciphertext: encrypted.ciphertext.toString("hex"),
mac: encrypted.mac.toString("hex"),
};
var finalString = JSON.stringify(sendData);
// Call the token shipping function with that final string.
OnChainShopContract.methods.shipToMe(finalString, tokenId)
.send({from: web3.eth.defaultAccount}).then(...)...
};
So, on the Python side, I went and took the ECDH bits from eciespy, and by looking at eccrypto code as an example and the relevant Python libraries, implemented code to make AES-CBC work with the data we get from our blockchain event listener. And then I found out that it still did not work, as I got garbage out instead of the expected result. Ouch. Adding more debug messages, I realized that the key used for AES was already wrong, so ECDH resulted in the wrong shared secret. Now I was really confused: Same elliptic curve, right public and private keys used, but the much-proven ECDH algorithm gives me a wrong result? How can that be? I was fully of disbelief and despair, wondering if this could be solved at all.
But I went for web searches trying to find out why in the world ECDH could give different results on different libraries that all use the secp256k1 curve. And I found documents ofthat same issue. And it comes down to this: While standard ECDH returns the x coordinate of the resulting point, the libsecp256k1 developers (I believe that's a part of the Bitcoin community) found it would be more secure to instead return the SHA256 hash of both coordinates of that point. This may be a good idea when everyone uses the same library, but eccrypto uses a standard library while eciespy uses libsecp256k1 - and so they disagree on the shared secret, which is pretty unhelpful in our case.
In the end, I also replaced the ECDH pieces from eciespy with equivalent code using a standard library - and suddenly things worked! \o/
I was fully of joy, and we had code we could use for Crypto stamp - and since the release in June 2019, this mechanism has been used successfully for over a hundred shipments of stamps to postal addresses (note that we had a limited amount available in the OnChainShop).
So, here's the Python code used for decrypting (we pip install eciespy cryptography in our virtualenv - not sure if eciespy is still needed but it may for dependencies we end up using):
from Crypto.Cipher import AES
import hashlib
import hmac
from cryptography.hazmat.primitives.asymmetric import ec
from cryptography.hazmat.backends import default_backend
def ecies_decrypt(privkey, message_parts):
# Do ECDH via the cryptography module to get the non-libsecp256k1 version.
sender_public_key_obj = ec.EllipticCurvePublicNumbers.from_encoded_point(ec.SECP256K1(), message_parts["ephemPublicKey"]).public_key(default_backend())
private_key_obj = ec.derive_private_key(Web3.toInt(hexstr=privkey),ec.SECP256K1(), default_backend())
aes_shared_key = private_key_obj.exchange(ec.ECDH(), sender_public_key_obj)
# Now let's do AES-CBC with this, including the hmac matching (modeled after eccrypto code).
aes_keyhash = hashlib.sha512(aes_shared_key).digest()
hmac_key = aes_keyhash[32:]
test_hmac = hmac.new(hmac_key, message_parts["iv"] + message_parts["ephemPublicKey"] + message_parts["ciphertext"], hashlib.sha256).digest()
if test_hmac != message_parts["mac"]:
logger.error("Mac doesn't match: %s vs. %s", test_hmac, message_parts["mac"])
return False
aes_key = aes_keyhash[:32]
# Actual decrypt is modeled after ecies.utils.aes_decrypt() - but with CBC mode to match eccrypto.
aes_cipher = AES.new(aes_key, AES.MODE_CBC, iv=message_parts["iv"])
try:
decrypted_bytes = aes_cipher.decrypt(message_parts["ciphertext"])
# Padding characters (unprintable) may be at the end to fit AES block size, so strip them.
unprintable_chars = bytes(''.join(map(chr, range(0,32))).join(map(chr, range(127,160))), 'utf-8')
decrypted_string = decrypted_bytes.rstrip(unprintable_chars).decode("utf-8")
return decrypted_string
except:
logger.error("Could not decode ciphertext: %s", sys.exc_info()[0])
return False
So, this mechanism has caused me quite a bit of work and you probably don't want to know the word I shouted at my computer at times while trying to figure this all out, but the results works great, and if you are ever in need of something like this, I hope I could shed some light on how to achieve it!
For further illustration, here's a flow graph of how the data gets from the user to Post AG in the end - the ECIES code samples are highlighted with light blue, all encryption-related things are blue in general, red is unencrypted data, while green is encrypted data:
Thanks to Post AG and Capacity for letting me work on interesting projects like that - and keep checking crypto.post.at for news about the next iteration of Crypto stamp!
Ever since I was on a tour to Star Trek filming sites in 2016 with Geek Nation Tours and Larry Nemecek, I've become ever more interested in finding out to which actual real-world places TV/film crews have gone "on location" and shot scenes for our favorite on-screen stories. While the background of production of TV and film is of interest to me in general, I focus mostly on everything Star Trek and I love visiting locations they used and try to catch pictures that recreate the base setting of the shots in the production - but just the way the place looks "in the real world" and right now.
This has gone as far as me doing several presentations about the topic - two of which (one in German, one in English language) I will give at this year's FedCon as well, and creating an experimental website at filmingsites.com where I note all locations used in Star Trek productions as soon as I become aware of them.
In the last few years, around the Star Trek Las Vegas Conventions, I did get the chance to have a few days traveling around Los Angeles and vicinity, visit a few locations and take pictures there. And after Discovery being filmed up in the Toronto area (and generally using quite few locations outside the studios), Picard is back producing in Southern California and using plenty of interesting places! And now with the first half of season 1 in the books (or at least ready to watch for us via streaming), here are a few filming sites I found in those episodes:
And we actually get started with our first location (picture is a still from the series) in "Remembrance" right after Picard wakes up from the "cold open" dream sequence: Château Picard was filmed at Sunstone Winery's Villa this time (after different places were used in its TNG appearances). The Winery's general manager even said "We encourage all the Trekkies and Trekkers to come visit us." - so I guess I'll need to put it in my travels plans soon.
Another one I haven't seen yet but will need to put in my plans to see is One Culver, previously known as Sony Pictures Plaza. That's where the scenes in the Daystrom Institute were shot - interestingly, in walking distance to the location of the former Desilu Culver soundstages (now "The Culver Studios") and its backlot (now a residential area), where the original Star Trek series shot its first episodes and several outdoor scenes of later ones as well. One Culver's big glass front structure and the huge screen on its inside are clearly visible multiple times in Picard's Daystrom Institute scenes, as is the rainbow arch behind it on the Sony Studios parking lot. Not having been there, I could only include a promotional picture from their website here.
Now a third filming site that appears in "Remembrance" is actually one I do have my own pictures of: After seeing the first trailer for Picard and getting a hint where that building depicted that clip is, I made my way last summer to a place close to Disneyland and took a few pictures of Anaheim Convention Center. Walking by to the main entrance, I found the attached Arena to just look good, so I also got one shot of that one in - and then I see that in this episode, they used it as the Starfleet Archive Museum!
Of course, in the second episode, "Maps and Legends", we then see the main entrance, where Picard goes to meet the C-in-C, so presumably Starfleet headquarters. It looks like the roof scenes with Dahj would actually be on the same building, on satellite pictures, there seems to be an area with those stairs South of the main entrance. I'm still a bit sad though that Starfleet seems to have moved their headquarters and it's not the Tillman administration building any more that was used in previous series (actually, for both headquarters and the Academy - so maybe it comes back in some series as the Academy, with its beautiful Japanese garden).
Of course, at the end of this episode we get to Raffi's home, and we stay there for a bit and see more of it in "The End is the Beginning". The description in the episode tells us it's located at a place called "Vasquez Rocks" - and this time, that's actually the real filming site! Now, Trekkies know this of course, as a whole lot of Trek has been filmed there - most famously the fight between Kirk and the Gorn captain in "Arena". Vasquez Rocks has surely been of the most-used Star Trek filming sites over the years, though - at least before Picard - I'd say that it ranked second behind Bronson Canyon. How what's nowadays a Natural Area park becomes a place to live in by 2399 is up to anyone's speculation.
I guess in the 3 introductory episodes we had more different filming sites than in any of the two whole seasons of Discovery seen so far, but right in the next episode, Absolute Candor, we got yet another interesting place! A lot of that episode plays on the planet Vashti, with three sets of scenes on their main place with the bar setting: In the "cold open" / flashback, when Picard beams down to the planet again in the show's present, and before he leaves, including the fight scene. Given that there were multiple hints of shooting taking place at Universal Studios Hollywood, and the sets having a somewhat familiar look, more Mexican than totally alien, it did not take long to identify where those scenes were filmed: It's the standing "Mexican Street" / "Old Mexico Place" set on Universal's backlot - which you usually can visit with the Studio Tour as an attraction of their Theme Park. The pictures, of the bar area, and basically from there in the direction of Picard's beam-in point, are from a one of those tours I took in 2013.
In the following two episodes, I could not make out any filming sites, so I guess they pretty much filmed those at Santa Clarita Studios where the production of the series is based. I know we will have some location(s) to talk about in the second half of the season though - not sure if there's as many as in the first few episodes, but I hope we'll have a few good ones!
I've been meaning to blog again for some time, and just looked in disbelief at the date of my last post. Yes, I'm still around. I hope I get to write more often in the future.
Ludo just posted his thoughts on FOSDEM, which I also attended last weekend as a volunteer for Mozilla. I have been attending this conference since 2002, when it first went by that exact name, and since then AFAIK only missed the 2010 edition, giving talks in the Mozilla dev room almost every year - though funnily enough, in two of the three years where I've been a member of the Mozilla Tech Speakers program, my talks were not accepted into that room, while I made it all the years before. In fact, that's more telling a story of how interested speakers are in getting into this room nowadays, while in the past there were probably fewer submissions in total. So, this year I helped out Sunday's Mozilla developer room by managing the crowd entering/leaving at the door(s), similar to what I did in the last few years, and given that we had fewer volunteers this year, I also helped out at the Mozilla booth on Saturday. Unfortunately, being busy volunteering on both days meant that I did not catch any talks at all at the conference (I hear there were some good ones esp. in our dev room), but I had a number of good hallway and booth conversations with various people, esp. within the Mozilla community - be it with friends I had not seen for a while, new interesting people within and outside of Mozilla, or conversations clearing up lingering questions.
(pictures by Rabimba & Bob Chao)
Now, this was the 20th conference by the FOSDEM team (their first one went by "OSDEM", before they added the "F" in 2002), and the number 20 is coming up for me all over the place - not just that it works double duty in the current year's number 2020, but even in the months before, I started my row of 20-year anniversaries in terms of my Mozilla contributions: first bug reported in May, first contribution contact in December, first German-language Mozilla suite release on January 1, and will will continue with the 20th anniversaries of my first patches to shared code this summer - see 'My Web Story' post from 2013 for more details. So, being part of an Open-Source project with more than 20 years of history, celebrating a number of 20th anniversaries in that community, I see that number popping up quite a bit nowadays. Around the turn of the century/millennium, a lot of change happened, for me personally but all around as well. Since then, it has been a whirlwind, and change is the one constant that really stayed with me and has become almost a good friend. A lot of changes are going on in the Mozilla community right now as well, and after a bit of a slump and trying to find my new place in this community (since I switched back from staff to volunteer in 2016), I'm definitely excited again to try and help building this next chapter of the future with my fellow Mozillians.
There's so much more going around in my mind, but for now I'll leave it at that: In past times, when I was invited as volunteer or staff, the Mozilla Summits and All-hands were points that energized me and gave me motivation to push forward on making Mozilla better. This year, FOSDEM, with my volunteering and the conversations I had, did the same job. Let's build a better Internet and a better Mozilla community!
I’m writing this in lieu of a traditional Firefox Front-end Performance Update, as I think this will be more useful in the long run than just a snapshot of what my team is doing.
I want to talk about main thread disk access (sometimes referred to more generally as “main thread IO”). Specifically, I’m going to argue that main thread disk access is lethal to program responsiveness. For some folks reading this, that might be an obvious argument not worth making, or one already made ad nauseam — if that’s you, this blog post is probably not for you. You can go ahead and skip most or all of it, if you’d like. Or just skim it. You never know — there might be something in here you didn’t know or hadn’t thought about!
For everybody else, scoot your chairs forward, grab a snack, and read on.
Disclaimer: I wouldn’t call myself a disk specialist. I don’t work for Western Digital or Seagate. I don’t design file systems. I have, however, been using and writing software for computers for a significant chunk of my life, and I seem to have accumulated a bunch of information about disks. Some of that information might be incorrect or imprecise. Please send me mail at mike dot d dot conley at gmail dot com if any of this strikes you as wildly inaccurate (though forgive me if I politely disregard pedantry), and then I can update the post.
The mechanical parts of a computer
If you grab a screwdriver and (carefully) open up a laptop or desktop computer, what do you see? Circuit boards, chips, wires and plugs. Lots of electrons flowing around in there, moving quickly and invisibly.
Notably, there aren’t many mechanical moving parts of a modern computer. Nothing to grease up, nowhere to pour lubricant. Opening up my desktop at home, the only moving parts I can really see are the cooling fans near the CPU and power supply (and if you’re like me, you’ll also notice that your cooling fans are caked with dust and in need of a cleaning).
There’s another moving part that’s harder to see — the hard drive. This might not be obvious, because most mechanical drives (I’ve heard them sometimes referred to as magnetic drives, spinny drives, physical drives, platter drives and HDDs. There are probably more terms.) hide their moving parts inside of the disk enclosure.1
If you ever get the opportunity to open one of these enclosures (perhaps the disk has failed or is otherwise being replaced, and you’re just about to get rid of it) I encourage you to.
As you disassemble the drive, what you’ll probably notice are circular parts, layered on top of one another on a motor that spins them. In between those circles are little arms that can move back and forth. This next image shows one of those circles, and one of those little arms.
<figcaption>There are several of those circles stacked on top of one another, and several of those arms in between them. We’re only seeing the top one in this photo.</figcaption>
Does this remind you of anything? The circular parts remind me of CDs and DVDs, but the arms reaching across them remind me of vinyl players.
<figcaption>Vinyl’s back, baby!</figcaption>
The comparison isn’t that outlandish. If you ignore some of the lower-level details, CDs, DVDs, vinyl players and hard drives all operate under the same basic principles:
The circular part has information encoded on it.
An arm of some kind is able to reach across the radius of the circular part.
Because the circular part is spinning, the arm is able to reach all parts of it.
The end of the arm is used to read the information encoded on it.
There’s some extra complexity for hard drives. Normally there’s more than one spinning platter and one arm, all stacked up, so it’s more like several vinyl players piled on top of one another.
Hard drives are also typically written to as well as read from, whereas CDs, DVDs and vinyls tend to be written to once, and then used as “read-only memory.” (Though, yes, there are exceptions there.)
Lastly, for hard drives, there’s a bit I’m skipping over involving caches, where parts of the information encoded on the spinning platters are temporarily held elsewhere for easier and faster access, but we’ll ignore that for now for simplicity, and because it wrecks my simile.2
So, in general, when you’re asking a computer to read a file off of your hard drive, it’s a bit like asking it to play a tune on a vinyl. It needs to find the right starting place to put the needle, then it needs to put the needle there and only then will the song play.
For hard drives, the act of moving the “arm” to find the right spot is called seeking.
Contiguous blocks of information and fragmentation
Have you ever had to defragment your hard drive? What does that even mean? I’m going to spend a few moments trying to explain that at a high-level. Again, if this is something you already understand, go ahead and skip this part.
Most functional hard drives allow you to do the following useful operations:
Write data to the drive
Read data from the drive
Remove data from the drive
That last one is interesting, because usually when you delete a file from your computer, the information isn’t actually erased from the disk. This is true even after emptying your Trash / Recycling Bin — perhaps surprisingly, the files that you asked to be removed are still there encoded on the circular platters as 1’s and 0’s. This is why it’s sometimes possible to recover deleted files even when it seems that all is lost.
Allow me to explain.
Just like there are different ways of organizing a sock drawer (at random, by colour, by type, by age, by amount of damage), there are ways of organizing a hard drive. These “ways” are called file systems. There are lots of different file systems. If you’re using a modern version of Windows, you’re probably using a file system called NTFS. One of the things that a file system is responsible for is knowing where your files are on the spinning platters. This file system is also responsible for knowing where there’s free space on the spinning platters to write new data to.
When you delete a file, what tends to happen is that your file system marks those sectors of the platter as places where new information can be written to, but doesn’t immediately overwrite those sectors. That’s one reason why sometimes deleted files can be recovered.
Depending on your file system, there’s a natural consequence as you delete and write files of different sizes to the hard drive: fragmentation. This kinda sounds like the actual physical disk is falling apart, but that’s not what it means. Data fragmentation is probably a more precise way of thinking about it.
Imagine you have a sheet of white paper broken up into a grid of 5 boxes by 5 boxes (25 boxes in total), and a box of paints and paintbrushes.
Each square on the paper is white to start. Now, starting from the top-left, and going from left-to-right, top-to-bottom, use your paint to fill in 10 of those boxes with the colour red. Now use your paint to fill in the next 5 boxes with blue. Now do 3 more boxes with yellow.
So we’ve got our colour-filled boxes in neat, organized rows (red, then blue, then yellow), and we’ve got 18 of them filled, and 7 of them still white.
Now let’s say we don’t care about the colour blue. We’re okay to paint over those now with a new colour. We also want to fill in 10 boxes with the colour purple. Hm… there aren’t enough free white boxes to put in that many purple ones, but we have these 5 blue ones we can paint over. Let’s paint over them with purple, and then put the next 5 at the end in the white boxes.
So now 23 of the boxes are filled, we’ve got 2 left at the end that are white, but also, notice that the purple boxes aren’t all together — they’ve been broken apart into two sections. They’ve been fragmented.
This is an incredibly simplified model, but (I hope) it demonstrates what happens when you delete and write files to a hard drive. Gaps open up that can be written to, and bits and pieces of files end up being distributed across the platters as fragments.
This also occurs as files grow. If, for example, we decided to paint two more white boxes red, we’d need to paint the ones at the very end, breaking up the red boxes so that they’re fragmented.
So going back to our vinyl player example for a second — the ideal scenario is that you start a song at the beginning and it plays straight through until the end, right? The more common case with disk drives, however, is you read bits and pieces of a song from different parts of the vinyl: you have to lift and move the arm each time until eventually you have heard the song from start to finish. That seeking of the arm adds overhead to the time it takes to listen to the song from beginning to end.
When your hard drive undergoes defragmentation, what your computer does is try to re-organize your disk so that files are in contiguous sectors on the platters. That’s a fancy way of saying that they’re all in a row on the platter, so they can be read in without the overhead of seeking around to assemble it as fragments.
Skipping that overhead can have huge benefits to your computer’s performance, because the disk is usually the slowest part of your computer.
On the relative input / output speeds of modern computing components
I mentioned in the disclaimer at the start of this post that I’m not a disk specialist or expert. Scott Davis is probably a better bet as one of those. His bio lists an impressive wealth of experience, and mentions that he’s “a recognized expert in virtualization, clustering, operating systems, cloud computing, file systems, storage, end user computing and cloud native applications.”
I don’t know Scott at all (if you’re reading this, Hi, Scott!), but let’s just agree for now that he probably knows more about disks than I do.
I’m picking Scott as an expert because of a particularly illustrative analogy that was posted to a blog for a company he used to work for. The analogy compares the speeds of different media that can be used to store information on a computer. Specifically, it compares the following:
RAM
The network with a decent connection
Flash drives
Magnetic hard drives — what we’ve been discussing up until now.
For these media, the post claims that input / output speed can be measured using the following units:
RAM is in nanoseconds
10GbE Network speed is in microseconds (~50 microseconds)
Flash speed is in microseconds (between 20-500+ microseconds)
Disk speed is in milliseconds
That all seems pretty fast. What’s the big deal? Well, it helps if we zoom in a little bit. The post does this by supposing that we pretend that RAM speed happens in minutes.
If that’s the case, then we’d have to measure network speed in weeks.
And if that’s the case, then we’d want to measure the speed of a Flash drive in months.
And if that’s the case, then we’d have to measure the speed of a magnetic spinny disk in decades.
Update (May 23, 2019): My Uncle Mark, who also works in computing, sent me links that show similar visualizations of computing latency: this one has a really excellent infographic, and this one has more discussion. These articles highlight network latency as the worst offender, which is true especially when the quality of service is low, but I’m mostly writing this post for folks who hack on Firefox where the vast majority of networking occurs off of the main thread.
I wish I had some ACM paper, or something written by a computer science professor that I could point to you to bolster the following claim. I don’t, not because one doesn’t exist, but because I’m too lazy to look for one. I hope you’ll forgive me for that, but I don’t think I’m saying anything super controversial when I say:
In the common case, for a personal computer, it’s best to assume that reading and writing to the disk is the slowest operation you can perform.
Sure, there are edge cases where other things in the system might be slower. And there is that disk cache that I breezed over earlier that might make reading or writing cheaper. And sometimes the operating system tries to do smart things to help you. For now, just let it go. I’m making a broad generalization that I think covers the common cases, and I’m talking about what’s best to assume.
Single and multi-threaded restaurants
When I try to describe threading and concurrency to someone, I inevitably fall back to the metaphor of cooks in a kitchen in a restaurant. This is a special restaurant where there’s only one seat, for a single customer — you, the user.
Single-threaded programs
Let’s imagine a restaurant that’s very, very small and simple. In this restaurant, the cook is also acting as the waiter / waitress / server. That means when you place your order, the server / cook goes into the kitchen and makes it for you. While they’re gone, you can’t really ask for anything else — the server / cook is busy making the thing you asked for last.
This is how most simple, single-threaded programs work—the user feeds in requests, maybe by clicking a button, or typing something in, maybe something else entirely—and then the program goes off and does it and returns some kind of result. Maybe at that point, the program just exits (“The restaurant is closed! Come back tomorrow!”), or maybe you can ask for something else. It’s really up to how the restaurant / program is designed that dictates this.
Suppose you’re very, very hungry, and you’ve just ordered a complex five-course meal for yourself at this restaurant. Blanching, your server / cook goes off to the kitchen. While they’re gone, nobody is refilling your water glass or giving you breadsticks. You’re pretty sure there’s activity going in the kitchen and that the server / cook hasn’t had a heart attack back there, but you’re going to be waiting a looooong time since there’s only one person working in this place.
Maybe in some restaurants, the server / cook will dash out periodically to refill your water glass, give you some breadsticks, and update you on how things are going, but it sure would be nice if we gave this person some help back there, wouldn’t it?
Multi-threaded programs
Let’s imagine a slightly different restaurant. There are more cooks in the kitchen. The server is available to take your order (but is also able to cook in the kitchen if need be), and you make your request from the menu.
Now suppose again that you order a five-course meal. The server goes to the kitchen and tells the cooks what you just ordered. In this restaurant, suppose the kitchen staff are a really great team and don’t get in each other’s way3, so they divide up the order in a way that makes sense and get to work.
The server can come back and refill your water glass, feed you breadsticks, perhaps they can tell you an entertaining joke, perhaps they can take additional orders that won’t take as long. At any rate, in this restaurant, the interaction between the user and the server is frequent and rarely interrupted.
The waiter / waitress / server is the main thread
In these two examples, the waiter / waitress / server is what is usually called the main thread of execution, which is the part of the program that the user interacts with most directly. By moving expensive operations off of the main thread, the responsiveness of the program increases.
Have you ever seen the mouse turn into an hourglass, seen the “This program is not responding” message on Windows? Or the spinny colourful pinwheel on macOS? In those cases, the main thread is off doing something and never came back to give you your order or refill your water or breadsticks — that’s how it generally manifests in common operating systems. The program seems “unresponsive”, “sluggish”, “frozen”. It’s “hanging”, or “stuck”. When I hear those words, my immediate assumption is that the main thread is busy doing something — either it’s taking a long time (it’s making you your massive five course meal, maybe not as efficiently as it could), or it’s stuck (maybe they fell down a well!).
In either case, the general rule of thumb to improving program responsiveness is to keep the server filling the user’s water and breadsticks by offloading complex things on the menu to other cooks in the kitchen.
Accessing the disk on the main thread
Recall that in the common case, for a personal computer, it’s best to assume that reading and writing to the disk is the slowest operation you can perform. In our restaurant example, reading or writing to the disk on the main thread is a bit like having your server hop onto their bike and ride out to the next town over to grab some groceries to help make what you ordered.
And sometimes, because of data fragmentation (not everything is all in one place), the server has to search amongst many many shelves all widely spaced apart to get everything.
And sometimes the grocery store is very busy because there are other restaurants out there that are grabbing supplies.
And sometimes there are police checks (anti-virus / anti-malware software) occurring for passengers along the road, where they all have to show their IDs before being allowed through.
It’s an incredibly slow operation. Hopefully by the time the server comes back, they don’t realize they have to go back out again to get more, but they might if they didn’t realize they were missing some more ingredients.4
Slow slow slow. And unresponsive. And a great way to lose a hungry customer.
For super small programs, where the kitchen is well stocked, or the ride to the grocery store doesn’t need to happen often, having a single-thread and having it read or write is usually okay. I’ve certainly written my fair share of utility programs or scripts that do main thread disk access.
Firefox, the program I spend most of my time working on as my job, is not a small program. It’s a very, very, very large program. Using our restaurant model, it’s many large restaurants with many many cooks on staff. The restaurants communicate with each other and ship food and supplies back and forth using messenger bikes, to provide to you, the customer, the best meals possible.
But even with this large set of restaurants, there’s still only a single waiter / waitress / server / main thread of execution as the point of contact with the user.
Part of my job is to help organize the workflows of this restaurant so that they provide those meals as quickly as possible. Sending the server to the grocery store (main thread disk access) is part of the workflow that we absolutely need to strike from the list.
Start-up main-thread disk access
Going back to our analogy, imagine starting the program like opening the restaurant. The lights go on, the chairs come off of the tables, the kitchen gets warmed up, and prep begins.
While this is occurring, it’s all hands on deck — the server might be off in the kitchen helping to do prep, off getting cutlery organized, whatever it takes to get the restaurant open and ready to serve. Before the restaurant is open, there’s no point in having the server be idle, because the customer hasn’t been able to come in yet.
So if critical groceries and supplies needed to open the restaurant need to be gotten before the restaurant is open, it’s fine to send the server to the store. Somebody has to do it.
For Firefox, there are various things that need to take place before we can display any UI. At that point, it’s usually fine to do main-thread disk access, so long as all of the things being read or written are kept to an absolute minimum. Find how much you need to do, and reduce it as much as possible.
But as soon as UI is presented to the user, the restaurant is open. At that point, the server should stay off their bike and keep chatting with the customer, even if the kitchen hasn’t finished setting up and getting all of their supplies. So to stay responsive, don’t do disk access on the main thread of execution after you’ve started to show the user some kind of UI.
Disk contention
There’s one last complication I want to capture here with our restaurant example before I wrap up. I’ve been saying that it’s important to send anyone except the server to the grocery store for supplies. That’s true — but be careful of sending too many other people at the same time.
Moving disk access off of the main thread is good for responsiveness, full stop. However, it might do nothing to actually improve the overall time that it takes to complete some amount of work. Put it another way: just because the server is refilling your glass and giving you breadsticks doesn’t mean that your five-course meal is going to show up any faster.
Also, disk operations on magnetic drives do not have a constant speed. Having the disk do many things at once within a single program or across multiple programs can slow the whole set of operations down due to the overhead of seeking and context switching, since the operating system will try to serve all disk requests at once, more or less.5
Disk contention and main thread disk access is something I think a lot about these days while my team and I work on improving Firefox start-up performance.
Some questions to ask yourself when touching disk
So it’s important to be thoughtful about disk access. Are you working on code that touches disk? Here are some things to think about:
Is UI visible, and responsiveness a goal?
If so, best to move the disk access off of the main-thread. That was the main thing I wanted to capture, and I hope I’ve convinced you of that point by now.
Does the access need to occur?
As programs age and grow and contributors come and go, sometimes it’s important to take a step back and ask, “Are the assumptions of this disk access still valid? Does this access need to happen at all?” The fastest code is the code that doesn’t run at all.
What else is happening during this disk access? Can disk access be prioritized more efficiently?
This is often trickier to answer as a program continues to run. Thankfully, tools like profilers can help capture recordings of things like disk access to gain evidence of simultaneous disk access.
Start-up is a special case though, since there’s usually a somewhat deterministic / reliably stable set of operations that occur in the same way in roughly the same order during start-up. For start-up, using a tool like a profiler, you can gain a picture of the sorts of things that tend to happen during that special window of time. If you notice a lot of disk activity occurring simultaneously across multiple threads, perhaps ponder if there’s a better way of ordering those operations so that the most important ones complete first.
Can we reduce how much we need to read or write?
There are lots of wonderful compression algorithms out there with a variety of performance characteristics that might be worth pondering. It might be worth considering compressing the data that you’re storing before writing it so that the disk has to write less and read less.
Of course, there’s compression and decompression overhead to consider here. Is it worth the CPU time to save the disk time? Is there some other CPU intensive task that is more critical that’s occurring?
Can we organize the things that we want to read ahead of time so that they’re more likely to be read contiguously (without seeking the disk)?
If you know ahead of time the sorts of things that you’re going to be reading off of the disk, it’s generally a good strategy to store them in that read order. That way, in the best case scenario (the disk is defragmented), the read head can fly along the sectors and read everything in, in exactly the right order you want them. If the user has defragmented their disk, but the things you’re asking for are all out of order on the disk, you’re adding overhead to seek around to get what you want.
Supposing that the data on the disk is fragmented, I suspect having the files in order anyways is probably better than not, but I don’t think I know enough to prove it.
Flawed but useful
One of my mentors, Greg Wilson, likes to say that “all models are flawed, but some are useful”. I don’t think he coined it, but he uses it in the right places at the right times, and to me, that’s what counts.
The information in this post is not exhaustive — I glossed over and left out a lot. It’s flawed. Still, I hope it can be useful to you.
Thanks
Thanks to the following folks who read drafts of this and gave feedback:
Mandy Cheang
Emily Derr
Gijs Kruitbosch
Doug Thayer
Florian Quèze
There are also newer forms of disks called Flash disks and SSDs. I’m not really going to cover those in this post. ↩
The other thing to keep in mind is that the disk cache can have its contents evicted at any time for reasons that are out of your control. If you time it right, you can maybe increase the probability of a file you want to read being in the cache, but don’t bet the farm on it. ↩
When writing multi-threaded programs, this is much harder than it sounds! Mozilla actually developed a whole new programming language to make that easier to do correctly. ↩
Keen readers might notice I’m leaving out a discussion on Paging. That’s because this blog post is getting quite long, and because it kinda breaks the analogy a bit — who sends groceries back to a grocery store? ↩
I’ve never worked on an operating system, but I believe most modern operating systems try to do a bunch of smart things here to schedule disk requests in efficient ways. ↩
Hello, folks. I wanted to give a quick update on what the Firefox Front-end Performance team is up to, so let’s get into it.
The name of the game continues to be start-up performance. We made some really solid in-roads last quarter, and this year we want to continue to apply pressure. Specifically, we want to focus on reducing IO (specifically, main-thread IO) during browser start-up.
Reducing main thread IO during start-up
There are lots of ways to reduce IO – in the best case, we can avoid start-up IO altogether by not doing something (or deferring it until much later). In other cases, when the browser might be servicing events on the main thread, we can move IO onto another thread. We can also re-organize, pack or compress files differently so that they’re read off of the disk more efficiently.
If you want to change something, the first step is measuring it. Thankfully, my colleague Florian has written a rather brilliant test that lets us take accounting of how much IO is going on during start-up. The test is deterministic enough that he’s been able to write a whitelist for the various ways we touch the disk on the main thread during start-up, and that whitelist means we’ve made it much more difficult for new IO to be introduced on that thread.
That whitelist has been processed by the team, and have been turned into bugs, bucketed by the start-up phasewheretheIO is occurring. The next step is to estimate the effort and potential payoff of fixing those bugs, and then try to whittle down the whitelist.
And that’s effectively where we’re at. We’re at the point now where we’ve got a big list of work in front of us, and we have the fun task of burning that list down!
Being better at loading DLLs on Windows
While investigating the warm-up service for Windows, Doug Thayer noticed that we were loading DLLs during start-up oddly. Specifically, using a tool called RAMMap, he noticed that we were loading DLLs using “read ahead” (eagerly reading the entirety of the DLL into memory) into a region of memory marked as not-executable. This means that anytime we actually wanted to call a library function within that DLL, we needed to load it again into an executable region of memory.
Doug also noticed that we were unnecessarily doing ReadAhead for the same libraries in the content process. This wasn’t necessary, because by the time the content process wanted to load these libraries, the parent process would have already done it and it’d still be “warm” in the system file cache.
We’re not sure why we were doing this ReadAhead-into-unexecutable-memory work – it’s existence in the Firefox source code goes back many many years, and the information we’ve been able to gather about the change is pretty scant at best, even with version control. Our top hypothesis is that this was a performance optimization that made more sense in the Windows XP era, but has since stopped making sense as Windows has evolved.
UPDATE: Ehsan pointed us to this bug where the change likely first landed. It’s a long and wind-y bug, but it seems as if this was indeed a performance optimization, and efforts were put in to side-step effects from Prefetch. I suspect that later changes to how Prefetch and SuperFetch work ultimately negated this optimization.
Doug hacked together a quick prototype to try loading DLLs in a more sensible way, and the he was able to capture quite an improvement in start-up time on our reference hardware:
<figcaption>This graph measures various start-up metrics. The scatter of datapoints on the left show the “control” build, and they tighten up on the right with the “test” build. Lower is better. </figcaption>
At this point, we all got pretty excited. The next step was to confirm Doug’s findings, so I took his control and test builds, and tested them independently on the reference hardware using frame recording. There was a smaller1, but still detectable improvement in the test build. At this point, we decided it was worth pursuing.
We’re also seeing the impact reflected in Telemetry. The first Nightly build with Doug Thayer’s patch went out on April 14th, and we’re starting to see a nice dip in some of our graphs here:
<figcaption>This graph measures the time at which the browser window reports that it has first painted. April 14th is the second last date on the X axis, and the Y axis is time. The top-most line is plotting the 95th percentile, and there’s a nice dip appearing around April 14th. </figcaption>
We expect this improvement to have the greatest impact on weaker hardware with slower disks, but we’ll be avoiding some unnecessary work for all Windows users, and that gets a thumbs-up in my books.
If all goes well, this fix should roll out in Firefox 68, which reaches our release audience on July 9th!
My test machine has SuperFetch disabled to help reduce noise and inconsistency with start-up tests, and we suspect SuperFetch is able to optimize start-up better in the test build ↩
With Firefox 67 only a few short weeks away, I thought it might be interesting to take a step back and talk about some of the work that the Firefox Front-end Performance team is shipping to users in that particular release.
To be clear, this is not an exhaustive list of the great performance work that’s gone into Firefox 67 – but I picked a few things that the front-end team has been focused on to talk about.
Stop loading things we don’t need right away
The fastest code is the code that doesn’t run at all. Sometimes, as the browser evolves, we realize that there are components that don’t need to be loaded right away during start-up, and can instead of deferred until sometime after start-up. Sometimes, that means we can wait until the very last moment to initialize some component – that’s called loading something lazily.
Here’s a list of things that either got deferred until sometime after start-up, or made lazy:
The Hidden Window is a mysterious chunk of code that manages the state of the global menu bar on macOS when there are no open windows. The Hidden Window is also sometimes used as a singleton DOM window where various operations can take place. On Linux and Windows, it turns out we were creating this Hidden Window far early than needs be, and now it’s quite a bit lazier.
Page style
Page Style is a menu you can find under View in the main menu bar, and it’s used to switch between alternative style sheets on a page. It’s a pretty rarely used feature from what we can tell, but we were scanning pages for their alternative stylesheets far earlier than we needed to. We were also scanning pages that we know don’t have alternative stylesheets, like the about:home / about:newtab page. Now we only scan normal web pages, and we do so only after we service the idle event queue.
Cache invalidation
The Startup Cache is an important part of Firefox startup performance. It’s primary job is to cache computations that occur during each startup so that they only have to happen every once in a while. For example, the mark-up of the browser UI often doesn’t change from startup to startup, so we can cache a version of the mark-up that’s faster to read from disk, and only invalidate that during upgrades.
We were invalidating the whole startup cache every time a WebExtension was installed or uninstalled. This used to be necessary for old-style XUL add-ons (since those could cause changes to the UI that would need to go into the cache), but with those add-ons no longer available, we were able to remove the invalidation. This means faster startups more often.
Don’t touch the disk
The disk is almost always the slowest part of the system. Reading and writing to the disk can take a long time, especially on spinning magnetic drives. The less we can read and write, the better. And if we’re going to read, best to do it off of the main thread so that the UI remains responsive.
Old XUL icons code
We were reading from the disk on the main thread to search for window-specific icons to display in the window titlebar.
We noticed that when we were checking that a directory exists on Windows (to write a file to it), we were using the CreateDirectoryW Windows API. This API checks each folder on the way down to the last one to see if they exist. That’s a lot of disk IO! We can avoid this if we assume that the parent directories exist, and only fall into the slow path if we fail to write our file. This means that we hit the faster path with less IO more often, which is good for responsiveness and start-up time.
Enjoy your Faster Fox!
Firefox 67 is slated to ship with these improvements on May 14th – just a little over a month away. Enjoy!
Firefox 66 has been released, Firefox 67 is out on the beta channel, and Firefox 68 is cooking for the folks on the Nightly channel! These trains don’t stop!
With that, let’s take a quick peek at what the Firefox Front-end Performance team has been doing these past few weeks…
Document Splitting Foundations for WebRender (In-Progress by Doug Thayer)
An impressive set of patches were recently queued to land, which should bring document splitting to WebRender, but in a disabled state. The gfx.webrender.split-render-roots pref is what controls it, but I don’t think we can reap the full benefits of document splitting until we get retained display lists enabled in the parent process for the UI. I believe, at that point, we can start enabling document splitting, which means that updating the browser UI area will not involve sending updates to the content area for WebRender.
In other WebRender news, it looks like it should be enabled by default for some of our users on the release channel in Firefox 67, due to be released in mid-May!
Warm-up Service (In-Progress by Doug Thayer)
Doug has written the bits of code that tie a Firefox preference to an HKLM registry key, which can be read by the warm-up service at start-up. The next step is to add a mode to the Firefox executable that loads its core DLLs and then exits, and then have the warm-up service call into that mode if enabled.
Once this is done, we should be in a state where we can user test this feature.
Startup Cache Telemetry (In-Progress by Doug Thayer)
Two things of note here:
With the probes having now uplifted to Beta, data will slowly trickle in these next few days that will show us how the Firefox startup cache is behaving in the wild for users that aren’t receiving two updates a day (like our Nightly users). This important, because oftentimes, those updates cause some or all of the startup cache to be invalidated. We’re eager to see how the startup caches are behaving in the wild on Beta.
One of the tests that was landed for the startup cache Telemetry appears to have caught an issue with how the QuantumBar code works with it – this is useful, because up until now, we’ve had very little automated testing to ensure that the startup cache is working as expected.
Smoother Tab Animations (Paused by Felipe Gomes)
UX, Product and Engineering have been having discussions about how the new tab animations work, and one thing has been decided upon: we want our User Research team to run some studies to see how tab animations are perceived before we fully commit to changing one of the fundamental interactions in the browser. So, at this time, Felipe is pausing his efforts here until User Research comes back with some information on guidance.
Browser Adjustment Project (Concluded by Gijs Kruitbosch)
We originally set out to see whether or not we could do something for users running weaker hardware to improve their browsing experience. Our initial hypothesis was that by lowering the frame rate of the browser on weaker hardware, we could improve the overall page load time.
This hypothesis was bolstered by measurements done in late 2018, where it appeared that by manually lowering the frame rate on a weaker reference laptop, we could improve our internal page load benchmarks by a significant degree. This measurement was reproduced by Denis Palmeiro on Vicky Chin’s team, and so Gijs started implementing a runtime detection mechanism to do that lowering of the frame rate for machines with 2 or fewer cores where each core’s clockspeed was 1.8Ghz or slower1.
However, since then, we’ve been unable to reproduce the same positive effect on page load time. Neither has Denis. We suspect that recent work on the RefreshDriver, which changes how often the RefreshDriver runs during the page load window, is effectively getting the same kind of win2.
We did one final experiment to see whether or not lowering the frame rate would improve battery life, and it appeared to, but not to a very high degree. We might revisit that route were we tasked with trying to improve power usage in Firefox.
Better about:newtab Preloading (Completed by Gijs Kruitbosch)
The patch to preload about:newtab in an idle callback has landed and stuck! This means that we don’t preload about:newtab immediately after opening a new tab (which is good for responsiveness right around the time when you’re likely to want to do something), and also means that we have the possibility of preloading the first new tab in new windows! Great job, Gijs!
Experiments with the Process Priority Manager (In-Progress by Mike Conley)
I had a meeting today with Saptarshi, one of our illustrious Data Scientists, to talk about the upcoming experiment. One of the things he led me to conclude was that this experiment is going to have a lot of confounds, and it will be difficult to conclude things from.
Part of the reason for that is because there are often times when a background tab won’t actually have its content process priority lowered. The potential reasons for this are:
The tab is running in a content process which is also hosting a tab that is running in the foreground of either the same or some other browser window.
The tab is playing audio or video.
Because of this, we can’t actually do things like measure how page load is being impacted by this feature because we don’t have a great sense of how many tabs have their content process priorities lowered. That’s just not a thing we collect with Telemetry. It’s theoretically possible, either due to how many windows or videos or tabs our Beta users have open, that very few of them will ever actually have their content process priorities lowered, and then the data we’d draw from Telemetry would be useless.
I’m working with Saptarshi now to try to find ways of either altering the process priority manager or adding new probes to reduce the number of potential confounds.
We’re only a few weeks away from Firefox 67 merging from the Nightly channel to Beta, and since my last update, a number of things have landed.
It’s the end of a long week for me, so I apologize for the brevity here. Let’s check it out!
Document Splitting Foundations for WebRender (In-Progress by Doug Thayer)
dthayer is still trucking along here – he’s ironed out a number of glitches, and kats is giving feedback on some APZ-related changes. dthayer is also working on a WebRender API endpoint for generating frames for multiple documents in a single transaction, which should help reduce the window of opportunity for nasty synchronization bugs.
Warm-up Service (In-Progress by Doug Thayer)
dthayer is pressing ahead with this experiment to warm up a number of critical files for Firefox shortly after the OS boots. He is working on a prototype that can be controlled via a pref that we’ll be able to test on users in a lab-setting (and perhaps in the wild as a SHIELD experiment).
Startup Cache Telemetry (In-Progress by Doug Thayer)
dthayer landed this Telemetry early in the week, and data has started to trickle in. After a few more days, it should be easier for us to make inferences on how the startup caches are operating out in the wild for our Nightly users.
Smoother Tab Animations (In-Progress by Felipe Gomes)
After a few rounds of landings and backouts, this appears to have stuck! The hidden window is now created after the main window has finished painting, and this has resulted in a nice ts_paint (startup paint) win on our Talos benchmark!
<figcaption>This is a graph of the ts_paint startup paint Talos benchmark. The highlighted node is the first mozilla-central build with the hidden window work. Lower is better, so this looks like a nice win!</figcaption>
Browser Adjustment Project (In-Progress by Gijs Kruitbosch)
This project appears to be reaching its conclusion, but with rather unsatisfying results. Denis Palmeiro from Vicky Chin’s team has done a bunch of testing of both the original set of patches that Gijs landed to lower the global frame rate (painting and compositing) from 60fps to 30fps for low-end machines, as well as the new patches that decrease the frequency of main-thread painting (but not compositing) to 30fps. Unfortunately, this has not yielded the page load wins that we wanted1. We’re still waiting to see if there’s a least a power-usage win here worth pursuing, but we’re almost ready the pull the plug on this one.
Better about:newtab Preloading (In-Progress by Gijs Kruitbosch)
Unfortunately, there’s a small snag with a test failure in automation, but Gijs is on the case.
Experiments with the Process Priority Manager (In-Progress by Mike Conley)
The Process Priority Manager has been enabled in Nightly for a number of weeks now, and no new bugs have been filed against it. I filed a bug earlier this week to run a pref-flip experiment on Beta after the Process Priority Manager patches are uplifted later this month. Our hope is that this has a neutral or positive impact on both page load time and user retention!
Make the PageStyleChild load lazily (Completed by Mike Conley)
There’s an infrequently used feature in Firefox that allows users to switch between different CSS stylesheets that a page might offer. I’ve made the component that scans the document for alternative stylesheets much lazier, and also made it skip non web-pages, which means (at the very least) less code running when loading about:home and about:newtab
This was unexpected – we ran an experiment late in 2018 where we noticed that lowering the frame rate manually via the layout.frame_rate pref had a positive impact on page load time… unfortunately, this effect is no longer being observed. This might be due to other refresh driver work that has occurred in the meantime. ↩
It’s been just a little over two weeks since my last update, so let’s see where we are!
A number of our projects are centered around trying to improve start-up time. Start-up can mean a lot of things, so we’re focused specifically on cold start-up on the Windows 10 2018 reference device when the machine is at rest.
If you want to improve something, the first thing to do is measure it. There are lots of ways to measure start-up time, and one of the ways we’ve been starting to measure is by doing frame recording analysis. This is when we capture display output from a testing device, and then analyze the videos.
This animated GIF shows eight videos. The four on the left are Firefox Nightly, and the four on the right are Google Chrome (71.0.3578.98). The videos are aligned so that both browsers are started at the same time.
<figcaption>The four on the left are Firefox Nightly, and the four on the right are Google Chrome (71.0.3578.98)</figcaption>
Some immediate observations:
Firefox Nightly is consistently faster to reach first paint (that’s the big white window)
Firefox Nightly is consistently faster to draw its toolbar and browser UI
Google Chrome is faster at painting its initial content
This last bullet is where the team will be focusing its efforts – we want to have the initial content painted and settled much sooner than we currently do.
Document Splitting Foundations (In-Progress by Doug Thayer)
After some pretty significant refactorings to work better with APZ, Doug posted a new stack of patches late last week which will sit upon the already large stack of patches that have already landed. There are still a number of reviews pending on the main stack, but this work appears to be getting pretty close to conclusion, as the patches are in the final review and polish stage.
After this, once retained display lists are enabled in the parent process, and an API is introduced to WebRender to generate frames for multiple documents in a single transaction, we can start thinking about enabling document splitting by default.
Warm-up Service (In-Progress by Doug Thayer)
A Heartbeat survey went out a week or so back to get some user feedback about a service that would speed up the launching of Firefox at the cost of adding some boot time to Windows. The responses we’ve gotten back have been quite varied, but can be generally bucketed into three (unsurprising) groups:
Users who say they do not want to make this trade
Users who say they would love to make this trade
Users who don’t care at all about this trade
Each group is sufficiently large to warrant further exploration. Our next step is to build a version of this service that we can turn on and off with a pref and test either in a lab and/or out in the wild with a SHIELD study.
Startup Cache Telemetry (In-Progress by Doug Thayer)
We do a number of things to try to improve real and perceived start-up time. One of those things is to cache things that we calculate at runtime during start-up to the disk, so that for subsequent start-ups, we don’t have to do those calculations again.
There are a number of mechanisms that use this technique, and Doug is currently adding some Telemetry to see how they’re behaving in the wild. We want to measure cache hits and misses, so that we know how healthy our cache system is out in the wild. If we get signals back that our start-up caches are missing more than we expect, this will highlight an important area for us to focus on.
Smoother Tab Animations (In-Progress by Felipe Gomes)
UX has gotten back to us with valuable feedback on the current implementation, and Felipe is going through it and trying to find the simplest way forward to address their concerns.
Having been available (though disabled by default) on Nightly, we’ve discovered one bug where the tab strip can become unresponsive to mouse events. Felipe is currently working on this.
Lazy Hidden Window (In-Progress by Felipe Gomes)
Under the hood, Firefox’s front-end has a notion of a “hidden window”. This mysterious hidden window was originally introduced long long ago1 for MacOS, where it’s possible to close all windows yet keep the application running.
Since then, it’s been (ab)used for Linux and Windows as well, as a safe-ish place to do various operations that require a window (since that window will always be around, and not go away until shutdown).
Browser Adjustment Project (In-Progress by Gijs Kruitbosch)
Gijs updated the patch so that the adjustment causes the main thread to skip every other VSync rather than swithing us to 30fps globally2.
We passed the patch off to Denis Palmeiro, who has a sophisticated set-up that allows him to measure a pageload benchmark using frame recording. Unfortunately, the results we got back suggested that the new approach regressed visual page load time significantly in the majority of cases.
We’re in the midst of using the same testing rig to test the original global 30fps patch to get a sense of the magnitude of any improvements we could get here. Denis is also graciously measuring the newer patch to see if it has any positive benefits towards power consumption.
Better about:newtab Preloading (In-Progress by Gijs Kruitbosch)
By default, users see about:newtab / a.k.a Activity Stream when they open new tabs. One of the perceived performance optimizations we’ve done for many years now is to preload the next about:newtab in the background so that the next time that the user opens a tab, the about:newtab is all ready to roll.
This is a perceived performance optimization where we’re moving work around rather than doing less work.
Right now, we preload a tab almost immediately after the first tab is opened in a window. That means that the first opened tab is never preloaded, but the second one is. This is for historical reasons, but we think we can do better.
Gijs is working on making it so that we choose a better time to preload the tab – namely, when we’ve found an idle pocket of time where the user doesn’t appear to be doing anything. This should also mean that the first new tab that gets opened might also be preloaded, assuming that enough idle time was made available to trigger the preload. And if there wasn’t any idle time, that’s also good news – we never got in the users way by preloading when it’s clear they were busy doing something else
Experiments with the Process Priority Manager (In-Progress by Mike Conley)
The Process Priority Manager has been enabled on Nightly for a few weeks now. Except for a (now fixed) issue where audio playing in background tabs would drop samples periodically, it’s been all quiet for regression reports.
The next step is to file a bug to run an experiment on Beta to see how this work impacts page load time.
Enable the separate Activity Stream content process by default (Stalled by Mike Conley)
This work is temporarily stalled while I work on other things, so there’s not too much to report here.
Well, here I am again – apologizing about a late update. Lots of stuff has been going on performance-wise in the Firefox code-base, and I’ll just be covering a small section of it here.
You might also notice that I changed the title of the blog series from “Firefox Performance Update” to “Firefox Front-end Performance Update”, to reflect that the things the Firefox Front-end Performance team is doing to keep Firefox speedy (though I’ll still add a grab-bag of other performance related work at the end).
So what are we waiting for? What’s been going on?
Migrate consumers to the new Places Observer system (Paused by Doug Thayer)
Doug was working on this later in 2018, and successfully ported a good chunk of our bookmarks code to use the new batched Places Observer system. There’s still a long-tail of other call sites that need to be updated to the new system, but Doug has shifted focus from this to other things in the meantime.
This has been a pretty long-haul task, but Doug has been plugging away, and landed a significant chunk of the infrastructure for this. At this time, Doug is working with kats to make Document Splitting integrate nicely with Async-Pan-Zooming (APZ).
The current plan is for Document Splitting to land disabled by default, since it’s blocked by parent-process retained display lists (which still have a few bugs to shake out).
Warm-up Service (In-Progress by Doug Thayer)
Doug is investigating the practicalities of having a service run during Windows start-up to preload various files that Firefox will need when started.
We’re still researching this at multiple levels, and haven’t yet determined if this is a thing that we’d eventually want to ship. Stay tuned.
Smoother Tab Animations (In-Progress by Felipe Gomes)
After much ado, simplification, and review back-and-forth, the initial set of new tab animations have landed in Nightly. You can enable them by setting browser.tabs.newanimations to true in about:config and then restarting the browser. These new animations run entirely on the compositor, instead of painting at each refresh driver tick, so they should be smoother than the current animations that we ship.
There are still some cases that need new animations, and Felipe is waiting on UX for those.
Overhauling about:performance (V1 Completed by Florian Quèze)
The new about:performance shipped late last year, and now shows both energy as well as memory usage of your tabs and add-ons.
The current iteration allows you to close the tabs that are hogging your resources. Current plans should allow users to pause JavaScript execution in busy background tabs as well.
Browser Adjustment Project (In-Progress by Gijs Kruitbosch)
Gijs has landed some patches in Nightly (which have recently uplifted to Beta, and are only enabled on early Betas), which lowers the default frame rate of Firefox from 60fps to 30fps on devices that are considered “low-end”1.
This has been on Nightly for a while, but as our Nightly population tends to skew to more powerful hardware, we expect not a lot of users have experienced the impact there.
While the lowered frame rate seemed to have a positive impact on page load time in the lab on our “low-end” reference hardware, we’re having a much harder time measuring any appreciable improvement in CI. We have scheduled an experiment to see if improvements are detectable via our Telemetry system on Beta.
We need to be prepared that this particular adjustment will either not have the desired page load improvement, or will result in a poorer quality of experience that is not worth any page load improvement. If that’s the case, we still have a few ideas to try, including:
Lowering the refresh driver tick, rather than the global frame rate. This would mean things like scrolling and videos would still render at 60fps, but painting the UI and web content would occur at a lower frequency.
Use the hardware vsync again (switching to 30fps turns hardware vsync off), but just paint every other time. This is to test whether or not software vsync results in worse page load times than hardware vsync.
Avoiding spurious about:blank loads in the parent process (Completed by Gijs Kruitbosch)
Gijs short-circuited a bunch of places where we were needlessly creating about:blank documents that we were just going to throw away (see this bug and dependencies). There are still a long tail of cases where we still do this in some cases, but they’re not the common cases, and we’ve decided to apply effort for other initiatives in the meantime.
Experiments with the Process Priority Manager (In-Progress by Mike Conley)
This was originally Doug Thayer’s project, but I’ve taken it on while Doug focuses on the epic mountain that is WebRender Document Splitting.
If you recall, the goal of this project is to lower the process priority for tabs that are only sitting in the background. This means that if you have tabs in the background that are attempting to use system resources (running JavaScript for example), those tabs will have less priority at the operating system level than tabs that are in the foreground. This should make it harder for background tabs to cause foreground tabs to be starved of processing resources.
After clearing a few finalblockers, we enabled the Process Priority Manager by default last week. We also filed a bug to keep background tabs at a higher priority if they’re playing audio and video, and the fix for that just landed in Nightly today.
So if you’re on Windows on Nightly, and you’re curious about this, you can observe the behaviour by opening up the Windows Task Manager, switching to the “Details” tab, and watching the “Base priority” reading on your firefox.exe processes as you switch tabs.
Cheaper tabs in titlebar (Completed by Mike Conley)
After an epic round of review (thanks, Dao!), the patches to move our tabs-in-titlebar logic out of JS and into CSS landed late last year.
Enable the separate Activity Stream content process by default (In-Progress by Mike Conley
There’s one known bug remaining that’s preventing us from letting the privileged content process from being enabled by default.
Thankfully, the cause is understood, and a fix is being worked on. Unfortunately, this is one of those bugs where the proper solution involves refactoring a bit of old crufty stuff, so it’s taking longer than I’d like.
Still, if all goes well, this bug should be closed out soon, and we can see about letting the privileged content process ride the trains.
Grab bag of notable performance work
This is an informal list of things that I’ve seen land in the tree lately that I believe will have a positive performance impact for our users. Have you seen something that you’d like to nominate for a future list? Submit the bug here!
Also, keep in mind that some of these landed months ago and already shipped to release. That’s what I get for taking so long to write a blog post.
Gabriele Svelto made it so that the minidump analyzer, which runs after a crash has occurred to produce stack dumps for crash pings, runs with much lower process priority. This means that the CPU is being hogged less after a crash, when you’re likely to want to get right back to what you were doing.
Hey folks – another Performance Update coming at you! It’s been a few weeks since I posted one of these, mostly due to travel, holidays and the Mozilla SF All-Hands. However, we certainly haven’t been idle during that time. Much work has been done Performance-wise, and there’s a lot to tell. So strap in! But first…
This Performance Update is brought to you by: promiseDocumentFlushed
promiseDocumentFlushed is a utility that’s available for browser engineers in chrome documents on the window global. The goal of promiseDocumentFlushed is to help avoid synchronous layout flushes in our JavaScript code by scheduling work to only occur after the next “natural” layout flush occurs1.
promiseDocumentFlushed takes a function and returns a Promise. The function it takes will run the next time a natural layout flush and paint has finished occurring. At this point, the DOM should not be “dirty”, and size and position queries should be very cheap to calculate. It is critically important for the callback to not modify the DOM. I’ve filed bugs to make modifying the DOM inside that callback enter some kind of failure state, but it hasn’t been resolved yet.
The return value of the callback is what promiseDocumentFlushed’s returned Promise resolves with. Once the Promise resolves, it is then safe to modify the DOM.
This mechanism means that if, for some reason, you need to gather information about the size or position of things in the DOM, you can do it without forcing a synchronous layout flush – however, a paint will occur before that information is given to you. So be on the look-out for flicker, since that’s the trade-off here.
And now, here’s a list of the projects that the team has been working on lately:
ClientStorage (In-Progress by Doug Thayer)
The ClientStorage project should allow Firefox to communicate with the GPU more efficiently on macOS, which should hopefully reduce jank on the compositor thread2. This is right on the verge of landing3, and we’re very excited to see how this impacts our macOS users!
Init WindowsJumpLists off-main-thread (Completed by Doug Thayer)
The JumpList is a Windows-only feature – essentially an application-specific context menu that opens when you right-click on the application in the task bar. Adding entries to this context menu involves talking to Windows, and unfortunately, the way we were originally doing this involved writing to the disk on the main thread. Thankfully, the API is thread-safe, so Doug was able to move the operation onto a background thread. This is good, because arewesmoothyet was reporting the Windows JumpList code as one of the primary causes of main-thread hangs caused by our front-end code.
Reduce painting while scrolling panels on macOS (Completed by Doug Thayer)
Matt Woodrow noticed that the recently added All Tabs list was performing quite poorly when scrolling it on macOS. After turning on paint-flashing for our browser UI, he noticed that we were re-painting the entire menu every time it scrolled. After some investigation, Matt realized that this was because our Graphics code was skipping some optimizations due to the rounded corners of the panels on macOS. We briefly considered removing the rounded corners on macOS, but then Doug found a more general fix, and now we only re-paint the minimum necessary to scroll the menu, and it’s much smoother!
Make the RemotePageManager lazy (In-Progress by Felipe Gomes)
Overhauling about:performance (In-Progress by Florian Quèze)
Florian is working on improving about:performance, with the hopes of making it more useful for browser engineers and users for diagnosing performance problems in Firefox. Here’s a screenshot of what he has so far:
Apparently, mining cryptocurrency takes a lot of CPU!
Thanks to the work of Tarek Ziade, we now have a reliable mechanism for getting information on which tabs are consuming CPU cycles. For example, in the above screenshot, we can see that the coinhive tab that Firefox has open is consuming a bunch of CPU in some workers (mining cryptocurrency). Florian has also been clearing out some of the older code that was supporting about:performance, including the subprocess memory table. This table was useful for our browser engineers when developing and tuning the multi-process project, but we think we can replace it now with something more actionable and relevant to our users. In the meantime, since gathering the memory data causes jank on the main thread, he’s removed the table and the supporting infrastructure. The about:performance work hasn’t landed in the tree yet, but Florian is aiming to get it reviewed and landed (preffed off) soon.
Browser Adjustment Project (In-Progress by Gijs Kruitbosch)
This is a research project to find ways that Firefox can classify the hardware it’s running on, which should make it easier for the browser to make informed decisions on how to deal with things like CPU scheduling, thread and process priority, graphics and UI optimizations, and memory reclamation strategies. This project is still in its early days, but Gijs has already identified prior art and research that we can build upon, and is looking at lightweight ways we can assign grades to a user’s CPU, disk, and graphics hardware. Then the plan is to try hooking that up to the toolkit.cosmeticAnimations pref, to test disabling those animations on weaker hardware. He’s also exploring ways in which the user can override these measurements in the event that they want to bypass the defaults that we choose for each environment.
Avoiding spurious about:blank loads in the parent process (In-Progress by Gijs Kruitbosch)
When we open new browser windows, the initial browser tab inside them runs in the parent process and loads about:blank. Soon after, we do a process flip to load a page in the content process. However, that initial about:blank still has cost, and we think we can avoid it. There’s a test failure that Gijs is grappling with, but after much thorough detective work deep in the complex ball of code that supports our window opening infrastructure, he’s figured out a path forward. We expect this project to be wrapped up soon, which should hopefully make window opening cheaper and also produce less flicker.
Load Activity Stream scripts from ScriptPreloader (Completed by Jay Lim)
Defer calculating Activity Stream state until idle (In-Progress by Jay Lim)
When Firefox starts up, one of the first things it prepares to do is show you the Activity Stream page, since that’s the default home and new tab page. Jay thinks we might be able to save the state of Activity Stream at shutdown, and load it again quickly during startup within the content process, and then defer the calculations necessary to produce a more recent state until after the parent process has become idle. We’re unsure yet what this will buy us in terms of start-up speed, but Jay is hacking together a prototype to see. I’m eager to find out!
Grab bag of Notable Performance Work
Luca Greco landed all of the infrastructure to move the WebExtension storage.local backend from a file in the profile directory to indexedDB. This should particularly help the performance of the browser when WebExtensions write small changes to large storage structures, since historically this would cause the entire JSON object for the structure to be recomputed and flushed to disk. This should also help with memory consumption. The infrastructure is disabled by default, and once this bug is fixed, it will be switched on.
Dave Townsend fixed an issue where we were requesting the favicon for new pages twice instead of once. This resulted in a 2%-3% win on our internal session restoration bench on 64-bit Linux!
As I draw this update to a close, I want to give a shout-out to my intern and colleague Jay Lim, whose internship is ending in a few short days. Jay took to performance work like a duck in water, and his energy, ideas and work were greatly appreciated! Thank you so much, Jay!
By “natural”, I mean a layout flush triggered by the refresh driver, and not by some JavaScript requesting size or position information on a dirty DOM ↩
And when it comes to smoothness and responsiveness, jank on the compositor thread is deadly ↩
it landed and bounced once due to a crash test failure, but Doug has just gotten a fix for it approved ↩
As I mentioned previously, the Mixed Reality "virus" has caught me recently and I spend a good portion of my Mozilla contribution time with presenting and writing demos for WebVR/XR nowadays.
The prime driver for writing my first such demo was that I wanted to do something meaningful with A-Frame. Previously, I had only played around with the Hello WebVR example and some small alterations around the basic elements seen in that one, which is also pretty much what I taught to others in the WebVR workshops I held in Vienna last year. Now, it was time to go beyond that, and as I had recently bought a HTC Vive, I wanted something where the controllers could be used - but still something that would fall back nicely and be usable in 2D mode on a desktop browser or even mobile screens.
While I was thinking about what I could work on in that area, another long-standing thought crossed my mind: How feasible is it to render OpenStreetMap (OSM) data in 3D using WebVR and A-Frame? I decided to try and find out.
First, I built on my knowledge from Lantea Maps and the fact that I had a tile cache server set up for that, and created a layer of a certain set of tiles on the ground to for the base. That brought me to a number of issue to think about and make decisions on: First, should I respect the curvature of the earth, possibly put the tiles and the viewer on a certain place on a virtual globe? Should I respect the terrain, especially the elevation of different points on the map? Also, as the VR scene relates to real-world sizes of objects, how large is a map tile actually in reality? After a lot of thinking, I decided that this would be a simple demo so I would assume the earth is flat - both in terms of curvature or "the globe" and terrain, and the viewer would start off at coordinates 0/0/0 with x and z coordinates being horizontal and y the vertical component, as usual in A-Frame scenes. For the tile size, I found that with OpenStreetMap using Mercator projection, the tiles always stayed squares, with different sizes based on the latitude (and zoom level, but I always use the same high zoom there). In this respect, I still had to take account of the real world being a globe.
Once I had those tiles rendering on the ground, I could think about navigation and I added teleport controls, later also movement controls to fly through the scene. With W/A/S/D keys on the desktop (and later the fly controls), it was possible to "fly" underneath the ground, which was awkward, so I wrote a very simple "position-limit" A-Frame control later on, which prohibits that and also is a very nice example for how to build a component, because it's short and easy to understand.
All this isn't using OSM data per se, but just the pre-rendered tiles, so it was time to go one step further and dig into the Overpass API, which allows to query and retrieve raw geo data from OSM. With Overpass Turbo I could try out and adjust the queries I wanted to use ad then move those into my code. I decided the first exercise would be to get something that is a point on the map, a single "node" in OSM speak, and when looking at rendered maps, I found that trees seemed to fit that requirement very well. An Overpass query for "node[natural=tree]" later and some massaging the result into a format that JavaScript can nicely work with, I was able to place three-dimensional A-Frame entities in the places where the tiles had the symbols for trees! I started with simple brown cylinders for the trunks, then placed a sphere on top of them as the crown, later got fancy by evaluating various "tags" in the data to render accurate height, crown diameter, trunk circumference and even a different base model for needle-leaved trees, using a cone for the crown.
But to make the demo really look like a map, it of course needed buildings to be rendered as well. Those are more complex, as even the simpler buildings are "ways" with a variable amount of "nodes", and the more complex ones have holes in their base shape and therefore require a compound (or "relation" in OSM speak) of multiple "ways", for the outer shape and the inner holes. And then, the 2D shape given by those properties needs to be extruded to a certain height to form an actual 3D building. After finding the right Overpass query, I realized it would be best to create my own "building" geometry in A-Frame, which would get the inner and outer paths as well as the height as parameters. In the code for that, I used the THREE.js library underlying A-Frame to create a shape (potentially with holes), extrude it to the right height and rotate it to actually stand on the ground. Then I used code similar to what I had for trees to actually create A-Frame entities that had that custom geometry. For the height, I would use the explicit tags in the OSM database, estimate from its levels/floors if given or else fall back to a default. And I would even respect the color of the building if there was a tag specifying it.
With that in place, I had a pretty nice demo that uses data directly from OpenStreetMap to render Virtual Reality scenes that could be viewed in the desktop or mobile browser, or even in a full VR headset!
It's available under the name of "VR Map" at vrmap.kairo.at, and of course the source code can also be expected, copied and forked on GitHub.
Again, this is intended as a demo, not a full-featured product, and e.g. does at this time only render an area of a defined size and does not include any code to load additional scenery as you are moving around. Also, it does not support "building parts", which are the way to specify in OSM that a different pieces of a building have e.g. different heights or colors. It could also be extended to actually render models of the buildings when they exist and are referred in the database (so e.g. the Eiffel Tower would look less weird when going to the Paris preset). There are a lot of things that still can be done to improve on this demo for sure, but as it stands, it's a pretty simple piece of code that shows the power of both A-Frame and the OpenStreetMap data, and that's what I set out to do, after all.
My plan is to take this to multiple meetups and conferences to promote both underlying projects and get people inspired to think about what they can do with those ideas. Please let me know if you know of a good event where I can present this work. The first of those presentations happened a at the ViennaJS May Meetup, see the slides and video.
I'm also in an email conversation with another OSM contributor who is using this demo as a base for some of his work, e.g. on rendering building models in 3D and VR and allowing people to correct their position data.
I hope that this demo spawns more ideas of what people can do with this toolset, and I'll also be looking into more demos that will probably move into different directions.
Ever since a close encounter with burning out (thankfully, I didn't quite get there) forced me to leave my job with Mozilla more than two years ago, I have been looking for a place and role that feels good for me in the Mozilla community. I immediately signed up to join Tech Speakers as I always loved talking about Mozilla tech topics and after all breaking down complicated content and communicating it to different groups is probably my biggest strength - but finding the topics I want to present at conferences and other events has been a somewhat harder journey.
I knew I had to keep my distance to crash stats, despite knowing the area in and out and having developed some passion for it, but staying in the same area as a volunteer than in a job that almost burned me out was just not a good idea, from multiple points of view. I thought about building up some talks about working with data but it still was a bit too close to that past and not what I presently do a lot (I work in blockchain technology mostly today), so that didn't go far (but maybe it will happen at some point).
On the other hand, I got more and more interested in some things the Open Innovation group at Mozilla was doing, and even more in what the Emerging Technologies teams bring into the Mozilla and web sphere. My talk (slides) at this year's local "Linuxwochen Wien" conference was a very quick run-through of what's going on there and it's a whole stack of awesomeness, from Mixed Reality via codecs, Rust, Voice and whatnot to IoT. I would love to dig a bit into the latter but I didn't yet find the time.
What I did find some time for is digging into WebVR (now WebXR, where "XR" means "Mixed Reality") and the A-Frame library that Mozilla has created to make it dead simple to create your own VR/XR experiences. Last year I did two workshops in Vienna on that area, another one this year and I'm planning more of them. It's great how people with just some HTML knowledge can build something easily there as well as people who are more into JS programming, who can dig even deeper. And the immersiveness of VR with a real headset blows people away again and again in any case, so a good thing to show off.
While last year I only had cardboards with some left-over Sony Z3C phones (thanks to Mozilla) to show some basic 3DoF (rotation only) VR with low resolution, this proved to be interesting already to people I presented to or made workshops with. Now, this year I decided to buy a HTC Vive, seeing its price go down somewhat before the next generation of headsets would be shipped. (As a side note, I chose the Vive over the Rift because of Linux drivers being available and because I don't want to give money to Facebook.) Along with a new laptop with a high-end GPU that can drive the VR headset, I got into fully immersive 6DoF VR and, I have to say, got somewhat addicted to the experience.
I ran a demo booth with A-Painter at "Linuxwochen Wien" in May, and people were both awed at the VR experience and that this was all running in plain Firefox! Spreading the word about new web technologies can be really fun and rewarding with experiences like that! Next to showing demos and using VR myself, I also got into building WebVR/XR demos myself (I'm more the person to do demos and prototypes and spread the word, rather than building long-lasting products) - but I'll leave that to another blog post that will be upcoming very soon!
So, for the moment, I have found a place I feel very comfortable with in the community, doing demos and presentations about WebVR or "Mixed Reality" (still need to dig into AR but I don't have fitting hardware for that yet) as well as giving people and overview of the Emerging Technologies "we" (MoCo and the Mozilla community) are bringing to the web, and trying to make people excited and use the technologies or hopefully even contribute to them. Being at the forefront of innovation for once feels really good, I hope it lasts long!
Hello, Internet! Here we are with yet another Firefox Performance Update for your consumption. Hold onto your hats – we’re going in!
But first a word from our sponsor: ScriptPreloader!
A lot of the Firefox front-end is written using JavaScript. With the possible exception of system add-ons that update outside of the normal release cycle, these scripts tend to be the same until you update.
About a year ago, Mozilla developer Kris Maglione had an idea: let’s try to optimize browser start time by noticing which scripts are being loaded during start-up, and then converting those scripts into a binary representation1 that we can cache on disk. That way, next time we start up, we can just grab the cached binaries off of the disk, skip the parsing step and start executing the JavaScript right away.
Long-time Mozillians might know that we already do some aggressive caching to improve start time for things like XUL, XBL, manifests and other things that are read at start-up. I think we actually were already caching JavaScript files too – but I don’t think we were storing them pre-parsed. And the old caching stuff was definitely not caching scripts that were loading in content processes (since content processes didn’t exist when the old caching stuff was designed).
At any rate, my understanding is that the ScriptPreloader pays attention to script loads between main process start and the point where the first browser window fires the “browser-delayed-startup-finished” observer notification (after the window paints and does post-painting script loading). At that point, the ScriptPreloader examines the list of scripts that the parent and content processes have loaded, and2 writes their pre-parsed bytecode representation to disk.
After that cache is written, the next time the main process or content processes start up, the cache is checked for the binary data. If it exists, this means that we can skip the parsing step. The ScriptPreloader goes one step further and starts to “decode”3 that binary format off of the main thread, even before those scripts are requested. Then, when the scripts are finally requested, they’re very much ready to execute right away.
I’m now working on a series of patches in this bug that will widen the window of time where we note scripts that we can cache. This will hopefully improve the speed of privileged scripts that run up until the idle point of the first browser window.
And now for some Performance Project updates!
Early first blank paint (lead by Florian Quèze)
User Research has hired a contractor to perform a study to validate our hypothesis that the early first blank paint perceived performance optimization will make Firefox seem like it’s starting faster. More data to come out of that soon!
Faster content process start-up time (lead by Felipe Gomes)
The patchesthatFelipe wrote a few weeks back have landed and have had a positive impact! The proof is in the pudding – let’s look at some graphs:
The cpstartup impact. Those two clusters are test runs “before” and “after” Felipe’s patches landed, respectively.
The above graph shows a nice drop in the cpstartup Talos test. The cpstartup test measures the time it takes to boot up the content process and have it be ready to show you web pages.
This is a screen capture of a Base Content JS improvement in the AreWeSlimYet test. This graph measures the amount of memory that content processes consume via JavaScript not long after starting up.
In the graph above, we can see that the patches also helped reduce the memory that content processes use by default, by making more scripts only load when they’re needed.
It’s always nice to see our work have an impact in our graphs. Great work, Felipe! Keep it up!
The good news is that it appears that this has had a significant and positive impact on tab switch times – tab switch times went down, and the number of Nightly instances reporting tab switch spinners went down by about 10%. Great work, Doug!
A numberofbugswere filed against the original bug due to some glitchy edge-cases that we don’t handle well just yet.
We also detected a ~8% resident memory regression in our automated testing suites. This was expected (keeping layers around isn’t free!) and gave us a sense of how much memory we might consume were we to enable this by default.
The experiment is concluded for now, and we’re going to disable the cache for a bit while we think about ways to improve the implementation.
ClientStorageTextureSource for macOS (lead by Doug Thayer)
Swapping DataURLs for Blobs in Activity Stream (lead by Jay Lim)
Jay’s patch to swap out DataURLs for Blobs for Activity Stream images has passed a first round of review from Mardak! He’s now waiting for a second review from k88hudson, and then hopefully this can land and give us a bit of a memory win. Having done some analysis, we expect this buy back quite a bit of memory that was being contained within those long DataURL strings.
Caching Activity Stream JS in the JS Bytecode Cache (lead by Jay Lim)
After examining the JavaScript Bytecode Cache that’s used for Web Content, Jay has determined that it’s really not the right mechanism for caching the Activity Steam scripts.
However, that ScriptPreloader that I was talking about earlier sounds like a much more reasonable candidate. Jay is now doing a deep dive on the ScriptPreloader to see whether or not the Activity Stream scripts are already being cached – and if not, why not.
Tab warming (lead by Mike Conley)
No news is good news here. Tab warming continues to ride and no new bugs have been filed. The work to reduce the number of paints when warming tabs has stalled a bit while I dealt with a rather strange cpstartup Talos regression. Ultimately, I think I can get rid of the second paint when warming by keeping background tabs display port suppressed4, and then only triggering the display port unsuppression after a tab switch. This will happily take advantage of a painting mechanism that Doug Thayer put in as part of the LRU cache experiment.
Firefox’s Most Wanted: Performance Wins (lead by YOU!)
Before we go into the grab-bag list of performance-related fixes – have you seen any patches landing that should positively impact Firefox’s performance? Let me know about it so I can include it in the list, and give appropriate shout-outs to all of the great work going on! That link again!
Grab-bag time
And now, without further ado, a list of performance work that took place in the tree:
Ryan Hunt made it so that paint threads (see Off Main Thread Painting) no longer block Display List and Frame Layer building. This means we can get more done before offloading instructions to the paint threads, and this gives paint threads more time to finish up their work, increasing the probability that they’ll be free by the time the transaction to the paint thread needs to take place.
This is an optimization that we do that shrinks the painted area to just the region that’s visible to the browser. We normally paint a bit outside the viewable area so that it’s ready when a user starts scrolling ↩
We’ve just released a new Thunderbird Conversations (previously know as Gmail Conversation View) with full support for Thunderbird 52. We’re sorry for the delay, but the good news is it should now work fine.
I’d like to thank Jonathan for letting me help out with the release process, and for all those who contributed to release or filed issues.
The add-on should work with the current Thunderbird Beta versions (56), but won’t currently work in Daily (57) due to some compatibility issues. We’re hoping to get those resolved in the next week or so.
If you want to help out with future releases, then find the source code here and come and help us with supporting users or fixing issues.
On Monday, I, along with several million other people, decided to view the Great American Eclipse. Since I presently live in Urbana, IL, that meant getting in my car and driving down I-57 towards Carbondale. This route is also what people from Chicago or Milwaukee would have taken, which means traffic was heavy. I ended up leaving around 5:45 AM, which puts me around the last clutch of people leaving.
Our original destination was Goreville, IL (specifically, Ferne Clyffe State Park), but some people who arrived earlier got dissatisfied with the predicted cloudy forecast, so we moved the destination out to Cerulean, KY, which meant I ended up arriving around 11:00 AM, not much time before the partial eclipse started.
Partial eclipses are neat, but they're very much a see-them-once affair. When the moon first entered the sun, you get a flurry of activity as everyone puts on the glasses, sees it, and then retreats back into the shade (it was 90°F, not at all comfortable in the sun). Then the temperature starts to drop—is that the eclipse, or this breeze that started up? As more and more gets covered, then it starts to dim: I had the impression that a cloud had just passed in front of the sun, and I wanted to turn and look at that non-existent cloud. And as the sun really gets covered, then trees start acting as pinhole cameras and the shadows take on a distinctive scalloped pattern.
A total eclipse though? Completely different. The immediate reaction of everyone in the group was to start planning to see the 2024 eclipse. For those of us who spent 10, 15, 20 hours trying to see 2-3 minutes of glory, the sentiment was not only that it was time well spent, but that it was worth doing again. If you missed the 2017 eclipse and are able to see the 2024 eclipse, I urge you to do so. Words and pictures simply do not do it justice.
What is the eclipse like? In the last seconds of partiality, everyone has their eyes, eclipse glasses on of course, staring at the sun. The thin crescent looks first like a side picture of an eyeball. As the time ticks by, the tendrils of orange slowly diminish until nothing can be seen—totality. Cries come out that it's safe to take the glasses off, but everyone is ripping them off anyways. Out come the camera phones, trying to capture that captivating image. That not-quite-perfect disk of black, floating in a sea of bright white wisps of the corona, not so much a circle as a stretched oval. For those who were quick enough, the Baily's beads can be seen. The photos, of course, are crap: the corona is still bright enough to blot out the dark disk of the moon.
Then, our attention is drawn away from the sun. It's cold. It's suddenly cold; the last moment of totality makes a huge difference. Probably something like 20°F off the normal high in that moment? Of course, it's dark. Not midnight, all-you-see-are-stars dark; it's more like a dusk dark. But unlike normal dusk, you can see the fringes of daylight in all directions. You can see some stars (or maybe that's just Venus; astronomy is not my strong suit), and of course a few planes are in the sky. One of them is just a moving, blinking light in the distance; another (chasing the eclipse?) is clearly visible with its contrail. And the silence. You don't notice the usual cacophony of sounds most of the time, but when everyone shushes for a moment, you hear the deafening silence of insects, of birds, of everything.
Naturally, we all point back to the total eclipse and stare at it for most of the short time. Everything else is just a distraction, after all. How long do we have? A minute. Still more time for staring. A running commentary on everything I've mentioned, all while that neck is craned skyward and away from the people you're talking to. When is it no longer safe to keep looking? Is it still safe—no orange in the eclipse glasses, should still be fine. How long do we need to look at the sun to damage our eyes? Have we done that already? Are the glasses themselves safe? As the moon moves off the sun, hold that stare until that last possible moment, catch the return of the Baily's beads. A bright spark of sun, the photosphere is made visible again, and then clamp the eyes shut as hard as possible while you fumble the glasses back on to confirm that orange is once again visible.
Finally, the rush out of town. There's a reason why everyone leaves after totality is over. Partial eclipses really aren't worth seeing twice, and we just saw one not five minutes ago. It's just the same thing in reverse. (And it's nice to get back in the car before the temperature gets warm again; my dark grey car was quite cool to the touch despite sitting in the sun for 2½ hours). Forget trying to beat the traffic; you've got a 5-hour drive ahead of you anyways, and the traffic is going to keep pouring onto the roads over the next several hours anyways (10 hours later, as I write this, the traffic is still bad on the eclipse exit routes). If you want to avoid it, you have to plan your route away from it instead.
I ended up using this route to get back, taking 5 hours 41 minutes and 51 seconds including a refueling stop and a bathroom break. So I don't know how bad I-57 was (I did hear there was a crash on I-57 pretty much just before I got on the road, but I didn't know that at the time), although I did see that I-69 was completely stopped when I crossed it. There were small slowdowns on the major Illinois state roads every time there was a stop sign that could have been mitigated by sitting police cars at those intersections and effectively temporarily signalizing them, but other than that, my trip home was free-flowing at speed limit the entire route.
Some things I've learned:
It's useful to have a GPS that doesn't require cellphone coverage to figure out your route.
It's useful to have paper maps to help plan a trip that's taking you well off the beaten path.
It's even more useful to have paper maps of the states you're in when doing that.
The Ohio River is much prettier near Cairo, IL than it is near Wheeling, WV.
The Tennessee River dam is also pretty.
Driving directions need to make the "I'm trying to avoid anything that smells like a freeway because it's going to be completely packed and impassable" mode easier to access.
Passing a car by crossing the broken yellow median will never not be scary.
Being passed by a car crossing the broken yellow median is still scary.
Driving on obscure Kentucky state roads while you're playing music is oddly peaceful and relaxing.
The best test for road hypnosis is seeing how you can drive a long, straight, flat, featureless road. You have not seen a long, straight, flat, featureless road until you've driven something like an obscure Illinois county road where the "long, straight" bit means "20 miles without even a hint of a curve" and the "featureless" means "you don't even see a house, shed, barn, or grain elevator to break up corn and soy fields." Interstates break up the straight bit a lot, and state highways tend to have lots of houses and small settlements on them that break up endless farm fields.
Police direction may not permit you to make your intended route work.
This is a repost from medium, where Arshad originally wrote the blog post.
In the past blog, I talked mostly about the development environment setup, but this blog will be about the react dialog development.
Since then I have been working on converting some more dialogs into React. I have converted three dialogs — calendar properties dialog, calendar alarm dialog and print dialog into their React equivalent till now. Calendar alarm dialog and print dialog still need some work on state logic but it is not something that will take much time. Here are some screenshots of these dialogs.
calendar-properties-dialog
print-dialog
calendar-alarm-dialog
While making react equivalents, I found out XUL highly depends upon attributes and their values. HTML doesn’t work with attributes and their values in the same way XUL does. HTML allows attribute minimization and with React there are some other difficulties related to attributes. React automatically neglects all non-default HTML attributes so to add those attributes I have to add it explicitly using setAttribute method on the element when it has mounted. Here is a short snippet of code which shows how I am adding custom HTML attributes and updating them in React.
class CalendarAlarmWidget extends React.Component {
componentDidMount() {
this.addAttributes(this.props);
}
componentWillReceiveProps(nextProps) {
// need to call removeAttributes first
// so that previous render attributes are removed
this.removeAttributes();
this.addAttributes(nextProps);
}
addAttributes(props) {
// add attributes here
}
removeAttributes() {
// remove attributes here
}
}
XUL also have dialog element which is used instead of window for dialog boxes. I have also made its react equivalent which has nearly all the attributes and functionality that XUL dialog element has. Since XUL has slightly different layout technique to position elements in comparison to HTML, I have dropped some of the layout specific attributes. With the power of modern CSS, it is quite easy to create the layout so instead of controlling layout using attributes I am depending more upon CSS to do these things. Some of the methods like centerWindowOnScreen and moveToAlertPosition are dependent on parent XUL wrapper so I have also dropped them for React equivalent.
There are some elements in XUL whose HTML equivalents are not available and for some XUL elements, HTML equivalents don’t have same structure so their appearance considerably differs. One perfect example would be menulist whose HTML equivalent is select. Unlike menulist whose direct child is menupopup which wraps all menuitem, select element directly wraps all the options so the UI of select can’t be made exactly similar to menulist. option elements are also not customizable unlike menuitem and it also doesn’t support much styling. While it is helpful to have React components that behave similar to their XUL counterparts, in the end only HTML will remain. Therefore it is unavoidable that some features not useful for the new components will be dropped.
I have made some custom React elements to provide all the features that existing dialogs provide, although I am still using HTML select element at some places instead of the custom menulist item because using javascript and extra CSS just to make the element look similar to XUL equivalent is not worth it.
As each platform has its own specific look, there are naturally differences in CSS rules. I have organized the files in a way that it is easy to write rules common to all platforms, but also add per-OS differences. A lot of the UI differences are handled automatically through -moz-appearance rules, which instruct the Mozilla Platform to use OS styling to render the elements. The web app will automatically detect your OS so you can see how the dialog will look on different platforms.
I thought it would be great to get quick suggestions and feedback on UI of dialogs from the community so I have added a comment section on each dialog page. I will be adding more cool features to the web app that can possibly help in making progress in this project.
Thanks to BrowserStack for providing free OSS plans, now I can quickly check how my dialogs are looking on Windows and Mac.
Thanks to yulia [IRC nickname] for finding time to discuss the react implementation of dialog, I hope to have more react discussions in future :)
Feel free to check the dialogs on web app and comment if you have any questions.
This is a repost from medium, where Arshad originally wrote the blog post.
This summer I am working on a Thunderbird project — Convert XUL to HTML, as a Google Summer of Code 2017 candidate. I am really excited and thrilled to start my journey at Mozilla. I will be working on Mozilla Calendar add-on for Thunderbird aka Lightning. The goal of this project will be to convert XUL dialog boxes into their React versions.
Project Abstract:
Lightning has traditionally been using XUL for its user interface. To modernize, we would like to convert dialogs, tab content and other parts of the user interface to HTML. The new components should use web standards as much as possible, avoiding extensive use of third party libraries.
The second week of the coding period is going to end and there is a lot to tell about the progress of the Convert XUL to HTML project. I was able to setup a balanced development environment and convert a dialog into React. Things are going well so far as the time invested in setting up the development environment is bringing results.
I will start by telling a bit about the challenges that I faced and later a bit about the solutions that I sorted out. Since Thunderbird doesn’t have any extra build step, it was very clear from the start that anything that needs an extra build/compile step is a NO for this project. By that, it means I have to compromise on the awesome features like hot-reloading, jsx etc. that are often paired with React. Another minor issue that I faced was styling of components of dialog box so that they can look exactly like their XUL versions.
At first, I thought of going with the option of importing react, react-dom via script tags and write code without jsx in vanilla js but later I thought why not automate this difficulty. I setup Babel with react-preset and wrote few lines of code to make a clean npm environment to do all these things. Since running Babel on the source directory only outputted the js files, I wrote a few gulp tasks to copy the HTML and CSS files to the compiled js directory.
It is kind of annoying to copy each file manually so I opted for going with Gulp. I also wrote a bash script that removes the Babel scripts and edits the type of main javascript files in the compiled directory’s HTML files. Now there is no extraneous code into the files of compiled directory(dist).
Using Gulp, I can live reload the browser automatically whenever I make any changes to the source files, this is not as good as hot-reloading but it’s better to have it rather than manually hitting the refresh button.
As a web developer, I never worried about the default styling of the browser but for this project, I have to be totally dependent on Firefox toolkit themes and Thunderbird CSS skins. It started to make sense after a few hours of work and now I can create exactly the same layout and appearance of elements in React as it has in XUL dialog boxes. All thanks go to developer tools of Thunderbird and DXR.
The dialog that I and my mentor Philipp decided to do first was calendar-properties-dialog as it was simple and it would help me to get a comfortable start. This dialog is now completely done except a few OS specific CSS rules which can be done later on after testing the dialog in Thunderbird. Working on this dialog was fun and easy and I hope this fun and easiness continues.
Anyone can check the progress of the project by either checking out this repository or logging on to https://gsoc17-convert-xul-to-html.herokuapp.com. I have also created an iframe testing ground where a user can send and modify the state object of dialog and open the dialog in an iframe. This page uses the same HTML5 postMessage API for communication between iframe and parent as it will use in Thunderbird dialog boxes, similar to how it is already working for the event dialog in the past GSoC project. I am sure the testing ground will save a lot of time in debugging and it clearly shows how things are going on internally within dialog box. It is like a mini control dashboard for our dialog boxes.
We haven’t tested out the current react dialog box in Thunderbird yet but after integrating react version of dialog boxes into Thunderbird, we will most likely not be using all these tools to generate the code, but focusing on using the minimal tools available in the Mozilla build system. We would like to hear the suggestions of Mozilla devtools folks to see if they have plans on improving tooling support and possibly using jsx, as it is much easier to read than having that converted to javascript.
I am very excited for the next weeks and I hope things go well as it has been going on. Many thanks to my mentor Philipp for his continuous support and Mozilla community for answering my questions on IRC. Any pieces of advice, suggestion and perhaps encouraging words are always welcome :)
My good friend Wsmwk is starting the test effort for Thunderbird 52, please give 15 minutes of your time to iron out and polish the next version of Thunderbird.
As discussed in the previous post, the HTML-based UI for editing events and tasks in a tab is still a work in progress that is in a fairly early stage and not something you could use yet. (However, for any curious folks living on the bleeding edge who might still want to check it out, the previous post also describes how to activate it.) This post relates to its implementation, namely the use of React, “a Javascript library for building user interfaces.”
For the HTML UI we decided to use React (but not JSX which is often paired with it). React basically provides a nice declarative way to define composable, reusable UI components (like a tab strip, a text box, or a drop down menu) that you use to create a UI. These are some of its main advantages over “raw” HTML. It’s also quite efficient / fast and is a library that does one thing well and can be combined with other technologies (as compared with more monolithic frameworks). I enjoyed using and learning about React. Once you understand its basic model of state management and how the components work it is not very difficult or complicated to use. I found its documentation to be quite good, and I liked how it lets you do everything in Javascript, since it generates the HTML for the UI dynamically.
One of the biggest differences when using React is that instead of storing state in DOM elements and querying them for their state (as we currently do), the app state is centralized in a top-level React component and from there it gets automatically distributed to various child components. When the state changes (on user input) React automatically updates the UI to reflect those changes. To do this it uses an internal “virtual DOM” which is basically a representation of the state of the DOM in Javascript. When there are changes it compares the previous version of that virtual DOM with the new version to decide what changes need to be made to the actual DOM. (Because the actual DOM is quite slow compared to Javascript, this approach gives React an advantage in terms of performance.) Centralizing the app state in this way simplifies things considerably. Direct interaction with DOM elements is not needed, and is actually an anti-pattern.
One example of the power and flexibility that React offers is that I actually did the “responsive design” part of the HTML UI with React rather than CSS. The reason was that some of the UI components had to move to different positions in the UI when transitioning between the narrow and wide layouts for different window sizes. This was not really possible with CSS, at least not without overly complex workarounds. However, it was simple to do it with React because React can easily re-render the UI in any configuration you define, in this case in response to resizing the window past a certain threshold. (Once CSS grid layout is available this kind of repositioning will be straightforward to do with CSS.)
React’s different approach to state does present some challenges for using it with existing code. For this project at least it is not simply a matter of dropping it in and having it work, rather using it will entail some non-trivial code refactoring. Basically, the code will need to be separated out into different jobs. First there’s (1) interacting with the outside of the iframe (e.g. toolbar, menubar, statusbar) and (2) modifying and/or formatting the event or task data. These are needed for both the XUL and HTML UIs. Next there’s (3) updating and interacting with the XUL UI inside the iframe. Currently these things (1, 2, and 3) are usually closely intertwined, for example in a single function. Then there is (4) using React to define components and how they respond to changes to the app state, and (5) updating and interacting with the HTML UI inside the iframe (i.e. read from or write to the app state in the top-level React component). So there is some significant refactoring work to do, but after it is done the code should be more robust and maintainable.
Despite the refactoring work that may be involved, I think that React has a lot to offer for future UI work for Calendar or Thunderbird as an alternative to XUL. Especially for code that involves managing a lot of state (like the current project) using React and its approach should reduce complexity and make the code more maintainable. Also, because it mostly involves using Javascript this simplifies things for developers. When CSS grid layout is available that will also strengthen the case for HTML UI work since it will offer greater control over the layout and appearance of the UI.
I’ll close with links to two blog posts and a video about React that I found helpful:
It’s hard to believe it is already late August and this year’s Google Summer of Code is all wrapped up. The past couple of months have really flown by. In the previous post I summarized the feedback we received on the new UI design and discussed the work I’ve been doing to port the current UI (for editing events and tasks) to a tab. In this post I’ll describe how to try out this new feature in a development version of Thunderbird, and give an update on the HTML implementation of the new UI design. In my next post I’ll share some thoughts on using React for the HTML UI.
To try out editing events and tasks in a tab instead of in a dialog window you’ll need a development version of Thunderbird (aka: “Daily”). Since it is a development version you will want to use a separate profile and/or make sure your data is backed up. Once you have that all set up, you can turn on the “event in a tab” feature with a hidden preference. To access hidden preferences, go to Preferences > Advanced > Config Editor, and then search for “calendar.item.editInTab” and toggle it to true by double-clicking on it.
Or if that’s too much trouble you can just wait until it arrives in the next stable release of Thunderbird/Lightning. In the meantime, here’s what it looks like (click to enlarge):
The screenshot above shows the current XUL-based UI ported to a tab. I ended up not having much time to work on the new HTML-based UI (actually only a week or so) and did not get as far on it as I’d hoped — only as far as a basic and preliminary implementation, a starting point for further development rather than something that can be used today. For example, it does not yet support saving changes and not all of the data is loaded into the UI for a given event or task.
Some aspects do already work, like the responsive design where the UI changes to adapt to the width of the window, taking more advantage of the greater space available in a tab. Here are two screen shots that show the wide and narrow views (click to enlarge).
Even though the HTML UI is not ready for use yet, we decided to go ahead and land it in the code base as a work-in-progress for further development. So if you are curious to see where it stands, it can also be turned on with a hidden preference (“calendar.item.useNewItemUI”) in a current development version of Thunderbird, as described above. Again, be sure to use a separate profile and/or make sure your data is backed up.
For more technical details about the project, including some high-level documentation I wrote for this part of the code, see the meta bug, especially my comment #2 which summarizes the state of things as of the end of the Summer of Code period.
It was a great summer working on this project. I learned a lot and enjoyed contributing. As my time permits, I hope to continue to contribute and finish the implementation of the new UI. Many thanks to Google, Mozilla, and especially to my mentors Philipp Kewisch (Fallen) and MakeMyDay for their guidance and tireless willingness to answer my questions and review code. Also thanks to Richard Marti (Paenglab) for his help and feedback on the UI design work.
I wish there was another month of the official coding period to get the HTML implementation further along, but alas, so far we’ve only been able to help people manage their time, not actually generate more of it.
The clock has run out on Google Summer of Code 2016. In this post I’ll summarize the feedback we received on the new UI design and the work I’ve been doing since my last post.
Feedback on the New UI Design
A number of people shared their feedback on the new UI design by posting comments on the previous blog post. The response was generally positive. Here’s a brief summary:
One commenter advocated for keeping the current date/time picker design, while another just wanted to be sure to keep quick and easy text entry.
A question about how attendees availability would be shown (same as it is currently).
A request to consider following Google Calendar’s reminders UI.
A question about preserving the vertical scroll position across different tabs (this should not be a problem).
A concern about how the design would scale for very large numbers (say hundreds) of attendees, categories, reminders, etc. (See my reply.)
Thanks to everyone who took the time to share their thoughts. It is helpful to hear different views and get user input. If you have not weighed in yet, feel free to do so, as more feedback is always welcome. See the previous blog post for more details.
Coding the Summer Away
A lot has happened over the last couple months. The big news is that I finished porting the current UI from the window dialog to a tab. Here’s a screenshot of this XUL-based implementation of the UI in a tab (click to enlarge):
Getting this working in a really polished way took more time than I anticipated, largely because the code had to be refactored so that the majority of the UI lives inside an iframe. This entailed using asynchronous message passing for communication between the iframe’s contents and its outer parent context (e.g. toolbars, menus, statusbar, etc.), whether that context is a tab or a dialog window. While this is not a visible change, it was necessary to prepare the way for the new HTML-based design, where an HTML file will be loaded in the iframe instead of a XUL file.
Along with the iframe refactoring, there are also just a lot of details that go into providing an ideal user experience, all the little things we tend to take for granted when using software. Here’s a list of some of these things that I worked on over the last months for the XUL implementation:
when switching tabs, update the toolbar and statusbar to reflect the current tab
persist open tabs across application restarts (which requires serializing the tab state)
ask the user about saving changes before closing a tab, before closing the application window, and before quitting the application
allow customizing toolbars with the new iframe setup
provide a default window dialog height and width with the new iframe setup
display icons for tabs and related CSS/style work
get the relevant ‘Events and Tasks’ menu items to work for a task in a tab
allow hiding and showing the toolbar from the view > toolbars menu
if the user has customized their toolbar for the window dialog, migrate those settings to the tab toolbar on upgrade
fix existing mozmill tests so they work with the new iframe setup
test for regressions in SeaMonkey
In the next two posts I’ll describe how to try out this new feature in a development version of Thunderbird, discuss the HTML implementation of the new UI design, and share some thoughts on using React for the HTML implementation.
As you can see on the Event in a Tab wiki page, I have created a number of mockups, labeled A through N, for the new UI for creating, viewing, and editing calendar events and tasks. (This has given me a lot of practice using Inkscape!) The final design will be implemented in the second phase of the project. So far the revisions have been based on valuable feedback from Paenglab and MakeMyDay (thanks!), and we are now seeking broader feedback from users on the latest and greatest mockup “N” (click to view full size):
Event in a Tab, UI Design, Mockup “N”
Please take a look and send any feedback, comments, suggestions, questions, etc. to the calendar mailing list / newsgroup where we will be discussing the design, or you can leave a comment on this blog post, send a private email to mozilla@kewis.ch, or reach us via IRC (in Mozilla’s #calendar channel).
Here are some notes and details about the behavior of the proposed UI that are not apparent from a static image.
The mockup is intended as a relatively rough “wire frame” to show layout and it only approximates spacing, sizing, and aesthetic details. Unless otherwise noted, functionality is the same as in the current Lightning add-on.
A responsive design approach will be used to implement this UI in HTML. As the window expands horizontally, the elements will expand with it up to a breakpoint where the two-column “tab” layout goes into effect. Then the elements will continue to expand in both of the columns, up to a certain maximum limit at which they would expand no further. (Having this limit will keep things more focused on very wide monitors/windows.)
For vertical scrolling in a tab… Categories, Reminders, Attachments, Attendees, and Description can expand to take up as much vertical space as necessary to show all of their content. In most cases, where there are only a small number of these items, there will be enough room on the page to show them all without any scrolling. In less common cases where there are many items, the content of the tab will grow taller until it no longer fits vertically, and then the whole tab will become scrollable. (The toolbar at the top, with the buttons like “Save and Close,” will not scroll, remaining in place, still easily accessible.) This approach makes it possible to view all of the items at once when there are many of them (instead of having smaller boxes around each of these elements that are each independently scrollable). This “whole tab scrolling” approach is how it works in Google Calendar.
For vertical scrolling in a dialog window… When the contents of the tabbed box (Reminders, Attachments, Attendees, and Description) becomes too big to fit vertically, the tabbed box becomes scrollable. (Suggestions are welcome for the name of the “More” tab in the window dialog.)
The mockup shows the new date/time picker that is being developed by Mozilla. It remains to be seen whether it will be available in time for use in this project. Another possibility is the date/time picker developed by Fastmail.
Progress Report on Coding
Besides working on the design for the UI, I have continued to work on porting the current event dialog UI to a tab. I created a bug for this part of the first phase of the project, posted my first work-in-progress patch there, and am now working on the next iteration based on the feedback.
This work includes refactoring the current event dialog’s XUL file into more than one file to separate the main part of the UI from its menu bar, tool bar, and status bar items. This more modular arrangement will make it possible to make the menu bar, tool bar, and status bar items appear in the correct places in the main Thunderbird window when displaying the UI in a tab. This will solve the problem of the doubled status bar and menu bar in my first patch.
The next patch will also have a hidden preference (accessible via “about:config” but eventually to be added to Lightning’s preferences UI) that determines whether event and task dialogs are opened in a window or a tab by default.
So overall, things are progressing well, which is a good thing since there is only about a week or so left before the GSoC midterm milestone, and the goal is to have phase one of the project completed by that point. After I have finished this initial “phase one” patch, and any follow-up work that needs to be done for it, we will reach a decision about whether to use XUL, Web Components, React.js, or “plain vanilla” HTML for the implementation of the new UI design, and then start working on implementing it.
Time for a progress report after my first week or so working on the Event in a Tab GSoC project. Things are going well so far. In short, I have the current event and task dialogs opening in a tab rather than a window and I can create and edit tasks and events in a tab. While not everything is working yet most things already are.
The trickiest part has been working with XUL, since I am not as familiar with it as I am with Javascript. With some help from Fallen on IRC I figured out how to register a new XUL document that contains an iframe and how to load another XUL file into this iframe. For an event or task that is editable one XUL file is loaded (calendar-event-dialog.xul), but if it is read-only then a different XUL file is loaded (calendar-summary-dialog.xul).
Initially I used the tabmail interface’s “shared tab” option — where a single XUL file is loaded and then its appearance and content is modified to create the appearance of completely different tabs. (This is how Thunderbird’s “3-pane” and “single message” tabs work, and also Lightning’s “Calendar” and “Tasks” tab.) However, this did not work when you opened multiple events/tasks in separate tabs. So I figured out the tabmail interface’s other option which loads each tab separately as you would expect and everything is now working fine.
The next step was to figure out how to access the data for an event (or task) from the tab. I actually figured out two ways to do this. The first was via the tabmail interface in the way that it is set up to work (i.e. “tabmail.currentTabInfo”). That meant that the current event dialog code (that referenced the data as a property of the “window” object) had to be changed to access it from this new location. But that is not so good since we will be supporting both window and tab options and it would be nice if the same code could “just work” for both cases as much as possible.
So I figured out a second way to provide access to the data by just putting it in the right place relative to the iframe, so that the current code could reach it without having to be modified (i.e. still as a property of the “window” object, but with the “window” being relative to the iframe). This is a better approach since the same code will work for both cases (events/tasks in a dialog window or in a tab).
One small thing I implemented via the tabmail interface is that the title of the tab indicates whether you are creating a new item or modifying an existing one and whether the item is an event or a task. However, I will probably end up re-working this because the current dialog window code updates the title of the window as you change the title of the event/task, and that code can probably also be used to generate the initial title of the tab. This is something I will be looking into as I start to really work with the event dialog code.
On the UI design side of things, I created three new mockups based on some more feedback from Richard Marti and MakeMyDay. Part of the challenge is that there are a number of elements that vary in size depending on how many items they contain (e.g. reminders, categories, attachments, attendees). Mockups K and L were my attempt at a slightly different approach for handling this, although we will be following the design of mockup J going forward. You can take a look at these mockups and read notes about them on the wiki page.
The next steps will be to push toward a more finalized design and seek broader feedback on it. On the coding side I will be identifying where things are not working yet and getting them to work. For example, the code for closing a window does not work from a tab and the status bar items are appearing just above the status bar (at the bottom of the window) because of the iframe.
So far I think things are going well. It is really encouraging that I am already able to create and modify events and tasks from a tab and that most of the basic functionality appears to be working fine.
Today is the first day of the “coding period” for Google Summer of Code 2016 and I’m excited to be working on the “Event in a Tab” project for Mozilla Calendar. The past month of the “community bonding period” has flown by as I made various preparations for the summer ahead. This post covers what I’ve been up to and my experience so far.
After the exciting news of my acceptance for GSoC I knew it was time to retire my venerable 2008 Apple laptop which had gotten somewhat slow and “long in the tooth.” Soon, with a newly refurbished 2014 laptop via Ebay in hand, I made the switch to GNU/Linux, dual-booting the latest Ubuntu 16.04. Having contributed to LilyPond before it felt familiar to fire up a terminal, follow the instructions for setting up my development environment, and build Thunderbird/Lightning. (I was even able to make a few improvements to the documentation – removed some obsolete info, fixed a typo, etc.) One difference from what I’m used to is using mercurial instead of git, although the two seem fairly similar. When I was preparing my application for GSoC my build succeeded but I only got a blank white window when opening Thunderbird. This time, thanks to some guidance from my mentor Philipp about selecting the revision to build, everything worked without any problems.
One of the highlights of the bonding period was meeting my mentors Philipp Kewisch (primary mentor) and MakeMyDay (secondary mentor). We had a video chat meeting to discuss the project and get me up to speed. They have been really supportive and helpful and I feel confident about the months ahead knowing that they “have my back.” That same day I also listened in on the Thunderbird meeting with Simon Phipps answering questions about his report on potential future legal homes for Thunderbird, which was an interesting discussion.
At this point I am feeling pretty well integrated into the Mozilla infrastructure after setting up a number of accounts – for Bugzilla, MDN, the Mozilla wiki, an LDAP account for making blog posts and later for commit access, etc. I got my feet wet with IRC (nick: pmorris), introduced myself on the Calendar dev team’s mailing list, and created a tracker bug and a wiki page for the project.
Following the Mozilla way of working in the open, the wiki page provides a public place to document the high-level details related to design, implementation, and the overall project plan. If you want to learn more about this “Event in a Tab” project, check out the wiki page. It contains the mockup design that I made when applying for GSoC and my notes on the thinking behind it. I shared these with Richard Marti who is the resident expert on UI/UX for Thunderbird/Calendar and he gave me some good feedback and suggestions. I made a number of additional mockups for another round of feedback as we iterate towards the final design. One thing I have learned is that this kind of UI/UX design work is harder than it looks!
Additionally, I have been getting oriented with the code base and figuring out the first steps for the coding period, reading through XUL documentation and learning about Web Components and React, which are two options for an HTML implementation. It turns out there is a student team working on a new version of Thunderbird’s address book and they are also interested in using React, so there will be a larger conversation with the Thunderbird and Calendar dev teams about this. (Apparently React is already being used by the Developer Tools team and the Firefox Hello team.)
I think that about covers it for now. I’m excited for the coding period to get underway and grateful for the opportunity to work on this project. I’ll be posting updates to this blog under the “gsoc” tag, so you can follow my progress here.
It is about time for a new blog post. I know it has been a while and there are certainly some notable events I could have blogged about, but in today’s fast paced world I have preferred quick twitter messages.
The exciting news I would like to spread today is that we have a new Google Summer of Code student for this summer! May I introduce to you Paul Morris, who I believe is an awesome candidate. Here is a little information about Paul:
I am currently finishing my graduate degree and in my spare time I like to play music and work on alternative music notation systems (see Clairnote). I have written a few Firefox add-ons and I was interested in the “Event in a Tab” project because I wanted to contribute to Mozilla and to Thunderbird/Calendar which is used by millions of people and fills an important niche. It was also a good fit for my skills and an opportunity to learn more about using html/css/javascript for user interfaces.
Paul will be working on the Event in a Tab project, which aims to allow opening a calendar event or task in a tab, instead of in the current event dialog. Just imagine the endless possibilities we’d have with so much space! In the end you will be able to view events and tasks both in the traditional dialog and in a tab, depending on your preference and the situation you are in.
The project will have two phases, the first taking the current event dialog code and UI as is and making it possible to open it in a tab. The textboxes will inevitably be fairly wide, but I believe this is an important first step and gives users a workable result early on.
Once this is done, the second step is to re-implement the dialog using HTML instead of XUL, with a new layout that is made for the extra space we have in a tab. The layout should be adaptable, so that when the window is resized or the event is opened in a narrow dialog, the elements fall in to place, just like you’d experience in a reactive designed website. You can read more about the project on the wiki.
Paul has already made some great UI mock-ups in his proposal, we will be going through these with the Thunderbird UI experts to make sure we can provide you with the best experience possible. I am sure we will share some screenshots on the blog once the re-implementation phase comes closer.
Paul will be using this blog to give updates about his progress. The coding phase is about to start on May 22nd after which posts will become more frequent. Please join me in welcoming Paul and wishing him all the best for the summer!
Bug 1162148 – Warning: Identity file /builds/.ssh/{ffxbld_rsa,tbirdbld_dsa} not accessible: No such file or directory (won’t be needed after esr38, fixed in 42) – status?
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status TBD – bug 1176748 – fix main thread proxies to the migration code (jorgk and m_kato helped in past)
proxy (eg causing foxyproxy problems)
topcrash bug 1149287 is ** 31% of our crashes** – see below
bug 1211291 (esr38, unknown in esr45) – Folders are visible, but messages are not. (?related to bug 1211358 lightning chrome.manifest not updated in 38.3.0 ?)
nobody has a clue about what to do with this, leaning toward shipping 38.7.0 with not fix. (wsmwk) agree, ship without fixing it
Problems updating to 38.6 to do with Lightning bug 1249894 and its five duplicates. Two users attached their extensions folder.
Isn’t this the same as the one above, bug 1211291?
donation website is “live” but procedures to report and spend income are not established. Meeting on Thursday 2016-04-07 with MoFo admin on this.
beginning to investigate bug 1260724 IMAP failing with godaddy servers in TB 38
investigated and landed bug 1250723 (esr45) – startup CRASH: C-C TB ASSERTION: can’t be your own parent aka morkTable::AddRow (v45 topcrash)
releases
I have a huge project to convert ExQuilla to JS (prototype for a possible similar conversion for Thunderbird) that will take the majority of my time for the next six months.
Question Time
rkent, there is no 64-bit version of TB 45 for Windows. It seems only 46+ but the next public will be 52. Is there a possibility of compiling 45 in 64-bit?
Someone needs to work on the release engineering to figure out to enable those builds.
Help Wanted
Accessibility lead
Person comfortable (not necessarily technical experts) with Core type issues. Example, graphics to guide bug 1195947 Thunderbird hardware acceleration (HWA) issues to be resolved
Lead to run the donation campaign after TB 45 is released.
tl;dr: glodastrophe, the experimental entirely-client-side JS desktop-ish email app now supports Vega-based visualizations in addition to new support infrastructure for extension-y things and creating derived views based on the search/filter infrastructure.
Shareable email workflows (credit to :davida). If you could figure out how to set up your email client in a way that worked for you, you should be able to share that with others in a way that doesn’t require them to manually duplicate your efforts and ideally without you having to write code. (And ideally without anyone having to review code/anything in order to ensure there are no privacy or security problems in the workflow.)
Useful email visualizations. While in the end, the only visualization ever shipped with Thunderbird was the simple timeline view of the faceted global search, various experiments happened along the way, some abandoned. For example, the following screenshot shows one of the earlier stages of faceted search development where each facet attempted to visualize the relative proportion of messages sharing that facet.
At the time, the protovis JS visualization library was the state of the art. Its successor the amazing, continually evolving d3 has eclipsed it. d3, being a JS library, requires someone to write JS code. A visualization written directly in JS runs into the whole code review issue. What would be ideal is a means of specifying visualizations that is substantially more inert and easy to sandbox.
Enter, Vega, a visualization grammar that can be expressed in JSON that can not only define “simple” static visualizations, but also mind-blowing gapminder-style interactive visualizations. Also, it has some very clever dataflow stuff under the hood and builds on d3 and its well-proven magic. I performed a fairly extensive survey of the current visualization, faceting, and data processing options to help bring visualizations and faceted filtered search to glodastrophe and other potential gaia mail consumers like the Firefox OS Gaia Mail App.
Digression: Two relevant significant changes in how the gaia mail backend was designed compared to its predecessor Thunderbird (and its global database) are:
As much as can possibly be done in a DOM/Web Worker(s) is done so. This greatly assists in UI responsiveness. Thunderbird has to do most things on the main thread because of hard-to-unwind implementation choices that permeate the codebase.
It’s assumed that the local mail client may only have a subset of the messages known to the server, that the server may be smart, and that it’s possible to convince servers to support new functionality. In many ways, this is still aspirational (the backend has not yet implemented search on server), but the architecture has always kept this in mind.
In terms of visualizations, what this means is that we pre-chew as much of the data in the worker as we can, drastically reducing both the amount of computation that needs to happen on the main (page) thread and the amount of data we have to send to it. It also means that we could potentially farm all of this out to the server if its search capabilities are sufficiently advanced. And/or the backend could cache previous results.
For example, in the faceted visualizations on the sidebar (placed side-by-side here):
In the “Prolific Authors” visualization definition, the backend in the worker constructs a Vega dataflow (only!). The search/filter mechanism is spun up and the visualization’s data gathering needs specify that we will load the messages that belong to each conversation in consideration. Then for each message we extract the author and age of the message and feed that to the dataflow graph. The data transforms bin the messages by date, facet the messages by author, and aggregate the message bins within each author. We then sort the authors by the number of messages they authored, and limit it to the top 5 authors which we then alphabetically sort. If we were doing this on the front-end, we’d have to send all N messages from the back-end. Instead, we send over just 5 histograms with a maximum of 60 data-points in each histogram, one per bin.
Same deal with “Prolific domains”, but we extract the author’s mail domain and aggregate based on that.
Similarly, the overview Authored content size over time heatmap visualization sends only the aggregated heatmap bins over the wire, not all the messages. Elaborating, for each message body part, we (now) compute an estimate of the number of actual “fresh” content bytes in the message. Anything we can detect as a quote or a mailing list footer or multiple paragraphs of legal disclaimers doesn’t count. The x-axis bins by time; now is on the right, the oldest considered message is on the left. The y-axis bins by the log of the authored content size. Messages with zero new bytes are at the bottom, massive essays are at the top. The current visualization is useless, but I think the ingredients can and will be used to create something more informative.
Other notable glodastrophe changes since the last blog post:
Front-end state management is now done using redux
The Material UI React library has been adopted for UI widget purposes, though the conversation and message summaries still need to be overhauled.
React was upgraded
A war was fought with flexbox and flexbox won. Hard-coding and calc() are the only reason the visualizations look reasonably sized.
Webpack is now used for bundling in order to facilitate all of these upgrades and reduce potential contributor friction.
bug 1250723 (esr45) – startup CRASH: C-C TB ASSERTION: can’t be your own parent aka morkTable::AddRow (v45 topcrash)
rkent is taking this, reproducible result of exceeding limit of key size
bug 1251120 (esr45)- crash in nsMsgI18NConvertFromUnicode (v45 topcrash)
We really need a reproducible case, individual messages don’t do the trick. (wsmwk reproduces 100%)
bug 1211291 (esr38, unknown in esr45) – Folders are visible, but messages are not. (?related to bug 1211358 lightning chrome.manifest not updated in 38.3.0 ?)
nobody has a clue about what to do with this, leaning toward shipping 38.7.0 with not fix. (wsmwk) agree, ship without fixing it
Bug 1162148 – Warning: Identity file /builds/.ssh/{ffxbld_rsa,tbirdbld_dsa} not accessible: No such file or directory (won’t be needed after esr38, fixed in 42) – status?
Problems updating to 38.6 to do with Lightning bug 1249894 and its five duplicates. Two users attached their extensions folder.
Isn’t this the same as the one above, bug 1211291?
important (but not top critical)
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status TBD – bug 1176748 – fix main thread proxies to the migration code (jorgk and m_kato helped in past)
proxy (eg causing foxyproxy problems)
topcrash bug 1149287 is ** 31% of our crashes** – see below
Person comfortable (not necessarily technical experts) with Core type issues. Example, graphics to guide bug 1195947 Thunderbird hardware acceleration (HWA) issues to be resolved
We’ve been overhauling the Firefox OS Gaia Email app and its back-end to understand email conversations. I also created a react.js-based desktop-ish development UI, glodastrophe, that consumes the same back-end.
My first attempt at summaries for glodastrophe was the following:
The back-end derives a conversation summary object from all of the messages that make up the conversation whenever any message in the conversation changes. While there are some things that are always computed (the number of messages in the conversation, whether there are any unread messages, any starred/flagged messages, etc.), the back-end also provides hooks for the front-end to provide application logic to do its own processing to meet its UI needs.
In the case of this conversation summary, the application logic finds the first 3 unread messages in the conversation and stashes their date, author, and extracted snippet (if any) in a list of “tidbits”. This also is used to determine the height of the conversation summary in the conversation list. (The virtual list is aware of a quantized coordinate space where each conversation summary object is between 1 and 4 units high in this case.)
While this is interesting because it’s something Thunderbird’s thread pane could not do, it’s not clear that the tidbits are an efficient use of screen real-estate. At least not when the number of unread messages in the conversation exceeds the 3 we cap the tidbits at.
But our app logic can actually do anything it wants. It could, say, establish the threading relationship of the messages in the conversation to enable us to make a highly dubious visualization of the thread structure in the conversation as well as show the activity in the conversation over time. Much like the visualization you already saw before you read this sentence. We can see the rhythm of the conversation. We can know whether this is a highly active conversation that’s still ongoing, or just that someone has brought back to life.
Here’s the same visualization where we still use the d3 cluster layout but don’t clobber the x-position with our manual-quasi-logarithmic time-based scale:
Disclaimer: This visualization is ridiculously impractical in cases where a conversation has only a small number of messages. But a neat thing is that the application logic could decide to use textual tidbits for small numbers of unread and a cool graph for larger numbers. The graph’s vertical height could even vary based on the number of messages in the conversation. Or the visualization could use thread-arcs if you like visualizations but want them based on actual research.
If you’re interested in the moving pieces in the implementation, they’re here:
bug 1211160 – Why is updates with calendar add-on messing with Thunderbird?
I (rkent) hope to respond to review comments today.
Action items from last meetings
Thunderbird Council reorganization: we’ll discuss this in the Council directly and do some reorganization. This is not a forever governance plan, but we need to be practical given the many demands of the moment.
I (rkent) sent a proposal to the Council two weeks ago, now I need to badger a few people to give responses and decisions. Gerv has a process of voting to approve the slate using tb-planning that he will be running.
38.3.1 bug 1211160 calendar 4.0.3.1 is a workaround for 38.3.0 — Thunderbird Version 38.3 Buttons not working menu items // bug 1211291 – Folders are visible, but messages are not. (?related to bug 1211358 lightning chrome.manifest not updated in 38.3.0 ?
rkent has patch, see above
important (but not top critical)
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status TBD – bug 1176748 – fix main thread proxies to the migration code (jorgk and m_kato have done such fixes in the past)
filelink, proxy,
topcrash bug 1149287 is ** 31% of our crashes** – see below
bug 597369 – Override charset carried over to the next message processed (part 2).
Ongoing:
Investigating why drafts folder gets corrupted every few months bug 1216914
Other work: Aurora Landings
jcranmer
Review queue is still backlogged
Most complicated review at this point is the JSAccount stuff
Looking at JS-ification concerns, see mdat/tb-planning post
Current status of JS protocol libraries:
SASL: Complete, except for NTLM, GSSAPI, and EXTERNAL (not that it’s easy to pull in)
Email-socket: prototyping along with NNTP client
NNTP client: I want to test replacing our C++ implementation with this once JSAccount is finished
JSMime: I have untested prototypes for message composition for below
Working on improving JS send implementation
Current idea is to try to use an internal interface to break the patches into smaller sets
PSA: if you’re touching anything in nsMsgSend or nsMsgCompUtils (or anything else in mailnews/compose except for nsMsgCompose), PLEASE ADD AN XPCSHELL TEST (or a mozmill test if not possible via xpcshell)… if you don’t, it will probably break within a year!
sshagarwal
Apologies for being invisible for quite some time
Completed delivery format status being persisted in draft feature request: bug 1202165
Will be working on Address book visibility issues in AB window and composition sidebar
aceman
We need decision on {{bug|584313)) from a mailnews peer if it can go in (was looked at by Neil and comments were solved)
Magnus make the call if needed.
1202165 is ready in code (needs review from jcranmer, who gave r- due to test failure). I just work on some additional mozmill tests.
European meet-up dates:
TB is looking for events/dates in March to May timeframe, that would allow the team to meet with possible European partners in Europe. BA to send an e-mail to the TB team with upcoming European events.
Person comfortable (not necessarily technical experts) with Core type issues. Example, graphics to guide bug 1195947 Thunderbird hardware acceleration (HWA) issues to be resolved
Over the past couple of days it has become apparent that there has been an issue with IMAP accounts hosted on office365 and outlook.com. Support has received a number of complains of subscribed folders disappearing from Thunderbird and attempts to re-subscribe failing.
A workaround has been identified by turning off the Thunderbird option "Show only subscribed folders".
To turn off this option;
Right click the account in the folder pane.
Select the menu entry Settings
In the server settings for the account,select the advanced button.
In the advanced account settings dialog, un-check the option "Show only subscribed folders"
----o0O0o----
Microsoft have now acknowledged the issue as EX41924 and are posting updates here. At the time of writing this post the latest update (update 4) is suggesting a code solution has been developed and is currently being deployed across the office365 and outlook.com web sites to remediate the failure their previous patch caused.Affected users that wish to follow the Microsoft support thread on community.office365.com can find it here.
While it is unfortunate, there is nothing the Thunderbird community can do in this other than offer the workaround until such time as Microsoft resume normal services.
governance, futures (rkent: nothing I want to volunteer, but I can answer questions).
builds
Action items from last meetings
Thunderbird Council reorganization: we’ll discuss this in the Council directly and do some reorganization. This is not a forever governance plan, but we need to be practical given the many demands of the moment.
38.3.1 bug 1211160 calendar 4.0.3.1 is a workaround for 38.3.0 — Thunderbird Version 38.3 Buttons not working menu items // bug 1211291 – Folders are visible, but messages are not. (?related to bug 1211358 lightning chrome.manifest not updated in 38.3.0 ?)
nightly (3 bugs blocking builds, all are assigned) – bug 1195442 – Mac builds broken ~1 month on nightly
Fixed – bug 1183490 – (dataloss) New emails do not adhere to sort by order received
Plan is to understand issues in converting completely to nextKey = ++lastKey for non-IMAP, then plan how much to implement for 38.3.0 A simple fix of the dataloss is probable. I have a try patch that does the nextKey = ++lastKey that works, now I need to think about patch for 38 (bug 1202105 needs to go to trunk and 38)
update 2015-10-20, patches have been posted but waiting for reviews for 2 weeks (aceman will look at those)
update 2015-11-17, review done 11-14, need to update and land.
important (but not top critical)
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status TBD – bug 1176748 – fix main thread proxies to the migration code (jorgk and m_kato have done such fixes in the past)
filelink, proxy,
topcrash bug 1149287 is ** 31% of our crashes** – see below
Regarding the question on Enigmail/p≡p, the installation will be a single executable file that will include everything if it is installed as add-on. In Microsoft Outlook, a similar p≡p plug-in installation takes 5 clicks and 10 seconds without asking a single question. Patrick/Volker will provide more input on this via the wiki once it is set-up.
Person comfortable (not necessarily technical experts) with Core type issues. Example, graphics to guide bug 1195947 Thunderbird hardware acceleration (HWA) issues to be resolved
Thunderbird Council reorganization: we’ll discuss this in the Council directly and do some reorganization. This is not a forever governance plan, but we need to be practical given the many demands of the moment.
38.3.1 bug 1211160 calendar 4.0.3.1 is a workaround for 38.3.0 — Thunderbird Version 38.3 Buttons not working menu items // bug 1211291 – Folders are visible, but messages are not. (?related to bug 1211358 lightning chrome.manifest not updated in 38.3.0 ?)
nightly (3 bugs blocking builds, all are assigned) – bug 1195442 – Mac builds broken ~1 month on nightly
rkent is working on it – bug 1183490 – (dataloss) New emails do not adhere to sort by order received
Plan is to understand issues in converting completely to nextKey = ++lastKey for non-IMAP, then plan how much to implement for 38.3.0 A simple fix of the dataloss is probable. I have a try patch that does the nextKey = ++lastKey that works, now I need to think about patch for 38 (bug 1202105 needs to go to trunk and 38)
update 2015-10-20, patches have been posted but waiting for reviews for 2 weeks (aceman will look at those)
update 2015-11-17, review done 11-14, need to update and land.
important (but not top critical)
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status TBD – bug 1176748 – fix main thread proxies to the migration code (jorgk and m_kato have done such fixes in the past)
filelink, proxy,
topcrash bug 1149287 is ** 31% of our crashes** – see below
I was away for a full week for Thanksgiving, just back today, and have a ear infection as well, so has not been a good week. Responding to the pEp proposals, and Mitchell’s post, are my main priorities at the moment.
persons comfortable (not necessarily technical experts) with Core type issues. Example graphics to guide bug 1195947 Thunderbird hardware acceleration (HWA) issues to be resolved
Thunderbird Council reorganization: we’ll discuss this in the Council directly and do some reorganization. This is not a forever governance plan, but we need to be practical given the many demands of the moment.
Please help green up the tree – including not only c-c but also esr38, beta, and aurora
We must be aggressive so checkins when patches become available. Thoughts: if you file a bug, please consider fixing it, or pass it off to someone rather than rely on “the fates”. If it breaks build or tests, please make severity=blocker
Thunderbird 45: string freeze per schedule, interface freeze later (that’s me demanding a concession like last year).
Critical Issues
Leave critical bugs here until confirmed fixed. If confirmed, then remove.
38.3.1 bug 1211160 calendar 4.0.3.1 is a workaround for 38.3.0 — Thunderbird Version 38.3 Buttons not working menu items // bug 1211291 – Folders are visible, but messages are not. (?related to bug 1211358 lightning chrome.manifest not updated in 38.3.0 ?)
nightly (3 bugs blocking builds, all are assigned) – bug 1195442 – Mac builds broken ~1 month on nightly
rkent is working on it – bug 1183490 – (dataloss) New emails do not adhere to sort by order received
Plan is to understand issues in converting completely to nextKey = ++lastKey for non-IMAP, then plan how much to implement for 38.3.0 A simple fix of the dataloss is probable. I have a try patch that does the nextKey = ++lastKey that works, now I need to think about patch for 38 (bug 1202105 needs to go to trunk and 38)
update 2015-10-20, patches have been posted but waiting for reviews for 2 weeks (aceman will look at those)
update 2015-11-17, review done 11-14, need to update and land.
important (but not top critical)
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status TBD – bug 1176748 – fix main thread proxies to the migration code (jorgk and m_kato have done such fixes in the past)
filelink, proxy,
topcrash bug 1149287 is ** 31% of our crashes** – see below
bug 586587 – M-C drag and drop onto editor window – awaiting review.
bug 769604 – C-C font size toolbar button and change increase/decrease size function – awaiting review.
bug 1219928 – M-C spell checker problem – awaiting review.
Started looking as CJK problems in flowed plaintext messages: bug 26734, bug 653342.
Got hit by bug 1224840 – Assertion failure: IsOuterWindow(), fixed one occurrence.
rkent
Thunderbird’s future plans
Pending letter from Mitchell on Thunderbird and MoCo
Big issues for Thunderbird future: 1) cost of transition 2) sources of income
1) Volker Birk & another senior architect will work with you to estimate the cost of transition in man days and $ as they have lots of experince in setting up and migrating complex systems.
2) We have been talking to DigitalCourage about setting up a donation based system and we as the p≡p Foundation will fund Thunderbird until the donation levels are sufficient for a sustained future.
Even bigger issues for the future: articulating a vision, collecting partners: p≡p, Postbox, Chinese fork, Gaia email, TDF/Collabora, MoFo, ???
We see these points as elements that need to be worked out in the six months’ period that we have agreed with MoFo. As for collecting partners, we consider this as an outcome of the Business Strategy that will be defined in the next 6 months. We, the p≡p Foundation will start discussions with TDF. In my opinion, ‘Collecting Partners’could require quite large time committment and it should hence be very well planned.
Status of MoFo-Thunderbird and Thunderbird-p≡p discussions
p≡p Foundation is waiting for MoFo to come back with final comments. In the meantime, I’ll be wait for Kent’s input to align
My concern for developing capacity of Thunderbird to be self-managed and focused.
Planning a Europe meetup in January, focused on a business plan for January – June 2016
We are happy to set it up here in Europe.
Question Time
new account provisioner: Are there currently any providers other than Gandi? Would it be worth trying to find some more?
We need a plan for the build system fairly urgently, given the apparent “no” to the c-c/m-c merge
persons comfortable (not necessarily technical experts) with Core type issues. Example graphics to guide bug 1195947 Thunderbird hardware acceleration (HWA) issues to be resolved
needs assignee after solution identified – bug 1196662 not checking mails after hibernation, caused by core bug 1178890 (and we may be seeing other issues from 1178890, like hangs)
aceman is assignee – bug 1183490 – (dataloss) New emails do not adhere to sort by order received
status TBD – bug 1182629 – update to 38.1.0 from 38.0.1 re-enables disabled Lightning
status TBD – bug 1176399 – Multiple master password when GMail OAuth2 is enabled
status ??? – bug 1176748 – Someone needs to add additional main thread proxies to the migration code (jorgk and m_kato have in previous years been the major drivers of fixes)
topcrash bug 1149287 is ** 31% of our crashes** – see below
bug 1020181 – Set spell check language for recipient (published add-on as quick fix).
wsmwk
developers need to be thinking about landing stuff now, and in the next 2 months, for version 45 moves to aurora dec 14
HWA – collected more data and filed bug 1195947 Track HWA issues
crashes
38.2.0 crash rate ~0.35, dramatically reduced from ~.5 for 38.1.0 https://crash-stats.mozilla.com/daily?p=Thunderbird likely mostly due to disabling HWA. Only ~6 GPU crash signatures in top 200 and the highest rank is 80 (vs several in top 20 for 38.2.0), so most GPU driver crashes seem to be gone (but not all of the crash rate reduction will have been just from disabling HWA)
lacking time to follow up on other signatures in the “new” topcrash rankings (but that’s OK because we’re very well off compared to 38.1.0)
following progress of Thunderbird web pages moving to bedrock – they are close to being done
beware – ishikawa in the next several weeks will be landing big changes from many (>12?) bugs, both core and Thunderbird, for IO performance and error checking, eg bug 1116055
Fallen
Patched a lot of build issues, together with aleth
rkent
Code:
I think we have made progress on maildir issues, I hope to get two critical issues fixed before the next release.
I’m finally starting to turn more of my attention to management issues rather than urgent TB 38 problems. In the long run, those are very important, but the urgent easily crowds out the important. Kudos to those trying to get the tree usable again!
We could talk about:
Our thoughts on the announced changes to addons (Including not only
require signed vs unsigned?
no upside? i.e. we have no identified exploits
need to wait for enterprise impacts to be resolved
concerned that because the “unsigned process” will be no longer used by AMO, it may impact ability to push out unsigned THunderbird addons using the script they used to mass sign Firefox’s unsigned addons
XPCOM/XUL deprecation
need to make sure we are involved in the conversation and decisions so that developers have what they need in the “new environment”
Patrick is happy to do Enigmail integration into Thunderbird, I think we should agree. comments?
Joshua, Suyash, and I are working on the SkinkGlue (now JsAccount) integration, which will be the basis of both JMAP and CARDDAV support. –> consider using https://github.com/gaye/dav for carddav support
Volker Birk of Pretty Easy Privacy is *very* serious about heavily engaging his Foundation with Thunderbird. Concerns? http://pep-project.org/
Then there is the legal and financial home issue …
Expect a full-court press on MoFo by me soon, if you have contacts there please advise. One thing that is clear to me after talking to people is how critical Thunderbird is in the greater software world. We DO have a place at the table! How to get MoFo to see that?
We have submitted a formal application to Software Freedom Conservancy to affiliate with them, now we wait for their evaluation queue.
Very interesting talks with The Document Foundation(TDF)/LibreOffice. #1 concern: All of our energy goes into keeping up with Mozilla to allow us to optimize some future code development that we never actually have time for. Their funding model is very interesting, would be worth discussing.
hoping to meet with postbox in a few weeks
It would be really great if someone could lead efforts to prototype a fund-raising drive web page in-product. (hoping to target Oct/Nov)
aceman
looking at bug 1183490 (the order received regression), 2 options:
back out bug 854798. Should be safe. I recommend this.
finally drop msgkey=offset in mbox and just increment by 1. May be risky but there is already a precedent that msgs moved by filters already get key+1 so code expecting key==offset should have already been uncovered. However, there are still some unsure places, fixing which means basically bug 793865 (the 64bit envelope and msgkey!=offset cleanup). I would not recommend this for ESR.
finally looking to finish some feature patches (compose font selector, saved files tab, show map widget)
(do we still need this??) As underpass has pointed out repeatedly (thanks for your patience!) , we need to rewrite / heavily modify the lightning articles on support.mozilla.org. let me know irc: rolandtanglao on #tb-support-crew or rtanglao AT mozilla.com OR simply start editing the articles
I suggest that for the next dot release, if we cannot get the crash fixed, we should disable the menu item that promotes this. Does not block unthrottling since mostly affects new users.
Someone needs to add additional main thread proxies to the migration code.
bug 1174797 Add a cookie exception for GMail OAuth
Does not affect unthrottling since no effect unless someone tries to use OAuth. Lightning has an example of how to fix this, so should not be difficult.
Releases
Past
31.6.0 shipped
38.0b3 shipped 2015-04-26 Sunday
38.0b4 shipped 2015-05-03 Sunday
31.7.0 2015-05-18 Monday (lots of issues – Tues 5-12 was target)
38.0b5 shipped 2015-05-19 Tuesday
38.0b6 shipped 2015-05-23
38.0.1 nominally shipped 2015-06-12
Upcoming
31.8.0 ~2015-06-20 GTB
38.1.0 FF GTB on June 22 release June 30.
Would it be possible to continue to use the beta channel for TB 38 until post-38.2.0? So maybe we could do a 38.1.0b1?
As underpass has pointed out repeatedly (thanks for your patience!) , we need to rewrite / heavily modify the lightning articles on support.mozilla.org. let me know irc: rolandtanglao on #tb-support-crew or rtanglao AT mozilla.com OR simply start editing the articles
We need to fill the “Learn More” page with content, possibly point it to something more specific bug 1159682
Opt-out dialog: change “disable” to “remove” bug 1159698
My availability will be limited from June 20 – June 29. If there is going to be progress on fixing TB 38 releases in that time, someone else would have to drive it.
Monitoring issues appearing in TB 38.0.1
Previously there were issues getting TB 38 to build with current mozilla-esr38 (which resulted in THUNDERBIRD_38_0_20150603_RELBRANCH) and I suspect there may be some fixes needed to get the merge of mozilla-esr38 into THUNDERBIRD_38_VERBRANCH to work.
jcranmer
Huge backlog, being more or less processed in stack order unfortunately
Ping me on IRC if you have anything really important to look at something
Following up on some releng future issues
I have some discussions to start after 38 is finally out the door
sshagarwal
At step one of the project: Setting up a dummy protocol to understand how a protocol
is added to TB using addon approach (using Skinkglue).
I have taken too much time to do this mostly due to health issues continuing for over a month now. Will pick up soon.
Question Time
Jorg K:
When will JSMime take over all RFC2047 decoding? Looking at bug 1146099 I found RFC2047 code used in comm-central/mozilla/netwerk/mime.
Also, while Joshua is here: Will we fix whatever we did wrong in bug 1154521
Support team
Roland’s last day is June 30, 2015 – please needinfo :rolandtanglao or email rtanglao AT mozilla.com if you need my help starting July 1, 2015
Other
PLEASE PUT THE NEXT MEETING IN YOUR (LIGHTNING) CALENDAR
Note – meeting notes must be copied from etherpad to wiki before 5AM CET next day so that they will go public in the meeting notes blog.
…but of course there is is a release for Thunderbird 38! Since the release date for Thunderbird has been postponed and in the meanwhile Firefox has released 38.0.1, Thunderbird will also be released as Thunderbird 38.0.1. Since the Lightning version is automatically generated at build time, we have just released Lightning 4.0.0.1. If you are still using Thunderbird 31 and Lightning 3.3.3, you will be getting an update in the next days.
The exciting thing about this release is that Lightning has been integrated into Thunderbird. I expect there will be next to no issues during upgrade this time, because Thunderbird includes the Lightning addon already.
If you can’t wait, you can get Thunderbird in your language directly from mozilla.org. If you do happen to have issues with upgrading, you can also get Lightning from addons.mozilla.org. The latest Seamonkey version is 2.33.1 at the time of writing, you need to use Lightning 3.8b2 in this case. For more information on compatibility, check out the calendar versions page.
As mentioned in a previous blog post, most fixed issues are backend fixes that won’t be very visible. We do however have a great new feature to save copies of invitations to your calendar. This helps in case you don’t care about replying to the invitation but would still like to see it in your calendar. We also have more general improvements in invitation compatibility, performance and stability and some slight visual enhancements. The full list of changes can be found on bugzilla.
If you are upgrading manually, you might want to make a backup. Although I don’t anticipate any major issues, you never know.
If you have questions, would like support, or have found a bug, feel free to leave a comment here and I’ll get back to you as soon as possible.
As underpass has pointed out repeatedly (thanks for your patience!) , we need to rewrite / heavily modify the lightning articles on support.mozilla.org. let me know irc: rolandtanglao on #tb-support-crew or rtanglao AT mozilla.com OR simply start editing the articles
We need to fill the “Learn More” page with content, possibly point it to something more specific bug 1159682
Opt-out dialog: change “disable” to “remove” bug 1159698
During July I’ll be visiting family in Mongolia but I’ve also a few things that are very geeky that I want to do.
The first thing I want to do is plug the Ripe Atlas probes I have. It’s litle devices that look like that :
They enable anybody with a ripe atlas or ripe account to make measurements for dns queries and others. This helps making a global better internet. I have three of these probes I’d like to install. It’s good because last time I checked Mongolia didn’t have any active probe. These probes will also help Internet become better in Mongolia. I’ll need to buy some network cables before leaving because finding these in mongolia is going to be challenging. More on atlas at https://atlas.ripe.net/.
The second thing I intend to do is map Mongolia a bit better on two projects the first is related to Mozilla and maps gps coordinateswith wifi access point. Only a little part of The capital Ulaanbaatar is covered as per https://location.services.mozilla.com/map#11/47.8740/106.9485 I want this to be way more because having an open data source for this is important in the future. As mapping is my new thing I’ll probably edit Openstreetmap in order to make the urban parts of mongolia that I’ll visit way more usable on all the services that use OSM as a source of truth. There is already a project to map the capital city at http://hotosm.org/projects/mongolia_mapping_ulaanbaatar but I believe osm can server more than just 50% of mongolia’s population.
I got inspired to write this post by mu son this morning, look what he is doing at 17 months :
Last week I gave a talk at the Philly Tech Week 2015 Dev Day organized by the delightful people at technical.ly on some of the tricks/strategies we use in the Firefox OS Gaia Email app. Note that the credit for implementing most of these techniques goes to the owner of the Email app’s front-end, James Burke. Also, a special shout-out to Vivien for the initial DOM Worker patches for the email app.
I tried to avoid having slides that both I would be reading aloud as the audience read silently, so instead of slides to share, I have the talk script. Well, I also have the slides here, but there’s not much to them. The headings below are the content of the slides, except for the one time I inline some code. Note that the live presentation must have differed slightly, because I’m sure I’m much more witty and clever in person than this script would make it seem…
Cover Slide: Who!
Hi, my name is Andrew Sutherland. I work at Mozilla on the Firefox OS Email Application. I’m here to share some strategies we used to make our HTML5 app Seem faster and sometimes actually Be faster.
What’s A Firefox OS (Screenshot Slide)
But first: What is a Firefox OS? It’s a multiprocess Firefox gecko engine on an android linux kernel where all the apps including the system UI are implemented using HTML5, CSS, and JavaScript. All the apps use some combination of standard web APIs and APIs that we hope to standardize in some form.
Here are some screenshots. We’ve got the default home screen app, the clock app, and of course, the email app.
It’s an entirely client-side offline email application, supporting IMAP4, POP3, and ActiveSync. The goal, like all Firefox OS apps shipped with the phone, is to give native apps on other platforms a run for their money.
And that begins with starting up fast.
Fast Startup: The Problems
But that’s frequently easier said than done. Slow-loading websites are still very much a thing.
The good news for the email application is that a slow network isn’t one of its problems. It’s pre-loaded on the phone. And even if it wasn’t, because of the security implications of the TCP Web API and the difficulty of explaining this risk to users in a way they won’t just click through, any TCP-using app needs to be a cryptographically signed zip file approved by a marketplace. So we do load directly from flash.
However, it’s not like flash on cellphones is equivalent to an infinitely fast, zero-latency network connection. And even if it was, in a naive app you’d still try and load all of your HTML, CSS, and JavaScript at the same time because the HTML file would reference them all. And that adds up.
It adds up in the form of event loop activity and competition with other threads and processes. With the exception of Promises which get their own micro-task queue fast-lane, the web execution model is the same as all other UI event loops; events get scheduled and then executed in the same order they are scheduled. Loading data from an asynchronous API like IndexedDB means that your read result gets in line behind everything else that’s scheduled. And in the case of the bulk of shipped Firefox OS devices, we only have a single processor core so the thread and process contention do come into play.
So we try not to be a naive.
Seeming Fast at Startup: The HTML Cache
If we’re going to optimize startup, it’s good to start with what the user sees. Once an account exists for the email app, at startup we display the default account’s inbox folder.
What is the least amount of work that we can do to show that? Cache a screenshot of the Inbox. The problem with that, of course, is that a static screenshot is indistinguishable from an unresponsive application.
So we did the next best thing, (which is) we cache the actual HTML we display. At startup we load a minimal HTML file, our concatenated CSS, and just enough Javascript to figure out if we should use the HTML cache and then actually use it if appropriate. It’s not always appropriate, like if our application is being triggered to display a compose UI or from a new mail notification that wants to show a specific message or a different folder. But this is a decision we can make synchronously so it doesn’t slow us down.
Local Storage: Okay in small doses
We implement this by storing the HTML in localStorage.
Important Disclaimer! LocalStorage is a bad API. It’s a bad API because it’s synchronous. You can read any value stored in it at any time, without waiting for a callback. Which means if the data is not in memory the browser needs to block its event loop or spin a nested event loop until the data has been read from disk. Browsers avoid this now by trying to preload the Entire contents of local storage for your origin into memory as soon as they know your page is being loaded. And then they keep that information, ALL of it, in memory until your page is gone.
So if you store a megabyte of data in local storage, that’s a megabyte of data that needs to be loaded in its entirety before you can use any of it, and that hangs around in scarce phone memory.
To really make the point: do not use local storage, at least not directly. Use a library like localForage that will use IndexedDB when available, and then fails over to WebSQLDatabase and local storage in that order.
Now, having sufficiently warned you of the terrible evils of local storage, I can say with a sorta-clear conscience… there are upsides in this very specific case.
The synchronous nature of the API means that once we get our turn in the event loop we can act immediately. There’s no waiting around for an IndexedDB read result to gets its turn on the event loop.
This matters because although the concept of loading is simple from a User Experience perspective, there’s no standard to back it up right now. Firefox OS’s UX desires are very straightforward. When you tap on an app, we zoom it in. Until the app is loaded we display the app’s icon in the center of the screen. Unfortunately the standards are still assuming that the content is right there in the HTML. This works well for document-based web pages or server-powered web apps where the contents of the page are baked in. They work less well for client-only web apps where the content lives in a database and has to be dynamically retrieved.
The two events that exist are:
“DOMContentLoaded” fires when the document has been fully parsed and all scripts not tagged as “async” have run. If there were stylesheets referenced prior to the script tags, the script tags will wait for the stylesheet loads.
“load” fires when the document has been fully loaded; stylesheets, images, everything.
But none of these have anything to do with the content in the page saying it’s actually done. This matters because these standards also say nothing about IndexedDB reads or the like. We tried to create a standards consensus around this, but it’s not there yet. So Firefox OS just uses the “load” event to decide an app or page has finished loading and it can stop showing your app icon. This largely avoids the dreaded “flash of unstyled content” problem, but it also means that your webpage or app needs to deal with this period of time by displaying a loading UI or just accepting a potentially awkward transient UI state.
We reference our stylesheet first. It includes all of our styles. We never dynamically load stylesheets because that compels a style recalculation for all nodes and potentially a reflow. We would have to have an awful lot of style declarations before considering that.
Then we have our single script file. Because the stylesheet precedes the script, our script will not execute until the stylesheet has been loaded. Then our script runs and we synchronously insert our HTML from local storage. Then DOMContentLoaded can fire. At this point the layout engine has enough information to perform a style recalculation and determine what CSS-referenced image resources need to be loaded for buttons and icons, then those load, and then we’re good to be displayed as the “load” event can fire.
After that, we’re displaying an interactive-ish HTML document. You can scroll, you can press on buttons and the :active state will apply. So things seem real.
Being Fast: Lazy Loading and Optimized Layers
But now we need to try and get some logic in place as quickly as possible that will actually cash the checks that real-looking HTML UI is writing. And the key to that is only loading what you need when you need it, and trying to get it to load as quickly as possible.
There are many module loading and build optimizing tools out there, and most frameworks have a preferred or required way of handling this. We used the RequireJS family of Asynchronous Module Definition loaders, specifically the alameda loader and the r-dot-js optimizer.
One of the niceties of the loader plugin model is that we are able to express resource dependencies as well as code dependencies.
RequireJS Loader Plugins
var fooModule = require('./foo');
var htmlString = require('text!./foo.html');
var localizedDomNode = require('tmpl!./foo.html');
The standard Common JS loader semantics used by node.js and io.js are the first one you see here. Load the module, return its exports.
But RequireJS loader plugins also allow us to do things like the second line where the exclamation point indicates that the load should occur using a loader plugin, which is itself a module that conforms to the loader plugin contract. In this case it’s saying load the file foo.html as raw text and return it as a string.
But, wait, there’s more! loader plugins can do more than that. The third example uses a loader that loads the HTML file using the ‘text’ plugin under the hood, creates an HTML document fragment, and pre-localizes it using our localization library. And this works un-optimized in a browser, no compilation step needed, but it can also be optimized.
So when our optimizer runs, it bundles up the core modules we use, plus, the modules for our “message list” card that displays the inbox. And the message list card loads its HTML snippets using the template loader plugin. The r-dot-js optimizer then locates these dependencies and the loader plugins also have optimizer logic that results in the HTML strings being inlined in the resulting optimized file. So there’s just one single javascript file to load with no extra HTML file dependencies or other loads.
We then also run the optimizer against our other important cards like the “compose” card and the “message reader” card. We don’t do this for all cards because it can be hard to carve up the module dependency graph for optimization without starting to run into cases of overlap where many optimized files redundantly include files loaded by other optimized files.
Plus, we have another trick up our sleeve:
Seeming Fast: Preloading
Preloading. Our cards optionally know the other cards they can load. So once we display a card, we can kick off a preload of the cards that might potentially be displayed. For example, the message list card can trigger the compose card and the message reader card, so we can trigger a preload of both of those.
But we don’t go overboard with preloading in the frontend because we still haven’t actually loaded the back-end that actually does all the emaily email stuff. The back-end is also chopped up into optimized layers along account type lines and online/offline needs, but the main optimized JS file still weighs in at something like 17 thousand lines of code with newlines retained.
So once our UI logic is loaded, it’s time to kick-off loading the back-end. And in order to avoid impacting the responsiveness of the UI both while it loads and when we’re doing steady-state processing, we run it in a DOM Worker.
Being Responsive: Workers and SharedWorkers
DOM Workers are background JS threads that lack access to the page’s DOM, communicating with their owning page via message passing with postMessage. Normal workers are owned by a single page. SharedWorkers can be accessed via multiple pages from the same document origin.
By doing this, we stay out of the way of the main thread. This is getting less important as browser engines support Asynchronous Panning & Zooming or “APZ” with hardware-accelerated composition, tile-based rendering, and all that good stuff. (Some might even call it magic.)
When Firefox OS started, we didn’t have APZ, so any main-thread logic had the serious potential to result in janky scrolling and the impossibility of rendering at 60 frames per second. It’s a lot easier to get 60 frames-per-second now, but even asynchronous pan and zoom potentially has to wait on dispatching an event to the main thread to figure out if the user’s tap is going to be consumed by app logic and preventDefault called on it. APZ does this because it needs to know whether it should start scrolling or not.
And speaking of 60 frames-per-second…
Being Fast: Virtual List Widgets
…the heart of a mail application is the message list. The expected UX is to be able to fling your way through the entire list of what the email app knows about and see the messages there, just like you would on a native app.
This is admittedly one of the areas where native apps have it easier. There are usually list widgets that explicitly have a contract that says they request data on an as-needed basis. They potentially even include data bindings so you can just point them at a data-store.
But HTML doesn’t yet have a concept of instantiate-on-demand for the DOM, although it’s being discussed by Firefox layout engine developers. For app purposes, the DOM is a scene graph. An extremely capable scene graph that can handle huge documents, but there are footguns and it’s arguably better to err on the side of fewer DOM nodes.
So what the email app does is we create a scroll-region div and explicitly size it based on the number of messages in the mail folder we’re displaying. We create and render enough message summary nodes to cover the current screen, 3 screens worth of messages in the direction we’re scrolling, and then we also retain up to 3 screens worth in the direction we scrolled from. We also pre-fetch 2 more screens worth of messages from the database. These constants were arrived at experimentally on prototype devices.
We listen to “scroll” events and issue database requests and move DOM nodes around and update them as the user scrolls. For any potentially jarring or expensive transitions such as coordinate space changes from new messages being added above the current scroll position, we wait for scrolling to stop.
Nodes are absolutely positioned within the scroll area using their ‘top’ style but translation transforms also work. We remove nodes from the DOM, then update their position and their state before re-appending them. We do this because the browser APZ logic tries to be clever and figure out how to create an efficient series of layers so that it can pre-paint as much of the DOM as possible in graphic buffers, AKA layers, that can be efficiently composited by the GPU. Its goal is that when the user is scrolling, or something is being animated, that it can just move the layers around the screen or adjust their opacity or other transforms without having to ask the layout engine to re-render portions of the DOM.
When our message elements are added to the DOM with an already-initialized absolute position, the APZ logic lumps them together as something it can paint in a single layer along with the other elements in the scrolling region. But if we start moving them around while they’re still in the DOM, the layerization logic decides that they might want to independently move around more in the future and so each message item ends up in its own layer. This slows things down. But by removing them and re-adding them it sees them as new with static positions and decides that it can lump them all together in a single layer. Really, we could just create new DOM nodes, but we produce slightly less garbage this way and in the event there’s a bug, it’s nicer to mess up with 30 DOM nodes displayed incorrectly rather than 3 million.
But as neat as the layerization stuff is to know about on its own, I really mention it to underscore 2 suggestions:
1, Use a library when possible. Getting on and staying on APZ fast-paths is not trivial, especially across browser engines. So it’s a very good idea to use a library rather than rolling your own.
2, Use developer tools. APZ is tricky to reason about and even the developers who write the Async pan & zoom logic can be surprised by what happens in complex real-world situations. And there ARE developer tools available that help you avoid needing to reason about this. Firefox OS has easy on-device developer tools that can help diagnose what’s going on or at least help tell you whether you’re making things faster or slower:
– it’s got a frames-per-second overlay; you do need to scroll like mad to get the system to want to render 60 frames-per-second, but it makes it clear what the net result is
– it has paint flashing that overlays random colors every time it paints the DOM into a layer. If the screen is flashing like a discotheque or has a lot of smeared rainbows, you know something’s wrong because the APZ logic is not able to to just reuse its layers.
– devtools can enable drawing cool colored borders around the layers APZ has created so you can see if layerization is doing something crazy
There’s also fancier and more complicated tools in Firefox and other browsers like Google Chrome to let you see what got painted, what the layer tree looks like, et cetera.
It was brought to my attention recently by
reputablesources
that the recent announcement of increased usage in recent
years produced an internal firestorm within Mozilla. Key
figures raised alarm that some of the tech press had interpreted the blog post
as a sign that Thunderbird was not, in fact, dead. As a result, they asked
Thunderbird community members to make corrections to emphasize that Mozilla was
trying to kill Thunderbird.
The primary fear, it seems, is that knowledge that the largest open-source email
client was still receiving regular updates would impel its userbase to agitate
for increased funding and maintenance of the client to help forestall
potentialthreats
to the open nature of email as well as to
innovate in the space of providing usable and
private
communication channels. Such funding, however, would
be an unaffordable luxury and would only distract Mozilla from its
central goal of
buildingdeveloperproductivitytooling.
Persistent
rumors that Mozilla would be willing to fund Thunderbird were it renamed
Firefox Email were finally addressed with the comment, "such a renaming would
violate our current policy that allprojects be named Persona."
This post is part 8 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. Part 6 discusses how email security works in practice. Part 7 discusses the problem of trust. This part discusses why email security has largely failed.
At the end of the last part in this series, I posed the question, "Which email security protocol is most popular?" The answer to the question is actually neither S/MIME nor PGP, but a third protocol, DKIM. I haven't brought up DKIM until now because DKIM doesn't try to secure email in the same vein as S/MIME or PGP, but I still consider it relevant to discussing email security.
Unquestionably, DKIM is the only security protocol for email that can be considered successful. There are perhaps 4 billion active email addresses [1]. Of these, about 1-2 billion use DKIM. In contrast, S/MIME can count a few million users, and PGP at best a few hundred thousand. No other security protocols have really caught on past these three. Why did DKIM succeed where the others fail?
DKIM's success stems from its relatively narrow focus. It is nothing more than a cryptographic signature of the message body and a smattering of headers, and is itself stuck in the DKIM-Signature header. It is meant to be applied to messages only on outgoing servers and read and processed at the recipient mail server—it completely bypasses clients. That it bypasses clients allows it to solve the problem of key discovery and key management very easily (public keys are stored in DNS, which is already a key part of mail delivery), and its role in spam filtering is strong motivation to get it implemented quickly (it is 7 years old as of this writing). It's also simple: this one paragraph description is basically all you need to know [2].
The failure of S/MIME and PGP to see large deployment is certainly a large topic of discussion on myriads of cryptography enthusiast mailing lists, which often like to partake in propositions of new end-to-end encryption of email paradigms, such as the recent DIME proposal. Quite frankly, all of these solutions suffer broadly from at least the same 5 fundamental weaknesses, and I see it unlikely that a protocol will come about that can fix these weaknesses well enough to become successful.
The first weakness, and one I've harped about many times already, is UI. Most email security UI is abysmal and generally at best usable only by enthusiasts. At least some of this is endemic to security: while it mean seem obvious how to convey what an email signature or an encrypted email signifies, how do you convey the distinctions between sign-and-encrypt, encrypt-and-sign, or an S/MIME triple wrap? The Web of Trust model used by PGP (and many other proposals) is even worse, in that inherently requires users to do other actions out-of-band of email to work properly.
Trust is the second weakness. Consider that, for all intents and purposes, the email address is the unique identifier on the Internet. By extension, that implies that a lot of services are ultimately predicated on the notion that the ability to receive and respond to an email is a sufficient means to identify an individual. However, the entire purpose of secure email, or at least of end-to-end encryption, is subtly based on the fact that other people in fact have access to your mailbox, thus destroying the most natural ways to build trust models on the Internet. The quest for anonymity or privacy also renders untenable many other plausible ways to establish trust (e.g., phone verification or government-issued ID cards).
Key discovery is another weakness, although it's arguably the easiest one to solve. If you try to keep discovery independent of trust, the problem of key discovery is merely picking a protocol to publish and another one to find keys. Some of these already exist: PGP key servers, for example, or using DANE to publish S/MIME or PGP keys.
Key management, on the other hand, is a more troubling weakness. S/MIME, for example, basically works without issue if you have a certificate, but managing to get an S/MIME certificate is a daunting task (necessitated, in part, by its trust model—see how these issues all intertwine?). This is also where it's easy to say that webmail is an unsolvable problem, but on further reflection, I'm not sure I agree with that statement anymore. One solution is just storing the private key with the webmail provider (you're trusting them as an email client, after all), but it's also not impossible to imagine using phones or flash drives as keystores. Other key management factors are more difficult to solve: people who lose their private keys or key rollover create thorny issues. There is also the difficulty of managing user expectations: if I forget my password to most sites (even my email provider), I can usually get it reset somehow, but when a private key is lost, the user is totally and completely out of luck.
Of course, there is one glaring and almost completely insurmountable problem. Encrypted email fundamentally precludes certain features that we have come to take for granted. The lesser known is server-side search and filtration. While there exist some mechanisms to do search on encrypted text, those mechanisms rely on the fact that you can manipulate the text to change the message, destroying the integrity feature of secure email. They also tend to be fairly expensive. It's easy to just say "who needs server-side stuff?", but the contingent of people who do email on smartphones would not be happy to have to pay the transfer rates to download all the messages in their folder just to find one little email, nor the energy costs of doing it on the phone. And those who have really large folders—Fastmail has a design point of 1,000,000 in a single folder—would still prefer to not have to transfer all their mail even on desktops.
The more well-known feature that would disappear is spam filtration. Consider that 90% of all email is spam, and if you think your spam folder is too slim for that to be true, it's because your spam folder only contains messages that your email provider wasn't sure were spam. The loss of server-side spam filtering would dramatically increase the cost of spam (a 10% reduction in efficiency would double the amount of server storage, per my calculations), and client-side spam filtering is quite literally too slow [3] and too costly (remember smartphones? Imagine having your email take 10 times as much energy and bandwidth) to be a tenable option. And privacy or anonymity tends to be an invitation to abuse (cf. Tor and Wikipedia). Proposed solutions to the spam problem are so common that there is a checklist containing most of the objections.
When you consider all of those weaknesses, it is easy to be pessimistic about the possibility of wide deployment of powerful email security solutions. The strongest future—all email is encrypted, including metadata—is probably impossible or at least woefully impractical. That said, if you weaken some of the assumptions (say, don't desire all or most traffic to be encrypted), then solutions seem possible if difficult.
This concludes my discussion of email security, at least until things change for the better. I don't have a topic for the next part in this series picked out (this part actually concludes the set I knew I wanted to discuss when I started), although OAuth and DMARC are two topics that have been bugging me enough recently to consider writing about. They also have the unfortunate side effect of being things likely to see changes in the near future, unlike most of the topics I've discussed so far. But rest assured that I will find more difficulties in the email infrastructure to write about before long!
[1] All of these numbers are crude estimates and are accurate to only an order of magnitude. To justify my choices: I assume 1 email address per Internet user (this overestimates the developing world and underestimates the developed world). The largest webmail providers have given numbers that claim to be 1 billion active accounts between them, and all of them use DKIM. S/MIME is guessed by assuming that any smartcard deployment supports S/MIME, and noting that the US Department of Defense and Estonia's digital ID project are both heavy users of such smartcards. PGP is estimated from the size of the strong set and old numbers on the reachable set from the core Web of Trust.
[2] Ever since last April, it's become impossible to mention DKIM without referring to DMARC, as a result of Yahoo's controversial DMARC policy. A proper discussion of DMARC (and why what Yahoo did was controversial) requires explaining the mail transmission architecture and spam, however, so I'll defer that to a later post. It's also possible that changes in this space could happen within the next year.
[3] According to a former GMail spam employee, if it takes you as long as three minutes to calculate reputation, the spammer wins.
Several years back, Ehsan and Jeff Muizelaar attempted to build a unified history of mozilla-central across the Mercurial era and the CVS era. Their result is now used in the gecko-dev repository. While being distracted on yet another side project, I thought that I might want to do the same for comm-central. It turns out that building a unified history for comm-central makes mozilla-central look easy: mozilla-central merely had one import from CVS. In contrast, comm-central imported twice from CVS (the calendar code came later), four times from mozilla-central (once with converted history), and imported twice from Instantbird's repository (once with converted history). Three of those conversions also involved moving paths. But I've worked through all of those issues to provide a nice snapshot of the repository [1]. And since I've been frustrated by failing to find good documentation on how this sort of process went for mozilla-central, I'll provide details on the process for comm-central.
The first step and probably the hardest is getting the CVS history in DVCS form (I use hg because I'm more comfortable it, but there's effectively no difference between hg, git, or bzr here). There is a git version of mozilla's CVS tree available, but I've noticed after doing research that its last revision is about a month before the revision I need for Calendar's import. The documentation for how that repo was built is no longer on the web, although we eventually found a copy after I wrote this post on git.mozilla.org. I tried doing another conversion using hg convert to get CVS tags, but that rudely blew up in my face. For now, I've filed a bug on getting an official, branchy-and-tag-filled version of this repository, while using the current lack of history as a base. Calendar people will have to suffer missing a month of history.
CVS is famously hard to convert to more modern repositories, and, as I've done my research, Mozilla's CVS looks like it uses those features which make it difficult. In particular, both the calendar CVS import and the comm-central initial CVS import used a CVS tag HG_COMM_INITIAL_IMPORT. That tagging was done, on only a small portion of the tree, twice, about two months apart. Fortunately, mailnews code was never touched on CVS trunk after the import (there appears to be one commit on calendar after the tagging), so it is probably possible to salvage a repository-wide consistent tag.
The start of my script for conversion looks like this:
Since I don't want to include the changesets useless to comm-central history, I trimmed the history by using hg convert to eliminate changesets that don't change the necessary files. Most of the files are simple directory-wide changes, but S/MIME only moved a few files over, so it requires a more complex way to grab the file list. In addition, I also replaced the % in the usernames with @ that they are used to appearing in hg. The relevant code is here:
# Step 1: Trim mozilla CVS history to include only the files we are ultimately
# interested in.
cat >$WORKDIR/convert-filemap.txt <<EOF# Revision e4f4569d451a
include directory/xpcom
include mail
include mailnews
include other-licenses/branding/thunderbird
include suite
# Revision 7c0bfdcda673
include calendar
include other-licenses/branding/sunbird
# Revision ee719a0502491fc663bda942dcfc52c0825938d3
include editor/ui
# Revision 52efa9789800829c6f0ee6a005f83ed45a250396
include db/mork/
include db/mdb/EOF# Add the S/MIME import files
hg -R $MC log -r "children($MC_SMIME_IMPORT)" \
--template "{file_dels % 'include {file}\n'}" >>$WORKDIR/convert-filemap.txt
if [ ! -e $WORKDIR/convert-authormap.txt ]; then
hg -R $HGCVS log --template "{email(author)}={sub('%', '@', email(author))}\n" \
| sort -u > $WORKDIR/convert-authormap.txt
ficd$WORKDIR
hg convert $HGCVS $OUTPUT --filemap convert-filemap.txt -A convert-authormap.txt
That last command provides us the subset of the CVS history that we need for unified history. Strictly speaking, I should be pulling a specific revision, but I happen to know that there's no need to (we're cloning the only head) in this case. At this point, we now need to pull in the mozilla-central changes before we pull in comm-central. Order is key; hg convert will only apply the graft points when converting the child changeset (which it does but once), and it needs the parents to exist before it can do that. We also need to ensure that the mozilla-central graft point is included before continuing, so we do that, and then pull mozilla-central:
CC_CVS_BASE=$(hg log -R $HGCVS -r 'tip' --template '{node}')
CC_CVS_BASE=$(grep $CC_CVS_BASE$OUTPUT/.hg/shamap | cut -d' ' -f2)
MC_CVS_BASE=$(hg log -R $HGCVS -r 'gitnode(215f52d06f4260fdcca797eebd78266524ea3d2c)' --template '{node}')
MC_CVS_BASE=$(grep $MC_CVS_BASE$OUTPUT/.hg/shamap | cut -d' ' -f2)
# Okay, now we need to build the map of revisions.
cat >$WORKDIR/convert-revmap.txt <<EOFe4f4569d451a5e0d12a6aa33ebd916f979dd8faa $CC_CVS_BASE # Thunderbird / Suite
7c0bfdcda6731e77303f3c47b01736aaa93d5534 d4b728dc9da418f8d5601ed6735e9a00ac963c4e, $CC_CVS_BASE # Calendar
9b2a99adc05e53cd4010de512f50118594756650 $MC_CVS_BASE # Mozilla graft point
ee719a0502491fc663bda942dcfc52c0825938d3 78b3d6c649f71eff41fe3f486c6cc4f4b899fd35, $MC_EDITOR_IMPORT # Editor
8cdfed92867f885fda98664395236b7829947a1d 4b5da7e5d0680c6617ec743109e6efc88ca413da, e4e612fcae9d0e5181a5543ed17f705a83a3de71 # ChatEOF# Next, import mozilla-central revisionsfor rev in$MC_MORK_IMPORT $MC_EDITOR_IMPORT $MC_SMIME_IMPORT; do
hg convert $MC$OUTPUT -r $rev --splicemap $WORKDIR/convert-revmap.txt \
--filemap $WORKDIR/convert-filemap.txt
done
Some notes about all of the revision ids in the script. The splicemap requires the full 40-character SHA ids; anything less and the thing complains. I also need to specify the parents of the revisions that deleted the code for the mozilla-central import, so if you go hunting for those revisions and are surprised that they don't remove the code in question, that's why.
I mentioned complications about the merges earlier. The Mork and S/MIME import codes here moved files, so that what was db/mdb in mozilla-central became db/mork. There's no support for causing the generated splice to record these as a move, so I have to manually construct those renamings:
# We need to execute a few hg move commands due to renamings.pushd$OUTPUT
hg update -r $(grep $MC_MORK_IMPORT .hg/shamap | cut -d' ' -f2)
(hg -R $MC log -r "children($MC_MORK_IMPORT)" \
--template "{file_dels % 'hg mv {file} {sub(\"db/mdb\", \"db/mork\", file)}\n'}") | bash
hg commit -m 'Pseudo-changeset to move Mork files' -d '2011-08-06 17:25:21 +0200'
MC_MORK_IMPORT=$(hg log -r tip --template '{node}')
hg update -r $(grep $MC_SMIME_IMPORT .hg/shamap | cut -d' ' -f2)
(hg -R $MC log -r "children($MC_SMIME_IMPORT)" \
--template "{file_dels % 'hg mv {file} {sub(\"security/manager/ssl\", \"mailnews/mime\", file)}\n'}") | bash
hg commit -m 'Pseudo-changeset to move S/MIME files' -d '2014-06-15 20:51:51 -0700'
MC_SMIME_IMPORT=$(hg log -r tip --template '{node}')
popd# Echo the new move commands to the changeset conversion map.
cat >>$WORKDIR/convert-revmap.txt <<EOF52efa9789800829c6f0ee6a005f83ed45a250396 abfd23d7c5042bc87502506c9f34c965fb9a09d1, $MC_MORK_IMPORT # Mork
50f5b5fc3f53c680dba4f237856e530e2097adfd 97253b3cca68f1c287eb5729647ba6f9a5dab08a, $MC_SMIME_IMPORT # S/MIMEEOF
Now that we have all of the graft points defined, and all of the external code ready, we can pull comm-central and do the conversion. That's not quite it, though—when we graft the S/MIME history to the original mozilla-central history, we have a small segment of abandoned converted history. A call to hg strip removes that.
# Now, import comm-central revisions that we need
hg convert $CC$OUTPUT --splicemap $WORKDIR/convert-revmap.txt
hg strip 2f69e0a3a05a
[1] I left out one of the graft points because I just didn't want to deal with it. I'll leave it as an exercise to the reader to figure out which one it was. Hint: it's the only one I didn't know about before I searched for the archive points [2].
[2] Since I wasn't sure I knew all of the graft points, I decided to try to comb through all of the changesets to figure out who imported code. It turns out that hg log -r 'adds("**")' narrows it down nicely (1667 changesets to look at instead of 17547), and using the {file_adds} template helps winnow it down more easily.
In an xpcshell test, I recently needed a way to monitor all network requests and access both request and response data so I can save them for later use. This required a little bit of digging in Mozilla’s devtools code so I thought I’d write a short blog post about it.
This code will be used in a testcase that ensures that calendar providers in Lightning function properly. In the case of the CalDAV provider, we would need to access a real server for testing. We can’t just set up a few servers and use them for testing, it would end in an unreasonable amount of server maintenance. Given non-local connections are not allowed when running the tests on the Mozilla build infrastructure, it wouldn’t work anyway. The solution is to create a fakeserver, that is able to replay the requests in the same way. Instead of manually making the requests and figuring out how the server replies, we can use this code to quickly collect all the requests we need.
Without further delay, here is the code you have been waiting for:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Over the years I’ve organized or tried to organize pgp key signing parties every time I go somewhere. I the last year I’ve organized 3 that were successful (eg with more then 10 attendees).
1. Have a venue
I’ve tried a bunch of times to have people show up at the hotel I was staying in the morning - that doesn’t work. Having catering at the venues is even better, it will encourage people to come from far away (or long distance commute). Try to show the path in the venues with signs (paper with PGP key signing party and arrows help).
2. Date and time
Meeting in the evening after work works better ( after 18 or 18:30 works better).
Let people know how long it will take (count 1 hour/per 30 participants).
3. Make people sign up
That makes people think twice before saying they will attend. It’s also an easy way for you to know how much beer/cola/ etc.. you’ll need to provide if you cater food.
I’ve been using eventbrite to manage attendance at my last three meeting it let’s me :
know who is coming
Mass mail participants
have them have a calendar reminder
4 Reach out
For such a party you need people to attend so you need to reach out.
I always start by a search on biglumber.com to find who are the people using gpg registered on that site for the area I’m visiting (see below on what I send).
Then I look for local linux users groups / *BSD groups and send an announcement to them with :
date
venue
link to eventbrite and why I use it
ask them to forward (they know the area better than you)
I also use lanyrd and twitter but I’m not convinced that it works.
for my last announcement it looked like this :
Subject: GnuPG / PGP key signing party September 26 2014
Content-Type: multipart/signed; micalg=pgp-sha256;
protocol="application/pgp-signature";
boundary="t01Mpe56TgLc7mgHKVMajjwkqQdw8XvI4"
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--t01Mpe56TgLc7mgHKVMajjwkqQdw8XvI4
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Hello my name is ludovic,
I'm a sysadmins at mozilla working remote from europe. I've been
involved with Thunderbird a lot (and still am). I'm organizing a pgp Key
signing party in the Mozilla san francisco office on September the 26th
2014 from 6PM to 8PM.
For security and assurances reasons I need to count how many people will
attend. I'v setup a eventbrite for that at
https://www.eventbrite.com/e/gnupg-pgp-key-signing-party-making-the-web-o=
f-trust-stronger-tickets-12867542165
(please take one ticket if you think about attending - If you change you
mind cancel so more people can come).
I will use the eventbrite tool to send reminders and I will try to make
a list with keys and fingerprint before the event to make things more
manageable (but I don't promise).
for those using lanyrd you will be able to use http://lanyrd.com/ccckzw.
Ludovic
ps sent to buug.org,nblug.org end penlug.org - please feel free to post
where appropriate ( the more the meerier, the stronger the web of trust).=
ps2 I have contacted people listed on biglumber to have more gpg related
people show up.
--=20
[:Usul] MOC Team at Mozilla
QA Lead fof Thunderbird
http://sietch-tabr.tumblr.com/ - http://weusepgp.info/
5. Make it easy to attend
As noted above making a list of participants to hand out helps a lot (I’ve used http://www.phildev.net/pius/ and my own stuff to make a list). It make it easier for you, for attendees. Tell people what they need to bring (IDs, pen, printed fingerprints if you don’t provide a list).
6. Send reminders
Send people reminder and let them know how many people intend to show up. It boosts audience.
I will use the eventbrite tool to send reminders and I will try to make a list with keys and fingerprint before the event to make things more manageable (but I don’t promise).
For those using lanyrd you will be able to use http://lanyrd.com/ccckzw.(Please tweet the event to get more people in).
This post is part 7 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. Part 6 discusses how email security works in practice. This part discusses the problem of trust.
At a technical level, S/MIME and PGP (or at least PGP/MIME) use cryptography essentially identically. Yet the two are treated as radically different models of email security because they diverge on the most important question of public key cryptography: how do you trust the identity of a public key? Trust is critical, as it is the only way to stop an active, man-in-the-middle (MITM) attack. MITM attacks are actually easier to pull off in email, since all email messages effectively have to pass through both the sender's and the recipients' email servers [1], allowing attackers to be able to pull off permanent, long-lasting MITM attacks [2].
S/MIME uses the same trust model that SSL uses, based on X.509 certificates and certificate authorities. X.509 certificates effectively work by providing a certificate that says who you are which is signed by another authority. In the original concept (as you might guess from the name "X.509"), the trusted authority was your telecom provider, and the certificates were furthermore intended to be a part of the global X.500 directory—a natural extension of the OSI internet model. The OSI model of the internet never gained traction, and the trusted telecom providers were replaced with trusted root CAs.
PGP, by contrast, uses a trust model that's generally known as the Web of Trust. Every user has a PGP key (containing their identity and their public key), and users can sign others' public keys. Trust generally flows from these signatures: if you trust a user, you know the keys that they sign are correct. The name "Web of Trust" comes from the vision that trust flows along the paths of signatures, building a tight web of trust.
And now for the controversial part of the post, the comparisons and critiques of these trust models. A disclaimer: I am not a security expert, although I am a programmer who revels in dreaming up arcane edge cases. I also don't use PGP at all, and use S/MIME to a very limited extent for some Mozilla work [3], although I did try a few abortive attempts to dogfood it in the past. I've attempted to replace personal experience with comprehensive research [4], but most existing critiques and comparisons of these two trust models are about 10-15 years old and predate several changes to CA certificate practices.
A basic tenet of development that I have found is that the average user is fairly ignorant. At the same time, a lot of the defense of trust models, both CAs and Web of Trust, tends to hinge on configurability. How many people, for example, know how to add or remove a CA root from Firefox, Windows, or Android? Even among the subgroup of Mozilla developers, I suspect the number of people who know how to do so are rather few. Or in the case of PGP, how many people know how to change the maximum path length? Or even understand the security implications of doing so?
Seen in the light of ignorant users, the Web of Trust is a UX disaster. Its entire security model is predicated on having users precisely specify how much they trust other people to trust others (ultimate, full, marginal, none, unknown) and also on having them continually do out-of-band verification procedures and publicly reporting those steps. In 1998, a seminal paper on the usability of a GUI for PGP encryption came to the conclusion that the UI was effectively unusable for users, to the point that only a third of the users were able to send an encrypted email (and even then, only with significant help from the test administrators), and a quarter managed to publicly announce their private keys at some point, which is pretty much the worst thing you can do. They also noted that the complex trust UI was never used by participants, although the failure of many users to get that far makes generalization dangerous [5]. While newer versions of security UI have undoubtedly fixed many of the original issues found (in no small part due to the paper, one of the first to argue that usability is integral, not orthogonal, to security), I have yet to find an actual study on the usability of the trust model itself.
The Web of Trust has other faults. The notion of "marginal" trust it turns out is rather broken: if you marginally trust a user who has two keys who both signed another person's key, that's the same as fully trusting a user with one key who signed that key. There are several proposals for different trust formulas [6], but none of them have caught on in practice to my knowledge.
A hidden fault is associated with its manner of presentation: in sharp contrast to CAs, the Web of Trust appears to not delegate trust, but any practical widespread deployment needs to solve the problem of contacting people who have had no prior contact. Combined with the need to bootstrap new users, this implies that there needs to be some keys that have signed a lot of other keys that are essentially default-trusted—in other words, a CA, a fact sometimes lost on advocates of the Web of Trust.
That said, a valid point in favor of the Web of Trust is that it more easily allows people to distrust CAs if they wish to. While I'm skeptical of its utility to a broader audience, the ability to do so for is crucial for a not-insignificant portion of the population, and it's important enough to be explicitly called out.
X.509 certificates are most commonly discussed in the context of SSL/TLS connections, so I'll discuss them in that context as well, as the implications for S/MIME are mostly the same. Almost all criticism of this trust model essentially boils down to a single complaint: certificate authorities aren't trustworthy. A historical criticism is that the addition of CAs to the main root trust stores was ad-hoc. Since then, however, the main oligopoly of these root stores (Microsoft, Apple, Google, and Mozilla) have made theirpoliciespublic and clear [7]. The introduction of the CA/Browser Forum in 2005, with a collection of major CAs and the major browsers as members, and several [8] helps in articulating common policies. These policies, simplified immensely, boil down to:
You must verify information (depending on certificate type). This information must be relatively recent.
You must not use weak algorithms in your certificates (e.g., no MD5).
You must not make certificates that are valid for too long.
You must maintain revocation checking services.
You must have fairly stringent physical and digital security practices and intrusion detection mechanisms.
You must be [externally] audited every year that you follow the above rules.
If you screw up, we can kick you out.
I'm not going to claim that this is necessarily the best policy or even that any policy can feasibly stop intrusions from happening. But it's a policy, so CAs must abide by some set of rules.
Another CA criticism is the fear that they may be suborned by national government spy agencies. I find this claim underwhelming, considering that the number of certificates acquired by intrusions that were used in the wild is larger than the number of certificates acquired by national governments that were used in the wild: 1 and 0, respectively. Yet no one complains about the untrustworthiness of CAs due to their ability to be hacked by outsiders. Another attack is that CAs are controlled by profit-seeking corporations, which misses the point because the business of CAs is not selling certificates but selling their access to the root databases. As we will see shortly, jeopardizing that access is a great way for a CA to go out of business.
To understand issues involving CAs in greater detail, there are two CAs that are particularly useful to look at. The first is CACert. CACert is favored by many by its attempt to handle X.509 certificates in a Web of Trust model, so invariably every public discussion about CACert ends up devolving into an attack on other CAs for their perceived capture by national governments or corporate interests. Yet what many of the proponents for inclusion of CACert miss (or dismiss) is the fact that CACert actually failed the required audit, and it is unlikely to ever pass an audit. This shows a central failure of both CAs and Web of Trust: different people have different definitions of "trust," and in the case of CACert, some people are favoring a subjective definition (I trust their owners because they're not evil) when an objective definition fails (in this case, that the root signing key is securely kept).
The other CA of note here is DigiNotar. In July 2011, some hackers managed to acquire a few fraudulent certificates by hacking into DigiNotar's systems. By late August, people had become aware of these certificates being used in practice [9] to intercept communications, mostly in Iran. The use appears to have been caught after Chromium updates failed due to invalid certificate fingerprints. After it became clear that the fraudulent certificates were not limited to a single fake Google certificate, and that DigiNotar had failed to notify potentially affected companies of its breach, DigiNotar was swiftly removed from all of the trust databases. It ended up declaring bankruptcy within two weeks.
DigiNotar indicates several things. One, SSL MITM attacks are not theoretical (I have seen at least two or three security experts advising pre-DigiNotar that SSL MITM attacks are "theoretical" and therefore the wrong target for security mechanisms). Two, keeping the trust of browsers is necessary for commercial operation of CAs. Three, the notion that a CA is "too big to fail" is false: DigiNotar played an important role in the Dutch community as a major CA and the operator of Staat der Nederlanden. Yet when DigiNotar screwed up and lost its trust, it was swiftly kicked out despite this role. I suspect that even Verisign could be kicked out if it manages to screw up badly enough.
This isn't to say that the CA model isn't problematic. But the source of its problems is that delegating trust isn't a feasible model in the first place, a problem that it shares with the Web of Trust as well. Different notions of what "trust" actually means and the uncertainty that gets introduced as chains of trust get longer both make delegating trust weak to both social engineering and technical engineering attacks. There appears to be an increasing consensus that the best way forward is some variant of key pinning, much akin to how SSH works: once you know someone's public key, you complain if that public key appears to change, even if it appears to be "trusted." This does leave people open to attacks on first use, and the question of what to do when you need to legitimately re-key is not easy to solve.
In short, both CAs and the Web of Trust have issues. Whether or not you should prefer S/MIME or PGP ultimately comes down to the very conscious question of how you want to deal with trust—a question without a clear, obvious answer. If I appear to be painting CAs and S/MIME in a positive light and the Web of Trust and PGP in a negative one in this post, it is more because I am trying to focus on the positions less commonly taken to balance perspective on the internet. In my next post, I'll round out the discussion on email security by explaining why email security has seen poor uptake and answering the question as to which email security protocol is most popular. The answer may surprise you!
[1] Strictly speaking, you can bypass the sender's SMTP server. In practice, this is considered a hole in the SMTP system that email providers are trying to plug.
[2] I've had 13 different connections to the internet in the same time as I've had my main email address, not counting all the public wifis that I have used. Whereas an attacker would find it extraordinarily difficult to intercept all of my SSH sessions for a MITM attack, intercepting all of my email sessions is clearly far easier if the attacker were my email provider.
[3] Before you read too much into this personal choice of S/MIME over PGP, it's entirely motivated by a simple concern: S/MIME is built into Thunderbird; PGP is not. As someone who does a lot of Thunderbird development work that could easily break the Enigmail extension locally, needing to use an extension would be disruptive to workflow.
[4] This is not to say that I don't heavily research many of my other posts, but I did go so far for this one as to actually start going through a lot of published journals in an attempt to find information.
[5] It's questionable how well the usability of a trust model UI can be measured in a lab setting, since the observer effect is particularly strong for all metrics of trust.
[6] The web of trust makes a nice graph, and graphs invite lots of interesting mathematical metrics. I've always been partial to eigenvectors of the graph, myself.
[7] Mozilla's policy for addition to NSS is basically the standard policy adopted by all open-source Linux or BSD distributions, seeing as OpenSSL never attempted to produce a root database.
[8] It looks to me that it's the browsers who are more in charge in this forum than the CAs.
[9] To my knowledge, this is the first—and so far only—attempt to actively MITM an SSL connection.
We just released the second beta of Thunderbird 31. Please help us improve Thunderbird quality by uncovering bugs now in Thunderbird 31 beta so that developers have time to fix them.
- Use Thunderbird 31 beta to do formal testing. Use the moztrap testing system to tests : choose run test - find the Thunderbird product and choose 31 test run.
This post is part 6 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. This part discusses how email security works in practice.
Email security is a rather wide-ranging topic, and one that I've wanted to cover for some time, well before several recent events that have made it come up in the wider public knowledge. There is no way I can hope to cover it in a single post (I think it would outpace even the length of my internationalization discussion), and there are definitely parts for which I am underqualified, as I am by no means an expert in cryptography. Instead, I will be discussing this over the course of several posts of which this is but the first; to ease up on the amount of background explanation, I will assume passing familiarity with cryptographic concepts like public keys, hash functions, as well as knowing what SSL and SSH are (though not necessarily how they work). If you don't have that knowledge, ask Wikipedia.
Before discussing how email security works, it is first necessary to ask what email security actually means. Unfortunately, the layman's interpretation is likely going to differ from the actual precise definition. Security is often treated by laymen as a boolean interpretation: something is either secure or insecure. The most prevalent model of security to people is SSL connections: these allow the establishment of a communication channel whose contents are secret to outside observers while also guaranteeing to the client the authenticity of the server. The server often then gets authenticity of the client via a more normal authentication scheme (i.e., the client sends a username and password). Thus there is, at the end, a channel that has both secrecy and authenticity [1]: channels with both of these are considered secure and channels without these are considered insecure [2].
In email, the situation becomes more difficult. Whereas an SSL connection is between a client and a server, the architecture of email is such that email providers must be considered as distinct entities from end users. In addition, messages can be sent from one person to multiple parties. Thus secure email is a more complex undertaking than just porting relevant details of SSL. There are two major cryptographic implementations of secure email [3]: S/MIME and PGP. In terms of implementation, they are basically the same [4], although PGP has an extra mode which wraps general ASCII (known as "ASCII-armor"), which I have been led to believe is less recommended these days. Since I know the S/MIME specifications better, I'll refer specifically to how S/MIME works.
S/MIME defines two main MIME types: multipart/signed, which contains the message text as a subpart followed by data indicating the cryptographic signature, and application/pkcs7-mime, which contains an encrypted MIME part. The important things to note about this delineation are that only the body data is encrypted [5], that it's theoretically possible to encrypt only part of a message's body, and that the signing and encryption constitute different steps. These factors combine to make for a potentially infuriating UI setup.
How does S/MIME tackle the challenges of encrypting email? First, rather than encrypting using recipients' public keys, the message is encrypted with a symmetric key. This symmetric key is then encrypted with each of the recipients' keys and then attached to the message. Second, by only signing or encrypting the body of the message, the transit headers are kept intact for the mail system to retain its ability to route, process, and deliver the message. The body is supposed to be prepared in the "safest" form before transit to avoid intermediate routers munging the contents. Finally, to actually ascertain what the recipients' public keys are, clients typically passively pull the information from signed emails. LDAP, unsurprisingly, contains an entry for a user's public key certificate, which could be useful in large enterprise deployments. There is also work ongoing right now to publish keys via DNS and DANE.
I mentioned before that S/MIME's use can present some interesting UI design decisions. I ended up actually testing some common email clients on how they handled S/MIME messages: Thunderbird, Apple Mail, Outlook [6], and Evolution. In my attempts to create a surreptitious signed part to confuse the UI, Outlook decided that the message had no body at all, and Thunderbird decided to ignore all indication of the existence of said part. Apple Mail managed to claim the message was signed in one of these scenarios, and Evolution took the cake by always agreeing that the message was signed [7]. It didn't even bother questioning the signature if the certificate's identity disagreed with the easily-spoofable From address. I was actually surprised by how well people did in my tests—I expected far more confusion among clients, particularly since the will to maintain S/MIME has clearly been relatively low, judging by poor support for "new" features such as triple-wrapping or header protection.
Another fault of S/MIME's design is that it makes the mistaken belief that composing a signing step and an encryption step is equivalent in strength to a simultaneous sign-and-encrypt. Another page describes this in far better detail than I have room to; note that this flaw is fixed via triple-wrapping (which has relatively poor support). This creates yet more UI burden into how to adequately describe in UI all the various minutiae in differing security guarantees. Considering that users already have a hard time even understanding that just because a message says it's from example@isp.invalid doesn't actually mean it's from example@isp.invalid, trying to develop UI that both adequately expresses the security issues and is understandable to end-users is an extreme challenge.
What we have in S/MIME (and PGP) is a system that allows for strong guarantees, if certain conditions are met, yet is also vulnerable to breaches of security if the message handling subsystems are poorly designed. Hopefully this is a sufficient guide to the technical impacts of secure email in the email world. My next post will discuss the most critical component of secure email: the trust model. After that, I will discuss why secure email has seen poor uptake and other relevant concerns on the future of email security.
[1] This is a bit of a lie: a channel that does secrecy and authentication at different times isn't as secure as one that does them at the same time.
[2] It is worth noting that authenticity is, in many respects, necessary to achieve secrecy.
[3] This, too, is a bit of a lie. More on this in a subsequent post.
[4] I'm very aware that S/MIME and PGP use radically different trust models. Trust models will be covered later.
[5] S/MIME 3.0 did add a provision stating that if the signed/encrypted part is a message/rfc822 part, the headers of that part should override the outer message's headers. However, I am not aware of a major email client that actually handles these kind of messages gracefully.
[6] Actually, I tested Windows Live Mail instead of Outlook, but given the presence of an official MIME-to-Microsoft's-internal-message-format document which seems to agree with what Windows Live Mail was doing, I figure their output would be identical.
[7] On a more careful examination after the fact, it appears that Evolution may have tried to indicate signedness on a part-by-part basis, but the UI was sufficiently confusing that ordinary users are going to be easily confused.
We just released the first beta of Thunderbird 30. There will be two betas for 30 and probably 2 or more for 31. We need to start uncovering bugs nows so that developers have time to fix things.
Now is the time to get the betas and use them as you do with the current release and file bugs. Makes these bugs block our tracking bug : 1008543.
For the next beta we will need more people to do formal testing - we will use moztrap and eventbrite to track this. The more participants to this (and other during the 31 beta period), the higher the quality. Follow this blog or subscribe to the Thunderbird-tester mailing list if you wish to make 31 a great release.
beets is the extensible music database tool every programmer with a music collection has dreamed of writing. At its simplest it’s a clever tagger that can normalize your music against the MusicBrainz database and then store the results in a searchable SQLite database. But with plugins it can fetch album art, use the Discogs music database for tagging too, calculate ReplayGain values for all your music, integrate meta-data from The Echo Nest, etc. It even has a Music Player Daemon server-mode (bpd) and a simple HTML interface (web) that lets you search for tracks and play them in your browse using the HTML5 audio tag.
I’ve tried a lot of music players through the years (alphabetically: amarok, banshee, exaile, quodlibet, rhythmbox). They all are great music players and (at least!) satisfy the traditional Artist/Album/Track hierarchy use-case, but when you exceed 20,000 tracks and you have a lot of compilation cd’s, that frequently ends up not being enough. Extending them usually turned out to be too hard / not fun enough, although sometimes it was just a question of time and seeking greener pastures.
But enough context; if you’re reading my blog you probably are on board with the web platform being the greatest platform ever. The notable bits of the implementation are:
Server-wise, it’s a mash-up of beets’ MPD-alike plugin bpd and its web plugin. Rather than needing to speak the MPD protocol over TCP to get your server to play music, you can just hit it with an HTTP POST and it will enqueue and play the song. Server-sent events/EventSource are used to let the web UI hypothetically update as things happen on the server. Right now the client can indeed tell the server to play a song and hear an update via the EventSource channel, but there’s almost certainly a resource leak on the server-side and there’s a lot more web/bpd interlinking required to get it reliable. (Python’s Flask is neat, but I’m not yet clear on how to properly manage the life-cycle of a long-lived request that only dies when the connection dies since I’m seeing the appcontext get torn down even before the generator starts running.)
The client is implemented in Polymer on top of some simple backbone.js collections that build on the existing logic from the beets web plugin.
The artist list uses the polymer-virtual-list element which is important if you’re going to be scrolling through a ton of artists. The implementation is page-based; you tell it how many pages you want and how many items are on each page. As you scroll it fires events that compel you to generate the appropriate page. It’s an interesting implementation:
Pages are allowed to be variable height and therefore their contents are too, although a fixedHeight mode is also supported.
In variable-height mode, scroll offsets are translated to page positions by guessing the page based on the height of the first page and then walking up/down from there based on cached page-sizes until the right page size is found. If there is missing information because the user managed to trigger a huge jump, extrapolation is performed based on the average item size from the first page.
Any changes to the contents of the list regrettably require discarding all existing pages/bindings. At this time there is no way to indicate a splice at a certain point that should simply result in a displacement of the existing items.
Albums are loaded in batches from the server and artists dynamically derived from them. Although this would allow for the UI to update as things are retrieved, the virtual-list invalidation issue concerned me enough to have the artist-list defer initialization until all albums are loaded. On my machine a couple thousand albums load pretty quickly, so this isn’t a huge deal.
There’s filtering by artist name and number of albums in the database by that artist built on backbone-filtered-collection. The latter one is important to me because scrolling through hundreds of artists where I might only own one cd or not even one whole cd is annoying. (Although the latter is somewhat addressed currently by only using the albumartist for the artist generation so various artists compilations don’t clutter things up.)
If you click on an artist it plays the first track (numerically) from the first album (alphabetically) associated with the artist. This does limit the songs you can listen to somewhat…
visualizations are done using d3.js; one svg per visualization
“What’s with all those tastefully chosen colors?” is what you are probably asking yourself. The answer? Two things!
A visualization of albums/releases in the database by time, heat-map style.
We bin all of the albums that beets knows about by year. In this case we assume that 1980 is the first interesting year and so put 1979 and everything before it (including albums without a year) in the very first bin on the left. The current year is the rightmost bucket.
We vertically divide the albums into “albums” (red), “singles” (green), and “compilations” (blue). This is accomplished by taking the MusicBrainz Release Group / Types and mapping them down to our smaller space.
the x-axis is “danceability”. Things to the left are less danceable. Things to the right are more danceable.
the y-axis is “valence” which they define as “the musical positiveness conveyed by a track”. Things near the top are sadder, things near the bottom are happier.
the colors are based on the type of album the track is from. The idea was that singles tend to have remixes on them, so it’s interesting if we always see a big cluster of green remixes to the right.
tracks without the relevant data all end up in the upper-left corner. There are a lot of these. The echo nest is extremely generous in allowing non-commercial use of their API, but they limit you to 20 requests per minute and at this point the beets echonest plugin needs to upload (transcoded) versions of all my tracks since my music collection is apparently more esoteric than what the servers already have fingerprints for.
Together these visualizations let us infer:
Madonna is more dancey than Morrissey! Shocking, right?
I bought the Morrissey singles box sets. And I got ripped off because there’s a distinct lack of green dots over on the right side.
Code is currently in the webpd branch of my beets fork although I should probably try and split it out into a separate repo. You need to enable the webpd plugin like you would any other plugin for it to work. There’s still a lot lot lot more work to be done for it to be usable, but I think it’s neat already. It definitely works in Firefox and Chrome.
Previously, I've been developing JSMime as a subdirectory within comm-central. However, after discussions with other developers, I have moved the official repository of record for JSMime to its own repository, now found on GitHub. The repository has been cleaned up and the feature set for version 0.2 has been selected, so that the current tip on JSMime (also the initial version) is version 0.2. This contains the feature set I imported into Thunderbird's source code last night, which is to say support for parsing MIME messages into the MIME tree, as well as support for parsing and encoding email address headers.
Thunderbird doesn't actually use the new code quite yet (as my current tree is stuck on a mozilla-central build error, so I haven't had time to run those patches through a last minute sanity check before requesting review), but the intent is to replace the current C++ implementations of nsIMsgHeaderParser and nsIMimeConverter with JSMime instead. Once those are done, I will be moving forward with my structured header plans which more or less ought to make those interfaces obsolete.
Within JSMime itself, the pieces which I will be working on next will be rounding out the implementation of header parsing and encoding support (I have prototypes for Date headers and the infernal RFC 2231 encoding that Content-Disposition needs), as well as support for building MIME messages from their constituent parts (a feature which would be greatly appreciated in the depths of compose and import in Thunderbird). I also want to implement full IDN and EAI support, but that's hampered by the lack of a JS implementation I can use for IDN (yes, there's punycode.js, but that doesn't do StringPrep). The important task of converting the MIME tree to a list of body parts and attachments is something I do want to work on as well, but I've vacillated on the implementation here several times and I'm not sure I've found one I like yet.
JSMime, as its name implies, tries to work in as pure JS as possible, augmented with several web APIs as necessary (such as TextDecoder for charset decoding). I'm using ES6 as the base here, because it gives me several features I consider invaluable for implementing JavaScript: Promises, Map, generators, let. This means it can run on an unprivileged web page—I test JSMime using Firefox nightlies and the Firefox debugger where necessary. Unfortunately, it only really works in Firefox at the moment because V8 doesn't support many ES6 features yet (such as destructuring, which is annoying but simple enough to work around, or Map iteration, which is completely necessary for the code). I'm not opposed to changing it to make it work on Node.js or Chrome, but I don't realistically have the time to spend doing it myself; if someone else has the time, please feel free to contact me or send patches.
…unless you're an expert at assembly, that is. The title of this post was obviously meant to be an attention-grabber, but it is much truer than you might think: poorly-written assembly code will probably be slower than an optimizing compiler on well-written code (note that you may need to help the compiler along for things like vectorization). Now why is this?
Modern microarchitectures are incredibly complex. A modern x86 processor will be superscalar and use some form of compilation to microcode to do that. Desktop processors will undoubtedly have multiple instruction issues per cycle, forms of register renaming, branch predictors, etc. Minor changes—a misaligned instruction stream, a poor order of instructions, a bad instruction choice—could kill the ability to take advantages of these features. There are very few people who could accurately predict the performance of a given assembly stream (I myself wouldn't attempt it if the architecture can take advantage of ILP), and these people are disproportionately likely to be working on compiler optimizations. So unless you're knowledgeable enough about assembly to help work on a compiler, you probably shouldn't be hand-coding assembly to make code faster.
To give an example to elucidate this point (and the motivation for this blog post in the first place), I was given a link to an implementation of the N-queens problem in assembly. For various reasons, I decided to use this to start building a fine-grained performance measurement system. This system uses a high-resolution monotonic clock on Linux and runs the function 1000 times to warm up caches and counters and then runs the function 1000 more times, measuring each run independently and reporting the average runtime at the end. This is a single execution of the system; 20 executions of the system were used as the baseline for a t-test to determine statistical significance as well as visual estimation of normality of data. Since the runs observed about a constant 1-2 μs of noise, I ran all of my numbers on the 10-queens problem to better separate the data (total runtimes ended up being in the range of 200-300μs at this level). When I say that some versions are faster, the p-values for individual measurements are on the order of 10-20—meaning that there is a 1-in-100,000,000,000,000,000,000 chance that the observed speedups could be produced if the programs take the same amount of time to run.
The initial assembly version of the program took about 288μs to run. The first C++ version I coded, originating from the same genesis algorithm that the author of the assembly version used, ran in 275μs. A recursive program beat out a hand-written assembly block of code... and when I manually converted the recursive program into a single loop, the runtime improved to 249μs. It wasn't until I got rid of all of the assembly in the original code that I could get the program to beat the derecursified code (at 244μs)—so it's not the vectorization that's causing the code to be slow. Intrigued, I started to analyze why the original assembly was so slow.
It turns out that there are three main things that I think cause the slow speed of the original code. The first one is alignment of branches: the assembly code contains no instructions to align basic blocks on particular branches, whereas gcc happily emits these for some basic blocks. I mention this first as it is mere conjecture; I never made an attempt to measure the effects for myself. The other two causes are directly measured from observing runtime changes as I slowly replaced the assembly with code. When I replaced the use of push and pop instructions with a global static array, the runtime improved dramatically. This suggests that the alignment of the stack could be to blame (although the stack is still 8-byte aligned when I checked via gdb), which just goes to show you how much alignments really do matter in code.
The final, and by far most dramatic, effect I saw involves the use of three assembly instructions: bsf (find the index of the lowest bit that is set), btc (clear a specific bit index), and shl (left shift). When I replaced the use of these instructions with a more complicated expression int bit = x & -x and x = x - bit, the program's speed improved dramatically. And the rationale for why the speed improved won't be found in latency tables, although those will tell you that bsf is not a 1-cycle operation. Rather, it's in minutiae that's not immediately obvious.
The original program used the fact that bsf sets the zero flag if the input register is 0 as the condition to do the backtracking; the converted code just checked if the value was 0 (using a simple test instruction). The compare and the jump instructions are basically converted into a single instruction in the processor. In contrast, the bsf does not get to do this; combined with the lower latency of the instruction intrinsically, it means that empty loops take a lot longer to do nothing. The use of an 8-bit shift value is also interesting, as there is a rather severe penalty for using 8-bit registers in Intel processors as far as I can see.
Now, this isn't to say that the compiler will always produce the best code by itself. My final code wasn't above using x86 intrinsics for the vector instructions. Replacing the _mm_andnot_si128 intrinsic with an actual and-not on vectors caused gcc to use other, slower instructions instead of the vmovq to move the result out of the SSE registers for reasons I don't particularly want to track down. The use of the _mm_blend_epi16 and _mm_srli_si128 intrinsics can probably be replaced with __builtin_shuffle instead for more portability, but I was under the misapprehension that this was a clang-only intrinsic when I first played with the code so I never bothered to try that, and this code has passed out of my memory long enough that I don't want to try to mess with it now.
In short, compilers know things about optimizing for modern architectures that many general programmers don't. Compilers may have issues with autovectorization, but the existence of vector intrinsics allow you to force compilers to use vectorization while still giving them leeway to make decisions about instruction scheduling or code alignment which are easy to screw up in hand-written assembly. Also, compilers are liable to get better in the future, whereas hand-written assembly code is unlikely to get faster in the future. So only write assembly code if you really know what you're doing and you know you're better than the compiler.