Mike HommeyAnnouncing git-cinnabar 0.6.0rc1

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.10?

  • Full rewrite of git-cinnabar in Rust.
  • Push performance is between twice and 10 times faster than 0.5.x, depending on scenarios.
  • Based on git 2.38.0.
  • git cinnabar fetch now accepts a --tags flag to fetch tags.
  • git cinnabar bundle now accepts a -t flag to give a specific bundlespec.
  • git cinnabar rollback now accepts a --candidates flag to list the metadata sha1 that can be used as target of the rollback.
  • git cinnabar rollback now also accepts a --force flag to allow any commit sha1 as metadata.
  • git cinnabar now has a self-update subcommand that upgrades it when a new version is available. The subcommand is only available when building with the self-update feature (enabled on prebuilt versions of git-cinnabar).

This Week In RustThis Week in Rust 462

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Newsletters
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is serde-transcode, a crate to efficiently convert between various serde-supporting formats

Thanks to Kornel for the suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

347 pull requests were merged in the last week

Rust Compiler Performance Triage

Overall a fairly quiet week in terms of new changes; the majority of the delta this week was due to reverting #101620, which was a regression noted in last week's report.

Triage done by @simulacrum. Revision range: 8fd6d03e2..d9297d22

2 Regressions, 7 Improvements, 3 Mixed; 3 of them in rollups 53 artifact comparisons made in total

Full report here

Call for Testing

An important step for RFC implementation is for people to experiment with the implementation and give feedback, especially before stabilization. The following RFCs would benefit from user testing before moving forward:

  • No RFCs issued a call for testing this week.

If you are a feature implementer and would like your RFC to appear on the above list, add the new call-for-testing label to your RFC along with a comment providing testing instructions and/or guidance on which aspect(s) of the feature need testing.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week, the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
  • No RFCs entered Final Comment Period this week.
Tracking Issues & PRs
New and Updated RFCs

Upcoming Events

Rusty Events between 2022-09-28 - 2022-10-26 🦀

Virtual
Asia
Europe
North America
Oceania

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Jobs

Please see the latest Who's Hiring thread on r/rust

Quote of the Week

Semver has its philosophy, but a pragmatic approach to versioning is:

<upgrades may break API> . <downgrades may break API> . <fine either way>

Kornel on rust-users

Thanks to Artem Borisovskiy for the suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, cdmistman, ericseppanen, extrawurst, andrewpollack, U007D, kolharsam, joelmarcey, mariannegoldin, bennyvasquez.

Email list hosting is sponsored by The Rust Foundation

Discuss on r/rust

William LachanceUsing Sphinx in a Monorepo

Just wanted to type up a couple of notes about working with Sphinx (the python documentation generator) inside a monorepo, an issue I’ve been struggling with (off and on) at Voltus since I started. I haven’t seen much written about this topic despite (I suspect) it being a reasonably frequent problem.

In general, there’s a lot to like about Sphinx: it’s great at handling deeply nested trees of detailed documentation with cross-references inside a version control system. It has local search that works pretty well and some themes (like readthedocs) scale pretty nicely to hundreds of documents. The directives and roles system is pretty flexible and covers most of the common things one might want to express in technical documentation. And if the built-in set of functionality isn’t enough, there’s a wealth of third party extension modules. My only major complaint is that it uses the somewhat obscure restructuredText file format by default, but you can get around that by using the excellent MyST extension.

Unfortunately, it has a pretty deeply baked in assumption that all documentation for your project lives inside a single subfolder. This is fine for a small repository representing a single python module, like this:

<root>
README.md
setup.cfg
pyproject.toml
mymodule/
docs/

However, this doesn’t work for a large monorepo, where you would typically see something like:

<root>/module-1/submodule-a
<root>/module-1/submodule-b
<root>/module-2/submodule-c
...

In a monorepo, you usually want to include a module’s documentation inside its own directory. This allows you to use your code ownership constraints for documentation, among other things.

The naive solution would be to create a sphinx site for every single one of these submodules. This is what happened at Voltus and I don’t recommend it. For a large monorepo you’ll end up with dozens, maybe hundreds of documentation “sites”. Under this scenario, discoverability becomes a huge problem: no longer can you rely on tables of contents and the built-in search to discover content: you just have to “know” where things live. I’m more than nine months in here and I’m still discovering new documentation.

It would be much better if we could somehow collect documentation from other parts of the repository into a single site. Is this possible? tl;dr: Yes. There’s a few solutions, each with their pros and cons.

The obvious solution that doesn’t work

The most obvious solution here is to create a symbolic link inside your documentation directory, say the following:

<root>/docs/
<root>/docs/module-1/submodule-a -> <root>/module-1/submodule-a/docs

Unfortunately, this doesn’t work. ☹️ Sphinx doesn’t follow symbolic links.

Solution 1: Just copy the files in

The most obvious solution is to just copy the files from various parts of the monorepo into place, as part of the build system. Mozilla did this for Firefox, with the moztreedocs system.

The results look pretty good, but this is a bespoke solution. Aside from general ideas, there’s no way I’m going to be able to apply anything in moztreedocs to Voltus’s monorepo (which is based on a completely different build system). And being honest, I’m not sure if the 40+ hour (estimated) effort to reimplement it would be a good use of time compared to other things I could be doing.

Solution 2: Use the include directive with MyST

Later versions of MyST include support for directly importing a markdown file from another part of the repository.

This is a limited form of embedding: it won’t let you import an entire directory of markdown files. But if your submodules mostly just include content in the form of a README.md (or similar), it might just be enough. Just create a directory for these files to live (say services) and slot them in:

<root>/docs/services/module-1/submodule-a/index.md:

```{include} ../../../module-1/submodule-a/README.md
```

I’m currently in the process of implementing this solution inside Voltus. I have optimism that this will be a big (if incremental) step up over what we have right now. There are obviously limits, but you can cram a lot of useful information in a README. As a bonus, it’s a pretty nice marker for those spelunking through the source code (much more so than a forest of tiny documentation files).

Solution 3: Sphinx Collections

This one I just found about today: Sphinx Collections is a small python module that lets you automatically import entire directories of files into your sphinx tree, under a _collections module. You configure it in your top-level conf.py like this:

extensions = [
    ...
    "sphinxcontrib.collections"
]

collections = {
    "submodule-a": {
        "driver": "symlink",
        "source": "/monorepo/module-1/submodule-a/docs",
        "target": "submodule-a"
    },
    ...
}

After setting this up, submodule-a is now available under _collections and you can include it in your table of contents like this:

...

```{toctree}
:caption: submodule-a

_collections/submodule-a/index.md
```

...

At this point, submodule-a’s documentation should be available under http://<my doc domain>/_collections/submodule-a/index.html

Pretty nifty. The main downside I’ve found so far is that this doesn’t play nicely with the Edit on GitHub links that the readthedocs theme automatically inserts (it thinks the files exist under _collections), but there’s probably a way to work around that.

I plan on investigating this approach further in the coming months.

Tantek ÇelikW3C TPAC 2022 Sustainability Community Group Meeting

This year’s W3C TPAC Plenary Day was a combination of the first ever AC open session in the early morning, and breakout sessions in the late morning and afternoon. Nick Doty proposed a breakout session for Sustainability for the Web and W3C which he & I volunteered to co-chair, as co-chairs of the Sustainability (s12y) CG which we created on Earth Day earlier this year. Nick & I met during a break on Wednesday afternoon and made plans for how we would run the session as a Sustainability CG meeting, which topics to introduce, how to deal with unproductive participation if any, and how to focus the latter part of the session into follow-up actions.

We agreed that our primary role as chairs should be facilitation. We determined a few key meeting goals, in particular to help participants:

  • Avoid/minimize any trolling or fallacy arguments (based on experience from 2021)
  • Learn who is interested in which sustainability topics & work areas
  • Determine clusters of similar, related, and overlapping sustainability topics
  • Focus on prioritizing actual sustainability work rather than process mechanics
  • Encourage active collaboration in work areas (like a do-ocracy)

The session went better than I expected. The small meeting room was packed with ~20 participants, with a few more joining us on Zoom (which thankfully worked without any issues, thanks to the W3C staff for setting that up so all we had to do as chairs was push a button to start the meeting!).

I am grateful for everyone’s participation and more importantly the shared sense of collaboration, teamwork, and frank urgency. It was great to meet & connect in-person, and see everyone on video who took time out of their days across timezones to join us. There was a lot of eagerness in participation, and Nick & I did our best to give everyone who wanted to speak time to contribute (the IRC bot Zakim's two minute speaker timer feature helped).

It was one of the more hopeful meetings I participated in all week. Thanks to Yoav Weiss for scribing the minutes. Here are a few of the highlights.

Session Introduction

Nick introduced himself and proposed topics of discussion for our breakout session.

  • How we can apply sustainbility to web standards
  • Goals we could work on as a community
  • Consider metrics to enable other measures to take effect
  • Measure the impact of the W3C meetings themselves
  • Working mode and how we talk about sustainability in W3C
  • Horizontal reviews

I introduced myself and my role at Mozilla as one our Environmental Champions, and noted that it’s been three years since we had the chance to meet in person at TPAC. Since then many of us who participate at W3C have recognized the urgency of sustainability, especially as underscored by recent IPCC reports. From the past few years of publications & discussions:

For our TPAC 2022 session, I asked that we proceed with the assumption of sustainability as a principle, and that if folks came to argue with that, that they should raise an issue with the TAG, not this meeting.

In the Call for Participation in the Sustainability Community Group, we highlighted both developing a W3C practice of Sustainability (s12y) Horizontal Review (similar to a11y, i18n, privacy, security) as proposed at TPAC 2021, and an overall venue for participants to discuss all aspects of sustainability with respect to web technologies present & future. For our limited meeting time, I asked participants to share how they want to have the biggest impact on sustainability at W3C, with the web in general, and actively prioritize our work accordingly.

Work Areas, Groups, Resources

Everyone took turns introducing themselves and expressing which aspects of sustainability were important to them, noting any particular background or applicable expertise, as well as which other W3C groups they are participating in, as opportunities for liaison and collaboration. Several clusters of interest emerged:

  • Technologies to reduce energy usage
  • W3C meetings and operations
  • Measurement
  • System Effects
  • Horizontal Review
  • Principles

The following W3C Groups were noted which are either already working on sustainability related efforts or would be good for collaboration, and except for the TAG, had a group co-chair in the meeting!

I proposed adding a liaisons section to our public Sustainability wiki page accordingly explicitly listing these groups and specific items for collaboration. Participants also shared the following links to additional efforts & resources:

Sustainability Work In Public By Default

Noting that since all our work on sustainability is built on a lot of public work by others, the best chance of our work having an impact is to also do it publicly, I proposed that Sustainability CG work in public by default, as well as sustainability work at W3C in general, and that we send that request to the AB to advise W3C accordingly. The proposal was strongly supported with no opposition.

Active Interest From Organizations

There were a number of organizations whose representatives indicated that they are committed to making a positive impact on the environment, and would like to work on efforts accordingly in the Sustainability CG, or would at least see if they could contact experts at their organizations to see if any of them were interested in contributing.

  • Igalia
  • mesur.io
  • Mozilla
  • Lawrence Berkeley National Laboratory
  • Washington Post

Meeting Wrap-up And Next Steps

We finished up the meeting with participants signing up to work on each of the work areas (clusters of interest noted above) that they were personally interested in working on. This has been captured on our wiki: W3C Wiki: Sustainability Work Areas.

The weekend after the meeting I wrote up an email summary of the meeting & next steps and sent it directly to those who were present at the meeting, encouraging them to Join the Sustainability Community Group (requires a W3C account) for future emails and updates. Nick & I are also on the W3C Community Slack #sustainability channel which I recommended joining. Signup link: https://www.w3.org/slack-w3ccommunity-invite

Next Steps: we encouraged everyone signed up for a Work Area to reach out to each other directly and determine their preferred work mode, including in which venue they’d like to do the work, whether in the Sustainability CG, another CG, or somewhere else. We noted that work on sustainable development & design of web sites in particular should be done directly with the Sustainable Web Design CG (sustyweb), “a community group dedicated to creating sustainable websites”.

Some possibilities for work modes that Work Area participants can use:

  • W3C Community Slack #sustainability channel
  • public-sustainability email list of the Sustainability CG
  • Our Sustainability wiki page, creating "/" subpages as needed

There is lots of work to do across many different areas for sustainability & the web, and for technology as a whole, which lends itself to small groups working in parallel. Nick & I want to help facilitate those that have the interest, energy, and initiative to do so. We are available to help Work Area participants pick a work mode & venue that will best meet their needs and help them get started on their projects.

The Talospace ProjectFirefox 105 on POWER

Firefox 105 is out. No, it's not your imagination: I ended up skipping a couple versions. I wasn't able to build Firefox 103 because gcc 12 in Fedora 36 caused weird build failures until it was finally fixed; separately, building 104 and working more on the POWER9 JavaScript JIT got delayed because I'd finally had it with the performance issues and breakage in GNOME 42 and took a couple weeks renovating Plasma so I could be happy with my desktop environment again. Now I'm on track again with everything hopefully maintainable and my workflows properly restored, and we're back to the grind with both those concerns largely resolved.

Unfortunately, we have a couple new ones. Debug builds broke in Fx103 using our standard .mozconfig when mfbt/lz4/xxhash.h was upgraded, because we compile with -Og and it wants to compile its functions with static __inline__ __attribute__((always_inline, unused)). When gcc builds a deoptimized debugging build and fails to inline those functions, it throws a compilation error, and the build screeches to a halt. (This doesn't affect Fedora's build because they always build at a sufficient optimization level such that these functions do indeed get inlined.) After a little thinking, this is the new debug .mozconfig:


export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24" # or as you likez
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9 -fpermissive -DXXH_NO_INLINE_HINTS=1"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd
ac_add_options --without-wasm-sandboxed-libraries

export GN=/home/censored/bin/gn # if you haz
This builds, or at least compiles, but fails at linkage because of the second problem. This time, it's libwebrtc ... again. To glue the Google build system onto Mozilla's, there is a fragile and system-dependent permuting-processing step that again has broken and Mozilla would like a definitive fix. Until then, we're high and dry because the request is for the generated build file to be generated correctly rather than just patching the generated build file. That's a much bigger knot to unravel and building the gn tool it depends on used to be incredibly difficult (it's now much easier and I was able to upgrade, but all this has done is show me where the problem is and it's not a straightforward fix). If this is not repaired, then various screen capture components used by libwebrtc are not compiled, and linking will fail. Right now it looks like we're the only platform affected even though aarch64 has been busted by the same underlying issue in the past.

The easy choice, especially if you don't use WebRTC, is just add ac_add_options --disable-webrtc to your .mozconfig. I don't use WebRTC much and I'm pretty lazy so ordinarily I would go this route — except you, gentle reader, expect me to be able to tell you when Firefox compiles are breaking, so that brings us to the second option: Dan Horák's patch. This also works and is the version I'm typing into now. Expect you will have to carry this patch in your local tree for a couple versions until this gets dealt with.

Fortunately, the PGO-LTO patch for Firefox 101 still applies to Fx105, so you can still use that. While the optimized .mozconfig is unchanged, here it is for reference:


export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24" # or as you likez
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9 -fpermissive"
ac_add_options --enable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full
ac_add_options --without-wasm-sandboxed-libraries
ac_add_options MOZ_PGO=1

export GN=/home/censored/bin/gn # if you haz
export RUSTC_OPT_LEVEL=2
I've got one other issue to settle, and then I hope to get back to porting the JavaScript and Wasm JIT to 102ESR. But my real life and my $DAYJOB interfere with my after-hours hacking, so contributors still solicited so our work can benefit the OpenPOWER community. When it's one person working on it, things are slower.

Niko MatsakisRust 2024…the year of everywhere?

I’ve been thinking about what “Rust 2024” will look like lately. I don’t really mean the edition itself — but more like, what will Rust feel like after we’ve finished up the next few years of work? I think the answer is that Rust 2024 is going to be the year of “everywhere”. Let me explain what I mean. Up until now, Rust has had a lot of nice features, but they only work sometimes. By the time 2024 rolls around, they’re going to work everywhere that you want to use them, and I think that’s going to make a big difference in how Rust feels.

Async everywhere

Let’s start with async. Right now, you can write async functions, but not in traits. You can’t write async closures. You can’t use async drop. This creates a real hurdle. You have to learn the workarounds (e.g., the async-trait crate), and in some cases, there are no proper workarounds (e.g., for async-drop).

Thanks to a recent PR by Michael Goulet, static async functions in traits almost work on nightly today! I’m confident we can work out the remaining kinks soon and start advancing the static subset (i.e., no support for dyn trait) towards stabilization.

The plans for dyn, meanwhile, are advancing rapidly. At this point I think we have two good options on the table and I’m hopeful we can get that nailed down and start planning what’s needed to make the implementation work.

Once async functions in traits work, the next steps for core Rust will be figuring out how to support async closures and async drop. Both of them add some additional challenges — particularly async drop, which has some complex interactions with other parts of the language, as Sabrina Jewson elaborated in a great, if dense, blog post — but we’ve started to develop a crack team of people in the async working group and I’m confident we can overcome them.

There is also library work, most notably settling on some interop traits, and defining ways to write code that is portable across allocators. I would like to see more exploration of structured concurrency1, as well, or other alternatives to select! like the stream merging pattern Yosh has been advocating for.

Finally, for extra credit, I would love to see us integrate async/await keywords into other bits of the function body, permitting you to write common patterns more easily. Yoshua Wuyts has had a really interesting series of blog posts exploring these sorts of ideas. I think that being able to do for await x in y to iterate, or (a, b).await as a form of join, or async let x = … to create a future in a really lightweight way could be great.

Impl trait everywhere

The impl Trait notation is one of Rust’s most powerful conveniences, allowing you to omit specific types and instead talk about the interface you need. Like async, however, impl Trait can only be used in inherent functions and methods, and can’t be used for return types in traits, nor can it be used in type aliases, let bindings, or any number of other places it might be useful.

Thanks to Oli Scherer’s hard work over the last year, we are nearing stabilization for impl Trait in type aliases. Oli’s work has also laid the groundwork to support impl trait in let bindings, meaning that you will be able to do something like

let iter: impl Iterator<Item = i32> = (0..10);
//        ^^^^^^^^^^^^^ Declare type of `iter` to be “some iterator”.

Finally, the same PR that added support for async fns in traits also added initial support for return-position impl trait in traits. Put it all together, and we are getting very close the letting you use impl trait everywhere you might want to.

There is still at least one place where impl Trait is not accepted that I think it should be, which is nested in other positions. I’d like you to be able to write impl Fn(impl Debug), for example, to refer to “some closure that takes an argument of type impl Debug” (i.e., can be invoked multiple times with different debug types).

Generics everywhere

Generic types are a big part of how Rust libraries are built, but Rust doesn’t allow people to write generic parameters in all the places they would be useful, and limitations in the compiler prevent us from making full use of the annotations we do have.

Not being able to use generic types everywhere might seem abstract, particularly if you’re not super familiar with Rust. And indeed, for a lot of code, it’s not a big deal. But if you’re trying to write libraries, or to write one common function that will be used all over your code base, then it can quickly become a huge blocker. Moreover, given that Rust supports generic types in many places, the fact that we don’t support them in some places can be really confusing — people don’t realize that the reason their idea doesn’t work is not because the idea is wrong, it’s because the language (or, often, the compiler) is limited.

The biggest example of generics everywhere is generic associated types. Thanks to hard work by Jack Huey, Matthew Jasper, and a number of others, this feature is very close to hitting stable Rust — in fact, it is in the current beta, and should be available in 1.65. One caveat, though: the upcoming support for GATs has a number of known limitations and shortcomings, and it gives some pretty confusing errors. It’s still really useful, and a lot of people are already using it on nightly, but it’s going to require more attention before it lives up to its full potential.

You may not wind up using GATs in your code, but it will definitely be used in some of the libraries you rely on. GATs directly enables common patterns like Iterable that have heretofore been inexpressible, but we’ve also seen a lot of examples where its used internally to help libraries present a more unified, simpler interface to their users.

Beyond GATs, there are a number of other places where we could support generics, but we don’t. In the previous section, for example, I talked about being able to have a function with a parameter like impl Fn(impl Debug) — this is actually an example of a “generic closure”. That is, a closure that itself has generic arguments. Rust doesn’t support this yet, but there’s no reason we can’t.

Oftentimes, though, the work to realize “generics everywhere” is not so much a matter of extending the language as it is a matter of improving the compiler’s implementation. Rust’s current traits implementation works pretty well, but as you start to push the bounds of it, you find that there are lots of places where it could be smarter. A lot of the ergonomic problems in GATs arise exactly out of these areas.

One of the developments I’m most excited about in Rust is not any particular feature, it’s the formation of the new types team. The goal of this team is to revamp the compiler’s trait system implementation into something efficient and extensible, as well as building up a core set of contributors.

Making Rust feel simpler by making it more uniform

The topics in this post, of course, only scratch the surface of what’s going on in Rust right now. For example, I’m really excited about “everyday niceties” like let/else-syntax and if-let-pattern guards, or the scoped threads API that we got in 1.63. There are exciting conversations about ways to improve error messages. Cargo, the compiler, and rust-analyzer are all generally getting faster and more capable. And so on, and so on.

The pattern of having a feature that starts working somewhere and then extending it so that it works everywhere seems, though, to be a key part of how Rust development works. It’s inspiring also because it becomes a win-win for users. Newer users find Rust easier to use and more consistent; they don’t have to learn the “edges” of where one thing works and where it doesn’t. Experienced users gain new expressiveness and unlock patterns that were either awkward or impossible before.

One challenge with this iterative development style is that sometimes it takes a long time. Async functions, impl Trait, and generic reasoning are three areas where progress has been stalled for years, for a variety of reasons. That’s all started to shift this year, though. A big part of is the formation of new Rust teams at many companies, allowing a lot more people to have a lot more time. It’s also just the accumulation of the hard work of many people over a long time, slowly chipping away at hard problems (to get a sense for what I mean, read Jack’s blog post on NLL removal, and take a look at the full list of contributors he cited there — just assembling the list was impressive work, not to mention the actual work itself).

It may have been a long time coming, but I’m really excited about where Rust is going right now, as well as the new crop of contributors that have started to push the compiler faster and faster than it’s ever moved before. If things continue like this, Rust in 2024 is going to be pretty damn great.

  1. Oh, my beloved moro! I will return to thee! 

The Rust Programming Language BlogAnnouncing Rust 1.64.0

The Rust team is happy to announce a new version of Rust, 1.64.0. Rust is a programming language empowering everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, you can get 1.64.0 with:

rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.64.0 on GitHub.

If you'd like to help us out by testing future releases, you might consider updating locally to use the beta channel (rustup default beta) or the nightly channel (rustup default nightly). Please report any bugs you might come across!

What's in 1.64.0 stable

Enhancing .await with IntoFuture

Rust 1.64 stabilizes the IntoFuture trait. IntoFuture is a trait similar to IntoIterator, but rather than supporting for ... in ... loops, IntoFuture changes how .await works. With IntoFuture, the .await keyword can await more than just futures; it can await anything which can be converted into a Future via IntoFuture - which can help make your APIs more user-friendly!

Take for example a builder which constructs requests to some storage provider over the network:

pub struct Error { ... }
pub struct StorageResponse { ... }:
pub struct StorageRequest(bool);

impl StorageRequest {
    /// Create a new instance of `StorageRequest`.
    pub fn new() -> Self { ... }
    /// Decide whether debug mode should be enabled.
    pub fn set_debug(self, b: bool) -> Self { ... }
    /// Send the request and receive a response.
    pub async fn send(self) -> Result<StorageResponse, Error> { ... }
}

Typical usage would likely look something like this:

let response = StorageRequest::new()  // 1. create a new instance
    .set_debug(true)                  // 2. set some option
    .send()                           // 3. construct the future
    .await?;                          // 4. run the future + propagate errors

This is not bad, but we can do better here. Using IntoFuture we can combine "construct the future" (line 3) and "run the future" (line 4) into a single step:

let response = StorageRequest::new()  // 1. create a new instance
    .set_debug(true)                  // 2. set some option
    .await?;                          // 3. construct + run the future + propagate errors

We can do this by implementing IntoFuture for StorageRequest. IntoFuture requires us to have a named future we can return, which we can do by creating a "boxed future" and defining a type alias for it:

// First we must import some new types into the scope.
use std::pin::Pin;
use std::future::{Future, IntoFuture};

pub struct Error { ... }
pub struct StorageResponse { ... }
pub struct StorageRequest(bool);

impl StorageRequest {
    /// Create a new instance of `StorageRequest`.
    pub fn new() -> Self { ... }
    /// Decide whether debug mode should be enabled.
    pub fn set_debug(self, b: bool) -> Self { ... }
    /// Send the request and receive a response.
    pub async fn send(self) -> Result<StorageResponse, Error> { ... }
}

// The new implementations:
// 1. create a new named future type
// 2. implement `IntoFuture` for `StorageRequest`
pub type StorageRequestFuture = Pin<Box<dyn Future<Output = Result<StorageResponse, Error>> + Send + 'static>>
impl IntoFuture for StorageRequest {
    type IntoFuture = StorageRequestFuture;
    type Output = <StorageRequestFuture as Future>::Output;
    fn into_future(self) -> Self::IntoFuture {
        Box::pin(self.send())
    }
}

This takes a bit more code to implement, but provides a simpler API for users.

In the future, the Rust Async WG hopes to simplify the creating new named futures by supporting impl Trait in type aliases (Type Alias Impl Trait or TAIT). This should make implementing IntoFuture easier by simplifying the type alias' signature, and make it more performant by removing the Box from the type alias.

C-compatible FFI types in core and alloc

When calling or being called by C ABIs, Rust code can use type aliases like c_uint or c_ulong to match the corresponding types from C on any target, without requiring target-specific code or conditionals.

Previously, these type aliases were only available in std, so code written for embedded targets and other scenarios that could only use core or alloc could not use these types.

Rust 1.64 now provides all of the c_* type aliases in core::ffi, as well as core::ffi::CStr for working with C strings. Rust 1.64 also provides alloc::ffi::CString for working with owned C strings using only the alloc crate, rather than the full std library.

rust-analyzer is now available via rustup

rust-analyzer is now included as part of the collection of tools included with Rust. This makes it easier to download and access rust-analyzer, and makes it available on more platforms. It is available as a rustup component which can be installed with:

rustup component add rust-analyzer

At this time, to run the rustup-installed version, you need to invoke it this way:

rustup run stable rust-analyzer

The next release of rustup will provide a built-in proxy so that running the executable rust-analyzer will launch the appropriate version.

Most users should continue to use the releases provided by the rust-analyzer team (available on the rust-analyzer releases page), which are published more frequently. Users of the official VSCode extension are not affected since it automatically downloads and updates releases in the background.

Cargo improvements: workspace inheritance and multi-target builds

When working with collections of related libraries or binary crates in one Cargo workspace, you can now avoid duplication of common field values between crates, such as common version numbers, repository URLs, or rust-version. This also helps keep these values in sync between crates when updating them. For more details, see workspace.package, workspace.dependencies, and "inheriting a dependency from a workspace".

When building for multiple targets, you can now pass multiple --target options to cargo build, to build all of those targets at once. You can also set build.target to an array of multiple targets in .cargo/config.toml to build for multiple targets by default.

Stabilized APIs

The following methods and trait implementations are now stabilized:

These types were previously stable in std::ffi, but are now also available in core and alloc:

These types were previously stable in std::os::raw, but are now also available in core::ffi and std::ffi:

We've stabilized some helpers for use with Poll, the low-level implementation underneath futures:

In the future, we hope to provide simpler APIs that require less use of low-level details like Poll and Pin, but in the meantime, these helpers make it easier to write such code.

These APIs are now usable in const contexts:

Compatibility notes

  • As previously announced, linux targets now require at least Linux kernel 3.2 (except for targets which already required a newer kernel), and linux-gnu targets now require glibc 2.17 (except for targets which already required a newer glibc).

  • Rust 1.64.0 changes the memory layout of Ipv4Addr, Ipv6Addr, SocketAddrV4 and SocketAddrV6 to be more compact and memory efficient. This internal representation was never exposed, but some crates relied on it anyway by using std::mem::transmute, resulting in invalid memory accesses. Such internal implementation details of the standard library are never considered a stable interface. To limit the damage, we worked with the authors of all of the still-maintained crates doing so to release fixed versions, which have been out for more than a year. The vast majority of impacted users should be able to mitigate with a cargo update.

  • As part of the RLS deprecation, this is also the last release containing a copy of RLS. Starting from Rust 1.65.0, RLS will be replaced by a small LSP server showing the deprecation warning.

Other changes

There are other changes in the Rust 1.64 release, including:

  • Windows builds of the Rust compiler now use profile-guided optimization, providing performance improvements of 10-20% for compiling Rust code on Windows.

  • If you define a struct containing fields that are never used, rustc will warn about the unused fields. Now, in Rust 1.64, you can enable the unused_tuple_struct_fields lint to get the same warnings about unused fields in a tuple struct. In future versions, we plan to make this lint warn by default. Fields of type unit (()) do not produce this warning, to make it easier to migrate existing code without having to change tuple indices.

Check out everything that changed in Rust, Cargo, and Clippy.

Contributors to 1.64.0

Many people came together to create Rust 1.64.0. We couldn't have done it without all of you. Thanks!

Niko MatsakisDyn async traits, part 9: call-site selection

After my last post on dyn async traits, some folks pointed out that I was overlooking a seemingly obvious possibility. Why not have the choice of how to manage the future be made at the call site? It’s true, I had largely dismissed that alternative, but it’s worth consideration. This post is going to explore what it would take to get call-site-based dispatch working, and what the ergonomics might look like. I think it’s actually fairly appealing, though it has some limitations.

If we added support for unsized return values…

The idea is to build on the mechanisms proposed in RFC 2884. With that RFC, you would be able to have functions that returned a dyn Future:

fn return_dyn() -> dyn Future<Output = ()> {
    async move { }
}

Normally, when you call a function, we can allocate space on the stack to store the return value. But when you call return_dyn, we don’t know how much space we need at compile time, so we can’t do that1. This means you can’t just write let x = return_dyn(). Instead, you have to choose how to allocate that memory. Using the APIs proposed in RFC 2884, the most common option would be to store it on the heap. A new method, Box::new_with, would be added to Box; it acts like new, but it takes a closure, and the closure can return values of any type, including dyn values:

let result = Box::new_with(|| return_dyn());
// result has type `Box<dyn Future<Output = ()>>`

Invoking new_with would be ergonomically unpleasant, so we could also add a .box operator. Rust has had an unstable box operator since forever, this might finally provide enough motivation to make it worth adding:

let result = return_dyn().box;
// result has type `Box<dyn Future<Output = ()>>`

Of course, you wouldn’t have to use Box. Assuming we have sufficient APIs available, people can write their own methods, such as something to do arena allocation…

let arena = Arena::new();
let result = arena.new_with(|| return_dyn());

…or perhaps a hypothetical maybe_box, which would use a buffer if that’s big enough, and use box otherwise:

let mut big_buf = [0; 1024];
let result = maybe_box(&mut big_buf, || return_dyn()).await;

If we add postfix macros, then we might even support something like return_dyn.maybe_box!(&mut big_buf), though I’m not sure if the current proposal would support that or not.

What are unsized return values?

This idea of returning dyn Future is sometimes called “unsized return values”, as functions can now return values of “unsized” type (i.e., types who size is not statically known). They’ve been proposed in RFC 2884 by Olivier Faure, and I believe there were some earlier RFCs as well. The .box operator, meanwhile, has been a part of “nightly Rust” since approximately forever, though its currently written in prefix form, i.e., box foo2.

The primary motivation for both unsized-return-values and .box has historically been efficiency: they permit in-place initialization in cases where it is not possible today. For example, if I write Box::new([0; 1024]) today, I am technically allocating a [0; 1024] buffer on the stack and then copying it into the box:

// First evaluate the argument, creating the temporary:
let temp: [u8; 1024] = ...;

// Then invoke `Box::new`, which allocates a Box...
let box: *const T = allocate_memory();

// ...and copies the memory in.
std::ptr::write(box, temp);

The optimizer may be able to fix that, but it’s not trivial. If you look at the order of operations, it requires making the allocation happen before the arguments are allocated. LLVM considers calls to known allocators to be “side-effect free”, but promoting them is still risky, since it means that more memory is allocated earlier, which can lead to memory exhaustion. The point isn’t so much to look at exactly what optimizations LLVM will do in practice, so much as to say that it is not trivial to optimize away the temporary: it requires some thoughtful heuristics.

How would unsized return values work?

This merits a blog post of its own, and I won’t dive into details. For our purposes here, the key point is that somehow when the callee goes to return its final value, it can use whatever strategy the caller prefers to get a return point, and write the return value directly in there. RFC 2884 proposes one solution based on generators, but I would want to spend time thinking through all the alternatives before we settled on something.

Using dynamic return types for async fn in traits

So, the question is, can we use dyn return types to help with async function in traits? Continuing with my example from my previous post, if you have an AsyncIterator trait…

trait AsyncIterator {
    type Item;
    
    async fn next(&mut self) -> Option<Self::Item>;
}

…the idea is that calling next on a dyn AsyncIterator type would yield dyn Future<Output = Option<Self::Item>>. Therefore, one could write code like this:

fn use_dyn(di: &mut dyn AsyncIterator) {
    di.next().box.await;
    //       ^^^^
}

The expression di.next() by itself yields a dyn Future. This type is not sized and so it won’t compile on its own. Adding .box produces a Box<dyn AsyncIterator>, which you can then await.3

Compared to the Boxing adapter I discussed before, this is relatively straightforward to explain. I’m not entirely sure which is more convenient to use in practice: it depends how many dyn values you create and how many methods you call on them. Certainly you can work around the problem of having to write .box at each call-site via wrapper types or helper methods that do it for you.

Complication: dyn AsyncIterator does not implement AsyncIterator

There is one complication. Today in Rust, every dyn Trait type also implements Trait. But can dyn AsyncIterator implement AsyncIterator? In fact, it cannot! The problem is that the AsyncIterator trait defines next as returning impl Future<..>, which is actually shorthand for impl Future<..> + Sized, but we said that next would return dyn Future<..>, which is ?Sized. So the dyn AsyncIterator type doesn’t meet the bounds the trait requires. Hmm.

But…does dyn AsyncIterator have to implement AsyncIterator?

There is no “hard and fixed” reason that dyn Trait types have to implement Trait, and there are a few good reasons not to do it. The alternative to dyn safety is a design like this: you can always create a dyn Trait value for any Trait, but you may not be able to use all of its members. For example, given a dyn Iterator, you could call next, but you couldn’t call generic methods like map. In fact, we’ve kind of got this design in practice, thanks to the where Self: Sized hack that lets us exclude methods from being used on dyn values.

Why did we adopt object safety in the first place? If you look back at RFC 255, the primary motivation for this rule was ergonomics: clearer rules and better error messages. Although I argued for RFC 255 at the time, I don’t think these motivations have aged so well. Right now, for example, if you have a trait with a generic method, you get an error when you try to create a dyn Trait value, telling you that you cannot create a dyn Trait from a trait with a generic method. But it may well be clearer to get an error at the point where you to call that generic method telling you that you cannot call generic methods through dyn Trait.

Another motivation for having dyn Trait implement Trait was that one could write a generic function with T: Trait and have it work equally well for object types. That capability is useful, but because you have to write T: ?Sized to take advantage of it, it only really works if you plan carefully. In practice what I’ve found works much better is to implement Trait to &dyn Trait.

What would it mean to remove the rule that dyn AsyncIterator: AsyncIterator?

I think the new system would be something like this…

  • You can always4 create a dyn Foo value. The dyn Foo type would define inherent methods based on the trait Foo that use dynamic dispatch, but with some changes:
    • Async functions and other methods defined with -> impl Trait return -> dyn Trait instead.
    • Generic methods, methods referencing Self, and other such cases are excluded. These cannot be handled with virtual dispatch.
  • If Foo is object safe using today’s rules, dyn Foo: Foo holds. Otherwise, it does not.5
    • On a related but orthogonal note, I would like to make a dyn keyword required to declare dyn safety.

Implications of removing that rule

This implies that dyn AsyncIterator (or any trait with async functions/RPITIT6) will not implement AsyncIterator. So if I write this function…

fn use_any<I>(x: &mut I)
where
    I: ?Sized + AsyncIterator,
{
    x.next().await
}

…I cannot use it with I = dyn AsyncIterator. You can see why: it calls next and assumes the result is Sized (as promised by the trait), so it doesn’t add any kind of .box directive (and it shouldn’t have to).

What you can do is implement a wrapper type that encapsulates the boxing:

struct BoxingAsyncIterator<'i, I> {
    iter: &'i mut dyn AsyncIterator<Item = I>
}

impl<I> AsyncIterator for BoxingAsyncIterator<'i, I> {
    type Item = I;
    
    async fn next(&mut self) -> Option<Self::Item> {
        self.iter.next().box.await
    }
}

…and then you can call use_any(BoxingAsyncIterator::new(ai)).7

Limitation: what if you wanted to do stack allocation?

One of the goals with the previous proposal was to allow you to write code that used dyn AsyncIterator which worked equally well in std and no-std environments. I would say that goal was partially achieved. The core idea was that the caller would choose the strategy by which the future got allocated, and so it could opt to use inline allocation (and thus be no-std compatible) or use boxing (and thus be simple).

In this proposal, the call-site has to choose. You might think then that you could just choose to use stack allocation at the call-site and thus be no-std compatible. But how does one choose stack allocation? It’s actually quite tricky! Part of the problem is that async stack frames are stored in structs, and thus we cannot support something like alloca (at least not for values that will be live across an await, which includes any future that is awaited8). In fact, even outside of async, using alloca is quite hard! The problem is that a stack is, well, a stack. Ideally, you would do the allocation just before your callee returns, but that’s when you know how much memory you need. But at that time, your callee is still using the stack, so your allocation is on the wrong spot.9 I personally think we should just rule out the idea of using alloca to do stack allocation.

If we can’t use alloca, what can we do? We have a few choices. In the very beginning, I talked about the idea of a maybe_box function that would take a buffer and use it only for really large values. That’s kind of nifty, but it still relies on a box fallback, so it doesn’t really work for no-std.10 Might be a nice alternative to stackfuture though!11

You can also achieve inlining by writing wrapper types (something tmandry and I prototyped some time back), but the challenge then is that your callee doesn’t accept a &mut dyn AsyncIterator, it accepts something like &mut DynAsyncIter, where DynAsyncIter is a struct that you defined to do the wrapping.

All told, I think the answer in reality would be: If you want to be used in a no-std environment, you don’t use dyn in your public interfaces. Just use impl AsyncIterator. You can use hacks like the wrapper types internally if you really want dynamic dispatch.

Question: How much room is there for the compiler to get clever?

One other concern I had in thinking about this proposal was that it seemed like it was overspecified. That is, the vast majority of call-sites in this proposal will be written with .box, which thus specifies that they should allocate a box to store the result. But what about ideas like caching the box across invocations, or “best effort” stack allocation? Where do they fit in? From what I can tell, those optimizations are still possible, so long as the Box which would be allocated doesn’t escape the function (which was the same condition we had before).

The way to think of it: by writing foo().box.await, the user told us to use the boxing allocator to box the return value of foo. But we can then see that this result is passed to await, which takes ownership and later frees it. We can thus decide to substitute a different allocator, perhaps one that reuses the box across invocations, or tries to use stack memory; this is fine so long as we modifed the freeing code to match. Doing this relies on knowing that the allocated value is immediately returned to us and that it never leaves our control.

Conclusion

To sum up, I think for most users this design would work like so…

  • You can use dyn with traits that have async functions, but you have to write .box every time you call a method.
  • You get to use .box in other places too, and we gain at least some support for unsized return values.12
  • If you want to write code that is sometimes using dyn and sometimes using static dispatch, you’ll have to write some awkward wrapper types.13
  • If you are writing no-std code, use impl Trait, not dyn Trait; if you must use dyn, it’ll require wrapper types.

Initially, I dismissed call-site allocation because it violated dyn Trait: Trait and it didn’t allow code to be written with dyn that could work in both std and no-std. But I think that violating dyn Trait: Trait may actually be good, and I’m not sure how important that latter constraint truly is. Furthermore, I think that Boxing::new and the various “dyn adapters” are probably going to be pretty confusing for users, but writing .box on a call-site is relatively easy to explain (“we don’t know what future you need, so you have to box it”). So now it seems a lot more appealing to me, and I’m grateful to Olivier Faure for bringing it up again.

One possible extension would be to permit users to specify the type of each returned future in some way. As I was finishing up this post, I saw that matthieum posted an intriguing idea in this direction on the internals thread. In general, I do see a need for some kind of “trait adapters”, such that you can take a base trait like Iterator and “adapt” it in various ways, e.g. producing a version that uses async methods, or which is const-safe. This has some pretty heavy overlap with the whole keyword generics initiative too. I think it’s a good extension to think about, but it wouldn’t be part of the “MVP” that we ship first.

Thoughts?

Please leave comments in this internals thread, thanks!

Appendix A: the Output associated type

Here is an interesting thing! The FnOnce trait, implemented by all callable things, defines its associated type Output as Sized! We have to change this if we want to allow unsized return values.

In theory, this could be a big backwards compatibility hazard. Code that writes F::Output can assume, based on the trait, that the return value is sized – so if we remove that bound, the code will no longer build!

Fortunately, I think this is ok. We’ve deliberately restricted the fn types so you can only use them with the () notation, e.g., where F: FnOnce() or where F: FnOnce() -> (). Both of these forms expand to something which explicitly specifies Output, like F: FnOnce<(), Output = ()>. What this means is that even if you really generic code…

fn foo<F, R>(f: F)
where
    F: FnOnce<Output = R>
{
    let value: F::Output = f();
    ...
}

…when you write F::Output, that is actually normalized to R, and the type R has its own (implicit) Sized bound.

(There’s was actually a recent unsoundness related to this bound, closed by this PR, and we discussed exactly this forwards compatibility question on Zulip.)

Footnotes

  1. I can hear you now: “but what about alloca!” I’ll get there. 

  2. The box foo operator supported by the compiler has no current path to stabilization. There were earlier plans (see RFC 809 and RFC 1228), but we ultimately abandoned those efforts. Part of the problem, in fact, was that the precedence of box foo made for bad ergonomics: foo.box works much better. 

  3. If you try to await a Box<dyn Future> today, you get an error that it needs to be pinned. I think we can solve that by implementing IntoFuture for Box<dyn Future> and having that convert it to Pin<Box<dyn Future>>

  4. Or almost always? I may be overlooking some edge cases. 

  5. Internally in the compiler, this would require modifying the definition of MIR to make “dyn dispatch” more first-class. 

  6. Don’t know what RPITIT stands for?! “Return position impl trait in traits!” Get with the program! 

  7. This is basically what the “magical” Boxing::new would have done for you in the older proposal. 

  8. Brief explanation of why async and alloca don’t mix here. 

  9. I was told Ada compiles will allocate the memory at the top of the stack, copy it over to the start of the function’s area, and then pop what’s left. Theoretically possible! 

  10. You could imagine a version that aborted the code if the size is wrong, too, which would make it no-std safe, but not in a realiable way (aborts == yuck). 

  11. Conceivably you could set the size to size_of(SomeOtherType) to automatically determine how much space is needed. 

  12. I say at least some because I suspect many details of the more general case would remain unstable until we gain more experience. 

  13. You have to write awkward wrapper types for now, anyway. I’m intrigued by ideas about how we could make that more automatic, but I think it’s way out of scope here. 

Firefox NightlyThese Weeks In Firefox: Issue 124

Highlights

Friends of the Firefox team

Introductions/Shout-Outs

  • Welcome Schalk! Schalk has been contributing for a while and is the community manager for MDN Web Docs, and is hanging out to hear about DevTools-y things and other interesting things going on in Firefox-land to help promote them to the wider community

Resolved bugs (excluding employees)

Volunteers that fixed more than one bug

  • axtinemvsn (one of our CalState students!)
  • Itiel

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions

WebExtensions Framework
  • Fixed a regression on accessing static theme resources from other extensions (introduced in Firefox 105 by Bug 1711168, new restrictions on accessing extensions resources not explicitly set as web_accessible_resources) – Bug 1786564 (landed in Firefox 105) and Bug 1790115 (landed in Firefox 106, followup fix related to extension pages running in private browsing windows)
  • Small tweaks and fixes related to the unified extensions toolbar button – Bug 1790015 / Bug 1784223 / Bug 1789407
  • Cleanups related to the Manifest Version 3 CSP – Bug 1789751 (removed localhost from script-src directive) / Bug 1766881 (removed unnecessary object-src)
Addon Manager & about:addons
  • Emilio enable modern flexbox use in the about:addons page (instead of XUL layout) – Bug 1790308
  • Itiel has updated the about:addons accent color var to use the Photon color and updated the “Available Updates” dot badge to use the expected Photon accent color – Bug 1787651

Developer Tools

DevTools
  • Eugene fixed a bug with the Network Monitor Websocket inspector, where messages would disappear when using filters in combination with the “keep messages” checkbox (bug)
  • Alex is updating the devtools codebase to prepare for ESM-ification:
  • The Network Monitor used to incorrectly show sizes in kibibytes (1024-based) instead of kilobytes (1000-based). Hubert fixed this issue and we now show the correct sizes and correct units everywhere in the Netmonitor (bug)
  • Alex keeps fixing bugs and UX issues around WebExtension debugging. Whenever you reloaded an extension, the Debugger would no longer show its sources. This was a recent regression, but it is now fixed and tested (bug)
  • Hubert fixed a bug with the new Edit and Resend panel, where we would crash if the request was too big. (bug)
  • Nicolas fixed a performance regression in the StyleEditor (bug), which was caused by performing too many cross compartment property access.
WebDriver BiDi
  • We added basic support for the “script.getRealms” command which returns the information about available WindowRealms, including sandboxes. This information contains realm ids which will be used to run script evaluation commands. (bug)
  • We extended the Remote Agent implementation to allow Marionette and WebDriver BiDi to open and close tabs in GeckoView applications. As a result we were able to enable ~300 additional WebDriver tests on Android. (bug)

ESMification status

Lint, Docs and Workflow

  • https is now the default to use in tests.
    • Please only disable the rule if you explicitly need to test insecure connections – and add a comment if you do disable.
  • You can now specify a –rule parameter to ./mach eslint (not ./mach lint -l eslint), which allows you to test turning on an ESLint rule.
  • We now have two new rules which are currently manually run.
    • The rules:
      • mozilla/valid-ci-uses checks that:
        • Ci.nsIFoo is a valid interface.
        • Ci.nsIFoo.CONSTANT is a valid constant available on the interface.
      • mozilla/valid-services-property checks that:
        • Services.foo.bar() is a valid property on the interface associated with Services.foo.
    • These will be added to run on CI as a tier-2 task in the next couple of months.
    • For now, they can be manually run via
      • ​​MOZ_OBJDIR=objdir-ff-opt ./mach eslint –rule=”mozilla/valid-services-property: error” –rule=”mozilla/valid-ci-uses: error” *
      • There are a few non-critical existing failures which will be resolved before CI lands.

Migration Improvements (CalState LA Project)

  • Students had a Hack Weekend the weekend before last to get up to speed with our tooling
  • Quite a few Good First Bugs landed to support the ESMification process
  • We’re starting the students off on researching the best ways of importing favicons from other browsers into Firefox. Watch this space!

Picture-in-Picture

Search and Navigation

Storybook / Reusable components

  • The ./mach storybook commands have landed!
    • ./mach storybook install # Run this the first time
    • ./mach storybook
    • ./mach storybook launch # Run this in a separate shell

This Week In RustThis Week in Rust 461

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is match_deref, a macro crate to implement deref patterns on stable Rust.

Thanks to meithecatte for the suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

347 pull requests were merged in the last week

Rust Compiler Performance Triage

This was a fairly negative week for compiler performance, with regressions overall up to 14% on some workloads (primarily incr-unchanged scenarios), largely caused by #101620. We are still chasing down either a revert or a fix for that regression, though a partial mitigation in #101862 has been applied. Hopefully the full fix or revert will be part of the next triage report.

We also saw a number of other regressions land, though most were much smaller in magnitude.

Triage done by @simulacrum. Revision range: 17cbdfd0..8fd6d03

See the full report for more details.

Call for Testing

An important step for RFC implementation is for people to experiment with the implementation and give feedback, especially before stabilization. The following RFCs would benefit from user testing before moving forward:

  • No RFCs issued a call for testing this week.

If you are a feature implementer and would like your RFC to appear on the above list, add the new call-for-testing label to your RFC along with a comment providing testing instructions and/or guidance on which aspect(s) of the feature need testing.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

  • No RFCs were approved this week.
Final Comment Period

Every week, the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
  • No RFCs entered Final Comment Period this week.
Tracking Issues & PRs
  • No Tracking Issues or PRs entered Final Comment Period this week.
New and Updated RFCs

Upcoming Events

Rusty Events between 2022-09-21 - 2022-10-19 🦀

Virtual
Asia
Europe
North America
Oceania

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Jobs

Please see the latest Who's Hiring thread on r/rust

Quote of the Week

At the #LinuxPlumbers Rust MC: "I'm Matthew Wilcox, I'm one of the authors of the NVMe spec, I'm the one who suggested you make an NVMe driver to demonstrate the value of Rust. You have succeeded beyond my wildest expectations. These performance numbers are phenomenal."

Josh Triplett paraphrasing Matthew Wilcox as spoken at the Linux Plumbers Conference Q&A session

Thanks to Josh Triplett for the self-suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, cdmistman, ericseppanen, extrawurst, andrewpollack, U007D, kolharsam, joelmarcey, mariannegoldin.

Email list hosting is sponsored by The Rust Foundation

Discuss on r/rust

Niko MatsakisWhat I meant by the “soul of Rust”

Re-reading my previous post, I felt I should clarify why I called it the “soul of Rust”. The soul of Rust, to my mind, is definitely not being explicit about allocation. Rather, it’s about the struggle between a few key values — especially productivity and versatility1 in tension with transparency. Rust’s goal has always been to feel like a high-level but with the performance and control of a low-level one. Oftentimes, we are able to find a “third way” that removes the tradeoff, solving both goals pretty well. But finding those “third ways” takes time — and sometimes we just have to accept a certain hit to one value or another for the time being to make progress. It’s exactly at these times, when we have to make a difficult call, that questions about the “soul of Rust” starts to come into play. I’ve been thinking about this a lot, so I thought I would write a post that expands on the role of transparency in Rust, and some of the tensions that arise around it.

Why do we value transparency?

From the draft Rustacean Principles:

🔧 Transparent: “you can predict and control low-level details”

The C language, famously, maps quite closely to how machines typically operate. So much so that people have sometimes called it “portable assembly”.2 Both C++ and Rust are trying to carry on that tradition, but to add on higher levels of abstraction. Inevitably, this leads to tension. Operator overloading, for example, makes figuring out what a + b more difficult.3

Transparency gives you control

Transparency doesn’t automatically give high performance, but it does give control. This helps when crafting your system, since you can set it up to do what you want, but it also helps when analyzing its performance or debugging. There’s nothing more frustrating than starting at code for hours and hours only to realize that the source of your problem isn’t anywhere in the code you can see — it lies in some invisible interaction that wasn’t made explicit.

Transparency can cost performance

The flip-side of transparency is overspecification. The more directly your program maps to assembly, the less room the compiler and runtime have to do clever things, which can lead to lower performance. In Rust, we are always looking for places where we can be less transparent in order to gain performance — but only up to a point. One example is struct layout: the Rust compiler retains the freedom to reorder fields in a struct, enabling us to make more compact data structures. That’s less transparent than C, but usually not in a way that you care about. (And, of course, if you want to specify the order of your fields, we offer the #[repr] attribute.)

Transparency hurts versatility and productivity

The bigger price of transparency, though, is versatility. It forces everyone to care about low-level details that may not actually matter to the problem at hand4. Relevant to dyn async trait, most async Rust systems, for example, perform allocations left and right. The fact that a particular call to an async function might invoke Box::new is unlikely to be a performance problem. For those users, selecting a Boxing adapter adds to the overall complexity they have to manage for very little gain. If you’re working on a project where you don’t need peak performance, that’s going to make Rust less appealing than other languages. I’m not saying that’s bad, but it’s a fact.

A zero-sum situation…

At this moment in the design of async traits, we are struggling with a core question here of “how versatile can Rust be”. Right now, it feels like a “zero sum situation”. We can add in something like Boxing::new to preserve transparency, but it’s going to cost us some in versatility — hopefully not too much.

…for now?

I do wonder, though, if there’s a “third way” waiting somewhere. I hinted at this a bit in the previous post. At the moment, I don’t know what that third way is, and I think that requiring an explicit adapter is the most practical way forward. But it seems to me that it’s not a perfect sweet spot yet, and I am hopeful we’ll be able to subsume it into something more general.

Some ingredients that might lead to a ‘third way’:

  • With-clauses or capabilities: I am intrigued by the idea of [with-clauses] and the general idea of scoped capabilities. We might be able to think about the “default adapter” as something that gets specified via a with-clause?
  • Const evaluation: One of the niftier uses for const evaluation is for “meta-programming” that customizes how Rust is compiled. For example, we could potentially let you write a const fn that creates the vtable data structure for a given trait.
  • Profiles and portability: Can we find a better way to identify the kinds of transparency that you want, perhaps via some kind of ‘profiles’? I feel we already have ‘de facto’ profiles right now, but we don’t recognize them. “No std” is a clear example, but another would be the set of operating systems or architectures that you try to support. Recognizing that different users have different needs, and giving people a way to choose which one fits them best, might allow us to be more supportive of all our users — but then again, it might make it make Rust “modal” and more confusing.

Comments?

Please leave comments in this internals thread. Thanks!

Footnotes

  1. I didn’t write about versatility in my original post: instead I focused on the hit to productivity. But as I think about it now, versatility is really what’s at play here — versatility really meant that Rust was useful for high-level things and low-level things, and I think that requiring an explicit dyn adaptor is unquestionably a hit against being high-level. Interestingly, I put versatility after transparency in the list, meaning that it was lower priority, and that seems to back up the decision to have some kind of explicit adaptor. 

  2. At this point, some folks point out all the myriad subtleties and details that are actually hidden in C code. Hush you. 

  3. I remember a colleague at a past job discovering that somebody had overloaded the -> operator in our codebase. They sent out an angry email, “When does it stop? Must I examine every dot and squiggle in the code?” (NB: Rust supports overloading the deref operator.) 

  4. Put another way, being transparent about one thing can make other things more obscure (“can’t see the forest for the trees”). 

Niko MatsakisDyn async traits, part 8: the soul of Rust

In the last few months, Tyler Mandry and I have been circulating a “User’s Guide from the Future” that describes our current proposed design for async functions in traits. In this blog post, I want to deep dive on one aspect of that proposal: how to handle dynamic dispatch. My goal here is to explore the space a bit and also to address one particularly tricky topic: how explicit do we have to be about the possibility of allocation? This is a tricky topic, and one that gets at that core question: what is the soul of Rust?

The running example trait

Throughout this blog post, I am going to focus exclusively on this example trait, AsyncIterator:

trait AsyncIterator {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

And we’re particularly focused on the scenario where we are invoking next via dynamic dispatch:

fn make_dyn<AI: AsyncIterator>(ai: AI) {
    use_dyn(&mut ai); // <— coercion from `&mut AI` to `&mut dyn AsyncIterator`
}

fn use_dyn(di: &mut dyn AsyncIterator) {
    di.next().await; // <— this call right here!
}

Even though I’m focusing the blog post on this particular snippet of code, everything I’m talking about is applicable to any trait with methods that return impl Trait (async functions themselves being a shorthand for a function that returns impl Future).

The basic challenge that we have to face is this:

  • The caller function, use_dyn, doesn’t know what impl is behind the dyn, so it needs to allocate a fixed amount of space that works for everybody. It also needs some kind of vtable so it knows what poll method to call.
  • The callee, AI::next, needs to be able to package up the future for its next function in some way to fit the caller’s expectations.

The first blog post in this series1 explains the problem in more detail.

A brief tour through the options

One of the challenges here is that there are many, many ways to make this work, and none of them is “obviously best”. What follows is, I think, an exhaustive list of the various ways one might handle the situation. If anybody has an idea that doesn’t fit into this list, I’d love to hear it.

Box it. The most obvious strategy is to have the callee box the future type, effectively returning a Box<dyn Future>, and have the caller invoke the poll method via virtual dispatch. This is what the async-trait crate does (although it also boxes for static dispatch, which we don’t have to do).

Box it with some custom allocator. You might want to box the future with a custom allocator.

Box it and cache box in the caller. For most applications, boxing itself is not a performance problem, unless it occurs repeatedly in a tight loop. Mathias Einwag pointed out if you have some code that is repeatedly calling next on the same object, you could have that caller cache the box in between calls, and have the callee reuse it. This way you only have to actually allocate once.

Inline it into the iterator. Another option is to store all the state needed by the function in the AsyncIter type itself. This is actually what the existing Stream trait does, if you think about it: instead of returning a future, it offers a poll_next method, so that the implementor of Stream effectively is the future, and the caller doesn’t have to store any state. Tyler and I worked out a more general way to do inlining that doesn’t require user intervention, where you basically wrap the AsyncIterator type in another type W that has a field big enough to store the next future. When you call next, this wrapper W stores the future into that field and then returns a pointer to the field, so that the caller only has to poll that pointer. One problem with inlining things into the iterator is that it only works well for &mut self methods, since in that case there can be at most one active future at a time. With &self methods, you could have any number of active futures.

Box it and cache box in the callee. Instead of inlining the entire future into the AsyncIterator type, you could inline just one pointer-word slot, so that you can cache and reuse the Box that next returns. The upside of this strategy is that the cached box moves with the iterator and can potentially be reused across callers. The downside is that once the caller has finished, the cached box lives on until the object itself is destroyed.

Have caller allocate maximal space. Another strategy is to have the caller allocate a big chunk of space on the stack, one that should be big enough for every callee. If you know the callees your code will have to handle, and the futures for those callees are close enough in size, this strategy works well. Eric Holk recently released the [stackfuture crate] that can help automate it. One problem with this strategy is that the caller has to know the size of all its callees.

Have caller allocate some space, and fall back to boxing for large callees. If you don’t know the sizes of all your callees, or those sizes have a wide distribution, another strategy might be to have the caller allocate some amount of stack space (say, 128 bytes) and then have the callee invoke Box if that space is not enough.

Alloca on the caller side. You might think you can store the size of the future to be returned in the vtable and then have the caller “alloca” that space — i.e., bump the stack pointer by some dynamic amount. Interestingly, this doesn’t work with Rust’s async model. Async tasks require that the size of the stack frame is known up front.

Side stack. Similar to the previous suggestion, you could imagine having the async runtimes provide some kind of “dynamic side stack” for each task.2 We could then allocate the right amount of space on this stack. This is probably the most efficient option, but it assumes that the runtime is able to provide a dynamic stack. Runtimes like embassy wouldn’t be able to do this. Moreover, we don’t have any sort of protocol for this sort of thing right now. Introducing a side-stack also starts to “eat away” at some of the appeal of Rust’s async model, which is designed to allocate the “perfect size stack” up front and avoid the need to allocate a “big stack per task”.3

Can async functions used with dyn be “normal”?

One of my initial goals for async functions in traits was that they should feel “as natural as possible”. In particular, I wanted you to be able to use them with dynamic dispatch in just the same way as you would a synchronous function. In other words, I wanted this code to compile, and I would want it to work even if use_dyn were put into another crate (and therefore were compiled with no idea of who is calling it):

fn make_dyn<AI: AsyncIterator>(ai: AI) {
    use_dyn(&mut ai);
}

fn use_dyn(di: &mut dyn AsyncIterator) {
    di.next().await;
}

My hope was that we could make this code work just as it is by selecting some kind of default strategy that works most of the time, and then provide ways for you to pick other strategies for those code where the default strategy is not a good fit. The problem though is that there is no single default strategy that seems “obvious and right almost all of the time”…

Strategy Downside
Box it (with default allocator) requires allocation, not especially efficient
Box it with cache on caller side requires allocation
Inline it into the iterator adds space to AI, doesn’t work for &self
Box it with cache on callee side requires allocation, adds space to AI, doesn’t work for &self
Allocate maximal space can’t necessarily use that across crates, requires extensive interprocedural analysis
Allocate some space, fallback uses allocator, requires extensive interprocedural analysis or else random guesswork
Alloca on the caller side incompatible with async Rust
Side-stack requires cooperation from runtime and allocation

The soul of Rust

This is where we get to the “soul of Rust”. Looking at the above table, the strategy that seems the closest to “obviously correct” is “box it”. It works fine with separate compilation, fits great with Rust’s async model, and it matches what people are doing today in practice. I’ve spoken with a fair number of people who use async Rust in production, and virtually all of them agreed that “box by default, but let me control it” would work great in practice.

And yet, when we floated the idea of using this as the default, Josh Triplett objected strenuously, and I think for good reason. Josh’s core concern was that this would be crossing a line for Rust. Until now, there is no way to allocate heap memory without some kind of explicit operation (though that operation could be a function call). But if we wanted make “box it” the default strategy, then you’d be able to write “innocent looking” Rust code that nonetheless is invoking Box::new. In particular, it would be invoking Box::new each time that next is called, to box up the future. But that is very unclear from reading over make_dyn and use_dyn.

As an example of where this might matter, it might be that you are writing some sensitive systems code where allocation is something you always do with great care. It doesn’t mean the code is no-std, it may have access to an allocator, but you still would like to know exactly where you will be doing allocations. Today, you can audit the code by hand, scanning for “obvious” allocation points like Box::new or vec![]. Under this proposal, while it would still be possible, the presence of an allocation in the code is much less obvious. The allocation is “injected” as part of the vtable construction process. To figure out that this will happen, you have to know Rust’s rules quite well, and you also have to know the signature of the callee (because in this case, the vtable is built as part of an implicit coercion). In short, scanning for allocation went from being relatively obvious to requiring a PhD in Rustology. Hmm.

On the other hand, if scanning for allocations is what is important, we could address that in many ways. We could add an “allow by default” lint to flag the points where the “default vtable” is constructed, and you could enable it in your project. This way the compiler would warn you about the possible future allocation. In fact, even today, scanning for allocations is actually much harder than I made it ought to be: you can easily see if your function allocates, but you can’t easily see what its callees do. You have to read deeply into all of your dependencies and, if there are function pointers or dyn Trait values, figure out what code is potentially being called. With compiler/language support, we could make that whole process much more first-class and better.

In a way, though, the technical arguments are besides the point. “Rust makes allocations explicit” is widely seen as a key attribute of Rust’s design. In making this change, we would be tweaking that rule to be something like ”Rust makes allocations explicit most of the time”. This would be harder for users to understand, and it would introduce doubt as whether Rust really intends to be the kind of language that can replace C and C++4.

Looking to the Rustacean design principles for guidance

Some time back, Josh and I drew up a draft set of design principles for Rust. It’s interesting to look back on them and see what they have to say about this question:

  • ⚙️ Reliable: “if it compiles, it works”
  • 🐎 Performant: “idiomatic code runs efficiently”
  • 🥰 Supportive: “the language, tools, and community are here to help”
  • 🧩 Productive: “a little effort does a lot of work”
  • 🔧 Transparent: “you can predict and control low-level details”
  • 🤸 Versatile: “you can do anything with Rust”

Boxing by default, to my mind, scores as follows:

  • 🐎 Performant: meh. The real goal with performant is that the cleanest code also runs the fastest. Boxing on every dynamic call doesn’t meet this goal, but something like “boxing with caller-side caching” or “have caller allocate space and fall back to boxing” very well might.
  • 🧩 Productive: yes! Virtually every production user of async Rust that I’ve talked to has agreed that having code box by default would (but giving the option to do something else for tight loops) would be a great sweet spot for Rust.
  • 🔧 Transparent: no. As I wrote before, understanding when a call may box now requires a PhD in Rustology, so this definitely fails on transparency.

(The other principles are not affected in any notable way, I don’t think.)

What the “user’s guide from the future” suggests

These considerations led Tyler and I to a different design. In the “User’s Guide From the Future” document from before, you’ll see that it does not accept the running example just as is. Instead, if you were to compile the example code we’ve been using thus far, you’d get an error:

error[E0277]: the type `AI` cannot be converted to a
              `dyn AsyncIterator` without an adapter
 --> src/lib.rs:3:23
  |
3 |     use_dyn(&mut ai);
  |                  ^^ adapter required to convert to `dyn AsyncIterator`
  |
  = help: consider introducing the `Boxing` adapter,
    which will box the futures returned by each async fn
3 |     use_dyn(&mut Boxing::new(ai));
                     ++++++++++++  +

As the error suggests, in order to get the boxing behavior, you have to opt-in via a type that we called Boxing5:

fn make_dyn<AI: AsyncIterator>(ai: AI) {
    use_dyn(&mut Boxing::new(ai));
    //          ^^^^^^^^^^^
}

fn use_dyn(di: &mut dyn AsyncIterator) {
    di.next().await;
}

Under this design, you can only create a &mut dyn AsyncIterator when the caller can verify that the next method returns a type from which a dyn* can be constructed. If that’s not the case, and it’s usually not, you can use the Boxing::new adapter to create a Boxing<AI>. Via some kind of compiler magic that ahem we haven’t fully worked out yet6, you could coerce a Boxing<AI> into a dyn AsyncIterator.

The details of the Boxing type need more work7, but the basic idea remains the same: require users to make some explicit opt-in to the default vtable strategy, which may indeed perform allocation.

How does Boxing rank on the design principles?

To my mind, adding the Boxing adapter ranks as follows…

  • 🐎 Performant: meh. This is roughly the same as before. We’ll come back to this.
  • 🥰 Supportive: yes! The error message guides you to exactly what you need to do, and hopefully links to a well-written explanation that can help you learn about why this is required.
  • 🧩 Productive: meh. Having to add Boxing::new call each time you create a dyn AsyncIterator is not great, but also on-par with other Rust papercuts.
  • 🔧 Transparent: yes! It is easy to see that boxing may occur in the future now.

This design is now transparent. It’s also less productive than before, but we’ve tried to make up for it with supportiveness. “Rust isn’t always easy, but it’s always helpful.”

Improving performance with a more complex ABI

One thing that bugs me about the “box by default” strategy is that the performance is only “meh”. I like stories like Iterator, where you write nice code and you get tight loops. It bothers me that writing “nice” async code yields a naive, middling efficiency story.

That said, I think this is something we could fix in the future, and I think we could fix it backwards compatibly. The idea would be to extend our ABI when doing virtual calls so that the caller has the option to provide some “scratch space” for the callee. For example, we could then do things like analyze the binary to get a good guess as to how much stack space is needed (either by doing dataflow or just by looking at all implementations of AsyncIterator). We could then have the caller reserve stack space for the future and pass a pointer into the callee — the callee would still have the option of allocating, if for example, there wasn’t enough stack space, but it could make use of the space in the common case.

Interestingly, I think that if we did this, we would also be putting some pressure on Rust’s “transparency” story again. While Rust’s leans heavily on optimizations to get performance, we’ve generally restricted ourselves to simple, local ones like inlining; we don’t require interprocedural dataflow in particular, although of course it helps (and LLVM does it). But getting a good estimate of how much stack space to reserve for potential calleees would violate that rule (we’d also need some simple escape analysis, as I describe in Appendix A). All of this adds up to a bit of ‘performance unpredictability’. Still, I don’t see this as a big problem, particularly since the fallback is just to use Box::new, and as we’ve said, for most users that is perfectly adequate.

Picking another strategy, such as inlining

Of course, maybe you don’t want to use Boxing. It would also be possible to construct other kinds of adapters, and they would work in a similar fashion. For example, an inlining adapter might look like:

fn make_dyn<AI: AsyncIterator>(ai: AI) {
    use_dyn(&mut InlineAsyncIterator::new(ai));
    //           ^^^^^^^^^^^^^^^^^^^^^^^^
}

The InlineAsyncIterator<AI> type would add the extra space to store the future, so that when the next method is called, it writes the future into its own fields and then returns it to the caller. Similarly, a cached box adapter might be &mut CachedAsyncIterator::new(ai), only it would use a field to cache the resulting Box.

You may have noticed that the inline/cached adapters include the name of the trait. That’s because they aren’t relying on compiler magic like Boxing, but are instead intended to be authored by end-users, and we don’t yet have a way to be generic over any trait definition. (The proposal as we wrote it uses macros to generate an adapter type for any trait you wish to adapt.) This is something I’d love to address in the future. You can read more about how adapters work here.

Conclusion

OK, so let’s put it all together into a coherent design proposal:

  • You cannot coerce from an arbitrary type AI into a dyn AsyncIterator. Instead, you must select an adaptor:
    • Typically you want Boxing, which has a decent performance profile and “just works”.
    • But users can write their own adapters to implement other strategies, such as InlineAsyncIterator or CachingAsyncIterator.
  • From an implementation perspective:
    • When invoked via dynamic dispatch, async functions return a dyn* Future. The caller can invoke poll via virtual dispatch and invoke the (virtual) drop function when it’s ready to dispose of the future.
    • The vtable created for Boxing<AI> will allocate a box to store the future AI::next() and use that to create the dyn* Future.
    • The vtable for other adapters can use whatever strategy they want. InlineAsyncIterator<AI>, for example, stores the AI::next() future into a field in the wrapper, takes a raw pointer to that field, and creates a dyn* Future from this raw pointer.
  • Possible future extension for better performance:8
    • We modify the ABI for async trait functions (or any trait function using return-position impl trait) to allow the caller to optionally provide stack space. The Boxing adapter, if such stack space is available, will use it to avoid boxing when it can. This would have to be coupled with some compiler analysis to figure out how much to stack space to pre-allocate.

This lets us express virtually any pattern. Its even possible to express side-stacks, if the runtime provides a suitable adapter (e.g., TokioSideStackAdapter::new(ai)), though if side-stacks become popular I would rather consider a more standard means to expose them.

The main downsides to this proposal are:

  • Users have to write Boxing::new, which is a productivity and learnability hit, but it avoids a big hit to transparency. Is that the right call? I’m still not entirely sure, though my heart increasingly says yes. It’s also something we could revisit in the future (e.g., and add a default adapter).
  • If we opt to modify the ABI, we’re adding some complexity there, but in exchange for potentially quite a lot of performance. I would expect us not to do this initially, but to explore it as an extension in the future once we have more data about how important it is.

There is one pattern that we can’t express: “have caller allocate maximal space”. This pattern guarantees that heap allocation is not needed; the best we can do is a heuristic that tries to avoid heap allocation, since we have to consider public functions on crate boundaries and the like. To offer a guarantee, the argument type needs to change from &mut dyn AsyncIterator (which accepts any async iterator) to something narrower. This would also support futures that escape the stack frame (see Appendix A below). It seems likely that these details don’t matter, and that either inline futures or heuristics would suffice, but if not, a crate like stackfuture remains an option.

Comments?

Please leave comments in this internals thread. Thanks!

Appendix A: futures that escape the stack frame

In all of this discussion, I’ve been assuming that the async call was followed closely by an await. But what happens if the future is not awaited, but instead is moved into the heap or other locations?

fn foo(x: &mut dyn AsyncIterator<Item = u32>) -> impl Future<Output = Option<u32>> + _ {
    x.next()
}

For boxing, this kind of code doesn’t pose any problem at all. But if we had allocated space on the stack to store the future, examples like this would be a problem. So long as the scratch space is optional, with a fallback to boxing, this is no problem. We can do an escape analysis and avoid the use of scratch space for examples like this.

Footnotes

  1. Written in Sep 2020, egads! 

  2. I was intrigued to learn that this is what Ada does, and that Ada features like returning dynamically sized types are built on this model. I’m not sure how SPARK and other Ada subsets that target embedded spaces manage that, I’d like to learn more about it. 

  3. Of course, without a side stack, we are left using mechanisms like Box::new to cover cases like dynamic dispatch or recursive functions. This becomes a kind of pessimistically sized segmented stack, where we allocate for each little piece of extra state that we need. A side stack might be an appealing middle ground, but because of cases like embassy, it can’t be the only option. 

  4. Ironically, C++ itself inserts implicit heap allocations to help with coroutines! 

  5. Suggestions for a better name very welcome. 

  6. Pay no attention to the compiler author behind the curtain. 🪄 🌈 Avert your eyes! 

  7. e.g., if you look closely at the User’s Guide from the Future, you’ll see that it writes Boxing::new(&mut ai), and not &mut Boxing::new(ai). I go back and forth on this one. 

  8. I should clarify that, while Tyler and I have discussed this, I don’t know how he feels about it. I wouldn’t call it ‘part of the proposal’ exactly, more like an extension I am interested in. 

Cameron KaiserSeptember patch set for TenFourFox

102 is now the next Firefox Extended Support Release, so it's time for spring cleaning — if you're a resident of the Southern Hemisphere — in the TenFourFox repository. Besides refreshing the maintenance scripts to pull certificate, timezone and HSTS updates from this new source, I also implemented all the relevant security and stability patches from the last gasp of 91ESR (none likely to be exploitable on Power Macs without a direct attack, but many likely to crash them), added an Fx102 user agent choice to the TenFourFox preference pane, updated the ATSUI font blacklist (thanks to Chris T for the report) and updated zlib to 1.2.12, picking up multiple bug fixes and some modest performance improvements. This touches a lot of low-level stuff so updating will require a complete rebuild from scratch (instructions). Sorry about that, it's necessary!

If you're new to building your own copy of TenFourFox, this article from last year is still current with the process and what's out there for alternatives and assistance.

Mozilla Performance BlogA different perspective

Usually, in our articles, we talk about performance from the performance engineer’s perspective, but in this one, I want to take a step back and look at it from another perspective. Earlier this year, I talked to an engineer about including more debugging information in the bugs we are filing for regressions. Trying to make a context out of the discussion, I realized the performance sheriffing process is complex and that many of our engineers have limited knowledge of how we detect regressions, how we identify the patch that introduced it, and how to respond to a notification of a regression.

As a result, I decided to make a recording about how a Sheriff catches a regression and files the bug, and then how the engineer that wrote the source code causing the regression can get the information they need to resolve it. The video below has a simplified version of how the Performance Sheriffs open a performance regression.

In short, if there’s no test gap between the last good and the first regressed revision, a regression will be filed on the bug that caused it and linked to the alert.

Filing a regression – Demo

I caused a regression! Now what?

If you caused a regression then a sheriff will open a regression bug and set the regressor bug’s id to the regressed by field. In the regression description, you’ll find the tests regressed and you’ll be able to view a particular graph or the updated alert. Note: almost always the alert contains more items than the description. The video below will show you how to zoom in and find the regressing data point, see the job, trigger a profiler, and see the taskcluster task for it. There you’ll find the payload, dependencies, artifacts, or parsed log.

Investigating a regression – Demo

The full process of investigating an alert and finding the cause of a regression is much more complex than these examples. It has three phases before, and one after, which are: triaging the alerts, investigating the graphs, and filing the regression bug. The one after is following up and offering support to the author of the regressing bug to understand and/or fix the issue. These phases are illustrated below.

Sheriffing Workflow

Sheriffing Workflow

Improvements

We have made several small improvements to the regression bug template that are worth noting:

  • We added links to the ratio (magnitude) column that opens the graph of each alert item
  • Previously the performance sheriffs set the severity of the regression, but we now allow the triage owners to determine severity based on the data provided
  • We added a paragraph that lets you know you can trigger profiling jobs for the regressed tests before and after the commit, or ask the sheriff to do this for you.
  • Added a cron job that will trigger performance tests for patches that are most likely to change the performance numbers

Work in progress

There are also three impactful projects in terms of performance:

  1. Integrating the side-by-side script to CI, the ultimate goal being to have side-by-side video comparisons generated automatically on regressions. Currently, there’s a local perftest-tools command that does the comparison.
  2. Having the profiler automatically triggered for the same purpose: having more investigation data available when a regression happens.
  3. Developing a more user-friendly performance comparison tool, PerfCompare, to replace Perfherder Compare View.

Mozilla Open Policy & Advocacy BlogMozilla Responds to EU General Court’s Judgment on Google Android

This week, the EU’s General Court largely upheld the decision sanctioning Google for restricting competition on the Android mobile operating system. But, on their own, the judgment and the record fine do not help to unlock competition and choice online, especially when it comes to browsers.

In July 2018, when the European Commission announced its decision, we expressed hope that the result would help to level the playing field for independent browsers like Firefox and provide real choice for consumers. Sadly for billions of people around the world who use browsers every day, this hope has not been realized – yet.

The case may rumble on in appeals for several more years, but Mozilla will continue to advocate for an Internet which is open, accessible, private, and secure for all, and we will continue to build products which advance this vision. We hope that those with the power to improve browser choice for consumers will also work towards these tangible goals.

The post Mozilla Responds to EU General Court’s Judgment on Google Android appeared first on Open Policy & Advocacy.

The Rust Programming Language BlogConst Eval (Un)Safety Rules

In a recent Rust issue (#99923), a developer noted that the upcoming 1.64-beta version of Rust had started signalling errors on their crate, icu4x. The icu4x crate uses unsafe code during const evaluation. Const evaluation, or just "const-eval", runs at compile-time but produces values that may end up embedded in the final object code that executes at runtime.

Rust's const-eval system supports both safe and unsafe Rust, but the rules for what unsafe code is allowed to do during const-eval are even more strict than what is allowed for unsafe code at runtime. This post is going to go into detail about one of those rules.

(Note: If your const code does not use any unsafe blocks or call any const fn with an unsafe block, then you do not need to worry about this!)

A new diagnostic to watch for

The problem, reduced over the course of the comment thread of #99923, is that certain static initialization expressions (see below) are defined as having undefined behavior (UB) at compile time (playground):

pub static FOO: () = unsafe {
    let illegal_ptr2int: usize = std::mem::transmute(&());
    let _copy = illegal_ptr2int;
};

(Many thanks to @eddyb for the minimal reproduction!)

The code above was accepted by Rust versions 1.63 and earlier, but in the Rust 1.64-beta, it now causes a compile time error with the following message:

error[E0080]: could not evaluate static initializer
 --> demo.rs:3:17
  |
3 |     let _copy = illegal_ptr2int;
  |                 ^^^^^^^^^^^^^^^ unable to turn pointer into raw bytes
  |
  = help: this code performed an operation that depends on the underlying bytes representing a pointer
  = help: the absolute address of a pointer is not known at compile-time, so such operations are not supported

As the message says, this operation is not supported: the transmute above is trying to reinterpret the memory address &() as an integer of type usize. The compiler cannot predict what memory address the () would be associated with at execution time, so it refuses to allow that reinterpretation.

When you write safe Rust, then the compiler is responsible for preventing undefined behavior. When you write any unsafe code (be it const or non-const), you are responsible for preventing UB, and during const-eval, the rules about what unsafe code has defined behavior are even more strict than the analogous rules governing Rust's runtime semantics. (In other words, more code is classified as "UB" than you may have otherwise realized.)

If you hit undefined behavior during const-eval, the Rust compiler will protect itself from adverse effects such as the undefined behavior leaking into the type system, but there are few guarantees other than that. For example, compile-time UB could lead to runtime UB. Furthermore, if you have UB at const-eval time, there is no guarantee that your code will be accepted from one compiler version to another.

What is new here

You might be thinking: "it used to be accepted; therefore, there must be some value for the memory address that the previous version of the compiler was using here."

But such reasoning would be based on an imprecise view of what the Rust compiler was doing here.

The const-eval machinery of the Rust compiler (also known as "the CTFE engine") is built upon a MIR interpreter which uses an abstract model of a hypothetical machine as the foundation for evaluating such expressions. This abstract model doesn't have to represent memory addresses as mere integers; in fact, to support fine-grained checking for UB, it uses a much richer datatype for the values that are held in the abstract memory store.

(The aforementioned MIR interpreter is also the basis for Miri, a research tool that interprets non-const Rust code, with a focus on explicit detection of undefined behavior. The Miri developers are the primary contributors to the CTFE engine in the Rust compiler.)

The details of the CTFE engine's value representation do not matter too much for our discussion here. We merely note that earlier versions of the compiler silently accepted expressions that seemed to transmute memory addresses into integers, copied them around, and then transmuted them back into addresses; but that was not what was acutally happening under the hood. Instead, what was happening was that the values were passed around blindly (after all, the whole point of transmute is that it does no transformation on its input value, so it is a no-op in terms of its operational semantics).

The fact that it was passing a memory address into a context where you would expect there to always be an integer value would only be caught, if at all, at some later point.

For example, the const-eval machinery rejects code that attempts to embed the transmuted pointer into a value that could be used by runtime code, like so (playground):

pub static FOO: usize = unsafe {
    let illegal_ptr2int: usize = std::mem::transmute(&());
    illegal_ptr2int
};

Likewise, it rejects code that attempts to perform arithmetic on that non-integer value, like so (playground):

pub static FOO: () = unsafe {
    let illegal_ptr2int: usize = std::mem::transmute(&());
    let _incremented = illegal_ptr2int + 1;
};

Both of the latter two variants are rejected in stable Rust, and have been for as long as Rust has accepted pointer-to-integer conversions in static initializers (see e.g. Rust 1.52).

More similar than different

In fact, all of the examples provided above are exhibiting undefined behavior according to the semantics of Rust's const-eval system.

The first example with _copy was accepted in Rust versions 1.46 through 1.63 because of CTFE implementation artifacts. The CTFE engine puts considerable effort into detecting UB, but does not catch all instances of it. Furthermore, by default, such detection can be delayed to a point far after where the actual problematic expression is found.

But with nightly Rust, we can opt into extra checks for UB that the engine provides, by passing the unstable flag -Z extra-const-ub-checks. If we do that, then for all of the above examples we get the same result:

error[E0080]: could not evaluate static initializer
 --> demo.rs:2:34
  |
2 |     let illegal_ptr2int: usize = std::mem::transmute(&());
  |                                  ^^^^^^^^^^^^^^^^^^^^^^^^ unable to turn pointer into raw bytes
  |
  = help: this code performed an operation that depends on the underlying bytes representing a pointer
  = help: the absolute address of a pointer is not known at compile-time, so such operations are not supported

The earlier examples had diagnostic output that put the blame in a misleading place. With the more precise checking -Z extra-const-ub-checks enabled, the compiler highlights the expression where we can first witness UB: the original transmute itself! (Which was stated at the outset of this post; here we are just pointing out that these tools can pinpoint the injection point more precisely.)

Why not have these extra const-ub checks on by default? Well, the checks introduce performance overhead upon Rust compilation time, and we do not know if that overhead can be made acceptable. (However, recent debate among Miri developers indicates that the inherent cost here might not be as bad as they had originally thought. Perhaps a future version of the compiler will have these extra checks on by default.)

Change is hard

You might well be wondering at this point: "Wait, when is it okay to transmute a pointer to a usize during const evaluation?" And the answer is simple: "Never."

Transmuting a pointer to a usize during const-eval has always been undefined behavior, ever since const-eval added support for transmute and union. You can read more about this in the const_fn_transmute / const_fn_union stabilization report, specifically the subsection entitled "Pointer-integer-transmutes". (It is also mentioned in the documentation for transmute.)

Thus, we can see that the classification of the above examples as UB during const evaluation is not a new thing at all. The only change here was that the CTFE engine had some internal changes that made it start detecting the UB rather than silently ignoring it.

This means the Rust compiler has a shifting notion of what UB it will explicitly catch. We anticipated this: RFC 3016, "const UB", explicitly says:

[...] there is no guarantee that UB is reliably detected during CTFE. This can change from compiler version to compiler version: CTFE code that causes UB could build fine with one compiler and fail to build with another. (This is in accordance with the general policy that unsound code is not subject to stability guarantees.)

Having said that: So much of Rust's success has been built around the trust that we have earned with our community. Yes, the project has always reserved the right to make breaking changes when resolving soundness bugs; but we have also strived to mitigate such breakage whenever feasible, via things like future-incompatible lints.

Today, with our current const-eval architecture, it is not feasible to ensure that changes such as the one that injected issue #99923 go through a future-incompat warning cycle. The compiler team plans to keep our eye on issues in this space. If we see evidence that these kinds of changes do cause breakage to a non-trivial number of crates, then we will investigate further how we might smooth the transition path between compiler releases. However, we need to balance any such goal against the fact that Miri has very a limited set of developers: the researchers determining how to define the semantics of unsafe languages like Rust. We do not want to slow their work down!

What you can do for safety's sake

If you observe the could not evaluate static initializer message on your crate atop Rust 1.64, and it was compiling with previous versions of Rust, we want you to let us know: file an issue!

We have performed a crater run for the 1.64-beta and that did not find any other instances of this particular problem. If you can test compiling your crate atop the 1.64-beta before the stable release goes out on September 22nd, all the better! One easy way to try the beta is to use rustup's override shortand for it:

$ rustup update beta
$ cargo +beta build

As Rust's const-eval evolves, we may see another case like this arise again. If you want to defend against future instances of const-eval UB, we recommend that you set up a continuous integration service to invoke the nightly rustc with the unstable -Z extra-const-ub-checks flag on your code.

Want to help?

As you might imagine, a lot of us are pretty interested in questions such as "what should be undefined behavior?"

See for example Ralf Jung's excellent blog series on why pointers are complicated (parts I, II, III), which contain some of the details elided above about the representation of pointer values, and spell out reasons why you might want to be concerned about pointer-to-usize transmutes even outside of const-eval.

If you are interested in trying to help us figure out answers to those kinds of questions, please join us in the unsafe code guidelines zulip.

If you are interested in learning more about Miri, or contributing to it, you can say Hello in the miri zulip.

Conclusion

To sum it all up: When you write safe Rust, then the compiler is responsible for preventing undefined behavior. When you write any unsafe code, you are responsible for preventing undefined behavior. Rust's const-eval system has a stricter set of rules governing what unsafe code has defined behavior: specifically, reinterpreting (aka "transmuting") a pointer value as a usize is undefined behavior during const-eval. If you have undefined behavior at const-eval time, there is no guarantee that your code will be accepted from one compiler version to another.

The compiler team is hoping that issue #99923 is an exceptional fluke and that the 1.64 stable release will not encounter any other surprises related to the aforementioned change to the const-eval machinery.

But fluke or not, the issue provided excellent motivation to spend some time exploring facets of Rust's const-eval architecture and the interpreter that underlies it. We hope you enjoyed reading this as much as we did writing it.

This Week In RustThis Week in Rust 460

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Foundation
Newsletters
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is bstr, a fast and featureful byte-string library.

Thanks to 8573 for the suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

324 pull requests were merged in the last week

Rust Compiler Performance Triage

From the viewpoint of metrics gathering, this was an absolutely terrible week, because the vast majority of this week's report is dominated by noise. Several benchmarks (html5ever, cranelift-codegen, and keccak) have all been exhibiting bimodal behavior where their compile-times would regress and improve randomly from run to run. Looking past that, we had one small win from adding an inline directive.

Triage done by @pnkfelix. Revision range: e7cdd4c0..17cbdfd0

Summary:

(instructions:u) mean range count
Regressions ❌
(primary)
1.1% [0.2%, 6.2%] 26
Regressions ❌
(secondary)
1.9% [0.1%, 5.6%] 34
Improvements ✅
(primary)
-1.8% [-29.4%, -0.2%] 42
Improvements ✅
(secondary)
-1.3% [-5.3%, -0.2%] 50
All ❌✅ (primary) -0.7% [-29.4%, 6.2%] 68

11 Regressions, 11 Improvements, 13 Mixed; 11 of them in rollups 71 artifact comparisons made in total

Full report here

Call for Testing

An important step for RFC implementation is for people to experiment with the implementation and give feedback, especially before stabilization. The following RFCs would benefit from user testing before moving forward:

  • No RFCs issued a call for testing this week.

If you are a feature implementer and would like your RFC to appear on the above list, add the new call-for-testing label to your RFC along with a comment providing testing instructions and/or guidance on which aspect(s) of the feature need testing.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week, the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs
New and Updated RFCs
  • No New or Updated RFCs were created this week.

Upcoming Events

Rusty Events between 2022-09-14 - 2022-10-12 🦀

Virtual
Europe
North America
Oceania

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Jobs

Please see the latest Who's Hiring thread on r/rust

Quote of the Week

In Rust We Trust

Alexander Sidorov on Medium

Thanks to Anton Fetisov for the suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, cdmistman, ericseppanen, extrawurst, andrewpollack, U007D, kolharsam, joelmarcey, mariannegoldin.

Email list hosting is sponsored by The Rust Foundation

Discuss on r/rust

The Rust Programming Language BlogSecurity advisories for Cargo (CVE-2022-36113, CVE-2022-36114)

This is a cross-post of the official security advisory. The official advisory contains a signed version with our PGP key, as well.

The Rust Security Response WG was notified that Cargo did not prevent extracting some malformed packages downloaded from alternate registries. An attacker able to upload packages to an alternate registry could fill the filesystem or corrupt arbitary files when Cargo downloaded the package.

These issues have been assigned CVE-2022-36113 and CVE-2022-36114. The severity of these vulnerabilities is "low" for users of alternate registries. Users relying on crates.io are not affected.

Note that by design Cargo allows code execution at build time, due to build scripts and procedural macros. The vulnerabilities in this advisory allow performing a subset of the possible damage in a harder to track down way. Your dependencies must still be trusted if you want to be protected from attacks, as it's possible to perform the same attacks with build scripts and procedural macros.

Arbitrary file corruption (CVE-2022-36113)

After a package is downloaded, Cargo extracts its source code in the ~/.cargo folder on disk, making it available to the Rust projects it builds. To record when an extraction is successfull, Cargo writes "ok" to the .cargo-ok file at the root of the extracted source code once it extracted all the files.

It was discovered that Cargo allowed packages to contain a .cargo-ok symbolic link, which Cargo would extract. Then, when Cargo attempted to write "ok" into .cargo-ok, it would actually replace the first two bytes of the file the symlink pointed to with ok. This would allow an attacker to corrupt one file on the machine using Cargo to extract the package.

Disk space exhaustion (CVE-2022-36114)

It was discovered that Cargo did not limit the amount of data extracted from compressed archives. An attacker could upload to an alternate registry a specially crafted package that extracts way more data than its size (also known as a "zip bomb"), exhausting the disk space on the machine using Cargo to download the package.

Affected versions

Both vulnerabilities are present in all versions of Cargo. Rust 1.64, to be released on September 22nd, will include fixes for both of them.

Since these vulnerabilities are just a more limited way to accomplish what a malicious build scripts or procedural macros can do, we decided not to publish Rust point releases backporting the security fix. Patch files for Rust 1.63.0 are available in the wg-security-response repository for people building their own toolchains.

Mitigations

We recommend users of alternate registries to excercise care in which package they download, by only including trusted dependencies in their projects. Please note that even with these vulnerabilities fixed, by design Cargo allows arbitrary code execution at build time thanks to build scripts and procedural macros: a malicious dependency will be able to cause damage regardless of these vulnerabilities.

crates.io implemented server-side checks to reject these kinds of packages years ago, and there are no packages on crates.io exploiting these vulnerabilities. crates.io users still need to excercise care in choosing their dependencies though, as the same concerns about build scripts and procedural macros apply here.

Acknowledgements

We want to thank Ori Hollander from JFrog Security Research for responsibly disclosing this to us according to the Rust security policy.

We also want to thank Josh Triplett for developing the fixes, Weihang Lo for developing the tests, and Pietro Albini for writing this advisory. The disclosure was coordinated by Pietro Albini and Josh Stone.

Support.Mozilla.OrgTribute to FredMcD

It brings us great sadness to share the news that FredMcD has recently passed away.

If you ever posted a question to our Support Forum, you may be familiar with a contributor named “FredMcD”. Fred was one of the most active contributors in Mozilla Support, and for many years remains one of our core contributors. He was regularly awarded a forum contributor badge every year since 2013 for his consistency in contributing to the Support Forum.

He was a dedicated contributor, super helpful, and very loyal to Firefox users making over 81400 contributions to the Support Forum since 2013.  During the COVID-19 lockdown period, he focussed on helping people all over the world when they were online the most – at one point he was doing approximately 3600 responses in 90 days, an average of 40 a day.

In March 2022, I learned the news that he was hospitalized for a few weeks. He was back active in our forum shortly after he was discharged. But then we never heard from him again after his last contribution on May 5, 2022. There’s very little we know about Fred. But we were finally able to confirm his passing just recently.

We surely lost a great contributor. He was a helpful community member and his assistance with incidents was greatly appreciated. His support approach has always been straightforward and simple. It’s not rare, that he was able to solve a problem in one go like this or this one.

To honor his passing, we added his name to the about:credits page to make sure that his contribution and impact on Mozilla will never be forgotten. He will surely be missed by the community.


I’d like to thank Paul for his collaboration in this post and for his help in getting Fred’s name to the about:credits page. Thanks, Paul!

 

Mozilla ThunderbirdThunderbird Tip: Customize Colors In The Spaces Toolbar

In our last video tip, you learned how to manually sort the order of all your mail and account folders. Let’s keep that theme of customization rolling forward with a quick video guide on customizing the Spaces Toolbar that debuted in Thunderbird 102.

The Spaces Toolbar is on the left hand side of your Thunderbird client and gives you fast, easy access to your most important activities. With a single click you can navigate between Mail, Address Books, Calendars, Tasks, Chat, settings, and your installed add-ons and themes.

Watch below how to customize it!

Video Guide: Customizing The Spaces Toolbar In Thunderbird

This 2-minute tip video shows you how to easily customize the Spaces Toolbar in Thunderbird 102.

*Note that the color tools available to you will vary depending on the operating system you’re using. If you’re looking to discover some pleasing color palettes, we recommend the excellent, free tools at colorhunt.co.


Have You Subscribed To Our YouTube Channel?

We’re currently building the next exciting era of Thunderbird, and developing a Thunderbird experience for mobile. We’re also putting out more content and communication across various platforms to keep you informed. And, of course, to show you some great usage tips along the way.

To accomplish that, we’ve launched our YouTube channel to help you get the most out of Thunderbird. You can subscribe here. Help us reach more people than ever before by liking each video and leaving a comment if it helped!


Another Tip Before You Go?

The post Thunderbird Tip: Customize Colors In The Spaces Toolbar appeared first on The Thunderbird Blog.

The Mozilla BlogAnnouncing Carlos Torres, Mozilla’s new Chief Legal Officer

I am pleased to announce that starting today, September 12, Carlos Torres has joined Mozilla as our Chief Legal Officer. In this role Carlos will be responsible for leading our global legal and public policy teams, developing legal, regulatory and policy strategies that support Mozilla’s mission. He will also manage all regulatory issues and serve as a strategic business partner helping us accelerate our growth and evolution. Carlos will also serve as Corporate Secretary. He will report to me and join our steering committee.

<figcaption>Carlos Torres joins Mozilla executive team.</figcaption>

Carlos stood out in the interview process because of his great breadth of experience across many topics including strategic and commercial agreements, product, privacy, IP, employment, board matters, investments, regulatory and litigation. He brings experience in both large and small companies, and in organizations with different risk profiles as well as a deep belief in Mozilla’s commitment to innovation and to an open internet.

“Mozilla continues to be a unique and respected voice in technology, in a world that needs trusted institutions more than ever,” said Torres. “There is no other organization that combines community, product, technology and advocacy to produce trusted innovative products that people love. I’ve always admired Mozilla for its principled, people-focused approach and I’m grateful for the opportunity to contribute to Mozilla’s mission and evolution.”

Carlos comes to us most recently from Flashbots where he led the company’s legal and strategic initiatives. Prior to that, he was General Counsel for two start ups and spent over a decade at Salesforce in a variety of leadership roles including VP, Business Development and Strategic Alliances and VP, Associate General Counsel, Chief of Staff. He also served as senior counsel of a biotech company and started his legal career at Orrick, Herrington & Sutcliffe.

The post Announcing Carlos Torres, Mozilla’s new Chief Legal Officer appeared first on The Mozilla Blog.

IRL (podcast)The AI Medicine Cabinet

Life, death and data. AI’s capacity to support research on human health is well documented. But so are the harms of biased datasets and misdiagnoses. How can AI developers build healthier systems? We take a look at a new dataset for Black skin health, a Covid chatbot in Rwanda, AI diagnostics in rural India, and elusive privacy in mental health apps.

Avery Smith is a software engineer in Maryland who lost his wife to skin cancer. This inspired him to create the Black Skin Health AI Dataset and the web app, Melalogic.

Remy Muhire works on open source speech recognition software in Rwanda, including a Covid-19 chatbot, Mbaza, which 2 million people have used so far.

Radhika Radhakrishnan is a feminist scholar who studies how AI diagnostic systems are deployed in rural India by tech companies and hospitals, as well as the limits of consent.

Jen Caltrider is the lead investigator on a special edition of Mozilla’s “Privacy Not Included” buyer’s guide that investigated the privacy and security of mental health apps.

IRL is an original podcast from Mozilla, the non-profit behind Firefox. In Season 6, host Bridget Todd shares stories of people who make AI more trustworthy in real life. This season doubles as Mozilla’s 2022 Internet Health Report. Go to the report for show notes, transcripts, and more.

 

 

Firefox NightlyThese Weeks In Firefox: Issue 123

Highlights

  • We have a brand new batch of students from CalState LA working with us on improving the Migrator component as part of a Capstone course! They’re just getting ramped up, so stay tuned to hear more from them.
  • WebExtension Manifest v3 work is well underway
    • Added initial support for the subset of the DeclarativeNetRequest API used to manage the session rules
    • Support for Event Pages (background pages with the persistent flag set to false) has been enabled for Manifest v2 WebExtensions on all channels
      • This should make it easier to gradually migrate Manifest v2 WebExtensions to Manifest v3.
  • The DevTools team has some great updates for you:
    • They’ve made opening the debugger ~6 to 9% faster (perfherder), by changing how we’re handling syntax highlighting (bug)
    • Julian made opening the StyleEditor 90% faster (perfherder) on pages with minified stylesheets (bug)
      • This impacts websites like Gmail, but also probably opening the StyleEditor in BrowserToolbox
    • Go check out the DevTools section down below for more awesome improvements!
  • Janvi01 added new seek button controls to the Picture-in-Picture player window to move forward or backward 5 seconds
    • This must be enabled by setting `media.videocontrols.picture-in-picture.improved-video-controls.enabled` to `true` in about:config
    • This might undergo some visual redesign as we consider additional controls
Screenshot of a Picture-in-Picture window displaying video player controls, including newly added seek forward and seek backward buttons

The seek buttons allow you to move forward or backward by 5 seconds on Picture-in-Picture.

Friends of the Firefox team

Resolved bugs (excluding employees)

Script to find new contributors from bug list

Volunteers that fixed more than one bug

  • Itiel
  • Jonas Jenwald [:Snuffleupagus]

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions

WebExtensions Framework
  • Fixed a regression related to the extension sidebar panel becoming blank when users changes the browser language while the sidebar panel is open – Bug 1786262 (regressed in Firefox 100 by Bug 1755181)
  • Fixed a recent browserAction popup regressions on action popups opened while the widget got automatically moved into the overflow menu – Bug 1786587 (regressed recently in the current Nightly 106 by Bug 1783972)
  • John Bieling moved LanguageDetector.jsm at toolkit level to fix i18n.detectLanguage API in non Firefox Desktop builds, included Firefox for Android (where LanguageDetector.jsm was not available) – Bug 1764698 / Bug 1712214
Addon Manager & about:addons
  • Nicolas Chevobbe migrated GMPProvider.jsm to system ESM modules – Bug 1787724
  • Itiel contributed a fix to about:addons styling to improve contrast on the “Available Updates” badge in dark mode – Bug 1787621
WebExtensions APIs
  • The action API (browserAction/pageAction) setPopup url to accept only same-extension extension urls to be set as popup urls, restriction enforced on manifest_version 3 extensions and extended to manifest_version 2 extensions on Firefox for Android (and in general GeckoView builds) – Bug 1760608
  • As part of the ongoing ManifestVersion 3 (MV3) work:
    • Event Pages:
      • Extend Event Pages lifetime if there are API listener calls still pending – Bug 1785294
    • Follow ups related to content scripts registered as persistAcrossSession using the new scripting  API
      • Clear persisted content script on addon updates Bug 1787145
      • Avoid to wait for the scripting API rkv store initialization on addon startup if the extension doesn’t have any persisted content scripts – Bug 1785898
    • Enforced a stricter append-only behavior on manifest_version 3 extensions changing the ContentSecurityPolicy headers from a WebRequest blocking listener – Bug 1785821
      • Similar restrictions are expected to be applied also to other security headers as part of a separate followup – Bug 1786919
      • These changes are introducing a stricter behavior for manifest_version 3, we expect more followup to be needed to open up again use cases that would not be allowed to extensions running under this stricter behavior.

Developer Tools

DevTools
  • After adding similar feature for Header objects, Colin Cazabet added preview for FormData instances in the console and debugger (bug)
Screenshot of a FormData instance being previewed on the Browser Toolbox console

Want to be better informed of a FormData instance? You can now view previews on the Browser Toolbox console.

  • Alex added the “reload” button to WebExtension toolboxes (bug)
Screenshot of a reload button visible on an instance of a WebExtension toolbox debugger

At the click of a button, you can now reload an extension with ease.

  • Alex also added the ability to automatically open DevTools from the webext CLI, adding a –devtools flag  (bugzilla bug , webext Github PR)
    • Unfortunately this introduced a regression which was swiftly fixed by Rob Wu (bug)
  • Finally Alex started ESMifying DevTools codebase, starting with Launcher.jsm (bug)
    • He will then proceed with the whole devtools/client folder
  • Julian also fixed sorting cookies in the Storage inspector (bug)
  • We fixed a long standing issue where errors and console.log messages would not be displayed in the right order in the console when they were emitted during the same millisecond (bug)
  • Hubert fix a debugger crash when using “Close all tabs” on a minified file displaying an Error (bug)
  • By popular demand, Hubert reintroduced the simple “Resend” context menu entry in the Netmonitor (bug)
Screenshot of a context menu option "Resend" being shown on the Devtools - in the Network monitor tab

Need to send a new request? No worries – just press “Resend” in the context menu.

  • We now show condition text for @supports rules in the Inspector (bug)…
  • …as well as for @container rules (bug) (container queries are still behind layout.css.container-queries.enabled)
Screenshot of the CSS "@supports" and "@container" rules and their condition texts in the DevTools Inspector.

More details are now displayed alongside the @supports and @container rules.

WebDriver BiDi
  • Julian worked around object references
    • He added resultOwnership support for script.evaluate and script.callFunction so you can get a reference (handle id) of a given object from the page, that you can then pass to other commands (bug, bug)
    • And he implemented the script.disown command, which let the user release the reference so the object can be GCed (bug)
      • Note that those references are also cleared on navigation
  • Henrik fixed an issue in Marionette where the client didn’t mark the session as deleted when in_app shutdown was requested (bug)

 

Fluent

ESMification status

Screenshot of a data table comparing how many ESM and JSM files remain as of September 5, 2022

We’ve migrated ~9.7% of JSM files so far! Thanks to all our contributors who are helping with this effort.

  • ESMification is underway! We have 125 .sys.mjs files and 1278 jsm files, ~9.7%
    • There are more patches in flight.
  • Don’t forget to add [esmification-timeline] to the whiteboard so it shows up on the status page.
  • When adding new system module files, please use the new system rather than old jsm files.
  • Check out this walkthrough to see how to do one of these conversions.
    • Don’t forget to do both parts – 1) changing the modules and 2) updating imports in other files.
  • There’s an #esmification Matrix room for questions and coordination
  • [mconley] A number of ESMification bugs have been filed for our CalState students to get their feet wet on. If you see an open ESMification bug blocking this meta, please don’t take it.

Lint, Docs and Workflow

PDFs and Printing

Performance Tools (aka Firefox Profiler)

  • Added an [X] button on track names in the timeline to quickly hide tracks. (PR #4170)

    Screenshot of a new X button for tracks displayed in Firefox Profiler in order to quickly hide tracks

    X button will be shown when a user hovers a track to quickly hide it.

  • Added self time category breakdown to our bottom panel sidebar. Thanks to our contributor parttimenerd for adding this feature! Example profile (PR #4195)
Screenshot of Firefox Profiler's self time category breakdown being displayed on the bottom panel sidebar

There is now a self time category breakdown below the running time category breakdown.

  • Added the sum of the power usage in power track’s tooltips while selecting a time range. Example profile (PR #4172)
Screenshot of a tooltip displaying total power usage for a power track in Firefox Profiler

Details on power usage can be viewed, especially for a specific range within a track.

  • Landed a big backend refactoring that allows us to not crash on an error during profiling/capturing like OOM. (Bug 1612799)
  • Raptor browsertime tests now should output Firefox Profiler profiles for all the treeherder runs by default. You can directly open them in the Firefox Profiler view with a single click. (Bug 1786400)

Search and Navigation

  • Daisuke made a change so that searches from the address bar now get a smaller frecency boost. This should reduce the amount of search result pages shown in address bar searches.
  • Daisuke fixed an issue where text selection in the address bar could be lost when switching tabs.
  • Stephanie improved suggestions when typing about: into the address bar.
  • QuickActions:
    • Daisuke has fixed various issues with inspector and print QuickActions showing in incorrect states, updated the “refresh”, “clear” and “update” to take the appropriate action and lots of other bug fixes and visual updates.
    • Dale has split out the “add-ons” action into separate “extensions”, “themes” etc actions and prepared QuickActions for upcoming experimentation.

Hacks.Mozilla.OrgThe 100% Markdown Expedition

A snowy mountain peak at sunset

The 100% Markdown Expedition

In June 2021, we decided to start converting the source code for MDN web docs from HTML into a format that would be easier for us to work with. The goal was to get 100% of our manually-written documentation converted to Markdown, and we really had a mountain of source code to climb for this particular expedition.

In this post, we’ll describe why we decided to migrate to Markdown, and the steps you can take that will help us on our mission.

Why get to 100% Markdown?

We want to get all active content on MDN Web Docs to Markdown for several reasons. The top three reasons are:

  • Markdown is a much more approachable and friendlier way to contribute to MDN Web Docs content. Having all content in Markdown will help create a unified contribution experience across languages and repositories.
  • With all content in Markdown, the MDN engineering team will be able to clean up a lot of the currently maintained code. Having less code to maintain will enable them to focus on improving the tooling for writers and contributors. Better tooling will lead to a more enjoyable contribution workflow.
  • All content in Markdown will allow the MDN Web Docs team to run the same linting rules across all active languages.

Here is the tracking issue for this project on the translated content repository.

Tools

This section describes the tools you’ll need to participate in this project.

Git

If you do not have git installed, you can follow the steps described on this getting started page.

https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

If you are on Linux or macOS, you may already have Git. To check, open your terminal and run: git --version

On Windows, there are a couple of options:

GitHub

We’re tracking source code and managing contributions on GitHub, so the following will be needed:

• A GitHub account.
• The GitHub CLI to follow the commands below. (Encouraged, but optional, i.e., if you are already comfortable using Git, you can accomplish all the same tasks without the need for the GitHub CLI.)

Nodejs

First, install nvm – https://github.com/nvm-sh/nvm#installing-and-updating or on Windows https://github.com/coreybutler/nvm-windows

Once all of the above is installed, install Nodejs version 16 with NVM:

nvm install 16
nvm use 16
node --version

This should output a Nodejs version number that is similar to v16.15.1.

Repositories

You’ll need code and content from several repositories for this project, as listed below.

You only need to fork the translated-content repository. We will make direct clones of the other two repositories.

Clone the above repositories and your fork of translated-content as follows using the GitHub CLI:

gh repo clone mdn/markdown
gh repo clone mdn/content
gh repo clone username/translated-content # replace username with your GitHub username
Setting up the conversion tool
cd markdown
yarn

You’ll also need to add some configuration via an .env file. In the root of the directory, create a new file called .env with the following contents:

CONTENT_TRANSLATED_ROOT=../translated-content/files
Setting up the content repository
cd .. # This moves you out of the `markdown` folder
cd content
yarn

Converting to Markdown

I will touch on some specific commands here, but for detailed documentation, please check out the markdown repo’s README.

We maintain a list of documents that need to be converted to Markdown in this Google sheet. There is a worksheet for each language. The worksheets are sorted in the order of the number of documents to be converted in each language – from the lowest to the highest. You do not need to understand the language to do the conversion. As long as you are comfortable with Markdown and some HTML, you will be able to contribute.

NOTE: You can find a useful reference to the flavor of Markdown supported on MDN Web Docs. There are some customizations, but in general, it is based on GitHub flavoured Markdown.

The steps
Creating an issue

On the translated-content repository go to the Issues tab and click on the “New issue” button. As mentioned in the introduction, there is a tracking issue for this work and so, it is good practice to reference the tracking issue in the issue you’ll create.

You will be presented with three options when you click the “New issue” button. For our purposes here, we will choose the “Open a blank issue” option. For the title of the issue, use something like, “chore: convert mozilla/firefox/releases for Spanish to Markdown”. In your description, you can add something like the following:

As part of the larger 100% Markdown project, I am converting the set of documents under mozilla/firefox/releases to Markdown.

NOTE: You will most likely be unable to a assign an issue to yourself. The best thing to do here is to mention the localization team member for the appropriate locale and ask them to assign the issue to you. For example, on GitHub you would add a comment like this: “Hey @mdn/yari-content-es I would like to work on this issue, please assign it to me. Thank you!”

You can find a list of teams here.

Updating the spreadsheet

The tracking spreadsheet contains a couple of fields that you should update if you intend to work on speific items. The first item you need to add is your GitHub username and link the text to your GitHub profile. Secondly, set the status to “In progress”. In the issue column, paste a link to the issue you created in the previous step.

Creating a feature branch

It is a common practice on projects that use Git and GitHub to follow a feature branch workflow. I therefore need to create a feature branch for the work on the translated-content repository. To do this, we will again use our issue as a reference.

Let’s say your issue was called ” chore: convert mozilla/firefox/releases for Spanish to Markdown” with an id of 8192. You will do the following at the root of the translated-content repository folder:

NOTE: The translated content repository is a very active repository. Before creating your feature branch, be sure to pull the latest from the remote using the command git pull upstream main

git pull upstream main
git switch -c 8192-chore-es-convert-firefox-release-docs-to-markdown

NOTE: In older version of Git, you will need to use git checkout -B 8192-chore-es-convert-firefox-release-docs-to-markdown.

The above command will create the feature branch and switch to it.

Running the conversion

Now you are ready to do the conversion. The Markdown conversion tool has a couple of modes you can run it in:

  • dry – Run the script, but do not actually write any output
  • keep – Run the script and do the conversion but, do not delete the HTML file
  • replace – Do the conversion and delete the HTML file

You will almost always start with a dry run.

NOTE: Before running the command below, esnure that you are in the root of the markdown repository.

yarn h2m mozilla/firefox/releases --locale es --mode dry

This is because the conversion tool will sometimes encounter situations where it does not know how to convert parts of the document. The markdown tool will produce a report with details of the errors encountered. For example:

# Report from 9/1/2022, 2:40:14 PM
## All unhandled elements
- li.toggle (4)
- dl (2)
- ol (1)
## Details per Document
### [/es/docs/Mozilla/Firefox/Releases/1.5](<https://developer.mozilla.org/es/docs/Mozilla/Firefox/Releases/1.5>)
#### Invalid AST transformations
##### dl (101:1) => listItem

type: "text"
value: ""

### [/es/docs/Mozilla/Firefox/Releases/3](<https://developer.mozilla.org/es/docs/Mozilla/Firefox/Releases/3>)
### Missing conversion rules
- dl (218:1)

The first line in the report states that the tool had a problem converting four instances of li.toggle. So, there are four list items with the class attribute set to toggle. In the larger report, there is this section:

### [/es/docs/Mozilla/Firefox/Releases/9](<https://developer.mozilla.org/es/docs/Mozilla/Firefox/Releases/9>)
#### Invalid AST transformations
##### ol (14:3) => list

type: "html"
value: "<li class=\\"toggle\\"><details><summary>Notas de la Versión para Desarrolladores de Firefox</summary><ol><li><a href=\\"/es/docs/Mozilla/Firefox/Releases\\">Notas de la Versión para Desarrolladores de Firefox</a></li></ol></details></li>",type: "html"
value: "<li class=\\"toggle\\"><details><summary>Complementos</summary><ol><li><a href=\\"/es/Add-ons/WebExtensions\\">Extensiones del navegador</a></li><li><a href=\\"/es/Add-ons/Themes\\">Temas</a></li></ol></details></li>",type: "html"
value: "<li class=\\"toggle\\"><details><summary>Firefox por dentro</summary><ol><li><a href=\\"/es/docs/Mozilla/\\">Proyecto Mozilla (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/Gecko\\">Gecko</a></li><li><a href=\\"/es/docs/Mozilla/Firefox/Headless_mode\\">Headless mode</a></li><li><a href=\\"/es/docs/Mozilla/JavaScript_code_modules\\">Modulos de código JavaScript (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/js-ctypes\\">JS-ctypes (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/MathML_Project\\">Proyecto MathML</a></li><li><a href=\\"/es/docs/Mozilla/MFBT\\">MFBT (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/Projects\\">Proyectos Mozilla (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/Preferences\\">Sistema de Preferencias (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/WebIDL_bindings\\">Ataduras WebIDL (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/Tech/XPCOM\\">XPCOM</a></li><li><a href=\\"/es/docs/Mozilla/Tech/XUL\\">XUL</a></li></ol></details></li>",type: "html"
value: "<li class=\\"toggle\\"><details><summary>Crear y contribuir</summary><ol><li><a href=\\"/es/docs/Mozilla/Developer_guide/Build_Instructions\\">Instrucciones para la compilación</a></li><li><a href=\\"/es/docs/Mozilla/Developer_guide/Build_Instructions/Configuring_Build_Options\\">Configurar las opciones de compilación</a></li><li><a href=\\"/es/docs/Mozilla/Developer_guide/Build_Instructions/How_Mozilla_s_build_system_works\\">Cómo funciona el sistema de compilación (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/Developer_guide/Source_Code/Mercurial\\">Código fuente de Mozilla</a></li><li><a href=\\"/es/docs/Mozilla/Localization\\">Localización</a></li><li><a href=\\"/es/docs/Mozilla/Mercurial\\">Mercurial (Inglés)</a></li><li><a href=\\"/es/docs/Mozilla/QA\\">Garantía de Calidad</a></li><li><a href=\\"/es/docs/Mozilla/Using_Mozilla_code_in_other_projects\\">Usar Mozilla en otros proyectos (Inglés)</a></li></ol></details></li>"

The problem is therefore in the file /es/docs/Mozilla/Firefox/Releases/9. In this instance, we can ignore this as we will simply leave the HTML as is in the Markdown. This is sometimes needed as the HTML we need cannot be accurately represented in Markdown. The part you cannot see in the output above is this portion of the file:

<div><section id="Quick_links">
  <ol>
    <li class="toggle">

If you do a search in the main content repo you will find lots of instances of this. In all those cases, you will see that the HTML is kept in place and this section is not converted to Markdown.

The next two problematic items are two dl or description list elements. These elements will require manual conversion using the guidelines in our documentation. The last item, the ol is actually related to the li.toggle issue. Those list items are wrapped by an ol and because the tool is not sure what to do with the list items, it is also complaining about the ordered list item.

Now that we understand what the problems are, we have two options. We can run the exact same command but this time use the replace mode or, we can use the keep mode. I am going to go ahead and run the command with replace. While the previous command did not actually write anything to the translated content repository, when run with replace it will create a new file called index.md with the converted Markdown and delete the index.html that resides in the same directory.

yarn h2m mozilla/firefox/releases --locale es --mode replace

Following the guidelines from the report, I will have to pay particular attention to the following files post conversion:

  • /es/docs/Mozilla/Firefox/Releases/1.5
  • /es/docs/Mozilla/Firefox/Releases/3
  • /es/docs/Mozilla/Firefox/Releases/9

After running the command, run the following at the root of the translated content repository folder, git status. This will show you a list of the changes made by the command. Depending on the number of files touched, the output can be verbose. The vital thing to keep an eye out for is that there are no changes to folders or files you did not expect.

Testing the changes

Now that the conversion has been done, we need to review the syntax and see that the pages render correctly. This is where the content repo is going to come into play. As with the markdown repository, we also need to create a .env file at the root of the content folder.

CONTENT_TRANSLATED_ROOT=../translated-content/files

With this in place we can start the development server and take a look at the pages in the browser. To start the server, run yarn start. You should see output like the following:

❯ yarn start
yarn run v1.22.17
$ yarn up-to-date-check && env-cmd --silent cross-env CONTENT_ROOT=files REACT_APP_DISABLE_AUTH=true BUILD_OUT_ROOT=build yari-server
$ node scripts/up-to-date-check.js
[HPM] Proxy created: /  -> <https://developer.mozilla.org>
CONTENT_ROOT: /Users/schalkneethling/mechanical-ink/dev/mozilla/content/files
Listening on port 5042

Go ahead and open http://localhost:5042 which will serve the homepage. To find the URL for one of the pages that was converted open up the Markdown file and look at the slug in the frontmatter. When you ran git status earlier, it would have printed out the file paths to the terminal window. The file path will show you exactly where to find the file, for example, files/es/mozilla/firefox/releases/1.5/index.md. Go ahead and open the file in your editor of choice.

In the frontmatter, you will find an entry like this:

slug: Mozilla/Firefox/Releases/1.5

To load the page in your browser, you will always prepend http://localhost:5042/es/docs/ to the slug. In other words, the final URL you will open in your browser will be http://localhost:5042/es/docs/Mozilla/Firefox/Releases/1.5. You can open the English version of the page in a separate tab to compare, but be aware that the content could be wildly different as you might have converted a page that has not been updated in some time.

What you want to look out for is anything in the page that looks like it is not rendering correctly. If you find something that looks incorrect, look at the Markdown file and see if you can find any syntax that looks incorrect or completely broken. It can be extremely useful to use a tool such as VSCode with a Markdown tool and Prettier installed.

Even if the rendered content looks good, do take a minute and skim over the generated Markdown and see if the linters bring up any possible errors.

NOTE: If you see code like this {{FirefoxSidebar}} this is a macro call. There is not a lot of documentation yet but, these macros come from KumaScript in Yari.

A couple of other things to keep in mind. When you run into an error, before you spend a lot of time trying to understand what exatly the problem is or how to fix it, do the following:

  1. Look for the same page in the content repository and make sure the page still exists. If it was removed from the content repository, you can safely remove it from translated-content as well.
  1. Look at the same page in another language that has already been converted and see how they solved the problem.

For example, I ran into an error where a page I loaded simply printed the following in the browser: Error: 500 on /es/docs/Mozilla/Firefox/Releases/2/Adding_feed_readers_to_Firefox/index.json: SyntaxError: Expected "u" or ["bfnrt\\\\/] but "_" found.. I narrowed it down to the following piece of code inside the Markdown:

{{ languages( { "en": "en/Adding\\_feed\\_readers\\_to\\_Firefox", "ja": "ja/Adding\\_feed\\_readers\\_to\\_Firefox", "zh-tw": "zh\\_tw/\\u65b0\\u589e\\u6d88\\u606f\\u4f86\\u6e90\\u95b1\\u8b80\\u5de5\\u5177" } ) }}

In French it seems that they removed the page, but when I looked in zh-tw it looks like they simply removed this macro call. I opted for the latter and just removed the macro call. This solved the problem and the page rendered correctly. Once you have gone through all of the files you converted it is time to open a pull request.

Preparing and opening a pull request

# the dot says add everything
git add .

Start by getting all your changes ready for committing:

If you run git status now you will see something like the following:

❯ git status
On branch 8192-chore-es-convert-firefox-release-docs-to-markdown
Changes to be committed: # this be followed by a list of files that has been added, ready for commit

Commit your changes:

git commit -m 'chore: convert Firefox release docs to markdown for Spanish'

Finally you need to push the changes to GitHub so we can open the pull request:

git push origin 8192-chore-es-convert-firefox-release-docs-to-markdown

You can now head over to the translated content repository on GitHub where you should see a banner that asks whether you want to open a pull request. Click the “Compare and pull button” and look over your changes on the next page to ensure nothing surprises.

At this point, you can also add some more information and context around the pull request in the description box. It is also critical that you add a line as follows, “Fix #8192”. Substitute the number with the number of the issue you created earlier. The reason we do this is so that we link the issue and the pull request. What will also happen is, once the pull request is merged, GitHub will automatically close the issue.

Once you are satisfied with the changes as well as your description, go ahead and click the button to open the pull request. At this stage GitHub will auto-assign someone from the appropriate localization team to review your pull request. You can now sit back and wait for feedback. Once you receive feedback, address any changes requested by the reviewer and update your pull request.

Once you are both satisfied with the end result, the pull request will be merged and you will have helped us get a little bit closer to 100% Markdown. Thank you! One final step remains though. Open the spreadsheet and update the relevant rows with a link to the pull request, and update the status to “In review”.

Once the pull request has been merged, remember to come back and update the status to done.

Reach out if you need help

If you run into any problems and have questions, please join our MDN Web Docs channel on Matrix.

https://matrix.to/#/#mdn:mozilla.org

 

Photo by Cristian Grecu on Unsplash

The post The 100% Markdown Expedition appeared first on Mozilla Hacks - the Web developer blog.

The Mozilla BlogThe children’s book author behind #disabledandcute on her favorite corners of the internet

Keah Brown rests her head on her hand while posing for a photo.<figcaption>Photo: Carissa King</figcaption>

Here at Mozilla, we are the first to admit the internet isn’t perfect, but we are also quick to point out that the internet is pretty darn magical. The internet opens up doors and opportunities, allows for people to connect with others, and lets everyone find where they belong — their corners of the internet. We all have an internet story worth sharing. In My Corner Of The Internet, we talk with people about the online spaces they can’t get enough of, what we should save in Pocket to read later, and what sites and forums shaped them.

This month we chat with writer Keah Brown. She created the viral #disabledandcute hashtag and just published “Sam’s Super Seats,” her debut children’s book about a girl with cerebral palsy who goes back-to-school shopping with her best friends. She talks about celebrating the joys of young people with disabilities online, her love for the band Paramore, other pop culture “-mores” she’s obsessed with and a deep dive into a TV show reboot that never was. 

What is your favorite corner of the internet?

Film and TV chats with my friends, the corner of the internet that loves Drew Barrymore (because duh!), the new “A League of Their Own” discussions corner, the Paramore fandom because they are the best band in the world, and the rom-com corner of the internet.

What is an internet deep dive that you can’t wait to jump back into? 

The deep dive into why we won’t see “Lizzie McGuire” the reboot.

What is the one tab you always regret closing? 

The YouTube video I [opened] in a new tab so I wouldn’t lose it when the video I was watching ended.

What can you not stop talking about on the internet right now? 

My new children’s book, “Sam’s Super Seats,” Paramore, Drew Barrymore, “A League of Their Own” the series, getting ready to move into my first apartment, Meg Thee Stallion, and Renaissance, Beyonce’s [new] album.

What was the first online community you engaged with? 

The “Glee” fandom on Tumblr.

What articles and videos are in your Pocket waiting to be read/watched right now? 

Abbi Jacobson on “The Daily Show,” apartment tours on the Listed YouTube channel, “10 Renter-Friendly Fixes for Your First Apartment,Tracee Ellis Ross on “Hart to Heart,” and Meghan Markle’s podcast interview with Serena Williams.

How can parents or other caretakers of young people with disabilities use the internet to fight stigma and celebrate their joys?

By being aware that the disabled people in their lives are people first and deserve to be treated as such. Ask them for permission before posting about them online. Fight for them like you would anyone else you love and treat them like fully realized human beings 🙂 What I think is also important is that we center disabled people themselves and give them the space to share their own stories and celebrate joy while fighting stigma, too.

If you could create your own corner of the internet what would it look like? 

The really cool thing is that it looks exactly like the one I have now. I talk about all my favorite things starting with the more’s: Drew Barrymore, Mandy Moore, Paramore and then there is what I’m watching for film and TV, then we have house tours, book promotion (buy “Sam’s Super Seats”!) and people who love cheesecake and pizza. 


Keah Brown is a journalist, author and screenwriter. Keah is the creator of the viral hashtag #DisabledAndCute. Her work has appeared in Town & Country Magazine, Teen Vogue, Elle, Harper’s Bazaar, Marie Claire UK, and The New York Times, among other publications. Her essay collection “The Pretty One” and picture book “Sam’s Super Seats” are both out now. You can follow her on Twitter and Instagram.

An illustration reads: The Tech Talk

Talk to your kids about online safety

Get tips

The post The children’s book author behind #disabledandcute on her favorite corners of the internet appeared first on The Mozilla Blog.

This Week In RustThis Week in Rust 459

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Foundation
Newsletters
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is sql-query-builder, a library to write SQL queries in a simple and composable way.

Thanks to Belchior Oliveira for the self-suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

417 pull requests were merged in the last week

Rust Compiler Performance Triage

A relatively quiet week where regressions unfortunately outweighed improvements. What's more, many of the regressions that were found seemed somewhat mysterious requiring some deeper investigations.

Triage done by @rylev. Revision range: 0631ea5d73..09fb0bc6e

Summary:

(instructions:u) mean range count
Regressions ❌
(primary)
0.7% [0.2%, 4.5%] 85
Regressions ❌
(secondary)
1.0% [0.3%, 5.4%] 87
Improvements ✅
(primary)
-0.7% [-1.0%, -0.5%] 9
Improvements ✅
(secondary)
-1.4% [-2.7%, -0.5%] 22
All ❌✅ (primary) 0.5% [-1.0%, 4.5%] 94

2 Regressions, 3 Improvements, 2 Mixed; 3 of them in rollups 40 artifact comparisons made in total

Full report

Call for Testing

An important step for RFC implementation is for people to experiment with the implementation and give feedback, especially before stabilization. The following RFCs would benefit from user testing before moving forward:

  • No RFCs issued a call for testing this week.

If you are a feature implementer and would like your RFC to appear on the above list, add the new call-for-testing label to your RFC along with a comment providing testing instructions and/or guidance on which aspect(s) of the feature need testing.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

  • No RFCs were approved this week.
Final Comment Period

Every week, the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs
New and Updated RFCs

Upcoming Events

Rusty Events between 2022-09-07 - 2022-10-05 🦀

Virtual
Europe
North America
Oceania
South America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Jobs

Please see the latest Who's Hiring thread on r/rust

Quote of the Week

So long, and thanks for all the turbofish.

moltonel on r/rust

Thanks to Josh Triplett for the suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, cdmistman, ericseppanen, extrawurst, andrewpollack, U007D, kolharsam, joelmarcey, mariannegoldin.

Email list hosting is sponsored by The Rust Foundation

Discuss on r/rust

Dave TownsendUsing VS Code for merges in Mercurial

VS Code is now a great visual merge tool, here is how you set it up to be the merge tool and visual diff tool for Mercurial

The Mozilla BlogThe Tech Talk

The internet is a great place for families. It gives us new opportunities to discover the world, connect with others and just generally make our lives easier and more colorful.

But it also comes with new challenges and complications for the people raising the next generations. Mozilla wants to help families make the best online decisions, whatever that looks like, with our latest series, The Tech Talk.

Talk to your kids about online safety

Get tips


An illustration shows a silhouette of a child surrounded by emojis.

Concerned about screen time?

Here’s what experts are saying.


An illustration shows a digital pop-up box that reads: A back-to-school checklist for online safety: Set up new passwords. Check your devices' privacy settings. Protect your child's browsing information. Discuss parental controls with the whole family. Have the "tech talk."

A back-to-school checklist for online safety

This school year, make the best use of the internet while staying safe.


A child smiles while using a table computer.

Are parental controls the answer to keeping kids safe online?

There are a few things to consider before giving parental controls a go.


An illustration shows three columns containing newspaper icons along with social media icons.

5 ways to fight misinformation on your social feed

Slow your scroll with this guide that we created with the News Literacy Project and the Teens for Press Freedom.


Ten young people lean on a wall looking down at their phones.

A little less misinformation, a little more action

How do teens engage with information on social media? We asked them.


Keah Brown rests her head on her hand while posing for a photo.

The children’s book author behind #disabledandcute on her favorite corners of the internet

Keah Brown talks about celebrating the joys of young people with disabilities online.

The post The Tech Talk appeared first on The Mozilla Blog.

Firefox NightlyThese Weeks In Firefox: Issue 122

August 23rd, 2022

Highlights

  • Starting from Firefox 106, popups related to WebExtensions browserAction widgets that are moved inside the overflow panel will open in their own separate panel, instead of being opened inside the overflow panel with a fixed width (Bug 1783972)
  • Huge thanks to contributor ramya for their work to land a patch to add an option to print the current page only to the new printing UI. This was a much-requested feature that made 105’s release notes!
  • Closing out our Browser Toolbox project, the Multiprocess Browser Toolbox officially replaces the Browser Toolbox on all channels starting with Firefox 105 (bug).
    • Even though this work started almost three years ago, it was only enabled on Nightly and local builds for performance reasons (preference browsertoolbox.fission).
  • Nicolas added a toolbar to switch between multiprocess and parent-process modes (bug) in the Multiprocess Browser Toolbox.
    • The “multiprocess” mode works exactly in the same way as the default Multiprocess Browser Toolbox, whereas the “parent-process” mode emulates the “old” Browser Toolbox, and should be faster.
  • The default mode is set to “parent-process” (bug) (preference)
  • dharvey on the Search Team has landed some updates to Quick Actions in the AwesomeBar:
    • Quick Actions can be tested by making sure that the following prefs are set in about:config:
      • browser.urlbar.quickactions.enabled = true
      • browser.urlbar.shortcuts.quickactions = true
    • We now support Quick Action in different languages. Bug 1778184
    • We fixed a bug where it was harder to exit Screenshot mode after being invoked from a Quick Action. Bug 1781049
    • We now do whole phrase matches for Quick Actions. Bug 1781112
      • Example “How to take a screenshot” now matches the Screenshot Quick Action.
    • Created a shortcut Icon for Quick Actions. Bug 1781048, Bug 1783616
    • Added an option to disable Quick Actions. Bug 1783452

Friends of the Firefox team

Resolved bugs (excluding employees)

Script to find new contributors from bug list

Volunteers that fixed more than one bug
  • Ramya
  • Pmcghen
  • Luke.swiderski
  • Pierov
  • Msmolens
  • Harshraitth2
  • Colin.cazabet
  • Janvibajo1
  • jonas.jenwald

 

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions
WebExtensions Framework
WebExtension APIs
  • Starting from Firefox 105 the new scripting API supports dynamically registered content scripts that are persisted across sessions (Bug 1751436)
Developer Tools
Toolbox
  • Thanks to Raphaël Ferrand for adding a new CSS warning, displayed when using width/height on ruby elements (bug)
  • Thanks to Luke Swiderski for addressing an annoying UX issue in the debugger. The breakpoints panel no longer expands automatically when navigating the call stack (bug)
  • Luke also added Composition Event support to our Event Listener Breakpoints (bug), under the “Keyboard” category. They are used when you indirectly enter text with an Input method editor (IME). eg. Chinese characters on a latin keyboard.
  • Thanks to Colin Cabazet, the object inspector now supports previewing Headers (bug)
  • Thanks to pmcghen, the Responsive Design Mode’s device selector correctly shows the Operating System for all devices. We used to incorrectly classify some Android versions as Linux, iPadOS was not recognized… They should all be accurate now.
  • Browser Toolbox:
    • The team fixed a few bugs which would occur when switching from one mode to another (bug, bug and bug), please report any breakage you see when switching between the two modes!
    • Alex added support for the ctrl/cmd+alt+R shortcut in the Browser Toolbox (bug), which will now behave the same way as the “Restart(Developer)” action available on local builds. And ctrl/cmd + R will no longer blank the Browser.
  • Nicolas fixed a nasty performance regression when trying to open long minified files in the debugger (bug).
  • Bomsy fixed a bug on the Edit and Resend panel of the network monitor, to correctly update the content length of the request if needed (bug).
WebDriver BiDi
  • Sasha added support for the “sandbox” argument to the script “evaluate” and “callFunction” commands (bug). As the name suggests this allows you to execute scripts in a sandbox, and limit the side-effects of the test on the actual page.
  • Henrik fixed an issue on environments using IPv6 by default, which were unable to connect to RemoteAgent on “localhost” (bug).
ESMification status
Lint, Docs and Workflow
Password Manager
Picture-in-Picture
  • Janvi01 added new hover states to the PiP window controls!
  • Janvi01 also implemented a new PiP player control button for toggling fullscreen mode
    • Currently set behind a pref `media.videocontrols.picture-in-picture.improved-video-controls.enabled`
Performance
Search and Navigation
  • mcheang refactored the way SearchSettings.sys.mjs handles the user’s settings file that stores search engine information in Bug 1779094
  • daisuke fixed various papercut bugs in Address Bar results menu, Search Bar, Search Icons.
  • Various work is being done by Standard8 to move search and newtab code to ES modules. Bug 1779984.
  • Updated Qwant Search Engine Icon with a new design.Bug 1784877
  • jteow made an enhancement for the connection error page by enhancing the phrasing. Bug 780413
  • Stephanie fixed Bug 1770818 so that the address bar is now focused when opening a new window with a custom URL configured.

 

Mozilla Addons BlogHello from the new developer advocate

Hello extension developers, I’m Juhis, it’s a pleasure to meet you all. In the beginning of August I joined Mozilla and the Firefox add-ons team as a developer advocate. I expect us to see each other quite a lot in the future. My mom taught me to always introduce myself to new people so here we go!

My goal is to help all of you to learn from each other, to build great add-ons and to make that journey an enjoyable experience. Also, I want to be your voice to the teams building Firefox and add-ons tooling.

My journey into the world of software

I’m originally from Finland and grew up in a rather small town in the southwest. I got excited about computers from a very young age. I vividly remember a moment from my childhood when my sister created a digital painting of two horses, but since it was too large for the screen, I had to scroll to reveal the other horse. That blew my four-year old mind and I’ve been fascinated by the opportunities of technology ever since.

After some years working in professional software development, I realized I could offer maximum impact by building communities and helping others become developers rather than just coding myself. Ever since, I’ve been building developer communities, organized meetups, taught programming and served as a general advocate for the potentials of technology.

I believe in the positive empowerment that technology can bring to individuals all around the world. Whether it’s someone building something small to solve a problem in their daily life, someone building tools for their community, or being able to build and run your own business, there are so many ways we can leverage technology for good.

Customize your own internet experience with add-ons

The idea of shaping your own internet experience has been close to my heart for a long time. It can be something relatively simple like running custom CSS through existing extensions to make a website more enjoyable to use, or maybe it’s building big extensions for thousands of other people to enjoy. I’m excited to now be in a position where I can help others to build great add-ons of their own.

To understand better what a new extensions developer goes through, I built an extension following our documentation and processes. I built it for fellow Pokemon TCG players who want a a more visual way to read decklists online. Pokemon TCG card viewer can be installed from addons.mozilla.org. It adds a hover state to card codes it recognizes and displays a picture of the card on hover.

Best way to find me is in the Mozilla Matrix server as @hamatti:mozilla.org in the Add-ons channel. Come say hi!

The post Hello from the new developer advocate appeared first on Mozilla Add-ons Community Blog.

Mozilla ThunderbirdThunderbird Tip: How To Manually Sort All Email And Account Folders

In our last blog post, you learned an easy way to change the order your accounts are displayed in Thunderbird. Today, we have a short video guide that takes your organizing one step further. You’ll learn how to manually sort all of the Thunderbird folders you have. That includes any newsgroup and RSS feed subscriptions too!


Have You Subscribed To Our YouTube Channel?

We’re currently building the next exciting era of Thunderbird, and developing a Thunderbird experience for mobile. We’re also putting out more content and communication across various platforms to keep you informed. And, of course, to show you some great usage tips along the way.

To accomplish that, we’ve launched our YouTube channel to help you get the most out of Thunderbird. You can subscribe here. Help us reach 1000 subscribers by the end of September!


Video Guide: Manually Sort Your Thunderbird Folders

The short video below shows you everything you need to know:

We plan to produce many more tips just like this on our YouTube channel. We’ll also share them right here on the Thunderbird blog, so grab our RSS feed! (Need a guide for using RSS with Thunderbird? Here you go!)

Do you have a good Thunderbird tip we should turn into a video? Let us know in the comments, and thank you for using Thunderbird!

The post Thunderbird Tip: How To Manually Sort All Email And Account Folders appeared first on The Thunderbird Blog.

Wladimir PalantWhen extension pages are web-accessible

In the article discussing the attack surface of extension pages I said:

Websites, malicious or not, cannot usually access extension pages directly however.

And then I proceeded talking about extension pages as if this security mechanism were always in place. But that isn’t the case of course, and extensions will quite often disable it at least partially.

The impact of extension pages being exposed to the web is severe and warrants a thorough discussion in a separate article. So here it comes.

Note: This article is part of a series on the basics of browser extension security. It’s meant to provide you with some understanding of the field and serve as a reference for my more specific articles. You can browse the extension-security-basics category to see other published articles in this series.

Why display extension pages within web pages?

Very often extensions will want to display some of its user interface on regular web pages. Our example extension took the approach of injecting its content directly into the page:

let div = document.createElement("div");
div.innerHTML = result.message + " <button>Explain</button>";
document.body.appendChild(div);

Whether this approach works depends very much on the website. Even for non-malicious websites, one never knows what CSS styles are used by the website and how they will impact this code. So extension developers will try to find an own context for extension’s user interface, one where it won’t be affected by whatever unexpected stuff the website might be doing.

This kind of context is provided by the <iframe> element, whatever we load there will no longer be affected by the parent page.

Except: A frame displaying about:blank may be easy to create, but its contents are accessible not merely to your content script but to the web page as well. So the web page may decide to do something with them, whether unintentionally (because the frame is mistaken for one of its own) or with a malicious purpose.

The obvious solution: load an extension page in that frame. The frame will not be considered same-origin by the browser, so the browser won’t grant the website access to it. It’s the secure solution. Well, mostly at least…

Loading an extension page in a frame

I’ll discuss all the changes to the example extension one by one. But you can download the ZIP file with the resulting extension source code here.

So let’s say we add a message.html page to the extension, one that will display the message outlined above. How will the content script load it on a page?

let frame = document.createElement("iframe");
frame.src = chrome.runtime.getURL("message.html");
frame.style.borderWidth = "0";
frame.style.width = "100%";
frame.style.height = "100px";
document.body.appendChild(frame);

When we add this code to our script.js content script and open example.com we get the following:

Screenshot of the example domain. Below the usual content the “sad page” symbol is displayed and the text “This page has been blocked by Chromium.”

That’s the security mechanism mentioned in the previous article: web pages are usually not allowed to interact with extension pages directly. The same restriction applies to our content script, so loading the extension page fails.

Note: This only applies to Chromium-based browsers. In Mozilla Firefox the code above will succeed. Content scripts have the same access rights as extension pages here, meaning that they can load extension pages even when the web page they attach to cannot.

The solution? Make the page web-accessible. It means adding the following line to the extension’s manifest.json file:

{
  
  "web_accessible_resources": ["message.html"],
  
}

The good news: now the content script is allowed to load message.html. The bad news: any web page is also allowed to load message.html. This page is no longer protected against malicious web pages messing with it directly.

Do extensions even do this?

Obviously, extension pages not being web-accessible is a useful security mechanism. But, as we’ve seen before, disabling security mechanisms isn’t uncommon. So, how many extensions declare their pages as web-accessible?

It’s hard to tell for sure because web_accessible_resources can contain wildcard matches and it isn’t obvious whether these apply to any HTML pages. However, looking for explicit allowing of .html resources in my extension survey, I can see that at least 8% of the extensions do this.

Here again, more popular extensions are more likely to relax security mechanisms. When looking at extensions with at least 10,000 users, the share of those with web-accessible extension pages goes up to almost 17%. And for extensions with at least 100,000 users it’s even 25% of them.

Some extensions will go as far as declaring all of extension resources web-accessible. These are a minority however, with their share staying below 2% even for the popular extensions.

A vulnerable message page

Of course, a web-accessible extension page isn’t necessarily a vulnerable extension page. At this stage it’s merely more exposed. It typically becomes vulnerable when extension developers give in to their natural urge to make things more generic.

For example, we could just move the code displaying the message from the content script into the extension page. But why do that? We could make that a generic message page and keep all the logic in the content script.

And since it is a generic message page displaying generic messages, the content script needs to tell it what to do. For example, it could use URL parameters for that:

chrome.storage.local.get("message", result =>
{
  frame.src = chrome.runtime.getURL("message.html") +
    "?message=" + encodeURIComponent(result.message) +
    "&url=https://example.net/explanation";
});

The extension page now gets two parameters: the message to be displayed and the address to be opened if the button is clicked.

And the script doing the processing in the extension page would then look like this:

$(() =>
{
  let params = new URLSearchParams(location.search);
  $(document.body).append(params.get("message") + " <button>Explain</button>");
  $("body > button").click(() =>
  {
    chrome.tabs.create({ url: params.get("url") });
  });
});

This has the added benefit that the background page is no longer necessary. It can be removed because the message page has all the necessary privileges, it doesn’t need to delegate the task of opening a new tab.

Yes, this is using jQuery again, with its affinity for running JavaScript code as an unexpected side-effect. And it appears to work correctly. We get a message similar to the one produced by the original extension. Yet this time page CSS no longer applies to it.

Screenshot of the example domain. Below the usual content a message says “Hi there!” along with a button labeled “Explain.”

Achieving Remote Code Execution

People familiar with Cross-site Scripting (XSS) vulnerabilities probably noticed already that the way the message parameter is handled is vulnerable. Since the message.html page is now web-accessible, the web page can take the frame created by the content script and rewrite the parameters:

setTimeout(() =>
{
  let frame = document.querySelector("iframe:last-child");
  let src = frame.src;

  // Remove existing query parameters
  src = src.replace(/\?.*/, "");

  // Add malicious query parameters
  src += "?message=" + encodeURIComponent("<script>alert('XSS')</script>");

  // Load into frame
  frame.src = src;
}, 1000);

Yes, the extension page will attempt to run the script passed in the parameter. Which is stopped by Content Security Policy here as well:

Screenshot of an issue displayed in Developer Tools with the text ”Content Security Policy of your site blocks the use of 'eval' in JavaScript`”

So in order for this to be a proper Remote Code Execution vulnerability, our example extension also needs to relax its Content Security Policy in the manifest.json file:

{
  
  "content_security_policy": "script-src 'self' 'unsafe-eval'; object-src 'self';",
  
}

As I explained in the previous article, CSP being weakened in this way is remarkably common. Once this change is made, the attack results in the expected message indicating code execution in the extension page context:

Message showing on the example domain with the text: “chrome-extension://… says: XSS”

Note: Quite a few stars have to align for this attack to work. Chrome will generally ignore 'unsafe-inline' directive for scripts, so inline scripts will never execute. Here it only works because jQuery versions before 3.4.0 will call eval() on inline scripts. And eval() calls can be allowed with the 'unsafe-eval' directive.

Triggering the attack at will

The approach outlined here relies on the extension injecting its frame into the page. But our example extension only does it on example.com. Does it mean that other websites cannot exploit it?

Usually they still can, at least in Chromium-based browsers. That’s because the extension page address is always the same:

chrome-extension://<extension-id>/message.html

For public extensions the extension ID is known. For example, if you switch on Developer Mode in Chrome you will see it in the list of installed extensions:

Screenshot of the Adobe Acrobat extension listing. Below the extension description, there is a line labeled ID followed by a combination of 32 letters.

So any website can create this frame and exploit the vulnerability instead of waiting for the extension to create it:

let frame = document.createElement("iframe");
frame.src = "chrome-extension://abcdefghijklmnopabcdefghijklmnop/message.html?message="
  + encodeURIComponent("<script>alert('XSS')</script>");
document.body.appendChild(frame);

This approach won’t work in Firefox because the page address is built using a different, user-specific extension ID. In Manifest V3 Chrome also introduced a use_dynamic_url flag to the web_accessible_resources entry which has a similar effect. At the moment barely any extensions use this flag however.

What if code execution is impossible?

But what if the extension does not relax Content Security Policy? Or if it doesn’t use jQuery? Is this extension no longer vulnerable then?

The extension page remains vulnerable to HTML injection of course. This means that a website could e.g. open this extension page as a new tab and display its own content there. For the user it will look like a legitimate extension page, so they might be inclined to trust the content and maybe even enter sensitive data into an HTML form provided.

Also, if a vulnerable extension page contains sensitive data, this data could be extracted by injecting CSS code. I previously outlined how such an attack would work against Google web pages, but it works against a browser extension as well of course.

Finally, there is also the url parameter here. Even without code execution, we can make this extension open whichever page we like:

let frame = document.createElement("iframe");
frame.src = "chrome-extension://abcdefghijklmnopabcdefghijklmnop/message.html?message="
  + "&url=data:,Hi!";
document.body.appendChild(frame);

If the user now clicks that “Explain” button, the address data:,Hi! loads in a new tab, even though websites aren’t normally allowed to open it for security reasons. So this vulnerability allows websites to hijack window.open() extension API.

Wait, but the user still needs to click that button, right? Isn’t that quite a bit of a setback?

Actually, tricking the user into doing that is easy with clickjacking. The approach: we make that frame invisible. And we also clip it to make sure only a piece of the button is visible. Then we place the frame under the mouse cursor whenever the user moves it, so when the user clicks anywhere this button receives the click.

let frame = document.createElement("iframe");
frame.style.position = "absolute";
frame.style.opacity = "0.0001";
frame.style.clip = "rect(10px 60px 30px 40px)";
frame.src = "chrome-extension://abcdefghijklmnopabcdefghijklmnop/message.html?message="
  + "&url=data:,Hi!";
document.body.appendChild(frame);

window.addEventListener("mousemove", event =>
{
  frame.style.left = (event.clientX - 50) + "px";
  frame.style.top = (event.clientY - 20) + "px";
});

The user doesn’t see anything unusual here. Yet when they click anywhere on the page the address data:,Hi! loads in a new tab.

Passing data via window.postMessage()

When a content script passes data to an extension page, it isn’t always using URL parameters. Another common approach is window.postMessage(). In principle, this method gives developers better control over who can do what. The extension page and content script can inspect event.origin and event.sender properties to ensure that only trusted parties can communicate here.

In reality this method is meant for communication between web pages however and not extension parts. So it doesn’t allow distinguishing between a web page and the content script running in that web page for example. Securing this communication channel is inherently difficult, and extensions frequently fail to do it correctly.

Worse yet, its convenience and capability for bi-directional communication invite exchanging way more data. Web pages can listen in on this data, and they could attempt to send messages of their own. In the worst-case scenario, this exposes functionality that allows compromising all of the extension’s capabilities.

Recommendations for developers

Obviously, all the recommendations from the previous article apply here as well. These help prevent code execution vulnerabilities in extension pages or at least limit vulnerability scope.

In addition, making extension pages web-accessible should be considered carefully. It may sound obvious, but please don’t mark pages as web-accessible unless you absolutely have to. And apply use_dynamic_url flag if you can.

Also, web-accessible pages require additional security scrutiny. Any parameters passed in by methods accessible to web pages should be considered untrusted. If possible, don’t even use communication methods that are accessible to web pages.

Yes, runtime.sendMessage() API requires communicating via the background page which makes it far less convenient. Yes, that safe replacement for window.postMessage() for extensions to use isn’t getting any traction. Still, that way you won’t accidentally make mistakes that will compromise the security of your extension.

Support.Mozilla.OrgWhat’s up with SUMO – August 2022

Hi everybody,

Summer is not a thing in my home country, Indonesia. But I learn that taking some time off after having done a lot of work in the first half of the year is useful for my well-being. So I hope you had a chance to take a break this summer.

We passed half of Q3 already at this point, so let’s see what SUMO has been doing and up to with renewed excitement after this holiday season.

Welcome note and shout-outs

  • Thanks to Felipe for doing a short experiment on social support mentoring. This was helpful to understand what other contributors might need when they start contributing.
  • Thanks to top contributors for Firefox for iOS in the forum. We are in need of more iOS contributors in the forum, so your contribution is highly appreciated.
  • I’d like to give special thanks to a few contributors who start to contribute more to KB these days: Denys, Kaie, Lisah933, jmaustin, and many others.

If you know anyone that we should feature here, please contact Kiki and we’ll make sure to add them in our next edition.

Community news

  • We are now sharing social and mobile support stats regularly. This is an effort to make sure that contributors are updated and exposed to both contribution areas. We knew it’s not always easy to discover opportunity to contribute to social or mobile support since we’re utilizing a different tool for these contribution areas. Check out the last one was from last week.
  • The long-awaited work to fix the automatic function to shorten KB article link has been released to production. Read more about this change in this contributor thread and how you can help remove manual links that we added in the past when the functionality was broken.
  • Check out our post about Back to School marketing campaign if you haven’t.

Catch up

  • Consider subscribing to Firefox Daily Digest if you haven’t to get daily updates about Firefox from across different platforms.
  • Watch the monthly community call if you haven’t. Learn more about what’s new in July! Reminder: Don’t hesitate to join the call in person if you can. We try our best to provide a safe space for everyone to contribute. You’re more than welcome to lurk in the call if you don’t feel comfortable turning on your video or speaking up. If you feel shy to ask questions during the meeting, feel free to add your questions on the contributor forum in advance, or put them in our Matrix channel, so we can answer them during the meeting.
  • If you’re an NDA’ed contributor, you can watch the recording of the Customer Experience weekly scrum meeting from AirMozilla to catch up with the latest product updates.
  • Check out the following release notes from Kitsune in the month:

Community stats

KB

KB pageviews (*)

* KB pageviews number is a total of KB pageviews for /en-US/ only

Month Page views Vs previous month
Jul 2022 7,325,189 -5.94%

Top 5 KB contributors in the last 90 days: 

KB Localization

Top 10 locales based on total page views

Locale Jul 2022 pageviews (*)
de 8.31%
zh-CN 7.01%
fr 5.94%
es 5.91%
pt-BR 4.75%
ru 4.14%
ja 3.93%
It 2.13%
zh-TW 1.99%
pl 1.94%
* Locale pageviews is an overall pageviews from the given locale (KB and other pages)

** Localization progress is the percentage of localized article from all KB articles per locale

Top 5 localization contributors in the last 90 days: 

Forum Support

Forum stats

-TBD-

Top 5 forum contributors in the last 90 days: 

Social Support

Channel Total incoming conv Conv interacted Resolution rate
Jul 2022 237 251 75.11%

Top 5 Social Support contributors in the past 2 months: 

  1. Bithiah K
  2. Christophe Villeneuve
  3. Felipe Koji
  4. Kaio Duarte
  5. Matt Cianfarani

Play Store Support

Channel Jul 2022
Total priority review Total priority review replied Total reviews replied
Firefox for Android 2155 508 575
Firefox Focus for Android 45 18 92
Firefox Klar Android 3 0 0

Top 5 Play Store contributors in the past 2 months: 

  • Paul Wright
  • Selim Şumlu
  • Felipe Koji
  • Tim Maks
  • Matt Cianfarani

Product updates

To catch up on product releases update, please watch the recording of the Customer Experience scrum meeting from AirMozilla. You can also subscribe to the AirMozilla folder by clickling on the Subscribe button at the top right corner of the page to get notifications each time we add a new recording.

Useful links:

Jody HeavenerA tip on using peer dependencies with TypeScript

I encountered this issue recently and felt pretty silly when I realized the simple mistake I was making, so allow me to share, in hopes that it saves someone else time...

When you're developing an NPM package that makes use of a dependency also used by the main application, you might consider listing it as a peer dependency.

Take this example, where we move React from "dependencies" to "peerDependencies":

-  "dependencies": {
-    "react": "^16.8.4 || ^17.0.0"
-  },
   "devDependencies": {
     "@docusaurus/theme-classic": "^2.0.1",
     "@types/react": "^18.0.17"
+  },
+  "peerDependencies": {
+    "react": "^16.8.4 || ^17.0.0"
   }

React is now a peer dependency, which means the main application needs to list it in its own dependencies. Additionally, we're able to keep developing this package with no issues from TypeScript (can you see why?).

Now notice that other package, @docusaurus/theme-classic. I wanted to make this one a peer dependency as well, so I did just that:

   "devDependencies": {
-    "@docusaurus/theme-classic": "^2.0.1",
     "@types/react": "^18.0.17"
   },
   "peerDependencies": {
+    "@docusaurus/theme-classic": "^2.0.1",
     "react": "^16.8.4 || ^17.0.0"
   }
 }

But after I made this change, TypeScript wasn't happy. 😔 When I tried importing from that module I got the typical "Cannot find module or its corresponding type declarations" error. I spent quite a while scratching my head, trying to understand peer dependencies. I knew package manager CLIs don't automatically install peer dependencies, but I couldn't figure out why other packages, such as React, were working while this one wasn't.

And this is where I felt silly after figuring it out: the @docusaurus/theme-classic package was supplying its own type declarations, so moving it over to peer dependencies was eliminating its types altogether.

To address this, the simplest solution I've found is to duplicate that dependency over to "devDependencies". Doing this makes sure that it is installed locally while you develop the package, while also maintaining its status as a peer dependency when the main application consumes it.

   "devDependencies": {
+    "@docusaurus/theme-classic": "^2.0.1",
     "@types/react": "^18.0.17"
   },
   "peerDependencies": {

I've also tried playing with the install-peers package, that claims to install all your peer dependencies as dev dependencies, but wasn't having much success with it.

If you have your own solution for this problem, I'd love to hear it!

Mozilla Open Policy & Advocacy BlogMozilla Meetups – The Long Road to Federal Privacy Protections: Are We There Yet?

Register Below!
Join us for a discussion about the need for comprehensive privacy reform and whether the political landscape is ready to make it happen.

The panel session will be immediately followed by a happy hour reception with drinks and light fare. 

Date and time: Wednesday, September 21st – panel starts @ 4:00PM promptly (doors @ 3:45pm)
Location: Wunder Garten, 1101 First St. NE, Washington, DC 20002

The post Mozilla Meetups – The Long Road to Federal Privacy Protections: Are We There Yet? appeared first on Open Policy & Advocacy.

Spidermonkey Development BlogSpiderMonkey Newsletter (Firefox 104-105)

SpiderMonkey is the JavaScript engine used in Mozilla Firefox. This newsletter gives an overview of the JavaScript and WebAssembly work we’ve done as part of the Firefox 104 and 105 Nightly release cycles.

👷🏽‍♀️ New features

  • We’ve implemented the ShadowRealms proposal (disabled by default).
  • We’ve shipped the array findLast and findLastIndex functions (Firefox 104).
  • We’ve removed range restrictions from various Intl objects to match spec changes.

Features that are in progress:

  • We’ve started to implement the decorator proposal.
  • We implemented more instructions for the Wasm GC proposal (disabled by default).
  • We implemented more instructions and optimizations for the Wasm function references proposal (disabled by default).
  • We removed support for Wasm runtime types because this was removed from the spec.

⚙️ Modernizing JS modules

We’re working on improving our implementation of modules. This includes supporting modules in Workers, adding support for Import Maps, and ESMification (replacing the JSM module system for Firefox internal JS code with standard ECMAScript modules).

  • See the AreWeESMifiedYet website for the status of ESMification.
  • We’ve ported the module implementation from self-hosted JS to C++.
  • We’ve made a lot of changes to the rewritten code to match the latest version of the spec better.

💾 Robust Caching

We’re working on better (in-memory) caching of JS scripts based on the new Stencil format. This will let us integrate better with other resource caches used in Gecko, hit the cache in more cases, and will open the door to potentially cache JIT-related hints.

The team is currently working on removing the dependency on JSContext for off-thread parsing. This will make it easier to integrate with browser background threads and will further simplify the JS engine.

  • We’ve introduced ErrorContext for reporting errors without using JSContext.
  • We’re now using ErrorContext to report out-of-memory exceptions in the frontend.
  • We’re now using ErrorContext to report allocation-overflow exceptions in the frontend.
  • We’ve changed the over-recursion checks in the frontend to not depend on JSContext.

🚀 Performance

  • We optimized and reduced the size of Wasm metadata.
  • We optimized StringBuffer by reducing the number of memory (re)allocations.
  • The performance team added SIMD optimizations to speed up some string and array builtins.
  • We optimized iterators used in self-hosted code to avoid some extra allocations.
  • We optimized the object allocation code more. This made the JSON parsing benchmark more than 30% faster.
  • We fixed some places to avoid unnecessary GC tenuring.
  • We implemented a simpler heuristic for the GC heap limit and resizing (disabled by default).
  • We fixed a performance cliff for certain objects with sparse elements.
  • We added some optimizations to inline the string startsWith and endsWith functions in the JIT in certain cases.
  • We optimized the code generated for string substring operations.
  • We changed our optimizing compiler to use frame pointer relative addressing (instead of stack pointer relative) for JS and Wasm code on x86/x64.
  • We added a browser pref (disabled by default) for disabling Spectre JIT mitigations in isolated Fission content processes.

📚 Miscellaneous

  • We fixed some issues with instant evaluation in the web console, to avoid unwanted side-effects from higher order self-hosted functions.
  • The profiler team has added labels to more builtin functions.
  • We fixed a bug that could result in missing stack frames in the profiler.
  • We simplified our telemetry code to be less error-prone and easier to work with.
  • We improved Math.pow accuracy for large exponents.
  • We removed the unmaintained TraceLogger integration.
  • We updated our copy of irregexp to the latest upstream code.
  • We added more linting and code formatting for self-hosted code.
  • We changed non-Wasm SharedArrayBuffer to use calloc instead of mapped memory. This uses less memory and avoids out-of-memory exceptions in some cases.

Hacks.Mozilla.OrgMerging two GitHub repositories without losing commit history

Merging two GitHub repositories without losing history

We are in the process of merging smaller example code repositories into larger parent repositories on the MDN Web Docs project. While we thought that copying the files from one repository into the new one would lose commit history, we felt that this might be an OK strategy. After all, we are not deleting the old repository but archiving it.

After having moved a few of these, we did receive an issue from a community member stating that it is not ideal to lose history while moving these repositories and that there could be a relatively simple way to avoid this. I experimented with a couple of different options and finally settled on a strategy based on the one shared by Eric Lee on his blog.

tl;dr The approach is to use basic git commands to apply all of the histories of our old repo onto a new repo without needing special tooling.

Getting started

For the experiment, I used the sw-test repository that is meant to be merged into the dom-examples repository.

This is how Eric describes the first steps:

# Assume the current directory is where we want the new repository to be created
# Create the new repository

git init

# Before we do a merge, we need to have an initial commit, so we’ll make a dummy commit

dir > deleteme.txt
git add .
git commit -m “Initial dummy commit”

# Add a remote for and fetch the old repo
git remote add -f old_a <OldA repo URL>

# Merge the files from old_a/master into new/master
git merge old_a/master

I could skip everything up to the git remote ... step as my target repository already had some history, so I started as follows:

git clone https://github.com/mdn/dom-examples.git
cd dom-examples

Running git log on this repository, I see the following commit history:

commit cdfd2aeb93cb4bd8456345881997fcec1057efbb (HEAD -> master, upstream/master)
Merge: 1c7ff6e dfe991b
Author:
Date:   Fri Aug 5 10:21:27 2022 +0200

    Merge pull request #143 from mdn/sideshowbarker/webgl-sample6-UNPACK_FLIP_Y_WEBGL

    “Using textures in WebGL”: Fix orientation of Firefox logo

commit dfe991b5d1b34a492ccd524131982e140cf1e555
Author:
Date:   Fri Aug 5 17:08:50 2022 +0900

    “Using textures in WebGL”: Fix orientation of Firefox logo

    Fixes <https://github.com/mdn/content/issues/10132>

commit 1c7ff6eec8bb0fff5630a66a32d1b9b6b9d5a6e5
Merge: be41273 5618100
Author:
Date:   Fri Aug 5 09:01:56 2022 +0200

    Merge pull request #142 from mdn/sideshowbarker/webgl-demo-add-playsInline-drop-autoplay

    WebGL sample8: Drop “autoplay”; add “playsInline”

commit 56181007b7a33907097d767dfe837bb5573dcd38
Author:
Date:   Fri Aug 5 13:41:45 2022 +0900

With the current setup, I could continue from the git remote command, but I wondered if the current directory contained files or folders that would conflict with those in the service worker repository. I searched around some more to see if anyone else had run into this same situation but did not find an answer. Then it hit me! I need to prepare the service worker repo to be moved.

What do I mean by that? I need to create a new directory in the root of the sw-test repo called service-worker/sw-test and move all relevant files into this new subdirectory. This will allow me to safely merge it into dom-examples as everything is contained in a subfolder already.

To get started, I need to clone the repo we want to merge into dom-examples.

git clone https://github.com/mdn/sw-test.git
cd sw-test

Ok, now we can start preparing the repo. The first step is to create our new subdirectory.

mkdir service-worker
mkdir service-worker/sw-test

With this in place, I simply need to move everything in the root directory to the subdirectory. To do this, we will make use of the move (mv) command:

NOTE: Do not yet run any of the commands below at this stage.


# enable extendedglob for ZSH
set -o extendedglob
mv ^sw-test(D) service-worker/swtest

The above command is a little more complex than you might think. It uses a negation syntax. The next section explains why we need it and how to enable it.

How to exclude subdirectories when using mv

While the end goal seemed simple, I am pretty sure I grew a small animal’s worth of grey hair trying to figure out how to make that last move command work. I read many StackOverflow threads, blog posts, and manual pages for the different commands with varying amounts of success. However, none of the initial set of options quite met my needs. I finally stumbled upon two StackOverflow threads that brought me to the answer.

To spare you the trouble, here is what I had to do.

First, a note. I am on a Mac using ZSH (since macOS Catalina, this is now the default shell). Depending on your shell, the instructions below may differ.

For new versions of ZSH, you use the set -o and set +o commands to enable and disable settings. To enable extendedglob, I used the following command:


# Yes, this _enables_ it
set -o extendedglob

On older versions of ZSH, you use the setopt and unsetopt commands.

setopt extendedglob

With bash, you can achieve the same using the following command:

shopt -s extglob

Why do you even have to do this, you may ask? Without this, you will not be able to use the negation operator I use in the above move command, which is the crux of the whole thing. If you do the following, for example:

mkdir service-worker
mv * service-worker/sw-test

It will “work,” but you will see an error message like this:

mv: rename service-worker to service-worker/sw-test/service-worker: Invalid argument

We want to tell the operating system to move everything into our new subfolder except the subfolder itself. We, therefore, need this negation syntax. It is not enabled by default because it could cause problems if file names contain some of the extendedglob patterns, such as ^. So we need to enable it explicitly.

NOTE: You might also want to disable it after completing your move operation.

Now that we know how and why we want extendedglob enabled, we move on to using our new powers.

NOTE: Do not yet run any of the commands below at this stage.

mv ^sw-test(D) service-worker/sw-test

The above means:

  • Move all the files in the current directory into service-worker/sw-test.
  • Do not try to move the service-worker directory itself.
  • The (D) option tells the move command to also move all hidden files, such as .gitignore, and hidden folders, such as .git.

NOTE: I found that if I typed mv ^sw-test and pressed tab, my terminal would expand the command to mv CODE_OF_CONDUCT.md LICENSE README.md app.js gallery image-list.js index.html service-worker star-wars-logo.jpg style.css sw.js. If I typed mv ^sw-test(D) and pressed tab, it would expand to mv .git .prettierrc CODE_OF_CONDUCT.md LICENSE README.md app.js gallery image-list.js index.html service-worker star-wars-logo.jpg style.css sw.js. This is interesting because it clearly demonstrates what happens under the hood. This allows you to see the effect of using (D) clearly. I am not sure whether this is just a native ZSH thing or one of my terminal plugins, such as Fig. Your mileage may vary.

Handling hidden files and creating a pull request

While it is nice to be able to move all of the hidden files and folders like this, it causes a problem. Because the .git folder is transferred into our new subfolder, our root directory is no longer seen as a Git repository. This is a problem.

Therefore, I will not run the above command with (D) but instead move the hidden files as a separate step. I will run the following command instead:

mv ^(sw-test|service-worker) service-worker/sw-test

At this stage, if you run ls it will look like it moved everything. That is not the case because the ls command does not list hidden files. To do that, you need to pass the -A flag as shown below:

ls -A

You should now see something like the following:

❯ ls -A
.git           .prettierrc    service-worker

Looking at the above output, I realized that I should not need to move the .git folder. All I needed to do now was to run the following command:

mv .prettierrc service-worker

After running the above command, ls -A will now output the following:

❯ ls -A
.git simple-service-worker

Time to do a little celebration dance 😁

We can move on now that we have successfully moved everything into our new subdirectory. However, while doing this, I realized I forgot to create a feature branch for the work.

Not a problem. I just run the command, git switch -C prepare-repo-for-move. Running git status at this point should output something like this:

❯ git status
On branch prepare-repo-for-move
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	deleted:    .prettierrc
	deleted:    CODE_OF_CONDUCT.md
	deleted:    LICENSE
	deleted:    README.md
	deleted:    app.js
	deleted:    gallery/bountyHunters.jpg
	deleted:    gallery/myLittleVader.jpg
	deleted:    gallery/snowTroopers.jpg
	deleted:    image-list.js
	deleted:    index.html
	deleted:    star-wars-logo.jpg
	deleted:    style.css
	deleted:    sw.js

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	service-worker/

no changes added to commit (use "git add" and/or "git commit -a")

Great! Let’s add our changes and commit them.

git add .
git commit -m 'Moved all source files into new subdirectory'

Now we want to push our changes and open a pull request.

Woop! Let’s push:

git push origin prepare-repo-for-move

Head over to your repository on GitHub. You should see a banner like “mv-files-into-subdir had recent pushes less than a minute ago” and a “Compare & pull request” button.

Click the button and follow the steps to open the pull request. Once the pull request is green and ready to merge, go ahead and merge!

NOTE: Depending on your workflow, this is the point to ask a team member to review your proposed changes before merging. It is also a good idea to have a look over the changes in the “Files changed” tab to ensure nothing is part of the pull request you did not intend. If any conflicts prevent your pull request from being merged, GitHub will warn you about these, and you will need to resolve them. This can be done directly on GitHub.com or locally and pushed to GitHub as a separate commit.

When you head back to the code view on GitHub, you should see our new subdirectory and the .gitignore file.

With that, our repository is ready to move.

Merging our repositories

Back in the terminal, you want to switch back to the main branch:

git switch main

You can now safely delete the feature branch and pull down the changes from your remote.

git branch -D prepare-repo-for-move
git pull origin main

Running ls -A after pulling the latest should now show the following:

❯ ls -A
.git           README.md      service-worker

Also, running git log in the root outputs the following:

commit 8fdfe7379130b8d6ea13ea8bf14a0bb45ad725d0 (HEAD -> gh-pages, origin/gh-pages, origin/HEAD)
Author: Schalk Neethling
Date:   Thu Aug 11 22:56:48 2022 +0200

    Create README.md

commit 254a95749c4cc3d7d2c7ec8a5902bea225870176
Merge: f5c319b bc2cdd9
Author: Schalk Neethling
Date:   Thu Aug 11 22:55:26 2022 +0200

    Merge pull request #45 from mdn/prepare-repo-for-move

    chore: prepare repo for move to dom-examples

commit bc2cdd939f568380ce03d56f50f16f2dc98d750c (origin/prepare-repo-for-move)
Author: Schalk Neethling
Date:   Thu Aug 11 22:53:13 2022 +0200

    chore: prepare repo for move to dom-examples

    Prepping the repository for the move to dom-examples

commit f5c319be3b8d4f14a1505173910877ca3bb429e5
Merge: d587747 2ed0eff
Author: Ruth John
Date:   Fri Mar 18 12:24:09 2022 +0000

    Merge pull request #43 from SimonSiefke/add-navigation-preload

Here are the commands left over from where we diverted earlier on.

# Add a remote for and fetch the old repo
git remote add -f old_a <OldA repo URL>

# Merge the files from old_a/master into new/master
git merge old_a/master

Alrighty, let’s wrap this up. First, we need to move into the root of the project to which we want to move our project. For our purpose here, this is the dom-examples directory. Once in the root of the directory, run the following:

git remote add -f swtest https://github.com/mdn/sw-test.git

NOTE: The -f tells Git to fetch the remote branches. The ssw is a name you give to the remote so this could really be anything.

After running the command, I got the following output:

❯ git remote add -f swtest https://github.com/mdn/sw-test.git
Updating swtest
remote: Enumerating objects: 500, done.
remote: Counting objects: 100% (75/75), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 500 (delta 35), reused 45 (delta 15), pack-reused 425
Receiving objects: 100% (500/500), 759.76 KiB | 981.00 KiB/s, done.
Resolving deltas: 100% (269/269), done.
From <https://github.com/mdn/sw-test>
 * [new branch]      gh-pages        -> swtest/gh-pages
 * [new branch]      master          -> swtest/master
 * [new branch]      move-prettierrc -> swtest/move-prettierrc
 * [new branch]      rename-sw-test  -> swtest/rename-sw-test

NOTE: While we deleted the branch locally, this is not automatically synced with the remote, so this is why you will still see a reference to the rename-sw-test branch. If you wanted to delete it on the remote, you would run the following from the root of that repository: git push origin :rename-sw-test (if you have configured your repository “to automatically delete head branches”, this will be automatically deleted for you)

Only a few commands left.

NOTE: Do not yet run any of the commands below at this stage.

git merge swtest/gh-pages

Whoops! When I ran the above, I got the following error:

❯ git merge swtest/gh-pages
fatal: refusing to merge unrelated histories

But this is pretty much exactly what I do want, right? This is the default behavior of the merge command, but you can pass a flag and allow this behavior.

git merge swtest/gh-pages --allow-unrelated-histories

NOTE: Why gh-pages? More often than not, the one you will merge here will be main but for this particular repository, the default branch was named gh-pages. It used to be that when using GitHub pages, you would need a branch called gh-pages that will then be automatically deployed by GitHub to a URL that would be something like mdn.github.io/sw-test.

After running the above, I got the following:

❯ git merge swtest/gh-pages --allow-unrelated-histories
Auto-merging README.md
CONFLICT (add/add): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.

Ah yes, of course. Our current project and the one we are merging both contain a README.md, so Git is asking us to decide what to do. If you open up the README.md file in your editor, you will notice something like this:

<<<<<<< HEAD

=======

There might be a number of these in the file. You will also see some entries like this, >>>>>>> swtest/gh-pages. This highlights the conflicts that Git is not sure how to resolve. You could go through and clear these manually. In this instance, I just want what is in the README.md at the root of the dom-examples repo, so I will clean up the conflicts or copy the content from the README.md from GitHub.

As Git requested, we will add and commit our changes.

git add .
git commit -m 'merging sw-test into dom-examples'

The above resulted in the following output:

❯ git commit
[146-chore-move-sw-test-into-dom-examples 4300221] Merge remote-tracking branch 'swtest/gh-pages' into 146-chore-move-sw-test-into-dom-examples

If I now run git log in the root of the directory, I see the following:

commit 4300221fe76d324966826b528f4a901c5f17ae20 (HEAD -> 146-chore-move-sw-test-into-dom-examples)
Merge: cdfd2ae 70c0e1e
Author: Schalk Neethling
Date:   Sat Aug 13 14:02:48 2022 +0200

    Merge remote-tracking branch 'swtest/gh-pages' into 146-chore-move-sw-test-into-dom-examples

commit 70c0e1e53ddb7d7a26e746c4a3412ccef5a683d3 (swtest/gh-pages)
Merge: 4b7cfb2 d4a042d
Author: Schalk Neethling
Date:   Sat Aug 13 13:30:58 2022 +0200

    Merge pull request #47 from mdn/move-prettierrc

    chore: move prettierrc

commit d4a042df51ab65e60498e949ffb2092ac9bccffc (swtest/move-prettierrc)
Author: Schalk Neethling
Date:   Sat Aug 13 13:29:56 2022 +0200

    chore: move prettierrc

    Move `.prettierrc` into the siple-service-worker folder

commit 4b7cfb239a148095b770602d8f6d00c9f8b8cc15
Merge: 8fdfe73 c86d1a1
Author: Schalk Neethling
Date:   Sat Aug 13 13:22:31 2022 +0200

    Merge pull request #46 from mdn/rename-sw-test

Yahoooo! That is the history from sw-test now in our current repository! Running ls -A now shows me:

❯ ls -A
.git                           indexeddb-examples             screen-wake-lock-api
.gitignore                     insert-adjacent                screenleft-screentop
CODE_OF_CONDUCT.md             matchmedia                     scrolltooptions
LICENSE                        media                          server-sent-events
README.md                      media-session                  service-worker
abort-api                      mediaquerylist                 streams
auxclick                       payment-request                touchevents
canvas                         performance-apis               web-animations-api
channel-messaging-basic        picture-in-picture             web-crypto
channel-messaging-multimessage pointer-lock                   web-share
drag-and-drop                  pointerevents                  web-speech-api
fullscreen-api                 reporting-api                  web-storage
htmldialogelement-basic        resize-event                   web-workers
indexeddb-api                  resize-observer                webgl-examples

And if I run ls -A service-worker/, I get:

❯ ls -A service-worker/
simple-service-worker

And finally, running ls -A service-worker/simple-service-worker/ shows:

❯ ls -A service-worker/simple-service-worker/
.prettierrc        README.md          image-list.js      style.css
CODE_OF_CONDUCT.md app.js             index.html         sw.js
LICENSE            gallery            star-wars-logo.jpg

All that is left is to push to remote.

git push origin 146-chore-mo…dom-examples

NOTE: Do not squash merge this pull request, or else all commits will be squashed together as a single commit. Instead, you want to use a merge commit. You can read all the details about merge methods in their documentation on GitHub.

After you merge the pull request, go ahead and browse the commit history of the repo. You will find that the commit history is intact and merged. o/\o You can now go ahead and either delete or archive the old repository.

At this point having the remote configured for our target repo serve no purpose so, we can safe remove the remote.

git remote rm swtest
In Conclusion

The steps to accomplish this task is then as follows:

# Clone the repository you want to merge
git clone https://github.com/mdn/sw-test.git
cd sw-test

# Create your feature branch
git switch -C prepare-repo-for-move
# NOTE: With older versions of Git you can run:
# git checkout -b prepare-repo-for-move

# Create directories as needed. You may only need one, not two as
# in the example below.
mkdir service-worker
mkdir service-worker/sw-test

# Enable extendedglob so we can use negation
# The command below is for modern versions of ZSH. See earlier
# in the post for examples for bash and older versions of ZSH
set -o extendedglob

# Move everything except hidden files into your subdirectory,
# also, exclude your target directories
mv ^(sw-test|service-worker) service-worker/sw-test

# Move any of the hidden files or folders you _do_ want
# to move into the subdirectory
mv .prettierrc service-worker

# Add and commit your changes
git add .
git commit -m 'Moved all source files into new subdirectory'

# Push your changes to GitHub
git push origin prepare-repo-for-move

# Head over to the repository on GitHub, open and merge your pull request
# Back in the terminal, switch to your `main` branch
git switch main

# Delete your feature branch
# This is not technically required, but I like to clean up after myself :)
git branch -D prepare-repo-for-move
# Pull the changes you just merged
git pull origin main

# Change to the root directory of your target repository
# If you have not yet cloned your target repository, change
# out of your current directory
cd ..

# Clone your target repository
git clone https://github.com/mdn/dom-examples.git
# Change directory
cd dom-examples

# Create a feature branch for the work
git switch -C 146-chore-move-sw-test-into-dom-examples

# Add your merge target as a remote
git remote add -f ssw https://github.com/mdn/sw-test.git

# Merge the merge target and allow unrelated history
git merge swtest/gh-pages --allow-unrelated-histories

# Add and commit your changes
git add .
git commit -m 'merging sw-test into dom-examples'

# Push your changes to GitHub
git push origin 146-chore-move-sw-test-into-dom-examples

# Open the pull request, have it reviewed by a team member, and merge.
# Do not squash merge this pull request, or else all commits will be
# squashed together as a single commit. Instead, you want to use a merge commit.

# Remove the remote for the merge target
git remote rm swtest

Hopefully, you now know how to exclude subdirectories using the mv command, set and view shell configuration, and merge the file contents of a git repo into a new repository while preserving the entire commit history using only basic git commands.

The post Merging two GitHub repositories without losing commit history appeared first on Mozilla Hacks - the Web developer blog.

IRL (podcast)The Truth is Out There

Murky political groups are exploiting social media systems to spread disinformation. With important elections taking place around the world this year, who is pushing back? We meet grassroots groups in Africa and beyond who are using AI to tackle disinformation in languages and countries underserved by big tech companies.

Justin Arenstein is the founder of Code for Africa, an organization that works with newsrooms across 21 countries to fact check, track and combat the global disinformation industry.

Tarunima Prabhakar builds tools and datasets to respond to online misinformation in India, as co-founder of the open-source technology community, Tattle.

Sahar Massachi was a data engineer at Facebook and now leads the Integrity Institute, a new network for people who work on integrity teams at social media companies. 

Raashi Saxena in India was the global project coordinator of Hatebase, a crowdsourced repository of online hate speech in 98 languages, run by the Sentinel Project. 

IRL is an original podcast from Mozilla, the non-profit behind Firefox. In Season 6, host Bridget Todd shares stories of people who make AI more trustworthy in real life. This season doubles as Mozilla’s 2022 Internet Health Report. Go to the report for show notes, transcripts, and more.

 

Mozilla AccessibilityCache the World Opt In Preview

Early last year, the Firefox accessibility team began work on a major project to re-design the browser’s accessibility engine. The accessibility engine is responsible for providing assistive technologies like screen readers with the information they need to properly and efficiently announce web page content.

Firefox’s accessibility engine dates back many, many years, having received a hurried but less than ideal update with Firefox’s move to a multi-process architecture about 5 years ago. The current Firefox multi-process accessibility architecture suffers from considerable, sometimes catastrophic performance issues and it is more costly and difficult to maintain than we’d like. These deficiencies prompted the team to design and build a newer, faster and more maintainable accessibility engine and the team has dubbed that project “Cache the World“. It’s called that because the new engine automatically sends information from web content processes to a cache in the browser’s main process for consumption by assistive technologies.

This week, the Cache the World project has reached sufficient completeness in the Firefox Nightly build for the Windows operating system that we are ready for some of you to try it out and let us know how it’s working.

If you’re a Windows screen reader user, and you’re brave enough to test Firefox Nightly builds, you can enable the new accessibility engine by going to Firefox Settings, entering accessibility cache into the search box, tabbing to the Accessibility cache check box and turning it on. Firefox will then prompt you to restart. After Firefox restarts, your screen reader will be relying on the new code, which should be considerably faster but may still have correctness and stability bugs.

If you experience Firefox crashes with the new accessibility engine enabled, please submit those crash reports so we can fix them. If you encounter other capability failures that you do not experience with the old accessibility engine, we’d love to get a Bugzilla bug report or even a note on our chat. One known current problem is that live regions do not work correctly with JAWS. We are working to diagnose this and hope to fix it as soon as we can.

There’s still more work to do – fixing bugs, optimizing performance and bringing platforms other than Windows up to speed – but we’re far enough along that we could really use your help completing the project, especially in identifying unknown failures that will certainly happen out there in the wild West of the web.

There will be more updates here as we make further progress.

The post Cache the World Opt In Preview appeared first on Mozilla Accessibility.

Will Kahn-GreeneVolunteer Responsibility Amnesty Day: 06-2022

Summary

Back in June, I saw a note about Volunteer Responsibility Amnesty Day in Sumana's Changeset Consulting newsletter. The idea of it really struck a chord with me. I wondered whether running an event like this at work would help. With that, I coordinated an event, ran it, and this is the blog post summarizing how it went.

The context

As people leave Mozilla, the libraries, processes, services, and other responsibilities (hidden and visible) all suddenly become unowned. In some cases, these things get passed to teams and individuals and there's a clear handoff. In a lot of cases, stuff just gets dropped on the floor.

Some of these things should remain on the floor--we shouldn't maintain all the things forever. Sometimes things get maintained because of inertia rather than actual need. Letting these drop and decay over time is fine.

Some of these things turn out to be critical cogs in the machinations of complex systems. Letting these drop and decay over time can sometimes lead to a huge emergency involving a lot of unscheduled scrambling to fix. That's bad. No one likes that.

In the last year, I had picked up a bunch of stuff from people who had left and it was increasingly hard to juggle it all. Thus taking a day to audit all the things on my plate and figuring out which ones I don't want to do anymore seemed really helpful.

Further, even without people leaving, new projects show up, pipelines are added, new services are stood up--there's more stuff running and more stuff to do to keep it all running.

Thus I wondered, what if other people in Data Org at Mozilla had similar issues? What if there were tasks and responsibilities that we had accumulated over the years that, if we stepped back and looked at them, didn't really need to be done anymore? What if there were people who had too many things on their plate and people who had a lot of space? Maybe an audit would surface this and let us collectively shuffle some things around.

Setting it up

In that context, I decided to coordinate a Volunteer Responsibility Amnesty Day for Data Org.

I decided to structure it a little differently because I wanted to run something that people could participate in regardless of what time zone they were in. I wanted it to produce an output that individuals could talk with their managers about--something they could use to take stock of where things were at, surface work individuals were doing that managers may not know about, and provide a punch list of actions to fix any problems that came up.

I threw together a Google doc that summarized the goals, provided a template for the audit, and included a next steps which were pretty much tell us on Slack and bring it up with your manager in your next 1:1. Here's the doc:

https://docs.google.com/document/d/19NF69uavGXii_DEkRpQsJuklHxWoPTWwxBp_ucRela4/edit#

I talked to my manager about it. I mentioned it in meetings and in various channels on Slack.

On the actual day, I posted a few reminders in Slack.

How'd it go?

I figured it was worth doing once. Maybe it would be helpful? Maybe not? Maybe it helps us reduce the amount of stuff we're doing solely for inertia purposes?

I didn't get a lot of signal about how it went, though.

I know chutten participated and the audit was helpful for him. He has a ton of stuff on his plate.

I know Jan-Erik participated. I don't know if it was helpful for him.

I heard that Alessio decided to do this with his team every 6 months or so.

While I did organize the event, I actually didn't participate. I forget what happened, but something came up and I was bogged down with that.

That's about all I know. I think there are specific people who have a lot of stuff on their plate and this was helpful, but generally either people didn't participate (Maybe they were bogged down like me? Maybe they don't have much they're juggling?) or I never found out they participated.

Epilog

I think it was useful to do. It was a very low-effort experiment to see if something like this would be helpful. If it was the case that people had a lot on their plates, seems like this would have surfaced a bunch of things allowing us to improve peoples' work lives. I think for specific people who have a lot on their plate, it was a helpful exercise.

I didn't get enough signal to make me want to spend the time to run it again in December.

Given that:

  1. Ift think it's good to run individually. If you're feeling overwhelmed with stuff, an audit is a great place to start figuring out how to fix that.

  2. It might be good to run in a small team as an excercise in taking stock of what's going on and rebalance things.

  3. It's probably not helpful to run in an org where maybe it ends up being more bookkeeping work than it's worth.

The Mozilla BlogSlow your scroll: 5 ways to fight misinformation on your social feed

An illustration shows three columns containing newspaper icons along with social media icons.<figcaption>Credit: Nick Velazquez / Mozilla</figcaption>

The news is overwhelming. Attention spans are waning. Combine those with social media feeds that are optimized for endless scrolling, and we get an internet where misinformation thrives. 

In many ways, consuming news has become a social act. We get to share what we’re reading and thinking through social media. Other people respond with their own thoughts and opinions. Algorithms pick up on all of this activity, and soon enough, our feeds feed us what to consume next – one after another. While it could be actual news and accurate information, often, it’s an opinionated take, inaccuracy or even propaganda. 

Of course, the internet also connects us with reliable sources. But when it comes to social media, it becomes a matter of whether or not we actually stop scrolling and take the time to verify what we’re seeing and hearing. So, how can we fight misinformation in our never-ending feeds? Consider these five tips.

An illustration reads: The Tech Talk

Talk to your kids about online safety

Get tips

1. Filter out the aesthetics

Cool infographic catch your eye? Know that it’s probably designed to do just that: grab our attention. Same with content from creators we love. One day they’re dancing, the next they’re giving us health advice. Before taking what we see and hear at face value, we should ask ourselves the 5 Ws:

  • Who is posting? Are they the original source of the information? If not, who is?
  • What is the subject of the post? Is it the source’s expertise or are they relaying something they experienced first-hand?
  • When was it posted? Is the information still relevant today, or have circumstances changed?
  • If it’s an image or a video, where is the event that’s depicted located?
  • Why did they post it? Are they trying to sell you something or gain your support in any way?

2. If something sparks emotion, take a beat

Shocking images and videos can spread quickly on social media. It doesn’t mean we can’t trust them, but it does mean that stakes are higher when they turn out to be misleading or manipulated. 

Before hitting that like or share button, consider what might happen if that turns out to be the case. How would sharing false information affect us, other people or the larger world? Emotions can cloud our judgment, especially when a topic feels personal, so just taking a moment to let our critical thinking kick in can often do the trick.

3. Know when it’s time to dig deeper

There can be obvious signs of misinformation. Think typos, grammatical errors and clear alteration of images or videos. But many times, it’s hard to tell. Is it a screenshot of an article with no link, or footage of a large protest? Does the post address a polarizing topic? 

It might even take an expert like an investigative journalist, fact-checker or researcher to figure out whether a piece of media has been manipulated or if a post is the product of a sophisticated disinformation campaign. That’s when knowing how to find experts’ work — trustworthy sources — comes in handy. 

4. Report misinformation

If you’ve determined that something is false, report it in the app. Social media companies often rely on users to flag misleading and dangerous content, so take an extra but impactful step to help make sure others don’t fall for misinformation. 

5. Feed your curiosity – outside the feed

Real talk: Our attention spans are getting shorter, and learning about the world through quick, visual content can be more entertaining than reading. That’s OK! Still, we should give ourselves some time to explore what piques our interests outside of our social media apps.

Hear something outrageous? Look up news articles and learn more, maybe you can even do something about it. Concerned about vaccines, a pandemic or another public health emergency? Educate yourself and see what your local health officials are saying. Feel strongly about a topic everyone’s talking about online? Start a conversation about it in real life. Our screens give us a window to the larger world, but looking up to notice what’s right in front of us can be pretty great too. 

This guide was created in partnership with the News Literacy Project and the Teens for Press Freedom. The News Literacy Project, a nonpartisan education nonprofit, is building a national movement to advance the practice of news literacy throughout American society, creating better informed, more engaged and more empowered individuals — and ultimately a stronger democracy. The Teens for Press Freedom is a national, youth-led organization dedicated to promoting freedom of the press and factual literacy among teens.


The internet is a great place for families. It gives us new opportunities to discover the world, connect with others and just generally make our lives easier and more colorful. But it also comes with new challenges and complications for the people raising the next generations. Mozilla wants to help families make the best online decisions, whatever that looks like, with our latest series, The Tech Talk.

Firefox browser logo

Get Firefox

Get the browser that protects what’s important

The post Slow your scroll: 5 ways to fight misinformation on your social feed appeared first on The Mozilla Blog.

The Mozilla BlogFirefox Presents: Feeling alive with the ‘Stoke King’

If you could use a little hyping up to go outside, look no further than Wade Holland’s social media feeds. A former competitive skier from Montana, Holland encourages people to find their “stoke” – whether that’s by going on a mountain bike ride, rollerblading, or just feeling the sun on your skin.

“It can be finding a little park right behind your house and singing and dancing in it,” Holland said. “You don’t have to hike Everest. You can do whatever elevates your stoke!”

Now based in Los Angeles, the 34-year-old content creator calls himself a “stoke king.” His vibe is that of a very enthusiastic personal trainer, except you’ll see his outfit from a mile away, the gym is nature, and he’s training you to amp your zest for life all the way up to 11. 

Holland is his own personal success story. Years of injuries made him rethink his goal of becoming a professional skier. While filming a backcountry skiing video with a crew at 21, he flew about 60 feet and landed on his hip on a rock, shattering his femur. He had to be rescued through a helicopter and taken into surgery, during which he had a titanium rod placed in his leg. 

“I almost didn’t make it back from that,” Holland said. 

While the injury didn’t stop him from being active outdoors, he had to scale back. 

“It made me realize that maybe what I’m better at is getting other people excited about what I love so much,” he said. “That led me to a path of creating content that helps people get to a destination and feel good about themselves doing it.”

Holland’s mission became convincing people that anyone can go outside and enjoy nature, wherever they are and whatever their ability. No sleek cycling suit, surfboard or ski poles needed.

After years of consistently producing content, his ability to get people just as excited as he is has paid off. Wade’s motivational adventure posts have drawn 38,500 followers on TikTok and 213,000 on Instagram, where he met his partner Abby Wren, who’s a makeup content creator. 

Wade Holland holds up his hands, wearing gloves that read "stoked."<figcaption>Photo: Nita Hong for Mozilla</figcaption>

Holland had been booked to host an event  in Victoria, Canada. Always on the lookout for opportunities to collaborate, he searched #contentcreators and found Wren, a fellow Montana native. He asked her if she wanted to meet up. She agreed and asked to meet in Vancouver.

But the ferry wasn’t running that day, and sea planes were fully booked. Holland, not wanting to miss the chance to meet Wren, persuaded a helicopter company to help.

“I said, ‘Hey, this is kind of wild, but I’m trying to meet this woman who could be my future wife.’ I showed them a picture of Abby and told them that if they let me get on this helicopter, I’ll make them a 30-second video,” Wade recalled. “They said, ‘Wow, we’ve never been pitched that idea, this seems so outlandish. But this is going to be a hell of a story. Get on.’”

Holland and Wren have been together ever since, and they plan on getting married next year. His life changed because of his excitement to meet a woman he’s never met, and he got others to feel as thrilled as he was. 

“Each day I’m reminded how much life is a gift,” Holland said. “That it’s my responsibility to squeeze the most out of every day I have on this planet, bring my passion and enthusiasm to connect with my community online, and inspire them to get outside and stay stoked.”

Firefox is exploring all the ways the internet makes our planet an awesome place. Almost everything we do today ties back to the online world in some way — so, join us in highlighting the funny, weird, inspiring and courageous stories that remind us why we love the world wide web.

Wade Holland smiles at the camera.

Get the browser that makes a difference

Download Firefox

The post Firefox Presents: Feeling alive with the ‘Stoke King’ appeared first on The Mozilla Blog.

Mozilla ThunderbirdWe Asked AI To Create These Beautiful Thunderbird Wallpapers

The buzz around AI-generated artwork continues to grow with each passing week. As machine-learning-driven AI systems like DALL·E 2, Midjourney, and Stable Diffusion continue to evolve, some truly awe-inspiring creations are being unleashed onto the world. We wanted to tap into that creative energy to produce some unique desktop wallpapers for the Thunderbird community!

So, we fed Midjourney the official Thunderbird logo and a series of descriptive text prompts to produce the stunning desktop wallpapers you see below. (Can you spot which one is also inspired by our friends at Firefox?)

Dozens of variations and hundreds of images later, we narrowed it down to four designs. Aside from adding a small Thunderbird watermark in the lower corners of each wallpaper, these images are exactly as the Midjourney AI produced them.

View And Download The Thunderbird Wallpapers

We did take the liberty of upscaling each image to UltraHD resolution, meaning they’ll look fantastic even on your 4K monitors. And of course, on your 1080p or 1440p panels as well.

Just click each image below to download the full-resolution file.

If you love them, share this page and tell people about Thunderbird! And if you end up using them as your PC wallpaper, send us a screenshot on Mastodon or Twitter.

Thunderbird is the leading open-source, cross-platform email and calendaring client, free for business and personal use. We want it to stay secure and become even better. Donations allow us to hire developers, pay for infrastructure, expand our userbase, and continue to improve.

Click here to make a donation

The post We Asked AI To Create These Beautiful Thunderbird Wallpapers appeared first on The Thunderbird Blog.

The Mozilla BlogA little less misinformation, a little more action

Ten young people lean on a wall looking down at their phones.<figcaption>Credit: Nick Velazquez / Mozilla</figcaption>

As each generation comes of age, they challenge the norms that came before them. If you were to ask most people their go-to way to search, they would mention a search engine. But for Gen Z, TikTok has become one of the most popular ways to find information.

Adrienne Sheares, a social media strategist and a millennial who grew up relying on search engines, had difficulty grasping the habit. So, she spoke with a small group of Gen Zers and reported what she heard in a recent Twitter thread.

Among her learnings: Young people are drawn to content TikTok curates for them, they prefer watching quick videos over reading, and they know misinformation exists and “will avoid content on the platform that can easily be false.” Sheares’ thread went viral. Her curiosity resonated, especially for people with habits very different to those of Gen Z’s.  

As part of our mission at Mozilla, we’re working to support families in having a healthy relationship with the internet. That includes an online experience where young people are equipped to cut through the noise – including misinformation. So we wanted to learn more about how Gen Z consumes the news, and how families can encourage curiosity about current events without shutting out social media. After all, while it may be rife with misinformation, it’s still an essential platform for many teens to connect with their peers.

We spoke with members of Teens for Press Freedom, a youth-led organization that advocates for news literacy among teenagers. We asked Sofia, Agatha, Charlotte, Eloise and Kevin – who are all in their teens – about how they engage with information on social media, their concerns about algorithms and how we can help Gen Zers fight misinformation. Here’s what they said. 

An illustration reads: The Tech Talk

Talk to your kids about online safety

Get tips

Gen Zers are vocal about their values

The way we consume news has become intrinsically social. People start sharing the news they’re consuming because that’s what you do on Instagram and other platforms. People say, “Hey, I’m reading this and therefore, I fit into this educated part of political American life. I have a real opinion that’s very valid.” Everyone wants to feel like they’re part of that group.

CHARLOTTE, CO-FOUNDER OF TEENS FOR PRESS FREEDOM

Agatha, co-director of Teens for Press Freedom, first took notice of how news spreads on Instagram in 2020, when she was 14. 

“There was this post about Palestine and Israel that was incredibly antisemitic,” Agatha recalled. “It was sort of convincing people that they should be antisemitic. That obviously isn’t right. I’m Jewish, and I felt like the post associated Jewish people with the actions of Israel’s government. That felt like misinformation because I didn’t do anything. It seemed to blame people who have never even lived in Israel.”

She started seeing more and more posts with misinformation about other issues, including COVID-19, the Black Lives Matter protests and violence against Asian Americans. “People were sharing them because it looked cool, like they were doing the right thing by spreading these infographics and letting their thousands of followers know about these incidents,” Agatha said.

Many young people want to publicly express their values. However, they run into a problem in the way they do it. 

“People weren’t making sure that the information they were spreading was actually correct and not just something somebody had written, copied into a graphic and sent it out to the world,” Agatha said. 

Charlotte, who co-founded Teens for Press Freedom and is now an incoming freshman at Dartmouth College, said many people fall into a trap of “virtue signaling.”

“The way we consume news has become intrinsically social,” Charlotte said. “People start sharing the news they’re consuming because that’s what you do on Instagram and other platforms. People say, ‘Hey, I’m reading this and therefore, I fit into this educated part of political American life. I have a real opinion that’s very valid.’ Everyone wants to feel like they’re part of that group.”

The infinite feed has shaped Gen Z’s online habits

There’s something about the endless scroll that is so compelling to people.

CHARLOTTE, CO-FOUNDER OF TEENS FOR PRESS FREEDOM

Facebook launched in 2004, YouTube in 2005, Twitter in 2006, Instagram in 2010 and Snapchat in 2011. Millennials came of age as those platforms exploded. Gen Zers – those born after 1996, or people 25 and younger, as classified by the Pew Research Center – don’t remember a time when the internet wasn’t a major means of personal communication and media consumption.

Social media feeds favor information presented succinctly, so users can quickly move on to the next post one after another. TikTok, launched in 2016, has “hacked that algorithm so well,” Charlotte said. “Now, everyone’s using it. There’s YouTube Shorts, Instagram Reels, Netflix Fast Laughs. There’s something about the endless scroll that is so compelling to people. That just invites us to spend hours and hours learning about the world in that way.”

Teens today have lived most of their lives in that world, and it has affected how they consume the news.

Short attention spans fuel misinformation

When I’m listening to music, I can’t get myself to sit through a full song without skipping to the next one. Consuming things is just what we’re programmed to do.

ELOISE, ADVOCACY DIRECTOR OF TEENS FOR PRESS FREEDOM

Many teens know how to confirm facts through resources on the internet. That’s thanks to ongoing efforts by educators who include verifying information in their lesson plans. 

Kevin, workshop team director at Teens for Press Freedom, recently saw a post on Instagram purportedly about a California bill that would allow late-term abortions. “I looked it up because I was curious,” he said. He quickly learned that the law doesn’t actually propose that. 

The issue, Kevin said, is taking the time to fact-check. 

“We’re a generation constantly fed and fed and fed and given things to consume,” said Eloise, advocacy director at Teens for Press Freedom. “Our attention spans are significantly lower than generations before us. When I’m listening to music, I can’t get myself to sit through a full song without skipping to the next one. Consuming things is just what we’re programmed to do.”

That may be why many Gen Zers prefer watching short videos to learn information instead of reading articles.

“People feel like reading the news is not something to prioritize when they can just look at headlines,” Agatha said. “A lot of newspapers have an audio link now so people listen to it instead. Or it’ll say five-minute read, and people will take five minutes to read it. But they don’t want to spend 10, 20 minutes informing themselves on what’s happening to the world.”

News events become more engaging on social media with flashy imagery and content that highlights the outrageous. While this means platforms have become a breeding ground for misinformation, there’s also a silver lining: Younger generations have become more motivated than ever to engage in issues they care about.

Teens are aware about the power of algorithms

We find that [algorithms are] kind of abusing our personal information.

KEVIN, WORKSHOP TEAM DIRECTOR OF TEENS FOR PRESS FREEDOM

While many Gen Zers feel equipped to figure out what’s real or not on social media, algorithms that feed users content curated to each individual are hurting their ability to slow down and choose what they consume. 

“A lot of misinformation are half-truths, like it’s almost believable enough that you can accept it without doing any extra research,” said Sofia, a high school junior and co-director of Teens for Press Freedom. “You go to TikTok to be entertained, and if that entertainment is inundated with misleading information, you’re consuming it without knowing you’re consuming it.”

The teens expressed concern about algorithm-based technologies being tested on young people. Kevin sees it as “abusing their personal information.” Being fed posts based on each person’s interests can create a distorted ecosystem of content that includes misleading, even manipulative, information.

“You’re sucked into this world of people you don’t know, and you see all these different ideas and things that are your interests, and you spend hours and hours on there,” Agatha said. “Their ideas sort of become yours. Your opinion then becomes TikTok’s opinion and vice versa.”

Sofia said this has contributed to the loss of productive conversation around politics: “Algorithms are not only creepy. It’s really damaging not just to the individual but to the political situation in the United States. People are only seeing content that aligns with their beliefs.”

Charlotte said, “There’s this rhetoric about how Gen Z is the most informed generation because of social media, and in many ways that’s true. But social media isn’t really the great democratizer. There’s [also] a lot we don’t know because of these algorithms.”

There are ways to help younger generations fight misinformation

Rather than being talked at, [teens] can talk to each other about issues.

SOFIA, CO-DIRECTOR OF TEENS FOR PRESS FREEDOM

While education about trustworthy sources needs to continue through school, the group said we need to expand the conversation to social media. 

“A lot of people our age think that being critical of sources is something school-related,” Kevin said. “People will say something like, ‘I saw this on TikTok, and then you know, very non-reluctantly quote social media as a source of information.”

Applying the process of verifying information on social media means facilitating discussions among people who consume content in similar ways. 

“Rather than being talked at, they can talk to each other about issues,” Sofia said. “If they’re convinced by someone their own age that what they’re experiencing is not something that they alone have to go through, or that they alone have to figure out a solution, that makes the whole thing a lot easier to confront.”

For parents, this can mean finding peer-to-peer resources for their kids like Teens for Press Freedom’s misinformation workshops. Families can also have real conversations with their children about their values and issues they care about, encouraging curiosity instead of avoiding complicated topics. 

Ultimately, adults can use their power to support efforts to make the internet a better place – one where technology doesn’t use children’s data against them. Young people will tell us what they need if we ask. We can’t let algorithms do that work for us.  


The internet is a great place for families. It gives us new opportunities to discover the world, connect with others and just generally make our lives easier and more colorful. But it also comes with new challenges and complications for the people raising the next generations. Mozilla wants to help families make the best online decisions, whatever that looks like, with our latest series, The Tech Talk.

Firefox browser logo

Get Firefox

Get the browser that protects what’s important

The post A little less misinformation, a little more action appeared first on The Mozilla Blog.

Mozilla ThunderbirdThunderbird Tip: Rearrange The Order Of Your Accounts

One of Thunderbird’s strengths is managing multiple email accounts, newsgroup accounts, and RSS feed subscriptions. But how do you display those accounts in the order YOU want? It’s super easy, and our new Thunderbird Tips video (viewable below) shows you how in less than one minute!

But First: Our YouTube Channel!

We’re currently building the next exciting era of Thunderbird, and developing a Thunderbird experience for mobile. But we’re also trying to put out more content across various platforms to keep you informed — and maybe even entertained!

To that end, we are relaunching our YouTube channel with a forthcoming new podcast, and a series of tips and tricks to help you get the most out of Thunderbird. You can subscribe here. Help us reach 1000 subscribers by the end of August!


Bonus accessibility tips:

1) The keyboard shortcut for this command is ALT + ⬆/⬇ (OPTION + ⬆/⬇ on macOS).
2) Account Settings can also be accessed by the Spaces toolbar, App Menu, and Account Central.

As always, thanks for using Thunderbird! And thanks for making Thunderbird possible with your support and donations.

The post Thunderbird Tip: Rearrange The Order Of Your Accounts appeared first on The Thunderbird Blog.

Mozilla Open Policy & Advocacy BlogIt’s Time to Pass U.S. Federal Privacy Legislation

Despite being a powerhouse of technology and innovation, the U.S. lags behind global counterparts when it comes to privacy protections. Everyday, people face the real possibility that their very personal information could fall into the hands of third parties seeking to weaponize it against them.

At Mozilla, we strive to not only empower people with tools to protect their own privacy, but also to influence other companies to adopt better privacy practices. That said, we can’t solve every problem with a technical fix or rely on companies to voluntarily prioritize privacy.

The good news? After decades of failed attempts and false starts, real reform may finally be on the horizon. We’ve recently seen more momentum than ever for policy changes that would provide meaningful protections for consumers and more accountability from companies. It’s time that we tackle the real-world harms that emerge as a result of pervasive data collection online and abusive privacy practices.

Strong federal privacy legislation is critical in creating an environment where users can truly benefit from the technologies they rely on without paying the premium of exploitation of their personal data. Last month, the House Committee on Energy & Commerce took the important step of voting the bipartisan American Data Privacy and Protection Act (ADPPA) out of committee and advancing the bill to the House floor. Mozilla supports these efforts and encourages Congress to pass the ADPPA.

Stalling on federal policy efforts would only hurt American consumers. We look forward to continuing our work with policymakers and regulators to achieve meaningful reform that restores trust online and holds companies accountable. There’s more progress than ever before towards a solution. We can’t miss this moment.

The post It’s Time to Pass U.S. Federal Privacy Legislation appeared first on Open Policy & Advocacy.

Wladimir PalantAttack surface of extension pages

In the previous article we discussed extension privileges. And as we know from another article, extension pages are the extension context with full access to these privileges. So if someone were to attack a browser extension, attempting Remote Code Execution (RCE) in an extension page would be the obvious thing to do.

In this article we’ll make some changes to the example extension to make such an attack against it feasible. But don’t be mistaken: rendering our extension vulnerable requires actual work, thanks to the security measures implemented by the browsers.

This doesn’t mean that such attacks are never feasible against real-world extensions. Sometimes even these highly efficient mechanisms fail to prevent a catastrophic vulnerability. And then there are of course extensions explicitly disabling security mechanisms, with similarly catastrophic results. Ironically, both of these examples are supposed security products created by big antivirus vendors.

Note: This article is part of a series on the basics of browser extension security. It’s meant to provide you with some understanding of the field and serve as a reference for my more specific articles. You can browse the extension-security-basics category to see other published articles in this series.

What does RCE look like?

Extension pages are just regular HTML pages. So what we call Remote Code Execution here, is usually called a Cross-site Scripting (XSS) vulnerability in other contexts. Merely the impact of such vulnerabilities is typically more severe with browser extensions.

A classic XSS vulnerability would involve insecurely handling untrusted HTML code:

var div = document.createElement("div");
div.innerHTML = untrustedData;
document.body.appendChild(div);

If an attacker can decide what kind of data is assigned to innerHTML here, they could choose a value like <img src="x" onerror="alert('XSS')">. Once that image is added to the document, the browser will attempt to load it. The load fails, which triggers the error event handler. And that handler is defined inline, meaning that the JavaScript code alert('XSS') will run. So you get a message indicating successful exploitation:

A browser alert message titled “My extension” containing the text “XSS”.

And here is your first hurdle: the typical attack target is the background page, thanks to how central it is to most browser extensions. Yet the background page isn’t visible, meaning that it has little reason to deal with HTML code.

What about pages executing untrusted code directly then? Something along the lines of:

eval(untrustedData);

At the first glance, this looks similarly unlikely. No developer would actually do that, right?

Actually, they would if they use jQuery which has an affinity for running JavaScript code as an unexpected side-effect.

Modifying the example extension

I’ll discuss all the changes to the example extension one by one. But you can download the ZIP file with the complete extension source code here.

Before an extension page can run malicious code, this code has to come from somewhere. Websites, malicious or not, cannot usually access extension pages directly however. So they have to rely on extension content scripts to pass malicious data along. This separation of concerns reduces the attack surface considerably.

But let’s say that our extension wanted to display the price of the item currently viewed. The issue: the content script cannot download the JSON file with the price. That’s because the content script itself runs on www.example.com whereas JSON files are stored on data.example.com, so same-origin policy kicks in.

No problem, the content script can ask the background page to download the data:

chrome.runtime.sendMessage({
  type: "check_price",
  url: location.href.replace("www.", "data.") + ".json"
}, response => alert("The price is: " + response));

Next step: the background page needs to handle this message. And extension developers might decide that fetch API is too complicated, which is why they’d rather use jQuery.ajax() instead. So they do the following:

chrome.runtime.onMessage.addListener((request, sender, sendResponse) =>
{
  if (request.type == "check_price")
  {
    $.get(request.url).done(data =>
    {
      sendResponse(data.price);
    });
    return true;
  }
});

Looks simple enough. The extension needs to load the latest jQuery 2.x library as a background script and request the permissions to access data.example.com, meaning the following changes to manifest.json:

{
  
  "permissions": [
    "storage",
    "https://data.example.com/*"
  ],
  
  "background": {
    "scripts": [
      "jquery-2.2.4.min.js",
      "background.js"
    ]
  },
  
}

This appears to work correctly. When the content script executes on https://www.example.com/my-item it will ask the background page to download https://data.example.com/my-item.json. The background page complies, parses the JSON data, gets the price field and sends it back to the content script.

The attack

You might wonder: where did we tell jQuery to parse JSON data? And we actually didn’t. jQuery merely guessed that we want it to parse JSON because we downloaded a JSON file.

What happens if https://data.example.com/my-item.json is not a JSON file? Then jQuery might interpret this data as any one of its supported data types. By default those are: xml, json, script or html. And you can probably spot the issue already: script type is not safe.

So if a website wanted to exploit our extension, the easiest way would be to serve a JavaScript file (MIME type application/javascript) under https://data.example.com/my-item.json. One could use the following code for example:

alert("Running in the extension!");

Will jQuery then try to run that script inside the background page? You bet!

But the browser saves the day once again. The Developer Tools display the following issue for the background page now:

Screenshot with the text “Content Security Policy of your site blocks the use of 'eval' in JavaScript” indicating an issue in jquery-2.2.4.min.js

Note: There is a reason why I didn’t use jQuery 3.x. The developers eventually came around and disabled this dangerous behavior for cross-domain requests. In jQuery 4.x it will even be disabled for all requests. Still, jQuery 2.x and even 1.x remain way too common in browser extensions.

Making the attack succeed

The Content Security Policy (CSP) mechanism which stopped this attack is extremely effective. The default setting for browser extension pages is rather restrictive:

script-src 'self'; object-src 'self';

The script-src entry here determines what scripts can be run by extension pages. 'self' means that only scripts contained in the extension itself are allowed. No amount of trickery will make this extension run a malicious script on an extension page. This protection renders all vulnerabilities non-exploitable or at least reduces their severity considerably. Well, almost all vulnerabilities.

That’s unless an extension relaxes this protection, which is way too common. For example, some extensions will explicitly change this setting in their manifest.json file to allow eval() calls:

{
  
  "content_security_policy": "script-src 'self' 'unsafe-eval'; object-src 'self';",
  
}

Protection is gone and the attack described above suddenly works!

A browser alert message titled “My extension” containing the text “Running in the extension!”.

Do I hear you mumble “cheating”? “No real extension would do that” you say? I beg to differ. In my extension survey 7.9% of the extensions use 'unsafe-eval' to relax the default Content Security Policy setting.

In fact: more popular extensions are more likely to be the offenders here. When looking at extensions with more than 10,000 users, it’s already 12.5% of them. And for extensions with at least 100,000 users this share goes up to 15.4%.

Further CSP circumvention approaches

Edit (2022-08-24): This section originally mentioned 'unsafe-inline' script source. It is ignored for browser extensions however, so that it isn’t actually relevant in this context.

It doesn’t always have to be the 'unsafe-eval' script source which essentially drops all defenses. Sometimes it is something way more innocuous, such as adding the some website as a trusted script source:

{
  
  "content_security_policy": "script-src 'self' https://example.com/;"
  
}

With example.com being some big name’s code hosting or even the extension owner’s very own website, it certainly can be trusted? How likely is it that someone will hack that server only to run some malicious script in the extension?

Actually, hacking the server often isn’t necessary. Occasionally, example.com will contain a JSONP endpoint or something similar. For example, https://example.com/get_data?callback=ready might produce a response like this:

ready({...some data here...});

Attackers would attempt to manipulate this callback name, e.g. loading https://example.com/get_data?callback=alert("XSS")// which will result in the following script:

alert("XSS")//({...some data here...});

That’s it, now example.com can be used to produce a script with arbitrary code and CSP protection is no longer effective.

Side-note: These days JSONP endpoints usually restrict callback names to alphanumeric characters only, to prevent this very kind of abuse. However, JSONP endpoints without such protection are still too common.

Edit (2022-08-25): The original version of this article mentioned open redirects as another CSP circumvention approach. Current browser versions check redirect target against CSP as well however.

Recommendations for developers

So if you are an extension developer and you want to protect your extension against this kind of attacks, what can you do?

First and foremost: let Content Security Policy protect you. Avoid adding 'unsafe-eval' at any cost. Rather than allowing external script sources, bundle these scripts with your extension. If you absolutely cannot avoid loading external scripts, try to keep the list short.

And then there is the usual advise to prevent XSS vulnerabilities:

  • Don’t mess with HTML code directly, use safe DOM manipulation methods such as createElement(), setAttribute(), textContent.
  • For more complicated user interfaces use safe frameworks such as React or Vue.
  • Do not use jQuery.
  • If you absolutely have to handle dynamic HTML code, always pass it through a sanitizer such as DOMPurify and soon hopefully the built-in HTML Sanitizer API.
  • When adding links dynamically, always make sure that the link target starts with https:// or at least http:// so that nobody can smuggle in a javascript: link.

The Mozilla BlogHow Firefox’s Total Cookie Protection and container extensions work together

When we recently announced the full public roll-out of Firefox Total Cookie Protection — a new default browser feature that automatically confines cookies to the websites that created them, thus eliminating the most common method that sites use to track you around the web — it raised a question: Do container extensions like Mozilla’s Facebook Container and Multi-Account Containers still serve a purpose, since they similarly perform anti-tracking functions by suppressing cookie trails?

In short, yes. Container extensions offer additional benefits even beyond the sweeping new privacy enhancements introduced with Firefox Total Cookie Protection.

Total Cookie Protection + container extensions = enhanced anti-tracking 

Total Cookie Protection isolates cookies from each website you visit, so Firefox users now receive comprehensive cookie suppression wherever they go on the web. 

However, Total Cookie Protection does not isolate cookies from different open tabs under the same domain. So for instance, if you have Google Shopping open in one tab, Gmail in another, and Google News in a third, Google will know you have all three pages open and connect their cookie trails. 

<figcaption>Total Cookie Protection creates a separate cookie jar for each website you visit. (Illustration: Meghan Newell)</figcaption>

But with a container extension, you can isolate cookies even within parts or pages of the same domain. You could have Gmail open in one container tab and Google Shopping and News in other containers (for instance, under different accounts) and Google will be oblivious to their relation. 

Beyond this added privacy protection, container extensions are most useful as an easy means of separating different parts of your online life (e.g. personal, work) within the same browser. 

A couple reasons you might want Multi-Account Containers installed on Firefox… 

  • Avoid logging in and out of different accounts under the same web platform; for example, with containers you could have separate instances of Slack open at the same time — one for work, another for friends. 
  • If multiple family members or roommates share Firefox on one computer, each person can easily access their own container with a couple clicks.

While, technically, you can create a Facebook container within Multi-Account Containers, the Facebook Container extension is intended to provide a simple, targeted solution for so many Facebook users concerned about the pervasive ways the social media behemoth tracks you around the web. 

Facebook tracks your online moves outside of Facebook through the various widgets you find embedded ubiquitously around the web (e.g. “Like” buttons or Facebook comments on articles, social share features, etc.). The convenience of automatic sign-in when you visit Facebook is because of cookies. However, this convenience comes at a steep privacy cost — those same cookies can tell Facebook about any page you visit associated with one of its embedded features. 

But with Facebook Container installed on Firefox, you maintaIn the convenience of automatic Facebook sign-in while cutting off the cookie trail to other sites you visit outside of Facebook. 

So if you want superior anti-tracking built right into your browser, plus the enhanced privacy protections and organizational convenience of containers, install a container extension on Firefox and rest easy knowing your cookie trails aren’t exposed. 

Firefox browser logo

Get Firefox

Get the browser that protects what’s important

The post How Firefox’s Total Cookie Protection and container extensions work together appeared first on The Mozilla Blog.

Firefox Add-on ReviewsHow Firefox’s Total Cookie Protection and container extensions work together

When we recently announced the full public roll-out of Firefox Total Cookie Protection — a new default browser feature that automatically confines cookies to the websites that created them, thus eliminating the most common method that sites use to track you around the web — it raised a question: Do container extensions like Mozilla’s Facebook Container and Multi-Account Containers still serve a purpose, since they similarly perform anti-tracking functions by suppressing cookie trails?

In short, yes. Container extensions offer additional benefits even beyond the sweeping new privacy enhancements introduced with Firefox Total Cookie Protection.

Total Cookie Protection + container extensions = enhanced anti-tracking 

Total Cookie Protection isolates cookies from each website you visit, so Firefox users now receive comprehensive cookie suppression wherever they go on the web. 

However, Total Cookie Protection does not isolate cookies from different open tabs under the same domain. So for instance, if you have Google Shopping open in one tab, Gmail in another, and Google News in a third, Google will know you have all three pages open and connect their cookie trails. 

<figcaption>Total Cookie Protection creates a separate cookie jar for each website you visit. (Illustration: Meghan Newell)</figcaption>

But with a container extension, you can isolate cookies even within parts or pages of the same domain. You could have Gmail open in one container tab and Google Shopping and News in other containers (for instance, under different accounts) and Google will be oblivious to their relation. 

Beyond this added privacy protection, container extensions are most useful as an easy means of separating different parts of your online life (e.g. personal, work) within the same browser. 

A couple reasons you might want Multi-Account Containers installed on Firefox… 

  • Avoid logging in and out of different accounts under the same web platform; for example, with containers you could have separate instances of Slack open at the same time — one for work, another for friends. 
  • If multiple family members or roommates share Firefox on one computer, each person can easily access their own container with a couple clicks.

While, technically, you can create a Facebook container within Multi-Account Containers, the Facebook Container extension is intended to provide a simple, targeted solution for so many Facebook users concerned about the pervasive ways the social media behemoth tracks you around the web. 

Facebook tracks your online moves outside of Facebook through the various widgets you find embedded ubiquitously around the web (e.g. “Like” buttons or Facebook comments on articles, social share features, etc.). The convenience of automatic sign-in when you visit Facebook is because of cookies. However, this convenience comes at a steep privacy cost — those same cookies can tell Facebook about any page you visit associated with one of its embedded features. 

But with Facebook Container installed on Firefox, you maintaIn the convenience of automatic Facebook sign-in while cutting off the cookie trail to other sites you visit outside of Facebook. 

So if you want superior anti-tracking built right into your browser, plus the enhanced privacy protections and organizational convenience of containers, install a container extension on Firefox and rest easy knowing your cookie trails aren’t exposed. 

Niko MatsakisCome contribute to Salsa 2022!

Have you heard of the Salsa project? Salsa is a library for incremental computation – it’s used by rust-analyzer, for example, to stay responsive as you type into your IDE (we have also discussed using it in rustc, though more work is needed there). We are in the midst of a big push right now to develop and release Salsa 2022, a major new revision to the API that will make Salsa far more natural to use. I’m writing this blog post both to advertise that ongoing work and to put out a call for contribution. Salsa doesn’t yet have a large group of maintainers, and I would like to fix that. If you’ve been looking for an open source project to try and get involved in, maybe take a look at our Salsa 2022 tracking issue and see if there is an issue you’d like to tackle?

So wait, what does Salsa do?

Salsa is designed to help you build programs that respond to rapidly changing inputs. The prototypical example is a compiler, especially an IDE. You’d like to be able to do things like “jump to definition” and keep those results up-to-date even as the user is actively typing. Salsa can help you build programs that manage that.

The key way that Salsa achieves reuse is through memoization. The idea is that you define a function that does some specific computation, let’s say it has the job of parsing the input and creating the Abstract Syntax Tree (AST):

fn parse_program(input: &str) -> AST { }

Then later I have other functions that might take parts of that AST and operate on them, such as type-checking:

fn type_check(function: &AstFunction) { }

In a setup like this, I would like to have it so that when my base input changes, I do have to re-parse but I don’t necessarily have to run the type checker. For example, if the only change to my progam was to add a comment, then maybe my AST is not affected, and so I don’t need to run the type checker again. Or perhaps the AST contains many functions, and only one of them changed, so while I have to type check that function, I don’t want to type check the others. Salsa can help you manage this sort of thing automatically.

What is Salsa 2022 and how is it different?

The original salsa system was modeled very closely on the [rustc query system]. As such, it required you to structure your program entirely in terms of functions and queries that called one another. All data was passed through return values. This is a very powerful and flexible system, but it can also be kind of mind-bending sometimes to figure out how to “close the loop”, particularly if you wanted to get effective re-use, or do lazy computation.

Just looking at the parse_program function we saw before, it was defined to return a complete AST:

fn parse_program(input: &str) -> AST { }

But that AST has, internally, a lot of structure. For example, perhaps an AST looks like a set of functions:

struct Ast {
    functions: Vec<AstFunction>
}

struct AstFunction {
    name: Name,
    body: AstFunctionBody,
}

struct AstFunctionBody {
    ...
}

Under the old Salsa, changes were tracked at a pretty coarse-grained level. So if your input changed, and the content of any function body changed, then your entire AST was considered to have changed. If you were naive about it, this would mean that everything would have to be type-checked again. In order to get good reuse, you had to change the structure of your program pretty dramatically from the “natural structure” that you started with.

Enter: tracked structs

The newer Salsa introduces tracked structs, which makes this a lot easier. The idea is that you can label a struct as tracked, and now its fields become managed by the database:

#[salsa::tracked]
struct AstFunction {
    name: Name,
    body: AstFunctionBody,
}

When a struct is declared as tracked, then we also track accesses to its fields. This means that if the parser produces the same set of functions, then its output is considered not to have changed, even if the function bodies are different. When the type checker reads the function body, we’ll track that read independently. So if just one function has changed, only that function will be type checked again.

Goal: relatively natural

The goal of Salsa 2022 is that you should be able to convert a program to use Salsa without dramatically restructuring it. It should still feel quite similar to the ‘natural structure’ that you would have used if you didn’t care about incremental reuse.

Using techniques like tracked structs, you can keep the pattern of a compiler as a kind of “big function” that passes the input through many phases, while still getting pretty good re-use:

fn typical_compiler(input: &str) -> Result {
    let ast = parse_ast(input);
    for function in &ast.functions {
        type_check(function);
    }
    ...
}

Salsa 2022 also has other nice features, such as accumulators for managing diagnostics and built-in interning.

If you’d like to learn more about how Salsa works, check out the overview page or read through the (WIP) tutorial, which covers the design of a complete compiler and interpreter.

How to get involved

As I mentioned, the purpose of this blog post is to serve as a call for contribution. Salsa is a cool project but it doesn’t have a lot of active maintainers, and we are actively looking to recruit new people.

The Salsa 2022 tracking issue contains a list of possible items to work on. Many of those items have mentoring instructions, just search for things tagged with good first issue. There is also documentation of salsa’s internal structure on the main web page that can help you navigate the code base. Finally, we have a Zulip instance where we hang out and chat (the #good-first-issue stream is a good place to ask for help!)

Mozilla ThunderbirdThunderbird Time Machine: Windows XP + Thunderbird 1.0

Let’s step back into the Thunderbird Time Machine, and transport ourselves back to November 2004. If you were a tech-obsessed geek like me, maybe you were upgrading Windows 98 to Windows XP. Or playing Valve’s legendary shooter Half-Life 2. Maybe you were eagerly installing a pair of newly released open-source software applications called Firefox 1.0 and Thunderbird 1.0…


As we work toward a new era of Thunderbird, we’re also revisiting its roots. Because the entirety of Thunderbird’s releases and corresponding release notes have been preserved, I’ve started a self-guided tour of Thunderbird’s history. Read the first post in this series here:


“Thunderbirds Are GO!”

Before we get into the features of Thunderbird 1.0, I have to call out the endearing credits reel that could be viewed from the “About Thunderbird” menu. You could really feel that spark of creativity and fun from the developers:

<figcaption>Yes, we have a YouTube channel! Subscribe for more. </figcaption>

Windows XP + 2 New Open-Source Alternatives

Thunderbird 1.0 launched in the prime of Windows XP, and it had a companion for the journey: Firefox 1.0! Though both of these applications had previous versions (with different logos and different names), their official 1.0 releases were milestones. Especially because they were open-source and represented quality alternatives to existing “walled-garden” options.

Thunderbird 1.0 and Firefox 1.0 installers on Windows XP<figcaption>Thunderbird 1.0 and Firefox 1.0 installers on Windows XP</figcaption>

(Thunderbird was, and always has been, completely free to download and use. But the internet was far less ubiquitous than it is now, so we offered to mail users within the United States a CD-ROM for $5.95.)

Without a doubt, Mozilla Thunderbird is a very good e-mail client. It sends and receives mail, it checks it for spam, handles multiple accounts, imports data from your old e-mail application, scrapes RSS news feeds, and is even cross-platform.

Thunderbird 1.0 Review | Ars Technica

Visually, it prided itself on having a pretty consistent look across Windows, Mac OS X, and Linux distributions like CentOS 3.3 or Red Hat. And the iconography was updated to be more colorful and playful, in a time when skeuomorphic design reigned supreme.

Groundbreaking Features In Thunderbird 1.0

Thunderbird 1.0 launched with a really diverse set of features. It offered add-ons (just like its brother Firefox) to extend functionality. But it also delivered some cutting-edge stuff like:

  • Adaptive junk mail controls
  • RSS integration (this was only 2 months after podcasts first debuted)
  • Effortless migration from Outlook Express and Eudora
  • A Global Inbox that could combine multiple POP3 email accounts
  • Message Grouping (by date, sender, priority, custom labels, and more)
  • Automatic blocking of remote image requests from unknown senders
Thunderbird 1.0 About Page<figcaption>Thunderbird 1.0 About Page</figcaption>

Feeling Adventurous? Try It For Yourself!

If you have a PowerPC Mac, a 32-bit Linux distribution, or any real or virtualized version of Windows after Windows 98, you can take your own trip down memory lane. All of Thunderbird’s releases are archived here.

Thunderbird is the leading open-source, cross-platform email and calendaring client, free for business and personal use. We want it to stay secure and become even better. Donations allow us to hire developers, pay for infrastructure, expand our userbase, and continue to improve.

Click here to make a donation

The post Thunderbird Time Machine: Windows XP + Thunderbird 1.0 appeared first on The Thunderbird Blog.

Wladimir PalantImpact of extension privileges

As we’ve seen in the previous article, a browser extension isn’t very different from a website. It’s all the same HTML pages and JavaScript code. The code executes in the browser’s regular sandbox. So what can websites possibly gain by exploiting vulnerabilities in a browser extension?

Well, access to extension privileges of course. Browser extensions usually have lots of those, typically explicitly defined in the permissions entry of the extension manifest, but some are granted implicitly. Reason enough to take a closer look at some of these permissions and their potential for abuse.

Extension manifest of some Avast Secure Browser built-in extensions, declaring a huge list of permissions

Note: This article is part of a series on the basics of browser extension security. It’s meant to provide you with some understanding of the field and serve as a reference for my more specific articles. You can browse the extension-security-basics category to see other published articles in this series.

The crown jewels: host-based permissions

Have a look at the permissions entry of your favorite extension’s manifest. Chances are, you will find entries like the following there:

"permissions": [
  "*://*/*"
]

Or:

"permissions": [
  "http://*/*",
  "https://*/*"
]

Or:

"permissions": [
  "<all_urls>"
]

While these three variants aren’t strictly identical, from the security security point of view the differences don’t matter: this extension requests access to each and every website on the web.

Making requests

When regular websites use XMLHttpRequest or fetch API, they are restricted to requesting data from the own website only. Other websites are out of reach by default, unless these websites opt in explicitly by means of CORS.

For browser extensions, host-based permissions remove that obstacle. A browser extension can call fetch("https://gmail.com/") and get a response back. And this means that, as long as you are currently logged into GMail, the extension can download all your emails. It can also send a request instructing GMail to send an email in your name.

It’s similar with your social media accounts and anything else that can be accessed without entering credentials explicitly. You think that your Twitter data is public anyway? But your direct messages are not. And a compromised browser extension can potentially send tweets or direct messages in your name.

The requests can be initiated by any extension page (e.g. the persistent background page). On Firefox host-based permissions allow content scripts to make arbitrary requests as well. There are no visual clues of an extension performing unexpected requests, if an extension turns malicious users won’t usually notice.

Watching tab updates

Host-based permissions also unlock “advanced” tabs API functionality. They allow the extension to call tabs.query() and not only get a list of user’s browser tabs back but also learn which web page (meaning address and title) is loaded.

Not only that, listeners like tabs.onUpdated become way more useful as well. These will be notified whenever a new page loads into a tab.

So a compromised or malicious browser extension has everything necessary to spy on the user. It knows which web pages the user visits, how long they stay there, where they go then and when they switch tabs. This can be misused for creating browsing profiles (word is, these sell well) – or by an abusive ex/employer/government.

Running content scripts

We’ve already seen a content script and some of its potential to manipulate web pages. However, content scripts aren’t necessarily written statically into the extension manifest. Given sufficient host-based permissions, extensions can also load them dynamically by calling tabs.executeScript() or scripting.executeScript().

Both APIs allow executing not merely files contained in the extensions as content scripts but also arbitrary code. The former allows passing in JavaScript code as a string while the latter expects a JavaScript function which is less prone to injection vulnerabilities. Still, both APIs will wreak havoc if misused.

In addition to the capabilities above, content scripts could for example intercept credentials as these are entered into web pages. Another classic way to abuse them is injecting advertising on each an every website. Adding scam messages to abuse credibility of news websites is also possible. Finally, they could manipulate banking websites to reroute money transfers.

Implicit privileges

Some extension privileges don’t have to be explicitly declared. One example is the tabs API: its basic functionality is accessible without any privileges whatsoever. Any extension can be notified when you open and close tabs, it merely won’t know which website these tabs correspond with.

Sounds too harmless? The tabs.create() API is somewhat less so. It can be used to create a new tab, essentially the same as window.open() which can be called by any website. Yet while window.open() is subject to the pop-up blocker, tabs.create() isn’t. An extension can create any number of tabs whenever it wants.

If you look through possible tabs.create() parameters, you’ll also notice that its capabilities go way beyond what window.open() is allowed to control. And while Firefox doesn’t allow data: URIs to be used with this API, Chrome has no such protection. Use of such URIs on the top level has been banned due to being abused for phishing.

tabs.update() is very similar to tabs.create() but will modify an existing tab. So a malicious extension can for example arbitrarily load an advertising page into one of your tabs, and it can activate the corresponding tab as well.

Webcam, geolocation and friends

You probably know that websites can request special permissions, e.g. in order to access your webcam (video conferencing tools) or geographical location (maps). It’s features with considerable potential for abuse, so users each time have to confirm that they still want this.

Not so with browser extensions. If a browser extension wants access to your webcam or microphone, it only needs to ask for permission once. Typically, an extension will do so immediately after being installed. Once this prompt is accepted, webcam access is possible at any time, even if the user isn’t interacting with the extension at this point. Yes, a user will only accept this prompt if the extension really needs webcam access. But after that they have to trust the extension not to record anything secretly.

With access to your exact geographical location or contents of your clipboard, granting permission explicitly is unnecessary altogether. An extension simply adds geolocation or clipboard to the permissions entry of its manifest. These access privileges are then granted implicitly when the extension is installed. So a malicious or compromised extension with these privileges can create your movement profile or monitor your clipboard for copied passwords without you noticing anything.

Other means of exfiltrating browsing data

Somebody who wants to learn about the user’s browsing behavior, be it an advertiser or an abusive ex, doesn’t necessarily need host-based permissions for that. Adding the history keyword to the permissions entry of the extension manifest grants access to the history API. It allows retrieving the user’s entire browsing history all at once, without waiting for the user to visit these websites again.

The bookmarks permission has similar abuse potential, this one allows reading out all bookmarks via the bookmarks API. For people using bookmarks, their bookmarks collection and bookmark creation timestamps tell a lot about this user’s preferences.

The storage permission

We’ve already seen our example extension use the storage permission to store a message text. This permission looks harmless enough. The extension storage is merely a key-value collection, very similar to localStorage that any website could use. Sure, letting arbitrary websites access this storage is problematic if some valuable data is stored inside. But what if the extension is only storing some basic settings?

You have to remember that one basic issue of online advertising is reliably recognizing visitors. If you visit site A, advertisers will want to know whether you visited site B before and what you’ve bought there. Historically, this goal has been achieved via the cookies mechanism.

Now cookies aren’t very reliable. Browsers are giving users much control over cookies, and they are increasingly restricting cookie usage altogether. So there is a demand for cookie replacements, which led to several “supercookie” approaches to be designed: various pieces of data related to the user’s system leaked by the browser are thrown together to build a user identifier. We’ve seen this escalate into a cat and mouse game between advertisers and browser vendors, the former constantly looking for new identifiers while the latter attempt to restrict user-specific data as much as possible.

Any advertiser using supercookies will be more than happy to throw extension storage into the mix if some browser extension exposes it. It allows storing a persistent user identifier much like cookies do. But unlike with cookies, none of the restrictions imposed by browser vendors will apply here. And the user won’t be able to remove this identifier by any means other than uninstalling the problematic extension.

More privileges

The permissions entry of the extension manifest can grant more privileges. It’s too many to cover all of them here, the nativeMessaging permission in particular will be covered in a separate article. MDN provides an overview of what permissions are currently supported.

Why not restrict extension privileges?

Google’s developer policies explicitly prohibit requesting more privileges that necessary for the extension to function. In my experience this rule in fact works. I can only think of one case where a browser extension requested too many privileges, and this particular extension was being distributed with the browser rather than via some add-on store.

The reason why the majority of popular extensions request a very far-reaching set of privileges is neither malice nor incompetence. It’s rather a simple fact: in order to do something useful you need the privileges to do something useful. Extensions restricted to a handful of websites are rarely interesting enough, they need to make an impact on all of the internet to become popular. Extensions that ask you to upload or type things in manually are inconvenient, so popular extensions request webcam/geolocation/clipboard access to automate the process.

In some cases browsers could do better to limit the abuse potential of extension privileges. For example, Chrome allows screen recording via tabCapture or desktopCapture APIs. The abuse potential is low because the former can only be started as a response to a user action (typically clicking the extension icon) whereas the latter brings up a prompt to select the application window to be recorded. Both are sufficient to prevent extensions from silently starting to record in the background. Any of these approaches would have worked to limit the abuse potential of webcam access.

Such security improvements have the tendency to make extensions less flexible and less user-friendly however. A good example here is the activeTab permission. Its purpose is to make requesting host privileges for the entire internet unnecessary. Instead, the extension can access the current tab when the extension is explicitly activated, typically by clicking its icon.

That approach works well for some extensions, particularly those where the user needs to explicitly trigger an action. It doesn’t work in scenarios where extensions have to perform their work automatically however (meaning being more convenient for the user) or where the extension action cannot be executed immediately and requires preparation. So in my extension survey, I see an almost equal split: 19.5% of extensions using activeTab permission and 19.1% using host permissions for all of the internet.

But that does not account for the extension’s popularity. If I only consider the more popular extensions, the ones with 10,000 users and more, things change quite considerably. With 22% the proportion of the extensions using activeTab permission increases only slightly. Yet a whooping 33.7% of the popular extensions ask for host permissions for each and every website.

The Mozilla BlogAnnouncing Steve Teixeira, Mozilla’s new Chief Product Officer

I am pleased to share that Steve Teixeira has joined Mozilla as our Chief Product Officer. During our search for a Chief Product Officer, Steve stood out to us because of his extensive experience at tech and internet companies where he played instrumental roles in shaping products from research, design, security, development, and getting them out to market.

<figcaption>Steve Teixeira joins Mozilla executive team. Steve was photographed in Redmond, Wash., August 5, 2022.
(Photo by Dan DeLong for Mozilla)</figcaption>

As Chief Product Officer, Steve will be responsible for leading our product teams. This will include setting a product vision and strategy that accelerates the growth and impact of our existing products and setting the foundation for new product development.  His product management and technical expertise as well as his leadership experience are the right fit to lead our product teams into Mozilla’s next chapter. 

“There are few opportunities today to build software that is unambiguously good for the world while also being loveable for customers and great for business,” said Teixeira. “I see that potential in Firefox, Pocket, and the rest of the Mozilla product family. I’m also excited about being a part of the evolution of the product family that comes from projecting Mozilla’s evergreen principles through a modern lens to solve some of today’s most vexing challenges for people on the internet.”

Steve comes to us most recently from Twitter, where he spent eight months as a Vice President of Product for their Machine Learning and Data platforms. Prior to that, Steve led Product Management, Design and Research in Facebook’s Infrastructure organization. He also spent almost 14 years at Microsoft where he was responsible for the Windows third-party software ecosystems and held leadership roles in Windows IoT, Visual Studio and the Technical Computing Group. Steve also held a variety of engineering roles at small and medium-sized companies in the Valley in spaces like developer tools, endpoint security, mobile computing, and professional services. 

Steve will report to me and sit on the steering committee.

The post Announcing Steve Teixeira, Mozilla’s new Chief Product Officer appeared first on The Mozilla Blog.

IRL (podcast)AI from Above

An aerial picture can tell a thousand stories. But who gets to tell them? From above the clouds, our world is surveilled and datafied. Those who control the data, control the narratives. We explore the legacy of spatial apartheid in South Africa’s townships, and hear from people around the world who are reclaiming power over their own maps.

Raesetje Sefala is mapping the legacy of spatial apartheid in South Africa as a computer vision researcher with Timnit Gebru’s Distributed AI Research Institute (DAIR).

Astha Kapoor researches how communities and organizations can be ‘stewards’ of data about people and places as co-founder of the Aapti Institute in India.

Michael Running Wolf is the founder of Indigenous in AI. He is working on speech recognition and immersive spatial experiences with augmented and virtual reality in Canada.

Denise McKenzie is a location data expert who works with the global mapping organization PLACE to empower governments and communities to use advanced spatial data.

IRL is an original podcast from Mozilla, the non-profit behind Firefox. In Season 6, host Bridget Todd shares stories of people who make AI more trustworthy in real life. This season doubles as Mozilla’s 2022 Internet Health Report.  Go to the report for show notes, transcripts, and more.

The Rust Programming Language BlogAnnouncing Rust 1.63.0

The Rust team is happy to announce a new version of Rust, 1.63.0. Rust is a programming language empowering everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, you can get 1.63.0 with:

rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.63.0 on GitHub.

If you'd like to help us out by testing future releases, you might consider updating locally to use the beta channel (rustup default beta) or the nightly channel (rustup default nightly). Please report any bugs you might come across!

What's in 1.63.0 stable

Scoped threads

Rust code could launch new threads with std::thread::spawn since 1.0, but this function bounds its closure with 'static. Roughly, this means that threads currently must have ownership of any arguments passed into their closure; you can't pass borrowed data into a thread. In cases where the threads are expected to exit by the end of the function (by being join()'d), this isn't strictly necessary and can require workarounds like placing the data in an Arc.

Now, with 1.63.0, the standard library is adding scoped threads, which allow spawning a thread borrowing from the local stack frame. The std::thread::scope API provides the necessary guarantee that any spawned threads will have exited prior to itself returning, which allows for safely borrowing data. Here's an example:

let mut a = vec![1, 2, 3];
let mut x = 0;

std::thread::scope(|s| {
    s.spawn(|| {
        println!("hello from the first scoped thread");
        // We can borrow `a` here.
        dbg!(&a);
    });
    s.spawn(|| {
        println!("hello from the second scoped thread");
        // We can even mutably borrow `x` here,
        // because no other threads are using it.
        x += a[0] + a[2];
    });
    println!("hello from the main thread");
});

// After the scope, we can modify and access our variables again:
a.push(4);
assert_eq!(x, a.len());

Rust ownership for raw file descriptors/handles (I/O Safety)

Previously, Rust code working with platform APIs taking raw file descriptors (on unix-style platforms) or handles (on Windows) would typically work directly with a platform-specific representation of the descriptor (for example, a c_int, or the alias RawFd). For Rust bindings to such native APIs, the type system then failed to encode whether the API would take ownership of the file descriptor (e.g., close) or merely borrow it (e.g., dup).

Now, Rust provides wrapper types such as BorrowedFd and OwnedFd, which are marked as #[repr(transparent)], meaning that extern "C" bindings can directly take these types to encode the ownership semantics. See the stabilized APIs section for the full list of wrapper types stabilized in 1.63, currently, they are available on cfg(unix) platforms, Windows, and WASI.

We recommend that new APIs use these types instead of the previous type aliases (like RawFd).

const Mutex, RwLock, Condvar initialization

The Condvar::new, Mutex::new, and RwLock::new functions are now callable in const contexts, which allows avoiding the use of crates like lazy_static for creating global statics with Mutex, RwLock, or Condvar values. This builds on the work in 1.62 to enable thinner and faster mutexes on Linux.

Turbofish for generics in functions with impl Trait

For a function signature like fn foo<T>(value: T, f: impl Copy), it was an error to specify the concrete type of T via turbofish: foo::<u32>(3, 3) would fail with:

error[E0632]: cannot provide explicit generic arguments when `impl Trait` is used in argument position
 --> src/lib.rs:4:11
  |
4 |     foo::<u32>(3, 3);
  |           ^^^ explicit generic argument not allowed
  |
  = note: see issue #83701 <https://github.com/rust-lang/rust/issues/83701> for more information

In 1.63, this restriction is relaxed, and the explicit type of the generic can be specified. However, the impl Trait parameter, despite desugaring to a generic, remains opaque and cannot be specified via turbofish.

Non-lexical lifetimes migration complete

As detailed in this blog post, we've fully removed the previous lexical borrow checker from rustc across all editions, fully enabling the non-lexical, new, version of the borrow checker. Since the borrow checker doesn't affect the output of rustc, this won't change the behavior of any programs, but it completes a long-running migration (started in the initial stabilization of NLL for the 2018 edition) to deliver the full benefits of the new borrow checker across all editions of Rust. For most users, this change will bring slightly better diagnostics for some borrow checking errors, but will not otherwise impact which code they can write.

You can read more about non-lexical lifetimes in this section of the 2018 edition announcement.

Stabilized APIs

The following methods and trait implementations are now stabilized:

These APIs are now usable in const contexts:

Other changes

There are other changes in the Rust 1.63.0 release. Check out what changed in Rust, Cargo, and Clippy.

Contributors to 1.63.0

Many people came together to create Rust 1.63.0. We couldn't have done it without all of you. Thanks!

The Mozilla BlogWhy I joined Mozilla’s Board of Directors

I first started working with digitalization and the internet when I became CEO of Scandinavia Online in 1998. It was the leading online service in the Nordics and we were pioneers and idealists. I learnt a lot from that experience: the endless opportunities, the tricky business models and the extreme ups and downs in hypes and busts of evaluation. I also remember Mozilla during that time as a beacon of competence and idealism, as well as a champion for the open internet as a force for good.

kristin skogen lund mozilla board member<figcaption>Kristin Skogen Lund</figcaption>

Since those early days I have worked in the media industry, telecoms and interest organizations. Today I serve as CEO of Schibsted, the leading Nordic-based media company (which initially started Scandinavia Online back in the days). We own and operate around 70 digital consumer brands across media, online marketplaces, financial services, price comparison services and technology ventures. Within the global industry, we were known as one of the few traditional media companies that adapted to the digital world early on by disrupting our business model and gaining a position in the digital landscape early.

I am deeply engaged in public policy and I serve as president of the European Tech Alliance (EUTA), comprising the leading tech companies of Europe. We work to influence and improve the EU’s digital regulation and to ensure an optimal breeding ground for European digital entrepreneurship. This work is essential as our societies depend upon technology being a force for good, something that cannot be taken for granted, nor is it always the case.

I take great honor in serving on the board of Mozilla to help promote its vision and work to diversify and expand to new audiences and services. It is exciting to serve on the board of a US-based company with such strong roots and that has been an inspiration for me these past 25 years.

The process of meeting board members and management has strengthened my impression of a very capable and engaged team. To build on past successes is never easy, but in Mozilla’s case it is all the more important — not just for Mozilla, but for the health of the internet and thus our global community. I look very much forward to being part of, and contributing to, that tremendous endeavor.

The post Why I joined Mozilla’s Board of Directors appeared first on The Mozilla Blog.

Wladimir PalantAnatomy of a basic extension

I am starting an article series explaining the basics of browser extension security. It’s meant to provide you with some understanding of the field and serve as a reference for my more specific articles. You can browse the extension-security-basics category to see other published articles in this series.

Before we go for a deeper dive, let’s get a better understanding of what a browser extension actually is. We’ll take a look at a simple example extension and the different contexts in which its code runs.

Browser extensions? What kind of browser extensions?

Browser extensions were introduced to the general public by Mozilla more than two decades ago. Seeing their success, other browser vendors developed their own extension models. However, Google Chrome becoming the prevalent browser eventually caused all the other extension models to go extinct. At this point in time, only Chrome-compatible extensions are still relevant.

These extensions are supported by Google Chrome itself and other Chromium-based browsers such as Microsoft Edge, Opera or Vivaldi. The Mozilla Firefox browser uses an independent implementation of the extension APIs, the only one as far as I am aware. While mostly compatible, Mozilla’s extension APIs have been improved in some areas. Some of these improvements have security impact but, the extension development being centered on Google Chrome these days, I doubt that many extension developers are aware.

Another interesting aspect is that Mozilla for Android also supports extensions, unlike the mobile versions of Google Chrome. This is merely of theoretical importance however as only add-ons from a very short list can be installed. Two years ago I’ve voiced my concern about this restrictive approach, yet as of today this list still contains only ten browser extensions.

The example extension

So our example extension is going to be a Chrome-compatible one. I’ll discuss the files one by one but you can download the entire source code to play around with here. Unpack this ZIP file to some directory.

All browsers support trying out extensions by loading them from a directory. In Chromium-based browsers you go to chrome://extensions/, enable developer mode and use “Load unpacked” button. In Firefox you go to about:debugging#/runtime/this-firefox and click “Load Temporary Add-on” button.

This extension uses questionable approaches on purpose. It has several potential security issues, none of these are currently exploitable however. Small changes to the extension functionality will change that however, I’ll introduce these in future articles.

The extension manifest

The central piece of an extension is its manifest, a file named manifest.json. Ours looks like this:

{
  "manifest_version": 2,
  "name": "My extension",
  "version": "1.0",
  "permissions": [
    "storage"
  ],
  "content_scripts": [
    {
      "js": [
        "script.js"
      ],
      "matches": [
        "https://example.com/*",
        "https://www.example.com/*"
      ]
    }
  ],
  "background": {
    "scripts": [
      "background.js"
    ]
  },
  "options_ui": {
    "page": "options.html"
  }
}

We use manifest version 2 here. Eventually manifest version 3 is supposed to replace it completely. Yet in my current survey of extension manifests only 16% of all extensions used the newer version.

This manifest declares that the extension requires the storage permission. This means that it can use the storage API to store its data persistently. Unlike cookies or localStorage APIs which give users some level of control, extension storage can normally only be cleared by uninstalling the extension.

It also declares that this extension contains the content script script.js, an options page options.html and a background script background.js. We’ll take a look at all of these next.

The content script

Content scripts are loaded whenever the user navigates to a matching page, in our case any page matching the https://example.com/* expression. They execute like the page’s own scripts and have arbitrary access to the page’s Document Object Model (DOM). Our content script uses that access to display a notification message:

chrome.storage.local.get("message", result =>
{
  let div = document.createElement("div");
  div.innerHTML = result.message + " <button>Explain</button>";
  div.querySelector("button").addEventListener("click", () =>
  {
    chrome.runtime.sendMessage("explain");
  });
  document.body.appendChild(div);
});

This uses the storage API to retrieve the message value from extension’s storage. That message is then added to the page along with a button labeled “Explain”. This is what it looks like on example.com:

Usual website text starting with “Example Domain.” Below it a block saying “Hi there!” followed by a button titled “Explain.”

What happens when this button is clicked? The content script uses runtime.sendMessage() API to send a message to the extension pages. That’s because a content script only has direct access to a handful of APIs such as storage. Everything else has to be done by extension pages that content scripts can send messages to.

The content script capabilities differ slightly depending on browser. For Chromium-based browsers you can find the list in the Chrome Developers documentation, for Firefox MDN is the ultimative source.

The background page

Usually, when content scripts send a message its destination is the background page. The background page is a special page that is always present unless specified otherwise in the extension manifest. It is invisible to the user, despite being a regular page with its own DOM and everything. Its function is typically coordinating all other parts of the extension.

Wait, our extension manifest doesn’t even define a background page! There is only a background script. How does that work?

There is still a background page. It is called _generated_background_page.html and contains the following code:

<!DOCTYPE html>
<body>
<script src="background.js"></script>

If a background page isn’t declared explicitly, the browser will helpfully generate one automatically and make sure all the declared background scripts are loaded into it.

And here is our background script:

chrome.runtime.onMessage.addListener((request, sender, sendResponse) =>
{
  if (request == "explain")
  {
    chrome.tabs.create({ url: "https://example.net/explanation" });
  }
})

It uses runtime.onMessage API to listen to messages. When an "explain" message is received, it uses tabs API to open a page in a new tab.

The options page

But how did that message text get into extension storage? Probably via an option page allowing to configure the extension.

Browser extensions can contain various kinds of pages. There are for example action pages that are displayed in a drop-down when the extension icon is clicked. Or pages that the extension will load in a new tab. Unlike the background page, these pages aren’t persistent but rather load when needed. Yet all of them can receive messages from content scripts. And all of them have full access to extension-specific APIs, as far as the extension’s permissions allow.

Our extension manifest declares an options page. This page displays on top of the extension details if some user manages to find the button to open it which browser vendors hide rather well:

Extension details displayed by the browser: permissions, source, and also “Extension options” button. On top of that page a modal dialog is displayed with the title “My extension”. Inside it the text “Please enter a message” followed by a text box. The value “Hi there!” is filled into the text box.

It’s a regular HTML page, nothing special about it:

<html>

<head>
  <script src="options.js"></script>
</head>

<body>
  Please enter a message:
  <input id="message" style="width: 100%;">
</body>

</html>

The script loaded by this page makes sure that any changes to the message field are immediately saved to the extension storage where our content script will retrieve them:

function init()
{
  let element = document.getElementById("message");
  chrome.storage.local.get("message", result => element.value = result.message);
  element.addEventListener("input", () =>
  {
    chrome.storage.local.set({ message: element.value });
  });
}

window.addEventListener("load", init, { once: true });

The relevant contexts

Altogether the relevant contexts for browser extensions look like this:

Browser extension consists of two sections: extension pages (background page, action page, options page) and content scripts (example.com content script, example.net content script, example.info content script). Content scripts interact with extension pages and with websites: example.com content script with example.com website, example.net content script with example.net website, example.info content script with example.info website. Extension pages interact with browser’s extension APIs, connected websites and desktop applications.

In this article we’ve already seen content scripts that can access websites directly but are barred from accessing most of the extension APIs. Instead, content scripts will communicate with extension pages.

The extension pages are way more powerful than content scripts and have full access to extension APIs. They usually won’t communicate with websites directly however, instead relying on content scripts for website access.

Desktop applications and connected websites are out of scope for this article. We’ll take a thorough look at them later.

Mozilla ThunderbirdHow You Can Contribute To Thunderbird Without Knowing How To Code

Thunderbird and K-9 Mail are both open-source software projects. That means anyone can contribute to them, improve them, and make them better products. But how does one contribute? You must need some programming skills, right? No! Do you want to learn how to help make a big difference in the global Thunderbird community, without knowing a single line of code? We have a few ideas to share.

Our friend Dustin Krysak, a contributor to Ubuntu Budgie and a customer solutions engineer at Sysdig, brilliantly compares a software project to a construction job:

“Programming is the carpenter, but you still need the architects, designers, project managers, and permit people to make a construction job come together.”

Dustin Krysak

Similarly, making an open-source project like Thunderbird requires an entire community! We need the talents of programmers, graphic designers, translators, writers, financial supporters, enthusiastic fans, quality assurance helpers, bug hunters, & beta testers.

Even if you just have an idea to share, you can make a difference!

No matter what your skill set is, you can absolutely help make Thunderbird better than ever. Here are a few ideas to get you started.


Join The Support Crew

Are you an experienced Thunderbird user who knows the software inside and out? Maybe you want to pay it forward and volunteer some time to help new users! We even have a private discussion group for our support crew to help each other, so they can better support Thunderbird users.

To get involved: https://wiki.mozilla.org/Thunderbird/tb-support-crew

Testing

Want to help improve Thunderbird by simply using it? Testing is a great way to #contribute and requires no prior experience! Help us catch those bugs before they get loose!

Here’s all the info you need to get started with testing: https://wiki.mozilla.org/Thunderbird:Testing

Let’s Go Bug Hunting

Speaking of bugs, capturing and reporting Thunderbird bugs is really important. It’s invaluable! And you’ll enjoy the satisfaction of helping MILLIONS of other users avoid that bug in the future!

We use Mozilla’s Bugzilla, a very powerful tool: https://bugzilla.mozilla.org/

Translate This

We want the entire world to use Thunderbird, which is why it’s currently available in more than 60 languages. If you’re a wordsmith who understands multiple languages, you can aid in our ongoing translation efforts for translating various Thunderbird web pages.

Join 100 other contributors who help translate Thunderbird websites! https://pontoon.mozilla.org/projects/thunderbirdnet/

Document All The Things

Know what else is important? Documentation! From beginner tutorials to technical guides, there’s always a need for helpful information to be written down and easily found.

There are many ways you can contribute to Thunderbird documentation. Start here: https://www.thunderbird.net/en-US/get-involved/#documentation

Financial Support

Financial contributions are another way to help. Thunderbird is both free and freedom respecting, but we’re also completely funded by donations!

Year after year, the generosity of our donors makes it possible for Thunderbird to thrive. You can contribute a one-time or recurring monthly donation at give.thunderbird.net.

Sharing Is Caring

Do you share our tweets, Fediverse posts, or Facebook messages? Do you tell your friends and colleagues about Thunderbird? Then yes, you are contributing!

Word of mouth and community promotions are yet another key ingredient to the success of Thunderbird, and any open source project.

Enthusiasm is contagious. Keep sharing ❤

Coding

If you DO have coding skills there are so many ways to help! One of those ways is adding new functionality and designs to Thunderbird with Extensions and Themes!

Here’s what you need to know about making add-ons for Thunderbird: https://developer.thunderbird.net/add-ons/about-add-ons

Last but certainly not least, we’re always looking for contributions to Thunderbird itself from the talented FOSS developer community!

Our Developer Hub has everything you need to start hacking away on Thunderbird: https://developer.thunderbird.net

K-9 Mail and Thunderbird Mobile

Last but certainly not least, there’s the newest member of the Thunderbird family, K-9 Mail. As we work towards bringing Thunderbird to Android, contributions are encouraged! Find out how you can help the K-9 Mail team here: https://k9mail.app/contribute

We can’t wait to see the many important ways you’ll contribute to Thunderbird. If you have questions about it, leave a comment on this post or ask us on social media.

The post How You Can Contribute To Thunderbird Without Knowing How To Code appeared first on The Thunderbird Blog.

The Rust Programming Language BlogNon-lexical lifetimes (NLL) fully stable

As of Rust 1.63 (releasing next week), the "non-lexical lifetimes" (NLL) work will be enabled by default. NLL is the second iteration of Rust's borrow checker. The RFC actually does quite a nice job of highlighting some of the motivating examples. "But," I hear you saying, "wasn't NLL included in Rust 2018?" And yes, yes it was! But at that time, NLL was only enabled for Rust 2018 code, while Rust 2015 code ran in "migration mode". When in "migration mode," the compiler would run both the old and the new borrow checker and compare the results. This way, we could give warnings for older code that should never have compiled in the first place; we could also limit the impact of any bugs in the new code. Over time, we have limited migration mode to be closer and closer to just running the new-style borrow checker: in the next release, that process completes, and all Rust code will be checked with NLL.

How does removing the old borrow checker affect users?

At this point, we have almost completely merged "migration mode" and "regular mode", so switching to NLL will have very little impact on the user experience. A number of diagnostics changed, mostly for the better -- Jack Huey gives the full details in his blog post.

Credit where credit is due

The work to remove the old borrow checker has been going on for years. It's been a long, tedious, and largely thankless process. We'd like to take a moment to highlight the various people involved and make sure they are recognized for their hard work:

Jack's blog post includes a detailed narrative of all the work involved if you'd like more details! It's a fun read.

Looking forward: what can we expect for the "borrow checker of the future"?

The next frontier for Rust borrow checking is taking the polonius project and moving it from research experiment to production code. Polonius is a next-generation version of the borrow checker that was "spun off" from the main NLL effort in 2018, as we were getting NLL ready to ship in production. Its most important contribution is fixing a known limitation of the borrow checker, demonstrated by the following example:

fn last_or_push<'a>(vec: &'a mut Vec<String>) -> &'a String {
    if let Some(s) = vec.last() { // borrows vec
        // returning s here forces vec to be borrowed
        // for the rest of the function, even though it
        // shouldn't have to be
        return s; 
    }
    
    // Because vec is borrowed, this call to vec.push gives
    // an error!
    vec.push("".to_string()); // ERROR
    vec.last().unwrap()
}

This example doesn't compile today (try it for yourself), though there's not a good reason for that. You can often workaround the problem by editing the code to introduce a redundant let (as shown in this example), but with polonius, it will compile as is. If you'd like to learn more about how polonius (and the existing borrow checker) works1, you can watch my talk from Rust Belt Rust.

  1. Or where the name "polonius" comes from!

The Mozilla BlogA back-to-school checklist for online safety

The first day of school is right around the corner. Whether that brings some relief, gives you jitters or both, we’re here to support families with one major thing: internet safety.  

For parents, thinking about the dangers of the web can be scary. But it doesn’t have to be. While the internet isn’t perfect, it’s also a wonderful place for learning and connecting with others. Here’s what families can do to make the best of it while staying safe this school year. 

An illustration shows a digital pop-up box that reads: A back-to-school checklist for online safety: Set up new passwords. Check your devices' privacy settings. Protect your child's browsing information. Discuss parental controls with the whole family. Have the "tech talk."<figcaption>Credit: Nick Velazquez / Mozilla</figcaption>

1. Set up new passwords

Back-to-school season is a good time to update passwords, since students often log in to the same learning tools they use at home and on campus. It’s important to teach kids the basics of password hygiene, including keeping passwords in a safe place and regularly changing them. 

2. Check your devices’ privacy settings 

Whether you have a preschooler who uses the family tablet to watch videos or a kid who’s finally ready for a phone, make sure to set up these devices with data privacy in mind. Figure out – together, if possible – which information they’re sharing with the apps they use. 

Have a school-issued device? Take the time to look into the settings, and don’t be afraid to ask teachers and school administrators about how the tools and software used in classrooms are handling students’ data.

3. Protect your child’s browsing information

An investigation by The Markup, in collaboration with Mozilla Rally, exposed how federal financial aid applications automatically sent students’ personal information to Facebook – even if a student didn’t have a Facebook account. It’s just one example of how invasive big tech’s data tracking has become. One way to cut the amount of information companies are collecting about your kid is by protecting their internet browsing data. 

Firefox has Total Cookie Protection on by default to all users. That means that when your child visits a website, cookies (which store bits of information a page remembers about them), stays within that website and out of the hands of companies that want to track their online behavior and target them with ads. 

How to make Firefox the default browser on a desktop computer:

  • If you haven’t already, download Firefox and open the app. 
  • In the menu bar at the top of the screen, click on Firefox > Preferences.
  • In the general panel, click on the Make Default button.

How to make Firefox the default browser on mobile:

  • Download Firefox.
  • On an iOS device, go to settings, scroll down and click on Firefox > Default Browser App > Firefox
  • On an Android, open the app. Click on the menu button next to the address bar > Settings > Set as default browser > Firefox for Android > Set as default.

Find more information about setting Firefox as the default browser on iOS and Android here.

<figcaption>Make Firefox your default browser on mobile.</figcaption>

4. Discuss parental controls with the whole family

Relying on parental control settings to limit kids’ screen time and block websites may be tempting. But no tool can completely protect kids online. One thing that researchers and advocates agree on when it comes to technology: open communication. Parents should talk to their children about whether or not they need to use parental controls and why. They should also figure out a plan to ease restrictions as kids learn how to manage themselves online. 

Ready for that conversation? Here are some Firefox extensions to consider with your family: 

  • Unhook
    Specific to YouTube, Unhook strips away a lot of the distracting “rabbit role” elements of the site, including suggested videos, trending content and comments. 
  • Tomato Clock
    Based on a renowned time management method (Pomodoro technique), this extension helps a user focus on the computer by breaking up work intervals into defined “tomato” bursts. While this productivity extension could benefit anyone, parents might find it useful for helping kids stay focused during online school time.
  • Block Site
    Try this add-on if your family has agreed to implement restrictions on specific websites. With its password control feature, not only can parents continue to visit these websites, but they can also leave custom display messages if their kid tries to access a restricted site (“Busted! Shouldn’t you be doing homework?”) as well as redirect from one site to another (e.g. Roblox.com to a public library website).

If you’re new to extensions, you can learn more here

5. Have the “tech talk”

Of course, besides weak passwords and school work distractions, there’s plenty of age-appropriate topics that parents may want to talk to their children about. “It helps to talk about values first, then think through together – in developmentally appropriate ways, based on a child’s life stage – how to put those values into practice,” said Leah A. Plunkett, a Harvard Law School lecturer who teaches a course on youth and digital citizenship.

Another idea: Consider putting what your family has agreed upon on paper and have everyone sign it. Or, use a template like Common Sense Media‘s or this one, which also list items that parents can agree to do, like recognizing the role media plays in their kids’ lives, even if they don’t fully understand it.

Like with any other aspect of parenting, providing kids safe and healthy experiences online is more complicated than it seems. The process won’t be perfect, but learning together – with help from trusted sources – can go a long way. 


The internet is a great place for families. It gives us new opportunities to discover the world, connect with others and just generally make our lives easier and more colorful. But it also comes with new challenges and complications for the people raising the next generations. Mozilla wants to help families make the best online decisions, whatever that looks like, with our latest series, The Tech Talk.

An illustration reads: The Tech Talk

Talk to your kids about online safety

Get tips

The post A back-to-school checklist for online safety appeared first on The Mozilla Blog.

Firefox Add-on ReviewsTranslate the web easily with a browser extension

Do you do a lot of language translating on the web? Are you constantly copying text from one browser tab and navigating to another to paste it? Maybe you like to compare translations from different services like Google Translate or Bing Translate? Need easy access to text-to-speech features? 

Online translation services provide a hugely valuable function, but for those of us who do a lot of translating on the web, the process is time-consuming and cumbersome. With the right browser extension, however, web translations become a whole lot easier and faster. Here are some fantastic translation extensions for folks with differing needs…

I just want a simple, efficient way to translate. I don’t need fancy features.

Simple Translate

It doesn’t get much simpler than this. Highlight the text you want to translate and click the extension’s toolbar icon to activate a streamlined pop-up. Your highlighted text automatically appears in the pop-up’s translation field and a drop-down menu lets you easily select your target language. Simple Translate also features a handy “Translate this page” button should you want that. 

Translate Web Pages

Maybe you just need to translate full web pages, like reading news articles in other languages, how-to guides, or job related sites. If so, Translate Web Pages could be the ideal solution for you with its sharp focus on full-page utility. 

However the extension also benefits from a few intriguing additional features, like the ability to select up to three top languages you most commonly translate into (each one easily accessible with a single click in the pop-up menu), designate specific sites to always translate for you upon arrival, and your choice of three translation engines: Google, Yandex, and DeepL. 

To Google Translate

Very popular, very simple translation extension that exclusively uses Google’s translation services, including text-to-speech. 

Simply highlight any text on a web page and right-click to pull up a To Google Translate context menu that allows three actions: 1) translate into your preferred language; 2) listen to audio of the text; 3) Translate the entire page

<figcaption>Right-click any highlighted text to activate To Google Translate.</figcaption>

Privacy is a priority. I’m uncomfortable sending my translations to a cloud.

Mozilla’s very own Firefox Translations is unique among its peers in that all translations occur locally in the browser instead of accessing translation data across the web or in a cloud. This approach is more private because the contents of your translations never leave your machine.

Firefox Translations is still relatively young in its development cycle and new languages are being added all the time.

I do a ton of translating. I need power features to save me time and trouble.

ImTranslator

Striking a balance between out-of-the-box ease and deep customization potential, ImTranslator leverages three top translation engines (Google, Bing, Translator) to cover 100+ languages; the extension itself is even available in nearly two-dozen languages. 

Other strong features include text-to-speech, dictionary and spell check in eight languages, hotkey customization, and a huge array of ways to tweak the look of ImTranslator’s interface—from light and dark themes to font size and more. 

Mate Translate

A slick, intuitive extension that performs all the basic translation functions very well, but it’s Mate Translate’s paid tier that unlocks some unique features, such as Sync (saved translations can appear across devices and browsers, including iPhones and Mac). 

There’s also a neat Phrasebook feature, which lets you build custom word and phrase lists so you can return to common translations you frequently need. It works offline, too, so it’s ideal for travellers who need quick reference to common foreign phrases. 

These are some of our favorites, but there are plenty more translation extensions to explore on addons.mozilla.org.

Manish GoregaokarSo Zero It's ... Negative? (Zero-Copy #3)

This is part 3 of a three-part series on interesting abstractions for zero-copy deserialization I’ve been working on over the last year. This part is about eliminating the deserialization step entirely. Part 1 is about making it more pleasant to work with and can be found here; while Part 2 is about making it work for more types and can be found here. The posts can be read in any order, though only the first post contains an explanation of what zero-copy deserialization is.

And when Alexander saw the breadth of his work, he wept. For there were no more copies left to zero.

—Hans Gruber, after designing three increasingly unhinged zero-copy crates

Part 1 of this series attempted to answer the question “how can we make zero-copy deserialization pleasant”, while part 2 answered “how do we make zero-copy deserialization more useful?”.

This part goes one step further and asks “what if we could avoid deserialization altogether?”.

Speech bubble for character Confused pion
Wait, what?

Bear with me.

As mentioned in the previous posts, internationalization libraries like ICU4X need to be able to load and manage a lot of internationalization data. ICU4X in particular wants this part of the process to be as flexible and efficient as possible. The focus on efficiency is why we use zero-copy deserialization for basically everything, whereas the focus on flexibility has led to a robust and pluggable data loading infrastructure that allows you to mix and match data sources.

Deserialization is a great way to load data since it’s in and of itself quite flexible! You can put your data in a neat little package and load it off the filesystem! Or send it over the network! It’s even better when you have efficient techniques like zero-copy deserialization because the cost is low.

But the thing is, there is still a cost. Even with zero-copy deserialization, you have to validate the data you receive. It’s often a cost folks are happy to pay, but that’s not always the case.

For example, you might be, say, a web browser interested in using ICU4X, and you really care about startup times. Browsers typically need to set up a lot of stuff when being started up (and when opening a new tab!), and every millisecond counts when it comes to giving the user a smooth experience. Browsers also typically ship with most of the internationalization data they need already. Spending precious time deserializing data that you shipped with is suboptimal.

What would be ideal would be something that works like this:

static DATA: &Data = &serde_json::deserialize!(include_bytes!("./testdata.json"));

where you can have stuff get deserialized at compile time and loaded into a static. Unfortunately, Rust const support is not at the stage where the above code is possible whilst working within serde’s generic framework, though it might be in a year or so.

You could write a very unsafe version of serde::Deserialize that operates on fully trusted data and uses some data format that is easy to zero-copy deserialize whilst avoiding any kind of validation. However, this would still have some cost: you still have to scan the data to reconstruct the full deserialized output. More importantly, it would require a parallel universe of unsafe serde-like traits that everyone has to derive or implement, where even small bugs in manual implementations would likely cause memory corruption.

Speech bubble for character Positive pion
Sounds like you need some format that needs no validation or scanning to zero-copy deserialize, and can be produced safely. But that doesn’t exist, does it?

It does.

… but you’re not going to like where I’m going with this.

Speech bubble for character Positive pion
Oh no.

There is such a format: Rust code. Specifically, Rust code in statics. When compiled, Rust statics are basically “free” to load, beyond the typical costs involved in paging in memory. The Rust compiler trusts itself to be good at codegen, so it doesn’t need validation when loading a compiled static from memory. There is the possibility of codegen bugs, however we have to trust the compiler about that for the rest of our program anyway!

This is even more “zero” than “zero-copy deserialization”! Regular “zero copy deserialization” still involves a scanning and potentially a validation step, it’s really more about “zero allocations” than actually avoiding all of the copies. On the other hand, there’s truly no copies or anything going on when you load Rust statics; it’s already ready to go as a &'static reference!

We just have to figure out a way to “serialize to const Rust code” such that the resultant Rust code could just be compiled in to the binary, and people who need to load trusted data into ICU4X can load it for free!

Speech bubble for character Confused pion
What does “const code” mean in this context?

In Rust, const code essentially is code that can be proven to be side-effect-free, and it’s the only kind of code allowed in statics, consts, and const fns.

Speech bubble for character Confused pion
I see! Does this code actually have to be “constant”?

Not quite! Rust supports mutation and even things like for loops in const code! Ultimately, it has to be the kind of code that can be computed at compile time with no difference of behavior: so no reading from files or the network, or using random numbers.

For a long time only very simple code was allowed in const, but over the last year the scope of what that environment can do has expanded greatly, and it’s actually possible to do complicated things here, which is precisely what enables us to actually do “serialize to Rust code” in a reasonable way.

databake

A lot of the design here can also be found in the design doc. While I did the bulk of the design for this crate, it was almost completely implemented by Robert, who also worked on integrating it into ICU4X, and cleaned up the design in the process.

Enter databake (née crabbake). databake is a crate that provides just this; the ability to serialize your types to const code that can then be used in statics allowing for truly zero-cost data loading, no deserialization necessary!

The core entry point to databake is the Bake trait:

pub trait Bake {
    fn bake(&self, ctx: &CrateEnv) -> TokenStream;
}

A TokenStream is the type typically used in Rust procedural macros to represent a snippet of Rust code. The Bake trait allows you to take an instance of a type, and convert it to Rust code that represents the same value.

The CrateEnv object is used to track which crates are needed, so that it is possible for tools generating this code to let the user know which direct dependencies are needed.

This trait is augmented by a #[derive(Bake)] custom derive that can be used to apply it to most types automatically:

// inside crate `bar`, module `module.rs`

use databake::Bake;

#[derive(Bake)]
#[databake(path = bar::module)]
pub struct Person<'a> {
   pub name: &'a str,
   pub age: u32,
}

As with most custom derives, this only works on structs and enums that contain other types that already implement Bake. Most types not involving mandatory allocation should be able to.

How to use it

databake itself doesn’t really prescribe any particular code generation strategy. It can be used in a proc macro or in a build.rs, or, even in a separate binary. ICU4X does the latter, since that’s just what ICU4X’s model for data generation is: clients can use the binary to customize the format and contents of the data they need.

So a typical way of using this crate might be to do something like this in build.rs:

use some_dep::Data;
use databake::Bake;
use quote::quote;

fn main() {
   // load data from file
   let json_data = include_str!("data.json");

   // deserialize from json
   let my_data: Data = serde_json::from_str(json_data);

   // get a token tree out of it
   let baked = my_data.bake();


   // Construct rust code with this in a static
   // The quote macro is used by procedural macros to do easy codegen,
   // but it's useful in build scripts as well.
   let my_data_rs = quote! {
      use some_dep::Data;
      static MY_DATA: Data = #baked;
   }

   // Write to file
   let out_dir = env::var_os("OUT_DIR").unwrap();
   let dest_path = Path::new(&out_dir).join("data.rs");
   fs::write(
      &dest_path,
      &my_data_rs.to_string()
   ).unwrap();

   // (Optional step omitted: run rustfmt on the file)

   // tell Cargo that we depend on this file
   println!("cargo:rerun-if-changed=src/data.json");
}

What it looks like

ICU4X generates all of its test data into JSON, postcard, and “baked” formats. For example, for this JSON data representing how a particular locale does numbers, the “baked” data looks like this. That’s a rather simple data type, but we do use this for more complex data like date time symbol data, which is unfortunately too big for GitHub to render normally.

ICU4X’s code for generating this is in this file. It’s complicated primarily because ICU4X’s data generation pipeline is super configurable and complicated, The core thing that it does is, for each piece of data, it calls tokenize(), which is a thin wrapper around calling .bake() on the data and some other stuff. It then takes all of the data and organizes it into files like those linked above, populated with a static for each piece of data. In our case, we include all this generated rust code into our “testdata” crate as a module, but there are many possibilities here!

For our “test” data, which is currently 2.7 MB in the postcard format (which is optimized for being lightweight), the same data ends up being 11 MB of JSON, and 18 MB of generated Rust code! That’s … a lot of Rust code, and tools like rust-analyzer struggle to load it. It’s of course much smaller once compiled into the binary, though that’s much harder to measure, because Rust is quite aggressive at optimizing unused data out in the baked version (where it has ample opportunity to). From various unscientific tests, it seems like 2MB of deduplicated postcard data corresponds to roughly 500KB of deduplicated baked data. This makes sense, since one can expect baked data to be near the theoretical limit of how small the data is without applying some heavy compression. Furthermore, while we deduplicate baked data at a per-locale level, it can take advantage of LLVM’s ability to deduplicate statics further, so if, for example, two different locales have mostly the same data for a given data key1 with some differences, LLVM may be able to use the same statics for sub-data.

Limitations

const support in Rust still has a ways to go. For example, it doesn’t yet support creating objects like Strings which are usually on the heap, though they are working on allowing this. This isn’t a huge problem for us; all of our data already supports zero-copy deserialization, which means that for every instance of our data types, there is some way to represent it as a borrow from another static.

A more pesky limitation is that you can’t interact with traits in const environments. To some extent, were that possible, the purpose of this crate could also have been fulfilled by making the serde pipeline const-friendly2, and then the code snippet from the beginning of this post would work:

static DATA: &Data = &serde_json::deserialize!(include_bytes!("./testdata.json"));

This means that for things like ZeroVec (see part 2), we can’t actually just make their safe constructors const and pass in data to be validated — the validation code is all behind traits — so we have to unsafely construct them. This is somewhat unfortunate, however ultimately if the zerovec byte representation had trouble roundtripping we would have larger problems, so it’s not an introduction of a new surface of unsafety. We’re still able to validate things when generating the baked data, we just can’t get the compiler to also re-validate before agreeing to compile the const code.

Try it out!

databake is much less mature compared to yoke and zerovec, but it does seem to work rather well so far. Try it out! Let me know what you think!

Thanks to Finch, Jane, Shane, and Robert for reviewing drafts of this post

  1. In ICU4X, a “data key” can be used to talk about a specific type of data, for example the decimal symbols data has a decimal/symbols@1 data key. 

  2. Mind you, this would not be an easy task, but it would likely integrate with the ecosystem really well. 

Manish GoregaokarZero-Copy All the Things! (Zero-Copy #2)

This is part 2 of a three-part series on interesting abstractions for zero-copy deserialization I’ve been working on over the last year. This part is about making zero-copy deserialization work for more types. Part 1 is about making it more pleasant to work with and can be found here; while Part 3 is about eliminating the deserialization step entirely and can be found here. The posts can be read in any order, though only the first post contains an explanation of what zero-copy deserialization is.

Background

This section is the same as in the last article and can be skipped if you’ve read it

For the past year and a half I’ve been working full time on ICU4X, a new internationalization library in Rust being built under the Unicode Consortium as a collaboration between various companies.

There’s a lot I can say about ICU4X, but to focus on one core value proposition: we want it to be modular both in data and code. We want ICU4X to be usable on embedded platforms, where memory is at a premium. We want applications constrained by download size to be able to support all languages rather than pick a couple popular ones because they cannot afford to bundle in all that data. As a part of this, we want loading data to be fast and pluggable. Users should be able to design their own data loading strategies for their individual use cases.

See, a key part of performing correct internationalization is the data. Different locales1 do things differently, and all of the information on this needs to go somewhere, preferably not code. You need data on how a particular locale formats dates2, or how plurals work in a particular language, or how to accurately segment languages like Thai which are typically not written with spaces so that you can insert linebreaks in appropriate positions.

Given the focus on data, a very attractive option for us is zero-copy deserialization. In the process of trying to do zero-copy deserialization well, we’ve built some cool new libraries, this article is about one of them.

What can you zero-copy?

Speech bubble for character Positive pion
If you’re unfamiliar with zero-copy deserialization, check out the explanation in the previous article!

In the previous article we explored how zero-copy deserialization could be made more pleasant to work with by erasing the lifetimes. In essence, we were expanding our capabilities on what you can do with zero-copy data.

This article is about expanding our capabilities on what we can make zero-copy data.

We previously saw this struct:

#[derive(Serialize, Deserialize)]
struct Person {
    // this field is nearly free to construct
    age: u8,
    // constructing this will involve a small allocation and copy
    name: String,
    // this may take a while
    rust_files_written: Vec<String>,
}

and made the name field zero-copy by replacing it with a Cow<'a, str>. However, we weren’t able to do the same with the rust_files_written field because serde does not handle zero-copy deserialization for things other than [u8] and str. Forget nested collections like Vec<String> (as &[&str]), even Vec<u32> (as &[u32]) can’t be made zero-copy easily!

This is not a fundamental restriction in zero-copy deserialization, indeed, the excellent rkyv library is able to support data like this. However, it’s not as slam-dunk easy as str and [u8] and it’s understandable that serde wishes to not pick sides on any tradeoffs here and leave it up to the users.

So what’s the actual problem here?

Blefuscudian Bewilderment

The short answer is: endianness, alignment, and for Vec<String>, indirection.

See, the way zero-copy deserialization works is by directly taking a pointer to the memory and declaring it to be the desired value. For this to work, that data must be of a kind that looks the same on all machines, and must be legal to take a reference to.

This is pretty straightforward for [u8] and str, their data is identical on every system. str does need a validation step to ensure it’s valid UTF-8, but the general thrust of zero-copy serialization is to replace expensive deserialization with cheaper validation, so we’re fine with that.

On the other hand, the borrowed version of Vec<String>, &[&str] is unlikely to look the same even across different executions of the program on the same system, because it contains pointers (indirection) that’ll change each time depending on the data source!

Pointers are hard. What about Vec<u32>/[u32]? Surely there’s nothing wrong with a pile of integers?

<figcaption class="caption-text">

Dracula, dispensing wisdom on the subject of zero-copy deserialization.

</figcaption>

This is where the endianness and alignment come in. Firstly, a u32 doesn’t look exactly the same on all systems, some systems are “big endian”, where the integer 0x00ABCDEF would be represented in memory as [0x00, 0xAB, 0xCD, 0xEF], whereas others are “little endian” and would represent it [0xEF, 0xCD, 0xAB, 0x00]. Most systems these days are little-endian, but not all, so you may need to care about this.

This would mean that a [u32] serialized on a little endian system would come out completely garbled on a big-endian system if we’re naïvely zero-copy deserializing.

Secondly, a lot of systems impose alignment restrictions on types like u32. A u32 cannot be found at any old memory address, on most modern systems it must be found at a memory address that’s a multiple of 4. Similarly, a u64 must be at a memory address that’s a multiple of 8, and so on. The subsection of data being serialized, however, may be found at any address. It’s possible to design a serialization framework where a particular field in the data is forced to have a particular alignment (rkyv has this), however it’s kinda tricky and requires you to have control over the alignment of the original loaded data, which isn’t a part of serde’s model.

So how can we address this?

ZeroVec and VarZeroVec

A lot of the design here can be found explained in the design doc

After a bunch of discussions with Shane, we designed and wrote zerovec, a crate that attempts to solve this problem, in a way that works with serde.

The core abstractions of the crate are the two types, ZeroVec and VarZeroVec, which are essentially zero-copy enabled versions of Cow<'a, [T]>, for fixed-size and variable-size T types.

ZeroVec can be used with any type implementing ULE (more on what this means later), which is by default all of the integer types and can be extended to most Copy types. It’s rather similar to &[T], however instead of returning references to its elements, it copies them out. While ZeroVec is a Cow-like borrowed-or-owned type3, there is a fully borrowed variant ZeroSlice that it derefs to.

Similarly, VarZeroVec may be used with types implementing VarULE (e.g. str). It is able to hand out references VarZeroVec<str> behaves very similarly to how &[str] would work if such a type were allowed to exist in Rust. You can even nest them, making types like VarZeroVec<VarZeroSlice<ZeroSlice<u32>>>, the zero-copy equivalent of Vec<Vec<Vec<u32>>>.

There’s also a ZeroMap type that provides a binary-search based map that works with types compatible with either ZeroVec or VarZeroVec.

So, for example, to make the following struct zero-copy:

#[derive(serde::Serialize, serde::Deserialize)]
struct DataStruct {
    nums: Vec<u32>,
    chars: Vec<char>,
    strs: Vec<String>,
}

you can do something like this:

#[derive(serde::Serialize, serde::Deserialize)]
pub struct DataStruct<'data> {
    #[serde(borrow)]
    nums: ZeroVec<'data, u32>,
    #[serde(borrow)]
    chars: ZeroVec<'data, char>,
    #[serde(borrow)]
    strs: VarZeroVec<'data, str>,
}

Once deserialized, the data can be accessed with data.nums.get(index) or data.strs[index], etc.

Custom types can also be supported within these types with some effort, if you’d like the following complex data to be zero-copy:

#[derive(Copy, Clone, PartialEq, Eq, Ord, PartialOrd, serde::Serialize, serde::Deserialize)]
struct Date {
    y: u64,
    m: u8,
    d: u8
}

#[derive(Clone, PartialEq, Eq, Ord, PartialOrd, serde::Serialize, serde::Deserialize)]
struct Person {
    birthday: Date,
    favorite_character: char,
    name: String,
}

#[derive(serde::Serialize, serde::Deserialize)]
struct Data {
    important_dates: Vec<Date>,
    important_people: Vec<Person>,
    birthdays_to_people: HashMap<Date, Person>
}

you can do something like this:

// custom fixed-size ULE type for ZeroVec
#[zerovec::make_ule(DateULE)]
#[derive(Copy, Clone, PartialEq, Eq, Ord, PartialOrd, serde::Serialize, serde::Deserialize)]
struct Date {
    y: u64,
    m: u8,
    d: u8
}

// custom variable sized VarULE type for VarZeroVec
#[zerovec::make_varule(PersonULE)]
#[zerovec::derive(Serialize, Deserialize)] // add Serde impls to PersonULE
#[derive(Clone, PartialEq, Eq, Ord, PartialOrd, serde::Serialize, serde::Deserialize)]
struct Person<'data> {
    birthday: Date,
    favorite_character: char,
    #[serde(borrow)]
    name: Cow<'data, str>,
}

#[derive(serde::Serialize, serde::Deserialize)]
struct Data<'data> {
    #[serde(borrow)]
    important_dates: ZeroVec<'data, Date>,
    // note: VarZeroVec always must reference the unsized ULE type directly
    #[serde(borrow)]
    important_people: VarZeroVec<'data, PersonULE>,
    #[serde(borrow)]
    birthdays_to_people: ZeroMap<'data, Date, PersonULE>
}

Unfortunately the inner “ULE type” workings are not completely hidden from the user, especially for VarZeroVec-compatible types, but the crate does a fair number of things to attempt to make it pleasant to work with.

In general, ZeroVec should be used for types that are fixed-size and implement Copy, whereas VarZeroVec is to be used with types that logically contain a variable amount of data, like vectors, maps, strings, and aggregates of the same. VarZeroVec will always be used with a dynamically sized type, yielding references to that type.

I’ve noted before that these types are like Cow<'a, T>; they can be dealt with in a mutable-owned fashion, but it’s not the primary focus of the crate. In particular, VarZeroVec<T> will be significantly slower to mutate than something like Vec<String>, since all operations are done on the same buffer format. The general idea of this crate is that you probably will be generating your data in a situation without too many performance constraints, but you want the operation of reading the data to be fast. So, where necessary, the crate trades off mutation performance for deserialization/read performance. Still, it’s not terribly slow, just something to look out for and benchmark if necessary.

How it works

Most of the crate is built on the ULE and VarULE traits. Both of these traits are unsafe traits (though as shown above most users need not manually implement them). “ULE” stands for “unaligned little-endian”, and marks types which have no alignment requirements and have the same representation across endiannesses, preferring to be identical to the little-endian representation where relevant4.

There’s also a safe AsULE trait that allows one to convert a type between itself and some corresponding ULE type.

pub unsafe trait ULE: Sized + Copy + 'static {
    // Validate that a byte slice is appropriate to treat as a reference to this type
    fn validate_byte_slice(bytes: &[u8]) -> Result<(), ZeroVecError>;

    // less relevant utility methods omitted
}

pub trait AsULE: Copy {
    type ULE: ULE;

    // Convert to the ULE type
    fn to_unaligned(self) -> Self::ULE;
    // Convert back from the ULE type
    fn from_unaligned(unaligned: Self::ULE) -> Self;
}

pub unsafe trait VarULE: 'static {
    // Validate that a byte slice is appropriate to treat as a reference to this type
    fn validate_byte_slice(_bytes: &[u8]) -> Result<(), ZeroVecError>;

    // Construct a reference to Self from a known-valid byte slice
    // This is necessary since VarULE types are dynamically sized and the working of the metadata
    // of the fat pointer varies between such types
    unsafe fn from_byte_slice_unchecked(bytes: &[u8]) -> &Self;

    // less relevant utility methods omitted
}

ZeroVec<T> takes in types that are AsULE and stores them internally as slices of their ULE types (&[T::ULE]). Such slices can be freely zero-copy serialized. When you attempt to index a ZeroVec, it converts the value back to T on the fly, an operation that’s usually just an unaligned load.

VarZeroVec<T> is a bit more complicated. The beginning of its memory stores the indices of every element in the vector, followed by the data for all of the elements just splatted one after the other. As long as the dynamically sized data can be represented in a flat fashion (without further internal indirection), it can implement VarULE, and thus be used in VarZeroVec<T>. str implements this, but so do ZeroSlice<T> and VarZeroSlice<T>, allowing for infinite nesting of zerovec types!

ZeroMap<T> works similarly to the litemap crate, it’s a map built out of two vectors, using binary search to find keys. This isn’t always as efficient as a hash map but it can work well in a zero-copy way since it can just be backed by ZeroVec and VarZeroVec. There’s a bunch of trait infrastructure that allows it to automatically select ZeroVec or VarZeroVec for each of the key and value vectors based on the type of the key or value.

What about rkyv?

An important question when we started down this path was: what about rkyv? It had at the time just received a fair amount of attention in the Rust community, and seemed like a pretty cool library targeting the same space.

And in general if you’re looking for zero-copy deserialization, I wholeheartedly recommend looking at it! It’s an impressive library with a lot of thought put into it. When I was refining zerovec I learned a lot from rkyv having some insightful discussions with David and comparing notes on approaches.

The main sticking point, for us, was that rkyv works kinda separately from serde: it uses its own traits and own serialization mechanism. We really liked serde’s model and wanted to keep using it, especially since we wanted to support a variety of human-readable and non-human-readable data formats, including postcard, which is explicitly designed for low-resource environments. This becomes even more important for data interchange; we’d want programs written in other languages to be able to construct and send over data without necessarily being constrained to a particular wire format.

The goal of zerovec is essentially to bring rkyv-like improvements to a serde universe without disrupting that universe too much. zerovec types, on human-readable formats like JSON, serialize to a normal human-readable representation of the structure, and on binary formats like postcard, serialize to a compact, zero-copy-friendly representation that Just Works.

How does it perform?

So off the bat I’ll mention that rkyv maintains a very good benchmark suite that I really need to get around to integrating with zerovec, but haven’t yet.

Speech bubble for character Negative pion
Why not go do that first? It would make your post better!

Well, I was delaying working on this post until I had those benchmarks integrated, but that’s not how executive function works, and at this point I’d rather publish with the benchmarks I have rather than delaying further. I might update this post with the Good Benchmarks later!

Speech bubble for character Negative pion
Hmph.

The complete benchmark run details can be found here (run via cargo bench at 1e072b32. I’m pulling out some specific data points for illustration:

ZeroVec:

BenchmarkSliceZeroVec
Deserialization (with bincode)
Deserialize a vector of 100 u32s141.55 ns12.166 ns
Deserialize a vector of 15 chars225.55 ns25.668 ns
Deserialize and then sum a vector of 20 u32s47.423 ns14.131 ns
Element fetching performance
Sum a vector of 75 u32 elements4.3091 ns5.7108 ns
Binary search a vector of 1000 u32 elements, 50 times428.48 ns565.23 ns
Binary search a vector of 1000 u32 elements, 50 times428.48 ns565.23 ns
Serialization
Serialize a vector of 20 u32s51.324 ns21.582 ns
Serialize a vector of 15 chars195.75 ns21.123 ns


In general we don’t care about serialization performance much, however serialization is fast here because ZeroVecs are always stored in memory as the same form they would be serialized at. This can make mutation slower. Fetching operations are a little bit slower on ZeroVec. The deserialization performance is where we see our real wins, sometimes being more than ten times as fast!

VarZeroVec:

The strings are randomly generated, picked with sizes between 2 and 20 code points, and the same set of strings is used for any given row.

BenchmarkVec<String>Vec<&str>VarZeroVec
Deserialize (len 100)11.274 us2.2486 us1.9446 us
Count code points (len 100)728.99 ns1265.0 ns
Binary search for 1 element (len 500)57.788 ns122.10 ns
Binary search for 10 elements (len 500)451.40 ns803.67 ns


Here, fetching operations are a bit slower since they need to read the indexing array, but there’s still a decent win for zero-copy deserialization. The deserialization wins stack up for more complex data; for Vec<String> you can get most of the wins by using Vec<&str>, but that’s not necessarily possible for something more complex. We don’t currently have mutation benchmarks for VarZeroVec, but mutation can be slow and as mentioned before it’s not intended to be used much in client code.

Some of this is still in flux; for example we are in the process of making VarZeroVec’s buffer format configurable so that users can pick their precise tradeoffs.

Try it out!

Similar to yoke, I don’t consider the zerovec crate “done” yet, but it’s been in use in ICU4X for a year now and I consider it mature enough to recommend to others. Try it out! Let me know what you think!

Thanks to Finch, Jane, and Shane for reviewing drafts of this post

  1. A locale is typically a language and location, though it may contain additional information like the writing system or even things like the calendar system in use. 

  2. Bear in mind, this isn’t just a matter of picking a format like MM-DD-YYYY! Dates in just US English can look like 4/10/22 or 4/10/2022 or April 10, 2022, or Sunday, April 10, 2022 C.E., or Sun, Apr 10, 2022, and that’s not without thinking about week numbers, quarters, or time! This quickly adds up to a decent amount of data for each locale. 

  3. As mentioned in the previous post, while zero-copy deserializing, it is typical to use borrowed-or-owned types like Cow over pure borrowed types because it’s not necessary that data in a human-readable format will be able to zero-copy deserialize. 

  4. Most modern systems are little endian, so this imposes one fewer potential cost on conversion. 

Manish GoregaokarNot a Yoking Matter (Zero-Copy #1)

This is part 1 of a three-part series on interesting abstractions for zero-copy deserialization I’ve been working on over the last year. This part is about making zero-copy deserialization more pleasant to work with. Part 2 is about making it work for more types and can be found here; while Part 3 is about eliminating the deserialization step entirely and can be found here. The posts can be read in any order, though this post contains an explanation of what zero-copy deserialization is.

Background

For the past year and a half I’ve been working full time on ICU4X, a new internationalization library in Rust being built under the Unicode Consortium as a collaboration between various companies.

There’s a lot I can say about ICU4X, but to focus on one core value proposition: we want it to be modular both in data and code. We want ICU4X to be usable on embedded platforms, where memory is at a premium. We want applications constrained by download size to be able to support all languages rather than pick a couple popular ones because they cannot afford to bundle in all that data. As a part of this, we want loading data to be fast and pluggable. Users should be able to design their own data loading strategies for their individual use cases.

See, a key part of performing correct internationalization is the data. Different locales1 do things differently, and all of the information on this needs to go somewhere, preferably not code. You need data on how a particular locale formats dates2, or how plurals work in a particular language, or how to accurately segment languages like Thai which are typically not written with spaces so that you can insert linebreaks in appropriate positions.

Given the focus on data, a very attractive option for us is zero-copy deserialization. In the process of trying to do zero-copy deserialization well, we’ve built some cool new libraries, this article is about one of them.

<figcaption class="caption-text">

Gary Larson, “Cow Tools”, The Far Side. October 1982

</figcaption>

Zero-copy deserialization: the basics

This section can be skipped if you’re already familiar with zero-copy deserialization in Rust

Deserialization typically involves two tasks, done in concert: validating the data, and constructing an in-memory representation that can be programmatically accessed; i.e., the final deserialized value.

Depending on the format, the former is typically rather fast, but the latter can be super slow, typically around any variable-sized data which needs a new allocation and often a large copy.

#[derive(Serialize, Deserialize)]
struct Person {
    // this field is nearly free to construct
    age: u8,
    // constructing this will involve a small allocation and copy
    name: String,
    // this may take a while
    rust_files_written: Vec<String>,
}

A typical binary data format will probably store this as a byte for the age, followed by the length of name, followed by the bytes for name, followed by another length for the vector, followed by a length and string data for each String value. Deserializing the u8 age just involves reading it, but the other two fields require allocating sufficient memory and copying each byte over, in addition to any validation the types may need.

A common technique in this scenario is to skip the allocation and copy by simply validating the bytes and storing a reference to the original data. This can only be done for serialization formats where the data is represented identically in the serialized file and in the deserialized value.

When using serde in Rust, this is typically done by using a Cow<'a, T> with #[serde(borrow)]:

#[derive(Serialize, Deserialize)]
struct Person<'a> {
    age: u8,
    #[serde(borrow)]
    name: Cow<'a, str>,
}

Now, when name is being deserialized, the deserializer only needs to validate that it is in fact a valid UTF-8 str, and the final value for name will be a reference to the original data being deserialized from itself.

An &'a str can also be used instead of the Cow, however this makes the Deserialize impl much less general, since formats that do not store strings identically to their in-memory representation (e.g. JSON with strings that include escapes) will not be able to fall back to an owned value. As a result of this, owned-or-borrowed Cow<'a, T> is often a cornerstone of good design when writing Rust code partaking in zero-copy deserialization.

You may notice that rust_files_written can’t be found in this new struct. This is because serde, out of the box, can’t handle zero-copy deserialization for anything other than str and [u8], for very good reasons. Other frameworks like rkyv can, however we’ve also managed to make this possible with serde. I’ll go in more depth about said reasons and our solution in part 2.
Speech bubble for character Confused pion
Aren’t there still copies occurring here with the age field?

Yes, “zero-copy” is somewhat of a misnomer, what it really means is “zero allocations”, or, alternatively, “zero large copies”. Look at it this way: data like age does get copied, but without, say, allocating a vector of Person<'a>, you’re only going to see that copy occur a couple times when individually deserializing Person<'a>s or when deserializing some struct that contains Person<'a> a couple times. To have a large copy occur without involving allocations, your type would have to be something that is that large on the stack in the first place, which people avoid in general because it means a large copy every time you move the value around even when you’re not deserializing.

When life gives you lifetimes ….

Zero-copy deserialization in Rust has one very pesky downside: the lifetimes. Suddenly, all of your deserialized types have lifetimes on them. Of course they would; they’re no longer self-contained, instead containing references to the data they were originally deserialized from!

This isn’t a problem unique to Rust, either, zero-copy deserialization always introduces more complex dependencies between your types, and different frameworks handle this differently; from leaving management of the lifetimes to the user to using reference counting or a GC to ensure the data sticks around. Rust serialization libraries can do stuff like this if they wish, too. In this case, serde, in a very Rusty fashion, wants the library user to have control over the precise memory management here and surfaces this problem as a lifetime.

Unfortunately, lifetimes like these tend to make their way into everything. Every type holding onto your deserialized type needs a lifetime now and it’s likely going to become your users’ problem too.

Furthermore, Rust lifetimes are a purely compile-time construct. If your value is of a type with a lifetime, you need to know at compile time by when it will definitely no longer be in use, and you need to hold on to its source data until then. Rust’s design means that you don’t need to worry about getting this wrong, since the compiler will catch you, but you still need to do it.

All of this isn’t ideal for cases where you want to manage the lifetimes at runtime, e.g. if your data is being deserialized from a larger file and you wish to cache the loaded file as long as data deserialized from it is still around.

Typically in such cases you can use Rc<T>, which is effectively the “runtime instead of compile time” version of &'a Ts safe shared reference, but this only works for cases where you’re sharing homogenous types, whereas in this case we’re attempting to share different types deserialized from one blob of data, which itself is of a different type.

ICU4X would like users to be able to make use of caching and other data management strategies as needed, so this won’t do at all. For a while ICU4X had not one but two pervasive lifetimes threaded throughout most of its types: it was both confusing and not in line with our goals.

… make life take the lifetimes back

A lot of the design here can be found explained in the design doc

After a bunch of discussion on this, primarily with Shane, I designed yoke, a crate that attempts to provide lifetime erasure in Rust via self-referential types.

Speech bubble for character Confused pion
Wait, lifetime erasure?

Like type erasure! “Type erasure” (in Rust, done using dyn Trait) lets you take a compile time concept (the type of a value) and move it into something that can be decided at runtime. Analogously, the core value proposition of yoke is to take types burdened with the compile time concept of lifetimes and allow you to decide they be decided at runtime anyway.

Speech bubble for character Confused pion
Doesn’t Rc<T> already let you make lifetimes a runtime decision?

Kind of, Rc<T> on its own lets you avoid compile-time lifetimes, whereas Yoke works with situations where there is already a lifetime (e.g. due to zero copy deserialization) that you want to paper over.

Speech bubble for character Confused pion
Cool! What does that look like?

The general idea is that you can take a zero-copy deserializeable type like a Cow<'a, str> (or something more complicated) and “yoke” it to the value it was deserialized from, which we call a “cart”.

Speech bubble for character Negative pion
*groan* not another crate named with a pun, Manish.

I will never stop.

Anyway, here’s what that looks like.

// Some types explicitly mentioned for clarity

// load a file
let file: Rc<[u8]> = fs::read("data.postcard")?.into();

// create a new Rc reference to the file data by cloning it,
// then use it as a cart for a Yoke
let y: Yoke<Cow<'static, str>, Rc<[u8]>> = Yoke::attach_to_cart(file.clone(), |contents| {
    // deserialize from the file
    let cow: Cow<str> =  postcard::from_bytes(&contents);
    cow
})

// the string is still accessible with `.get()`
println!("{}", y.get())

drop(y);
// only now will the reference count on the file be decreased
Some of the APIs here may not quite work due to current compiler bugs. In this blog post I’m using the ideal version of these APIs for illustrative purposes, but it’s worth checking with the Yoke docs to see if you may need to use an alternate workaround API. Most of the bugs have been fixed as of Rust 1.61.
Speech bubble for character Positive pion
The example above uses postcard: postcard is a really neat serde-compatible binary serialization format, designed for use on resource constrained environments. It’s quite fast and has a low codesize, check it out!

The type Yoke<Cow<'static, str>, Rc<[u8]>> is “a lifetime-erased Cow<str> ‘yoked’ to a backing data store ‘cart’ that is an Rc<[u8]>”. What this means is that the Cow contains references to data from the cart, however, the Yoke will hold on to the cart type until it is done, which ensures the references from the Cow no longer dangle.

Most operations on the data within a Yoke operate via .get(), which in this case will return a Cow<'a, str>, where 'a is the lifetime of borrow of .get(). This keeps things safe: a Cow<'static, str> is not really safe to distribute in this case since Cow is not actually borrowing from static data; however it’s fine as long as we transform the lifetime to something shorter during accesses.

Turns out, the 'static found in Yoke types is actually a lie! Rust doesn’t really let you work with types with borrowed content without mentioning some lifetime, and here we want to relieve the compiler from its duty of managing lifetimes and manage them ourselves, so we need to give it something so that we can name the type, and 'static is the only preexisting named lifetime in Rust.

The actual signature of .get() is a bit weird since it needs to be generic, but if our borrowed type is Foo<'a>, then the signature of .get() is something like this:

impl Yoke<Foo<'static>> {
    fn get<'a>(&'a self) -> &'a Foo<'a> {
        ...
    }
}

For a type to be allowed within a Yoke<Y, C>, it must implement Yokeable<'a>. This trait is unsafe to manually implement, in most cases you should autoderive it with #[derive(Yokeable)]:

#[derive(Yokeable, Serialize, Deserialize)]
struct Person<'a> {
    age: u8,
    #[serde(borrow)]
    name: Cow<'a, str>,
}

let person: Yoke<Person<'static>, Rc<[u8]> = Yoke::attach_to_cart(file.clone(), |contents| {
    postcard::from_bytes(&contents)
});

Unlike most #[derive]s, Yokeable can be derived even if the fields do not already implement Yokeable, except for cases when fields with lifetimes also have other generic parameters. In such cases it typically suffices to tag the type with #[yoke(prove_covariance_manually)] and ensure any fields with lifetimes also implement Yokeable.

There’s a bunch more you can do with Yoke, for example you can “project” a yoke to get a new yoke with a subset of the data found in the initial one:

let person: Yoke<Person<'static>, Rc<[u8]>> = ....;

let person_name: Yoke<Cow<'static, str>> = person.project(|p, _| p.name);

This allows one to mix data coming from disparate Yokes.

Yokes are, perhaps surprisingly, mutable as well! They are, after all, primarily intended to be used with copy-on-write data, so there are ways to mutate them provided that no additional borrowed data sneaks in:

let mut person: Yoke<Person<'static>, Rc<[u8]>> = ....;

// make the name sound fancier
person.with_mut(|person| {
    // this will convert the `Cow` into owned one
    person.name.to_mut().push(", Esq.")
})

Overall Yoke is a pretty powerful abstraction, useful for a host of situations involving zero-copy deserialization as well as other cases involving heavy borrowing. In ICU4X the abstractions we use to load data always use Yokes, allowing various data loading strategies — including caching — to be mixed

How it works

Speech bubble for character Positive pion
Manish is about to say the word “covariant” so I’m going to get ahead of him and say: If you have trouble understanding this and the next section, don’t worry! The internal workings of his crate rely on multiple niche concepts that most Rustaceans never need to care about, even those working on otherwise advanced code.

Yoke works by relying on the concept of a covariant lifetime. The Yokeable trait looks like this:

pub unsafe trait Yokeable<'a>: 'static {
    type Output: 'a;
    // methods omitted
}

and a typical implementation would look something like this:

unsafe impl<'a> Yokeable<'a> for Cow<'static, str> {
    type Output: 'a = Cow<'a, str>;
    // ...
}

An implementation of this trait will be implemented on the 'static version of a type with a lifetime (which I will call Self<'static>3 in this post), and maps the type to a version of it with a lifetime (Self<'a>). It must only be implemented on types where the lifetime 'a is covariant, i.e., where it’s safe to treat Self<'a> as Self<'b> when 'b is a shorter lifetime. Most types with lifetimes fall in this category4, especially in the space of zero-copy deserialization.

Speech bubble for character Positive pion
You can read more about variance in the nomicon!

For any Yokeable type Foo<'static>, you can obtain the version of that type with a lifetime 'a with <Foo as Yokeable<'a>>::Output. The Yokeable trait exposes some methods that allow one to safely carry out the various transforms that are allowed on a type with a covariant lifetime.

#[derive(Yokeable)], in most cases, relies on the compiler’s ability to determine if a lifetime is covariant, and doesn’t actually generate much code! In most cases, the bodies of the various functions on Yokeable are pure safe code, looking like this:

impl<'a> Yokeable for Foo<'static> {
    type Output: 'a = Foo<'a>;
    fn transform(&self) -> &Self::Output {
        self
    }
    fn transform_owned(self) -> Self::Output {
        self
    }
    fn transform_mut<F>(&'a mut self, f: F)
    where
        F: 'static + for<'b> FnOnce(&'b mut Self::Output) {
        f(self)
    }
    // fn make() omitted since it's not as relevant
}

The compiler knows these are safe because it knows that the type is covariant, and the Yokeable trait allows us to talk about types where these operations are safe, generically.

Speech bubble for character Positive pion
In other words, there’s a certain useful property about lifetime “stretchiness” that the compiler knows about, and we can check that the property applies to a type by generating code that the compiler would refuse to compile if the property did not apply.

Using this trait, Yoke then works by storing Self<'static> and transforming it to a shorter, more local lifetime before handing it out to any consumers, using the methods on Yokeable in various ways. Knowing that the lifetime is covariant is what makes it safe to do such lifetime “squeezing”. The 'static is a lie, but it’s safe to do that kind of thing as long as the value isn’t actually accessed with the 'static lifetime, and we take great care to ensure it doesn’t leak.

Better conversions: ZeroFrom

A crate that pairs well with this is zerofrom, primarily designed and written by Shane. It comes with the ZeroFrom trait:

pub trait ZeroFrom<'zf, C: ?Sized>: 'zf {
    fn zero_from(other: &'zf C) -> Self;
}

The idea of this trait is to be able to work generically with types convertible to (often zero-copy) borrowed types.

For example, Cow<'zf, str> implements both ZeroFrom<'zf, str> and ZeroFrom<'zf, String>, as well as ZeroFrom<'zf, Cow<'a, str>>. It’s similar to the AsRef trait but it allows for more flexibility on the kinds of borrowing occuring, and implementors are supposed to minimize the amount of copying during such a conversion. For example, when ZeroFrom-constructing a Cow<'zf, str> from some other Cow<'a, str>, it will always construct a Cow::Borrowed, even if the original Cow<'a, str> were owned.

Yoke has a convenient constructor Yoke::attach_to_zero_copy_cart() that can create a Yoke<Y, C> out of a cart type C if Y<'zf> implements ZeroFrom<'zf, C> for all lifetimes 'zf. This is useful for cases where you want to do basic self-referential types but aren’t doing any fancy zero-copy deserialization.

… make life rue the day it thought it could give you lifetimes

Life with this crate hasn’t been all peachy. We’ve, uh … unfortunately discovered a toweringly large pile of gnarly compiler bugs. A lot of this has its root in the fact that Yokeable<'a> in most cases is bound via for<'a> Yokeable<'a> (“Yokeable<'a> for all possible lifetimes 'a”). The for<'a> is a niche feature known as a higher-ranked lifetime or trait bound (often referred to as “HRTB”), and while it’s always been necessary in some capacity for Rust’s typesystem to be able to reason about function pointers, it’s also always been rather buggy and is often discouraged for usages like this.

We’re using it so that we can talk about the lifetime of a type in a generic sense. Fortunately, there is a language feature under active development that will be better suited for this: Generic Associated Types.

This feature isn’t stable yet, but, fortunately for us, most compiler bugs involving for<'a> also impact GATs, so we have been benefitting from the GAT work, and a lot of our bug reports have helped shore up the GAT code. Huge shout out to Jack Huey for fixing a lot of these bugs, and eddyb for helping out in the debugging process.

As of Rust 1.61, a lot of the major bugs have been fixed, however there are still some bugs around trait bounds for which the yoke crate maintains some workaround helpers. It has been our experience that most compiler bugs here are not restrictive when it comes to what you can do with the crate, but they may end up with code that looks less than ideal. Overall, we still find it worth it, we’re able to do some really neat zero-copy stuff in a way that’s externally convenient (even if some of the internal code is messy), and we don’t have lifetimes everywhere.

Try it out!

While I don’t consider the yoke crate “done” yet, it’s been in use in ICU4X for a year now and I consider it mature enough to recommend to others. Try it out! Let me know what you think!

Thanks to Finch, Jane, and Shane for reviewing drafts of this post

  1. A locale is typically a language and location, though it may contain additional information like the writing system or even things like the calendar system in use. 

  2. Bear in mind, this isn’t just a matter of picking a format like MM-DD-YYYY! Dates in just US English can look like 4/10/22 or 4/10/2022 or April 10, 2022, or Sunday, April 10, 2022 C.E., or Sun, Apr 10, 2022, and that’s not without thinking about week numbers, quarters, or time! This quickly adds up to a decent amount of data for each locale. 

  3. This isn’t real Rust syntax; since Self is always just Self, but we need to be able to refer to Self as a higher-kinded type in this scenario. 

  4. Types that aren’t are ones involving mutability (&mut or interior mutability) around the lifetime, and ones involving function pointers and trait objects. 

IRL (podcast)When an Algorithm is Your Boss

Gig workers around the world report directly to algorithms in precarious jobs created by secretive corporations. We take you to the streets of Quito, Ecuador where delivery workers are protesting against artificial intelligence, and we hear solutions from people in several countries on how to audit the algorithms and reclaim rights.

Eduardo Meneses is gearing up with allies to ‘audit the algorithms’ of delivery platforms in Ecuador as the Global Head of Social Change at Thoughtworks.

Dan Calacci at the MIT Media Lab is developing open source tools and systems that empower workers to take control of their data.

Aída Ponce Del Castillo is working on AI regulation to protect the rights of platform workers as a lawyer with the European Trade Union Institute in Brussels.

Yuly Ramirez is the general secretary of a coalition of digital platform workers in Ecuador and José Gonzalez is a delivery driver in Quito, Ecuador.

IRL is an original podcast from Mozilla, the non-profit behind Firefox. In Season 6, host Bridget Todd shares stories of people who make AI more trustworthy in real life. This season doubles as Mozilla’s 2022 Internet Health Report. Go to the report for show notes, transcripts, and more.

 

The Rust Programming Language BlogIncreasing the glibc and Linux kernel requirements

The minimum requirements for Rust toolchains targeting Linux will increase with the Rust 1.64.0 release (slated for September 22nd, 2022). The new minimum requirements are:

  • glibc >= 2.17 (previously glibc >= 2.11)
  • kernel >= 3.2 (previously kernel >= 2.6.32)

These requirements apply both to running the Rust compiler itself (and other Rust tooling like Cargo or Rustup), and to running binaries produced by Rust, if they use the libstd.

If you are not targeting an old long-term-support distribution, or embedded hardware running an old Linux version, this change is unlikely to affect you. Otherwise, read on!

Affected targets

In principle, the new kernel requirements affect all *-linux-* targets, while the glibc requirements affect all *-linux-gnu* targets. In practice, many targets were already requiring newer kernel or glibc versions. The requirements for such targets do not change.

Among targets for which a Rust host toolchain is distributed, the following are affected:

  • i686-unknown-linux-gnu (Tier 1)
  • x86_64-unknown-linux-gnu (Tier 1)
  • x86_64-unknown-linux-musl (Tier 2 with host tools)
  • powerpc-unknown-linux-gnu (Tier 2 with host tools)
  • powerpc64-unknown-linux-gnu (Tier 2 with host tools)
  • s390x-unknown-linux-gnu (Tier 2 with host tools)

The following are not affected, because they already had higher glibc/kernel requirements:

  • aarch64-unknown-linux-gnu (Tier 1)
  • aarch64-unknown-linux-musl (Tier 2 with host tools)
  • arm-unknown-linux-gnueabi (Tier 2 with host tools)
  • arm-unknown-linux-gnueabihf (Tier 2 with host tools)
  • armv7-unknown-linux-gnueabihf (Tier 2 with host tools)
  • mips-unknown-linux-gnueabihf (Tier 2 with host tools)
  • powerpc64le-unknown-linux-gnueabihf (Tier 2 with host tools)
  • riscv64gc-unknown-linux-gnueabihf (Tier 2 with host tools)

For other tier 2 or tier 3 targets, for which no Rust toolchain is distributed, we do not accurately track minimum requirements, and they may or may not be affected by this change. *-linux-musl* targets are only affected by the kernel requirements, not the glibc requirements. Targets which only use libcore and not libstd are unaffected.

A list of supported targets and their requirements can be found on the platform support page.

Affected systems

The glibc and kernel versions used for the new baseline requirements are already close to a decade old. As such, this change should only affect users that either target old long-term-support Linux distributions, or embedded hardware running old versions of Linux.

The following Linux distributions are still supported under the new requirements:

  • RHEL 7 (glibc 2.17, kernel 3.10)
  • SLES 12-SP5 (glibc 2.22, kernel 4.12.14)
  • Debian 8 (glibc 2.19, kernel 3.16.7)
  • Ubuntu 14.04 (glibc 2.19, kernel 3.13)

The following distributions are not supported under the new requirements:

  • RHEL 6 (glibc 2.12, kernel 2.6.32)
  • SLES 11-SP4 (glibc 2.11.3, kernel 3.0.101)
  • Debian 6 (glibc 2.11, kernel 2.6.32), Debian 7 (glibc 2.13, kernel 3.2.41)
  • Ubuntu 12.04 (glibc 2.15, kernel 3.2)

Out of the distributions in the second list, only RHEL 6 still has limited vendor support (ELS).

Why increase the requirements?

We want Rust, and binaries produced by Rust, to be as widely usable as possible. At the same time, the Rust project only has limited resources to maintain compatibility with old environments.

There are two parts to the toolchain requirements: The minimum requirements for running the Rust compiler on a host system, and the minimum requirements for cross-compiled binaries.

The minimum requirements for host toolchains affect our build system. Rust CI produces binary artifacts for dozens of different targets. Creating binaries that support old glibc versions requires either building on an operating system with old glibc (for native builds) or using a buildroot with an old glibc version (for cross-compiled builds).

At the same time, Rust relies on LLVM for optimization and code generation, which regularly increases its toolchain requirements. LLVM 16 will require GCC 7.1 or newer (and LLVM 15 supports GCC 5.1 in name only). Creating a build environment that has both a very old glibc and a recent compiler becomes increasingly hard over time. crosstool-ng (which we use for most cross-compilation needs) does not support targeting both glibc 2.11, and using a compiler that satisfies the new LLVM requirements.

The requirements for cross-compiled binaries have a different motivation: They affect which kernel versions need to be supported by libstd. Increasing the kernel requirements allows libstd to use newer syscalls, without having to maintain and test compatibility with kernels that do not support them.

The new baseline requirements were picked as the least common denominator among long-term-support distributions that still have active support. This is currently RHEL 7 with glibc 2.17 and kernel 3.10. The kernel requirement is picked as 3.2 instead, because this is the minimum requirement of glibc itself, and there is little relevant API difference between these versions.

What should I do?

If you or your organization are affected by this change, there are a number of viable options depending on your situation:

  • Upgrade your target system, or raise the minimum requirements of your software, to satisfy the new constraints.
  • If you are running the Rust compiler on an old host, consider cross-compiling from a newer host instead.
  • If you are targeting an old glibc version, consider targeting musl instead.
  • If you are targeting an old kernel version and use libstd, you may be out of luck: In this case you may have to either freeze your current Rust version, or maintain a fork of libstd that supports older kernels.

Mike HommeyAnnouncing git-cinnabar 0.5.10

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.9?

  • Fixed exceptions during config initialization.
  • Fixed swapped error messages.
  • Fixed correctness issues with bundle chunks with no delta node.
  • This is probably the last 0.5.x release before 0.6.0.

Mozilla Localization (L10N)L10n Report: July 2022 Edition

Please note some of the information provided in this report may be subject to change as we are sometimes sharing information about projects that are still in early stages and are not final yet. 

Welcome!

Are you a locale leader and want us to include new members in our upcoming reports? Contact us!

New content and projects

What’s new or coming up in Firefox desktop

While the last months have been pretty quiet in terms of new content for Firefox, we’re approaching a new major release for 2022, and that will include new features and dedicated onboarding.

Part of the content has already started landing in the last days, expect more in the coming weeks. In the meantime, make sure to check out the feature name guidelines for Firefox View and Colorways.

In terms of upcoming deadlines: Firefox 104 is currently in Beta and it will be possible to update translations up to August 14.

What’s new or coming up in mobile

Mobile releases now align more closely to desktop release schedules, so you may notice that target dates for these projects are the same in Pontoon. As with desktop, things are quiet now for mobile, but we’ll be seeing more strings landing in the coming weeks for the next major release.

What’s new or coming up in web projects

Firefox Relay website & add-on

We’re expanding Firefox Relay Premium into new locales across Europe: Austria, Belgium, Cyprus, Estonia, Finland, France, Germany, Greece, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Portugal, Slovakia, Slovenia, Spain, Sweden, and Switzerland. In order to deliver a truly great experience to our users in these new locales, we would like to make sure that users can utilize our products in the language they feel most comfortable with. Having these languages localized will take already complex topics like privacy and security and help connect more with users and offer them greater protections.

If you don’t see the product offered in the language in the markets above, maybe you can help by requesting to localize the product. Thank you for helping spread the word.

What’s new or coming up in Pontoon

  • When 100% TM match is available, it now automatically appears in the editor if the string doesn’t have any translations yet.

    100% matches from Translation Memory now automatically appear in the editor

  • Before new users make their first contribution to a locale, they are now provided with guidelines. And when they submit their first suggestion, team managers get notified.

    Tooltip with guidelines for new contributors.

  • The Contributors page on the Team dashboard has been reorganized. Contributors are grouped by their role within the team, which makes it easier to identify and reach out to team managers.

    Team contributors grouped by role.

  • We have introduced a new list parameter in translate view URLs, which allows for presenting a selected list of strings in the sidebar.
  • Deadlines have been renamed to Target Dates.
  • Thanks to Eemeli for making a bunch of under-the-hood improvements, which make our codebase much easier to build on.

Events

Want to showcase an event coming up that your community is participating in? Contact us and we’ll include it.

Friends of the Lion

Know someone in your l10n community who’s been doing a great job and should appear here? Contact us and we’ll make sure they get a shout-out!

Useful Links

Questions? Want to get involved?

If you want to get involved, or have any question about l10n, reach out to:

Did you enjoy reading this report? Let us know how we can improve it.

Mozilla Performance BlogPerformance Tools Newsletter (H1 2002)

As the Perf-Tools team, we are responsible for the Firefox Profiler. This newsletter gives an overview of the new features and improvements we’ve done in the first half of 2022.

You can find the previous newsletter here which was about the Q4 2021. This newsletter contains updates related to work done in the first half of 2022.

Here are some highlights.

Documentation

The main Profiler documentation was refreshed, to better match what has changed in recent years.

If you are new to the Profiler, or if you would like your friends, colleagues, or customers to learn about it, please visit: https://profiler.firefox.com/docs/#/./guide-getting-started.
(This link is also accessible from the landing page at https://profiler.firefox.com.)
Screenshot of the top of https://profiler.firefox.com/docs/#/./guide-getting-started

As a reminder, other aspects of the Profiler are captured in separate locations:

For developers of the Firefox application, or if you are just interested in some technical details, see the internal documentation at https://firefox-source-docs.mozilla.org/tools/profiler/.

For web developers, one point of interest is that you can add “markers” to your own pages through the JavaScript User Timing API, which will then be visible in profiles. This can be useful to measure some specific actions in your web pages.

Finally, the GitHub repository for the front-end side of the Profiler (the website that actually displays profiles after they have been captured) contains some detailed documentation about profile file formats and inner workings of the website.

New DevTools performance panel

The new performance panel was released in Firefox 98. This panel makes it easier for web developers to record and share profiles, as it opens a new profiler.firefox.com tab after capturing the profile.
Screenshot of the devtools performance panel

To access it, open the DevTools (main menu > More tools > Web Developer Tools), and select the “Performance” panel.

Note that having the DevTools open may use some processing power, so the resulting profile of any web page may be slightly skewed, with some of the profile data being related to DevTools. If these effects are noticeable in your profiles, just close the DevTools window while profiling. As an alternative, you may choose to control the profiler through the toolbar button (enabled when first visiting https://profiler.firefox.com), or the accompanying shortcuts: Ctrl+Shift+1 to start, and Ctrl+Shift+2 to stop.

Internationalization

The profiler.firefox.com interface is now available in languages other than English, and may be changed on the fly from the language picker at the bottom right cornerScreenshot of the language pickerYou may request new languages, and contribute, at https://pontoon.mozilla.org/projects/firefox-profiler/ after signing in with a Firefox Account.

Markers

A number of improvements happened around markers: how they’re collected, some new useful markers that were added around Firefox, and how they are better presented.

  • Instant markers (those without duration) with the same name are now grouped in the top line, instead of being sprinkled semi-randomly among interval markers. And they are all displayed as diamond shapes, making it easier to notice multiple adjacent markers.
  • Interval markers are now all displayed as rectangles with a visible darker outline, which better indicates if and where they start and stop. Very short markers that fit in 1 pixel look like dark vertical lines.
    Examples of different marker displays
  • Inter-Process Communication (IPC) markers are now captured from any thread, even those not currently being profiled, in order to give a more complete picture of interactions between Firefox processes. These messages have become more numerous and important since site isolation was improved by using separate Operating System processes for each web server address – so that web servers cannot spy on each other through a common web page.
    IPC markers also have a new context menu option, to select the other thread on the other side of the IPC message channel.
    IPC marker context menu showing "Select the send thread", and from there an arrow pointing at the corresponding marker for the IPC that was sent.
  • "Awake": In most threads, this shows the actual time spent running tasks (as opposed to waiting for these tasks). It also includes how much CPU is spent, and where available which CPU core ran it, and at which priority level. These markers are especially useful to see how often and when a thread woke up.
    "Awake" marker tooltip, with duration, CPU Id, and priority details
  • "DLLLoad": Time spent loading DLLs on Windows.
  • "Set/Sample/Clear Animation" markers on the compositor thread.
  • DOMEvent and CSS Animation/Transition markers now show the target of the event/animation
  • Screenshot marker tooltips now show the actual screenshot.
  • New context menu option “Copy page URL”.

Other improvements

  • We now support profiling of private browsing windows. Previously we were disabling the profiler as soon as a private window was opened. Now we profile them but mark everything as coming from a private window, so that we can remove the data if the profile is shared (which is the default).
  • Per-process CPU utilization. The feature is still marked as experimental but should be usable. Enable it in about:profiling, and when viewing the profile, open the JavaScript console and run experimental.enableProcessCPUTracks() to show the graphs.
    In particular, it can highlight work done in the whole process, which may not happen in the currently-visible threads; If you notice some high values there, but cannot find the corresponding work in individual threads, consider selecting more threads to profile before re-running the profiler.
    For example, in the following screenshot the zone marked ① looks idle in the main thread; but the “Process CPU” reveals that there was significant activity at ② in this process, and a bit of exploring hidden threads found that the “StyleThreads” at ③ were the ones working hard at that time.
    Profile with Process CPU track, see text above for explanation
  • Ability to capture stack traces on all versions of Firefox for macos. It used to only work on some pre-release channels.
  • Profiling data from early times during a process startup used to be split in their own tracks (annotated with “pre-xul”), they are now combined with the correct threads past this early startup phase.
  • The memory track’s tooltip now shows the number of memory operations performed between two samples.
  • Animations were removed from various places for users with prefer-reduced-motion.
  • Profiles captured from Linux perf now distinguish kernel and user frames with different colors. (Thank you to contributor Dave Rigby.)
  • Experimental support for capturing actual power usage (not just CPU utilization times) on some platforms (Windows 11 and Apple Silicon as of this writing). This work is still progressing, and will bring another set of measurements that are very important to improve the performance of Firefox and websites.
  • Miscellaneous internal performance improvements of the Profiler itself, to better handle capturing and displaying more and more data.

Meet the team, and get help

If you profiled something and are puzzled with the profile you captured, we have the Joy of Profiling (#joy-of-profiling:mozilla.org) channel where people share their profiles and get help from the people who are more familiar with the Firefox Profiler. If you’re interested in the Profiler itself, our main channel is https://matrix.to/#/#profiler:mozilla.org.

We also have Joy of Profiling Open Sessions where people bring their profile and we try to analyze the profile together over Zoom. See the Performance Office Hours calendar.

And if you would just like to see some Firefox developers analyze profiles, watch Mike Conley and friends in the original Joy of Profiling, “unscripted, unplanned, uncensored, and true to life”.

 

Until next time, Happy Profiling!

Firefox NightlyThese Weeks In Firefox: Issue 121

Highlights

Friends of the Firefox team

Resolved bugs (excluding employees)

Volunteers that fixed more than one bug

  • Janvi Bajoria [:janvi01]
  • sayuree
  • Shane Hughes [:aminomancer]

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions

WebExtensions Framework
    • As part of the ongoing ManifestVersion 3 (MV3) work:
        • The new unified toolbar button is meant to replace the browserAction toolbar buttons but it does not cover all the features that are currently provided by the browserActions toolbar buttons yet. The new toolbar button is currently only enabled when the “extensions.unifiedExtensions.enabled” preference is explicitly set to true in about:config.
        • A huge shout out to both Itiel and James Teh for the great support they provided in the reviews for the unified extensions button!
      • Event Pages (non persistent background pages): In Firefox >= 104, the event page will not be terminated if there is an active connection to a native app through the native messaging APIs (browser.runtime.connectNative and browser.runtime.sendNativeMessage) – Bug 1770696
      • New web_accessible_resources manifest field syntax: In Firefox >= 104, in the “matches” properties of the web_accessible_resources entries in the manifest_version 3 format supports the “<all_urls>” host permission string – Bug 1776841
WebExtension APIs
  • In Firefox >=  104, restricted schemes (eg. “resources” or “chrome” schemes) are allowed in scripting.registerContentScripts API when called from a (MV2) privileged extension – Bug 1756758
  • Fixed a bug with browser.extension.getViews and preloaded browserAction popup pages – Bug 1780008
  • Starting from Firefox 104, the history WebExtensions API will use internal PlacesUtils.history async API
    • Some extensions (e.g. DownThemAll! add-on) were calling the history API during startup and so this fix will also result in a startup performance improvement for Firefox users that have extensions using this API and blocking web requests triggered during the Firefox startup.
    • Huge shout out to Emilio for investigating and fixing this issue!

Developer Tools

Toolbox
  • Many thanks to arai for helping us with instant evaluation issues (bug & bug).
    • Those patches will prevent the following expression to be instantly evaluated as the user is typing
      • [1, 2, 3].map(alert)
      • Function.prototype.call.bind(Function.prototype.call)(alert);
  • We fixed a bug for log points in the Browser Toolbox where the logs would only appear if the “Show Content Messages” setting was checked (bug)
  • Julian fixed an issue with debugging addons + reloading (easily triggered when using webext to write your extension) (bug)
    • Uplifted to ESR
  • Bomsy worked on a couple things in the Netmonitor we should improve memory usage and performance
    • Network monitoring is now disabled in the Browser Toolbox until the user selects the  Netmonitor (bug)
    • We now properly clear data when the request list is cleared (bug)
  • Ochameau is still making progress on Debugger source tree stability and performance (bugbugbug and bug, showing decent performance improvements to the Browser Toolbox: DAMP results)
WebDriver BiDi

ESMification status

Lint, Docs and Workflow

Picture-in-Picture

Performance

  • hiro and florian closed out this bug which can cause the compositor to run at 60hz at all times even when nothing is animating on Windows!
  • emilio made chrome windows support document.visibilityState, which means the refresh driver is now throttled in fully occluded background windows.

Performance Tools (aka Firefox Profiler)

  • Removed the timeline graph type radio buttons (#4147)
    • The "before" state of the Profiler UI is shown, with a section showing the radio buttons for changing the graph type between "Categories with CPU", "Categories" and "Stack height".

      Before

    • The "after" state of the Profiler UI is shown. The section showing the graph type radio buttons is gone.

      After!

    • If you are a power user and would like to use another timeline graph type, you can call window.toggleTimelineType from the devtools console with various types. See the console message in profiler.firefox.com for more details.
  • Profiler no longer crashes when the profile data is too big, instead we discard only the profile of the child process that was too big, and we log error messages in the JSON file. It’s visible in profile.profileGatheringLog from the console. (Bug 1779685, Bug 1758643, Bug 1779367)
  • Added a power profiling setting to the profiler popup (Power usage data in Watt is available only on Windows 11 with Intel CPUs and Apple Silicon, but the preset can still be used elsewhere for a low overhead profiling of what’s causing thread wake-ups) (Bug 1778282) You can change the profiler setting either via profiler popup or about:profiling.
    • The Profiler settings UI showing a list of radio buttons. Each radio button sets the Profiler into a preset configuration. A new configuration is highlighted for profiling Power Usage.

      This will be very handy for finding power consumption optimizations!

  • Added profiler sub-category for Wasm frames (Bug 1780383)
  • Added doc for local profiling on android with screenshots. Here’s the link to the doc. (#4145)
  • Hid the user interface components showing stacks for tracks that don’t have stack samples (#4133)

Search and Navigation

Mozilla Open Policy & Advocacy BlogMozilla submits comments in OSTP consultation on privacy-preserving data sharing

Earlier this month, the US Office of Science and Technology Policy (OSTP) asked stakeholders to contribute to the development of a national strategy for “responsibly harnessing privacy-preserving data sharing and analytics to benefit individuals and society.” This effort offers a much-needed opportunity to advance privacy in online advertising, an industry that has not seen improvement in many years.

In our comments, we set out the work that Mozilla has undertaken over the past decade to shape the evolution of privacy preserving advertising, both in our products, and in how we engage with regulators and standards bodies.

Mozilla has often outlined that the current state of the web is not sustainable, particularly how online advertising works today. The ecosystem is broken. It’s opaque by design, rife with fraud, and does not serve the vast majority of those which depend on it – most importantly, the people who use the open web. The ways in which advertising is conducted today – through pervasive tracking, serial privacy violations, market consolidation, and lack of transparency – are not working and cause more harm than good.

At Mozilla, we’ve been working to drive the industry in a better direction through technical solutions. However, technical work alone can’t address disinformation, discrimination, societal manipulation, privacy violations, and more. A complementary regulatory framework is necessary to mitigate the most egregious practices in the ecosystem and ensure that the outcomes of such practices (discrimination, electoral manipulation, etc.) are untenable under law rather than due to selective product policy enforcement.

Our vision is a web which empowers individuals to make informed choices without their privacy and security being compromised.  There is a real opportunity now to improve the privacy properties of online advertising. We must draw upon the internet’s founding principles of transparency, public participation, and innovation. We look forward to seeing how OSTP’s national strategy progresses this vision.

The post Mozilla submits comments in OSTP consultation on privacy-preserving data sharing appeared first on Open Policy & Advocacy.

Mozilla ThunderbirdThunderbird Time Machine, 2003: A Look Back At Thunderbird 0.1

Let’s take a walk down memory lane to the summer of 2003. Linkin Park, 50 Cent, and Evanescence have top-selling new albums. Apple’s iPod hasn’t even sold 1 million units. Mozilla’s new web browser used to be called Phoenix, but now it’s called Firebird. And a new cross-platform, open-source application called Thunderbird has debuted from the foundations of Mozilla Mail

Because the entirety of Thunderbird’s releases and corresponding release notes have been preserved, I’ve started a self-guided tour of Thunderbird’s history. Why? A mixture of personal and technical curiosity. I used Thunderbird for a couple years in the mid-2000s, and again more recently, but there are giant gaps in my experience. So I’m revisiting every single major version to discover the nuances between releases; the changes big and small.

(If you ever get the craving to do the same, I’ve found the easiest operating system to use is Windows, preferably inside a virtual machine. Early versions of Thunderbird for Macs were built for PowerPC architecture, while early Linux versions were 32-bit only. Both may cause you headaches with modern PC hardware!)

3-Pane Mail Layout: A Solid Foundation!

Below is my screenshot of Thunderbird 0.1 running on a Windows 11 virtual machine.

The first thing you’re probably thinking is “well, not much has changed!” With respect to the classic 3-pane mail presentation, you’re absolutely right! (Hey, why mess with a good thing?)

A screenshot of Thunderbird 0.1 from 2003, running on modern hardware and Windows 11.

Thousands of changes have been made to the client between Thunderbird 0.1 and Thunderbird 102, both under the hood and cosmetically. But it’s clear that Thunderbird started with a strong foundation. And it remains one of the most flexible, customizable applications you can use.

Something else stands out about that screenshot above: the original Thunderbird logo. Far removed from the modern, flat, circular logo we have today, this original logo simply took the Mozilla Phoenix/Firebird logo and gave it a blue coat of paint:

The original Mozilla Phoenix/Thunderbird logos<figcaption>The original Mozilla Thunderbird (top) and Mozilla Phoenix (bottom) logos</figcaption>

Thunderbird 0.1 Release Notes: “Everything Is New”

Back in 2003, much of what we take for granted in Thunderbird now was actually groundbreaking. Things like UI extensions to extend functionality, and user-modifiable theming were forward-thinking ideas. For a bit of historical appreciation, here are the release notes for Thunderbird 0.1:

  • Customizable Toolbars and Mail 3-pane: Toolbars can be customized the way you want them. Choose View / Toolbars / Customize inside any window. Mozilla Thunderbird also supports a new vertical 3-pane configuration (Tools / Options / General), giving you even more choice in how you want to view your mail.
  • Extensions: UI extensions can be added to Mozilla Thunderbird to customize your experience with specific features and enhancements that you need. Extensions allow you to add features particular to your needs such as offline mail support. A full list of available extensions can be found here.
  • Contacts Manager: A contacts sidebar for mail compose makes it easy and convenient to add address book contacts to emails.
  • Junk Mail Detection: In addition to automatically detecting junk mail using the same method as Mozilla Mail, Thunderbird also sanitizes HTML in mail marked as junk in order to better protect your privacy and give peace of mind when viewing a message identified as junk.
  • New default theme: Mozilla Thunderbird 0.1 sports a crisp, fresh and attractive theme, based on the amazing Qute theme by Arvid Axelsson. This is the same theme used by Mozilla Firebird, giving Thunderbird a similar look and feel. Thunderbird also supports a growing number of downloadable themes which alter the appearance of the client.
  • Stream-lined user interface and simplified Options UI.
  • Integrated spell checker.

Next Time, Inside The Thunderbird Time Machine…

A fictitious entry from a Livejournal page, circa December 2004:

“I had a super productive weekend! Finally finished Half-Life 2 and cannot wait for the sequel! I also upgraded my Dell Inspiron 7000 laptop from Windows 98 to Windows XP, so it’s time to install Firefox 1.0 and Thunderbird 1.0. Looking forward to trying this new open-source software!”

Thunderbird is the leading open-source, cross-platform email and calendaring client, free for business and personal use. We want it to stay secure and become even better. Donations allow us to hire developers, pay for infrastructure, expand our userbase, and continue to improve.

Click here to make a donation

The post Thunderbird Time Machine, 2003: A Look Back At Thunderbird 0.1 appeared first on The Thunderbird Blog.

Data@MozillaThis Week in Data: Python Environment Freshness

(“This Week in Glean Data” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)

By: Perry McManis and Chelsea Troy

A note on audience: the intended reader for this post is a data scientist or analyst, product owner or manager, or similar who uses Python regularly but has not had the opportunity to work with engineering processes to the degree they may like. Experienced engineers may still benefit from the friendly reminder to keep their environments fresh and up-to-date.

When was the last time you remade your local Python environment? One month ago? Six months ago? 1997?

Wait, please, don’t leave. I know, I might as well have asked you when the last time you cleaned out the food trap in your dishwasher was and I apologize. But this is almost as important. Almost.

If you don’t recall when, go ahead and check when you made your currently most used environment. It might surprise you how long ago it was.

# See this helpful stack overflow post by Timur Shtatland: https://stackoverflow.com/a/69109373
Mac: conda env list -v -v -v | grep -v '^#' | perl -lane 'print $F[-1]' | xargs /bin/ls -lrtd
Linux: conda env list | grep -v '^#' | perl -lane 'print $F[-1]' | xargs ls -lrt1d
Windows: conda env list
# Find the top level directory of your envs, e.g. C:\Users\yourname\miniconda3\envs
Windows: dir /T:C C:\Users\yourname\miniconda3\envs

Don’t feel bad though, if it does surprise you, or the answer is one you’d not admit publicly. Python environments are hard. Not in the everything is hard until you know how way, but in the why doesn’t this work? This worked last week! way. And the impetus is often to just not mess with things. Especially if you have that one environment that you’ve been using for the last 4 years, you know the one you have propped up with popsicle sticks and duct tape? But I’d like to propose that you consider regularly remaking your environments, and you build your own processes for doing so.

It is my opinion that if you can, you should be working in a fresh environment.

Much like the best by date, what is fresh is contextual. But if you start getting that when did I stand this env up? feeling, it’s time. Working in a fresh environment has a few benefits. Firstly, it makes it more likely that other folks will be able to easily duplicate it. Similarly to how providing an accurate forecast becomes increasingly difficult as you go further into the future, as you get further away from the date you completed a task in a changing ecosystem, the less likely it is that task can be successfully completed again.

Perhaps even more relevant is that packages often release security updates, APIs improve, functionality that you originally had to implement yourself may even get an official release. Official releases, especially for higher level programming languages like Python, are often highly optimized. For many researchers, those optimizations are out of the scope of their work, and rightly so. But the included version of that calculation in your favorite stats package will not only have several engineers working on it to make it run as quickly as possible, now you have the benefit of many researchers testing it concurrently with you.

These issues can collide spectacularly in cases where people get stuck trying to replicate your environment due to a deprecated version of a requirement. And if you never update your own environment, it could take someone else bringing it up to you to even notice that one of the packages you are using is no longer available, or an API has been moved from experimental to release, or removed altogether.

There is no best way of making fresh environments, but I have a few suggestions you might consider.

I will preface by saying that my preference is for command line tools, and these suggestions reflect that. Using graphical interfaces is a perfectly valid way to handle your environments, I’m just not that familiar with them, so while I think the ideas of environment freshness still apply, you will have to find your own way with them. And more generally, I would encourage you to develop your own processes anyway. These are more suggestions on where to start, and not all of them need find their way into your routines.

If you are completely unfamiliar with these environments, and you’ve been working in your base environment, I would recommend in the strongest terms possible that you immediately back it up. Python environments are shockingly easy to break beyond repair and tend to do so at the worst possible time in the worst possible way. Think live demo in front of the whole company that’s being simulcast on youtube. LeVar Burton is in the audience. You don’t want to disappoint him, do you? The easiest way to quickly make a backup is to create a new environment through the normal means, confirm it has everything you need in it, and make a copy of the whole install folder of the original.

If you’re not in the habit of making new environments, the next time you need to do an update for a package you use constantly, consider making an entirely new environment for it. As an added bonus this will give you a fallback option in case something goes wrong. If you’ve not done this before, one of the easiest ways is to utilize pip’s freeze function.

pip list --format=freeze > requirements.txt
conda create -n {new env name}
conda activate {new env name}
pip install -r requirements.txt
pip install {package} --upgrade

When you create your requirements.txt file, it’s usually a pretty good idea to go through it. A common gotcha is that you might see local file paths in place of version numbers. That is why we used pip list here. But it never hurts to check.

Take a look at your version numbers, are any of these really out of date? That is something we want to fix, but often some of our important packages have dependencies that require specific versions and we have to be careful not to break those dependencies. But we can work around that while getting the newest version we can by removing those dependencies from our requirements file and installing our most critical packages separately. That way we let pip or conda get the newest versions of everything that will work. For example, if I need pandas, and I know pandas depends on numpy, I can remove both from my requirements document and let pip handle my dependencies for me.

pip install --upgrade pip
pip install -r requirements.txt
pip install pandas

Something you may notice is that this block looks like it should be something that could be packaged up since it’s just a collection of commands. And indeed it can. We can put this in a shell script and with a bit of work, add a command line option, to more or less fire off a new environment for us in one go. This can also be expanded with shell commands for cases where we may need a compiler, tool from another language, a github repo even, etc. Assuming we have a way to run shell scripts,  Let’s call this create_env.sh:

conda deactivate
conda create -n $1
conda activate $1apt install gcc

apt install g++pip install --upgrade pip
pip install pystan==2.19.1.1
python3 -m pip install prophet --no-cache-dir

pip install -r requirements.txt
pip install scikit-learn

git clone https://github.com/mozilla-mobile/fenix.git

cd ./fenix

echo "Finished creating new environment: $1"

And by adding some flag handling, now using bash we can call sh create_env.sh newenv and be ready to go.

It will likely take some experimentation the first time or two. But once you know the steps you need to follow, getting new environment creation down to just a few minutes is as easy as packaging your steps up. And if you want to share, you can send your setup script rather than a list of instructions. Including this in your repository with a descriptive name and a mention in your README.md is a low effort way to help other folks get going with less friction.

There are of course tons of other great ways to package environments, like Docker. I would encourage you to read into them if you are interested in reproducibility beyond the simpler case of just rebuilding your local environment with regularity. There are a huge number of fascinating and immensely powerful tools out there to explore, should you wish to bring even more rigor to your Python working environments.

In the end, the main thing is to work out a fast and repeatable method that enables you to get your environment up and going again quickly from scratch. One that works for you. And then when you get the feeling that your environment has been around for a while, you won’t have to worry about making a new environment being an all-day or even worse, all-week affair. By investing in your own process, you will save yourself loads of time in the long run, and you may even save your colleagues some too. And hopefully, save yourself some frustration, too.

Like anything, the key to working out your process is repetitions. The first time will be hard, though maybe some of the tips here can make it a bit easier. But the second time will be easier. And after a handful, you will have developed a framework that will allow you to make your work more portable, more resilient and less angering, even beyond Python.

Support.Mozilla.OrgIntroducing Smith Ellis

Hi everybody,

I’m so happy to introduce our latest addition to the Customer Experience team. Smith Ellis is going to join forces with Tasos and Ryan to develop our support platform. It’s been a while since we got more than 2 engineers on the team, so I’m personally excited to see what we can unlock with more engineers.

Here’s a bit of an intro from Smith:

Hello Mozillians!  I’m Smith Ellis, and I’m joining the Customer Experience team as a Software Engineer. I’m more than happy to be here. I’ve held many technical and management roles in the past and have found that doing work that makes a difference is what makes me happy. My main hobbies are electronics, music, video games, programming, welding and playing with my kids.

I look forward to meeting you and making a difference with Mozilla.

Please join me to congratulate and welcome Smith into our SUMO family!