Martin ThompsonBundling for the Web

The idea of bundling is deceptively simple. Take a bunch of stuff and glom them into a single package. So why is it so difficult to teach the web how to bundle?

The Web already does bundling

A bundled resource is a resource the composes multiple pieces of content. Bundles can consist of content only of a single type or mixed types.

Take something like JavaScript[1]. A very large proportion of the JavaScript content on the web is bundled today. If you haven’t bundled, minified, and compressed your JavaScript, you have left easy performance wins unrealized.

HTML is a bundling format in its own right, with inline JavaScript and CSS. Bundling other content is also possible with data: URIs, even if this has some drawbacks.

Then there are CSS preprocessors, which provide bundling options, image spriting, and myriad other hacks.

And that leaves aside the whole mess of zipfiles, tarballs, and self-extracting executables that are used for a variety of Web-adjacent purposes. Those matter too, but they are generally not Web-visible.

Why we might want bundles

What is immediately clear from this brief review of available Web bundling options is that they are all terrible in varying degrees. The reasons are varied and a close examination of the reasons for this is probably not worthwhile.

It might be best just to view this as the legacy of a system that evolved in piecemeal fashion; an evolutionary artifact along a dimension that nature did not regard as critical to success.

I’m more interested in what reasons we might have for improving the situation. There are reasons in support of bundling, but I doubt that introducing native support for bundling technology will fundamentally change the way Web content is delivered.

Expanding the set of options for content delivery could still have value for some use cases or deployment environments.

In researching this, I was reminded of work that Jonas Sicking did to identify use cases. There are lots of reasons and requirements that are worth looking at. Some of the reasoning is dated, but there is a lot of relevant material, even five years on.

Efficiency

One set of touted advantages for bundling relate to performance and efficiency. Today, we have a better understanding of the ways in which performance is affected by resource composition, so this has been narrowed down to two primary features: compression efficiency and reduced overheads.

Compression efficiency can be dramatically improved if similar resources are bundled together. This is because the larger shared context results in more repetition and gives a compressor more opportunities to find and exploit similarities.

Bundling is not the only way to achieve this. Alternative methods of attaining compression gains have been explored, such as SDCH and cross-stream compression contexts for HTTP/2. Prototypes of the latter showed immense improvements in compression efficiency and corresponding performance gains. However, general solutions like these have not been successful in find ways to manage operational security concerns.

Bundling could also reduce overheads. While HTTP/2 and HTTP/3 reduce the cost of making requests, those costs still compound when multiple resources are involved. The claim here is that internal handling of individual requests in browsers has inefficiencies that are hard to eliminate without some form of bundling.

I find it curious that protocol-level inefficiencies are not blamed here, but rather inter-process communication between internal browser processes. Not having examined this closely, I can’t really speak to these claims, but they are quite credible.

What I do know is that performance in this space is subtle. When we were building HTTP/2, we found that performance was highly sensitive to the number of requests that could be made by clients in the first few round trips of a connection. The way that networking protocols work means that there is very limited space for sending anything early in a connection[2]. The main motivation for HTTP header compression was that it allowed significantly more requests to be made early in a connection. By reducing request counts, bundling might do the same.

One of the other potential benefits of bundling is in eliminating additional round trips. For content that is requested, a bundle might provide resources that a client does not know that it needs yet. Without bundling, a resource that references another resource adds an additional round trip as the first resource needs to be fetched before the second one is even known to the client.

Again, experience with HTTP/2 suggests that performance gains from sending extra resources are not easy to obtain. This is exactly what HTTP/2 server push promised to provide. However, as we have learned with server push, the wins here are not easy to realize. A number of attempts to improve performance with server push often resulted in mixed results and sometimes large regressions in performance. The problem is that servers are unable to accurately predict when to push content and so they push data that is not needed. To date, no studies have shown that there are reliable strategies that servers can use to reliably improve performance with server push.

The uncertainty regarding server push performance means that compression gains and reductions in overhead are the primary focus of current performance-seeking uses of bundles. These together might be enough to counteract the waste of delivering unwanted resources.

I personally remain lukewarm on using bundling as a performance tool. Shortcomings in protocols — or implementations — seem like they could be addressed at that level.

Ergonomics

The use of bundlers is an established practice in Web development. Being able to outsource some of the responsibility for managing the complexities of content delivery is no doubt part of the appeal.

Being able to compose complex content into a single package should not be underestimated.

Bundling of content into a single file is a property common to many systems. Providing a single item to manage with a single identity simplifies interactions. This is how we expect content of all kinds to be delivered, whether it is applications, books, libraries, or any other sort of digital artifact. The Web here is something of an abberation in that it resists the idea that parts of it can be roped off into a discrete unit with a finite size and name.

Though this usage pattern might be partly attributed to path dependence, the usability benefits of individual files cannot be so readily dismissed. Being able to manage bundles as a single unit where necessary, but identify the component pieces is likely to be a fairly large gain for developers.

For me, this reason might be enough to justify using bundles, even over some of their drawbacks.

Why we might not want bundles

The act of bundling subsumes the identity of each piece of bundled content with the identity of the bundle that is formed. This produces a number of effects, some of them desirable (as discussed), some of them less so.

As far as effects go, whether they are valuable or harmful might depend on context and perspective. Some of these effects might simply be managed as trade-offs, with site or server developers being able to choose how content is composed in order to balance various factors like total bytes transferred or latency.

If bundling only represented trade-offs that affected the operation of servers, then we might be able to resolve whether the feature is worth pursuing on the grounds of simple cost-benefit. Where things get more interesting is where choices might involve depriving others of their own choices. Balancing the needs of clients and servers is occasionally necessary. Determining the effect of server choices on clients — and the people they might act for — is therefore an important part of any analysis we might perform.

Cache efficiency and bundle composition

Content construction and serving infrastructure generally operates with imperfect knowledge of the state of caches. Not knowing what a client might need can make it hard to know what content to serve at any given point in time.

Optimizing the composition of the bundles used on a site for clients with a variety of cache states can be particularly challenging if caches operate at the granularity of resources. Clients that have no prior state might benefit from maximal bundling, which allows better realization of the aforementioned efficiency gains.

On the other hand, clients that have previously received an older version of the same content might only need to receive updates for those things that have changed. Similarly, clients that have previously received content for other pages that includes some of the same content. In both cases, receiving copies of content that was already transferred might negate any efficiency gains.

This is a problem that JavaScript bundlers have to deal with today. As an optimization problem it is made difficult by the combination of poor information about client state with the complexity of code dependency graphs and the potential for clients to follow different paths through sites.

For example, consider the code that is used on an article page on a hypothetical news site and the code used on the home page of the same site. Some of that code will be common, if we make the assumption that site developers use common tools. Bundlers might deal with this by making three bundles: one of common code, plus one each of article and home page code. For a very simple site like this, that allows all the code to be delivered in just two bundles on either type of page, plus an extra bundle when navigating from an article to the home page or vice versa.

As the number of different types of page increases, splitting code into multiple bundles breaks down. The number of bundle permutations can increase much faster than the number of discrete uses. In the extreme, the number of bundles could end up being factorial on the number of types of page, limited only by the number of resources that might be bundled. Of course, well before that point is reached, the complexity cost of bundling likely exceeds any benefits it might provide.

To deal with this, bundlers have a bunch of heuristics that balance the costs of providing too much data in a bundle for a particular purpose, against the costs of potentially providing bundled data that is already present. Some sites take this a little further and use service workers to enhance browser caching logic[3].

It is at this point that you might recognize an opportunity. If clients understood the structure of bundles, then maybe they could do something to avoid fetching redundant data. Maybe providing a way to selectively request pieces of bundles could reduce the cost of fetching bundles when parts of the bundle are already present. That would allow the bundlers to skew their heuristics more toward putting stuff in bundles. It might even be possible to tune first-time queries this way.

The thing is, we’ve already tried that.

A standard for inefficient caching

There is a long history in HTTP of failed innovation when it comes to standardizing improvements for cache efficiency. Though cache invalidation is recognized as one of the hard problems in computer science, there are quite a few examples of successful deployments of proprietary solutions in server and CDN infrastructure.

A few caching innovations have made it into HTTP over time, such as the recent immutable Cache-Control directive. That particular solution is quite relevant in this context due to the way that it supports content-based URI construction, but it is still narrower in applicability than a good solution in this space might need.

More general solutions that aim to improve the ability to eliminate wasted requests in a wider range of cases are more difficult. Cache digests is notable here in that it got a lot further, getting several revisions into the IETF working group process. It still failed.

If the goal of failing is to learn, then this too was a failure largely for the most ignomonious of reasons: no interest. Claims from clients that cache digests are too expensive to implement are credible here, but not entirely satisfactory in light of the change to use Cuckoo filters in later versions and recent storage partitioning work.

The point of this little digression is to highlight the inherent difficulties in trying to standardize enhancements to caching on the Web. My view is that it would be unwise to attempt to tackle a difficult problem as part of trying to introduce a new feature. If the success of bundling depends on finding a solution to this problem, then I would be surprised, but it might suggest that the marginal benefit of bundling — for performance — is not sufficient to justify the effort[4].

Erasing resource identity

An issue that was first[5] raised by Brave is that the use of bundles creates opportunities for sites to obfuscate the identity of resources. The thesis being that bundling could confound content blocking techniques as it would make rewriting of identifiers easier.

For those who rely on the identity of resources to understand the semantics and intent of the identified resource, there are some ways in which bundling might affect their decision-making. The primary concern is that references between resources in the same bundle are fundamentally more malleable than other references. As the reference and reference target are in the same place, it is trivial - at least in theory - to change the identifier.

Brave and several others are therefore concerned that bundling will make it easier to prevent URL-based classification of resources. In the extreme, identifiers could be rewritten for every request, negating any attempt to use those identifiers for classification.

One of the most interesting properties of the Web is the way that it insinuates a browser - and user agency - into the process. The way that happens is that the the Web[6] is structurally biased toward functioning better when sites expose semantic information to browsers. This property, that we like to call semantic availability, is what allows browsers to be opinionated about content rather than acting as a dumb pipe[7].

Yes, it’s about ad blockers

Just so that this is clear, this is mostly about blocking advertising.

While more advanced ad blocking techniques also draw on contextual clues about resources, those methods are more costly. Most ad blocking decisions are made based on the URL of resources. Using the resource identity allows the ad blocker to prevent the load, which not only means that the ad is not displayed, but the resources needed to retrieve it are not spent[8].

While many people might choose to block ads, sites don’t like being denied the revenue that advertising provides. Some sites already use techniques that are designed to show advertising to users of ad blockers, so it is not unreasonable to expect tools to be used to prevent classification.

It is important to note that this is not a situation that requires an absolute certainty. The sorry state of Web privacy means that we have a lot of places where various forces are in tension or transition. The point of Brave’s complaint here is not that bundling outright prevents the sort of classification they seek, but that it changes the balance of system dynamics by giving sites another tool that they might employ to avoid classification.

Of course, when it is a question of degree, we need to discuss and agree how much the introduction of such a tool affects the existing system. That’s where this gets hard.

Coordination artifacts

As much as these concerns are serious, I tend to think that Jeffrey Yasskin’s analysis of the problem is correct. That analysis essentially concludes that the reason we have URIs is to facilitate coordination between different entities. As long as there is a need to coordinate between the different entities that provide the resources that might be composed into a web page, that coordination will expose information that can be used for classification.

That is, to the extent to which bundles enable obfuscation of identifiers, that obfuscation relies on coordination. Any coordination that would enable obfuscation with bundling is equally effective and easy to apply without bundling.

Single-page coordination

Take a single Web page. Pretend for a moment that the web page exists in a vacuum, with no relationship to other pages at all. You could take all the resources that comprise that page and form them into a single bundle. As all resources are in the one place, it would be trivial to rewrite the references between those resources. Or, the identity of resources could be erased entirely by inlining everything. If every request for that page produced a bundle with a different set of resource identifiers, it would be impossible to infer anything about the contents of resources based on their identity alone.

Unitary bundles for evey page is an extreme that is almost certainly impractical. If sites were delivered this way, there would be no caching, which means no reuse of common components. Using the Web would be virtually intolerable.

Providing strong incentive to deploy pages as discrete bundles — something Google Search has done to enable preloading search results for cooperating sites — could effectively force sites to bundle in this way. Erasing or obfuscating internal links in these bundles does seem natural at this point, if only to try to reclaim some of the lost performance, but that assumes an unnatural pressure toward bundling[9].

Absent perverse incentives, sites are often built from components developed by multiple groups, even if that is just different teams working at the same company. To the extent that teams operate independently, they need to agree on how they interface. The closer the teams work together, and the more tightly they are able to coordinate, the more flexible those interfaces can be.

There are several natural interface points on the Web. Of these the URL remains a key interface point[10]. A simple string that provides a handle for a whole bundle[11] of collected concepts is a powerful abstraction.

Cross-site coordination

Interfaces between components therefore often use URLs, especially once cross-origin content is involved. For widely-used components that enable communication between sites, URLs are almost always involved. If you want to use React, the primary interface is a URL:

<script src="https://unpkg.com/react@17/umd/react.production.min.js" crossorigin></script>
<script src="https://unpkg.com/react-dom@17/umd/react-dom.production.min.js" crossorigin></script>

If you want add Google analytics, there is a bit of JavaScript[12] as well, but the URL is still key:

<script async src="https://www.googletagmanager.com/gtag/js?id=$XXX"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag(){dataLayer.push(arguments);}
  gtag('js', new Date());
  gtag('config', '$XXX');
</script>

The same applies to advertising.

The scale of coordination required to change these URLs is such that changes cannot be effected on a per-request basis, they need months, if not years[13].

Even for resources on the same site, a version of the same coordination problem exists. Content that might be used by multiple pages will be requested at different times. At a minimum, changing the identity of resources would mean forgoing any reuse of cached resources. Caching provides such a large performance advantage that I can’t imagine sites giving that up.

Even if caching were not incentive enough, I suggest that the benefits of stability of references are enough to ensure that identifiers don’t change arbitrarily.

Loose coupling

As long as loose coupling is a feature of Web development, the way that resources are identified will remain a key part of how the interfaces between components is managed. Those identifiers will therefore tend to be stable. That stability will allow the semantics of those resources to be learned.

Bundles do not change these dynamics in any meaningful way, except to the extent that they might enable better atomicity. That is, it becomes easier to coordinate changes to references and content if the content is distributed in a single indivisible unit. That’s not nothing, but — as the case of selective fetches and cache optimization highlights — content from bundles need to be reused in a different context, so the application of indivisible units is severely limited.

Of course, there are ways of enabling coordination that might allow for constructing identifiers that are less semantically meaningful. To draw on the earlier point about the Web already having bundling options, advertising code could be inlined with other JavaScript or in HTML, rather than having it load directly from the advertiser. In the extreme, servers could rewrite all content with encrypted URLs with a per-user key. None of this depends on the deployment of new Web bundling technology, but it does require close coordination.

All or nothing bundles

Even if it were possible to identify unwanted content, opponents of bundling point out that placing that content in the same bundle as critical resources makes it difficult to avoid loading the unwanted content. Some of the performance gains from content blockers are the result of not fetching content[14]. Bundling unwanted content might eliminate the cost and performance benefits of content blocking.

This is another important criticism that ties in with the early concerns regarding bundle composition and reuse. And, similar to previous problems, the concern is not that this sort of bundling is enabled as a result of native, generic bundling capabilities, but more that it becomes more readily accessible as a result.

This problem, more so than the caching one, might motivate designs for selective acquisition of bundled content.

Existing techniques for selective content fetching, like HTTP range requests, don’t reliably work here as compression can render byte ranges useless. That leads to inventing new systems for selective acquistion of bundles. Selective removal of content from compressed bundles does seem to be possible at some levels, but this leads to a complex system and the effects on other protocol participants is non-trivial.

At some level, clients might want to say “just send me all the code, without the advertising”, but that might not work so well. Asking for bundle manifests so that content might be selectively fetched adds an additional round trip. Moving bundle manifests out of the bundles and into content[15] gives clients the information they need to be selective about which resources they want, but it requires moving information about the composition of resources into the content that references it. That too requires coordination.

For caches, this can add an extra burden. Using the Vary HTTP header field would be necessary to ensure that caches would not break when content from bundles is fetched selectively[16]. But it takes full awareness of these requests and how they are applied for a cache to not be exposed to a combinatorial explosion of different bundles as a result. Without updating caches to understand selectors, the effect is that caches end up bearing the load for the myriad permutations of bundles that might be needed.

Supplanting resource identity

A final concern is the ability — at least in active proposals — for bundled content to be identified with URLs from the same origin as the bundle itself. For example, a bundle at https://example.com/foo/bundle might contain content that is identified as https://example.com/foo/script.js. This is a long-standing concern that applies to many previous attempts at bundling or packaging.

This ability is constrained, but the intent is to have content in a bundle act as a valid substitute for other resources. This has implications for anyone deploying a server, who now need to ensure that bundles aren’t hosted adjacent to content that might not want interference from the bundle.

At this point, I will note that this is also the point of signed exchanges. The constraints on what can be replaced and how are important details, but the goal is the same: signed exchanges allow a bundle to speak for other resources, just that in that case it is resources that are served by a completely different origin.

You might point out that this sort of thing is already possible with service workers. Service workers take what it means to subvert the identity of resources to the next level. A request that is handled by a service worker can be turned into any other request or requests (with any cardinality). Service workers are limited though. A site can opt to perform whatever substitutions it likes, but it can only do that for its own requests. Bundles propose something that might be enabled for any server, inadvertently or otherwise.

One proposal says that all supplanted resources must be identical to the resources they supplant. The theory there is that clients could fetch the resource from within a bundle or directly and expect the same result. It goes on to suggest that a mismatch between these fetches might be cause for a client to stop using the bundle. However, it is perfectly normal in HTTP for the same resource to return different content when fetched multiple times, even when the fetch is made by the same client or at the same time. So it is hard to imagine how a client would treat inconsistency as anything other than normal. If bundling provides advantages, giving up on using bundles for that reason could make bundles totally unreliable.

One good reason for enabling equivalence of bundled and unbundled resources is to provide a graceful fallback in the case that bundling is not supported by a client. One bad reason is to ensure that the internal identifiers in bundles are “real” and that the fallback does not change behaviour; see the previous points about the folly of attempting to enforce genuine equivalence.

Indirection for identifiers

Addressing the problem of one resource speaking unilaterally for another resource requires a little creativity. Here the solution is hinted at with both service workers and JavaScript import maps. Import maps are especially instructive here as it makes it clear that the mapping from the import specifier to a URL is not the URL resolution function in RFC 3986; import specifiers are explicitly not URLs, relative or otherwise.

This leaves open the possibility of a layer of indirection, either the limited form provided in import maps that takes one string and produces another, or the Turing-complete version that service workers enable.

In other words, we allow those places that express the identity of resources to tell the browser how to interpret values. This is something that HTML has had forever, with the <base> element. This is also the fundamental concept behind the fetch maps proposal, which looks like this[17]:

<script type="fetchmap">
{
  "urls": {
    "/styles.css": "/styles.a74fs3.css",
    "/bg.png": "/bg.8e3ac4.png"
  }
}
</script>
<link rel="stylesheet" href="/styles.css">

Then, when the browser is asked to fetch /styles.css, it knows to fetch /styles.a74fs3.css instead.

The beauty of this approach is that the change only exists where the reference is made. The canonical identity of the resource is the same for everyone (it’s https://example.com/styles.a74fs3.css), only the way that reference is expressed changes.

In other words, the common property between these designs — service workers, <base>, import maps, or fetch maps — is that the indirection only occurs at the explicit request of the thing that makes the reference. A site deliberately chooses to use this facility, and if it does, it controls the substitution resource identities. There is no lateral replacement of content as all of the logic occurs at the point the reference is made.

Making resource maps work

Of course, fitting this indirections into an existing system requires a few awkward adaptations. But it seems like this particular design could be quite workable.

Anne van Kesteren pointed out that many of the places where identifiers appear are concretely URLs. APIs assume that they can be manipulated as URIs and violating that expectation would break things that rely on that. If we are going to enable this sort of indirection, then we need to ensure that URIs stay URIs. That doesn’t mean that URIs need to be HTTP, just that they are still URIs. Thus, you might choose to construct identifiers with a new URI scheme in order to satisfy this requirement[18]:

<a href="scheme-for-mappings:hats">buy hats here</a>

Of course, in the fetch map example given, those identifiers look like and can act like URLs. In the absence of a map, they translate directly to relative URLs. That’s probably a useful feature to retain as it means that you can find local files when the reference is found in a local file during development. Using a new scheme won’t have that advantage. A new scheme might be an option, but it doesn’t seem to be a necessary feature of the design.

I can also credit Anne with the idea that we model this indirection as a redirect, something like an HTTP 303 (See Other). The Web is already able to manage redirection for all sorts of resources, so that would not naturally disrupt things too much.

That is not to say that this is easy, as these redirects will need to conform to established standards for the Web, with respect to the origin model and integration with things like Content Security Policy. It will need to be decided how resource maps affect cross-origin content. And many other details will need to be thought about carefully. But again, the design seems at least plausible.

Of note here is that resource maps can be polyfilled with service workers. That suggests we might just have sites build this logic into service workers. That could work, and it might be the basis for initial experiments. A static format is likely superior as it makes the information more readily available.

Alternatives and bundle URIs

Providing indirection is just one piece of enabling use of bundled content. Seamless integration needs two additional pieces.

The first is an agreed method of identifying the contents of bundles. The IETF WPACK working group have had several discussions about this. These discussions were inconclusive, in part because it was difficult to manage conflicting requirements. However, a design grounded in a map-like construct might loosen some of the constraints that disqualified some of the past options that were considered.

In particular, the idea that a bundle might itself have an implicit resource map was not considered. That could enable the use of simple identifiers for references between resources in the same bundle without forcing links in bundled content to be rewritten. And any ugly URI scheme syntax for bundles might then be abstracted away elegantly.

The second major piece to getting this working is a map that provides multiple alternatives. In previous proposals, mappings were strictly one-to-one. A one-to-many map could offer browsers a choice of resources that the referencing entity considers to be equivalent[19]. The browser is then able to select the option that it prefers. If an alternative references a bundle the browser already has, that would be good cause to use that option.

Presenting multiple options also allows browsers to experiment with different policies with respect to fetching content when bundles are offered. If bundled content tends to perform better on initial visits, then browsers might request bundles then. If bundled content tends to perform poorly when there is some valid, cached content available already, then the browser might request individual resources in that case.

A resource map might be used to enable deployment of new bundling formats, or even new retrieval methods[20].

Selective acquisition

One advantage of providing an identifier map like this is that it provides a browser with some insight into what bundles contain before fetching them[21]. Thus, a browser might be able to make a decision about whether a bundle is worth fetching. If most of the content is stuff that the browser does not want, then it might choose to fetch individual resources instead.

Having a reference map might thereby reduce the pressure to design mechanisms for partial bundle fetching and caching. Adding some additional metadata, like hints about resource size, might further allow for better tuning of this logic.

Reference maps could even provide content classification tools more information about resources that they can use. Even in a simple one-to-one mapping, like with an import map, there are two identifiers that might be used to classify content. Even if one of these is nonsense, the other could be useable.

While this requires more sophistication on the part of classifiers, it also provides opportunities for better classification. With alternative sources, even if the identifier for one source does not reveal any useful information, an alternative might.

Now that I’m fully into speculating about possibilities, this opens some interesting options. The care that was taken to ensure that pages don’t break when Google Analytics is loaded could be managed differently. Remember that script:

window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', '$XXX');

As you can see, the primary interface is always defined and the window.dataLayer object is replaced with a dumb array if the script didn’t load. With multiple alternatives, the fallback logic here could be encoded in the map as a data: URI instead:

<element-for-mappings type="text/media-type-for-mappings+json">
{ "scheme-for-mappings:ga": [
  "https://www.googletagmanager.com/gtag/js?id=$XXX",
  "data:text/javascript;charset=utf-8;base64,d2luZG93LmRhdGFMYXllcj1bXTtmdW5jdGlvbiBndGFnKCl7ZGF0YUxheWVyLnB1c2goYXJndW1lbnRzKTt9Z3RhZygnanMnLG5ldyBEYXRlKCkpO2d0YWcoJ2NvbmZpZycsJyRYWFgnKTs="
]}</element-for-mappings>
<script async src="scheme-for-mappings:ga"></script>

In this case, a content blocker that decides to block the HTTPS fetch could allow the data: URI and thereby preserve compatibility. Nothing really changed, except that the fallback script is async too. Of course, this is an unlikely outcome as this is not even remotely backward-compatible, but it does give some hints about some of the possibilities.

Next steps

So that was many more words than I expected to write. The size and complexity of this problem continues to be impressive. No doubt this conversation will continue for some time before we reach some sort of conclusion.

For me, the realization that it is possible to provide finer control over how outgoing references are managed was a big deal. We don’t have to accept a design that allows one resource speaking for others, we just have to allow for control over how references are made. That’s a fairly substantial improvement over most existing proposals and the basis upon which something good might be built.

I still have serious reservations about the caching and performance trade-offs involved with bundling. Attempting to solve this problem with selective fetching of bundle contents seems like far too much complexity. Not only does it require addressing the known-hard problem of cache invalidation, it also requires that we find solutions to problems that have defied solutions on numerous occasions in the past.

That said, I’ve concluded that giving servers the choice in how content is assembled does not result in bad outcomes for others, so we are no longer talking about negative externalities.

If we accept that selective fetching is a difficult problem, supporting bundles only gives servers more choices. What we learn from that might give us the information that allows us to find solutions later. Resource maps mean that we can always fall back to fetching resources individually, which has been pretty effective so far. But resources maps might also be the framework on which we build new experiments with alternative resource fetching models.

All that said, the usability advantages provided by bundles seem to be sufficient justification for enabling their support. That applies even if there is uncertainty about performance. That applies even if we don’t initially solve those performance problems. One enormous problem at a time, please.


  1. Have I ever mentioned that I loathe CamelCase names? Thanks 1990s. ↩︎

  2. This is due to the way congestion control algorithms operate. These start out slow in case the network is constrained, but gradually speed up. ↩︎

  3. Tantek Çelik pointed out that you can use a service worker to load old content at the same time as checking asynchronously for updates. The fact is, service workers can do just about anything discussed here. You only have to write and maintain a service worker. ↩︎

  4. You might reasonably suggest that this sort of thinking tends toward suboptimal local minima. That is a fair criticism, but my rejoinder there might be that conditioning success on a design that reduces to a previously unsolved problem is not really a good strategy either. Besides, accepting suboptimal local minima is part of how we make forward progress without endless second-guessing. ↩︎

  5. I seem to recall this being raised before Pete Snyder opened this issue, perhaps at the ESCAPE workshop, but I can’t put a name to it. ↩︎

  6. In particular, the split between style (CSS) and semantics (HTML). ↩︎

  7. At this point, a footnote seems necessary. Yes, a browser is an intermediary. All previous complaints apply. It would be dishonest to deny the possibility that a browser might abuse its position of privilege. But that is the topic for a much longer posting. ↩︎

  8. This more than makes up for the overheads of the ad blocker in most cases, with page loads being considerably faster on ad-heavy pages. ↩︎

  9. If it isn’t clear, I’m firmly of the opinion that Google’s AMP Cache is not just a bad idea, but an abuse of Google’s market dominance. It also happens to be a gross waste of resources in a lot of cases, as Google pushes content that can be either already present or content for links that won’t ever be followed. Of course, if they guess right and you follow a link, navigation is fast. Whoosh. ↩︎

  10. With increasing amounts of scripts, interfaces might also be expressed at the JavaScript module or function level. ↩︎

  11. Yep. Pun totally intended. ↩︎

  12. Worth noting here is the care Google takes to structure the script to avoid breaking pages when their JavaScript load is blocked by an ad blocker. ↩︎

  13. I wonder how many people are still fetching ga.js from Google. ↩︎

  14. Note that, at least for ad blocking, the biggest gains come from not executing unwanted content, as executing ad content almost always leads to a chain of additional fetches. Saving the CPU time is the third major component to savings. ↩︎

  15. Yes, that effectively means bundling them with content. ↩︎

  16. Curiously, the Variants design is might not be a good fit here as it provides enumeration of alternatives, which is tricky for the same reason that caching in ignorance of bundling is. ↩︎

  17. There is lots to quibble about in the exact spelling in this example, but I just copied from the proposal directly. ↩︎

  18. It’s tempting here to suggest urn:, but that might cause some heads to explode. ↩︎

  19. The thought occurs that this is something that could be exploited to allow for safe patching of dependencies when combined with semantic versioning. For instance, I will accept any version X.Y.? of this file greater than X.Y.Z. We can leave that idea for another day though. ↩︎

  20. Using IPFS seems far more plausible if you allow it as one option of many with the option for graceful fallback. ↩︎

  21. To what extent providing information ahead of time can be used to improve performance is something that I have often wondered about; it seems like it has some interesting trade-offs that might be worth studying. ↩︎

The Talospace ProjectFirefox 86 on POWER

Firefox 86 is out, not only with multiple picture-in-picture (now have all the Weird Al videos open simultaneously!) and total cookie protection (not to be confused with other things called TCP) but also some noticeable performance improvements and finally gets rid of Backspace backing you up, a key I have never pressed to go back a page. Or, maybe those performance improvements are due to further improvements to our LTO-PGO recipe, which uses Fedora's work to get rid of the sidecar shell script. Now with this single patch, plus their change to nsTerminator.cpp to allow optimization to be unbounded by time, you can build a fully link- and profile-guided optimized version for OpenPOWER and gcc with much less work. Firefox 86 also incorporates our low-level Power-specific fix to xpconnect.

Our .mozconfigs are mostly the same except for purging a couple iffy options. Here's Optimized:


export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9"
ac_add_options --enable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full
ac_add_options MOZ_PGO=1

# uncomment if you have it
#export GN=/home/censored/bin/gn
And here's Debug:

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd

# uncomment if you have it
#export GN=/home/censored/bin/gn

Aaron Klotz2018 Roundup: H2

This is the fifth post in my “2018 Roundup” series. For an index of all entries, please see my blog entry for Q1.

Yes, you are reading the dates correctly: I am posting this nearly two years after I began this series. I am trying to get caught up on documenting my past work!

Preparing to Enable the Launcher Process by Default

CI and Developer Tooling

Given that the launcher process completely changes how our Win32 Firefox builds start, I needed to update both our CI harnesses, as well as the launcher process itself. I didn’t do much that was particularly noteworthy from a technical standpoint, but I will mention some important points:

During normal use, the launcher process usually exits immediately after the browser process is confirmed to have started. This was a deliberate design decision that I made. Having the launcher process wait for the browser process to terminate would not do any harm, however I did not want the launcher process hanging around in Task Manager and being misunderstood by users who are checking their browser’s resource usage.

On the other hand, such a design completely breaks scripts that expect to start Firefox and be able to synchronously wait for the browser to exit before continuing! Clearly I needed to provide an opt-in for the latter case, so I added the --wait-for-browser command-line option. The launcher process also implicitly enables this mode under a few other scenarios.

Secondly, there is the issue of debugging. Developers were previously used to attaching to the first firefox.exe process they see and expecting to be debugging the browser process. With the launcher process enabled by default, this is no longer the case.

There are few options here:

  • Visual Studio users may install the Child Process Debugging Power Tool, which enables the VS debugger to attach to child processes;
  • WinDbg users may start their debugger with the -o command-line flag, or use the Debug child processes also checkbox in the GUI;
  • I added support for a MOZ_DEBUG_BROWSER_PAUSE environment variable, which allows developers to set a timeout (in seconds) for the browser process to print its pid to stdout and wait for a debugger attachment.

Performance Testing

As I have alluded to in previous posts, I needed to measure the effect of adding an additional process to the critical path of Firefox startup. Since in-process testing will not work in this case, I needed to use something that could provide a holistic view across both launcher and browser processes. I decided to enhance our existing xperf suite in Talos to support my use case.

I already had prior experience with xperf; I spent a significant part of 2013 working with Joel Maher to put the xperf Talos suite into production. I also knew that the existing code was not sufficiently generic to be able to handle my use case.

I threw together a rudimentary analysis framework for working with CSV-exported xperf data. Then, after Joel’s review, I vendored it into mozilla-central and used it to construct an analysis for startup time. [While a more thorough discussion of this framework is definitely warranted, I also feel that it is tangential to the discussion at hand; I’ll write a dedicated blog entry about this topic in the future. – Aaron]

In essence, the analysis considers the following facts when processing an xperf recording:

  • The launcher process will be the first firefox.exe process that runs;
  • The browser process will be started by the launcher process;
  • The browser process will fire a session store window restored event.

For our analysis, we needed to do the following:

  1. Find the event showing the first firefox.exe process being created;
  2. Find the session store window restored event from the second process;
  3. Output the time interval between the two events.

This block of code demonstrates how that analysis is specified using my analyzer framework.

Overall, these test results were quite positive. We saw a very slight but imperceptible increase in startup time on machines with solid-state drives, however the security benefits from the launcher process outweigh this very small regression.

Most interestingly, we saw a signficant improvement in startup time on Windows 10 machines with magnetic hard disks! As I mentioned in Q2 Part 3, I believe this improvement is due to reduced hard disk seeking thanks to the launcher process forcing \windows\system32 to the front of the dynamic linker’s search path.

Error and Experimentation Readiness

By Q3 I had the launcher process in a state where it was built by default into Firefox, but it was still opt-in. As I have written previously, we needed the launcher process to gracefully fail even without having the benefit of various Gecko services such as telemetry and the crash reporter.

Error Propagation

First of call, I created a new class, WindowsError, that encapsulates all types of Windows error codes. As an aside, I would strongly encourage all Gecko developers who are writing new code that invokes Windows APIs to use this class in your error handling.

WindowsError is currently able to store Win32 DWORD error codes, NTSTATUS error codes, and HRESULT error codes. Internally the code is stored as an HRESULT, since that type has encodings to support the other two. WindowsError also provides a method to convert its error code to a localized string for human-readable output.

As for the launcher process itself, nearly every function in the launcher process returns a mozilla::Result-based type. In case of error, we return a LauncherResult, which [as of 2018; this has changed more recently – Aaron] is a structure containing the error’s source file, line number, and WindowsError describing the failure.

Detecting Browser Process Failures

While all Results in the launcher process may be indicating a successful start, we may not yet be out of the woods! Consider the possibility that the various interventions taken by the launcher process might have somehow impaired the browser process’s ability to start!

The way we deal with this is to record timestamps for both the launcher process and the browser process. We record these timestamps in a way that distinguishes between multiple distinct Firefox installations.

In the ideal scenario where both processes are functioning correctly, we expect a timestamp for the launcher process to be recorded, followed by a timestamp for the browser process.

If something goes wrong with the browser process, it will not be able to record its timestamp.

The next time the launcher process is started, it checks for timestamps recorded from the previous run. If the browser process’s timestamp is either missing or older than the previous launcher timestamp, then we know that something went wrong. In this case, the launcher process disables itself and proceeds to start the browser process without any of its usual interventions.

Once the browser has successfully started, it reflects the launcher process state into telemetry, preferences, and about:support.

Future attempts to start Firefox will bypass the launcher process until the next time the installation’s binaries are updated, at which point we reset the timestamps and attempt once again to start with the launcher process. We do this in the hope that whatever was failing in version n might be fixed in version n + 1.

Note that this update behaviour implies that there is no way to forcibly and permanently disable the launcher process. This is by design: the timestamp feature is designed to prevent the browser from becoming unusable, not to provide configurability. The launcher process is a security feature and not something that we should want users adjusting any more than we would want users to be disabling the capability system or some other important security mitigation. In fact, my original roadmap for InjectEject called for eventually removing the timestamp code once the launcher failure rate became small enough.

Experimentation and Emergency

The pref reflection built into the timestamp system is bi-directional. This allowed us to ship a release where we ran a study with a fraction of users running with the launcher process enabled by default.

Once we rolled out the launcher process at 100%, this pref also served as a useful “emergency kill switch” that we could have flipped if necessary.

Fortunately our experiments were successful and we rolled the launcher process out to release at 100% without ever needing the kill switch!

At this point, this pref should probably be removed, as we no longer need nor want to control launcher process deployment in this way.

Error Reporting

When telemetry is enabled, the launcher process is able to convert its LauncherResult into a ping which is sent in the background by ping-sender. When telemetry is disabled, we perform a last-ditch effort to surface the error by logging details about the LauncherResult failure in the Windows Event Log.

In Conclusion

Thanks for reading! This concludes my 2018 Roundup series! There is so much more work from 2018 that I did for this project that I wish I could discuss, but for security reasons I must refrain. Nonetheless, I hope you enjoyed this series. Stay tuned for more roundups in the future!

Data@MozillaThis Week in Glean: Boring Monitoring

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.)

All “This Week in Glean” blog posts are listed in the TWiG index (and on the Mozilla Data blog).


Every Monday the Glean has its weekly Glean SDK meeting. This meeting is used for 2 main parts: First discussing the features and bugs the team is currently investigating or that were requested by outside stakeholders. And second bug triage & monitoring of data that Glean reports in the wild.

Most of the time looking at our monitoring is boring and that’s a good thing.

From the beginning the Glean SDK supported extensive error reporting on data collected by the framework inside end-user applications. Errors are produced when the application tries to record invalid values. That could be a negative value for a counter that should only ever go up or stopping a timer that was never started. Sometimes this comes down to a simple bug in the code logic and should be fixed in the implementation. But often this is due to unexpected and surprising behavior of the application the developers definitely didn’t think about. Do you know all the ways that your Android application can be started? There’s a whole lot of events that can launch it, even in the background, and you might miss instrumenting all the right parts sometimes. Of course this should then also be fixed in the implementation.

Monitoring Firefox for Android

For our weekly monitoring we look at one application in particular: Firefox for Android. Because errors are reported in the same way as other metrics we are able to query our database, aggregate the data by specific metrics and errors, generate graphs from it and create dashboards on our instance of Redash.

Graph of the error counts for different metrics in Firefox for Android<figcaption>Graph of the error counts for different metrics in Firefox for Android</figcaption>

The above graph displays error counts for different metrics. Each line is a specific metric and error (such as Invalid Value or Invalid State). The exact numbers are not important. What we’re interested in is the general trend. Are the errors per metrics stable or are there sudden jumps? Upward jumps indicate a problem, downward jumps probably means the underlying bug got fixed and is finally rolled out in an update to users.

Rate of affected clients in Firefox for Android<figcaption>Rate of affected clients in Firefox for Android</figcaption>

We have another graph that doesn’t take the raw number of errors, but averages it across the entire population. A sharp increase in error counts sometimes comes from a small number of clients, whereas the errors for others stay at the same low-level. That’s still a concern for us, but knowing that a potential bug is limited to a small number of clients may help with finding and fixing it. And sometimes it’s really just bogus client data we get and can dismiss fully.

Most of the time these graphs stay rather flat and boring and we can quickly continue with other work. Sometimes though we can catch potential issues in the first days after a rollout.

Sudden jump upwards in errors for 2 metrics in Firefox for Android Nightly<figcaption>Sudden jump upwards in errors for 2 metrics in Firefox for Android Nightly</figcaption>

In this graph from the nightly release of Firefox for Android two metrics started reporting a number of errors that’s far above any other error we see. We can then quickly find the implementation of these metrics and report that to the responsible team (Filed bug, and the remediation PR).

But can’t that be automated?

It probably can! But it requires more work than throwing together a dashboard with graphs. It’s also not as easy to define thresholds on these changes and when to report them. There’s work underway that hopefully enables us to more quickly build up these dashboards for any product using the Glean SDK, which we can then also extend to do more reporting automated. The final goal should be that the product teams themselves are responsible for monitoring their data.

William LachanceCommunity @ Mozilla: People First, Open Source Second

It’s coming up on ten years at Mozilla for me, by far the longest I’ve held any job personally and exceedingly long by the standards of the technology industry. When I joined up in Summer 2011 to work on Engineering Productivity1, I really did see it as a dream job: I’d be paid to work full time on free software, which (along with open data and governance) I genuinely saw as one of the best hopes for the future. I was somewhat less sure about Mozilla’s “mission”: the notion of protecting the “open web” felt nebulous and ill-defined and the Mozilla Manifesto seemed vague when it departed from the practical aspects of shipping a useful open source product to users.

It seems ridiculously naive in retrospect, but I can remember thinking at the time that the right amount of “open source” would solve all the problems. What can I say? It was the era of the Arab Spring, WikiLeaks had not yet become a scandal, Google still felt like something of a benevolent upstart, even Facebook’s mission of “making the world more connected” sounded great to me at the time. If we could just push more things out in the open, then the right solutions would become apparent and fixing the structural problems society was facing would become easy!

What a difference a decade makes. The events of the last few years have demonstrated (conclusively, in my view) that open systems aren’t necessarily a protector against abuse by governments, technology monopolies and ill-intentioned groups of individuals alike. Amazon, Google and Facebook are (still) some of the top contributors to key pieces of open source infrastructure but it’s now beyond any doubt that they’re also responsible for amplifying a very large share of the problems global society is experiencing.

At the same time, some of the darker sides of open source software development have become harder and harder to ignore. In particular:

  • Harassment and micro aggressions inside open source communities is rampant: aggressive behaviour in issue trackers, personal attacks on discussion forums, the list goes on. Women and non-binary people are disproportionately affected, although this behaviour exacts a psychological toll on everyone.
  • Open source software as exploitation: I’ve worked with lots of contributors while at Mozilla. It’s hard to estimate this accurately, but based on some back-of-the-envelope calculations, I’d estimate that the efforts of community volunteers on projects I’ve been involved in have added up to (conservatively) to hundreds of thousands of U.S. dollars in labour which has never been directly compensated monetarily. Based on this experience (as well as what I’ve observed elsewhere), I’d argue that Mozilla as a whole could not actually survive on a sustained basis without unpaid work, which (at least on its face) seems highly problematic and creates a lingering feeling of guilt given how much I’ve benefited financially from my time here.
  • It’s a road to burnout. Properly managing and nurturing an open source community is deeply complex work, involving a sustained amount of both attention and emotional labour — this is difficult glue work that is not always recognized or supported by peers or management. Many of the people I’ve met over the years (community volunteers and Mozilla employees alike) have ended up feeling like it just isn’t worth the effort and have either stopped doing it or have outright left Mozilla. If it weren’t for an intensive meditation practice which I established around the time I started working here, I suspect I would have been in this category by now.

All this has led to a personal crisis of faith. Do openness and transparency inherently lead to bad outcomes? Should I continue to advocate for it in my position? As I mentioned above, the opportunity to work in the open with the community is the main thing that brought me to Mozilla— if I can’t find a way of incorporating this viewpoint into my work, what am I even doing here?

Trying to answer these questions, I went back to the manifesto that I just skimmed over in my early days. Besides openness — what are Mozilla’s values, really, and do I identify with them? Immediately I was struck by how much it felt like it was written explicitly for the present moment (even aside from the addendums which were added in 2018). Many points seem to confront problems we’re grappling with now which I was only beginning to perceive ten years ago.

Beyond that, there was also something that resonated with me on a deeper level. There were a few points, highlighted in bold, that really stood out:

  1. The internet is an integral part of modern life—a key component in education, communication, collaboration, business, entertainment and society as a whole.
  2. The internet is a global public resource that must remain open and accessible.
  3. The internet must enrich the lives of individual human beings.
  4. Individuals’ security and privacy on the internet are fundamental and must not be treated as optional.
  5. Individuals must have the ability to shape the internet and their own experiences on the internet.
  6. The effectiveness of the internet as a public resource depends upon interoperability (protocols, data formats, content), innovation and decentralized participation worldwide.
  7. Free and open source software promotes the development of the internet as a public resource.
  8. Transparent community-based processes promote participation, accountability and trust.
  9. Commercial involvement in the development of the internet brings many benefits; a balance between commercial profit and public benefit is critical.
  10. Magnifying the public benefit aspects of the internet is an important goal, worthy of time, attention and commitment.

I think it’s worth digging beneath the surface of these points: what is the underlying value system behind them? I’d argue it’s this, simply put: human beings really do matter. They’re not just line items in a spreadsheet or some other resource to be optimized. They are an end in of themselves. People (more so than a software development methodology) are the reason why I show up every day to do the work that I do. This is really an absolute which has enduring weight: it’s a foundational truth of every major world religion to say nothing of modern social democracy.

What does working and building in then open mean then? As we’ve seen above, it certainly isn’t something I’d consider “good” all by itself. Instead, I’d suggest it’s a strategy which (if we’re going to follow it) should come out of that underlying recognition of the worth of Mozilla’s employees, community members, and users. Every single one of these people matter, deeply. I’d argue then, that Mozilla should consider the following factors in terms of how we work in the open:

  • Are our spaces2 generally safe for people of all backgrounds to be their authentic selves? This not only means free from sexual harassment and racial discrimination, but also that they’re able to work to their full potential. This means creating opportunities for everyone to climb the contribution curve, among other things.
  • We need to be more honest and direct about the economic benefits that community members bring to Mozilla. I’m not sure exactly what this means right now (and of course Mozilla’s options are constrained both legally and economically), but we need to do better about acknowledging their contributions to Mozilla’s bottom line and making sure there is a fair exchange of value on both sides. At the very minimum, we need to make sure that people’s contributions help them grow professionally or otherwise if we can’t guarantee monetary compensation for their efforts.
  • We need to acknowledge the efforts that our employees make in creating functional communities. This work does not come for free and we need to start acknowledging it in both our career development paths and when looking at individual performance. Similarly, we need to provide better guidance and mentorship on how to do this work in a way that does not extract too hard a personal toll on the people involved — this is a complex topic, but a lot of it in my opinion comes down to better onboarding practices (which is something we should be doing anyway) as well as setting better boundaries (both in terms of work/life balance, as well as what you’ll accept in your interactions).
  • Finally, what is the end result of our work? Do the software and systems we build genuinely enrich people’s lives? Do they become better informed after using our software? Do they make them better decisions? Free software might be good in itself, but one must also factor in how it is used when measuring its social utility (see: Facebook).

None of the above is easy to address. But the alternatives are either close everything down to public participation (which I’d argue will lead to the death of Mozilla as an organization: it just doesn’t have the resources to compete in the marketplace without the backing of the community) or continue down the present path (which I don’t think is sustainable either). The last ten years have shown that the “open source on auto-pilot” approach just doesn’t work.

I suspect these problems aren’t specific to Mozilla and affect other communities that work in the open. I’d be interested in hearing other perspectives on this family of problems: if you have anything to add, my contact information is below.

  1. I’ve since moved to the Data team

  2. This includes our internal communications channels like our Matrix instance as well as issue trackers like Bugzilla. There’s also a question of what to do about non-Mozilla channels, like Twitter or the Orange Site. Although not Mozilla spaces, these places are often vectors for harassment of community members. I don’t have any good answers for what to do about this, aside from offering my solidarity and support to those suffering abuse on these channels. Disagreement with Mozilla’s strategy or policy is one thing, but personal attacks, harassment, and character assasination is never ok— no matter where it’s happening. 

Jan-Erik RedigerThis Week in Glean: Boring Monitoring

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.)

All "This Week in Glean" blog posts are listed in the TWiG index (and on the Mozilla Data blog). This article is cross-posted on the Mozilla Data blog.


Every Monday the Glean has its weekly Glean SDK meeting. This meeting is used for 2 main parts: First discussing the features and bugs the team is currently investigating or that were requested by outside stakeholders. And second bug triage & monitoring of data that Glean reports in the wild.

Most of the time looking at our monitoring is boring and that's a good thing.

From the beginning the Glean SDK supported extensive error reporting on data collected by the framework inside end-user applications. Errors are produced when the application tries to record invalid values. That could be a negative value for a counter that should only ever go up or stopping a timer that was never started. Sometimes this comes down to a simple bug in the code logic and should be fixed in the implementation. But often this is due to unexpected and surprising behavior of the application the developers definitely didn't think about. Do you know all the ways that your Android application can be started? There's a whole lot of events that can launch it, even in the background, and you might miss instrumenting all the right parts sometimes. Of course this should then also be fixed in the implementation.

Monitoring Firefox for Android

For our weekly monitoring we look at one application in particular: Firefox for Android. Because errors are reported in the same way as other metrics we are able to query our database, aggregate the data by specific metrics and errors, generate graphs from it and create dashboards on our instance of Redash.

Graph of the error counts for different metrics in Firefox for Android

The above graph displays error counts for different metrics. Each line is a specific metric and error (such as Invalid Value or Invalid State). The exact numbers are not important. What we're interested in is the general trend. Are the errors per metrics stable or are there sudden jumps? Upward jumps indicate a problem, downward jumps probably means the underlying bug got fixed and is finally rolled out in an update to users.

Rate of affected clients in Firefox for Android

We have another graph that doesn't take the raw number of errors, but averages it across the entire population. A sharp increase in error counts sometimes comes from a small number of clients, whereas the errors for others stay at the same low-level. That's still a concern for us, but knowing that a potential bug is limited to a small number of clients may help with finding and fixing it. And sometimes it's really just bogus client data we get and can dismiss fully.

Most of the time these graphs stay rather flat and boring and we can quickly continue with other work. Sometimes though we can catch potential issues in the first days after a rollout.

Sudden jump upwards in errors for 2 metrics in Firefox for Android Nightly

In this graph from the nightly release of Firefox for Android two metrics started reporting a number of errors that's far above any other error we see. We can then quickly find the implementation of these metrics and report that to the responsible team (Filed bug, and the remediation PR).

But can't that be automated?

It probably can! But it requires more work than throwing together a dashboard with graphs. It's also not as easy to define thresholds on these changes and when to report them. There's work underway that hopefully enables us to more quickly build up these dashboards for any product using the Glean SDK, which we can then also extend to do more reporting automated. The final goal should be that the product teams themselves are responsible for monitoring their data.

Karl DubostThe Benefits Of Code Review For The Reviewer

Code Review is an essential part of the process of publishing code. We often talk about the benefits of code review for projects and for people writing the code. I want to talk about the benefits for the person actually reviewing the code.

Path in a bamboo garden

Understanding The Project

When doing code review, we don't necessarily have a good understanding of the project, or at least the same level of understanding than the person who has written the code.

Reviewing is a good way to piece together all the parts that makes this project work.

Learning How To Better Code

A lot of the reviews I have been have involved with taught me on how to become a better developer. Nobody has full knowledge of a language, an algorithm construct, a data structure. When reviewing we learn as much as we help. For things, which seem unclear, we dive into the documentation to better understand the intent. We can put into competition the existing knowledge with the one brought by the developer.

We might bring a new solution to the table that we didn't know existed. A review is not only discovering errors or weaknesses of a code, it's how to improve the code by exchanging ideas with the developer.

Instant Gratification

There is a feel good opportunity when doing good code reviews. Specifically, when the review helped to improve both the code and the developer. Nothing better than the last comment of a developer being happy of having the code merged and the feeling of improving skills.

To comment…

Otsukare!

Support.Mozilla.OrgIntroducing Fabiola Lopez

Hi everyone,

Please join us in welcoming Fabiola Lopez (Fabi) to the team. Fabi will be helping us with support content in English and Spanish, so you’ll see her in both locales. Here’s a little more about Fabi:

 

Hi, everyone! I’m Fabi, and I am a content writer and a translator. I will be working with you to create content for all our users. You will surely find me writing, proofreading, editing and localizing articles. If you have any ideas to help make our content more user-friendly, please reach out to me. Thanks to your help, we make this possible.

 

Also, Angela’s contract was ended last week. We’d like to thank Angela for her support for the past year.

This Week In RustThis Week in Rust 379

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

No newsletters or official blog posts this week.

Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is lever, a library for writing transactional systems.

Thanks to Mahmud Bulut for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

starlight - Support for "unsafe" cases of finally

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

329 pull requests were merged in the last week

Rust Compiler Performance Triage

Overall, a positive week for compiler performance with only one moderate regression. The change that introduced the regression leads to significantly improved bootstrap speed of the compiler as well as easier maintainability.

Triage done by @rylev. Revision range: f1c47c..301ad8a

1 Regression, 5 Improvements, 0 Mixed 0 of them in rollups

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs

New RFCs

Upcoming Events

Online
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

ForAllSecure

NZXT

Parity

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Finally, I feel it is necessary to debunk the “fighting the borrow checker” legend, a story depicting the Rust compiler as a boogeyman: in my experience, it happens mostly to beginners and the 1% trying to micro-optimize code or push the boundaries. Most experienced Rust developers know exactly how to model their code in a way that no time is wasted fighting the compiler on design issues, and can spot anti-patterns at a glance, just like most people know how to drive their car on the correct side of the road to avoid accidents, and notice those who don’t!

Simon Chemouil on the Kraken blog

Thanks to scottmcm for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Hacks.Mozilla.OrgA Fabulous February Firefox — 86!

Looking into the near distance, we can see the end of February loitering on the horizon, threatening to give way to March at any moment. To keep you engaged until then, we’d like to introduce you to Firefox 86. The new version features some interesting and fun new goodies including support for the Intl.DisplayNames object, the :autofill pseudo-class, and a much better <iframe> inspection feature in DevTools.

This blog post provides merely a set of highlights; for all the details, check out the following:

Better <iframe> inspection

The Firefox web console used to include a cd() helper command that enabled developers to change the DevTools’ context to inspect a specific <iframe> present on the page. This helper has been removed in favor of the iframe context picker, which is much easier to use.

When inspecting a page with <iframe>s present, the DevTools will show the iframe context picker button.

Firefox devtools, showing the select iframe dropdown menu, a list of the iframes on the page that can be selected from

When pressed, it will display a drop-down menu listing all the URLs of content embedded in the page inside <iframe>s. Choose one of these, and the inspector, console, debugger, and all other developer tools will then target that <iframe>, essentially behaving as if the rest of the page does not exist.

:autofill

The :autofill CSS pseudo-class matches when an <input> element has had its value auto-filled by the browser. The class stops matching as soon as the user edits the field.

For example:

input:-webkit-autofill {
  border: 3px solid blue;
}

input:autofill {
  border: 3px solid blue;
}

Firefox 86 supports the unprefixed version with the -webkit-prefixed version also supported as an alias. Most other browsers just support the prefixed version, so you should provide both for maximum browser support.

Intl.DisplayNames

The Intl.DisplayNames built-in object has been enabled by default in Firefox 86. This enables the consistent translation of language, region, and script display names. A simple example looks like so:

// Get English currency code display names
let currencyNames = new Intl.DisplayNames(['en'], {type: 'currency'});

// Get currency names
currencyNames.of('USD'); // "US Dollar"
currencyNames.of('EUR'); // "Euro"

Nightly preview — image-set()

The image-set() CSS function lets the browser pick the most appropriate CSS image from a provided set. This is useful for implementing responsive images in CSS, respecting the fact that resolution and bandwidth differ by device and network access.

The syntax looks like so:

background-image: image-set("cat.png" 1x,
                            "cat-2x.png" 2x,
                            "cat-print.png" 600dpi);

Given the set of options, the browser will choose the most appropriate one for the current device’s resolution — users of lower-resolution devices will appreciate not having to download a large hi-res image that they don’t need, which users of more modern devices will be happy to receive a sharper, crisper image that looks better on their device.

WebExtensions

As part of our work on Manifest V3, we have landed an experimental base content security policy (CSP) behind a preference in Firefox 86. The new CSP disallows remote code execution. This restriction only applies to extensions using manifest_version 3, which is not currently supported in Firefox (currently, only manifest_version 2 is supported).

If you would like to test the new CSP for extension pages and content scripts, you must change your extension’s manifest_version to 3 and set extensions.manifestv3.enabled to true in about:config. Because this is a highly experimental and evolving feature, we want developers to be aware that extensions that work with the new CSP may break as more changes are implemented.

The post A Fabulous February Firefox — 86! appeared first on Mozilla Hacks - the Web developer blog.

The Firefox FrontierTry Firefox Picture-in-Picture for multi-tasking with videos

The Picture-in-Picture feature in the Firefox browser makes multitasking with video content easy, no window shuffling necessary. With Picture-in-Picture, you can play a video in a separate, scalable window that … Read more

The post Try Firefox Picture-in-Picture for multi-tasking with videos appeared first on The Firefox Frontier.

About:CommunityNew Contributors In Firefox 86

With the release of Firefox 86, we are pleased to welcome many new friends of the Fox, developers who’ve contributed their first code changes to Firefox in version 86. 25 were brand new volunteers! Please join us in congratulating, thanking and welcoming all of these diligent and enthusiastic contributors, and take a look at their excellent work:

The Mozilla BlogLatest Firefox release includes Multiple Picture-in-Picture and Total Cookie Protection

Beginning last year, the internet began playing a bigger role in our lives than ever before. In the US, we went from only three percent of workers to more than forty percent working from home in 2020, all powered by the web. We also relied on it to stay informed, and connect with friends and family when we couldn’t meet in-person.

And despite the many difficulties we all have faced online and offline, we’re proud to keep making Firefox an essential part of what makes the web work.

Today I’m sharing two new features: multiple picture-in-picture (multi-PiP) and our latest privacy protection combo. Multi-PiP allows multiple videos to play at the same time — all the adorable animal videos or NCAA Tournament anyone? And our latest privacy protection, the dynamic duo of Total Cookie Protection (technically known as State Partitioning or Dynamic First-Party Isolation) and Supercookie Protections (launched in last month’s release) are here to combat cross-site cookie tracking once and for all.

Today’s Firefox features:

Multiple Picture-in-Picture to help multi-task

Our Picture-in-Picture feature topped our Best of Firefox 2020 features list and we heard from people who wanted more than just one picture-in-picture view. In today’s release, we added multiple picture-in-picture views, available on Mac, Linux and Windows, and includes keyboard controls for fast forward and rewind. Haven’t been to a zoo in a while? Now, you can visit your favorite animal at the zoo, along with any other animals around the world with multiple views.  Also, we can’t help that it coincides with one of the biggest sports events this year in March.  :basketball: :wink:



New privacy protections to stop cookie tracking

Today, we are announcing Total Cookie Protection for Firefox, a major new milestone in our work to protect your privacy. Total Cookie Protection stops cookies from tracking you around the web by creating a separate cookie jar for every website. Total Cookie Protection joins our suite of privacy protections called ETP (Enhanced Tracking Protection). In combining Total Cookie Protection with last month’s supercookie protections, Firefox is now armed with very strong, comprehensive protection against cookie tracking. This will be available in ETP Strict Mode in both the desktop and Android version. Here’s how it works:

Total Cookie Protection confines all cookies from each website in a separate cookie jar

In our ongoing commitment to bring the best innovations in privacy, we are working tirelessly to improve how Firefox protects our users from tracking. In 2019, Firefox introduced Enhanced Tracking Protection (ETP) which blocks cookies from known, identified trackers, based on the Disconnect list. To bring even more comprehensive protection, Total Cookie Protection confines all cookies from each website in a separate cookie jar so that cookies can no longer be used to track you across the web as you browse from site to site. For a technical look at how this works, you can dig into the details in our post on our Security Blog. You can turn on Total Cookie Protection by setting your Firefox privacy controls to Strict mode.

Join our journey to evolve Firefox

If it’s been a while since you’ve used Firefox, now is the time to try Firefox again and see today’s features. You can download the latest version of Firefox for your desktop and mobile devices and get ready for an exciting year ahead.

The post Latest Firefox release includes Multiple Picture-in-Picture and Total Cookie Protection appeared first on The Mozilla Blog.

Mozilla Security BlogFirefox 86 Introduces Total Cookie Protection

Today we are pleased to announce Total Cookie Protection, a major privacy advance in Firefox built into ETP Strict Mode. Total Cookie Protection confines cookies to the site where they were created, which prevents tracking companies from using these cookies to track your browsing from site to site.

Cookies, those well-known morsels of data that web browsers store on a website’s behalf, are a useful technology, but also a serious privacy vulnerability. That’s because the prevailing behavior of web browsers allows cookies to be shared between websites, thereby enabling those who would spy on you to “tag” your browser and track you as you browse. This type of cookie-based tracking has long been the most prevalent method for gathering intelligence on users. It’s a key component of the mass commercial tracking that allows advertising companies to quietly build a detailed personal profile of you.

In 2019, Firefox introduced Enhanced Tracking Protection by default, blocking cookies from companies that have been identified as trackers by our partners at Disconnect. But we wanted to take protections to the next level and create even more comprehensive protections against cookie-based tracking to ensure that no cookies can be used to track you from site to site as you browse the web.

Our new feature, Total Cookie Protection, works by maintaining a separate “cookie jar” for each website you visit. Any time a website, or third-party content embedded in a website, deposits a cookie in your browser, that cookie is confined to the cookie jar assigned to that website, such that it is not allowed to be shared with any other website.

Total Cookie Protection creates a separate cookie jar for each website you visit. (Illustration: Meghan Newell)

In addition, Total Cookie Protection makes a limited exception for cross-site cookies when they are needed for non-tracking purposes, such as those used by popular third-party login providers. Only when Total Cookie Protection detects that you intend to use a provider, will it give that provider permission to use a cross-site cookie specifically for the site you’re currently visiting. Such momentary exceptions allow for strong privacy protection without affecting your browsing experience.

In combination with the Supercookie Protections we announced last month, Total Cookie Protection provides comprehensive partitioning of cookies and other site data between websites in Firefox. Together these features prevent websites from being able to “tag” your browser,  thereby eliminating the most pervasive cross-site tracking technique.

To learn more technical details about how Total Cookie Protection works under the hood, you can read the MDN page on State Partitioning and our blog post on Mozilla Hacks.

Thank you

Total Cookie Protection touches many parts of Firefox, and was the work of many members of our engineering team: Andrea Marchesini, Gary Chen, Nihanth Subramanya, Paul Zühlcke, Steven Englehardt, Tanvi Vyas, Anne van Kesteren, Ethan Tseng, Prangya Basu, Wennie Leung, Ehsan Akhgari, and Dimi Lee.

We wish to express our gratitude to the many Mozillians who contributed to and supported this work, including: Selena Deckelmann, Mikal Lewis, Tom Ritter, Eric Rescorla, Olli Pettay, Kim Moir, Gregory Mierzwinski, Doug Thayer, and Vicky Chin.

Total Cookie Protection is an evolution of the First-Party-Isolation feature, a privacy protection that is shipped in Tor Browser. We are thankful to the Tor Project for that close collaboration.

We also want to acknowledge past and ongoing work by colleagues in the Brave, Chrome, and Safari teams to develop state partitioning in their own browsers.

The post Firefox 86 Introduces Total Cookie Protection appeared first on Mozilla Security Blog.

Hacks.Mozilla.OrgIntroducing State Partitioning

State Partitioning is the technical term for a new privacy feature in Firefox called Total Cookie Protection, which will be available in ETP Strict Mode in Firefox 86. This article shows how State Partitioning works inside of Firefox and explains what developers of third-party integrations can do to stay compatible with the latest changes.

Web sites utilize a variety of different APIs to store data in the browser. Most famous are cookies, which are commonly used to build login sessions and provide a customized user experience. We call these stateful APIs, because they are able to establish state that will persist through reloads, navigations and browser restarts. While these APIs allow developers to enrich a user’s web experience, they also enable nefarious web tracking which jeopardizes user privacy. To fight abuse of these APIs Mozilla is introducing State Partitioning in Firefox 86.

Stateful Web APIs in Firefox are:

  • Storage: Cookies, Local Storage, Session Storage, Cache Storage, and IndexedDB
  • Workers: SharedWorkers and ServiceWorkers
  • Communication channel: Broadcast channel

To fight against web tracking, Firefox currently relies on Enhanced Tracking Protection (ETP) which blocks cookies and other shared state from known trackers, based on the Disconnect list. This form of cookie blocking is an effective approach to stop tracking, but it has its limitations. ETP protects users from the 3000 most common and pervasive identified trackers, but its protection relies on the fact that the list is complete and always up-to-date. Ensuring completeness is difficult, and trackers can try to circumvent the list by registering new domain names. Additionally, identifying trackers is a time-consuming task and commonly adds a delay on a scale of months before a new tracking domain is added to the list.

To address the limitations of ETP and provide comprehensive protection against trackers, we introduce a technique called State Partitioning, which will prevent cookie-based tracking universally, without the need for a list.

State Partitioning is complemented by our efforts to eliminate the usage of non-traditional storage mechanisms (“supercookies”) as a tracking vector, for example through the partitioning of network state, which was recently rolled out in Firefox 85.

State Partitioning – How it works in Firefox

To explain State Partitioning, we should first take a look at how stateful Web APIs enable tracking on the Web.  While these APIs were not designed for tracking, their state is shared with a website regardless of whether it is loaded as a first-party or embedded as a third-party, for example in an iframe or as a simple image (“tracking pixel”). This shared state allows trackers embedded in other websites to track you across the Web, most commonly by setting cookies.

For example, a cookie of www.tracker.com will be shared on foo.com and bar.com if they both embed www.tracker.com as a third-party. So, www.tracker.com can connect your activities on both sites by using the cookie as an identifier.

ETP will prevent this by simply blocking access to shared state for embedded instances of www.tracker.com. Without the ability to set cookies, the tracker can not easily re-identify you.

Cookie-based tracking without protections, both instances of www.tracker.com share the same cookie.

 

In comparison, State Partitioning will also prevent shared third-party state, but it does so without blocking cookie access entirely. With State Partitioning, shared state such as cookies, localStorage, etc. will be partitioned (isolated) by the top-level website you’re visiting. In other words, every first party and its embedded third-party contexts will be put into a self-contained bucket.

Firefox is using double-keying to implement State Partitioning, which will add an additional key to the origin of the website that is accessing these states. We use the scheme and registrable domain (also known as eTLD+1) of the top-level site as the additional key. Following the above example, cookies for www.tracker.com will be keyed differently under foo.com and bar.com. Instead of looking up the cookie jar for www.tracker.com, Storage partitioning will use www.tracker.com^http://foo.com and www.tracker.com^http://bar.com respectively.

Cookie-based tracking prevented by State Partitioning, by double-keying both instances of www.tracker.com.

 

Thus, there will be two distinct cookie jars for www.tracker.com under these two top-level websites.

This takes away the tracker’s ability to use cookies and other previously shared state to identify individuals across sites. Now the state is separate (“partitioned”) instead of shared across different first-party domains.

It is important to understand that State Partitioning will apply to every embedded third-party resource, regardless of whether it is a tracker or not.

This brings great benefits for privacy allowing us to extend protections beyond the Disconnect list and it allows embedded websites to continue to use their cookies and storage as they normally would, as long as they don’t need cross-site access. In the next section we will examine what embedded websites can do if they have a legitimate need for cross-site shared state.

State Partitioning – Web Compatibility

Given that State Partitioning brings a fundamental change to Firefox, ensuring web compatibility and an unbroken user and developer experience is a top concern. Inevitably, State Partitioning will break websites by preventing legitimate uses of third-party state. For example, Single Sign-On (SSO) services rely on third-party cookies to sign in users across multiple websites. State Partitioning will break SSO because the SSO provider will not be able to access its first-party state when embedded in another top-level website so that it is unable to recognize a logged-in user.

Third-party SSO cookies partitioned by State Partitioning, the SSO iframe cannot get the first-party cookie access.

 

In order to resolve these compatibility issues of State Partitioning, we allow the state to be unpartitioned in certain cases. When unpartitioning is taking effect, we will stop using double-keying and revert the ordinary (first-party) key.

Given the above example, after unpartitioning, the top-level SSO site and the embedded SSO service’s iframe will start to use the same storage key, meaning that they will both access the same cookie jar. So, the iframe can get the login credentials via a third-party cookie.

The SSO site has been unpartitioned, the SSO iframe gets the first-party cookie access.

 

State Partitioning – Getting Cross-Site Cookie Access

There are two scenarios in which Firefox might unpartition states for websites to allow for access to first-party (cross-site) cookies and other state:

  1. When an embedded iframe calls the Storage Access API.
  2. Based on a set of automated heuristics.

Storage Access API

The Storage Access API is a newly proposed JavaScript API to handle legitimate exceptions from privacy protections in modern browsers, such as ETP in Firefox or Intelligent Tracking Prevention in Safari. It allows the restricted third-party context to request first-party storage access (access to cookies and other shared state) for itself. In some cases, browsers will show a permission prompt to decide whether they trust the third party enough to allow this access.

The Firefox user prompt of the Storage Access API.

 

A partitioned third-party context can use the Storage Access API to gain a storage permission which grants unpartitioned access to its first-party state.

This functionality is expressed through the document.requestStorageAccess method. Another method, document.hasStorageAccess, can be used to find out whether your current browsing context has access to its first party storage. As outlined on MDN, document.requestStorage is subject to a number of restrictions, standardized as well as browser-specific, that protect users from abuse. Developers should make note of these and adjust their site’s user experience accordingly.

As an example, Zendesk will show a message with a call-to-action element to handle the standard requirement of transient user activation (e.g. a button click). In Safari, it will also spawn a popup that the user has to activate to ensure that the browser-specific requirements of Webkit are satisfied.

Zendesk notification to allow the user to trigger the request for storage access.

 

After the user has granted access, Firefox will remember the storage permission for 30 days.

Note that the third-party context will only be unpartitioned under the top-level domain for which the storage access has been requested. For other top-level domains, the third-party context will still be partitioned. Let’s say there is a cross-origin iframe example.com which is embedded in foo.com. And example.com uses the Storage Access API to request first-party access on foo.com and the user allows it. In this case, example.com will have unpartitioned access to its own first-party cookies on foo.com. Later, the user loads another page bar.com which also embeds example.com. But, this time, the example.com will remain partitioned under bar.com because there is no storage permission here.

document.hasStorageAccess().then(hasAccess => {
if (!hasAccess) {
return document.requestStorageAccess();
}
}).then(_ => {
// Now we have unpartitioned storage access for the next 30 days!
//
// …
}).catch(_ => {
// error obtaining storage access.
});

Javascript example of using the Storage Access API to get the storage access from users.

Currently, the Storage Access API is supported by Safari, Edge, Firefox. It is behind a feature flag in Chrome.

Automatic unpartitioning through heuristics

In the Firefox storage access policy, we have defined several heuristics to address Web compatibility issues. The heuristics are designed to catch the most common scenarios of using third-party storage on the web (outside of tracking) and allow storage access in order to make websites continue normally. For example, in Single-Sign-On flows it is common to open a popup that allows the user to sign in, and transmit that sign-in information back to the website that opened the popup. Firefox will detect this case and automatically grant storage access.

Note that these heuristics are not designed for the long term. Using the Storage Access API is the recommended solution for websites that need unpartitioned access. We will continually evaluate the necessity of the restrictions and remove them as appropriate. Therefore, developers should not rely on them now or in the future.

State Partitioning – User controls for Cross-Site Cookie Access

We have introduced a new UI for State Partitioning which allows users to be aware of which third parties have acquired unpartitioned storage and provides fine-grain control of storage access. Firefox will show the unpartitioned domains in the permission section of the site identity panel. The “Cross-site Cookies” permission indicates the unpartitioned domain, and users can delete the permission from the UI directly by clicking the cancel button alongside the permission entries.

The post Introducing State Partitioning appeared first on Mozilla Hacks - the Web developer blog.

Karl DubostBrowser Wish List - Bookmark This Selection

Some of us are keeping notes of bread and crumbs fallen everywhere. A dead leaf, a piece of string, a forgotten note washed away on a beach, and things read in a book. We collect memories and inspiration.

All browsers have a feature called "Bookmark This Page". It is essentially the same poor badly manageable tool on every browsers. If you do not want to rely on a third party service, or an addon, what the browser has to offer is not very satisfying.

Firefox gives a possibility to change the name, to choose where to put it and to add tags at the moment we save it.

Firefox Menu for bookmarking a page

Edge follows the same conventions without the tagging.

Edge Menu for bookmarking a page

Safari offers something slightly more evolved with a Description field.

Safari Menu for bookmarking a page

but none of them is satisfying for the Web drifters, the poets collecting memories, the archivists and the explorers. And it's unfortunate because it looks like such a low hanging fruit. It ties very much in my previous post about Browser Time Machine.

Bookmark This Selection

What I would like from the bookmark feature in the browser is the ability to not only bookmark the full page but be able to select a piece of the page that is reflected in the bookmark, be through the normal menu as we have seen above or through the contextual menu of the browser.

Firefox Contextual menu after selecting a text

Then once the bookmarks are collected I can do full text searches on all the collected texts.

And yes, some add-ons exist, but I just wish the feature was native to the browser. And I do not want to rely on a third party service. My quotes are mine only and should not necessary be shared with a server on someone's else machine.

Memex which has very interesting features, but it is someone else service. Pocket (even if it belongs to Mozilla) is not answering my needs. I need to open an account, and it is someone's else server.

To comment

Otsukare!

Tantek ÇelikLife Happens: Towards Community Support For Prioritizing Life Events And Mutual Care, Starting With The #IndieWeb

“Life Happens” is an acknowledgement that there are numerous things that people experience in their actual physical lives that suddenly take higher priority than nearly anything else (like participation in volunteer-based communities), and those communities (like the IndieWeb) should acknowledge, accept, and be supportive of community members experiencing such events.

What Happens

What kind of events? Off the top of my head I came up with several that I’ve witnessed community members (including a few myself) experience, like:

  • getting married — not having experienced this myself, I can only imagine that for some folks it causes a priorities reset
  • having a child — from what I've seen this pretty much causes nearly everything else that isn’t essential to get dropped, acknowledging that there are many family shapes, without judgment of any
  • going through a bad breakup or divorce — the trauma, depression etc. experienced can make you want to not show up for anything, sometimes not even get out of bed
  • starting a new job — that takes up all your time, and/or polices what you can say online, or where you may participate
  • becoming an essential caregiver — caring for an aging, sick, or critically ill parent, family member, or other person
  • buying a house — often associated with a shift in focus of personal project time (hat tip: Marty McGuire)
  • home repairs or renovations — similar to “new house” project time, or urgent repairs. This is one that I’ve been personally both “dealing with” and somewhat embracing since December 2019 (with maybe a few weeks off at times), due to an infrastructure failure the previous month, which turned into an inspired series of renovations
  • death of a family member, friend, pet
  • … more examples of how life happens on the wiki

Values, People, and Making It Explicit

When these things happen, as a community, I feel we should respond with kindness, support, and understanding when someone steps back from community participation or projects. We should not shame or guilt them in any way, and ideally act in such a way that welcomes their return whenever they are able to do so.

Many projects (especially open source software) often talk about their “bus factor” (or more positively worded “lottery factor”). However that framing focuses on the robustness of the project (or company) rather than those contributing to it. Right there in IndieWeb’s motto is an encouragement to reframe: be a “people-focused alternative to the corporate […]”.

The point of “life happens” is to decenter the corporation or project when it comes to such matters, and instead focus on the good of the people in the community. Resiliency of humanity over resiliency of any particular project or organization.

Adopting such values and practices explicitly is more robust than depending on accidental good faith or informal cultural support. Such emotional care should be the clearly written default, rather than something everyone has to notice and figure out on their own. I want to encourage more mutual care-taking as a form of community-based resiliency, and make it less work for folks experiencing “life happens” moments. Through such care, I believe you get actually sustainable community resiliency, without having to sacrifice or burn people out.

Acknowledging Life Happens And You Should Take Care

It’s important to communicate to community members, and especially new community members that a community believes in mutual care-taking. That yes, if and when “life happens” to you that:

  • we want you to take care of what you need to take care of
  • you are encouraged to prioritize those things most important to you, and that the community will not judge or shame you in any way
  • you should not feel guilty about being absent, or abruptly having to stop participating
  • it is ok to ask for help in the community with any of your community projects or areas of participation, no matter what size or importance
  • the community will be here for you when you’re able to and want to return

It’s an incomplete & imperfect list, yet hopefully captures the values and general feeling of support. More suggestions welcome.

How to Help

Similarly, if you notice someone active in the community is missing, if you feel you know them well enough, you’re encouraged to reach out and unobtrusively check on them, and ask (within your capacity) if there’s anything you can do to help out with any community projects or areas of participation.

Thanks to Chris Aldrich for expanding upon How to help and encouraging folks to Keep in mind that on top of these life changes and stresses, the need to make changes to social activities (like decreasing or ceasing participation in the IndieWeb community) can be an added additional compounding stress on top of the others. Our goal should be to mitigate this additional stress as much as possible.

How to Repair

Absence(s) from the community can result in shared resources or projects falling behind or breaking. It’s important to provide guidance to the community with how to help repair such things, especially in a caring way without any shame or guilt. Speaking to a second person voice:

You might notice that one or more projects, wiki pages, or sections appear to be abandoned or in disrepair. This could be for any number of reasons, so it’s best to ask about it in a discussion channel to see if anyone knows what’s going on. If it appears someone is missing (for any reason), you may do kind and respectful repairs on related pages (wikifying), in a manner that attempts to minimize or avoid any guilt or shame, and ideally makes it clear they are welcome back any time.

If you come across an IndieWeb Examples section on a page where the links either don’t work (404, broken in some other way, or support appears to have been dropped), move that specific IndieWeb Example to a “Past Examples” section, and fix the links with Internet Archive versions, perhaps at a point in time of when the links were published (e.g. permalinks with dates in their paths), or by viewing history on the wiki page and determining when the broken links were added.

Encouraging More Communities To Be Supportive When Life Happens

When I shared these thoughts with the IndieWeb chat and wiki a couple of weeks ago, no one knew of any other open (source, standards, etc.) communities that had such an explicit “Life Happens” statement or otherwise explicitly captured such a sentiment.

My hope is that the IndieWeb community can set a good example here for making a community more humane and caring (rather than the “just work harder” capitalist default, or quiet unemotional detached neglect of an abandoned GitHub repo).

That being said, we’re definitely interested in knowing about other intentional creative communities with any similar explicit sentiments or statements of community care, especially those that acknowledge that members of a community may experience things which are more important to them than their participation in that community, and being supportive of that.

This blog post is a snapshot in time and my own expression, most of which is shared freely on the IndieWeb wiki.

If this kind of statement resonates with you and your communities, you’re encouraged to write one of your own, borrowing freely from the latest (and CC0 licensed) version on the wiki: life happens. Attribution optional. Either way, let us know, as it would be great to collect other examples of communities with explicit “life happens” statements.

Thanks

Thanks to early feedback & review in chat from Kevin Marks, Jacky Alcine, Anthony Ciccarello, Ben Werdmüller, and gRegor Morrill. On the wiki page, thanks for excellent additions from Chris Aldrich, and proofreading & precise fixes from Jeremy Cherfas. Thanks for the kind tweets Ana Rodrigues and Barry Frost.

Now back to some current “life happens” things… (also posted on IndieNews)

Cameron KaiserTenFourFox FPR30 SPR2 available

TenFourFox Feature Parity Release "30.2" (SPR 2) is now available for testing (downloads, hashes, release notes). The reason this is another security-only release is because of my work schedule and also I spent a lot of time spinning my wheels on issue 621, which is the highest priority JavaScript concern because it is an outright crash. The hope was that I could recreate one of the Apple discussion pages locally and mess with it and maybe understand what is unsettling the parser, but even though I thought I had all the components, it still won't load or reproduce in a controlled environment. I've spent too much time on it and even if I could do more diagnostic analysis I still don't know if I can do anything better than "not crash" (and in SPR2 is a better "don't crash" fix, just one that doesn't restore any functionality). Still, if you are desperate to see this fixed, see if you can create a completely local Apple discussions page or clump of files that reliably crashes the browser. If you do this, either attach the archive to the Github issue or open a Tenderapp ticket and let's have a look. No promises, but if the community wants this fixed, the community will need to do some work on it.

In the meantime, I want to get back to devising a settings tab to allow the browser to automatically pick appropriate user agents and/or start reader mode by domain so that sites that are expensive or may not work with TenFourFox's older hacked-up Firefox 45 base can automatically select a fallback. Our systems are only getting slower compared to all the crap modern sites are adding, after all. I still need to do some plumbing work on it, so the fully-fledged feature is not likely to be in FPR31, but I do intend to have some more pieces of the plumbing documented so that you can at least mess with that. The user-agent selector will be based on the existing functionality that was restored in FPR17.

Assuming no major issues, FPR30 SPR2 goes live Monday evening Pacific as usual.

Daniel Stenberg“I will slaughter you”

You might know that I’ve posted funny emails I’ve received on my blog several times in the past. The kind of emails people send me when they experience problems with some device they own (like a car) and they contact me because my email address happens to be visible somewhere.

People sometimes say I should get a different email address or use another one in the curl license file, but I’ve truly never had a problem with these emails, as they mostly remind me about the tough challenges the modern technical life bring to people and it gives me insights about what things that run curl.

But not all of these emails are “funny”.

Category: not funny

Today I received the following email

From: Al Nocai <[redacted]@icloud.com>
Date: Fri, 19 Feb 2021 03:02:24 -0600
Subject: I will slaughter you

That subject.

As an open source maintainer since over twenty years, I know flame wars and personal attacks and I have a fairly thick skin and I don’t let words get to me easily. It took me a minute to absorb and realize it was actually meant as a direct physical threat. It found its ways through and got to me. This level of aggressiveness is not what I’m prepared for.

Attached in this email, there were seven images and no text at all. The images all look like screenshots from a phone and the first one is clearly showing source code I wrote and my copyright line:

The other images showed other source code and related build/software info of other components, but I couldn’t spot how they were associated with me in any way.

No explanation, just that subject and the seven images and I was left to draw my own conclusions.

I presume the name in the email is made up and the email account is probably a throw-away one. The time zone used in the Date: string might imply US central standard time but could of course easily be phony as well.

How I responded

Normally I don’t respond to these confused emails because the distance between me and the person writing them is usually almost interplanetary. This time though, it was so far beyond what’s acceptable to me and in any decent society I couldn’t just let it slide. After I took a little pause and walked around my house for a few minutes to cool off, I wrote a really angry reply and sent it off.

This was a totally and completely utterly unacceptable email and it hurt me deep in my soul. You should be ashamed and seriously reconsider your manners.

I have no idea what your screenshots are supposed to show, but clearly something somewhere is using code I wrote. Code I have written runs in virtually every Internet connected device on the planet and in most cases the users download and use it without even telling me, for free.

Clearly you don’t deserve my code.

I don’t expect that it will be read or make any difference.

Update below, added after my initial post.

Al Nocai’s response

Contrary to my expectations above, he responded. It’s not even worth commenting but for transparency I’ll include it here.

I do not care. Your bullshit software was an attack vector that cost me a multimillion dollar defense project.

Your bullshit software has been used to root me and multiple others. I lost over $15k in prototyping alone from bullshit rooting to the charge arbitrators.

I have now since October been sandboxed because of your bullshit software so dipshit google kids could grift me trying to get out of the sandbox because they are too piss poor to know shat they are doing.

You know what I did to deserve that? I tried to develop a trade route in tech and establish project based learning methodologies to make sure kids aren’t left behind. You know who is all over those god damn files? You are. Its sickening. I got breached in Oct 2020 through federal server hijacking, and I owe a great amount of that to you.

Ive had to sit and watch as i reported:

  1. fireeye Oct/2020
  2. Solarwinds Oct/2020
  3. Zyxel Modem Breach Oct/2020
  4. Multiple Sigover attack vectors utilizing favicon XML injection
  5. JS Stochastic templating utilizing comparison expressions to write to data registers
  6. Get strong armed by $50billion companies because i exposed bullshit malware

And i was rooted and had my important correspondence all rerouted as some sick fuck dismantled my life with the code you have your name plastered all over. I cant even leave the country because of the situation; qas you have so effectively built a code base to shit all over people, I dont give a shit how you feel about this.

You built a formula 1 race car and tossed the keys to kids with ego problems. Now i have to deal with Win10 0-days because this garbage.

I lost my family, my country my friends, my home and 6 years of work trying to build a better place for posterity. And it has beginnings in that code. That code is used to root and exploit people. That code is used to blackmail people.

So no, I don’t feel bad one bit. You knew exactly the utility of what you were building. And you thought it was all a big joke. Im not laughing. I am so far past that point now.

/- Al

Al continues

Nine hours after I first published this blog post , Al replied again with two additional emails. His third and forth emails to me.

Email 3:

https://davidkrider.com/i-will-slaughter-you-daniel-haxx-se/
Step up. You arent scaring me. What led me here? The 5th violent attempt on my life. Apple terms of service? gtfo, thanks for the platform.

Amusingly he has found a blog post about my blog post.

Email 4:

There is the project: MOUT Ops Risk Analysis through Wide Band Em Spectrum analysis through different fourier transforms.
You and whoever the fuck david dick rider is, you are a part of this.
Federal server breaches-
Accomplice to attempted murder-
Fraud-
just a few.

I have talked to now: FBI FBI Regional, VA, VA OIG, FCC, SEC, NSA, DOH, GSA, DOI, CIA, CFPB, HUD, MS, Convercent, as of today 22 separate local law enforcement agencies calling my ass up and wasting my time.

You and dick ridin’ dave are respinsible. I dont give a shit, call the cops. I cuss them out wheb they call and they all go silent.

I’ve kept his peculiar formatting and typos. In email 4 there was also a PDF file attached named BustyBabes 4.pdf. It is apparently a 13 page document about the “NERVEBUS NERVOUS SYSTEM” described in the first paragraph as “NerveBus Nervous System aims to be a general utility platform that provides comprehensive and complex analysis to provide the end user with cohesive, coherent and “real-time” information about the environment it monitors.”. There’s no mention of curl or my name in the document.

Since I don’t know the status of this document I will not share it publicly, but here’s a screenshot of the front page:

Related

This topic on hacker news and reddit.

I have reported the threat to the Swedish police (where I live).

The Mozilla BlogExpanding Mozilla’s Boards

I’m delighted to share that the Mozilla Foundation and Corporation Boards are each welcoming a new member.

Wambui Kinya is Vice President of Partner Engineering at Andela, a Lagos-based global talent network that connects companies with vetted, remote engineers from Africa and other emerging markets. Andela’s vision is a world where the most talented people can build a career commensurate with their ability – not their race, gender, or geography. Wambui joins the Mozilla Foundation Board and you can read more from her, here, on why she is joining. Motivated by the intersection of Africa, technology and social impact, Wambui has led business development and technology delivery, digital technology implementation, and marketing enablement across Africa, the United States, Europe and South America. In 2020 she was selected as one of the “Top 30 Most Influential Women” by CIO East Africa.

Laura Chambers is Chief Executive Officer of Willow Innovations, which addresses one of the biggest challenges for mothers, with the world’s first quiet, all-in-one, in-bra, wearable breast pump. She joins the Mozilla Corporation Board. Laura holds a wealth of knowledge in internet product, marketplace, payment, and community engagement from her time at AirBnB, eBay, PayPal, and Skype, as well as her current role at Willow. Her experience also includes business operations, marketing, shipping, global customer trust and community engagement. Laura brings a clear understanding of the challenges we face in building a better internet, coupled with strong business acumen, and an acute ability to hone in on key issues and potential solutions. You can read more from Laura about why she is joining here.

At Mozilla, we invite our Board members to be more involved with management, employees and volunteers than is generally the case, as I’ve written about in the past. To ready them for this, Wambui and Laura met with existing Board members, members of the management team, individual contributors and volunteers.

We know that the challenges of the modern internet are so big, and that expanding our capacity will help us develop solutions to those challenges. I am sure that Laura and Wambui’s insights and strategic thinking will be a great addition to our boards.

The post Expanding Mozilla’s Boards appeared first on The Mozilla Blog.

The Mozilla BlogWhy I’m Joining Mozilla’s Board of Directors

Laura Chambers

Like many of us I suspect, I have long been a fairly passive “end-user” of the internet. In my daily life, I’ve merrily skipped along it to simplify and accelerate my life, to be entertained, to connect with far-flung friends and family, and to navigate my daily life. In my career in Silicon Valley, I’ve happily used it as a trusty building block to help build many consumer technologies and brands – in roles leading turnarounds and transformations at market-creating companies like eBay, PayPal, Skype, Airbnb, and most recently as CEO of Willow Innovations Inc.

But over the past few years, my relationship with the internet has significantly changed. We’ve all had to face up to the cracks and flaws … many of which have been there for a while, but have recently opened into gaping chasms that we can’t ignore. The impact of curated platforms and data misuse on families, friendships, communities, politics and the global landscape has been staggering. And it’s hit close to home … I have three young children, all of whom are getting online much faster and earlier than expected, due to the craziness of homeschooling, and my concerns about their safety and privacy are tremendous. All of a sudden, my happy glances at the internet have been replaced with side-eyes of mistrust.

So last year, in between juggling new jobs, home-offices full of snoring dogs, and home schooling, I started to think about what I could do to help. In that journey, I was incredibly fortunate to connect with the team at Mozilla. As I learned more about the team, met the talented people at the helm, and dove into their incredible mission to ensure the internet is free, open and accessible to all, I couldn’t think of a better way to do something practical and meaningful to help than through joining the Board.

The opportunity ahead is astounding … using the power of the open internet to make the world a better, freer, more positively connected place. Mozilla has an extraordinary legacy of leading that charge, and I couldn’t be more thrilled to be join the exceptional group driving toward a much better future. I look forward to us all once again being able to merrily skip along our daily lives, with the internet as our trusty guide and friend along the way.

The post Why I’m Joining Mozilla’s Board of Directors appeared first on The Mozilla Blog.

Armen ZambranoMaking pip installations faster with wheel

I recently noticed the following message in Sentry’s pip installation step:

Using legacy ‘setup.py install’ for openapi-core, since package ‘wheel’ is not installed.

Upon some investigation, I noticed that the package wheel was not being installed. After making some changes, I can now guarantee that our development environment installs it by default and it’s given us about 40–50% speed gain.

<figcaption>Timings from before and after installing wheel</figcaption>

The screenshot above shows the steps from two different Github workflows; it installs Sentry’s Python packages inside of a fresh virtualenv and the pip cache is available.

If you see a message saying that wheelpackage is not installed, make sure to attend to it!

The Mozilla BlogWhy I’m Joining Mozilla’s Board of Directors

Wambui Kinya

 My introduction to Mozilla was when Firefox was first launched. I was starting my career as a software developer in Boston, MA at the time. My experience was Firefox was a far superior browser. I was also deeply fascinated by the notion that, as an open community, we could build and evolve a product for greater good.

You have probably deduced from this, that I am also old enough that growing up in my native country, Kenya, most of my formative years were under the policies of “poverty reduction programs” dictated and enforced by countries and institutions in the northern hemisphere. My firsthand experience of many of these programs was observing my mother, a phenomenal environmental engineer and academic, work tirelessly to try to convince donor organizations to be more inclusive of the communities they sought to serve and benefit.

This drive to have greater inclusion and representation was deepened over ten years of being a woman and person of color in technology in corporate America. I will spare you the heartache of recounting my experiences of being the first or the only one. But I must also acknowledge, I was fortunate enough to have leaders who wanted to help me succeed and grow. As my professional exposure became more global, I felt an urgency to have more representation and greater voice from Africa.

When I moved back to Kenya, ten years ago, I was excited about the advances in access to technology. However, I was disheartened that it was primarily as consumers rather than creators of technology products. We were increasingly distanced from the concentration of power influencing our access, our data and our ability to build and compete in this internet age.

My professional journey has since been informed by the culmination of believing in the talent that is in Africa, the desire to build for Africa and by extension the digital sovereignty of citizens of the global south. I was greatly influenced by the audacity of organizations like ThoughtWorks that thought deeply about the fight against digital colonialism and invested in free and open source products and communities. This is the context in which I was professionally reintroduced to Mozilla and its manifesto.

Mozilla’s commitment and reputation to “ensure the internet remains a public resource that is open and accessible to us all” has consistently inspired me. However, there is an increased urgency to HOW this is done given the times we live in. We must not only build, convene and enable technology and communities on issues like disinformation, privacy, trustworthy AI and digital rights, but it is imperative that we consider:

  • how to rally citizens and ensure greater representation;
  • how we connect leaders and enable greater agency to produce; and finally,
  • how we shape an agenda that is more inclusive.

This is why I have joined the Mozilla board. I am truly honored and look forward to contributing but also learning alongside you.

Onwards ever, backwards never!

The post Why I’m Joining Mozilla’s Board of Directors appeared first on The Mozilla Blog.

Benjamin BouvierA primer on code generation in Cranelift

Cranelift is a code generator written in the Rust programming language that aims to be a fast code generator, which outputs machine code that runs at reasonable speeds.

The Cranelift compilation model consists in compiling functions one by one, holding extra information about external entities, like external functions, memory addresses, and so on. This model allows for concurrent and parallel compilation of individual functions, which supports the goal of fast compilation. It was designed this way to allow for just-in-time (JIT) compilation of WebAssembly binary code in Firefox, although its scope has broadened a bit. Nowadays it is used in a few different WebAssembly runtimes, including Wasmtime and Wasmer, but also as an alternative backend for Rust debug compilation, thanks to cg_clif.

A classic compiler design usually includes running a parser to translate the source to some form of intermediate representations, then run optimization passes onto them, then feeds this to the machine code generator.

This blog post focuses on the final step, namely the concepts that are involved in code generation, and what they map to in Cranelift. To make things more concrete, we'll take a specific instruction, and see how it's translated, from its creation down to code generation. At each step of the process, I'll provide a short (ahem) high-level explanation of the concepts involved, and I'll show what they map to in Cranelift, using the example instruction. While this is not a tutorial detailing how to add new instructions in Cranelift, this should be an interesting read for anyone who's interested in compilers, and this could be an entry point if you're interested in hacking on the Cranelift codegen crate.

This is our plan for this blog post: each squared box represents data, each rounded box is a process. We're going to go through each of them below.

graph TD; clif[Optimized CLIF]; vcode[VCode]; final_vcode[Final VCode]; machine_code[Machine code artifacts]; lowering([Lowering]); regalloc([Register allocation]); codegen([Machine code generation]); clif --> lowering --> vcode --> regalloc --> final_vcode --> codegen --> machine_code

Intermediate representations

Compilers use intermediate representations (IR) to represent source code. Here we're interested in representations of the data flow, that is instructions themselves and only that. The IRs contain information about the instructions themselves, their operands, type specialization information, and any additional metadata that might be useful. IRs usually map to a certain level of abstraction, and as such, they are useful for solving different problems that require different levels of abstraction. Their shape (which data structures) and numbers often have a huge impact on the performance of the compiler itself (that is, how fast it is at compiling).

In general, most programming languages use IRs internally, and yet, these are invisible to the programmers. The reason is that source code is usually first parsed (tokenized, verified) and then translated into an IR. The abstract syntax tree, aka AST, is one such IR representing the source code itself, in a format that's very close to the source code itself. Since the raison d'être of Cranelift is to be a code generator, having a text format is secondary, and only useful for testing and debugging purposes. That's why embedders directly create and manipulate Cranelift's IR.

At the time of writing, Cranelift has two IRs to represent the function's code:

  • one external, high-level intermediate representation, called CLIF (for Cranelift IR format),
  • one internal, low-level intermediate representation called VCode (for virtual-registerized code).

CLIF IR

CLIF is the IR that Cranelift embedders create and manipulate. It consists of high-level typed operations that are convenient to use and/or can be simply translated to machine code. It is in static single assignment (SSA) form: each value referenced by an operation (SSA value) is defined only once, and may have as many uses as desired. CLIF is practical to use and manipulate for classic compilers optimization passes (e.g. LICM), as it is generic over the target architecture which we're compiling to.

let x = builder.ins().iconst(types::I64, 42);
let y = builder.ins().iconst(types::I64, 1337);
let sum = builder.ins().iadd(x, y);

An example of Rust code that would generate CLIF IR: using an IR builder, two constant 64-bits integer SSA values x and y are created, and then added together. The result is stored into the sum SSA value, which can then be consumed by other instructions.

The code for the IR builder we're manipulating above is automatically generated by the cranelift-codegen build script. The build script uses a domain specific meta language (DSL)1 that defines the instructions, their input and output operands, which input types are allowed, how the output type is inferred, etc. We won't take a look at this today: this is a bit too far from code generation, but this could be material for another blog post.

As an example of a full-blown CLIF generator, there is a crate in the Cranelift project that allows translating from the WebAssembly binary format to CLIF. The Cranelift backend for Rustc uses its own CLIF generator that translates from one of the Rust compiler's IRs.

Finally, it's time to reveal what's going to be our running example! The Chosen One is the iadd CLIF operation, which allows to add two integers of any length together, with wrapping semantics. It is both simple to understand what it does, and exhibits interesting behaviors on the two architectures we're interested in. So, let's continue down the pipeline!

VCode IR

Later on, the CLIF intermediate representation is lowered, i.e. transformed from a high-level one into a lower-level one. Here lower level means a form more specialized for a machine architecture. This lower IR is called VCode in Cranelift. The values it references are called virtual registers (more on the virtual bit below). They're not in SSA form anymore: each virtual register may be redefined as many times as we want. This IR is used to encode register allocation constraints and it guides machine code generation. As a matter of fact, since this information is tied to the machine code's representation itself, this IR is also target-specific: there's one flavor of VCode per each CPU architecture we're compiling to.

Let's get back to our example, that we're going to compile on two instruction set architectures: - ARM 64-bits (aka aarch64), which is used in most mobile devices but start to become mainstream on laptops (Apple's Mac M1, some Chromebooks) - Intel's x86 64-bits (aka x86_64, also abbreviated x64), which is used in most desktop and laptop machines).

An integer addition machine instruction on aarch64 will take three operands: two input operands (one of which must be a register), and another third output register operand. While on the x86_64 architecture, the equivalent instruction involves a total of two registers: one that is a read-only source register, and another that is an in-out modified register, containing both the second source and the destination register. We'll get back to this.

So considering iadd, let's look at (one of2) the VCode instruction that's used to represent integer additions on aarch64 (as defined in cranelift/codegen/src/isa/aarch64/inst/mod.rs):

/// An ALU operation with two register sources and a register destination.
AluRRR {
    alu_op: ALUOp,
    rd: Writable<Reg>,
    rn: Reg,
    rm: Reg,
},

Some details here:

  • alu_op defines the sub-opcode used in the ALU (Arithmetic Logic Unit). It will be AluOp::Add64 for a 64-bits integer addition.
  • rn and rm are the conventional aarch64 names for the two input registers.
  • rd is the destination register. See how it's marked as Writable, while the two others are not? Writable is a plain Rust wrapper that makes sure that we can statically differentiate read-only registers from writable registers; a neat trick that allows us to catch more issues at compile-time.

All this information is directly tied to the machine code representation of an addition instruction on aarch64: each field is later used to select some bytes that will be generated during code generation.

As said before, the VCode is specific to each architecture, so x86_64 has a different VCode representation for the same instruction (as defined in cranelift/codegen/src/isa/x64/inst/mod.rs):

/// Integer arithmetic/bit-twiddling: (add sub and or xor mul adc? sbb?) (32 64) (reg addr imm) reg
AluRmiR {
    is_64: bool,
    op: AluRmiROpcode,
    src: RegMemImm,
    dst: Writable<Reg>,
},

Here, the sub-opcode is defined as part of the AluRmiROpcode enum (the comment hints at which other x86 machine instructions are generated by this same VCode). See how there's only one src (source) register (or memory or immediate operand), while the instruction conceptually takes two inputs? That's because it's expected that the dst (destination) register is modified, that is, both read (so it's the second input operand) and written to (so it's the result register). In equivalent C code, the x86's add instruction doesn't actually do a = b + c. What it does is a += b, that is, one of the sources is consumed by the instruction. This is an artifact inherited from the design of older x86 machines in the 1970's, when instructions were designed around an accumulator model (and representing efficiently three operands in a CISC architecture would make the encoding larger and harder than it is).

Instruction selection (lowering)

As said before, converting from the high-level IR (CLIF) to the low-level IR (VCode) is called lowering. Since VCode is target-dependent, this process is also target-dependent. That's where we consider which machine instructions get eventually used for a given CLIF opcode. There are many ways to achieve the same machine state results for given semantics, but some of these ways are faster than other, and/or require fewer code bytes to achieve. The problem can be summed up like this: given some CLIF, which VCode can we create to generate the fastest and/or smallest machine code that carries out the desired semantics? This is called instruction selection, because we're selecting the VCode instructions among a set of different possible instructions.

How do these IR map to each other? A given CLIF node may be lowered into 1 to N VCode instructions. A given VCode instruction may lead to the code generation of 1 to M machine instructions. There are no rules governing the maximum of entities mapped. For instance, the integer addition CLIF opcode iadd on 64-bits inputs maps to a single VCode instruction on aarch64. The VCode instruction then causes a single code instruction to be generated.

Other CLIF opcodes may generate more than a single machine instruction eventually. Consider the CLIF opcode for signed integer division idiv. Its semantics define that it traps for zero inputs and in case of integer overflow3. On aarch64, this is lowered into:

  • one VCode instruction that checks if the input is zero and trap otherwise
  • two VCode instructions for comparing the input values against the minimal integer value and -1
  • one VCode instruction to trap if the two input values match what we checked against
  • and one VCode instruction that does the actual division operation.

Each of these VCode instruction then generates one or more machine code instructions, resulting in a bit of a longer sequence.

Let's look at the lowering of iadd on aarch64 (in cranelift/codegen/src/isa/aarch64/lower_inst.rs), edited and simplified for clarity. I've added comments in the code, explaining what each line does:

Opcode::Iadd => {
    // Get the destination register.
    let rd = get_output_reg(ctx, outputs[0]).only_reg().unwrap();
    // Get the controlling type of the addition (32-bits int or 64-bits int or
    // int vector, etc.).
    let ty = ty.unwrap();
    // Force one of the inputs into a register, not applying any signed- or
    // zero-extension.
    let rn = put_input_in_reg(ctx, inputs[0], NarrowValueMode::None);
    // Try to see if we can encode the second operand as an immediate on
    // 12-bits, maybe by negating it;
    // Otherwise, put it into a register.
    let (rm, negated) = put_input_in_rse_imm12_maybe_negated(
        ctx,
        inputs[1],
        ty_bits(ty),
        NarrowValueMode::None,
    );
    // Select the ALU subopcode, based on possible negation and controlling
    // type.
    let alu_op = if !negated {
        choose_32_64(ty, ALUOp::Add32, ALUOp::Add64)
    } else {
        choose_32_64(ty, ALUOp::Sub32, ALUOp::Sub64)
    };
    // Emit the VCode instruction in the VCode stream.
    ctx.emit(alu_inst_imm12(alu_op, rd, rn, rm));
}

In fact, the alu_inst_imm12 wrapper can create one VCode instruction among a set of possible ones (since we're trying to select the best one). For the sake of simplicity, we'll assume that AluRRR is going to be generated, i.e. the selected instruction is the one using only register encodings for the input values.

Register allocation

graph TD vcode_vreg[VCode with virtual registers] regalloc([Register allocation]) vcode_rreg[VCode with real registers] codegen([Code generation]) machine_code(Machine code) vcode_vreg --> regalloc --> vcode_rreg --> codegen --> machine_code

VCode, registers and stack slots

Hey, ever wondered what the V in VCode meant? Back to the drawing board. While a program may reference a theoretically unlimited number of instructions, each referencing a theoretically unlimited number of values as inputs and outputs, the physical machine only has a fixed set of containers for those values:

  • either they must live in machine registers: very fast to access in the CPU, take some CPU real estate, thus are costly, so there are usually few of them.
  • or they must live in the process' stack memory: it's slower to access, but we can have virtually any amount of stack slots.
mov %edi,-0x4(%rbp)
mov %rsi,-0x10(%rbp)
mov -0x4(%rbp),%eax

In this example of x86 machine code, %edi, %rsi, %rbp, %eax are all registers; stack slots are memory addresses computed as the frame pointer (%rbp) plus an offset value (which happens to be negative here). Note that stack slots may be referred to by the stack pointer (%rsp) in general.

Defining the register allocation problem

The problem of mapping the IR values (in VCode these are the Reg) to machine "containers" is called register allocation (aka regalloc). Inputs to register allocation can be as numerous as we want them, and map to "virtual" values, hence we call them virtual registers. And... that's where the V from VCode comes from: the instructions in VCode reference values that are virtual registers before register allocation, so we say the code is in virtualized register form. The output of register allocation is a set of new instructions, where the virtual registers have been replaced by real registers (the physical ones, limited in quantity) or stack slots references (and other additional metadata).

// Before register allocation, with unlimited virtual registers:
v2 = v0 + v1
v3 = v2 * 2
v4 = v2 + 1
v5 = v4 + v3
return v5

// One possible register allocation, on a machine that has 2 registers %r0, %r1:
%r0 = %r0 + %r1
%r1 = %r0 * 2
%r0 = %r0 + 1
%r1 = %r0 + %r1
return %r1

When all is well, the virtual registers don't conceptually live at the same time, and they can be put into physical registers. Issues arise when there's not enough physical registers to contain all the virtual registers that live at the same time, which is the case for... a very large majority of programs. Then, register allocation must decide which registers continue to live in registers at a given program point, and which should be spilled into a stack slot, effectively storing them onto the stack for later use. This later reuse will imply to reload them from the stack slot, using a load machine instruction. The complexity resides in choosing which registers should be spilled, at which program point they should be spilled, and at which program points we should reload them, if we need to do so. Making good choices there will have a large impact on the speed of the generated code, since memory accesses to the stack imply an additional runtime cost. For instance, a variable that's frequently used in a hot loop should live in a register for the whole loop's lifetime, and not be spilled/reloaded in the middle of the loop.

// Before register allocation, with unlimited virtual registers:
v2 = v0 + v1
v3 = v0 + v2
v4 = v3 + v1
return v4

// One possible register allocation, on a machine that has 2 registers %r0, %r1.
// We need to spill one value, because there's a point where 3 values are live at the same time!
spill %r1 --> stack_slot(0)
%r1 = %r0 + %r1
%r1 = %r0 + %r1
reload stack_slot(0) --> %r0
%r1 = %r1 + %r0
return %r1

And, since we like to have our cake and eat it too, the register allocator itself should be fast: it should not take an unbounded amount of time to make these allocation decisions. Register allocation has the good taste to be a NP-complete problem. Concretely, this means that implementations cannot find the best solutions given arbitrary inputs, but they'll estimate good solutions based on heuristics, in worst-case quadratic time over the size of the input. All of this makes it so that register allocation has its own whole research field, and has been extensively studied for some time now. It is a fascinating problem.

Register allocation in Cranelift

Back to Cranelift. The register allocation contract is that if a value must live in a real register at a given program point, then it does live where it should (unless register allocation is impossible). At the start of code generation for a VCode instruction, we are guaranteed that the input values live in real registers, and that the output real register is available before the next VCode instruction.

You might have noticed that the VCode instructions only refer to registers, and not stack slots. But where are the stack slots, then? The trick is that the stack slots are invisible to VCode. Register allocation may create an arbitrary number of spills, reloads, and register moves4 around VCode instructions, to ensure that their register allocation constraints are met. This is why the output of register allocation is a new list of instructions, that includes not only the initial instructions filled with the actual registers, but also additional spill, reload and move (VCode) instructions added by regalloc.

As said before, this problem is so sufficiently complex, involved and independent from the rest of the code (assuming the right set of interfaces!) that its code lives in a separate crate, regalloc.rs, with its own fuzzing and testing infrastructure. I hope to shed some light on it at some point too.

What's interesting to us today is the register allocation constraints. Consider the aarch64 integer add instruction add rd, rn, rm: rd is the output virtual register that's written to, while rn and rm are the inputs, thus read from. We need to inform the register allocation algorithm about these constraints. In regalloc jargon, "read to" is known as used, while "written to" is known as defined. Here, the aarch64 VCode instruction AluRRR does use rn and rm, and it defines rd. This usage information is collected in the aarch64_get_regs function (cranelift/codegen/src/isa/aarch64/inst/mod.rs):

fn aarch64_get_regs(inst: &Inst, collector: &mut RegUsageCollector) {
    match inst {
        &Inst::AluRRR { rd, rn, rm, .. } => {
            collector.add_def(rd);
            collector.add_use(rn);
            collector.add_use(rm);
        }
        // etc.

Then, after register allocation has assigned the physical registers, we need to instruct it how to replace virtual register mentions by physical register mentions. This is done in the aarch64_map_regs function (same file as above):

fn aarch64_map_regs<RUM: RegUsageMapper>(inst: &mut Inst, mapper: &RUM) {
    // ...
    match inst {
        &mut Inst::AluRRR {
            ref mut rd,
            ref mut rn,
            ref mut rm,
            ..
        } => {
            map_def(mapper, rd);
            map_use(mapper, rn);
            map_use(mapper, rm);
        }

        // etc.

Note this is reflecting quite precisely what the usage collector did: we're replacing the virtual register mention for the defined register rd with the information (which real register) provided by the RegUsageMapper. These two functions must stay in sync, otherwise here be dragons! (and bugs very hard to debug!)

Register allocation on x86

On Intel's x86, register allocation may be a bit trickier: in some cases, the lowering needs to be carefully written so it satisfies some register allocation constraints that are very specific to this architecture. In particular, x86 has fixed register constraints as well as tied operands.

For this specific part, we'll look at the integer shift-left instruction, which is equivalent to C's x << y. Why this particular instruction? It exhibits both properties that we're interested in studying here. The lowering of iadd is similar, albeit slightly simpler, as it only involves tied operands.

Fixed register constraints

On the one hand, some instructions expect their inputs to be in fixed registers, that is, specific registers arbitrarily predefined by the architecture manual. For the example of the shift instruction, if the count is not statically known at compile time (it's not a shift by a constant value), then the amount by which we're shifting must be in the rcx register5.

Now, how do we make sure that the input value actually is in rcx? We can mark rcx as used in the get_regs function so regalloc knows about this, but nothing ensures that the input resides in it at the beginning of the instruction. To resolve this, we'll introduce a move instruction during lowering, that is going to copy the input value into rcx. Then we're sure it lives there, and register allocation knows it's used: we're good to go!

In a nutshell, this shows how lowering and register allocation play together:

  • during lowering, we introduce a move from a dynamic shift input value to rcx before the actual shift
  • in the register usage function, we mark rcx as used
  • (nothing to do in the register mapping function: rcx is a real register already)
Tied operands

On the other hand, some instructions have operands that are both read and written at the same time: we call them modified in Cranelift and regalloc.rs, but they're also known as tied operands in the compiler literature. It's not just that there's a register that must be read, and a register that must be written to: they must be the same register. How do we model this, then?

Consider a naive solution. We take the input virtual register, and decide it's allocated to the same register as the output (modified) register. Unfortunately, if the chosen virtual register was going to be reused by another later VCode instruction, then its value would be overwritten (clobbered) by the current instruction. This would result in incorrect code being generated, so this is not acceptable. In general we can't clobber the value that was in an input value during lowering, because that's the role of regalloc to make this kind of decisions.

// Before register allocation, with virtual registers:
v2 = v0 + v1
v3 = v0 + 42

// After register allocation, on a machine with two registers %r0 and %r1:
// assign v0 to %r0, v1 to %r1, v2 to %r0
%r0 += v1
... = %r0 + 42 // ohnoes! the value in %r0 is v2, not v0 anymore!

The right solution is, again, to copy this input virtual register into the output virtual register, right before the instruction. This way, we can still reuse the untouched input register in other instructions without modifying it: only the copy is written to.

Pfew! We can now look at the entire lowering for the shift left instruction, edited and commented for clarity:

// Read the instruction operand size from the output's type.
let size = dst_ty.bytes() as u8;

// Put the left hand side into a virtual register.
let lhs = put_input_in_reg(ctx, inputs[0]);

// Put the right hand side (shift amount) into either an immediate (if it's
// statically known at compile time), or into a virtual register.
let (count, rhs) =
    if let Some(cst) = ctx.get_input_as_source_or_const(insn, 1).constant {
        // Mask count, according to Cranelift's semantics.
        let cst = (cst as u8) & (dst_ty.bits() as u8 - 1);
        (Some(cst), None)
    } else {
        (None, Some(put_input_in_reg(ctx, inputs[1])))
    };

// Get the destination virtual register.
let dst = get_output_reg(ctx, outputs[0]).only_reg().unwrap();

// Copy the left hand side into the (modified) output operand, to satisfy the
// mod constraint.
ctx.emit(Inst::mov_r_r(true, lhs, dst));

// If the shift count is statically known: nothing particular to do. Otherwise,
// we need to put it in the RCX register.
if count.is_none() {
    let w_rcx = Writable::from_reg(regs::rcx());
    // Copy the shift count (which is in rhs) into RCX.
    ctx.emit(Inst::mov_r_r(true, rhs.unwrap(), w_rcx));
}

// Generate the actual shift instruction.
ctx.emit(Inst::shift_r(size, ShiftKind::ShiftLeft, count, dst));

And this is how we tell the register usage collector about our constraints:

Inst::ShiftR { num_bits, dst, .. } => {
    if num_bits.is_none() {
        // if the shift count is dynamic, mark RCX as used.
        collector.add_use(regs::rcx());
    }
    // In all the cases, the destination operand is modified.
    collector.add_mod(*dst);
}

Only the modified register needs to be mapped to its allocated physical register:

Inst::ShiftR { ref mut dst, .. } => {
    map_mod(mapper, dst);
}

Virtual registers copies and performance

Do these virtual register copies sound costly to you? In theory, they could lead to the code generation of a move instructions, increasing the size of the code generated and causing a small runtime cost. In practice, register allocation, through its interface, knows how to identify move instructions, their source and their destination. By analyzing them, it can see when a source isn't used after a given move instruction, and thus allocate the same register for the source and the destination of the move. Then, when Cranelift generates the code, it will avoid generating a move from a physical register to the same one6. As a matter of fact, creating a VCode copy doesn't necessarily mean that it will generate a machine code move instruction later: it is present just in case regalloc needs it, but it can be avoided when it's spurious.

Code generation

Oh my, we're getting closer to actually being able to run the code! Once register allocation has run, we can generate the actual machine code for the VCode instructions. Cool kids call this step of the pipeline codegen, for code generation. This is the part where we decipher the architecture manuals provided by the CPU vendors, and generate the raw machine bytes for our machine instructions. In Cranelift, this means filling a code buffer (there's a MachBuffer sink interface for this!), returned along some internal relocations7 and additional metadata. Let's see what happens for our integer addition, when the times come to generate the code for its VCode equivalent AluRRR on aarch64 (in cranelift/codegen/src/isa/aarch64/inst/emit.rs):

// We match on the VCode's identity here:
&Inst::AluRRR { alu_op, rd, rn, rm } => {
    // First select the top 11 bits based on the ALU subopcode.
    let top11 = match alu_op {
        ALUOp::Add32 => 0b00001011_000,
        ALUOp::Add64 => 0b10001011_000,
        // etc
    };
    // Then decide the bits 10 to 15, based on the ALU subopcode as well.
    let bit15_10 = match alu_op {
        // other cases
        _ => 0b000000,
    };
    // Then use an helper and pass forward the allocated physical registers
    // values.
    sink.put4(enc_arith_rrr(top11, bit15_10, rd, rn, rm));
}

And what's this enc_arith_rrr doing, then?

fn enc_arith_rrr(bits_31_21: u32, bits_15_10: u32, rd: Writable<Reg>, rn: Reg, rm: Reg) -> u32 {
    (bits_31_21 << 21)
        | (bits_15_10 << 10)
        | machreg_to_gpr(rd.to_reg())
        | (machreg_to_gpr(rn) << 5)
        | (machreg_to_gpr(rm) << 16)
}

Encoding the instruction parts (operands, register mentions) is a lot of bit twiddling and fun. We do so for each VCode instruction, until we've generated the whole function's body. If you remember correctly, at this point register allocation may have added some spills/reloads/move instructions. From the codegen's point of view, these are just regular instructions with precomputed operands (either real registers, or memory operands involving the stack pointer), so they're not treated particularly and they're just generated the same way other VCode instructions are.

More work is done by the codegen backend then, to optimize blocks placement, compute final branch offsets, etc. If you're interested by this, I strongly encourage you to go read this blog post by Chris Fallin. After this, we're finally done: we've produced a code buffer, as well as external relocations (to other functions, memory addresses, etc.) for a single function. The code generator's task is complete: the final steps consist in linking and, optionally, producing an executable binary.

Mission accomplished!

So, we're done for today! Thanks for reading this far, hope it has been a useful and pleasant read to you! Feel free to reach out to me on the twitterz if you have additional remarks/questions, and to go contribute on Wasmtime/Cranelift if this sort of things is interesting to you 😇. Until next time, take care of yourselves!

Thanks to Chris Fallin for reading and suggesting improvements to this blog post.


  1. Really, Rust is the DSL. It was Python code before, that had the advantage to be faster to update. Yet it was doing a lot of magic behind the curtain, which wasn't very friendly for new people trying to learn and use Cranelift. Despite a statically typed language helping for exploration through tooling, this meta-language is to partially disappear in the long run, see Chris' blog post on this topic. 

  2. Aarch64 connoisseurs may notice that there are other ways to encode an addition. Say, if one of the input operands was the result of a bit shift instruction by an immediate value, then it's possible to embed the shift within the add, so we end up with fewer machine instructions (and lower the register pressure). This other possible encoding is sufficiently different in terms of register allocation and code generation that it justifies having its own VCode instruction. AluRRR is simpler in the sense that it's only concerned with register inputs and outputs, thus a perfect example for this post. 

  3. What's an integer overflow for signed integer division? Consider an integer value represented on N bits. If you try to divide the smallest integer value -2**N by -1, it should return 2**N, but this is out of range, since the biggest signed integer value we can represent on N bits is (2**N) - 1! So this will overflow and be set to -2**N, which is the initial value, but not the correct result. Good luck debugging this without a software trap! 

  4. Register moves may be introduced because a successor block (in the control flow graph) expects a given virtual register to live in a particular real register, or because a particular instruction requires a virtual register to be allocated to a fixed real register that's busy: regalloc can then temporarily divert the busy register into another unused register. 

  5. The c in rcx actually stands for count; this is a property inherited from former CPU designs. 

  6. Unless this move carries sign- or zero-extending semantics, which is the case for e.g. x86's 32-bits mov instructions on a 64-bits architecture. 

  7. Relocations are placeholders for information we don't have yet access to. For instance, when we're generating jump instructions, the jump targets offsets are not determined yet. So we record where the jump instruction is in the code stream, as well as which control flow block it should jump into, so we can patch it later when the final offsets are known: that's the content of our relocation. 

This Week In RustThis Week in Rust 378

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

No newsletters this week.

Official
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

Despite having no nominations, this week's crate is firestorm, a fast intrusive flamegraph profiling library.

llogiq is pretty pleased anyway with the suggestion.

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

340 pull requests were merged in the last week

Rust Compiler Performance Triage

A mostly quiet week, though with an excellent improvement in bootstrap times, shaving off a couple percent off the total and 10% off of rustc_middle due to changes in the code being compiled.

Triage done by @simulacrum. Revision range: ea09825..f1c47c7

1 Regressions, 2 Improvements, 1 Mixed

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs

New RFCs

Upcoming Events

Online

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Have you seen someone juggle several items with one hand? That's the point of async. Blocking (non-async) it like writing - it requires constant work from each hand. If you want to write twice as fast you'll need two hands and write with both at the same time. That's multithreading. If you juggle, the moment the item leaves your hand and is in the air, you have it left with nothing to do. That's similar to network IO - you make a request and are just waiting for the server to respond. You could be doing something in the meantime, like catching another item and throwing it back up again. That's what "await" does - it says I threw and item into the air, so I want my current thread / hand to switch over to catch something else now.

/u/OS6aDohpegavod4 on /r/rust

Thanks to Jacob Pratt for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Daniel StenbergTransfers vs connections spring cleanup

Warning: this post is full of libcurl internal architectural details and not much else.

Within libcurl there are two primary objects being handled; transfers and connections. The transfers objects are struct Curl_easy and the connection counterparts are struct connectdata.

This is a separation and architecture as old as libcurl, even if the internal struct names have changed a little through the years. A transfer is associated with none or one connection object and there’s a pool with potentially several previously used, live, connections stored for possible future reuse.

A simplified schematic picture could look something like this:

<figcaption>One transfer, one connection and three idle connections in the pool.</figcaption>

Transfers to connections

These objects are protocol agnostic so they work like this no matter which scheme was used for the URL you’re transferring with curl.

Before the introduction of HTTP/2 into curl, which landed for the first time in September 2013 there was also a fixed relationship that one transfer always used (none or) one connection and that connection then also was used by a single transfer. libcurl then stored the association in the objects both ways. The transfer object got a pointer to the current connection and the connection object got a pointer to the current transfer.

Multiplexing shook things up

Lots of code in libcurl passed around the connection pointer (conn) because well, it was convenient. We could find the transfer object from that (conn->data) just fine.

When multiplexing arrived with HTTP/2, we then could start doing multiple transfers that share a single connection. Since we passed around the conn pointer as input to so many functions internally, we had to make sure we updated the conn->data pointer in lots of places to make sure it pointed to the current driving transfer.

This was always awkward and the source for agony and bugs over the years. At least twice I started to work on cleaning this up from my end but it quickly become a really large work that was hard to do in a single big blow and I abandoned the work. Both times.

Third time’s the charm

This architectural “wart” kept bugging me and on January 8, 2021 I emailed the curl-library list to start a more organized effort to clean this up:

Conclusion: we should stop using ‘conn->data’ in libcurl

Status: there are 939 current uses of this pointer

Mission: reduce the use of this pointer, aiming to reach a point in the future when we can remove it from the connection struct.

Little by little

With the help of fellow curl maintainer Patrick Monnerat I started to submit pull requests that would remove the use of this pointer.

Little by little we changed functions and logic to rather be anchored on the transfer rather than the connection (as data->conn is still fine as that can only ever be NULL or a single connection). I made a wiki page to keep an updated count of the number of references. After the first ten pull requests we were down to just over a hundred from the initial 919 – yeah the mail quote says 939 but it turned out the grep pattern was slightly wrong!

We decided to hold off a bit when we got closer to the 7.75.0 release so that we wouldn’t risk doing something close to the ship date that would jeopardize it. Once the release had been pushed out the door we could continue the journey.

Gone!

As of today, February 16 2021, the internal pointer formerly known as conn->data doesn’t exist anymore in libcurl and therefore it can’t be used and this refactor is completed. It took at least 20 separate commits to get the job done.

I hope this new order will help us do less mistakes as we don’t have to update this pointer anymore.

I’m very happy we could do this revamp without it affecting the API or ABI in any way. These are all just internal artifacts that are not visible to the outside.

One of a thousand little things

This is just a tiny detail but the internals of a project like curl consists of a thousand little details and this is one way we make sure the code remains in a good shape. We identify improvements and we perform them. One by one. We never stop and we’re never done. Together we take this project into the future and help the world do Internet transfers properly.

Mozilla Privacy BlogMozilla Mornings: Unpacking the DSA’s risk-based approach

On 25 February, Mozilla will host the next installment of Mozilla Mornings – our regular event series that brings together policy experts, policymakers and practitioners for insight and discussion on the latest EU digital policy developments.

This installment of Mozilla Mornings will focus on the DSA’s risk-based approach, specifically the draft law’s provisions on risk assessment, risk mitigation, and auditing for very large online platforms. We’ll be looking at what these provisions seek to solve for; how they’re likely to work in practice; and what we can learn from related proposals in other jurisdictions.

Speakers

Carly Kind
Director, Ada Lovelace Institute

Ben Scott
Executive Director, Reset

Owen Bennett
Senior Policy Manager, Mozilla Corporation

Moderated by Brian Maguire
EU journalist and broadcaster

 

Logistical information

25 February, 2021

11:00-12:00 CET

Zoom Webinar (conferencing details to be provided on morning of event)

Register your attendance here

The post Mozilla Mornings: Unpacking the DSA’s risk-based approach appeared first on Open Policy & Advocacy.

Karl DubostCapping macOS User Agent String on macOS 11

This is to keep track and document the sequence of events related to macOS 11 and another cascade of breakages related to the change of user agent strings. There is no good solution. One more time it shows how sniffing User Agent strings are both dangerous (future fail) and source of issues.

Brace for impact!

Lion foot statue

Capping macOS 11 version in User Agent History

  • 2020-06-25 OPENED WebKit 213622 - Safari 14 - User Agent string shows incorrect OS version

    A reporter claims it breaks many websites but without giving details about which websites. There's a mention about VP9

    browser supports vp9

    I left a comment there to get more details.

  • 2020-09-15 OPENED WebKit 216593 - [macOS] Limit reported macOS release to 10.15 series.

        if (!osVersion.startsWith("10"))
        osVersion = "10_15_6"_s;
    

    With some comments in the review:

    preserve original OS version on older macOS at Charles's request

    I suspect this is the Charles, the proxy app.

    2020-09-16 FIXED

  • 2020-10-05 OPENED WebKit 217364 - [macOS] Bump reported current shipping release UA to 10_15_7

    On macOS Catalina 10.15.7, Safari reports platform user agent with OS version 10_15_7. On macOS Big Sur 11.0, Safari reports platform user agent with OS version 10_15_6. It's a bit odd to have Big Sur report an older OS version than Catalina. Bump the reported current shipping release UA from 10_15_6 to 10_15_7.

    The issue here is that macOS 11 (Big Sur) reports an older version number than macOS 10.15 (Catalina), because the previous bug harcoded the string number.

        if (!osVersion.startsWith("10"))
        osVersion = "10_15_7"_s;
    

    This is still harcoded because in this comment:

    Catalina quality updates are done, so 10.15.7 is the last patch version. Security SUs from this point on won’t increment the patch version, and does not affect the user agent.

    2020-10-06 FIXED

  • 2020-10-11 Unity [WebGL][macOS] Builds do not run when using Big Sur

    UnityLoader.js is the culprit.

    They fixed it on January 2021(?). But there are a lot of legacy codes running out there which could not be updated.

    Irony, there’s no easy way to detect the unity library to create a site intervention that would apply to all games with the issue. Capping the UA string will fix that.

  • 2020-11-30 OPENED Webkit 219346 - User-agent on macOS 11.0.1 reports as 10_15_6 which is older than latest Catalina release.

    It was closed as a duplicate of 217364, but there's an interesting description:

    Regression from 216593. That rev hard codes the User-Agent header to report MacOS X 10_15_6 on macOS 11.0+ which breaks Duo Security UA sniffing OS version check. Duo security check fails because latest version of macOS Catalina is 10.15.7 but 10.15.6 is being reported.

  • 2020-11-30 OPENED Gecko 1679929 - Cap the User-Agent string's reported macOS version at 10.15

    There is a patch for Gecko to cap the user agent string the same way that Apple does for Safari. This will solve the issue with Unity Games which have been unable to adjust the code source to the new version of Unity.

    // Cap the reported macOS version at 10.15 (like Safari) to avoid breaking
    // sites that assume the UA's macOS version always begins with "10.".
    int uaVersion = (majorVersion >= 11 || minorVersion > 15) ? 15 : minorVersion;
    
    // Always return an "Intel" UA string, even on ARM64 macOS like Safari does.
    mOscpu = nsPrintfCString("Intel Mac OS X 10.%d", uaVersion);
    

    It should land very soon, this week (week 8, February 2021), on Firefox Nightly 87. We can then monitor if anything is breaking with this change.

  • 2020-12-04 OPENED Gecko 1680516 - [Apple Chip - ARM64 M1] Game is not loaded on Gamearter.com

    Older versions of Unity JS used to run games are broken when the macOS version is 10_11_0 in the user agent string of the browser.

    The Mozilla webcompat team proposed to fix this with a Site Intervention for gamearter specifically. This doesn't solve the other games breaking.

  • 2020-12-14 OPENED Gecko 1682238 - Override navigator.userAgent for gamearter.com on macOS 11.0

    A quick way to fix the issue on Firefox for gamearter was to release a site intervention by the Mozilla webcompat team

    "use strict";
    
    /*
    * Bug 1682238 - Override navigator.userAgent for gamearter.com on macOS 11.0
    * Bug 1680516 - Game is not loaded on gamearter.com
    *
    * Unity < 2021.1.0a2 is unable to correctly parse User Agents with
    * "Mac OS X 11.0" in them, so let's override to "Mac OS X 10.16" instead
    * for now.
    */
    
    /* globals exportFunction */
    
    if (navigator.userAgent.includes("Mac OS X 11.")) {
    console.info(
        "The user agent has been overridden for compatibility reasons. See https://bugzilla.mozilla.org/show_bug.cgi?id=1680516 for details."
    );
    
    let originalUA = navigator.userAgent;
    Object.defineProperty(window.navigator.wrappedJSObject, "userAgent", {
        get: exportFunction(function() {
        return originalUA.replace(/Mac OS X 11\.(\d)+;/, "Mac OS X 10.16;");
        }, window),
    
        set: exportFunction(function() {}, window),
    });
    }
    
  • 2020-12-16 OPENED WebKit 219977 - WebP loading error in Safari on iOS 14.3

    In this comment, Cloudinary explains they try to avoid the issue with the system bug with UA detection.

    Cloudinary is attempting to work around this issue by turning off WebP support to affected clients.

    If this is indeed about the underlying OS frameworks, rather than the browser version, as far as we can tell it appeared sometime after MacOS 11.0.1 and before or in 11.1.0. All we have been able to narrow down on the iOS side is ≥14.0.

    If you have additional guidance on which versions of the OSes are affected, so that we can prevent Safari users from receiving broken images, it would be much appreciated!

    Eric Portis (Cloudinary) created some tests: * WebPs that break in iOS ≥ 14.3 & MacOS ≥ 11.1 * Tiny WebP

    The issue seems to affect CloudFlare

  • 2021-01-05 OPENED WebKit WebP failures [ Big Sur ] fast/images/webp-as-image.html is failing

  • 2021-01-29 OPENED Blink 1171998 - Nearly all Unity WebGL games fail to run in Chrome on macOS 11 because of userAgent

  • 2021-02-06 OPENED Blink 1175225 - Cap the reported macOS version in the user-agent string at 10_15_7

    Colleagues at Mozilla, on the Firefox team, and Apple, on the Safari team, report that there are a long tail of websites broken from reporting the current macOS Big Sur version, e.g. 11_0_0, in the user agent string:

    Mac OS X 11_0_0

    and for this reason, as well as slightly improving user privacy, have decided to cap the reported OS version in the user agent string at 10.15.7:

    Mac OS X 10_15_7

  • 2021-02-09 Blink Intent to Ship: User Agent string: cap macOS version number to 10_15_7

    Ken Russell sends an intent to cap macOS in the Chrome (blink) user agent string to 10_15_7 to follow on the steps of Apple and Mozilla. In the intent to ship, there is a discussion on solving the issue with Client Hints. Sec-CH-UA-Platform-Version would be a possibility, but Client Hints is not yet deployed across browsers and there is not yet a full consensus about it. This is a specification pushed by Google and partially implemented by Google on Chrome.

    Masataka Yakura shared with me (Thanks!) two threads on the Webkit-dev mailing-list. One from May 2020 and another one from November 2020.

    In May, Maciej said:

    I think there’s a number of things in the spec that should be cleaned up before an implementation ships enabled by default, specifically around interop, privacy, and protection against UA lockouts. I know there are PRs in flight for some of these issues. I think it would be good to get more of the open issues to resolution before actually shipping this.

    And in November, Maciej did another round of spec review with one decisive issue.

    Note that Google has released, last year, on January 14, 2020 an Intent to Ship: Client Hints infrastructure and UA Client Hints and this was enabled a couple of days ago on February 11, 2021.

And I'm pretty sure the story is not over. There will be probably more breakages and more unknown bugs.

Otsukare!

Project TofinoIntroducing Project Mentat, a flexible embedded knowledge store

Edit, January 2017: to avoid confusion and to better follow Mozilla’s early-stage project naming guidelines, we’ve renamed Datomish to Project Mentat. This post has been altered to match.

For several months now, a small team at Mozilla has been exploring new ways of building a browser. We called that effort Tofino, and it’s now morphed into the Browser Futures Group.

As part of that, Nick Alexander and I have been working on a persistent embedded knowledge store called Project Mentat. Mentat is designed to ship in client applications, storing relational data on disk with a flexible schema.

It’s a little different to most of the storage systems you’re used to, so let’s start at the beginning and explain why. If you’re only interested in the what, skip down to just above the example code.

As we began building Tofino’s data layer, we observed a few things:

  • We knew we’d need to store new types of data as our product goals shifted: page metadata, saved content, browsing activity, location. The set of actions the user can take, and the data they generate, is bound to grow over time. We didn’t (don’t!) know what these were in advance.
  • We wanted to support front-end innovation without being gated on some storage developer writing a complicated migration. We’ve seen database evolution become a locus of significant complexity and risk — “here be dragons” — in several other applications. Ultimately it becomes easier to store data elsewhere (a new database, simple prefs files, a key-value table, or JSON on disk) than to properly integrate it into the existing database schema.
  • As part of that front-end innovation, sometimes we’d have two different ‘forks’ both growing the data model in two directions at once. That’s a difficult problem to address with a tool like SQLite.
  • Front-end developers were interested in looser approaches to accessing stored data than specialized query endpoints: e.g., Lin Clark suggested that GraphQL might be a better fit. Only a month or two into building Tofino we already saw the number of API endpoints, parameters, and fields growing as we added features. Specialized API endpoints turn into ad hoc query languages.
  • Syncability was a constant specter hovering at the back of our minds: getting the data model right for future syncing (or partial hosting on a service) was important.

Many of these concerns happen to be shared across other projects at Mozilla: Activity Stream, for example, also needs to store a growing set of page summary attributes for visited pages, and join those attributes against your main browsing history.

Nick and I started out supporting Tofino with a simple store in SQLite. We knew it had to adapt to an unknown set of use cases, so we decided to follow the principles of CQRS.

CQRS — Command Query Responsibility Segregation — recognizes that it’s hard to pick a single data storage model that works for all of your readers and writers… particularly the ones you don’t know about yet.

As you begin building an application, it’s easy to dive head-first into storing data to directly support your first user experience. As the experience changes, and new experiences are added, your single data model is pulled in diverging directions.

A common second system syndrome for this is to reactively aim for maximum generality. You build a single normalized super-flexible data model (or key-value store, or document store)… and soon you find that it’s expensive to query, complex to maintain, has designed-in capabilities that will never be used, and you still have tensions between different consumers.

The CQRS approach, at its root, is to separate the ‘command’ from the ‘query’: store a data model that’s very close to what the writer knows (typically a stream of events), and then materialize as many query-side data stores as you need to support your readers. When you need to support a new kind of fast read, you only need to do two things: figure out how to materialize a view from history, and figure out how to incrementally update it as new events arrive. You shouldn’t need to touch the base storage schema at all. When a consumer is ripped out of the product, you just throw away their materialized views.

Viewed through that lens, everything you do in a browser is an event with a context and a timestamp: “the user bookmarked page X at time T in session S”, “the user visited URL X at time T in session S for reason R, coming from visit V1”. Store everything you know, materialize everything you need.

We built that with SQLite.

This was a clear and flexible concept, and it allowed us to adapt, but the implementation in JS involved lots of boilerplate and was somewhat cumbersome to maintain manually: the programmer does the work of defining how events are stored, how they map to more efficient views for querying, and how tables are migrated when the schema changes. You can see this starting to get painful even early in Tofino’s evolution, even without data migrations.

Quite soon it became clear that a conventional embedded SQL database wasn’t a direct fit for a problem in which the schema grows organically — particularly not one in which multiple experimental interfaces might be sharing a database. Furthermore, being elbow-deep in SQL wasn’t second-nature for Tofino’s webby team, so the work of evolving storage fell to just a few of us. (Does any project ever have enough people to work on storage?) We began to look for alternatives.

We explored a range of existing solutions: key-value stores, graph databases, and document stores, as well as the usual relational databases. Each seemed to be missing some key feature.

Most good storage systems simply aren’t suitable for embedding in a client application. There are lots of great storage systems that run on the JVM and scale across clusters, but we need to run on your Windows tablet! At the other end of the spectrum, most webby storage libraries aren’t intended to scale to the amount of data we need to store. Most graph and key-value stores are missing one or more of full-text indexing (crucial for the content we handle), expressive querying, defined schemas, or the kinds of indexing we need (e.g., fast range queries over visit timestamps). ‘Easy’ storage systems of all stripes often neglect concurrency, or transactionality, or multiple consumers. And most don’t give much thought to how materialized views and caches would be built on top to address the tension between flexibility and speed.

We found a couple of solutions that seemed to have the right shape (which I’ll discuss below), but weren’t quite something we could ship. Datomic is a production-grade JVM-based clustered relational knowledge store. It’s great, as you’d expect from Cognitect, but it’s not open-source and we couldn’t feasibly embed it in a Mozilla product. DataScript is a ClojureScript implementation of Datomic’s ideas, but it’s intended for in-memory use, and we need persistent storage for our datoms.

Nick and I try to be responsible engineers, so we explored the cheap solution first: adding persistence to DataScript. We thought we might be able to leverage all of the work that went into DataScript, and just flush data to disk. It soon became apparent that we couldn’t resolve the impedance mismatch between a synchronous in-memory store and asynchronous persistence, and we had concerns about memory usage with large datasets. Project Mentat was born.

Mentat is built on top of SQLite, so it gets all of SQLite’s reliability and features: full-text search, transactionality, durable storage, and a small memory footprint.

On top of that we’ve layered ideas from DataScript and Datomic: a transaction log with first-class transactions so we can see and annotate a history of events without boilerplate; a first-class mutable schema, so we can easily grow the knowledge store in new directions and introspect it at runtime; Datalog for storage-agnostic querying; and an expressive strongly typed schema language.

Datalog queries are translated into SQL for execution, taking full advantage of both the application’s rich schema and SQLite’s fast indices and mature SQL query planner.

You can see more comparisons between Project Mentat and those storage systems in the README.

A proper tutorial will take more space than this blog post allows, but you can see a brief example in JS. It looks a little like this:

// Open a database.
let db = await datomish.open("/tmp/testing.db");
// Make sure we have our current schema.
await db.ensureSchema(schema);
// Add some data. Note that we use a temporary ID (the real ID
// will be assigned by Mentat).
let txResult = await db.transact([
{"db/id": datomish.tempid(),
"page/url": "https://mozilla.org/",
"page/title": "Mozilla"}
]);
// Let's extend our schema. In the real world this would
// typically happen across releases.
schema.attributes.push({"name": "page/visitedAt",
"type": "instant",
"cardinality": "many",
"doc": "A visit to the page."});
await db.ensureSchema(schema);
// Now we can make assertions with the new vocabulary
// about existing entities.
// Note that we simply let Mentat find which page
// we're talking about by URL -- the URL is a unique property
// -- so we just use a tempid again.
await db.transact([
{"db/id": datomish.tempid(),
"page/url": "https://mozilla.org/",
"page/visitedAt": (new Date())}
]);
// When did we most recently visit this page?
let date = (await db.q(
`[:find (max ?date) .
:in $ ?url
:where
[?page :page/url ?url]
[?page :page/visitedAt ?date]]`,
{"inputs": {"url": "https://mozilla.org/"}}));
console.log("Most recent visit: " + date);

Project Mentat is implemented in ClojureScript, and currently runs on three platforms: Node, Firefox (using Sqlite.jsm), and the JVM. We use DataScript’s excellent parser (thanks to Nikita Prokopov, principal author of DataScript!).

Addition, January 2017: we are in the process of rewriting Mentat in Rust. More blog posts to follow!

Nick has just finished porting Tofino’s User Agent Service to use Mentat for storage, which is an important milestone for us, and a bigger example of Mentat in use if you’re looking for one.

What’s next?

We’re hoping to learn some lessons. We think we’ve built a system that makes good tradeoffs: Mentat delivers schema flexibility with minimal boilerplate, and achieves similar query speeds to an application-specific normalized schema. Even the storage space overhead is acceptable.

I’m sure Tofino will push our performance boundaries, and we have a few ideas about how to exploit Mentat’s schema flexibility to help the rest of the Tofino team continue to move quickly. It’s exciting to have a solution that we feel strikes a good balance between storage rigor and real-world flexibility, and I can’t wait to see where else it’ll be a good fit.

If you’d like to come along on this journey with us, feel free to take a look at the GitHub repo, come find us on Slack in #mentat, or drop me an email with any questions. Mentat isn’t yet complete, but the API is quite stable. If you’re adventurous, consider using it for your next Electron app or Firefox add-on (there’s an example in the GitHub repository)… and please do send us feedback and file issues!

Acknowledgements

Many thanks to Lina Cambridge, Grisha Kruglov, Joe Walker, Erik Rose, and Nicholas Alexander for reviewing drafts of this post.


Introducing Project Mentat, a flexible embedded knowledge store was originally published in Project Tofino on Medium, where people are continuing the conversation by highlighting and responding to this story.

Mozilla Localization (L10N)L10n Report: February 2021 Edition

Welcome!

New localizers

  • Ibrahim of Hausa (ha) drove the Common Voice web part to completion shortly after he joined the community.
  • Crowdsource Kurdish, and Amed of Kurmanji Kurdish (kmr) teamed up to finish the Common Voice site localization.
  • Saltykimchi of Malay (ms) joins us from the Common Voice community.
  • Ibrahimi of Pashto (ps) completed the Common Voice site localization in a few days!
  • Reem of Swahili (sw) has been laser focused on the Terminology project.

Are you a locale leader and want us to include new members in our upcoming reports? Contact us!

New community/locales added

  • Mossi (mos)
  • Pashto (ps)

New content and projects

What’s new or coming up in Firefox desktop

First of all, let’s all congratulate the Silesian (szl) team for making their way into the official builds of Firefox. After spending several months in Nightly, they’re now ready for general audience and will ride the trains to Beta and Release with Firefox 87.

Upcoming deadlines:

  • Firefox 86 is currently in Beta and will be released on February 23. The deadline to update localizations is on February 14.
  • Firefox 87 is in Nightly and will move to Beta on February 22.

This means that, as of February 23, we’ll be only two cycles away from the next big release of Firefox (89), which will include the UI redesign internally called Proton. Several strings have already been exposed for localization, and you can start testing them – always in a new profile! – by manually setting these preferences to true in about:config:

  • browser.proton.appmenu.enabled
  • browser.proton.enabled
  • browser.proton.tabs.enabled

It’s a constant work in progress, so expect the UI to change frequently, as new elements are added every few days.

One important thing to note: English will change several elements of the UI from Title Case to Sentence case. These changes will not require locales to retranslate all the strings, but it also expects each locale to have clearly defined rules in their style guides about the correct capitalization to use for each part of the UI. If your locale is following the same capitalization rules as en-US, then you’ll need to manually change these strings to match the updated version.

We’ll have more detailed follow-ups in the coming week about Proton, highlighting the key areas to test. In the meantime, make sure that your style guides are in good shape, and get in touch if you don’t know how to work on them in GitHub.

What’s new or coming up in mobile

You may have noticed some changes to the Firefox for Android (“Fenix”) release schedule – that affects in turn our l10n schedule for the project.

In fact, Firefox for Android is now mirroring the Firefox Desktop schedule (as much as possible). While you will notice that the Pontoon l10n deadlines are not quite the same between Firefox Android and Firefox Desktop, their release cadence will be the same, and this will help streamline our main products.

Firefox for iOS remains unchanged for now – although the team is aiming to streamline the release process as well. However, this also depends on Apple, so this may take more time to implement.

Concerning the Proton redesign (see section above about Desktop), we still do not know to what extent it will affect mobile. Stay tuned!

What’s new or coming up in web projects

Firefox Accounts:

The payment settings feature is going to be updated later this month through a Beta release. It will be open for localization at a later date. Stay tuned!

mozilla.org

Migration to Fluent format continues, and the webdev team aims at wrapping up migration by the end of February. Kindly remind all the communities to check the migrated files for warnings, fix them right away. Otherwise, the strings will appear in English in an activated page on production. Or the page may resort to English because it can’t meet the activation threshold of 80% completion. Please follow the priority of the pages and work through them one at a time.

Common Voice

The project will be moved to Mozilla Foundation later this year. More details will be shared as soon as they become available.

Fairly small release as the transition details are being finalized.

  • Fixed bug where “Voices Online” wasn’t tracking activity anymore
  • Redirected language request modal to Github issue template
  • Updated average seconds based on corpus 6.1
  • Increased leaderboards “load more” function from 5 additional records to 20
  • Localization/sentence updates

What’s new or coming up in SuMo

Since the beginning of 2021, SUMO has been supporting Firefox 85. You can see the full list of articles that we added and updated for Firefox 85 in the SUMO Sprint wiki page here.

We also have good news from the Dutch team who’s been changing their team formation and finally managed to localize 100% support articles in SUMO. This is a huge milestone for the team, who has been a little bit behind in the past couple of years.

There are a lot more interesting changes coming up in our pipeline. Feel free to join SUMO Matrix room to discuss or just say hi.

Friends of the Lion

Image by Elio Qoshi

  • The Friesian (fy-NL) community hit the national news with the Voice Challenge, thanks to Wim for leading the effort. It was a competition between Friesian and Dutch languages, a campaign to encourage more people to donate their voices through different platforms and capture the broadest demographics. The ultimate goal is to collect about 300 hours of Frisian text.
  • Dutch team (nl) in SUMO, especially Tim Maks, Wim Benes, Onno Ekker, and Mark Heijl for completing 100% localization of the support articles in SUMO.

Know someone in your l10n community who’s been doing a great job and should appear here? Contact one of the l10n-drivers and we’ll make sure they get a shout-out (see list at the bottom)!

Useful Links

Hacks.Mozilla.OrgMDN localization update, February 2021

In our previous post, An update on MDN Web Docs’ localization strategy, we explained our broad strategy for moving forward with allowing translation edits on MDN again. The MDN localization communities are waiting for news of our progress on unfreezing the top-tier locales, and here we are. In this post we’ll look at where we’ve got to so far in 2021, and what you can expect moving forward.

Normalizing slugs between locales

Previously on MDN, we allowed translators to localize document URL slugs as well as the document title and body content. This sounds good in principle, but has created a bunch of problems. It has resulted in situations where it is very difficult to keep document structures consistent.

If you want to change the structure or location of a set of documentation, it can be nearly impossible to verify that you’ve moved all of the localized versions along with the en-US versions — some of them will be under differently-named slugs both in the original and new locations, meaning that you’d have to spend time tracking them down, and time creating new parent pages with the correct slugs, etc.

As a knock-on effect, this has also resulted in a number of localized pages being orphaned (not being attached to any parent en-US pages), and a number of en-US pages being translated more than once (e.g. localized once under the existing en-US slug, and then again under a localized slug).

For example, the following table shows the top-level directories in the en-US locale as of Feb 1, 2021, compared to that of the fr locale.

en-US fr
games
glossary
learn
mdn
mozilla
plugins
related
tools
web
webassembly
accessibilité
adaptation_des_applications_xul_pour_firefox_1.5
améliorations_dom_dans_firefox_3
améliorations_svg_dans_firefox_3
améliorations_xul_dans_firefox_3
apprendre
astuces_css
bugs_importants_corrigés_dans_firefox_3
changements_dans_gecko_1.9_affectant_les_sites_web
chrome
comment_créer_un_arbre_dom
compilation_et_installation
contrôles_dhtml_personnalisés_navigables_au_clavier
css
dhtml
dom
développement_web
explorer_un_tableau_html_avec_des_interfaces_dom_et_javascript
faq_sur_les_transformations_xsl_dans_mozilla
fuel
games
glossaire
glossary
html
inset-block-end
inset-block-start
inset-inline-end
inset-inline-start
inspecteur_dom
introduction_(alternative)
introduction_à_la_cryptographie_à_clef_publique
javascript
jeux
la_sécurité_dans_firefox_2
learn
localization
mdn
mdn_a_dix_ans
mise_à_jour_des_applications_web_pour_firefox_3
mise_à_jour_des_extensions_pour_firefox_2
mise_à_jour_des_extensions_pour_firefox_3
mozilla
navigatorusermedia.getusermedia
npapi
outils
référence_dom_gecko
sgml
svg_dans_firefox
tosource
tostring
type_mime_incorrect_pour_les_fichiers_css
un_raycaster_basique_avec_canvas
utilisation_de_xpath
utilisation_du_cache_de_firefox_1.5
web
webapi
webassembly
webrtc
xhtml
xmlserializer
xpcom
xslt_dans_gecko
xsltprocessor
zoom_pleine_page
à_propos_du_document_object_model

To make the non-en-US locales consistent and manageable, we are going to move to having en-US slugs only — all localized pages will be moved under their equivalent location in the en-US tree. In cases where that location cannot be reliably determined — e.g. where the documents are orphans or duplicates — we will put those documents into a specific storage directory, give them an appropriate prefix, and ask the maintenance communities for each unfrozen locale to sort out what to do with them.

  • Every localized document will be kept in a separate repo to the en-US content, but will have a corresponding en-US document with the same slug (folder path).
  • At first this will be enforced during deployment — we will move all the localized documents so that their locations are synchronized with their en-US equivalents. Every document that does not have a corresponding en-US document will be prefixed with orphaned during deployment. We plan to further automate this to check whenever a PR is created against the repo. We will also funnel back changes from the main en-US content repo, i.e. if an en-US page is moved, the localized equivalents will be automatically moved too.
  • All locales will be migrated, unfortunately, some documents will be marked as orphaned and some others will be marked as conflicting (as in adding a prefix conflicting to their slug). Conflicting documents have a corresponding en-US document with multiple translations in the same locale.
  • We plan to delete, archive, or move out orphaned/conflicting content.
  • Nothing will be lost since everything is in a git repo (even if something is deleted, it can still be recovered from the git history).

Processes for identifying unmaintained content

The other problem we have been wrestling with is how to identify what localized content is worth keeping, and what isn’t. Since many locales have been largely unmaintained for a long time, they contain a lot of content that is very out-of-date and getting further out-of-date as time goes on. Many of these documents are either not relevant any more at all, incomplete, or simply too much work to bring up to date (it would be better to just start from nothing).

It would be better for everyone involved to just delete this unmaintained content, so we can concentrate on higher-value content.

The criteria we have identified so far to indicate unmaintained content is as follows:

  • Pages that should have compat tables, which are missing them.
  • Pages that should have interactive examples and/or embedded examples, which are missing them.
  • Pages that should have a sidebar, but don’t.
  • Pages where the KumaScript is breaking so much that it’s not really renderable in a usable way.

These criteria are largely measurable; we ran some scripts on the translated pages to calculate which ones could be marked as unmaintained (they match one or more of the above). The results are as follows:

If you look for compat, interactive examples, live samples, orphans, and all sidebars:

  • Unmaintained: 30.3%
  • Disconnected (orphaned): 3.1%

If you look for compat, interactive examples, live samples, orphans, but not sidebars:

  • Unmaintained: 27.5%
  • Disconnected (orphaned):  3.1%

This would allow us to get rid of a large number of low-quality pages, and make dealing with localizations easier.

We created a spreadsheet that lists all the pages that would be put in the unmaintained category under the above rules, in case you were interested in checking them out.

Stopping the display of non-tier 1 locales

After we have unfrozen the “tier 1” locales (fr, ja, zh-CN, zh-TW), we are planning to stop displaying other locales. If no-one has the time to maintain a locale, and it is getting more out-of-date all the time, it is better to just not show it rather than have potentially harmful unmaintained content available to mislead people.

This makes sense considering how the system currently works. If someone has their browser language set to say fr, we will automatically serve them the fr version of a page, if it exists, rather than the en-US version — even if the fr version is old and really out-of-date, and the en-US version is high-quality and up-to-date.

Going forward, we will show en-US and the tier 1 locales that have active maintenance communities, but we will not display the other locales. To get a locale displayed again, we require an active community to step up and agree to have responsibility for maintaining that locale (which means reviewing pull requests, fixing issues filed against that locale, and doing a reasonable job of keeping the content up to date as new content is added to the en-US docs).

If you are interested in maintaining an unmaintained locale, we are more than happy to talk to you. We just need a plan. Please get in touch!

Note: Not showing the non-tier 1 locales doesn’t mean that we will delete all the content. We are intending to keep it available in our archived-content repo in case anyone needs to access it.

Next steps

The immediate next step is to get the tier 1 locales unfrozen, so we can start to get those communities active again and make that content better. We are hoping to get this done by the start of March. The normalizing slugs work will happen as part of this.

After that we will start to look at stopping the display of non-tier 1 localized content — that will follow soon after.

Identifying and removing unmaintained content will be a longer game to play — we want to involve our active localization communities in this work for the tier 1 locales, so this will be done after the other two items.

The post MDN localization update, February 2021 appeared first on Mozilla Hacks - the Web developer blog.

Karl DubostWhiteboard Reactionaries

The eminent Mike Taylor has dubbed us with one of his knightly tweets. Something something about

new interview question: on a whiteboard, re-implement the following in React (using the marker color of your choice)

Sir Bruce Lawson OM (Oh My…), a never ending disco knight, has commented about Mike's tweet, pointing out that:

the real test is your choice of marker colour. So, how would you go about making the right choice? Obviously, that depends where you’re interviewing.

I simply and firmly disagree and throw my gauntlet at Bruce's face. Choose your weapons, time and witnesses.

The important part of this tweet is how Mike Taylor points out how the Sillycon Valley industry is a just a pack of die-hard stick-in-the-mud reactionaries who have promoted the whiteboard to the pinnacle of one's dull abilities to regurgitate the most devitalizing Kardashianesque answers to stackoverflow problems. Young programmers! Rise! In front of the whiteboard, just walk out. Refuse the tiranny of the past, the chalk of ignorance.

Where are the humans, the progress? Where are the shores of the oceans, the Célestin Freinet, Maria Montessori and A. S. Neill, the lychens, the moss and the humus, the sap of imagination, the liberty of our creativity.

Otsukare!

Daniel Stenbergcurl –fail-with-body

That’s --fail-with-body, using two dashes in front of the name.

This is a brand new command line option added to curl, to appear in the 7.76.0 release. This function works like --fail but with one little addition and I’m hoping the name should imply it good enough: it also provides the response body. The --fail option has turned out to be a surprisingly popular option but users have often repeated the request to also make it possible to get the body stored. --fail makes curl stop immediately after having received the response headers – if the response code says so.

--fail-with-body will instead first save the body per normal conventions and then return an error if the HTTP response code was 400 or larger.

To be used like this:

curl --fail-with-body -o output https://example.com/404.html

If the page is missing on that HTTPS server, curl will return exit code 22 and save the error message response in the file named ‘output’.

Not complicated at all. But has been requested many times!

This is curl’s 238th command line option.

The Rust Programming Language BlogAnnouncing Rust 1.50.0

The Rust team is happy to announce a new version of Rust, 1.50.0. Rust is a programming language that is empowering everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, getting Rust 1.50.0 is as easy as:

rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.50.0 on GitHub.

What's in 1.50.0 stable

For this release, we have improved array indexing, expanded safe access to union fields, and added to the standard library. See the detailed release notes to learn about other changes not covered by this post.

Const-generic array indexing

Continuing the march toward stable const generics, this release adds implementations of ops::Index and IndexMut for arrays [T; N] for any length of const N. The indexing operator [] already worked on arrays through built-in compiler magic, but at the type level, arrays didn't actually implement the library traits until now.

fn second<C>(container: &C) -> &C::Output
where
    C: std::ops::Index<usize> + ?Sized,
{
    &container[1]
}

fn main() {
    let array: [i32; 3] = [1, 2, 3];
    assert_eq!(second(&array[..]), &2); // slices worked before
    assert_eq!(second(&array), &2); // now it also works directly
}

const value repetition for arrays

Arrays in Rust can be written either as a list [a, b, c] or a repetition [x; N]. For lengths N greater than one, repetition has only been allowed for xs that are Copy, and RFC 2203 sought to allow any const expression there. However, while that feature was unstable for arbitrary expressions, its implementation since Rust 1.38 accidentally allowed stable use of const values in array repetition.

fn main() {
    // This is not allowed, because `Option<Vec<i32>>` does not implement `Copy`.
    let array: [Option<Vec<i32>>; 10] = [None; 10];

    const NONE: Option<Vec<i32>> = None;
    const EMPTY: Option<Vec<i32>> = Some(Vec::new());

    // However, repeating a `const` value is allowed!
    let nones = [NONE; 10];
    let empties = [EMPTY; 10];
}

In Rust 1.50, that stabilization is formally acknowledged. In the future, to avoid such "temporary" named constants, you can look forward to inline const expressions per RFC 2920.

Safe assignments to ManuallyDrop<T> union fields

Rust 1.49 made it possible to add ManuallyDrop<T> fields to a union as part of allowing Drop for unions at all. However, unions don't drop old values when a field is assigned, since they don't know which variant was formerly valid, so safe Rust previously limited this to Copy types only, which never Drop. Of course, ManuallyDrop<T> also doesn't need to Drop, so now Rust 1.50 allows safe assignments to these fields as well.

A niche for File on Unix platforms

Some types in Rust have specific limitations on what is considered a valid value, which may not cover the entire range of possible memory values. We call any remaining invalid value a niche, and this space may be used for type layout optimizations. For example, in Rust 1.28 we introduced NonZero integer types (like NonZeroU8) where 0 is a niche, and this allowed Option<NonZero> to use 0 to represent None with no extra memory.

On Unix platforms, Rust's File is simply made of the system's integer file descriptor, and this happens to have a possible niche as well because it can never be -1! System calls which return a file descriptor use -1 to indicate that an error occurred (check errno) so it's never possible for -1 to be a real file descriptor. Starting in Rust 1.50 this niche is added to the type's definition so it can be used in layout optimizations too. It follows that Option<File> will now have the same size as File itself!

Library changes

In Rust 1.50.0, there are nine new stable functions:

And quite a few existing functions were made const:

See the detailed release notes to learn about other changes.

Other changes

There are other changes in the Rust 1.50.0 release: check out what changed in Rust, Cargo, and Clippy.

Contributors to 1.50.0

Many people came together to create Rust 1.50.0. We couldn't have done it without all of you. Thanks!

Mike HoyeText And Context

Memetic

This image is a reference to the four-square Drake template – originally Drake holding up a hand and turning away from something disapprovingly in the top half, while pointing favorably to something else in the lower half – featuring Xzibit rather than Drake, himself meme-famous for “yo dawg we heard you like cars, so we put a car in your car so you can drive while you drive”, to whose recursive nature this image is of course an homage. In the upper left panel, Xzibit is looking away disappointedly from the upper right, which contains a painting by Pieter Bruegel the Elder of the biblical Tower Of Babel. In the lower left, Xzibit is now looking favorably towards an image of another deeply nested meme.

This particular meme features the lead singer from Nickelback holding up a picture frame, a still from the video of their song “Photograph”. The “you know I had to do it to ’em” guy is in the distant background. Inside, the frame is cut in four by a two-axis graph, with “authoritarian/libertarian” on the Y axis and “economic-left/economic-right” on the X axis, overlaid with the words “young man, take the breadsticks and run, I said young man, man door hand hook car gun“, a play on both an old bit about bailing out of a bad conversation while stealing breadsticks, the lyrics to The Village People’s “YMCA”, and adding “gun” to the end of some sentence to shock its audience. These lyrics are arranged within those four quadrants in a visual reference to “loss.jpg”, a widely derided four-panel webcomic from 2008.

Taken as a whole the image is an oblique comment on the Biblical “Tower Of Babel” reference, specifically Genesis 11, in which “… the Lord said, Behold, the people is one, and they have all one language; and this they begin to do: and now nothing will be restrained from them, which they have imagined to do. Go to, let us go down, and there confound their language, that they may not understand one another’s speech” and the proliferation of deeply nested and frequently incomprehensible memes as a form of explicitly intra-generational communication.

So, yeah, there’s a lot going on in there.

I asked about using alt-text for captioning images like that in a few different forums the other day, to learn what the right thing is with respect to memes or jokes. If the image is the joke, is it useful (or expected) that the caption is written to try to deliver the joke, rather than be purely descriptive?

On the one hand, I’d expect you want the punchline to land, but I also want the caption to be usable and useful, and I assume that there are cultural assumptions and expectations in this space that I’m unaware of.

As intended, the question I asked wasn’t so much about “giving away” the punchline as it is about ensuring its delivery; either way you have to give away the joke, but does an image description phrased as a joke help, or hinder (or accidentally insult?) its intended audience?

I’m paraphrasing, but a few of the answers all said sort of the same useful and insightful thing: “The tool is the description of the image; the goal is to include people in the conversation. Use the tool to accomplish the goal.”

Which I kind of love.

And in what should not have stopped surprising me ages ago but still but consistently does, I was reminded that accessibility efforts support people far outside their intended audience. In this case, maybe that description makes the joke accessible to people who have perfectly good eyesight but haven’t been neck deep in memetics since they can-hazzed their first cheezeburgers and don’t quite know why this deep-fried, abstract level-nine metareference they’re seeing is hilarious.

The Mozilla BlogNext steps on trustworthy AI: transparency, bias and better data governance

Over the last few years, Mozilla has turned its attention to AI, asking: how can we make the data driven technologies we all use everyday more trustworthy? How can we make things like social networks, home assistants and search engines both more helpful and less harmful in the era ahead?

In 2021, we will take a next step with this work by digging deeper in three areas where we think we can make real progress: transparency, bias and better data governance. While these may feel like big, abstract concepts at first glance, all three are at the heart of problems we hear about everyday in the news: problems that are top of mind not just in tech circles, but also amongst policy makers, business leaders and the public at large.

Think about this: we know that social networks are driving misinformation and political divisions around the world. And there is growing consensus that we urgently need to do something to fix this. Yet we can’t easily see inside — we can’t scrutinize — the AI that drives these platforms, making genuine fixes and real accountability impossible. Researchers, policy makers and developers need to be able to see how these systems work (transparency) if we’re going to tackle this issue.

Or, this: we know that AI driven technology can discriminate, exclude or otherwise harm some people more than others. And, as automated systems become commonplace in everything from online advertising to financial services to policing, the impact of these systems becomes ever more real. We need to look at how systemic racism and the lack of diversity in the tech industry sits at the root of these problems (bias). Concretely, we also need to build tools to detect and mitigate bias — and to build for inclusivity — within the technologies that we use everyday.

And, finally, this: we know the constant collection of data about what we do online makes (most of) us deeply uncomfortable. And we know that current data collection practices are at the heart of many of the problems we face with tech today, including misinformation and discrimination. Yet there are few examples of technology that works differently. We need to develop new methods that use AI and data in a way that respects us as people, and that gives us power over the data collected about us (better data governance) — and then using these new methods to create alternatives to the online products and services we all use today.

Late last year, we zeroed in on transparency, bias and data governance for the reasons suggested above — each of these areas are central to the biggest ‘technology meets society’ issues that we face today. There is growing consensus that we need to tackle these issues. Importantly, we believe that this widespread awareness creates a unique opportunity for us to act: to build products, write laws and develop norms that result in a very different digital world. Over the next few years, we have a chance to make real progress towards more trustworthy AI — and a better internet — overall.

This opportunity for action — the chance to make the internet different and better — has shaped how we think about the next steps in our work. Practically, the teams within Mozilla Foundation are organizing our 2021 work around objectives tied to these themes:

  1. Test AI transparency best practices to increase adoption by builders and policymakers.
  2. Accelerate the impact of people working to mitigate bias in AI.
  3. Accelerate equitable data governance alternatives as a way to advance trustworthy AI.

These teams are also focusing on collaborating with others across the internet health movement — and with people in other social movements — to make progress on these issues. We’ve set a specific 2021 objective to ‘partner with diverse movements at the intersection of their primary issues and trustworthy AI’.

We already have momentum — and work underway — on all of these topics, although more with some than others. We spent much of last year developing initiatives related to better data governance, including the Data Futures Lab, which announced its first round of grantee partners in December. And, also in 2020, we worked with citizens on projects like YouTube Regrets Reporter to show what social media transparency could look like in action. While our work is more nascent on the issue of bias, we are supporting the work of people like Mozilla Fellows Deborah Raji and Camille Francios who are exploring concrete ways to tackle this challenge. We hope to learn from them as we shape our own thinking here.

Our high level plans for this work are outlined in our 2021 Objectives and Key Results, which you can find on the Mozilla wiki. We’ll post more detail on our plans — and calls for partnership — in the coming weeks, including overviews of our work and thinking on transparency, bias and better data governance. We’ll also post about efforts to expand partnerships we have with organizations in other movements.

As we look ahead, it’s important to remember: AI and data are defining computing technologies of today, just like the web was the defining technology 20 years ago when Mozilla was founded. As with the web, the norms we set around both AI and data have the potential to delight us and unlock good, or to discriminate and divide. It’s still early days. We still have the chance to define where AI will take us, and to bend it towards helping rather than harming humanity. That’s an important place for all of us to be focusing our attention right now.

P.S. for more background on Mozilla’s thinking about trustworthy AI, take a look at this blog post and associated discussion paper.

The post Next steps on trustworthy AI: transparency, bias and better data governance appeared first on The Mozilla Blog.

The Firefox FrontierLove lockdown: Four people reveal how they stay privacy-aware while using dating apps

Dating during a global pandemic is the definition of “it’s complicated”. Between the screen fatigue and social distancing, meeting someone in today’s world feels impossible. Yet, people are still finding … Read more

The post Love lockdown: Four people reveal how they stay privacy-aware while using dating apps appeared first on The Firefox Frontier.

This Week In RustThis Week in Rust 377

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Newsletters
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is threadIO, a crate that makes disk IO in a background thread easy and elegant.

Thanks to David Andersen for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

Fuchsia has several issues available:

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

384 pull requests were merged in the last week

Rust Compiler Performance Triage

No triage report this week

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs

New RFCs

Upcoming Events

Online
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

The main theme of Rust is not systems programming, speed, or memory safety - it's moving runtime problems to compile time. Everything else is incidental. This is an invaluable quality of any language, and is something Rust greatly excels at.

/u/OS6aDohpegavod4 on /r/rust

Thanks to Chris for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Data@MozillaThis Week in Glean: Backfilling rejected GPUActive Telemetry data

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla uses to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as Glean inspires it. You can find an index of all TWiG posts online.)

Data ingestion is a process that involves decompressing, validating, and transforming millions of documents every hour. The schemas of data coming into our systems are ever-evolving, sometimes causing partial outages of data availability when the conditions are ripe. Once the outage has been resolved, we run a backfill to fill in the gaps for all the missing data. In this post, I’ll discuss the error discovery and recovery processes through a recent bug.

Catching and fixing the error

 

Every Monday, a group of data engineers pours over a set of dashboards and plots indicating data ingestion health. On 2020-08-04, we filed a bug where we observed an elevated rate of schema validation errors coming from environment/system/gfx/adapters/N/GPUActive. For mistakes like these that are small fractions of our overall volume, partial outages are typically not urgent (as in not “we need to drop everything right now and resolve this stat!” critical). We called the subject experts and found out that the code responsible for reporting multiple GPUs in the environment had changed.

An intern reached out to me about a DNS study running a few weeks after filing the bug about GPUActive. I helped figure out that his external monitor setup with his Macbook was causing rejections like the ones that we had seen weeks before. One PR and one deploy later, I watched the error rates for the GPUActive field abruptly drop to zero.

Figure: Error counts for environment/system/gfx/adapters/N/GPUActive

The schema’s misspecification resulted in 4.1 million documents between 2020-07-04 and 2020-08-20 to be sent to our error stream, awaiting reprocessing.

Running a backfill

In January of 2021, we ran the backfill of the GPUActive rejects. First, we determined the backfill range by querying the relevant error table:

SELECT
  DATE(submission_timestamp) AS dt,
  COUNT(*)
FROM
  `moz-fx-data-shared-prod.payload_bytes_error.telemetry`
WHERE
  submission_timestamp < '2020-08-21'
  AND submission_timestamp > '2020-07-03'
  AND exception_class = 'org.everit.json.schema.ValidationException'
  AND error_message LIKE '%GPUActive%'
GROUP BY
  1
ORDER BY
  1

The query helped verify the date range of the errors and their counts: 2020-07-04 through 2020-08-20. The following tables were affected:

crash
dnssec-study-v1
event
first-shutdown
heartbeat
main
modules
new-profile
update
voice

We isolated the error documents into a backfill project named moz-fx-data-backfill-7 and mirrored our production BigQuery datasets and tables into it.

SELECT
  *
FROM
  `moz-fx-data-shared-prod.payload_bytes_error.telemetry`
WHERE
  DATE(submission_timestamp) BETWEEN "2020-07-04"
  AND "2020-08-20"
  AND exception_class = 'org.everit.json.schema.ValidationException'
  AND error_message LIKE '%GPUActive%'

Then we ran a suitable Dataflow job to populate our tables using the same ingestion code as the production jobs. It took about 31 minutes to run to completion. We copied and deduplicated the data into a dataset that mirrored our production environment.

gcloud config set project moz-fx-data-backfill-7
dates=$(python3 -c 'from datetime import datetime as dt, timedelta;
  start=dt.fromisoformat("2020-07-04");
  end=dt.fromisoformat("2020-08-21");
  days=(end-start).days;
  print(" ".join([(start + timedelta(i)).isoformat()[:10] for i in range(days)]))')
./script/copy_deduplicate --project-id moz-fx-data-backfill-7 --dates $(echo $dates)

This query took hours because it iterated over all tables for ~50 days, regardless of whether it contained data. Future backfills should probably remove empty tables before kicking off this script.

Now that tables were populated, we handled data deletion requests since the time of the initial error. A module named Shredder serves the self-service deletion requests in BigQuery ETL. We ran Shredder from the bigquery-etl root.

script/shredder_delete
  --billing-projects moz-fx-data-backfill-7
  --source-project moz-fx-data-shared-prod
  --target-project moz-fx-data-backfill-7
  --start_date 2020-06-01
  --only 'telemetry_stable.*'
  --dry_run

This removed relevant rows from our final tables.

INFO:root:Scanned 515495784 bytes and deleted 1280 rows from moz-fx-data-backfill-7.telemetry_stable.crash_v4
INFO:root:Scanned 35301644397 bytes and deleted 45159 rows from moz-fx-data-backfill-7.telemetry_stable.event_v4
INFO:root:Scanned 1059770786 bytes and deleted 169 rows from moz-fx-data-backfill-7.telemetry_stable.first_shutdown_v4
INFO:root:Scanned 286322673 bytes and deleted 2 rows from moz-fx-data-backfill-7.telemetry_stable.heartbeat_v4
INFO:root:Scanned 134028021311 bytes and deleted 13872 rows from moz-fx-data-backfill-7.telemetry_stable.main_v4
INFO:root:Scanned 2795691020 bytes and deleted 1071 rows from moz-fx-data-backfill-7.telemetry_stable.modules_v4
INFO:root:Scanned 302643221 bytes and deleted 163 rows from moz-fx-data-backfill-7.telemetry_stable.new_profile_v4
INFO:root:Scanned 1245911143 bytes and deleted 6477 rows from moz-fx-data-backfill-7.telemetry_stable.update_v4
INFO:root:Scanned 286924248 bytes and deleted 10 rows from moz-fx-data-backfill-7.telemetry_stable.voice_v4
INFO:root:Scanned 175822424583 and deleted 68203 rows in total

After this is all done, we append each of these tables to the production tables. Appends requires superuser permissions, so it was handed off to another engineer to finalize the deed. Afterward, we deleted the rows in the error table corresponding to the backfilled pings from the backfill-7 project.

DELETE
FROM
  `moz-fx-data-shared-prod.payload_bytes_error.telemetry`
WHERE
  DATE(submission_timestamp) BETWEEN "2020-07-04"
  AND "2020-08-20"
  AND exception_class = 'org.everit.json.schema.ValidationException'
  AND error_message LIKE '%GPUActive%'

Finally, we updated the production errors with new errors generated from the backfill process.

bq cp --append_table   moz-fx-data-backfill-7:payload_bytes_error.telemetry   moz-fx-data-shared-prod:payload_bytes_error.telemetry

Now those rejected pings are available for analysis down the line. For the unadulterated backfill logs, see this PR to bigquery-backfill.

Conclusions

No system is perfect, but the processes we have in place allow us to systematically understand the surface area of issues and systematically address failures. Our health check meeting improves our situational awareness of changes upstream in applications like Firefox, while our backfill logs in bigquery-backfill allow us to practice dealing with the complexities of recovering from partial outages. These underlying processes and systems are the same ones that facilitate the broader Glean ecosystem at Mozilla and will continue to exist as long as the data flows.

Mozilla Addons BlogExtensions in Firefox 86

Firefox 86 will be released on February 23, 2021. We’d like to call out two highlights and several bug fixes for the WebExtensions API that will ship with this release.

Highlights

  • Extensions that have host permissions for tabs no longer need to request the broader “tabs” permission to have access to the tab URL, title, and favicon URL.
  • As part of our work on Manifest V3, we have landed an experimental base content security policy (CSP) behind a preference in Firefox 86.  The new CSP disallows remote code execution. This restriction only applies to extensions using manifest_version 3, which is not currently supported in Firefox (currently, only manifest_version 2 is supported). If you would like to test the new CSP for extension pages and content scripts, you must change your extension’s manifest_version to 3 and set extensions.manifestv3.enabled to true in about:config. Because this is a highly experimental and evolving feature, we want developers to be aware that extensions that work with the new CSP may break tomorrow as more changes are implemented.

Bug fixes

  • Redirected URIs can now be set to a loopback address in the identity.launchWebAuthFlow API. This fix makes it possible for extensions to successfully integrate with OAuth authentication for some common web services like Google and Facebook. This will also be uplifted to Firefox Extended Support Release (ESR) 78.
  • Firefox 76 introduced a regression where webRequest.StreamFilter did not disconnect after an API, causing the loading icon on tabs to run persistently. We’ve also fixed a bug that caused crashes when using view-source requests.
  • The zoom levels for the extensions options pages embedded in the Firefox Add-ons Manager (about:addons) tabs should work as expected.
  • Now that the tabs hiding API is enabled by default, the extensions.webextensions.tabhide.enabled preference is no longer displayed and references to it have been removed.

As a quick note, going forward we’ll be publishing release updates in the Firefox developer release notes on MDN. We will still announce major changes to the WebExtensions API, like new APIs, significant enhancements, and deprecation notices, on this blog as they become available.

Thanks

Many thanks to community members Sonia Singla, Tilden Windsor, robbendebiene, and Brenda Natalia for their contributions to this release!

The post Extensions in Firefox 86 appeared first on Mozilla Add-ons Blog.

Hacks.Mozilla.OrgBrowser fuzzing at Mozilla

Introduction

Mozilla has been fuzzing Firefox and its underlying components for a while. It has proven to be one of the most efficient ways to identify quality and security issues. In general, we apply fuzzing on different levels: there is fuzzing the browser as a whole, but a significant amount of time is also spent on fuzzing isolated code (e.g. with libFuzzer) or whole components such as the JS engine using separate shells. In this blog post, we will talk specifically about browser fuzzing only, and go into detail on the pipeline we’ve developed. This single pipeline is the result of years of work that the fuzzing team has put into aggregating our browser fuzzing efforts to provide consistently actionable issues to developers and to ease integration of internal and external fuzzing tools as they become available.

Diagram showing interaction of systems used in Mozilla's browser fuzzing workflow

Build instrumentation

To be as effective as possible we make use of different methods of detecting errors. These include sanitizers such as AddressSanitizer (with LeakSanitizer), ThreadSanitizer, and UndefinedBehaviorSanitizer, as well as using debug builds that enable assertions and other runtime checks. We also make use of debuggers such as rr and Valgrind. Each of these tools provides a different lens to help uncover specific bug types, but many are incompatible with each other or require their own custom build to function or provide optimal results. Besides providing debugging and error detection, some tools cannot work without build instrumentation, such as code coverage and libFuzzer. Each operating system and architecture combination requires a unique build and may only support a subset of these tools.

Last, each variation has multiple active branches including Release, Beta, Nightly, and Extended Support Release (ESR). The Firefox CI Taskcluster instance builds each of these periodically.

Downloading builds

Taskcluster makes it easy to find and download the latest build to test. We discussed above the number of variants created by different instrumentation types, and we need to fuzz them in automation. Because of the large number of combinations of builds, artifacts, architectures, operating systems, and unpacking each, downloading is a non-trivial task.

To help reduce the complexity of build management, we developed a tool called fuzzfetch. Fuzzfetch makes it easy to specify the required build parameters and it will download and unpack the build. It also supports downloading specified revisions to make it useful with bisection tools.

How we generate the test cases

As the goal of this blog post is to explain the whole pipeline, we won’t spend much time explaining fuzzers. If you are interested, please read “Fuzzing Firefox with WebIDL” and the in-tree documentation. We use a combination of publicly available and custom-built fuzzers to generate test cases.

How we execute, report, and scale

For fuzzers that target the browser, Grizzly manages and runs test cases and monitors for results. Creating an adapter allows us to easily run existing fuzzers in Grizzly.

Simplified Python code for a Grizzly adaptor using an external fuzzer.

To make full use of available resources on any given machine, we run multiple instances of Grizzly in parallel.

For each fuzzer, we create containers to encapsulate the configuration required to run it. These exist in the Orion monorepo. Each fuzzer has a configuration with deployment specifics and resource allocation depending on the priority of the fuzzer. Taskcluster continuously deploys these configurations to distribute work and manage fuzzing nodes.

Grizzly Target handles the detection of issues such as hangs, crashes, and other defects. Target is an interface between Grizzly and the browser. Detected issues are automatically packaged and reported to a FuzzManager server. The FuzzManager server provides automation and a UI for triaging the results.

Other more targeted fuzzers use JS shell and libFuzzer based targets use the fuzzing interface. Many third-party libraries are also fuzzed in OSS-Fuzz. These deserve mention but are outside of the scope of this post.

Managing results

Running multiple fuzzers against various targets at scale generates a large amount of data. These crashes are not suitable for direct entry into a bug tracking system like Bugzilla. We have tools to manage this data and get it ready to report.

The FuzzManager client library filters out crash variations and duplicate results before they leave the fuzzing node. Unique results are reported to a FuzzManager server. The FuzzManager web interface allows for the creation of signatures that help group reports together in buckets to aid the client in detecting duplicate results.

Fuzzers commonly generate test cases that are hundreds or even thousands of lines long. FuzzManager buckets are automatically scanned to queue reduction tasks in Taskcluster. These reduction tasks use Grizzly Reduce and Lithium to apply different reduction strategies, often removing the majority of the unnecessary data. Each bucket is continually processed until a successful reduction is complete. Then an engineer can do a final inspection of the minimized test case and attach it to a bug report. The final result is often used as a crash test in the Firefox test suite.

Animation showing an example testcase reduction using Grizzly

Code coverage of the fuzzer is also measured periodically. FuzzManager is used again to collect code coverage data and generate coverage reports.

Creating optimal bug reports

Our goal is to create actionable bug reports to get issues fixed as soon as possible while minimizing overhead for developers.

We do this by providing:

  • crash information such as logs and a stack trace
  • build and environment information
  • reduced test case
  • Pernosco session
  • regression range (bisections via Bugmon)
  • verification via Bugmon

Grizzly Replay is a tool that forms the basic execution engine for Bugmon and Grizzly Reduce, and makes it easy to collect rr traces to submit to Pernosco. It makes re-running browser test cases easy both in automation and for manual use. It simplifies working with stubborn test cases and test cases that trigger multiple results.

As mentioned, we have also been making use of Pernosco. Pernosco is a tool that provides a web interface for rr traces and makes them available to developers without the need for direct access to the execution environment. It is an amazing tool developed by a company of the same name which significantly helps to debug massively parallel applications. It is also very helpful when test cases are too unreliable to reduce or attach to bug reports. Creating an rr trace and uploading it can make stalled bug reports actionable.

The combination of Grizzly and Pernosco have had the added benefit of making infrequent, hard to reproduce issues, actionable. A test case for a very inconsistent issue can be run hundreds or thousands of times until the desired crash occurs under rr. The trace is automatically collected and ready to be submitted to Pernosco and fixed by a developer, instead of being passed over because it was not actionable.

How we interact with developers

To request new features get a proper assessment, the fuzzing team can be reached at fuzzing@mozilla.com or on Matrix. This is also a great way to get in touch for any reason. We are happy to help you with any fuzzing related questions or ideas. We will also reach out when we receive information about new initiatives and features that we think will require attention. Once fuzzing of a component begins, we communicate mainly via Bugzilla. As mentioned, we strive to open actionable issues or enhance existing issues logged by others.

Bugmon is used to automatically bisect regression ranges. This notifies the appropriate people as quickly as possible and verifies bugs once they are marked as FIXED. Closing a bug automatically removes it from FuzzManager, so if a similar bug finds its way into the code base, it can be identified again.

Some issues found during fuzzing will prevent us from effectively fuzzing a feature or build variant. These are known as fuzz-blockers, and they come in a few different forms. These issues may seem benign from a product perspective, but they can block fuzzers from targeting important code paths or even prevent fuzzing a target altogether. Prioritizing these issues appropriately and getting them fixed quickly is very helpful and much appreciated by the fuzzing team.

PrefPicker manages the set of Firefox preferences used for fuzzing. When adding features behind a pref, consider adding it to the PrefPicker fuzzing template to have it enabled during fuzzing. Periodic audits of the PrefPicker fuzzing template can help ensure areas are not missed and resources are used as effectively as possible.

Measuring success

As in other fields, measurement is a key part of evaluating success. We leverage the meta bug feature of Bugzilla to help us keep track of the issues identified by fuzzers. We strive to have a meta bug per fuzzer and for each new component fuzzed.

For example, the meta bug for Domino lists all the issues (over 1100!) identified by this tool. Using this Bugzilla data, we are able to show the impact over the years of our various fuzzers.

Bar graph showing number of bugs reported by Domino over time

Number of bugs reported by Domino over time

These dashboards help evaluate the return on investment of a fuzzer.

Conclusion

There are many components in the fuzzing pipeline. These components are constantly evolving to keep up with changes in debugging tools, execution environments, and browser internals. Developers are always adding, removing, and updating browser features. Bugs are being detected, triaged, and logged. Keeping everything running continuously and targeting as much code as possible requires constant and ongoing efforts.

If you work on Firefox, you can help by keeping us informed of new features and initiatives that may affect or require fuzzing, by prioritizing fuzz-blockers, and by curating fuzzing preferences in PrefPicker. If fuzzing interests you, please take part in the bug bounty program. Our tools are available publicly, and we encourage bug hunting.

The post Browser fuzzing at Mozilla appeared first on Mozilla Hacks - the Web developer blog.

Mozilla Attack & DefenseGuest Blog Post: Good First Steps to Find Security Bugs in Fenix (Part 2)

 

This blog post is one of several guest blog posts, where we invite participants of our bug bounty program to write about bugs they’ve reported to us.

Continuing with Part 1, this article introduces some practices for finding security bugs in Fenix.

Fenix’s architecture is unique. Many of the browser features are not implemented in Fenix itself – they come from independent and reusable libraries such as GeckoView and Mozilla Android Components (known as Mozac). Fenix as a browser application combines these libraries as building parts for the internals, and the fenix project itself is primarily a User Interface. Mozac is noteworthy because it connects web contents rendered in GeckoView into the native Android world.

There are common pitfalls that lead to security bugs in the connection between web content and native apps. In this post, we’ll take a look at one of the pitfalls: private browsing mode bypasses. While looking for this class of bug, I discovered three separate but similar issues (Bugs 1657251, 1658231, and 1663261.)

Pitfalls in Private Browsing Mode

Take a look at the following two lines of HTML.

<img src="test1.png">
<link rel="icon" type="image/png" href="test2.png">

Although these two HTML tags look similar in that they both fetch and render PNG images from the server, their internal processing is very different. In the former <img> tag, GeckoView fetches the image from the server and renders it in an HTML document, whereas in the latter <link rel="icon"> tag identifies a favicon for the page, and code in Mozac fetches the image and renders it as a part of a view of Android. When I discovered these vulnerabilities in the fall of 2020, a HTTP request sent from <img> tag showed the string “Firefox” in the User-Agent header, whereas the request from <link rel="icon"> showed the string “MozacFetch”.

Like other browsers, GeckoView has a separated context for normal mode and private browsing mode. So the cookies and local storage areas in private browsing mode are completely separated from the normal mode, and these values are not shared. On the other hand, the URL fetch class that Mozac has – written in Kotlin – has only a single cookie store. If a favicon request responded with a Set-Cookie header; it would be stored in that cookie store and a later fetch of the favicon in private browsing mode would respond with the same cookie and vice versa. (Bug 1657251).

This same type of bug appears not only in Favicon, but also in other features that have a similar mechanism. One example is the Web Notification API. Web Notifications is a feature that shows an OS-level notification through JavaScript. Similar to favicons, an icon image can appear in the notification dialog – and it had a bug that shared private browsing mode cookies with the normal mode in the exact same way (Bug 1658231).

These bugs do not only occur when loading icon images. Bug 1663261 points out that a similar bypass occurs when downloading linked files via <a download>. File downloads are also handled by Mozac’s Downloads feature, which satisfies the same conditions to cause a similar flaw.

As you can see, Mozac’s URL fetch is one of the places that creates inconsistencies with web content. Other than private browsing mode, there are various other security protection mechanisms in the web world, such as port blocks, HSTS, CSP, Mixed-Content Block, etc. These protections are sometimes overlooked when issuing HTTP requests from another component. By focusing on these common pitfalls, you’ll likely be able to find new security bugs continuously into the future.

Using the difference in User-Agent to distinguish the initiator of the request was a useful technique for finding these kinds of bugs, but it’s no longer available in today’s Fenix. If you can build Fenix yourself, you can still use this technique by setting a custom request header indicating the request from Mozac like below.

GeckoViewFetchClient.kt

  private fun WebRequest.Builder.addHeadersFrom(request: Request): WebRequest.Builder {
  request.headers?.forEach { header ->
    addHeader(header.name, header.value)
  }

+ addHeader("X-REQUESTED-BY", "MozacFetch") // add
 
  return this
  
  }

For monitoring HTTP requests, the Remote Debugging is useful. Requests sent from MozacFetch will be output to the Network tab of the Multiprocess Toolbox process in Remote Debug window.You can find requests from Mozac by filtering by the string “MozacFetch”.

Have a good bug hunt!

Mozilla Performance BlogPerformance Sheriff Newsletter (January 2021)

In January there were 106 alerts generated, resulting in 15 regression bugs being filed on average 4.3 days after the regressing change landed.

Welcome to the January 2021 edition of the performance sheriffing newsletter. Here you’ll find the usual summary of our sheriffing efficiency metrics, followed by some analysis of the bug products and components that were identified as the cause of regressions in 2020. If you’re interested (and if you have access) you can view the full dashboard.

Sheriffing Efficiency (Jan 2021)Sheriffing efficiency

  • All alerts were triaged in an average of 1.2 days
  • 90% of alerts were triaged within 3 days
  • Valid regressions were associated with bugs in 1.7 days
  • 100% of valid regressions were associated with bugs within 5 days

Regression Bug Analysis

January was a quiet month for alerts, so I thought I’d share some analysis I performed recently on the performance regression bugs identified in 2020. Note that this analysis is biased towards areas we have test coverage, areas of active development, and areas that are sensitive to performance.

Products

To create the following chart, I collected the product/component of all bugs indicated as a regressor for regression bugs. Where no regressor bugs were given, I report the product/component of the regression bug itself.

Regression Bugs by Product (2020)

Perhaps unsurprisingly, the majority of the regression bugs were opened against the Core product. The 16% in the Testing product is likely caused by Testing::Performance being the default product/component for regression bugs.

Components

First, let’s take a look over the components within the Core product:

Regression Bugs by Core Component 2020)

To conserve space, and ensure the above chart is readable, I’ve grouped 29 components with 2 or fewer regression bugs into an “Other” category. Now let’s look at the components that fall outside of the Core product:

Regression Bugs by Component 2020)

Similar to the previous chart, I’ve grouped products/components with just one regression bug into an “Other” category.

Priority

Finally, let’s take a look at the priority of the bugs. Although 37% have no priority set, all but one of these had an assignee, and most of them have been resolved.

Regression Bugs by Priorities (2020)

Summary of alerts

Each month I’ll highlight the regressions and improvements found.

Note that whilst I usually allow one week to pass before generating the report, there are still alerts under investigation for the period covered in this article. This means that whilst I believe these metrics to be accurate at the time of writing, some of them may change over time.

I would love to hear your feedback on this article, the queries, the dashboard, or anything else related to performance sheriffing or performance testing. You can comment here, or find the team on Matrix in #perftest or #perfsheriffs.

The dashboard for January can be found here (for those with access).

Daniel Stenbergcurl supports rustls

curl is an internet transfer engine. A rather modular one too. Parts of curl’s functionality is provided by selectable alternative implementations that we call backends. You select what backends to enable at build-time and in many cases the backends are enabled and powered by different 3rd party libraries.

Many backends

curl has a range of such alternative backends for various features:

  1. International Domain Names
  2. Name resolving
  3. TLS
  4. SSH
  5. HTTP/3
  6. HTTP content encoding
  7. HTTP

Stable API and ABI

Maintaining a stable API and ABI is key to libcurl. As long as those promises are kept, changing internals such as switching between backends is perfectly fine.

The API is the armored front door that we don’t change. The backends is the garden on the back of the house that we can dig up and replant every year if we want, without us having to change the front door.

TLS backends

Already back in 2005 we added support for using an alternative TLS library in curl when we added support for GnuTLS in addition to OpenSSL, and since then we’ve added many more. We do this by having an internal API through which we do all the TLS related things and for each third party library we support we have code that does the necessary logic to connect the internal API with the corresponding TLS library.

rustls

Today, we merged support for yet another TLS library: rustls. This is a TLS library written in rust and it has a C API provided in a separate project called crustls. Strictly speaking, curl is built to use crustls.

This is still early days for the rustls backend and it is not yet feature complete. There’s more work to do and polish to apply before we can think of it as a proper competitor to the already established and well-used TLS backends, but with this merge it makes it much easier for more people to help out and test it out. Feel free and encouraged to join in!

We count this addition as the 14th concurrently supported TLS library in curl. I’m not aware of any other project, anywhere, that supports more or even this many TLS libraries.

rustls again!

The TLS library named mesalink is actually already using rustls, but under an OpenSSL API disguise and we support that since a few years back…

Credits

The TLS backend code for rustls was written and contributed by Jacob Hoffman-Andrews.

The Mozilla BlogMozilla Welcomes the Rust Foundation

Today Mozilla is thrilled to join the Rust community in announcing the formation of the Rust Foundation. The Rust Foundation will be the home of the popular Rust programming language that began within Mozilla. Rust has long been bigger than just a Mozilla project and today’s announcement is the culmination of many years of community building and collaboration. Mozilla is pleased to be a founding Platinum Sponsor of the Rust Foundation and looks forward to working with it to help Rust continue to grow and prosper.

Rust is an open-source programming language focused on safety, speed and concurrency. It started life as a side project in Mozilla Research. Back in 2010, Graydon Hoare presented work on something he hoped would become a “slightly less annoying” programming language that could deliver better memory safety and more concurrency. Within a few years, Rust had grown into a project with an independent governance structure and contributions from inside and outside Mozilla. In 2015, the Rust project announced the first stable release, Rust 1.0.

Success quickly followed. Rust is so popular that it has been voted the most “most-loved” programming language in Stack Overflow’s developer survey for five years in a row. Adoption is increasing as companies big and small, scientists, and many others discover its power and usability. Mozilla used Rust to build Stylo, the CSS engine in Firefox (replacing approximately 160,000 lines of C++ with 85,000 lines of Rust).

It takes a lot for a new programming language to be successful. Rust’s growth is thanks to literally thousands of contributors and a strong culture of inclusion. The wide range of contributors and adopters has made Rust a better language for everyone.

Mozilla is proud of its role in Rust’s creation and we are happy to see it outgrow its origins and secure a dedicated organization to support its continued evolution. Given its reach and impact, Rust will benefit from an organization that is 100% focused on the project.

The new Rust Foundation will have board representation from a wide set of stakeholders to help set a path to its own future. Other entities will be able to provide direct financial resources to Rust beyond in-kind contributions. The Rust Foundation will not replace the existing community and technical governance for Rust. Rather, it will be the organization that hosts Rust infrastructure, supports the community, and stewards the language for the benefit of all users.

Mozilla joins all Rustaceans in welcoming the new Rust Foundation.

The post Mozilla Welcomes the Rust Foundation appeared first on The Mozilla Blog.

Mike TaylorObsolete RFCs and obsolete Cookie Path checking comments

The other day I was reading Firefox’s CookieService.cpp to figure out how Firefox determines its maximum cookie size (more on that one day, maybe) when the following comment (from 2002, according to blame) caught my eye:

The following test is part of the RFC2109 spec.  Loosely speaking, it says that a site cannot set a cookie for a path that it is not on.  See bug 155083.  However this patch broke several sites -- nordea (bug 155768) and citibank (bug 156725).  So this test has been disabled, unless we can evangelize these sites.

Note 1: Anything having to do with broken websites is wont to catch my attention, especially olde bugs (let’s face it, in 2002 the internet was basically the High Middle Ages. Like yeah, we were killing it with the technological innovation on top of windmills and we’re getting pretty good at farming and what not, but it’s still the Middle Ages compared to today and kind of sucked).

Note 2: The two sites referenced in the Firefox comment are banks (see 155768 and 156725). And one of the axioms of web compatibility is that if you break a bank with some cool new API or non-security bug fix, game over, it’s getting reverted. And I’m pretty sure you can’t legally create test accounts for banks to run tests against and Silk Road got taken down by the feds.

But at the time, the now obsolete rfc2019 had this to say about cookies Path attributes:

To prevent possible security or privacy violations, a user agent rejects a cookie (shall not store its information) if any of the following is true:

  * The value for the Path attribute is not a prefix of the request-URI.

Well, as documented, that wasn’t really web-compatible, and it’s kind of a theoretical concern (so long as you enforce path-match rules before handing out cookies from the cookie jar. ¯\_(ツ)_/¯). So Firefox commented out the conditional that would reject the cookie and added the comment above in question. As a result, people’s banks starting working in Firefox again (in 2002, remmeber, so people could check their online balance then hit up the ATM to buy Beyblades and Harry Potter merch, and whatever else was popular back then).

My colleague Lily pointed out that Chromium has a similar comment in canonical_cookie.cc:

The RFC says the path should be a prefix of the current URL path. However, Mozilla allows you to set any path for compatibility with broken websites.  We unfortunately will mimic this behavior.  We try to be generous and accept cookies with an invalid path attribute, and default the path to something reasonable.

These days, rfc6265 (and 6265bis) is much more pragmatic and states exactly what Firefox and Chromium are doing:

…the Path attribute does not provide any integrity protection because the user agent will accept an arbitrary Path attribute in a Set-Cookie header.

Never one to pass up on an opportunity to delete things, I wrote some patches for Firefox and Chromium so maybe someone reading Cookie code in the future doesn’t get distracted.

Aside 1: Awkwardly my moz-phab account has been disabled, so I just attached the patch file using Splinter like it’s 2002 (more Medieval code review tech references).

Aside 2: Both of these comments have two spaces after periods. Remember that?

Will Kahn-GreeneMarkus v3.0.0 released! Better metrics API for Python projects.

What is it?

Markus is a Python library for generating metrics.

Markus makes it easier to generate metrics in your program by:

  • providing multiple backends (Datadog statsd, statsd, logging, logging roll-up, and so on) for sending metrics data to different places

  • sending metrics to multiple backends at the same time

  • providing testing helpers for easy verification of metrics generation

  • providing a decoupled architecture making it easier to write code to generate metrics without having to worry about making sure creating and configuring a metrics client has been done--similar to the Python logging module in this way

We use it at Mozilla on many projects.

v3.0.0 released!

I released v3.0.0 just now. Changes:

Features

  • Added support for Python 3.9 (#79). Thank you, Brady!

  • Changed assert_* helper methods on markus.testing.MetricsMock to print the records to stdout if the assertion fails. This can save some time debugging failing tests. (#74)

Backwards incompatible changes

  • Dropped support for Python 3.5 (#78). Thank you, Brady!

  • markus.testing.MetricsMock.get_records and markus.testing.MetricsMock.filter_records return markus.main.MetricsRecord instances now.

    This might require you to rewrite/update tests that use the MetricsMock.

Where to go for more

Changes for this release: https://markus.readthedocs.io/en/latest/history.html#february-5th-2021

Documentation and quickstart here: https://markus.readthedocs.io/en/latest/index.html

Source code and issue tracker here: https://github.com/willkg/markus/

Let me know how this helps you!

Firefox NightlyThese Weeks in Firefox: Issue 87

Highlights

  • Starting from Firefox 86, WebExtensions will not need to request the broader “tabs” permission to have access to some of the more privileged part of the tabs API (in particular access to tab url, title and favicon url) on tabs they have host permissions for – Bug 1679688. Thanks to robbendebiene for contributing this enhancement!
  • Over ¼ of our Nightly population has Fission enabled, either by opting in, or via Normandy!
    • You can go to about:support to see if Fission is enabled. You can opt in to using it on Nightly by visiting about:preferences#experimental
    • Think you’ve found a Fission bug? Please file it here!
  • With the export and now import of logins landed and looking likely to ship soon, we are starting to have a much better story for migrating from other browsers, password managers, other Firefox profiles, etc. We ingest a common subset of the many fields these kinds of software export. Please try it out and file bugs!
  • Multiple Picture-in-Picture player support has been enabled to ride the trains in Firefox 86!

Friends of the Firefox team

Resolved bugs (excluding employees)

Fixed more than one bug (between Jan. 12th and Jan 26th)

  • Hunter Jones
  • Swapnik Katkoori
  • Tim Nguyen :ntim

Project Updates

Add-ons / Web Extensions

Addon Manager & about:addons
  • Starting from Firefox 86 about:addons will not (wrongly) show a pending update badge on all addons cards when a new extension is installed – Bug 1659283
    • Thanks to Tilden Windsor for contributing this fix!
  • In preparation for “addons.mozilla.org API v3 deprecation”, usage of the addons.mozilla.org (AMO) API in Firefox has been updated to point to the AMO API v4 – Bug 1686187 (riding Firefox 86 train, will be also uplifted to ESR 78.8)
  • “Line Extension” badge description in about:addons updated to make it clear that the extensions built by Mozilla are reviewed for security and performance (similarly to the description we already have on the “Recommended Extensions” badges) and to match the wording for the similar badge shown on the AMO listing pages – Bug 1687375
WebExtensions Framework
  • Manifest V3 content security policy (CSP) updated in Nightly Fx86, the new base CSP will disallow remotely hosted code in extensions with manifest_version 3 (this is an ongoing work part of the changes needed to support manifest v3 extensions in Firefox, and so this restrictions does not affect manifest v2 extensions) – Bug 1594234
WebExtension APIs
  • WebRequest internals do not await on “webRequest.onSendHeaders” listeners anymore (because they are not blocking listeners). Thanks to Brenda M Lima for contributing this fix!

Developer Tools

  • Removed cd() command (was available on the Command line in the Console panel), bug 
    • The alternative will be JS Context Selector / IFrame picker
  • Fixed support for debugging mochitests (bug)
    • mach test devtools/client/netmonitor/test/browser_net_api-calls.js –jsdebugger
    • Also works for xpcshell tests
  • DevTools Fission M3 planning and analysis
    • Backlog almost ready
    • Implementation starts next week

Fission

  • Over ¼ of our Nightly population has Fission enabled, either by opting in, or via Normandy!

Lint

Password Manager

  • Some contributors to thank:

PDFs & Printing

  • Rolling out on release. Currently at 25% enabled, plan to monitor errors and increase to 100% in late February
  • Simplify page feature is a work-in-progress, but close to being finished.
  • Duplex printing orientation is the last remaining feature to add. We’re waiting on icons from UX.

Performance

Picture-in-Picture

  • New group of MSU students just started! This semester we’ll be working with:
    • Tony (frostwyrm98)
    • David (heftydav)
    • Swapnik (katkoor2)
    • Oliver (popeoliv)
    • Guanlin (chenggu3)
  • This past weekend was our intro hackathon:
    • Over the weekend, they already landed:
      • Bug 1670094 – Fixed Picture-in-Picture (PIP) explainer text never getting RTL aligned
      • Bug 1678351 – Removed some dead CSS
      • Bug 1626600 – Leave out the PIP context menu item for empty <video>’s
    • Not yet landed but made progress:
      • Bug 1654054 – Port video controls strings to Fluent
      • Bug 1674152 – Make PIP code generate Sphinx docs
      • Bug 1669205 – PIP icon will disappear when dragging the tab to create a new window
  • Here’s the metabug for all their work.

Search and Navigation

  • Added a new Nightly Experiments option for Address Bar IME (Input Method Editor) users – Bug 1685991
  • A non-working Switch to Tab result could be offered for Top Sites in Private Browsing windows – Bug 1681697
  • History results were not shown in Search Mode when the “Show history before search suggestions” option was enabled – Bug 1672507
  • Address Bar performance improvements when highlighting search strings – Bug 1687767
  • Fixed built-in Ebay search engine with multi word search strings – Bug 1683991

Screenshots

  • Screenshots has new module owners. It was recently updated to use `browser.tabs.captureTab`. We hope to clean up the module a bit and start opening up mentored bugs.

The Mozilla BlogWhat WebRTC means for you

If I told you that two weeks ago IETF and W3C finally published the standards for WebRTC, your response would probably be to ask what all those acronyms were. Read on to find out!

Widely available high quality videoconferencing is one of the real successes of the Internet. The idea of videoconferencing is of course old (go watch that scene in 2001 where Heywood Floyd makes a video call to his family on a Bell videophone), but until fairly recently it required specialized equipment or at least downloading specialized software. Simply put, WebRTC is videoconferencing (VC) in a Web browser, with no download: you just go to a Web site and make a call. Most of the major VC services have a WebRTC version: this includes Google Meet, Cisco WebEx, and Microsoft Teams, plus a whole bunch of smaller players.

A toolkit, not a phone

WebRTC isn’t a complete videoconferencing system; it’s a set of tools built in to the browser that take care of many of the hard pieces of building a VC system so that you don’t have to. This includes:

  • Capturing the audio and video from the computer’s microphone and camera. This also includes what’s called Acoustic Echo Cancellation: removing echos (hopefully) even when people don’t wear headphones.
  • Allowing the two endpoints to negotiate their capabilities (e.g., “I want to send and receive video at 1080p using the AV1 codec”) and arrive at a common set of parameters.
  • Establishing a secure connection between you and other people on the call. This includes getting your data through any NATs or firewalls that may be on your network.
  • Compressing the audio and video for transmission to the other side and then reassembling it on receipt. It’s also necessary to deal with situations where some of the data is lost, in which case you want to avoid having the picture freeze or hearing audio glitches.

This functionality is embedded in what’s called an application programming interface (API): a set of commands that the programmer can give the browser to get it to set up a video call. The upshot of this is that it’s possible to write a very basic VC system in a very small number of lines of code. Building a production system is more work, but with WebRTC, the browser does much of the work of building the client side for you.

Standardization

Importantly, this functionality is all standardized: the API itself was published and by the World Wide Web Consortium(W3C) and the network protocols (encryption, compression, NAT traversal, etc.) were standardized by the Internet Engineering Task Force (IETF). The result is a giant pile of specifications, including the API specification, the protocol for negotiating what media will be sent or received, and a mechanism for sending peer-to-peer data. All in all, this represents a huge amount of work by too many people to count spanning a decade and resulting in hundreds of pages of specifications.

The result is that it’s possible to build a VC system that will work for everyone right in their browser and without them having to install any software

Ironically, the actual publication of the standards is kind of anticlimactic: every major browser has been shipping WebRTC for years and as I mentioned above, there are a large number of WebRTC VC systems. This is a good thing: widespread deployment is the only way to get confidence that technologies really work as expected and that the documents are clear enough to implement from. What the standards reflect is the collective judgement of the technical community that we have a system which generally works and that we’re not going to change the basic pieces. It also means that it’s time for VC providers who implemented non-standard mechanisms to update to what the standards say[1].

Why do you care about any of this?

At this point you might be thinking “OK, you all did a lot of work, but why does it matter? Can’t I just download Zoom? There are a number of important reasons why WebRTC is a big deal, as described below.

Security

Probably the most important reason is security. Because WebRTC runs entirely in the browser, it means that you don’t need to worry about security issues in the software that the VC provider wants you to download. As an example, last year Zoom had a number of high profile security flaws that would, for instance, have allowed web sites to add you to calls without your permission, or mount what’s called a Remote Code Execution attack that would allow attackers to run their code on your computer. By contrast, because WebRTC doesn’t require a download, you’re not exposed to whatever vulnerabilities the vendor may have in their client. Of course browsers don’t have a perfect security record, but every major browser invests a huge amount in security technologies like sandboxing. Moreover, you’re already running a browser, so every additional application you run increases your security risk. For this reason, Kaspersky recommends running the Zoom Web client, even though the experience is a lot worse than the app.[2]

The second security advantage of WebRTC-based conferencing is that the browser controls access to the camera and microphone. This means that you can easily prevent sites from using them, as well as be sure when they are in use. For instance, Firefox prompts you before letting a site use the camera and microphone and then shows something in the URL bar whenever they are live.

WebRTC is always encrypted in transit without the VC system having to do anything else, so you mostly don’t have to ask whether the vendor has done a good job with their encryption. This is one of the pieces of WebRTC that Mozilla was most involved in putting into place, in line with Mozilla Manifesto principle number 4 (Individuals’ security and privacy on the internet are fundamental and must not be treated as optional.). Even more exciting, we’re starting to see work on built-in end-to-end encrypted conferencing for WebRTC built on MLS and SFrame. This will help address the one major security feature that some native clients have that WebRTC does not provide: preventing the service from listening in on your calls. It’s good to see progress on that front.

Low Friction

Because WebRTC-based video calling apps work out of the box with a standard Web browser, they dramatically reduce friction. For users, this means they can just join a call without having to install anything, which makes life a lot easier. I’ve been on plenty of calls where someone couldn’t join — often because their company used a different VC system — because they hadn’t downloaded the right software, and this happens a lot less now that it just works with your browser. This can be an even bigger issue in enterprises have restrictions on what software can be installed.

For people who want to stand up a new VC service, WebRTC means that they don’t need to write a new piece of client software and get people to download it. This makes it much easier to enter the market without having to worry about users being locked into one VC system and unable to use yours.

None of this means that you can’t build your own client and a number of popular systems such as WebEx and Meet have downloadable endpoints (or, in the case of WebEx, hardware devices you can buy). But it means you don’t have to, and if you do things right, browser users will be able to talk to your custom endpoints, thus giving casual users an easy way to try out your service without being too committed.[3]

Enhancing The Web

Because WebRTC is part of the Web, not isolated into a separate app, that means that it can be used not just for conferencing applications but to enhance the Web itself. You want to add an audio stream to your game? Share your screen in a webinar? Upload video from your camera? No problem, just use WebRTC.

One exciting thing about WebRTC is that there turn out to be a lot of Web applications that can use WebRTC besides just video calling. Probably the most interesting is the use of WebRTC “Data Channels”, which allow a pair of clients to set up a connection between them which they can use to directly exchange data. This has a number of interesting applications, including gaming, file transfer, and even BitTorrent in the browser. It’s still early days, but I think we’re going to be seeing a lot of DataChannels in the future.

The bigger picture

By itself, WebRTC is a big step forward for the Web: it If you’d told people 20 years ago that they would be doing video calling from their browser, they would have laughed at you — and I have to admit, I was initially skeptical — and yet I do that almost every day at work. But more importantly, it’s a great example of the power the Web has to make to make people’s lives better and of what we can do when we work together to do that.


  1. Technical note: probably the biggest source of problems for Firefox users is people who implemented a Chrome-specific mechanism for handling multiple media streams called “Plan B”. The IETF eventually went with something called “Unified Plan” and Chrome supports it (as does Google Meet) but there are still a number of services, such as Slack and Facebook Video Calling, which do Plan B only which means they don’t work properly with Firefox, which implemented Unified Plan. ↩︎
  2. The Zoom Web client is an interesting case in that it’s only partly WebRTC. Unlike (say) Google Meet, Zoom Web uses WebRTC to capture audio and video and to transmit media over the network, but does all the audio and video locally using WebAssembly. It’s a testament to the power of WebAssembly that this works at all, but a head-to-head comparison of Zoom Web to other clients such as Meet or Jitsi reveals the advantages of using the WebRTC APIs built into the browser. ↩︎
  3. Google has open sourced their WebRTC stack, which makes it easier to write your own downloadable client, including one which will interoperate with browsers. ↩︎

The post What WebRTC means for you appeared first on The Mozilla Blog.

Daniel StenbergWebinar: curl, Hyper and Rust

On February 11th, 2021 18:00 UTC (10am Pacific time, 19:00 Central Europe) we invite you to participate in a webinar we call “curl, Hyper and Rust”. To join us at the live event, please register via the link below:

https://www.wolfssl.com/isrg-partner-webinar/

What is the project about, how will this improve curl and Hyper, how was it done, what lessons can be learned, what more can we expect in the future and how can newcomers join in and help?

Participating speakers in this webinar are:

Daniel Stenberg. Founder of and lead developer of curl.

Josh Aas, Executive Director at ISRG / Let’s Encrypt.

Sean McArthur, Lead developer of Hyper.

The event went on for 60 minutes, including the Q&A session at the end.

Recording

Questions?

If you already have a question you want to ask, please let us know ahead of time. Either in a reply here on the blog, or as a reply on one of the many tweets that you will see about about this event from me and my fellow “webinarees”.

Mozilla GFXImproving texture atlas allocation in WebRender

This is going to be a rather technical dive into a recent improvement that went into WebRender.

Texture atlas allocation

In order to submit work to the GPU efficiently, WebRender groups as many drawing primitives as it can into what we call batches. A batch is submitted to the GPU as a single drawing command and has a few constraints. for example a batch can only reference a fixed set of resources (such as GPU buffers and textures). So in order to group as many drawing primitives as possible in a single batch we need to place as many drawing parameters as possible in few resources. When rendering text, WebRender pre-renders the glyphs before compositing them on the screen so this means packing as many pre-rendered glyphs as possible into a single texture, and the same applies for rendering images and various other things.

For a moment let’s simplify the case of images and text and assume that it is the same problem: input images (rectangles) of various rectangular sizes that we need to pack into a larger textures. This is the job of the texture atlas allocator. Another common name for this is rectangle bin packing.

Many in game and web development are used to packing many images into fewer assets. In most cases this can be achieved at build time Which means that the texture atlas allocator isn’t constrained by allocation performance and only needs to find a good layout for a fixed set of rectangles without supporting dynamic allocation/deallocation within the atlas at run time. I call this “static” atlas allocation as opposed to “dynamic” atlas allocation.

There’s a lot more literature out there about static than dynamic atlas allocation. I recommend reading A thousand ways to pack the bin which is a very good survey of various static packing algorithms. Dynamic atlas allocation is unfortunately more difficult to implement while keeping good run-time performance. WebRender needs to maintain texture atlases into which items are added and removed over time. In other words we don’t have a way around needing dynamic atlas allocation.

A while back

A while back, WebRender used a simple implementation of the guillotine algorithm (explained in A thousand ways to pack the bin). This algorithm strikes a good compromise between packing quality and implementation complexity.
The main idea behind it can be explained simply: “Maintain a list of free rectangles, find one that can hold your allocation, split the requested allocation size out of it, creating up to two additional rectangles that are added back to the free list.”. There is subtlety in which free rectangle to choose and how to split it, but the overall, the algorithm is built upon reassuringly understandable concepts.

Deallocation could simply consist of adding the deallocated rectangle back to the free list, but without some way to merge back neighbor free rectangles, the atlas would quickly get into a fragmented stated with a lot of small free rectangles and can’t allocate larger ones anymore.

<figcaption>Lots of free space, but too fragmented to host large-ish allocations.</figcaption>

To address that, WebRender’s implementation would regularly do a O(n²) complexity search to find and merge neighbor free rectangles, which was very slow when dealing with thousands of items. Eventually we stopped using the guillotine allocator in systems that needed support for deallocation, replacing it with a very simple slab allocator which I’ll get back to further down this post.

Moving to a worse allocator because of the run-time defragmentation issue was rubbing me the wrong way, so as a side project I wrote a guillotine allocator that tracks rectangle splits in a tree in order to find and merge neighbor free rectangle in constant instead of quadratic time. I published it in the guillotiere crate. I wrote about how it works in details in the documentation so I won’t go over it here. I’m quite happy about how it turned out, although I haven’t pushed to use it in WebRender, mostly because I wanted to first see evidence that this type of change was needed and I already had evidence for many other things that needed to be worked on.

The slab allocator

What replaced WebRender’s guillotine allocator in the texture cache was a very simple one based on fixed power-of-two square slabs, with a few special-cased rectangular slab sizes for tall and narrow items to avoid wasting too much space. The texture is split into 512 by 512 regions, each region is split into a grid of slabs with a fixed slab size per region.

<figcaption>The slab allocator in action. This is a debugging view generated from a real browsing session.</figcaption>

This is a very simple scheme with very fast allocation and deallocation, however it tends to waste a lot of texture memory. For example allocating an 8×10 pixels glyph occupies a 16×16 slot, wasting more than twice the requested space. Ouch!
In addition, since regions can allocate a single slab size, space can be wasted by having a region with few allocations because the slab size happens to be uncommon.

Improvements to the slab allocator

Images and glyphs used to be cached in the same textures. However we render images and glyphs with different shaders, so currently they can never be used in the same rendering batches. I changed image and glyphs to be cached into a separate set of textures which provided with a few opportunities.

Not mixing images and glyphs means the glyph textures get more room for glyphs which reduces the number of textures containing glyphs overall. In other words, less chances to break batches. The same naturally applies to images. This is of course at the expense of allocating more textures on average, but it is a good trade-off for us and we are about to compensate the memory increase by using tighter packing.

In addition, glyphs and images are different types of workloads: we usually have a few hundred images of all sizes in the cache, while we have thousands of glyphs most of which have similar small sizes. Separating them allows us to introduce some simple workload-specific optimizations.

The first optimization came from noticing that glyphs are almost never larger than 128px. Having more and smaller regions, reduces the amount of atlas space that is wasted by partially empty regions, and allows us to hold more slab sizes at a given time so I reduced the region size from 512×512 to 128×128 in the glyph atlases. In the unlikely event that a glyph is larger than 128×128, it will go into the image atlas.

Next, I recorded the allocations and deallocations browsing different pages, gathered some statistics about most common glyph sizes and noticed that on a low-dpi screen, a quarter of the glyphs would land in a 16×16 slab but would have fit in a 8×16 slab. In latin scripts at least, glyphs are usually taller than wide. Adding 8×16 and 16×32 slab sizes that take advantage of this helps a lot.
I could have further optimized specific slab sizes by looking at the data I had collected, but the more slab sizes I would add, the higher the risk of regressing different workloads. This problem is called over-fitting. I don’t know enough about the many non-latin scripts used around the world to trust that my testing workloads were representative enough, so I decided that I should stick to safe bets (such as “glyphs are usually small”) and avoid piling up optimizations that might penalize some languages. Adding two slab sizes was fine (and worth it!) but I wouldn’t add ten more of them.

<figcaption>The original slab allocator needed two textures to store a workload that the improved allocator can fit into a single one.</figcaption>

At this point, I had nice improvements to glyph allocation using the slab allocator, but I had a clear picture of the ceiling I would hit from the fixed slab allocation approach.

Shelf packing allocators

I already had guillotiere in my toolbelt, in addition to which I experimented with two algorithms derived from the shelf packing allocation strategy, both of them released in the Rust crate etagere. The general idea behind shelf packing is to separate the 2-dimensional allocation problem into a 1D vertical allocator for the shelves and within each shelf, 1D horizontal allocation for the items.

The atlas is initialized with no shelf. When allocating an item, we first find the shelf that is the best fit for the item vertically, if there is none or the best fit wastes too much vertical space, we add a shelf. Once we have found or added a suitable shelf, an horizontal slice of it is used to host the allocation.

At a glance we can see that this scheme is likely to provide much better packing than the slab allocator. For one, items are tightly packed horizontally within the shelves. That alone saves a lot of space compared to the power-of-two slab widths. A bit of waste happens vertically, between an item and the top of its shelf. How much the shelf allocator wastes vertically depends on how the shelve heights are chosen. Since we aren’t constrained to power-of-two size, we can also do much better than the slab allocator vertically.

The bucketed shelf allocator

The first shelf allocator I implemented was inspired from Mapbox’s shelf-pack allocator written in JavaScript. It has an interesting bucketing strategy: items are accumulated into fixed size “buckets” that behave like a small bump allocators. Shelves are divided into a certain number of buckets and buckets are only freed when all elements are freed. The trade-off here is to keep atlas space occupied for longer in order to reduce the CPU cost of allocating and deallocating. Only the top-most shelf is removed when empty so consecutive empty shelves in the middle aren’t merged until they become the top-most shelves, which can cause a bit of vertical fragmentation for long running sessions. When the atlas is full of (potentially empty) shelves the chance that a new item is too tall to fit into one of the existing shelves depends on how common the item height is. Glyphs tend to be of similar (small) heights so it works out well enough.

I added very limited support for merging neighbor empty shelves. When an allocation fails, the atlas iterates over the shelves and checks if there is a sequence of empty shelves that in total would be able to fit the requested allocation. If so, the first shelf of the sequence becomes the size of the sum, and the other shelves are squashed to zero height. It sounds like a band aid (it is) but the code is simple and it is working within the constraints that make the rest of the allocator very simple and fast. It’s only a limited form of support for merging empty shelves but it was an improvement for workloads that contain both small and large items.

<figcaption>Image generated from the glyph cache in a real borwsing session via a debugging tool. We see fewer/wider boxes rather than many thin boxes because the allocator internally doesn’t keep track of each item rectangle individually, so we only see buckets filling up instead.</figcaption>

This allocator worked quite well for the glyph texture (unsurprisingly, as Mapbox’s implementation it was inspired from is used with their glyph cache). The bucketing strategy was problematic, however, with large images. The relative cost of keeping allocated space longer was higher with larger items. Especially with long running sessions, this allocator was good candidate for the glyph cache but not for the image cache.

The simple shelf allocator

The guillotine allocator was working rather well with images. I was close to just using it for the image cache and move on. However, having spent a lot of time looking at various allocations patterns, my intuition was that we could do better. This is largely thanks to being able to visualize the algorithms via our integrated debugging tool that could generate nice SVG visualizations.

It motivated experimenting with a second shelf allocator. This one is conceptually even simpler: A basic vertical 1D allocator for shelves with a basic horizontal 1D allocator per shelf. Since all items are managed individually, they are deallocated eagerly which is the main advantage over the bucketed implementation. It is also why it is slower than the bucketed allocator, especially when the number of items is high. This allocator also has full support for merging/splitting empty shelves wherever they are in the atlas.

<figcaption>This was generated from the same glyph cache wokrload as the previous image.</figcaption>

Unlike the Bucketed allocator, this one worked very well for the image workloads. For short sessions (visiting only a handful of web pages) it was not packing as tightly as the guillotine allocator, but after browsing for longer period of time, it had a tendency to better deal with fragmentation.

<figcaption>The simple shelf allocator used on the image cache. Notice how different the image workloads look (using the same texture size), with much fewer items and a mix of large and small items sizes. </figcaption>

The implementation is very simple, scanning shelves linearly and then within the selected shelf another linear scan to find a spot for the allocation. I expected performance to scale somewhat poorly with high number of glyphs (we are dealing in the thousands of glyphs which arguably isn’t that high), but the performance hit wasn’t as bad I had anticipated, probably helped by mostly cache friendly underlying data-structure.

A few other experiments

For both allocators I implemented the ability to split the atlas into a fixed number of columns. Adding columns means more (smaller) shelves in the atlas, which further reduces vertical fragmentation issues, at the cost of wasting some space at the end of the shelves. Good results were obtained on 2048×2048 atlases with two columns. You can see in the previous two images that the shelf allocator was configured to use two columns.

The shelf allocators support arranging items in vertical shelves instead of horizontal ones. It can have an impact depending on the type of workload, for example if there is more variation in width than height for the requested allocations. As far as my testing went, it did not make a significant difference with workloads recorded in Firefox so I kept the default horizontal shelves.

The allocators also support enforcing specific alignments in x and y (effectively, rounding up the size of allocated items to a multiple of the x and y alignment). This introduces a bit of wasted space but avoids leaving tiny holes in some cases. Some platforms also require a certain alignment for various texture transfer operations so it is useful to have this knob to tweak at our disposal. In the Firefox integration, we use different alignments for each type of atlas, favoring small alignments for atlases that mostly contain small items to keep the relative wasted space small.

Conclusion

<figcaption>Various visualizations generated while I was working on this. It’s been really fun to be able “look” at the algorithms at each step of the process. </figcaption>

The guillotine allocator is the best at keeping track of all available space and can provide the best packing of the algorithms I tried. The shelf allocators waste a bit of space by simplifying the arrangement into shelves, and the slab allocator wastes a lot of space for the sake of simplicity. On the other hand the guillotine allocator is the slowest when dealing with multiple thousands of items and can suffer from fragmentations in some of the workloads I recorded. Overall the best compromise was the simple shelf allocator which I ended up integrating in Firefox for both glyph and image textures in the cache (in both cases configured to have two columns per texture). The bucketed allocator is still a very reasonable option for glyphs and we could switch to it in the future if we feel we should trade some packing efficiency for allocation/deallocation performance. In other parts of WebRender, for short lived atlases (a single frame), the guillotine allocation algorithm is used.

These observations are mostly workload-dependent, though. Workloads are rarely completely random so results may vary.

There are other algorithms I could have explored (and maybe will someday, who knows), but I had found a satisfying compromise between simplicity, packing efficiency, and performance. I wasn’t aiming for state of the art packing efficiency. Simplicity was a very important parameter and whatever solutions I came up with would have to be simple enough to ship it in a web browser without risks.

To recap, my goals were to:

  • allow packing more texture cache items into fewer textures,
  • reduce the amount of texture allocation/deallocation churn,
  • avoid increasing GPU memory usage, and if possible reduce it.

This was achieved by improving atlas packing to the point that we more rarely have to allocate multiple textures for each item type . The results look pretty good so far. Before the changes in Firefox, glyphs would often be spread over a number of textures after having visited a couple of websites, Currently the cache eviction is set so that we rarely need more than than one or two textures with the new allocator and I am planning to crank it up so we only use a single texture. For images, the shelf allocator is pretty big win as well. what used to fit into five textures now fits into two or three. Today this translates into fewer draw calls and less CPU-to-GPU transfers which has a noticeable impact on performance on low end Intel GPUs, in addition to reducing GPU memory usage.

The slab allocator improvements landed in bug 1674443 and shipped in Firefox 85, while the shelf allocator integration work went in bug 1679751 and will make it hit the release channel in Firefox 86. The interesting parts of this work are packaged in a couple of rust crates under permissive MIT OR Apache-2.0 license:

Armen ZambranoMaking Big Sur and pyenv play nicely

Soon after Big Sur came out, I received my new work laptop. I decided to upgrade to it. Unfortunately, I quickly discovered that the Python set up needed for Sentry required some changes. Since it took me a bit of time to figure it out I decided to document it for anyone trying to solve the same problem.

If you are curious about all that I went through and see references to upstream issues you can visit this issue. It’s a bit raw. Most important notes are in the first comment.

On Big Sur, if you try to install older versions of Python you will need to tell pyenv to patch the code. For instance, you can install Python 3.8.7 the typical way ( pyenv install 3.8.7 ), however, if you try to install 3.8.0, or earlier, you will have to patch the code before building Python.

pyenv install --patch 3.6.10 < \
<(curl -sSL https://github.com/python/cpython/commit/8ea6353.patch\?full_index\=1)

If your pyenv version is lesser than 1.2.22 you will also need to specify LDFLAGS. You can read more about it here.

LDFLAGS="-L$(xcrun --show-sdk-path)/usr/lib ${LDFLAGS}" \
pyenv install --patch 3.6.10 < \
<(curl -sSL https://github.com/python/cpython/commit/8ea6353.patch\?full_index\=1)

It seems very simple, however, it took me a lot of work to figure it out. I hope I saved you some time!

Martin GigerSunsetting Stream Notifier

I have decided to halt any plans to maintain the extension and focus on other spare time open source projects instead. I should have probably made this decision about seven months ago, when Twitch integration broke, however this extension means a lot to me. It was my first browser extension that still exists and went …

The post Sunsetting Stream Notifier appeared first on Humanoids beLog.

Will Kahn-GreeneSocorro: This Period in Crash Stats: Volume 2021.1

New features and changes in Crash Stats

Crash Stats crash report view pages show Breadcrumbs information

In 2020q3, Roger and I worked out a new Breadcrumbs crash annotation for crash reports generated by the android-components crash reporter. It's a JSON-encoded field with a structure just like the one that the sentry-sdk sends. Incoming crash reports that have this annotation will show the data on the crash report view page to people who have protected data access.

/images/tpics_2021_1_breadcrumbs.thumbnail.png

Figure 1: Screenshot of Breadcrumbs data in Details tab of crash report view on Crash Stats.

I implemented it based on what we get in the structure and what's in the Sentry interface.

Breadcrumbs information is not searchable with Supersearch. Currently, it's only shown in the crash report view in the Details tab.

Does this help your work? Are there ways to improve this? If so, let me know!

This work was done in [bug 1671276].

Crash Stats crash report view pages show Java exceptions

For the longest of long times, crash reports from a Java process included a JavaStackTrace annotation which was a big unstructured string of problematic parseability and I couldn't do much with it.

In 2020q4, Roger and I worked out a new JavaException crash annotation which was a JSON-encoded structured string containing the exception information. Not only does it have the final exception, but it also has cascading exceptions if there are any! With a structured form of the exception, we can suddenly do a lot of interesting things.

As a first step, I added display of the Java exception information to the crash report view page in the Display tab. It's in the same place that you would see the crashing thread stack if this were a C++/Rust crash.

Just like JavaStackTrace, the JavaException annotation has some data in it that can have PII in it. Because of that, the Socorro processor generates two versions of the data: one that's sanitized (no java exception message values) and one that's raw. If you have protected data access, you can see the raw one.

The interface is pretty wide and exceeds the screenshot. Sorry about that.

/images/tpics_2021_1_exception.thumbnail.png

Figure 2: Screenshot of Java exception data in Details tab of crash report view in Crash Stats.

My next step is to use the structured exception information to improve Java crash report signatures. I'm doing that work in [bug 1541120] and hoping to land that in 2021q1. More on that later.

Does this help your work? Are there ways to improve this? If so, let me know!

This work was done in [bug 1675560].

Changes to crash report view

One of the things I don't like about the crash report view is that it's impossible to intuit where the data you're looking at is from. Further, some of the tabs were unclear about what bits of data were protected data and what bits weren't. I've been improving that over time.

The most recent step involved the following changes:

  1. The "Metadata" tab was renamed to "Crash Annotations". This tab holds the crash annotation data from the raw crash before processing as well as a few fields that the collector adds when accepting a crash report from the client. Most of the fields are defined in the CrashAnnotations.yaml file in mozilla-central. The ones that aren't, yet, should get added. I have that on my list of things to get to.

  2. The "Crash Annotations" tab is now split into public and protected data sections. I hope this makes it a little clearer which is which.

  3. I removed some unneeded fields that the collector adds at ingestion.

Does this help your work? Are there ways to improve this? If so, let me know!

What's in the queue

In addition to the day-to-day stuff, I'm working on the following big projects in 2021q1.

Remove the Email field

Last year, I looked into who's using the Email field and for what, whether the data was any good, and in what circumstances do we even get Email data. That work was done in [bug 1626277].

The consensus is that since not all of the crash reporter clients let a user enter in an email address, it doesn't seem like we use the data, and it's pretty toxic data to have, we should remove it.

The first step of that is to delete the field from the crash report at ingestion. I'll be doing that work in [bug 1688905].

The second step is to remove it from the webapp. I'll be doing that work in [bug 1688907].

Once that's done, I'll write up some bugs to remove it from the crash reporter clients and wherever else it is in products.

Does this affect you? If so, let me know!

Redo signature generation for Java crashes

Currently, signature generation for Java crashes is pretty basic and it's not flexible in the ways we need it. Now we can fix that.

I need some Java crash expertise to bounce ideas off of and to help me verify "goodness" of signatures. If you're interested in helping in any capacity or if you have opinions on how it should work or what you need out of it, please let me know.

I'm hoping to do this work in 2021q1.

The tracker bug is [bug 1541120].

Closing

Thank you to Roger Yang who implemented Breadcrumbs and JavaException reporting and Gabriele Svelto who advised on new annotations and how things should work! Thank you to everyone who submits signature generation changes--I really appreciate your efforts!

Daniel Stenbergcurl 7.75.0 is smaller

There’s been another 56 day release cycle and here’s another curl release to chew on!

Release presentation

Numbers

the 197th release
6 changes
56 days (total: 8,357)

113 bug fixes (total: 6,682)
268 commits (total: 26,752)
0 new public libcurl function (total: 85)
1 new curl_easy_setopt() option (total: 285)

2 new curl command line option (total: 237)
58 contributors, 30 new (total: 2,322)
31 authors, 17 new (total: 860)
0 security fixes (total: 98)
0 USD paid in Bug Bounties (total: 4,400 USD)

Security

No new security advisories this time!

Changes

We added --create-file-mode to the command line tool. To be used for the protocols where curl needs to tell the remote server what “mode” to use for the file when created. libcurl already supported this, but now we expose the functionality to the tool users as well.

The --write-out option got five new “variables” to use. Detailed in this separate blog post.

The CURLOPT_RESOLVE option got an extended format that now allows entries to be inserted to get timed-out after the standard DNS cache expiry time-out.

gophers:// – the protocol GOPHER done over TLS – is now supported by curl.

As a new experimentally supported HTTP backend, you can now build curl to use Hyper. It is not quite up to 100% parity in features just yet.

AWS HTTP v4 Signature support. This is an authentication method for HTTP used by AWS and some other services. See CURLOPT_AWS_SIGV4 for libcurl and --aws-sigv4 for the curl tool.

Bug-fixes

Some of the notable things we’ve fixed this time around…

Reduced struct sizes

In my ongoing struggles to remove “dead weight” and ensure that curl can run on as small devices as possible, I’ve trimmed down the size of several key structs in curl. The memory foot-print of libcurl is now smaller than it has been for a very long time.

Reduced conn->data references

While itself not exactly a bug-fix, this is a step in a larger refactor of libcurl where we work on removing all references back from connections to the transfer. The grand idea is that transfers can point to connections, but since a connection can be owned and used by many transfers, we should remove all code that reference back to a transfer from the connection. To simplify internals. We’re not quite there yet.

Silly writeout time units bug

Many users found out that when asking the curl tool to output timing information with -w, I accidentally made it show microseconds instead of seconds in 7.74.0! This is fixed and we’re back to the way it always was now…

CURLOPT_REQUEST_TARGET works with HTTP proxy

The option that lets the user set the “request target” of a HTTP request to something custom (like for example “*” when you want to issue a request using the OPTIONS method) didn’t work over proxy!

CURLOPT_FAILONERROR fails after all headers

Often used with the tools --fail flag, this is feature that makes libcurl stop and return error if the HTTP response code is 400 or larger. Starting in this version, curl will still read and parse all the response headers before it stops and exists. This then allows curl to act on and store contents from the other headers that can be used for features in combination with --fail.

Proxy-Connection duplicated headers

In some circumstances, providing a custom “Proxy-Connection:” header for a HTTP request would still get curl’s own added header in the request as well, making the request get sent with a duplicate set!

CONNECT chunked encoding race condition

There was a bug in the code that handles proxy responses, when the body of the CONNECT responses was using chunked-encoding. curl could end up thinking the response had ended before it actually had…

proper User-Agent header setup

Back in 7.71.0 we fixed a problem with the user-agent header and made it get stored correctly in the transfer struct, from previously having been stored in the connection struct.

That cause a regression that we fixed now. The previous code had a mistake that caused the user-agent header to not get used when a connection was re-used or multiplexed, which from an outside perspective made it appear go missing in a random fashion…

add support for %if [feature] conditions in tests

Thanks to the new preprocessor we added for test cases some releases ago, we could now take the next step and offer conditionals in the test cases so that we can now better allow tests to run and behave differently depending on features and parameters. Previously, we have resorted to only make tests require a certain feature set to be able to run and otherwise skip the tests completely if the feature set could be satisfied, but with this new ability we can make tests more flexible and thus work with a wider variety of features.

if IDNA conversion fails, fallback to Transitional

A user reported that that curl failed to get the data when given a funny URL, while it worked fine in browsers (and wget):

The host name consists of a heart and a fox emoji in the .ws top-level domain. This is yet another URLs-are-weird issue and how to do International Domain Names with them is indeed a complicated matter, but starting now curl falls back and tries a more conservative approach if the first take fails and voilá, now curl too can get the heart-fox URL just fine… Regular readers of my blog might remember the smiley URLs of 2015, which were similar.

urldata: make magic first struct field

We provide types for the most commonly used handles in the libcurl API as typedef’ed void pointers. The handles are typically declared like this:

CURL *easy;
CURLM *multi;
CURLSH *shared;

… but since they’re typedefed void-pointers, the compiler cannot helpfully point out if a user passes in the wrong handle to the wrong libcurl function and havoc can ensue.

Starting now, all these three handles have a “magic” struct field in the same relative place within their structs so that libcurl can much better detect when the wrong kind of handle is passed to a function and instead of going bananas or even crash, libcurl can more properly and friendly return an error code saying the input was wrong.

Since you’d ask: using void-pointers like that was a mistake and we regret it. There are better ways to accomplish the same thing, but the train has left. When we’ve tried to correct this situation there have been cracks in the universe and complaints have been shouted through the ether.

SECURE_RENEGOTIATION support for wolfSSL

Turned out we didn’t support this before and it wasn’t hard to add…

openssl: lowercase the host name before using it for SNI

The SNI (Server Name Indication) field is data set in TLS that tells the remote server which host name we want to connect to, so that it can present the client with the correct certificate etc since the server might serve multiple host names.

The spec clearly says that this field should contain the DNS name and that it is case insensitive – like DNS names are. Turns out it wasn’t even hard to find multiple servers which misbehave and return the wrong cert if the given SNI name is not sent lowercase! The popular browsers typically always send the SNI names like that… In curl we keep the host name internally exactly as it was given to us in the URL.

With a silent protest that nobody cares about, we went ahead and made curl also lowercase the host name in the SNI field when using OpenSSL.

I did not research how all the other TLS libraries curl can use behaves when setting SNI. This same change is probably going to have to be done on more places, or perhaps we need to surrender and do the lowercasing once and for all at a central place… Time will tell!

Future!

We have several pull requests in the queue suggesting changes, which means the next release is likely to be named 7.76.0 and the plan is to ship that on March 31st, 2021.

Send us your bug reports and pull requests!

This Week In RustThis Week in Rust 376

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

No official blog posts this week.

Newsletters
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is fancy-regex a regex implementation using regex for speed and backtracking for fancy features.

Thanks to Benjamin Minixhofer for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

323 pull requests were merged in the last week

Rust Compiler Performance Triage

Another week dominated by rollups, most of which had relatively small changes with unclear causes embedded. Overall no major changes in performance this week.

Triage done by @simulacrum. Revision range: 1483e67addd37d9bd20ba3b4613b678ee9ad4d68.. f6cb45ad01a4518f615926f39801996622f46179

Link

2 Regressions, 1 Improvements, 1 Mixed

3 of them in rollups

See the full report for more.

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in the final comment period.

Tracking Issues & PRs

New RFCs

Upcoming Events

Online
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

This time we had two very good quotes, I could not decide, so here are both:

What I have been learning ... was not Rust in particular, but how to write sound software in general, and that in my opinion is the largest asset that the rust community tough me, through the language and tools that you developed.

Under this prism, it was really easy for me to justify the step learning curve that Rust offers: I wanted to learn how to write sound software, writing sound software is really hard , and the Rust compiler is a really good teacher.

[...]

This ability to identify unsound code transcends Rust's language, and in my opinion is heavily under-represented in most cost-benefit analysis over learning Rust or not.

Jorge Leitao on rust-users

and

Having a fast language is not enough (ASM), and having a language with strong type guarantees neither (Haskell), and having a language with ease of use and portability also neither (Python/Java). Combine all of them together, and you get the best of all these worlds.

Rust is not the best option for any coding philosophy, it’s the option that is currently the best at combining all these philosophies.

/u/CalligrapherMinute77 on /r/rust

Thanks to 2e71828 and Rusty Shell for their respective suggestions.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

The Talospace ProjectFollowup on Firefox 85 for POWER: new low-level fix

Shortly after posting my usual update on Firefox on POWER, I started to notice odd occasional tab crashes in Fx85 that weren't happening in Firefox 84. Dan Horák independently E-mailed me to report the same thing. After some digging, it turned out that our fix way back when for Firefox 70 was incomplete: although it renovated the glue that allows scripts to call native functions and fixed a lot of problems, it had an undiagnosed edge case where if we had a whole lot of float arguments we would spill parameters to the wrong place in the stack frame. Guess what type of function was now getting newly called?

This fix is now in the tree as bug 1690152; read that bug for the dirty details. You will need to apply it to Firefox 85 and rebuild, though I plan to ask to land this on beta 86 once it sticks and it will definitely be in Firefox 87. It should also be applied to ESR 78, though that older version doesn't exhibit the crashes to the frequency Fx85 does. This bug also only trips in optimized builds.

Mozilla Addons Blogaddons.mozilla.org API v3 Deprecation

The addons.mozilla.org (AMO) external API can be used by users and developers to get information about add-ons available on AMO, and to submit new add-on versions for signing.  It’s also used by Firefox for recommendations, among other things, by the web-ext tool, and internally within the addons.mozilla.org website.

We plan to shut down Version 3 (v3) of the AMO API on December 31, 2021. If you have any personal scripts that rely on v3 of the API, or if you interact with the API through other means, we recommend that you switch to the stable v4. You don’t need to take any action if you don’t use the AMO API directly. The AMO API v3 is entirely unconnected to manifest v3 for the WebExtensions API, which is the umbrella project for major changes to the extensions platform itself.

Roughly five years ago, we introduced v3 of the AMO API for add-on signing. Since then, we have continued developing additional versions of the API to fulfill new requirements, but have maintained v3 to preserve backwards compatibility. However, having to maintain multiple different versions has become a burden. This is why we’re planning to update dependent projects to use v4 of the API soon and shut down v3 at the end of the year.

You can find more information about v3 and v4 on our API documentation site.  When updating your scripts, we suggest just making the change from “/v3/” to “/v4” and seeing if everything still works – in most cases it will.

Feel free to contact us if you have any difficulties.

The post addons.mozilla.org API v3 Deprecation appeared first on Mozilla Add-ons Blog.

Tiger OakesTurning junk phones into an art display

Phones attached to wood panel with black wire wall art stemming from it

What do you do with your old phone when you get a new one? It probably goes to a pile in the back of the closest, or to the dump. I help my family with tech support and end up with any of their old devices, so my pile of junk phones got bigger and bigger.

I didn’t want to leave the phones lying around and collecting dust. One day I had the idea to stick them on the wall like digital photo frames. After some time with Velcro, paint, and programming, I had all the phones up and running.

What it does

These phones can do anything a normal phone does, but I’ve tweaked the use cases since they’re meant to be viewed and not touched.

  • Upload images to each individual cell phone or an image that spans multiple phones.
  • Show a drink list or other text across all phones.
  • Indicate at a glance if I or my partner is is in a meeting.
Example of CellWall running with different screens

Parts and assembly

Each of these phones has a story of its own. Some have cracked screens, some can’t connect to the internet, and one I found in the woods. To build Cell Wall, I needed a physical board, software for the phones, and a way to mount the phones on the board. You might have some of this lying around already!

Wood plank base (the “Wall”)

First off: there needs to be a panel for the phones to sit on top of. You could choose to stick phones directly on your wall, but I live in an apartment and I wanted to make something I could remove. I previously tried using a foam board but decided to “upgrade” to a wood panel with paint.

I started off arranging the phones on the floor and figuring out how much space was needed. I took some measurements and estimated that the board needed to be 2 feet by 1 1/2 feet to comfortably fit all the phones and wires.

Once I had some rough measurements, I took a trip to Home Depot. Home Depot sells precut 2 feet by 2 feet wood panels, so I found a sturdy light piece. You can get wood cut for free inside the store by an employee or at a DIY station, so I took out a saw and cut off the extra 6-inch piece.

The edges can be a little sharp afterwards. Use a block of sandpaper to smooth them out.

I wanted the wood board to blend in with my wall and not look like…wood. At a craft store, I picked up a small bottle of white paint and a paintbrush. At home, on top of some trash bags, I started painting a few coats of white.

Mounting and connecting the phones (the “Cell"s)

To keep the phones from falling off, I use Velcro. It’s perfect for securely attaching the phones to the board while allowing them to be removed if needed.

Before sticking them on, I also double-checked that the phones turn on at all. Most do, and the ones that are busted make a nice extra decoration.

If the phone does turn on, enable developer mode. Open settings, open the System section, and go to “About phone”. Developer mode is hidden here - by tapping on “Build number” many times, you eventually get a prompt indicating you are now a true Android developer.

The wires are laid out with a bunch of tiny wire clips. $7 will get you 100 of these clips in a bag, and I’ve laid them out so each clip only contains 1 or 2 wires. The wires themselves are all standard phone USB cables you probably have lying around for charging. You can also buy extra cables for less than a dollar each at Monoprice.

All the wires feed into a USB hub. This hub lets me connect all the phones to a computer just using a single wire. I had one lying around, but similar hubs are on Amazon for $20. The hub needs a second cable that plugs directly into an outlet and provides extra power, since it needs to charge so many phones.

Software

With all the phones hooked up to the USB hub, I can connect them all to a single computer server. All of these phones are running Android, and I’ll use this computer to send commands to them.

How to talk to Android phones from a computer

Usually, phones communicate to a server through the internet over WiFi. But, some of the phones don’t have working WiFi, so I need to connect over the USB cable instead. The computer communicates with the phones using a program from Google called the Android Debug Bridge. This program, called ADB for short, lets you control an Android phone by sending commands, such as installing a new app, simulating a button, or starting an app.

You can check if ADB can connect to your devices by running the command adb devices. The first time this runs, each phone gets a prompt to check if you trust this computer. Check the “remember” box and hit OK.

Android uses a system called “intents” to open an app. The simplest example is tapping an icon on the home screen, which sends a “launch” intent. However, you can also send intents with additional data, such as an email subject and body when opening an email app, or the address of a website when opening a web browser. Using this system, I can send some data to a custom Android app over ADB that tells it which screen to display.

# Command to send an intent using ADB
adb shell am start
  # The intent action type, such as viewing a website
  -a android.intent.action.VIEW
  # Data URI to pass in the intent
  -d https://example.com

The Android client

Each phone is running a custom Android app that interprets intents then displays one of 3 screens.

  • The text screen shows large text on a coloured background.
  • The image screen shows one full-screen image loaded over the internet.
  • The website screen loads a website, which is rendered with GeckoView from Mozilla.

This doesn’t sound like a lot, but when all the devices are connected together to a single source, you can achieve complicated functionality.

The Node.js server

The core logic doesn’t run on the phones but instead runs on the computer all the phones are connected to. Any computer with a USB port can work as the server that the phones connect to, but the Raspberry Pi is nice and small and uses less power.

This computer runs server software that acts as the manager for all the connected devices, sending them different data. It will take a large photo to crop into little photos, then send them to each phone. It can also take a list of text, then send individual lines to each cell. A grocery list can be shown by spreading the text across multiple phones. Larger images can be displayed by cutting them up on the server and sending a cropped version to each cell.

The server software is written in TypeScript and creates an HTTP server to expose functionality through different web addresses. This allows other programs to communicate with the server and lets me make a bridge with a Google Home or smart home software.

The remote control

To control CellWall, I wrote a small JavaScript app served by the Node server. It includes a few buttons to turn each display on, controls for specific screens, and presets to display. These input elements all send HTTP requests to the server, which then converts them into ADB commands sent to the cells.

Remote control app with power buttons, device selection, and manual display controls Diagram of request flow from remote to server to ADB to CellWall Remote CellWall Server ADB Tell ADB to send VIEW intent HTTP request to show URL Send intent to each phone

As a nice final touch, I put some black masking tape to resemble wires coming out of the board. While this is optional, it makes a nice Zoom background for meetings. My partner’s desk is across the room, and I frequently hear her coworkers comment on the display behind her.

I hope you’re inspired to try something similar yourself. All of my project code is available on GitHub. Let me know how yours turns out! I’m happy to answer any questions on Twitter @Not_Woods.

NotWoods/cell-wall A multi-device display for showing interactive data, such as photos, weather information, calendar appointments, and more.

Cameron KaiserFloodgap.com down due to domain squatter attack on Network Solutions

Floodgap sites are down because someone did a mass attack on NetSol (this also attacked Perl.com and others). I'm working to resolve this. More shortly.

Update: Looks like it was a social engineering attack. I spoke with a very helpful person in their security department (Beth) and she walked me through it. On the 26th someone initiated a webchat with their account representatives and presented official-looking but fraudulent identity documents (a photo ID, a business license and a utility bill), then got control of the account and logged in and changed everything. NetSol is in the process of reversing the damage and restoring the DNS entries. They will be following up with me for a post-mortem. I do want to say I appreciate how quickly and seriously they are taking this whole issue.

If you are on Network Solutions, check your domains this morning, please. I'm just a "little" site, and I bet a lot of them were attacked in a similar fashion.

Update the second: Domains should be back up, but it may take a while for them to propagate. The servers themselves were unaffected, and I don't store any user data anyway.

The Firefox FrontierFour ways to protect your data privacy and still be online

Today is Data Privacy Day, which is a good reminder that data privacy is a thing, and you’re in charge of it. The simple truth: your personal data is very … Read more

The post Four ways to protect your data privacy and still be online appeared first on The Firefox Frontier.

Daniel StenbergWhat if GitHub is the devil?

Some critics think the curl project shouldn’t use GitHub. The reasons for being against GitHub hosting tend to be one or more of:

  1. it is an evil proprietary platform
  2. it is run by Microsoft and they are evil
  3. GitHub is American thus evil

Some have insisted on craziness like “we let GitHub hold our source code hostage”.

Why GitHub?

The curl project switched to GitHub (from Sourceforge) almost eleven years ago and it’s been a smooth ride ever since.

We’re on GitHub not only because it provides a myriad of practical features and is a stable and snappy service for hosting and managing source code. GitHub is also a developer hub for millions of developers who already have accounts and are familiar with the GitHub style of developing, the terms and the tools. By being on GitHub, we reduce friction from the contribution process and we maximize the ability for others to join in and help. We lower the bar. This is good for us.

I like GitHub.

Self-hosting is not better

Providing even close to the same uptime and snappy response times with a self-hosted service is a challenge, and it would take someone time and energy to volunteer that work – time and energy we now instead can spend of developing the project instead. As a small independent open source project, we don’t have any “infrastructure department” that would do it for us. And trust me: we already have enough infrastructure management to deal with without having to add to that pile.

… and by running our own hosted version, we would lose the “network effect” and convenience for people that already are on and know the platform. We would also lose the easy integration with cool services like the many different CI and code analyzer jobs we run.

Proprietary is not the issue

While git is open source, GitHub is a proprietary system. But the thing is that even if we would go with a competitor and get our code hosting done elsewhere, our code would still be stored on a machine somewhere in a remote server park we cannot physically access – ever. It doesn’t matter if that hosting company uses open source or proprietary code. If they decide to switch off the servers one day, or even just selectively block our project, there’s nothing we can do to get our stuff back out from there.

We have to work so that we minimize the risk for it and the effects from it if it still happens.

A proprietary software platform holds our code just as much hostage as any free or open source software platform would, simply by the fact that we let someone else host it. They run the servers our code is stored on.

If GitHub takes the ball and goes home

No matter which service we use, there’s always a risk that they will turn off the light one day and not come back – or just change the rules or licensing terms that would prevent us from staying there. We cannot avoid that risk. But we can make sure that we’re smart about it, have a contingency plan or at least an idea of what to do when that day comes.

If GitHub shuts down immediately and we get zero warning to rescue anything at all from them, what would be the result for the curl project?

Code. We would still have the entire git repository with all code, all source history and all existing branches up until that point. We’re hundreds of developers who pull that repository frequently, and many automatically, so there’s a very distributed backup all over the world.

CI. Most of our CI setup is done with yaml config files in the source repo. If we transition to another hosting platform, we could reuse them.

Issues. Bug reports and pull requests are stored on GitHub and a sudden exit would definitely make us lose some of them. We do daily “extractions” of all issues and pull-requests so a lot of meta-data could still be saved and preserved. I don’t think this would be a terribly hard blow either: we move long-standing bugs and ideas over to documents in the repository, so the currently open ones are likely possible to get resubmitted again within the nearest future.

There’s no doubt that it would be a significant speed bump for the project, but it would not be worse than that. We could bounce back on a new platform and development would go on within days.

Low risk

It’s a rare thing, that a service just suddenly with no warning and no heads up would just go black and leave projects completely stranded. In most cases, we get alerts, notifications and get a chance to transition cleanly and orderly.

There are alternatives

Sure there are alternatives. Both pure GitHub alternatives that look similar and provide similar services, and projects that would allow us to run similar things ourselves and host locally. There are many options.

I’m not looking for alternatives. I’m not planning to switch hosting anytime soon! As mentioned above, I think GitHub is a net positive for the curl project.

Nothing lasts forever

We’ve switched services several times before and I’m expecting that we will change again in the future, for all sorts of hosting and related project offerings that we provide to the work and to the developers and participators within the project. Nothing lasts forever.

When a service we use goes down or just turns sour, we will figure out the best possible replacement and take the jump. Then we patch up all the cracks the jump may have caused and continue the race into the future. Onward and upward. The way we know and the way we’ve done for over twenty years already.

Credits

Image by Elias Sch. from Pixabay

Updates

After this blog post went live, some users remarked than I’m “disingenuous” in the list of reasons at the top, that people have presented to me. This, because I don’t mention the moral issues staying on GitHub present – like for example previously reported workplace conflicts and their association with hideous American immigration authorities.

This is rather the opposite of disingenuous. This is the truth. Not a single person have ever asked me to leave GitHub for those reasons. Not me personally, and nobody has asked it out to the wider project either.

These are good reasons to discuss and consider if a service should be used. Have there been violations of “decency” significant enough that should make us leave? Have we crossed that line in the sand? I’m leaning to “no” now, but I’m always listening to what curl users and developers say. Where do you think the line is drawn?

The Talospace ProjectFirefox 85 on POWER

Firefox 85 declares war on supercookies, enables link preloading and adds improved developer tools (just in time, since Google's playing games with Chromium users again). This version builds and runs so far uneventfully on this Talos II. If you want a full PGO-LTO build, which I strongly recommend if you're going to bother building it yourself, grab the shell script from Firefox 82 if you haven't already and use this updated diff to enable PGO for gcc. Either way, the optimized and debug .mozconfigs I use are also unchanged from Fx82. At some point I'll get around to writing a upstreamable patch and then we won't have to keep carrying the diff around.

Data@MozillaThis Week in Glean: The Glean Dictionary

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)

On behalf of Mozilla’s Data group, I’m happy to announce the availability of the first milestone of the Glean Dictionary, a project to provide a comprehensive “data dictionary” of the data Mozilla collects inside its products and how it makes use of it. You can access it via this development URL:

https://dictionary.protosaur.dev/

The goal of this first milestone was to provide an equivalent to the popular “probe” dictionary for newer applications which use the Glean SDK, such as Firefox for Android. As Firefox on Glean (FoG) comes together, this will also serve as an index of what data is available for Firefox and how to access it.

Part of the vision of this project is to act as a showcase for Mozilla’s practices around lean data and data governance: you’ll note that every metric and ping in the Glean Dictionary has a data review associated with it — giving the general public a window into what we’re collecting and why.

In addition to displaying a browsable inventory of the low-level metrics which these applications collect, the Glean Dictionary also provides:

  • Code search functionality (via Searchfox) to see where any given data collection is defined and used.
  • Information on how this information is represented inside Mozilla’s BigQuery data store.
  • Where available, links to browse / view this information using the Glean Aggregated Metrics Dashboard (GLAM).

Over the next few months, we’ll be expanding the Glean Dictionary to include derived datasets and dashboards / reports built using this data, as well as allow users to add their own annotations on metric behaviour via a GitHub-based documentation system. For more information, see the project proposal.

The Glean Dictionary is the result of the efforts of many contributors, both inside and outside Mozilla Data. Special shout-out to Linh Nguyen, who has been moving mountains inside the codebase as part of an Outreachy internship with us. We welcome your feedback and involvement! For more information, see our project repository and Matrix channel (#glean-dictionary on chat.mozilla.org).

Mozilla Privacy BlogFive issues shaping data, tech and privacy in the African region in 2021

The COVID 19 crisis increased our reliance on technology and accelerated tech disruption and innovation, as we innovated to fight the virus and cushion the impact. Nowhere was this felt more keenly than in the African region, where the number of people with internet access continued to increase and the corresponding risks to their privacy and data protection rose in tandem. On the eve of 2021 Data Privacy Day, we take stock of the key issues that will shape data and privacy in the Africa region in the coming year.

  • Data Protection: Africa is often used as a testing ground for technologies produced in other countries. As a result, people’s data is increasingly stored in hundreds of databases globally. While many in the region are still excluded from enjoying basic rights, their personal information is a valuable commodity in the global market, even when no safeguard mechanisms exist. Where safeguards exist, they are still woefully inadequate. One of the reasons Cambridge Analytica could amass large databases of personal information was the lack of data protection mechanisms in countries where they operated. This 2017 global scandal served as a wakeup call for stronger data protection for many African states. Many countries are therefore strengthening their legal provisions regarding access to personal data, with over 30 countries having enacted Data Protection legislation. Legislators have the African Union Convention on Cybersecurity and Personal Data Protection (Malabo Convention) 2014 to draw upon and we are likely to see more countries enacting privacy laws and setting up Data Protection Authorities in 2021, in a region that would otherwise have taken a decade or more to enact similar protections.
  • Digital ID:The UN’s Sustainable Development Goal 16.9 aims to provide legal identity for all, including birth registration. But this goal is often conflated with and used to justify high-tech biometric ID schemes. These national level identification projects are frequently funded through development aid agencies or development loans from multilaterals and are often duplicative of existing schemes. They are set up as unique, centralised single sources of information on people and meant to replace existing sectoral databases, and guarantee access to a series of critical public and private services. However, they risk abusing privacy and amplifying patterns of discrimination and exclusion. Given the impending rollout of COVID-19 vaccination procedures, we can expect digital ID to remain a key issue across the region. It will be vital that the discrimination risks inherent within digital IDs are not amplified to deny basic health benefits.
  • Behavioural Biometrics: In addition to government-issued digital IDs, our online and offline lives now require either identifying ourselves or being identified. Social login services which let us log in with Facebook, ad IDs that are used to target ads to us, or an Apple ID that connects our text messages, music preferences, app purchases, and payments to a single identifier, are all examples of private companies using identity systems to amass vast databases about us. Today’s behavioural biometrics technologies go further and use hundreds of unique parameters to analyse how someone uses their digital devices, their browsing history, how they hold their phone, and even how quickly they text, providing a mechanism to passively authenticate people without their knowledge. For example, mobile lending services are “commodifying the routine habits of Kenyans, transforming their behaviour into reputational data” to be monitored, assessed, and shared adding another worryingly sophisticated layer to identity verification leading to invasion of privacy and exclusion.
  • Fintech: An estimated 1.7 billion people lack access to financial services. FinTech solutions e.g. mobile money, have assumed a role in improving financial inclusion, while also serving as a catalyst for innovation in sectors like health, agriculture, etc. These solutions are becoming a way of life in many African countries, attracting significant investments in new transaction technologies. FinTech products collect significant amounts of personal data, including users’ names, location records, bank account details, email addresses, and sensitive data relating to religious practices, ethnicity, race, credit information, etc. The sheer volume of information increases its sensitivity and over time a FinTech company may generate a very detailed and complete picture of an individual while also collecting data that may have nothing to do with financial scope, for example, text messages, call logs, and address books. Credit analytics firms like Cignifi are extracting data from unbanked users to develop predictive algorithms. As FinTech continues to grow exponentially across the region in 2021 we can expect a lot of focus on ensuring companies adopt responsible, secure, and privacy-protective practices.
  • Surveillance and facial recognition technologies: Increased state and corporate surveillance through foreign-sourced technologies raises questions of how best to safeguard privacy on the continent. Governments in the region are likely to use surveillance technologies more to curb freedom of expression and freedom of assembly in contexts of political control. The increasing use of facial recognition technologies without accompanying legislation to mitigate privacy, security and discrimination risks, is of great concern. The effort to call out and reign in bad practices and ensure legislative safeguards will continue in 2021. Fortunately, we already have some successes to build on. For instance, in South Africa, certain provisions of the Regulation of Interception of Communications and Provision of Communication Related Information Act have been declared unconstitutional.

As we move through 2021, the African region will continue to see Big Tech’s unencumbered rise, with vulnerable peoples’ data being used to enhance companies’ innovations, entrench their economic and political power, while impacting the social lives of billions of people. Ahead of Data Privacy Day, we must remember that our work to ensure data protection and data privacy will not be complete until all individuals, no matter where they are located in the world, enjoy the same rights and protections.

The post Five issues shaping data, tech and privacy in the African region in 2021 appeared first on Open Policy & Advocacy.

Daniel Stenbergcurl your own error message

The --write-out (or -w for short) curl command line option is a gem for shell script authors looking for more information from a curl transfer. Experienced users know that this option lets you extract things such as detailed timings, the response code, transfer speeds and sizes of various kinds. A while ago we even made it possible to output JSON.

Maybe the best resource to learn more about it, is the dedication section in Everything curl. You’ll like it!

Now even more versatile

In curl 7.75.0 (released on February 3, 2021) we introduce five new variables for this option, and I’ll elaborate on some of the fun new things you can do with these!

These new variables were invented when we received a bug report that pointed out that when a user transfers many URLs in parallel and one or some of them fail – the error message isn’t identifying exactly which of the URLs that failed. We should improve the error messages to fix this!

Or wait a minute. What if we provide enough details for --write-out to let the user customize the error message completely by themselves and thus get exactly the info they want?

onerror

Using this, you can specify a message only to get written if the transfer ends in error. That is a non-zero exit code. An example could look like this:

curl -w '%{onerror}failed\n' $URL -o saved -s

…. if the transfer is OK, it says nothing. If it fails, the text on the right side of the “onerror” variable gets output. And that text can of course contain other variables!

This command line uses -s for “silent” to make it inhibit the regular built-in error message.

url

To help craft a good error message, maybe you want the URL included that was used in the transfer?

curl -w '%{onerror}%{url} failed\n' $URL

urlnum

If you get more than one URL in the command line, it might be helpful to get the index number of the used URL. This is of course especially useful if you for example work with the same URL multiple times in the same command line and just one of them fails!

curl -w '%{onerror}URL %{urlnum} failed\n' $URL $URL

exitcode

The regular built-in curl error message shows the exit code, as it helps diagnose exactly what the problem was. Include that in the error message like:

curl -w '%{onerror}%{url} got %{exitcode}\n' $URL

errormsg

This is the human readable explanation for the problem. The error message. Mimic the default curl error message like this:

curl -w '%{onerror}curl: %{exitcode} %{errormsg}\n' $URL

stderr

We already provide this “variable” from before, which allows you to make sure the output message is sent to stderr instead of stdout, which then makes it even more like a real error message:

url -w '%{onerror}%{stderr}curl: %{exitcode} %{errormsg}\n' $URL

More

These new variables work fine after %{onerror}, but they also of course work just as fine to output even when there was no error, and they work perfectly fine whether you use -Z for parallel transfers or doing them serially, one after the other.

William LachanceThe Glean Dictionary

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)

On behalf of Mozilla’s Data group, I’m happy to announce the availability of the first milestone of the Glean Dictionary, a project to provide a comprehensive “data dictionary” of the data Mozilla collects inside its products and how it makes use of it. You can access it via this development URL:

https://dictionary.protosaur.dev/

The goal of this first milestone was to provide an equivalent to the popular “probe” dictionary for newer applications which use the Glean SDK, such as Firefox for Android. As Firefox on Glean (FoG) comes together, this will also serve as an index of what data is available for Firefox and how to access it.

Part of the vision of this project is to act as a showcase for Mozilla’s practices around lean data and data governance: you’ll note that every metric and ping in the Glean Dictionary has a data review associated with it — giving the general public a window into what we’re collecting and why.

In addition to displaying a browsable inventory of the low-level metrics which these applications collect, the Glean Dictionary also provides:

  • Code search functionality (via Searchfox) to see where any given data collection is defined and used.
  • Information on how this information is represented inside Mozilla’s BigQuery data store.
  • Where available, links to browse / view this information using the Glean Aggregated Metrics Dashboard (GLAM).

Over the next few months, we’ll be expanding the Glean Dictionary to include derived datasets and dashboards / reports built using this data, as well as allow users to add their own annotations on metric behaviour via a GitHub-based documentation system. For more information, see the project proposal.

The Glean Dictionary is the result of the efforts of many contributors, both inside and outside Mozilla Data. Special shout-out to Linh Nguyen, who has been moving mountains inside the codebase as part of an Outreachy internship with us. We welcome your feedback and involvement! For more information, see our project repository and Matrix channel (#glean-dictionary on chat.mozilla.org).

Mozilla Attack & DefenseEffectively Fuzzing the IPC Layer in Firefox

 

The Inter-Process Communication (IPC) Layer within Firefox provides a cornerstone in Firefox’ multi-process Security Architecture. Thus, eliminating security vulnerabilities within the IPC Layer remains critical. Within this blogpost we survey and describe the different communication methods Firefox uses to perform inter-process communication which hopefully provide logical entry points to effectively fuzz the IPC Layer in Firefox.

 

Background on the Multi Process Security Architecture of Firefox

When starting the Firefox web browser it internally spawns one privileged process (also known as the parent process) which then launches and coordinates activities of multiple content processes. This multi process architecture allows Firefox to separate more complicated or less trustworthy code into processes, most of which have reduced access to operating system resources or user files. (Entering about:processes into the address bar shows detailed information about all of the running processes). As a consequence, less privileged code will need to ask more privileged code to perform operations which it itself cannot. That request for delegation of operations or in general any communication between content and parent process happens through the IPC Layer.

 

IPC – Inter Process Communication 

From a security perspective the Inter-Process Communication (IPC) is of particular interest because it spans several security boundaries in Firefox. The most obvious one is the PARENT <-> CONTENT process boundary. The content (or child process), which hosts one or more tabs containing web content, is unprivileged and sandboxed and in threat modeling scenarios often considered to be compromised and running arbitrary attacker code. The parent process on the other hand has full access to the host machine. While this parent-child relationship is not the only security boundary (see e.g. this documentation on process privileges), it is the most critical one from a security perspective because any violation will result in a sandbox escape.

 

 

Firefox internally uses three main communication methods (plus an obsolete one) through which content processes can communicate with the main process. In more detail, inter process communication within processes in Firefox happen through either: (1) IPDL protocols, (2) Shared Memory, (3) JS Actors and sometimes through (4) the obsolete and outdated process communication mechanism of Message Manager. Please note that (3) and (4) internally are built on top of (1) and hence are IPDL aware.

  1. IPDL
    The Inter-process-communication Protocol Definition Language (IPDL) is a Mozilla-specific language used to define the mechanism to pass messages between processes in C++ code.The primary protocol that governs the communication between the parent and child process in Firefox is the Content protocol. As you can see in its source named PContent.ipdl, the definition is fairly large and includes a lot of other protocols – because there are a number of sub-protocols in Content the whole protocol definition is hierarchic.Inside each protocol, there are methods for starting each sub-protocol and providing access to the methods inside the newly-started sub-protocol. For automated testing, this structure also means that there is not one flat attack surface containing all interesting methods. Instead, it might take a number of message round trips to reach a certain sub-protocol which then might surface a vulnerability.In addition to the PContent.ipdl Firefox also uses PBackground.ipdl which is not included in Content but allows for background communication and hence provides another mechanism to pass messages between processes in C++.
  2. Shared Memory
    Shmems are the main objects to share memory across processes. They are IPDL-aware and are created and managed by IPDL actors. The easiest way to find them in code is to search for uses of ‘AllocShmem/AllocUnsafeShmem’, or to search all files ending in ‘.ipdl’. As an example, PContent::InvokeDragSession uses PARENT <-> CONTENT Shmems to transfer drag/drop images, files, etc.An easier example, though not always involving the parent process, can be found in PCompositorBridge::EndRecordingToMemory, which is used to record the browser display for things like collecting visual performance metrics. It returns a CollectedFramesParams struct which contains a Shmem. First, the message is sent by a PCompositorBridgeChild actor to a GPU process’ PCompositorBridgeParent, where the Shmem is allocated, filled as a char* and returned. It is then interpreted by the child actor as a buffer of raw bytes, using a Span object.Please note that the above example does not necessarily involve the parent process because it illustrates two of the more subtly confusing aspects of the process design. First, PCompositorBridge actors exist as both GPU <-> PARENT actors for browser UI rendering, called “chrome” in some parts of documentation, and GPU <-> CONTENT actors for page rendering. (The comment for PCompositorBridge explains this relationship in more detail.) Second, the GPU process does not always exist. For example, there is no GPU process on Mac based systems. This does not mean that these actors are not used — it just means that when there is no GPU process, the parent process serves in its place.Finally, SharedMemoryBasic is a lower-level shared memory object that does not use IPDL. Its use is not well standardized but it is heavily used in normal operations, for example by SourceSurfaceSharedData to share texture data between processes.
  3. JSActors
    The preferred way to perform IPC in JavaScript is based on JS Actors where a JSProcessActor provides a communication channel between a child process and its parent and a JSWindowActor provides a communication channel between a frame and its parent. After registering a new JS Actor it provides two methods for sending messages: (a) sendAsyncMessage() and (b) sendQuery(), and one for receiving messages: receiveMessage().
  4. Message Manager
    In addition to JS Actors there is yet another method to perform IPC in JavaScript, namely the Message Manager. Even though we are replacing IPC communication based on Message Manager with the more robust JS Actors, there are still instances in the codebase which rely on the Message Manager. Its general structure is very loose and there is no registry of methods and also no specification of types. MessageManager messages are sent using the following four JavaScript methods: (a) sendSyncMessage(), (b) sendAsyncMessage(), (c) sendRpcMessage(), and (d) broadcastAsyncMessage().Enabling MOZ_LOG=”MessageManager:5” will output all of the messages passed back and forth the IPC Layer. MessageManager is built atop IPDL using PContent::SyncMessage and PContent::AsyncMessage.

 

Fuzz Testing In-Isolation vs. System-Testing

For automated testing, in particular fuzzing, isolating the components to be tested has proven to be effective many times. For IPC, unfortunately this approach has not been successful because the interesting classes, such as the ContentParent IPC endpoint, have very complex runtime contracts with their surrounding environment. Using them in isolation results in a lot of false positive crashes. As an example, compare the libFuzzer target for ContentParent, which has found some security bugs, but also an even larger number of (mostly nullptr) crashes that are related to missing initialization steps. Reporting false positives not only lowers the confidence in the tool’s results and requires additional maintenance but also indicates some missing parts of the attack surface because of an improper set up. Hence, we believe that a system testing approach is the only viable solution for comprehensive IPC testing.

One potential approach to fuzz such scenarios effectively could be to start Firefox with a new tab, navigate to a dummy page, then perform a snapshot of the parent (process- or VM-based snapshot fuzzing) and then replace the regular child messages with coverage-guided fuzzing. The snapshot approach would further allow to reset the parent to a defined state from time to time without suffering the major performance bottleneck of restarting the process. As described above in the IPDL section, it is crucial to have multiple messages going back and forth to ensure that we can reach deep into the protocol tree. And finally, the reproducibility of the crashes is crucial, since bugs without reliable steps to reproduce usually receive a lot less traction. Put differently, vulnerabilities with reliable steps to reproduce can be isolated, addressed and fixed a lot faster.

For VM-based snapshot fuzzing, we are aware of certain requirements that need to be fulfilled for successful Fuzzing. In particular:

  • Callback when the process is ready for fuzzing
    In order to know when to start intercepting and fuzzing communication, some kind of callback must be made at the right point when things are “ready”. As an example, at the point when a new tab becomes ready for browsing, or more precisely has loaded its URI, that might be a good entrypoint for PARENT <-> CONTENT child fuzzing. One way to intercept the message exactly at this point in time might be to check for a STATE_STOP message with the expected URI that you are loading, e.g. in OnStateChange().
  • Callback for when communication sockets are created
    On Linux, Firefox uses UNIX Sockets for IPC communication. If you want to intercept the creation of sockets in the parent, the ChannelImpl::CreatePipe method is what you should be looking for.

 

Examples of Previous Bugs

We have found (security) bugs through various means, including static analysis, manual audits, and libFuzzer targets on isolated parts (which has the problems described above). Looking through the reports of those bugs might additionally provide some useful information:

 

Further Fuzzing related resources for Firefox

The following resources are not specifically for IPC fuzzing but might provide additional background information and are widely used at Mozilla for fuzzing Firefox in various ways:

  • prefpicker – Compiles prefs.js files used for fuzzing Firefox
  • fuzzfetch – A tool to download all sorts of Firefox or JS shell builds from our automation
  • ffpuppet – Makes it easier to automate browser profile setup, startup etc.
  • Fuzzing Interface – Shows how libFuzzer targets, instrumentation, etc. work in our codebase
  • Sanitizers – How to build with various sanitizers, known issues, workarounds, etc.

 

Going Forward

Providing architectural insights into the security design of a system is crucial for truly working in the open and ultimately allows contributors, hackers, and bug bounty hunters to verify and  challenge our design decisions. We would like to point out that bugs in the IPC Layer are eligible for a bug bounty — you can report potential vulnerabilities simply by filing a bug on Bugzilla. Thank you!

 

About:CommunityNew contributors to Firefox 85

With Firefox 85 fresh out of the oven, we are delighted to welcome the developers who contributed their first code change to Firefox in this release, 13 of whom are new volunteers! Please join us in thanking each of them, and take a look at their contributions:

The Firefox FrontierJessica Rosenworcel’s appointment is good for the internet

With a new year comes change, and one change we’re glad to see in 2021 is new leadership at the Federal Communications Commission (FCC). On Thursday, Jan. 21, Jessica Rosenworcel, … Read more

The post Jessica Rosenworcel’s appointment is good for the internet appeared first on The Firefox Frontier.

Hacks.Mozilla.OrgJanuary brings us Firefox 85

To wrap up January, we are proud to bring you the release of Firefox 85. In this version we are bringing you support for the :focus-visible pseudo-class in CSS and associated devtools, <link rel="preload">, and the complete removal of Flash support from Firefox. We’d also like to invite you to preview two exciting new JavaScript features in the current Firefox Nightly — top-level await and relative indexing via the .at() method. Have fun!

This blog post provides merely a set of highlights; for all the details, check out the following:

:focus-visible

The :focus-visible pseudo-class, previously supported in Firefox via the proprietary :-moz-focusring pseudo-class, allows the developer to apply styling to elements in cases where browsers use heuristics to determine that focus should be made evident on the element.

The most obvious case is when you use the keyboard to focus an element such as a button or link. There are often cases where designers will want to get rid of the ugly focus-ring, commonly achieved using something like :focus { outline: none }, but this causes problems for keyboard users, for whom the focus-ring is an essential accessibility aid.

:focus-visible allows you to apply a focus-ring alternative style only when the element is focused using the keyboard, and not when it is clicked.

For example, this HTML:

<p><button>Test button</button></p>
<p><input type="text" value="Test input"></p>
<p><a href="#">Test link</a></p>

Could be styled like this:

/* remove the default focus outline only on browsers that support :focus-visible  */
a:not(:focus-visible), button:not(:focus-visible), button:not(:focus-visible) {
  outline: none;
}

/* Add a strong indication on browsers that support :focus-visible */
a:focus-visible, button:focus-visible, input:focus-visible {
  outline: 4px dashed orange;
}

And as another nice addition, the Firefox DevTools’ Page Inspector now allows you to toggle :focus-visible styles in its Rules View. See Viewing common pseudo-classes for more details.

Preload

After a couple of false starts in previous versions, we are now proud to announce support for <link rel="preload">, which allows developers to instruct the browser to preemptively fetch and cache high-importance resources ahead of time. This ensures they are available earlier and are less likely to block page rendering, improving performance.

This done by including rel="preload" on your link element, and an as attribute containing the type of resource that is being preloaded, for example:

<link rel="preload" href="style.css" as="style">
<link rel="preload" href="main.js" as="script">

You can also include a type attribute containing the MIME type of the resource, so a browser can quickly see what resources are on offer, and ignore ones that it doesn’t support:

<link rel="preload" href="video.mp4" as="video" type="video/mp4">
<link rel="preload" href="image.webp" as="image" type="image/webp">

See Preloading content with rel=”preload” for more information.

The Flash is dead, long live the Flash

Firefox 85 sees the complete removal of Flash support from the browser, with no means to turn it back on. This is a coordinated effort across browsers, and as our plugin roadmap shows, it has been on the cards for a long time.

For some like myself — who have many nostalgic memories of the early days of the web, and all the creativity, innovation, and just plain fun that Flash brought us — this is a bittersweet day. It is sad to say goodbye to it, but at the same time the advantages of doing so are clear. Rest well, dear Flash.

Nightly previews

There are a couple of upcoming additions to Gecko that are currently available only in our Nightly Preview. We thought you’d like to get a chance to test them early and give us feedback, so please let us know what you think in the comments below!

Top-level await

async/await has been around for a while now, and is proving popular with JavaScript developers because it allows us to write promise-based async code more cleanly and logically. This following trivial example illustrates the idea of using the await keyword inside an async function to turn a returned value into a resolved promise.

async function hello() {
  return greeting = await Promise.resolve("Hello");
};

hello().then(alert);

The trouble here is that await was originally only allowed inside async functions, and not in the global scope. The experimental top-level await proposal addresses this, by allowing global awaits. This has many advantages in situations like wanting to await the loading of modules in your JS application. Check out the proposal for some useful examples.

What’re you pointing at() ?

Currently an ECMAScript stage 3 draft proposal, the relative indexing method .at() has been added to Array, String, and TypedArray instances to provide an easy way of returning specific index values in a relative manner. You can use a positive index to count forwards from position 0, or a negative value to count backwards from the highest index position.

Try these, for example:

let myString = 'Hello, how are you?';
myString.at(4);
myString.at(-3);

let myArray = [0, 10, 35, 70, 100, 300];
myArray.at(1);
myArray.at(-2);

WebExtensions

Last but not least, let’s look at what has changed in our WebExtensions implementation in Fx 85.

And finally, we want to remind you about upcoming site isolation changes with Project Fission. As we previously mentioned, the drawWindow() method is being deprecated as part of this work. If you use this API, we recommend that you switch to using the captureTab() method instead.

The post January brings us Firefox 85 appeared first on Mozilla Hacks - the Web developer blog.

The Mozilla BlogWhy getting voting right is hard, Part V: DREs (spoiler: they’re bad)

This is the fifth post in my series on voting systems (catch up on parts I, II, III and IV), focusing on computerized voting machines. The technical term for these is Direct Recording Electronic (DRE) voting systems, but in practice what this means is that you vote on some kind of computer, typically using a touch screen interface. As with precinct-count optical scan, the machine produces a total count, typically recorded on a memory card, printed out on a paper receipt-like tape, or both. These can be sent back to election headquarters, together with the ballots, where they are aggregated.

Accessibility

One of the major selling points of DREs is accessibility: paper ballots are difficult for people with a number of disabilities to access without assistance. At least in principle DREs can be made more accessible, for instance fitted with audio interfaces, sip-puff devices, etc. Another advantage of DREs is that they scale better to multiple languages: you of course still have to encode ballot definitions in each new language, but you don’t need to worry about whether you’ve printed enough ballot in any given language[1]

In practice, the accessibility of DREs is not that great:

Noel Runyan is one of the few people who sits at the crossroads of
this debate. He has 50 years of experience designing accessible
systems and is both a computer scientist and disabled. He was dragged
into this debate, he said, because there were so few other people who
had a stake in both fields.

Voting machines for all is clearly not the right position, Runyan
said. But neither is the universal requirement for hand-marked paper
ballots.

“The [Americans with Disabilities Act], Hava and decency require that
we allow disabled people to vote and have accessible voting systems,”
Runyan said.

Yet Runyan also believes the voting machines on the market today are
“garbage”. They neither provide any real sense of security against
physical or cyber-attacks that could alter an election, nor do they
have good user interfaces for voters regardless of disability status.

See also the 2007 California Top-to-Bottom-Review accessibility report for a long catalog of the failings of accessible voting systems at the time, which don’t seem to have improved much. With all that said, having any kind of accessiblity is a pretty big improvement. In particular, this was the first time that many visually impaired voters were able to vote without assistance.

DestroyingClarifying Voter Intent

As discussed in previous posts, one of the challenges with any kind of hand-marked ballot is dealing with edge cases where the markings are not clear and you have to discern voter intent. Arguments about how to interpret (or discard) these ambiguous ballots have been important in at least two very high stakes US elections, the 2000 Bush/Gore Florida Presidential contest (conducted on punch card machines) and the 2008 Coleman/Franken Minnesota Senate contest (conducted on optical scan machines). It’s traditional at this point to show the following picture of one of the “scrutineers” from the Florida recount trying to interpret a punch card ballot[2]:

scrutineer

In a DRE system, by contrast, all of the interpretation of voter intent is done by the computer, with the expectation that any misinterpretation will be caught by the voter checking the DRE’s work (typically at some summary screen before casting). In addition, the DRE can warn users about potential errors on their part (or just make them impossible by forbidding voters from voting for >1 candidate, etc.). To the extent to which voters actually check that the DRE is behaving correctly, this seems like an advantage, but if they do not (see below) then it’s just destroying information which might be used to conduct a more accurate election. We have trouble measuring the error rate of DREs in the field — again, because the errors are erased and because observing actual voters while casting ballots is a violation of ballot privacy and secrecy — but Michael Byrne reports that under laboratory conditions, DREs have comparable error rates (~1-2%) to hand-marked optical scan ballots, so this suggests that the outcome is about neutral.

Scalability

DREs have far worse scaling properties than optical scan systems. The number of voters that can vote at once is one of the main limits on how fast people can get through a polling place. Thus, you’d like to have as many voting stations as possible. However, DREs are expensive to buy (as well as to set up), so there’s pressure on the maximum number of machines. To make matters worse, you need more machines than you would expect by just calculating the total amount of time people need to vote.

The intuition here is that people don’t vote evenly throughout the day, so you need many more machines than you would need to handle the average arrival rate. For instance, if you expect to see 1200 voters over a 12 hour period and each voter takes 6 minutes to vote, you might think you could get by with 10 machines. However, what actually happens is that a lot of people vote before work, at lunch, and after work and so you get a line that builds up early, gradually dissipates throughout the morning, with a lot of machines standing idle, builds up again around lunch, then dissipates, and and then another long line that starts to build up around 5 PM. The math here is complicated, but roughly speaking you need about twice as many machines as you would expect to ensure that lines stay short. In addition, the problem gets worse when there is high turnout.

These problems exist to some extent with optical scan, but the main difference is that the voting stations — typically a table and a privacy shield — are cheap, so you can afford to have overcapacity. Moreover, if you really start getting backed up you can let voters fill out ballots on clipboards or whatever. This isn’t to say that there’s no way to get long lines with paper ballots; for instance, you could have problems at checkin or a backup at the precinct count scanner, but in general paper should be more resilient to high turnout than DREs. It’s also more resilient to failure: if the scanners fail, you can just have people cast ballots in a ballot box for later scanning. If the DREs fail, people can’t vote unless you have backup paper ballots.

Security

DREs are computers and as discussed in Part III, any kind of computerized voting is dangerous because computers can be compromised. This is especially dangerous in a DRE system because the computer completely controls the users experience: it can let the voter vote for Smith — and even show the voter that they voted for Smith — and then record a vote for Jones. In the most basic DRE system, this kind of fraud is essentially undetectable: you simply have to trust the computer. For obvious reasons, this is not good. To quote Richard Barnes, “for security people ‘trust’ is a bad word.”

How to compromise a voting machine

There are a number of ways in which a voting machine might get compromised. The simplest is that someone might with physical access might subvert it (for obvious[3] reasons, you don’t want voting machines to be networked, let alone connected to the Internet). The bad news is that — at least in the past — a number of studies of DREs have found it fairly easy to compromise DREs even with momentary access. For instance, in 2007, Feldman, Halderman, and Felten studied the Diebold AccuVote-TS and found that:

1. Malicious software running on a single voting machine can steal votes
with little if any risk of detection. The malicious software can modify
all of the records, audit logs, and counters kept by the voting machine,
so that even careful forensic examination of these records will find
nothing amiss. We have constructed demonstration software that carries
out this vote-stealing attack.

2. Anyone who has physical access to a voting machine, or to a memory
card that will later be inserted into a machine, can install said
malicious software using a simple method that takes as little as
one minute. In practice, poll workers and others often have
unsupervised access to the machines.

As I said in Part III, most of the work here was done in the 2000s, so it’s possible that things have improved, but the available evidence suggests otherwise. Moreover, there are limits to how good a job it seems possible to do here.

As with precinct-count machines, there are a number of ways in which an attacker might get enough physical access to the machine in order to attack them. Anyone who has access to the warehouse where the machines are stored could potentially tamper with them. In addition it’s not uncommon for voting machines to be stored overnight at polling places before the election, where you’re mostly relying on whatever lock the church or school or whatever has on its doors. It’s also not impossible that a voter could exploit temporary physical access to a machine in order to compromise it — remember that there usually will be a lot of machines in a given location so it’s hard to supervise them all — but that is a somewhat harder attack to mount.

Viral attacks

However, there is another more serious attack modality: device administration. Prior to each election, DREs need to be initialized with the ballot contents for each context. The details of how this is done vary, for instance one connect them via a cable to the Election Management System (EMS) [–corrected from “Server”], or insert a memory stick programmed by the EMS, or sometimes over a local network. In either case, this electronic connection is a potential avenue for attack by an attacker who controls the EMS. This connection can also be an opportunity for a compromised voting machine to attack the EMS. Together, these provide the potential conditions for a virus: an attacker compromises a single DRE and then uses that to attack the EMS, and then uses the EMS to attack every DRE in the jurisdiction. This has been demonstrated on real systems. Here’s Feldman et al. again:

3. AccuVote-TS machines are susceptible to voting-machine viruses—computer
viruses that can spread malicious software automatically and invisibly from
machine to machine during normal pre- and post-election activity. We have
constructed a demonstration virus that spreads in this way, installing our
demonstration vote-stealing program on every machine it infects.

It’s important to remember that this kind of attack is also potentially possible with precinct-count opscan machines: any time you have computers in the polling place you run this risk. The major difference is that with precinct-count opscan machines, you have the paper ballots available so you can recount them without trusting the computer.

Voter Verifiable Paper Audit Trails (VVPAT)

Because of this kind of concern, some DREs are fitted with what’s called a Voter Verifiable Paper Audit Trail (VVPAT). A typical VVPAT is a reel-to-reel thermal printer (think credit card receipts) behind a clear cover that is attached to the voting machine, as in the picture of a Hart voting machine below (the VVPAT is the grey box on the left). [Picture by Joseph Lorenzo Hall].

Hart eSlate with VVPAT

The typical way this works is that after the voter has made their selections they will be presented with a final confirmation screen. At the same time, the VVPAT will print out a summary of their choices which the voter can check. If they are correct, the voter accepts them. If not, they can go back and correct their choices, and then go back to the confirmation screen. The idea is that the VVPAT then becomes an untamperable — at least electronically — record of the voter’s choices and can be counted separately if there is some concern about the correctness of the machine tally. If everyone did this, then DREs with VVPAT would be software independent (recall our discussion of SI in Part III of this series).

The major problem with VVPATs is that voters make mistakes and they aren’t very good about checking the results. This means that a compromised machine can change the voter’s vote (as if the voter had made a mistake). If the voter doesn’t catch the mistake, then the attacker wins, and if they do, they’re allowed to correct the mistake.[4] We do have some data on this from Bernhard et al., who studied Ballot Marking Devices (BMDs), which are like DREs except that they print out optical scan ballots (see below). They found that if left to themselves around 6.5% of voters (in a simulated but realistic setting) will detect ballots being changed, which is pretty bad. There is some good news here, which is that with appropriate warnings by the “poll workers” the researchers were able to raise the detection rate to 85.7%, though it’s not clear how feasible it is to get poll workers to give those warnings.

Privacy/Secrecy of the Ballot

The DRE privacy/secrecy story is also somewhat disappointing. There are two main ways that the system can leak how a voter voted: via Cast Vote Records (CVRs) and via the VVPAT paper record. A CVR is just an electronic representation of a given voter’s ballot stored on the DRE’s “disk”. In principle, you might think that you could just store the totals for each contest, but it’s convenient to have CVRs around for a variety of reasons, including post-election analysis (looking for undervotes, possible tabulation errors, etc.) In any case, it’s common practice to record them and the Voluntary Voting Systems Guidelines (VVSG) promulgated by the US Election Assistance Commission encourage vendors to do so. This isn’t necessarily a problem if CVRs are handled correctly, but it must be impossible to link a CVR back to a voter. This means they have to be stored in a random order with no identifying marks that lead back to voter sequence. Historically, manufacturers have not always gotten this right, as, for instance, the California TTBR found (See Section 4.4.8 and Section 6.8). These problems can also exist with precinct count optical scan systems, but I forgot to mention it in my post on them. Sorry about that. Even if this part is done correctly, there are risks of pattern voting attacks in which the voter casts their ballot in a specific unique way, though again this can happen with optical scan.

The VVPAT also presents a problem. As described above, VVPATs are typically one long strip of paper, with the result that the VVPAT reflects the order in which votes were cast. An attacker who can observe the order in which voters voted and who also has access to the VVPAT can easily determine how each voter voted. This issue can be mostly mitigated with election procedures which cut the VVPAT roll apart prior to usage, but absent those procedures it represents a risk.

Ballot Marking Devices

The final thing I want to cover in this post is what’s called a Ballot Marking Device (BMD) [also known as an Electronic Ballot Marker (EBM)]. BMDs have gained popularity in recent years — especially with people from the computer science voting security community — as a design that tries to blend some of the good parts of DREs with some of the good parts of paper ballots. For example, the Voting Works open source machine design is an BMD, as is Los Angeles’s new VSAP machine.

A BMD is conceptually similar to DRE but with two important differences:

  1. It doesn’t have a VVPAT but instead prints out a ballot which can be fed into an optical scanner.
  2. Because the actual ballot counting is done by the scanner, you don’t need the machine to count votes, so it doesn’t need to store CVRs or maintain vote totals.

BMDs address the privacy issues with DREs fairly effectively: you don’t need to store CVRs in the machine and the ballots are to some extent randomized in the ballot boxes and handling process. They also partly address the scaling issues: while BMDs aren’t any cheaper, if a long line develops you can fall back to hand-marked optical scan ballots without disrupting any of your back-end processes.

It’s less clear that they address the security issues: a compromised BMD can cheat just as much as a compromised DRE and so they still rely on the voter checking their ballot. There have been some somewhat tricky attacks proposed on DREs where the attacker controls the printer in a way that fools the user about the VVPAT record and these can’t be mounted with a BMD, but it’s not clear how practical those attacks are in any case. Probably the biggest security advantage of a BMD is that you don’t need to worry about trusting the machine count or the communications channel back from the machine: you just count the opscan ballots without having to mess around with the VVPAT.[5]  And of course because they’re fundamentally just a mechanism for printing paper ballots, it’s straightforward to fall back to paper in case of failure or long lines.

Up Next: Post-Election Audits

We’ve now covered all the major methods used for casting and counting votes. That’s just the beginning, though: if you want to have confidence in an election you need to be able to audit the results. That’s a topic that deserves its own post.


  1. For instance, Santa Clara county produces ballots in English, Chinese, Spanish, Tagalog, and Vietnamese, Hindi, Japanese, Khmer, and Korean. ↩︎
  2. Punch cards are an old system with some interesting properties. The voter marks their ballot by punching holes in a punch card. The card itself has no candidates written on it but is instead inserted into a holder that lists the contests and choices. The card itself is then read by a standard punch card reader. This seems like it ought to be fairly straightforward but went wrong in a number of ways in Florida due to a combination of poor ballot design and an unfortunate technical failure mode: it was possible to punch the cards incompletely and as the voting machine filled up with chads (the little pieces of paper that you punched out), it would sometimes become harder to punch the ballot completely. This resulted in a number of ballots which had partially detached (“hanging”) chads or just dimpled chads, leading to debates about how to interpret them. Wikipedia has a pretty good description of what happened here. ↩︎
  3. At least they should be obvious: It’s incredibly hard to write software that can resist compromise by a dedicated attacker who has direct access (this is why you have to keep upgrading your browser and operating system to fix security issues). Given the critical nature of voting machines, you really don’t want them attached to the Internet. ↩︎
  4. In principle, this might leave statistical artifacts, such as a higher rate of correcting from Smith -> Jones than Jones -> Smith, but it would take a fair amount of work to be sure that this wasn’t just random error. ↩︎
  5. We’ve touched on this a few times, but one of the real advantages of paper ballots is that they serve as a single common format for votes. Once you have that format, it’s possible to have multiple methods for writing (by hand, BMD) and reading (by hand, central count opscan, precinct count opscan) the ballots. That gives you increased flexibility because it means that you can innovate in one area without affecting others, as well as allowing either the writing side (voters) or reading side (election officials) to change its processes without affecting the other. This is a principle with applicability far beyond voting. Interoperable standardized data formats and protocols are a basic foundation of the Internet and the Web and much of what has made the rapid advancement of the Internet possible. ↩︎

  • Previous:

 

Mozilla Security BlogFirefox 85 Cracks Down on Supercookies

Trackers and adtech companies have long abused browser features to follow people around the web. Since 2018, we have been dedicated to reducing the number of ways our users can be tracked. As a first line of defense, we’ve blocked cookies from known trackers and scripts from known fingerprinting companies.

In Firefox 85, we’re introducing a fundamental change in the browser’s network architecture to make all of our users safer: we now partition network connections and caches by the website being visited. Trackers can abuse caches to create supercookies and can use connection identifiers to track users. But by isolating caches and network connections to the website they were created on, we make them useless for cross-site tracking.

What are supercookies?

In short, supercookies can be used in place of ordinary cookies to store user identifiers, but  they are much more difficult to delete and block. This makes it nearly impossible for users to protect their privacy as they browse the web. Over the years, trackers have been found storing user identifiers as supercookies in increasingly obscure parts of the browser, including in Flash storage, ETags, and HSTS flags.

The changes we’re making in Firefox 85 greatly reduce the effectiveness of cache-based supercookies by eliminating a tracker’s ability to use them across websites.

How does partitioning network state prevent cross-site tracking?

Like all web browsers, Firefox shares some internal resources between websites to reduce overhead. Firefox’s image cache is a good example: if the same image is embedded on multiple websites, Firefox will load the image from the network during a visit to the first website and on subsequent websites would traditionally load the image from the browser’s local image cache (rather than reloading from the network). Similarly, Firefox would reuse a single network connection when loading resources from the same party embedded on multiple websites. These techniques are intended to save a user bandwidth and time.

Unfortunately, some trackers have found ways to abuse these shared resources to follow users around the web. In the case of Firefox’s image cache, a tracker can create a supercookie by  “encoding” an identifier for the user in a cached image on one website, and then “retrieving” that identifier on a different website by embedding the same image. To prevent this possibility, Firefox 85 uses a different image cache for every website a user visits. That means we still load cached images when a user revisits the same site, but we don’t share those caches across sites.

In fact, there are many different caches trackers can abuse to build supercookies. Firefox 85 partitions all of the following caches by the top-level site being visited: HTTP cache, image cache, favicon cache, HSTS cache, OCSP cache, style sheet cache, font cache, DNS cache, HTTP Authentication cache, Alt-Svc cache, and TLS certificate cache.

To further protect users from connection-based tracking, Firefox 85 also partitions pooled connections, prefetch connections, preconnect connections, speculative connections, and TLS session identifiers.

This partitioning applies to all third-party resources embedded on a website, regardless of whether Firefox considers that resource to have loaded from a tracking domain. Our metrics show a very modest impact on page load time: between a 0.09% and 0.75% increase at the 80th percentile and below, and a maximum increase of 1.32% at the 85th percentile. These impacts are similar to those reported by the Chrome team for similar cache protections they are planning to roll out.

Systematic network partitioning makes it harder for trackers to circumvent Firefox’s anti-tracking features, but we still have more work to do to continue to strengthen our protections. Stay tuned for more privacy protections in the coming months!

Thank you

Re-architecting how Firefox handles network connections and caches was no small task, and would not have been possible without the tireless work of our engineering team: Andrea Marchesini, Tim Huang, Gary Chen, Johann Hofmann, Tanvi Vyas, Anne van Kesteren, Ethan Tseng, Prangya Basu, Wennie Leung, Ehsan Akhgari, and Dimi Lee.

We wish to express our gratitude to the many Mozillians who contributed to and supported this work, including: Selena Deckelmann, Mikal Lewis, Tom Ritter, Eric Rescorla, Olli Pettay, Kim Moir, Gregory Mierzwinski, Doug Thayer, and Vicky Chin.

We also want to acknowledge past and ongoing efforts carried out by colleagues in the Brave, Chrome, Safari and Tor Browser teams to combat supercookies in their own browsers.

The post Firefox 85 Cracks Down on Supercookies appeared first on Mozilla Security Blog.

Hacks.Mozilla.OrgWelcoming Open Web Docs to the MDN family

Collaborating with the community has always been at the heart of MDN Web Docs content work — individual community members constantly make small (and not so small) fixes to help incrementally improve the content, and our partner orgs regularly come on board to help with strategy and documenting web platform features that they have an interest in.

At the end of the 2020, we launched our new Yari platform, which exposes our content in a GitHub repo and therefore opens up many more valuable contribution opportunities than before.

And today, we wanted to spread the word about another fantastic event for enabling more collaboration on MDN — the launch of the Open Web Docs organization.

Open Web Docs

Open Web Docs (OWD) is an open collective, created in collaboration between several key MDN partner organizations to ensure the long-term health of open web platform documentation on de facto standard resources like MDN Web Docs, independently of any single vendor or organization. It will do this by collecting funding to finance writing staff and helping manage the communities and processes that will deliver on present and future documentation needs.

You will hear more about OWD, MDN, and opportunities to collaborate on web standards documentation very soon — a future post will outline exactly how the MDN collaborative content process will work going forward.

Until then, we are proud to join our partners in welcoming OWD into the world.

See also

The post Welcoming Open Web Docs to the MDN family appeared first on Mozilla Hacks - the Web developer blog.

Mozilla VR BlogA New Year, A New Hubs

A New Year, A New Hubs

An updated look & feel for Hubs, with an all-new user interface, is now live.

Just over two years ago, we introduced a preview release of Hubs. Our hope was to bring people together to create, socialize and collaborate around the world in a new and fun way. Since then, we’ve watched our community grow and use Hubs in ways we could only imagine. We’ve seen students use Hubs to celebrate their graduations last May, educational organizations use Hubs to help educators adapt to this new world we’re in, and heck, even NASA has used Hubs to feature new ways of working. In today’s world where we’re spending more time online, Hubs has been the go-to online place to have fun and try new experiences.

Today’s update brings new features including a chat sidebar, a new streamlined design for desktop and mobile devices, and a support forum to help our community get the most out of their Hubs experience.

The New Hubs Experience

We’re excited to announce a new update to Hubs that makes it easier than ever to connect with the people you care about remotely. The update includes:

Stay in the Conversation with new Chat sidebar

Chat scroll back has been a highly requested feature in Hubs. Before today’s update, messages sent in Hubs were ephemeral and disappeared after just a few seconds. The chat messages were also displayed drawn over the room UI, which could prevent scene content from being viewed. With the new chat sidebar, you’ll be able to see chat from the moment you join the lobby, and choose when to show or hide the panel. On desktop, if the chat panel is closed, you’ll still get the quick text notifications, which have moved from the center of the screen to the bottom-left.

A New Year, A New Hubs<figcaption>A preview of the new chat feature in the lobby of a Hubs room</figcaption>

Streamlined experience for desktop and mobile

In the past, our team took a design approach that kept the desktop, mobile, and virtual reality interfaces tightly coupled. This often meant that the application’s interactions were tailored primarily to virtual reality devices, but in practice, the vast majority of Hubs users are visiting rooms on non-VR devices. This update separates the desktop and mobile interfaces to align more to industry-standard best practices, and makes the experience of being in a Hubs room more tailored to the device you’re using at any given time.  We’ve improved menu navigation by making these full-screen on mobile devices, and by consolidating options and preferences for personalizing your experience.

A New Year, A New Hubs<figcaption>The preferences menu on mobile (left) and desktop (right)</figcaption>

For our Hubs Cloud customers, we’re planning to release the UI changes after March 25th, 2021.  If you’re running Hubs Cloud out of the box on AWS, no manual updates will be required. If you have a custom fork, you will need to pull the changes into your client manually. We’ve created a guide here to explain what changes need to be made. For help with updates to Hubs Cloud or custom clients, you can connect with us on GitHub. We will be releasing an update to Hubs Cloud next week that does not include the UI redesign.

Helping you get the most out of your Hubs experience through our community

We’re excited to share that you can now get answers to questions about Hubs using support.mozilla.org. In addition to articles to help with basic Hubs setup and troubleshooting, the ‘Ask a Question’ forum is now available. This is a new place for the community and team to help answer questions about Hubs. If you’re an active Hubs user, you can contribute by answering questions and flagging information for the team. If you’re new to Hubs and find yourself needing some help getting up and running, pop over and let us know how we can help.

In the coming months, we’ll have additional detail to share about accessibility and localization in the new Hubs client. In the meantime, we invite you to check out the new Hubs experience on either your mobile or desktop device and let us know what you think!

Thank you to the following community members for letting us include clips of their scenes and events in our promo video: Paradowski, XP Portal, Narratify, REM5 For Good, Innovación Educativa del Tecnológico de Monterrey, Jordan Elevons, and Brendan Bradley. For more information, see the video description on Vimeo.