Christian HeilmannThat one tweet…

One simple tweet made me feel terrible. One simple tweet made me doubt myself. One simple tweet – hopefully not meant to be mean – had a devastating effect on me. Here’s how and why, and a reminder that you should not be the person that with one simple tweet causes anguish like that.

you don't know the struggle someone had to go through to get where they are

Beep beep, I’m a roadrunner

As readers of this blog, you know that the last weeks have been hectic for me:

I bounced from conference to conference, delivering a new talk at each of them, making my slides available for the public. I do it because I care about people who can not get to the conference and for those I coach about speaking so they can re-use the decks if they wanted to. I also do a recording of my talks and publish them on YouTube so people can listen. Mostly I do these for myself, so I can get better at what I do. This is a trick I explained in the developer evangelism handbook – another service I provide for free.

Publishing on the go is damn hard:

  • Most Wi-Fi at events is flaky or very slow.
  • I travel world-wide which means I have no data on my phone without roaming and such.
  • Many times uploading my slides needs four to five attempts
  • Creating the screencast can totally drain the battery of my laptop with no power plug in sight.
  • Uploading the screencast can mean I do it over night.

The format of your slides are irellevant to these issues. HTML, Powerpoint, Keynote – lots of images means lots of bytes.

Traveling and presenting is tough – physical space still matters

Presenting and traveling is both stressful and taxing. Many people ask me how I do it and my answer is simply that: the positive feedback I get and seeing people improve when they get my advice is a great reward and keeps me going. The last few weeks have been especially taxing as I also need to move out of my flat. I keep getting calls by my estate agent that I need to wire money or be somewhere I can not. I also haven’t seen my partner more than a few hours because we are both busy.

I love my job. I still get excited to go to conferences, hear other presenters, listen to people’s feedback and help them out. A large part of my career is based on professional relationships that formed at events.

The lonely part of the rockstar life

Bill Murray in Lost in Translation

Being a public speaker means you don’t spend much time for yourself. At the event you sleep on average 4-5 hours as you don’t want to be the rockstar presenter that arrives, delivers a canned talk and leaves. You are there for the attendees, so you sacrifice your personal time. That’s something to prepare for. If you make promises, you also need to deliver them immediately. Any promise of you to look into something or contact someone you don’t follow up as soon as you can piles up to a large backlog you have a hard time remembering what it is you wanted to find out.

It also can make you feel very lonely. I’ve had many conversations with other presenters who feel very down as you are not with the people you care about, in the place you call home or in an environment you understand and feel comfortable in. Sure, hotels, airports and conference venues are all lush and have a “jet set” feel to them. They are also very nondescript and make you feel like a stranger.

Progressive Enhancement discussions happen and I can’t be part of it!

I care deeply about progressive enhancement. To me, it means you care more for the users of your products than you care about development convenience. It is a fundamental principal of the web, and – to me – the start of caring about accessibility.

In the last few weeks progressive enhancement was a hot topic in our little world and I wanted to chime in many a time. After all, I wrote training materials on this 11 years ago, published a few books on it and keep banging that drum. But, I was busy with the other events on my backlog and the agreed topics I would cover.

That’s why I was very happy when “at the frontend” came up as a speaking opportunity and I submitted a talk about progressive enhancement. In this talk, I explain in detail that it is not about the JavaScript on or off case. I give out a lot of information and insight into why progressive enhancement is much more than that.

That Tweet

Just before my talk, I uploaded and tweeted my deck, in case people are interested. And then I get this tweet as an answer:

Yehuda Katz: @codepo8, I can't see your slides without JavaScript

It made me angry – a few minutes before my talk. It made me angry because of a few things:

  • it is insincere – this is not someone who has trouble accessing my content. It is someone who wants to point out one flaw to have a “ha-ha you’re doing it wrong” moment.
  • the poster didn’t bother to read what I wrote at all – not even the blog post I wrote a few days before covering exactly the same topic or others explaining that the availability of JS is not what PE is about at all.
  • it judges the content of a publication by the channel it was published on. Zeldman wrote an excellent piece on this years ago how this is a knee-jerk reaction and an utter fallacy.
  • it makes me responsible for Slideshare’s interface – a service used my many people as a great place to share decks
  • it boils the topic I talked about and care deeply for down to a simple binary state. This state isn’t even binary if you analyse it and is not the issue. Saying something is PE because of JavaScript being available or not is the same technical nonsense that is saying a text-only version means you are accessible.

I posted the Zeldman article as an answer to the tweet and got reprimanded for not using any of the dozens available HTML slide deck versions that are progressively enhancing a document. Never mind that using keynote makes me more effective and helps me with re-use. I have betrayed the cause and should do better and feel bad for being such a terrible slide-creator. OK then. I shrugged this off before, and will again. But, this time, I was vulnerable and it hurt more.

Siding with my critics

I addition to me having lot of respect of what Yehuda achieved other people started favouriting the tweet. People I look up to, people I care about:

  • Alex Russel, probably one of the most gifted engineers I know with a vocabulary that makes a Thesaurus blush.
  • Michael Mahemoff, always around with incredibly good advice when HTML5 and apps where the question.
  • My colleague Jacob Rossi, who blows me away every single day with his insight and tech knowledge

And that’s when my anger turned inward and the ugly voice of impostor syndrome reared its head. Here is what it told me:

  • You’re a fool. You’re making a clown of yourself trying to explain something everyone knows and nobody gives a shit about. This battle is lost.
  • You’re trying to cover up your loss of reality of what’s needed nowadays to be a kick-ass developer by releasing a lot of talks nobody needs. Why do you care about writing a new talk every time? Just do one, keep delivering it and do some real work instead
  • Everybody else moved on, you just don’t want to admit to yourself that you lost track.
  • They are correct in mocking you. You are a hypocrite for preaching things and then violating them by not using HTML for a format of publication it wasn’t intended for.

I felt devastated, I doubted everything I did. When I delivered the talk I had so looked forward to and many people thanked me for my insights I felt even worse:

  • Am I just playing a role?
  • Am I making the lives of those who want to follow what I advocate unnecessarily hard?
  • Shouldn’t they just build things that work in Chrome now and burn them in a month and replace them with the next new thing?

Eventually, I did what I always do and what all of you should: tell the impostor syndrome voice in your head to fuck off and let my voice of experience ratify what I am doing. I know my stuff, I did this for a long time and I have a great job working on excellent products.

Recovery and no need for retribution

I didn’t have any time to dwell more on this, as I went to the next conference. A wonderful place where every presentation was full of personal stories, warmth and advice how to be better in communicating with another. A place with people from 37 countries coming together to celebrate their love for a product that brings them closer. A place where people brought their families and children although it was a geek event.

I’m not looking for pity here. I am not harbouring a grudge against Yehuda and I don’t want anyone to reprimand him. My insecurities and how they manifest themselves when I am vulnerable and tired are my problem. There are many other people out there with worse issues and they are being attacked and taken advantage of and scared. These are the ones we need to help.

Think before trying to win with a tweet

What I want though is to make you aware that everything you do online has an effect. And I want you to think next time before you post the “ha-ha you are wrong” tweet or favourite and amplify it. I want you to consider to:

  • read the whole publication before judging it
  • question if your criticism really is warranted
  • wonder how much work it was to publish the thing
  • consider how the author feels when the work is reduced to one thing that might be wrong.

Social media was meant to make media more social. Not to make it easier to attack and shut people up or tell them what you think they should do without asking about the how and why.

I’m happy. Help others to be the same.

Patrick ClokeNew Position in Cyber Security at Percipient Networks

Note

If you’re hitting this from planet mozilla, this doesn’t mean I’m leaving the Mozilla Community since I’m not (nor was I ever) a Mozilla employee.

After working for The MITRE Corporation for a bit over four years, I left a few weeks ago to begin work at a cyber security start-up: Percipient Networks. Currently our main product is STRONGARM: an intelligent DNS blackhole. Usually DNS blackholes are set-up to block known bad domains by sending requests for those domains to a non-routable or localhost. STRONGARM redirects that traffic for identification and analysis. You could give it a try and let us know of any feedback you might have! Much of my involvement has been in the design and creation of the blackhole, including writing protocol parsers for both standard protocols and malware.

So far, I’ve been greatly enjoying my new position. There’s been a renewed focus on technical work, while being in a position to greatly influence both SRTONGARM and Percipient Networks. My average day involves many more activities now, including technical work: reverse engineering, reviewing/writing code, or reading RFCs; as well as other work: mentoring [1], user support, writing documentation, and putting desks together [2]. I’ve been thoroughly enjoying the varied activities!

Shifting software stacks has also been nice. I’m now writing mostly Python code, instead of mostly MATLAB, Java and C/C++ [3]. It has been great how many ready to use packages are available for Python! I’ve been very impressed with the ecosystem, and been encouraged to feed back into the open-source community, where appropriate.

[1]We currently have four interns, so there’s always some mentoring to do!
[2]We got a delivery of 10 desks a couple of weeks ago and spent the evening putting them together.
[3]I originally titled this post "xx days since my last semi-colon!", since that has gone from being a common key press of mine to a rare one. Although now I just get confused when switching between Python and JavaScript. Since semicolons are optional in both, but encouraged in JavaScript and discouraged in Python…

Air MozillaWorld Wide Haxe Conference 2015

World Wide Haxe Conference 2015 Fifth International Conference in English about the Haxe programming language

Emma Irwin#MozLove for Tanner

Lets be move from being ‘bandwagon-y’ about appreciation to being active participants and believers in surfacing the accomplishments of others.

I block off a a bit of time each month as my #mozlove shout-out day. This is the day I try to blog, or write a couple of LinkedIn recommendations for community I am grateful for, amazed-by or otherwise I think you should know about!   And so, I want to surface the story of Tanner today.

 

peX8tiy

Tanner is a part of a core team contributing to Community Ops, and numerous other projects across Mozilla. Tanner has been a huge help to me personally, in tackling nasty server setup things.  He is a true pleasure to work with.  Tanner has also done a fantastic job as a Mozilla Rep, most recently helping with a Mozilla booth at SCALE 13x .

IMG_20150220_205423

I asked Tanner what first compelled him to contribute:

At time I was sharing one computer with four others in my family, it was running Windows Vista. For some reason that I couldn’t figure out, Internet Explorer caused Windows to BSOD. I tried everything I could to fix my then-beloved Internet Explorer (I feel gross saying this). Eventually I gave up with it, and started trying out other browsers. For a while I used Opera, and really liked it. Then it too started causing some sort of problems, I don’t remember what yet. After a while, I remembered that one of my friends had told me about this program called Firefox, so I decided to give it a try. It was love at first sight.

I saw something interesting on the support page. It was run entirely by volunteers, and *I* could help!

I also asked him what was most rewarding, and found a kindred spirit in his response:  ‘learning and teaching’.

 Teaching someone just feels different than learning it yourself. My thinking is that if you are able to teach someone how to do something the right way, you have to be at least competent in what you’re doing, and that would say a lot about your abilities.

There are challenges in participation, and I asked Tanner about those as well:

  1. Avoiding Burn-out (there are only 24 hours in the day and Tanner has admitted to being up at 3AM reading docs)
  2. Working ‘around the world’ cane be hard – finding time to collaborate with someone else on the other side of the world is challenging
  3. Managing relationships and handling disagreements is harder on the web than in person.  ( I would suggest this is a skill-set all it’s own that will serve Tanner well outside Mozilla)

Being somewhat nosy, I am always curious about what  family and friends know about our contributor’s time and about Mozilla in general.  So I asked about that as well, and glad I did because this:

 A lot of people do have trouble understanding how I’d do something that I’m good at for free. When people say that, I explain to them that Mozilla isn’t *just* Firefox, and I’m not doing it just because it’s fun, but because we have to “save the Web”.

Tanner describes himself as being from: “The Middle of Nowhere, USA, also known as Cedar Rapids, Iowa.”  And so, if you are ever wondering where the Middle of Nowhere, USA is – I can now share that with you here:)

 

iowa

Tanner will be attending the University of Northern Iowa this fall where he will be studying Networking and Systems (although I expect he can probably do a bit of the teaching a well!).

When he’s not being a volunteer super-hero, Tanner loves to go back to his love of Marching bands and Drum Corps – with YouTube of bands being his favorite viewing choice. (note to Tanner: I used to drum in pipebands, we should jam sometime soon)

I’m grateful for a community that includes Tanner, and so grateful he took the time to answer my questions!

You can read Tanner’s Blog here

Consider writing a #mozlove blog or tweet for someone you appreciate!

 

 

Jet VillegasFirefox Platform Rendering – Current Work

I’m often asked “what are you working on?” Here’s a snapshot of some of the things currently on my teams’ front burners:

I’m surely forgetting a few things, but that’s a quick snapshot for now. Do you have suggestions for what Platform Rendering features we should pick up next? Add your comments below…

Manish GoregaokarHow Rust Achieves Thread Safety

In every talk I have given till now, the question “how does Rust achieve thread safety?” has invariably come up1. I usually just give an overview, but this provides a more comprehensive explanation for those who are interested

See also: Huon’s blog post on the same topic

In my previous post I touched a bit on the Copy trait. There are other such “marker” traits in the standard library, and the ones relevant to this discussion are Send and Sync. I recommend reading that post if you’re not familiar with Rust wrapper types like RefCell and Rc, since I’ll be using them as examples throughout this post; but the concepts explained here are largely independent.

For the purposes of this post, I’ll restrict thread safety to mean no data races or cross-thread dangling pointers. Rust doesn’t aim to solve race conditions. However, there are projects which utilize the type system to provide some form of extra safety, for example rust-sessions attempts to provide protocol safety using session types.

These traits are auto-implemented using a feature called “opt in builtin traits”. So, for example, if struct Foo is Sync, all structs containing Foo will also be Sync, unless we explicitly opt out using impl !Sync for Bar {}. Similarly, if struct Foo is not Sync, structs containing it will not be Sync either, unless they explicitly opt in (unsafe impl Sync for Bar {})

This means that, for example, a Sender for a Send type is itself Send, but a Sender for a non-Send type will not be Send. This pattern is quite powerful; it lets one use channels with non-threadsafe data in a single-threaded context without requiring a separate “single threaded” channel abstraction.

At the same time, structs like Rc and RefCell which contain Send/Sync fields have explicitly opted out of one or more of these because the invariants they rely on do not hold in threaded situations.

It’s actually possible to design your own library with comparable thread safety guarantees outside of the compiler — while these marker traits are specially treated by the compiler, the special treatment is not necessary for their working. Any two opt-in builtin traits could be used here.

Send and Sync have slightly differing meanings, but are very intertwined.

Send types can be moved between threads without an issue. It answers the question “if this variable were moved to another thread, would it still be valid for use?”. Most objects which completely own their contained data qualify here. Notably, Rc doesn’t (since it is shared ownership). Another exception is LocalKey, which does own its data but isn’t valid from other threads. Borrowed data does qualify to be Send, but in most cases it can’t be sent across threads due to a constraint that will be touched upon later.

Even though types like RefCell use non-atomic reference counting, it can be sent safely between threads because this is a transfer of ownership (a move). Sending a RefCell to another thread will be a move and will make it unusable from the original thread; so this is fine.

Sync, on the other hand, is about synchronous access. It answers the question: “if multiple threads were all trying to access this data, would it be safe?”. Types like Mutex and other lock/atomic based types implement this, along with primitive types. Things containing pointers generally are not Sync.

Sync is sort of a crutch to Send; it helps make other types Send when sharing is involved. For example, &T and Arc<T> are only Send when the inner data is Sync (there’s an additional Send bound in the case of Arc<T>). In words, stuff that has shared/borrowed ownership can be sent to another thread if the shared/borrowed data is synchronous-safe.

RefCell, while Send, is not Sync because of the non atomic reference counting.

Bringing it together, the gatekeeper for all this is thread::spawn(). It has the signature

pub fn spawn<F, T>(f: F) -> JoinHandle<T> where F: FnOnce() -> T, F: Send + 'static, T: Send + 'static

Admittedly, this is confusing/noisy, partially because it’s allowed to return a value, and also because it returns a handle from which we can block on a thread join. We can conjure a simpler spawn API for our needs though:

pub fn spawn<F>(f: F) where F: FnOnce(), F: Send + 'static

which can be called like:

let mut x = vec![1,2,3,4];

// `move` instructs the closure to move out of its environment
thread::spawn(move || {
   x.push(1);

});

// x is not accessible here since it was moved

In words, spawn() will take a callable (usually a closure) that will be called once, and contains data which is Send and 'static. Here, 'static just means that there is no borrowed data contained in the closure. This is the aforementioned constraint that prevents the sharing of borrowed data across threads; without it we would be able to send a borrowed pointer to a thread that could easily outlive the borrow, causing safety issues.

There’s a slight nuance here about the closures — closures can capture outer variables, but by default they do so by-reference (hence the move keyword). They autoimplement Send and Sync depending on their capture clauses. For more on their internal representation, see huon’s post. In this case, x was captured by-move; i.e. as Vec<T> (instead of being similar to &Vec<T> or something), so the closure itself can be Send. Without the move keyword, the closure would not be `‘static’ since it contains borrowed content.

Since the closure inherits the Send/Sync/'static-ness of its captured data, a closure capturing data of the correct type will satisfy the F: Send+'static bound.

Some examples of things that are allowed and not allowed by this function (for the type of x):

  • Vec<T>, Box<T> are allowed because they are Send and 'static (when the inner type is of the same kind)
  • &T isn’t allowed because it’s not 'static. This is good, because borrows should have a statically-known lifetime. Sending a borrowed pointer to a thread may lead to a use after free, or otherwise break aliasing rules.
  • Rc<T> isn’t Send, so it isn’t allowed. We could have some other Rc<T>s hanging around, and end up with a data race on the refcount.
  • Arc<Vec<u32>> is allowed (Vec<T> is Send and Sync if the inner type is); we can’t cause a safety violation here. Iterator invalidation requires mutation, and Arc<T> doesn’t provide this by default.
  • Arc<Cell<T>> isn’t allowed. Cell<T> provides copying-based internal mutability, and isn’t Sync (so the Arc<Cell<T>> isn’t Send). If this were allowed, we could have cases where larger structs are getting written to from different threads simultaneously resulting in some random mishmash of the two. In other words, a data race.
  • Arc<Mutex<T>> or Arc<RwLock<T>> are allowed. The inner types use threadsafe locks and provide lock-based internal mutability. They can guarantee that only one thread is writing to them at any point in time. For this reason, the mutexes are Sync regardless of the inner T, and Sync types can be shared safely with wrappers like Arc. From the point of view of the inner type, it’s only being accessed by one thread at a time (slightly more complex in the case of RwLock), so it doesn’t need to know about the threads involved. There can’t be data races when Sync types like these are involved.

As mentioned before, you can in fact create a Sender/Receiver pair of non-Send objects. This sounds a bit counterintuitive — shouldn’t we be only sending values which are Send? However, Sender<T> is only Send if T is Send; so even if we can use a Sender of a non-Send type, we cannot send it to another thread, so it cannot be used to violate thread safety.

There is also a way to utilize the Send-ness of &T (which is not 'static) for some Sync T, namely thread::scoped. This function does not have the 'static bound, but it instead has an RAII guard which forces a join before the borrow ends. This allows for easy fork-join parallelism without necessarily needing a Mutex. Sadly, there are problems which crop up when this interacts with Rc cycles, so the API is currently unstable and will be redesigned. This is not a problem with the language design or the design of Send/Sync, rather it is a perfect storm of small design inconsistencies in the libraries.

Discuss: HN, Reddit


  1. So much that I added bonus slides about thread safety to the end of my deck, and of course I ended up using them at the talk I gave recently

David Rajchenbach TellerRe-dreaming Firefox (1): Firefox Agents

Gerv’s recent post on the Jeeves Test got me thinking of the Firefox of my dreams. So I decided to write down a few ideas on how I would like to experience the web. Today: Firefox Agents. Let me emphasise that the features described in this blog post do not exist.

Marcel uses Firefox every day, for quite a number of things.

  • He uses Firefox for fun, for watching videos and playing online games. For this purpose, he has installed a few tools for finding and downloading videos. Also, one of his main search engines is YouTube. Suggested movies? Sure, as long as they are fun.
  • He uses Firefox for social networks. He follows his friends, he searches on Facebook, or Twitter, or Google+. If anything looks fun, or useful, he’d like to be informed.
  • He uses Firefox for managing his bank accounts, his taxes, his health insurance. For this purpose, he has paranoid security settings – to avoid phishing, he can only browse to a few whitelisted websites – and no add-ons. He may be interested in getting information from these few websites, and in security updates, but that’s about it. Also, since Firefox handles all his passwords, it must itself be protected by a password.
  • He uses Firefox to read his Gmail account. And to read his other Gmail account. And he doesn’t want to leak privacy information by doing so on the same Firefox that he’s using for browsing.
  • Oh, and he may also be using Firefox for browsing websites that are sensitive for any kind of reason, whether he’s hunting for gifts for his close family, dating online, chatting with hackers, discussing politics, helping NGOs in sensitive parts of the globe, visiting BitTorrent trackers, consulting a physician through some online service, or, well, anything else that requires privacy. He’d like to perform such browsing with additional anonymity guarantees. This also means locking Firefox with a password.
  • Sometimes, his children or friends borrow his computer and use Firefox, too.

Of course, since Marcel brings his own device at (or from) work, that’s the same Firefox that he’s using for all of these tasks, and he’s probably even doing several of these tasks at the same time.

So, Marcel has a set of contradictory requirements, not to mention that each of his uses of Firefox needs to pass a distinct Jeeves Test. How do we keep him happy nevertheless?

Introducing Firefox Agents

In the rest of this post, I will be calling each of these uses of Firefox an Agent (if we ever implement this feature, it will, of course, be called Persona). Each Agent matches one way you use Firefox. While Firefox may be delivered with a predefined set of Agents, users can easily create new Agents. In the example, Marcel has his “Fun Agent”, his “Social Agent”, his “Work Agent”, etc.

Each Agent is unique:

  • Each Agent has its own icon on Marcel’s menu/desktop/tablet/phone and task list.
  • Each Agent has its own visual identity, to make sure that work-related stuff doesn’t end up accidentally in the Fun Agent.
  • Each Agent has its own set of preferences, bookmarks, remembered passwords, cookies, cache, and add-ons.
  • Each website may be connected to a given Agent, so that links received through Gmail or through Thunderbird, for instance, automatically open with the right Agent.

As a consequence, any technology that can come bundled with Firefox to, for instance, provide search suggestions or any other kind of website suggestions is tied to an Agent. For instance, Marcel’s browsing a dating site, or shopping for shoes, or having religious activities will not be visible to any of his colleagues looking above his shoulder at his Work Agent, nor will it be tied to either of Marcel’s Gmail accounts. This greatly increases the chances of suggestion technologies passing the Jeeves Test.

Agents are also connected:

  • A menu in each Agent, as well as a keyboard shortcut, lets users quickly open/switch to other Agents.
  • When an Agent follows a link to a website that belongs to another Agent, the relevant Agent opens automatically.
  • Bookmarks may be pushed, on demand, from one Agent to another one.
  • Passwords may be pulled, on demand, from one Agent to another one.

How far are we from Agents?

Technologically speaking, Firefox Agents almost exist. Indeed, Firefox has supported Profiles forever, since way before Firefox 1.0. I generally have three instances of Firefox opened at the same time (four when I’m doing web development), and it works nicely.

With a few add-ons, you can get almost everything, although not entirely connected together:

  • Profilist helps a lot with switching between profiles, and the dev version adds distinct icons;
  • Firefox Themes implement distinct appearances;
  • there are add-ons implementing whitelist browsing;
  • there are add-ons implementing password-protected Firefox.

A few features are missing, but as you can see, the list is actually quite short:

  • Pushing/pulling passwords and bookmarks between Agents (although that’s a subset of what Firefox Accounts can do).
  • Attaching specific websites to specific Agents (although this doesn’t seem too difficult to implement).
  • Connecting this all together.

What now?

I would like to browse with this Firefox. Would you?


Selena DeckelmannMigrating to Taskcluster: work underway!

Mozilla’s build and test infrastructure has relied on Buildbot as the backbone of our systems for many years. Asking around, I heard that we started using Buildbot around 2008. The time has come for a change!

Many of the people working on migrating from Buildbot to Taskcluster gathered all together for the first time to talk about migration this morning. (A recording of the meeting is available)

The goal of this work is to shut down Buildbot and identify a timeline. Our first goal post is to eliminate the Buildbot Scheduler by moving build production entirely into TaskCluster, and scheduling tests in TaskCluster.

Today, most FirefoxOS builds and tests are in Taskcluster. Nearly everything else for Firefox is driven by Buildbot.

Our current tracker bug is ‘Buildbot -> TaskCluster transition‘. At a high level, the big projects underway are:

We have quite a few things to figure out in the Windows and Mac OS X realm where we’re interacting with hardware, and some work is left to be done to support Windows in AWS. We’re planning to get more clarity on the work that needs to be done there next week.

The bugs identified seem tantalizingly close to describing most of the issues that remain in porting our builds. The plan is to have a timeline documented for builds to be fully migrated over by Whistler! We are also working on migrating tests, but for now believe the Buildbot Bridge will help us get tests out of the Buildbot scheduler, even if we continue to need Buildbot masters for a while. An interesting idea about using runner to manage hardware instead of the masters was raised during the meeting that we’ll be exploring further.

If you’re interested in learning more about TaskCluster and how to use it, Chris Cooper is running a training on Monday June 1 at 1:30pm PT.

Ping me on IRC, Twitter or email if you have questions!

David HumphreyMessing with MessageChannel

We're getting close to being able to ship a beta release of our work porting Brackets to the browser. I'll spend a bunch of time blogging about it when we do, and detail some of the interesting problems we solved along the way. Today I wanted to talk about a patch I wrote this week and what I learned in the process, specifically, using MessageChannel for cross-origin data sharing.

Brackets needs a POSIX filesystem, which is why we spent so much time on filer.js, which is exactly that. Filer stores filesystem nodes and data blocks in IndexedDB (or WebSQL on older browsers). Since this means that filesystem data is stored per-origin, and shared across tabs/windows, we have to be careful when building an app that lets a user write arbitrary HTML, CSS, and JavaScript that is then run it in the page (did I mention we've built a complete web server and browser on top of filer.js, because it's awesome!).

Our situation isn't that unique: we want to allow potentially dangerous script from the user to get published using our web app; but we need isolation between the web app and the code editor and "browser" that's rendering the content in the editor and filesystem. We do this by isolating the hosting web app from the editor/browser portion using an iframe and separate origins.

Which leads me back to the problem of cross-origin data sharing and MessageChannel. We need access to the filesystem data in the hosting app, so that a logged in user can publish their code to a server. Since the hosted app and the editor iframe run on different origins, we have to somehow allow one to access the data in the other.

Our current solution (we're still testing, but so far it looks good) is to put the filesystem (i.e., IndexedDB database) in the hosting app, and use a MessageChannel to proxy calls to the filesystem from the editor iframe. This is fairly straightforward, since all filesystem operations were already async.

Before this week, I'd only read about MessageChannel, but never really played with it. I found it mostly easy to use, but with a few gotchas. At first glance it looks a lot like postMessage between windows. What's different is that you don't have to validate origins on every call. Instead, a MessageChannel exposes two MessagePort objects: one is held onto by the initiating script; the other is transferred to the remote script.

I think this initial "handshake" is one of the harder things to get your head around when you begin using this approach. To start using a MessageChannel, you first have to do a regular postMessage in order to get the second MessagePort over to the remote script. Furthermore, you need to do it using the often overlooked third argument to postMessage, which lets you include Transferable objects. These objects get transferred (i.e., their ownership switches to the remote execution context).

In code you're doing something like this:

/**
 * In the hosting app's js
 */
var channel = new MessageChannel();  
var port = channel.port1;

...

// Wait until the iframe is loaded, via event or some postMessage
// setup, then post to the iframe, indicating that you're
// passing (i.e., transferring) the second port over which
// future communication will happen.
iframe.contentWindow.postMessage("here's your port...",  
                                 "*",
                                 [channel.port2]);

// Now wire the "local" port so we can get events from the iframe
function onMessage(e) {  
  var data = e.data;
  // do something with data passed by remote
}
port.addEventListener("message", onMessage, false);

// And, since we used addEventListener vs. onmessage, call start()
// see https://developer.mozilla.org/en-US/docs/Web/API/MessagePort/start
port.start();  

...

// Send some data to the remote end.  
var data = {...};  
port.postMessage(data);  

I'm using a window and iframe, but you could also use a worker (or your iframe could pass along to its worker, etc). On the other end, you do something like this:

/**
 * In the remote iframe's js
 */

var port;

// Let the remote side know we're ready to receive the port
parent.postMessage("send me the port, please", "*");

// Wait for a response, then wire the port for `message` events
function receivePort(e) {  
  removeListener("message", receivePort, false);

  if(e.data === "here's your port...") {
    port = e.ports[0];

    function onMessage(e) {
      var data = e.data;
      // do something with data passed by remote
    }

    port.addEventListener("message", onMessage, false);
    // Make sure you call start() if you use addEventListener
    port.start();
  }
}
addEventListener("message", receivePort, true);

...

// Send some data to the other rend
var data = {...};  
port.postMessage(data);  

Simple, right? It's mostly that easy, but here's the fine print:

  • It works today in every modern browser except IE 9 and Firefox, where it's awaiting final review and behind a feature pref. I ended up using a slightly modified version of MessageChannel.js as a polyfill. (We need this to land in Mozilla!)
  • You have to be careful with event handling on the ports, since using addEventListener requires an explicit call to start which onmessage doesn't. It's documented, but I know I wasted too much time on that one, so be warned.
  • You can safely pass all manner of data across the channel, except for things like functions, and you can use Transferables once again, for things that you want to ship wholesale across to the remote side.
  • Trying to transfer an ArrayBuffer via postMessage doesn't work right now in Blink

I was extremely pleased to find that I could adapt our filesystem in roughly a day to work across origins, without losing a ton of performance. I'd highly recommend looking at MessageChannels when you have a similar problem to solve.

Gregory SzorcImportant Changes to MozReview

This was a busy week for MozReview! There are a number of changes people need to be aware of.

Support for Specifying Reviewers via Commit Messages

MozReview will now parse r?gps syntax out of commit messages to set reviewers for pushed commits.

Read the docs for more information, including why we are preferring r? to r=.

Since it landed, a number of workflow concerns have been reported. See the bug tree for bug 1142251 before filing a bug to help avoid duplicates.

Thank Dan Minor for the feature!

Review Attachment/Flag Per Commit

Since the beginning of MozReview, there was one Bugzilla attachment / review flag per commit series. This has changed to one attachment / review flag per commit.

Before, you needed to grant Ship It on the parent/root review request in order to r+ the MozReview review request. Now, you grant Ship It on individual commits and these turn into individual r+ on Bugzilla. To reinforce that reviewing the parent/root review request doesn't do anything meaningful any more, the Ship It button and checkbox have been removed from the parent/root review request.

The new model more closely maps to how code review in Bugzilla has worked at Mozilla for ages. And, it is a superior workflow for future workflows we're trying to enable.

We tried to run a one-time migration script to convert existing parent/root attachments/review flags to per-commit attachments/flags. However, there were issues. We will attempt again sometime next week. In the interim, in-flight review requests may enter an inconsistent state if they are updated. If a new push is performed, the old parent/root attachment/review flag may linger and per-commit attachments/flags will be created. This could be confusing. The workaround is to manually clear the r? flag from the parent/root attachment or wait for the migration script to run in a few days.

Mark Côté put in a lot of hard work to make this change happen.

r? Flags Cleared After Review

Before, submitting a review without granting Ship It wouldn't do anything to the r? flag: the r? flag would linger.

Now, submitting review without granting Ship It will clear the r? flag. We think the new default is better for the majority of users. However, we recognize it isn't always wanted. There is a bug open to provide a checkbox to keep the r? flag present.

Metadata Added to Changesets

If you update to the tip of the version-control-tools repository (you should do this every week or so to stay fresh - use mach mercurial-setup to do this automatically), metadata will automatically be added to commits when working with MozReview-enabled repositories.

Specifically, we insert a hidden, unique, random ID into every changeset. This ID can be used to map commits to each other. We don't use this ID yet. But we have plans.

A side-effect of this change is that you can no longer push to MozReview if you have uncommitted local changes. If this is annoying, use hg shelve and hg unshelve to create and undo temporary commits. If this is too annoying, complain by filing a bug and we can look into doing this automatically as part of pushing.

What's Next?

We're actively working on more workflow enhancements to make MozReview an even more compelling experience.

I'm building a web service to query file metadata from moz.build files. This will allow MozReview to automatically file bugs in proper components based on what files changed. Once code reviewer metadata is added to moz.build files, it will enable us to assign reviewers automatically as well. The end goal here is to lower the number of steps needed to turn changed code into a landing. This will be useful when we turn GitHub pull requests into MozReview review requests (random GitHub contributors shouldn't need to know who to flag for review, nor should they be required to file a bug if they write some code). Oh year, we're working on integrating GitHub pull requests.

Another area of focus is better commit tracking and partially landed series. I have preliminary patches for automatically adding review URL annotations to commit messages. This will enable people to easily go from commit (message) to MozReview, without having to jump through Bugzilla. This also enables better commit tracking. If you e.g. perform complicated history rewriting, the review URL annotation will enable the MozReview server to better map previously-submitted commits to existing review requests. Partially landed series will enable commits to land as soon as they are reviewed, without having to wait on the entire series. It's my strong belief that if a commit is granted review, it should land as soon as possible. This helps prevent bit rot of ready-to-land commits. Landings also make people feel better because you feel like you've accomplished something. Positive feedback loops are good.

Major work is also being done to overhaul the web UI. The commit series view (which is currently populated via XHR) will soon be generated on the server and served as part of the page. This should make pages load a bit faster. And, things should be prettier as well.

And, of course, work is being invested into Autoland. Support for submitting pushes to Try landed a few weeks ago. Having Autoland actually land reviewed commits is on the radar.

Exciting times are ahead. Please continue to provide feedback. If you see something, say something.

Niko MatsakisVirtual Structs Part 2: Classes strike back

This is the second post summarizing my current thoughts about ideas related to “virtual structs”. In the last post, I described how, when coding C++, I find myself missing Rust’s enum type. In this post, I want to turn it around. I’m going to describe why the class model can be great, and something that’s actually kind of missing from Rust. In the next post, I’ll talk about how I think we can get the best of both worlds for Rust. As in the first post, I’m focusing here primarily on the data layout side of the equation; I’ll discuss virtual dispatch afterwards.

(Very) brief recap

In the previous post, I described how one can setup a class hierarchy in C++ (or Java, Scala, etc) with a base class and one subclass for every variant:

1
2
3
class Error { ... };
class FileNotFound : public Error { ... };
class UnexpectedChar : public Error { ... };

This winds up being very similar to a Rust enum:

1
2
3
4
enum ErrorCode {
    FileNotFound,
    UnexpectedChar
}

However, there are are some important differences. Chief among them is that the Rust enum has a size equal to the size of its largest variant, which means that Rust enums can be passed “by value” rather than using a box. This winds up being absolutely crucial to Rust: it’s what allows us to use Option<&T>, for example, as a zero-cost nullable pointer. It’s what allows us to make arrays of enums (rather than arrays of boxed enums). It’s what allows us to overwrite one enum value with another, e.g. to change from None to Some(_). And so forth.

Problem #1: Memory bloat

There are a lot of use cases, however, where having a size equal to the largest variant is actually a handicap. Consider, for example, the way the rustc compiler represents Rust types (this is actually a cleaned up and simplified version of the real thing).

The type Ty represents a rust type:

1
2
// 'tcx is the lifetime of the arena in which we allocate type information
type Ty<'tcx> = &'tcx TypeStructure<'tcx>;

As you can see, it is in fact a reference to a TypeStructure (this is called sty in the Rust compiler, which isn’t completely up to date with modern Rust conventions). The lifetime 'tcx here represents the lifetime of the arena in which we allocate all of our type information. So when you see a type like &'tcx, it represents interned information allocated in an arena. (As an aside, we added the arena back before we even had lifetimes at all, and used to use unsafe pointers here. The fact that we use proper lifetimes here is thanks to the awesome eddyb and his super duper safe-ty branch. What a guy.)

So, here is the first observation: in practice, we are already boxing all the instances of TypeStructure (you may recall that the fact that classes forced us to box was a downside before). We have to, because types are recursively structured. In this case, the ‘box’ is an arena allocation, but still the point remains that we always pass types by reference. And, moreover, once we create a Ty, it is immutable – we never switch a type from one variant to another.

The actual TypeStructure enum is defined something like this:

1
2
3
4
5
6
7
8
enum TypeStructure<'tcx> {
    Bool,                                      // bool
    Reference(Region, Mutability, Type<'tcx>), // &'x T, &'x mut T
    Struct(DefId, &'tcx Substs<'tcx>),         // Foo<..>
    Enum(DefId, &'tcx Substs<'tcx>),           // Foo<..>
    BareFn(&'tcx BareFnData<'tcx>),            // fn(..)
    ...
}

You can see that, in addition to the types themselves, we also intern a lot of the data in the variants themselves. For example, the BareFn variant takes a &'tcx BareFnData<'tcx>. The reason we do this is because otherwise the size of the TypeStructure type balloons very quickly. This is because some variants, like BareFn, have a lot of associated data (e.g., the ABI, the types of all the arguments, etc). In contrast, types like structs or references have relatively little associated data. Nonetheless, the size of the TypeStructure type is determined by the largest variant, so it doesn’t matter if all the variants are small but one: the enum is still large. To fix this, Huon spent quite a bit of time analyzing the size of each variant and introducing indirection and interning to bring it down.

Consider what would have happened if we had used classes instead. In that case, the type structure might look like:

1
2
3
4
5
6
7
typedef TypeStructure *Ty;
class TypeStructure { .. };
class Bool : public TypeStructure { .. };
class Reference : public TypeStructure { .. };
class Struct : public TypeStructure { .. };
class Enum : public TypeStructure { .. };
class BareFn : public TypeStructure { .. };

In this case, whenever we allocated a Reference from the arena, we would allocate precisely the amount of memory that a Reference needs. Similarly, if we allocated a BareFn type, we’d use more memory for that particular instance, but it wouldn’t affect the other kinds of types. Nice.

Problem #2: Common fields

The definition for Ty that I gave in the previous section was actually somewhat simplified compared to what we really do in rustc. The actual definition looks more like:

1
2
3
4
5
6
7
8
9
// 'tcx is the lifetime of the arena in which we allocate type information
type Ty<'tcx> = &'tcx TypeData<'tcx>;

struct TypeData<'tcx> {
    id: u32,
    flags: u32,
    ...,
    structure: TypeStructure<'tcx>,
}

As you can see, Ty is in fact a reference not to a TypeStructure directly but to a struct wrapper, TypeData. This wrapper defines a few fields that are common to all types, such as a unique integer id and a set of flags. We could put those fields into the variants of TypeStructure, but it’d be repetitive, annoying, and inefficient.

Nonetheless, introducing this wrapper struct feels a bit indirect. If we are using classes, it would be natural for these fields to live on the base class:

1
2
3
4
5
6
7
8
9
10
11
typedef TypeStructure *Ty;
class TypeStructure {
    unsigned id;
    unsigned flags;
    ...
};
class Bool : public TypeStructure { .. };
class Reference : public TypeStructure { .. };
class Struct : public TypeStructure { .. };
class Enum : public TypeStructure { .. };
class BareFn : public TypeStructure { .. };

In fact, we could go further. There are many variants that share common bits of data. For example, structs and enums are both just a kind of nominal type (“named” type). Almost always, in fact, we wish to treat them the same. So we could refine the hierarchy a bit to reflect this:

1
2
3
4
5
6
7
8
class Nominal : public TypeStructure {
    DefId def_id;
    Substs substs;
};
class Struct : public Nominal {
};
class Enum : public Nominal {
};

Now code that wants to work uniformly on either a struct or enum could just take a Nominal*.

Note that while it’s relatively easy in Rust to handle the case where all variants have common fields, it’s a lot more awkward to handle a case like Struct or Enum, where only some of the variants have common fields.

Problem #3: Initialization of common fields

Rust differs from purely OO languages in that it does not have special constructors. An instance of a struct in Rust is constructed by supplying values for all of its fields. One great thing about this approach is that “partially initialized” struct instances are never exposed. However, the Rust approach has a downside, particularly when we consider code where you have lots of variants with common fields: there is no way to write a fn that initializes only the common fields.

C++ and Java take a different approach to initialization based on constructors. The idea of a constructor is that you first allocate the complete structure you are going to create, and then execute a routine which fills in the fields. This approach to constructos has a lot of problems – some of which I’ll detail below – and I would not advocate for adding it to Rust. However, it does make it convenient to separately abstract over the initialization of base class fields from subclass fields:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef TypeStructure *Ty;
class TypeStructure {
    unsigned id;
    unsigned flags;

    TypeStructure(unsigned id, unsigned flags)
      : id(id), flags(flags)
    { }
};

class Bool : public TypeStructure {
    Bool(unsigned id)
      : TypeStructure(id, 0) // bools have no flags
    { }
};

Here, the constructor for TypeStructure initializes the TypeStructure fields, and the Bool constructor initializes the Bool fields. Imagine we were to add a field to TypeStructure that is always 0, such as some sort of counter. We could do this without changing any of the subclasses:

1
2
3
4
5
6
7
8
9
class TypeStructure {
    unsigned id;
    unsigned flags;
    unsigned counter; // new

    TypeStructure(unsigned id, unsigned flags)
      : id(id), flags(flags), counter(0)
    { }
};

If you have a lot of variants, being able to extract the common initialization code into a function of some kind is pretty important.

Now, I promised a critique of constructors, so here we go. The biggest reason we do not have them in Rust is that constructors rely on exposing a partially initialized this pointer. This raises the question of what value the fields of that this pointer have before the constructor finishes: in C++, the answer is just undefined behavior. Java at least guarantees that everything is zeroed. But since Rust lacks the idea of a “universal null” – which is an important safety guarantee! – we don’t have such a convenient option. And there are other weird things to consider: what happens if you call a virtual function during the base type constructor, for example? (The answer here again varies by language.)

So, I don’t want to add OO-style constructors to Rust, but I do want some way to pull out the initialization code for common fields into a subroutine that can be shared and reused. This is tricky.

Problem #4: Refinement types

Related to the last point, Rust currently lacks a way to “refine” the type of an enum to indicate the set of variants that it might be. It would be great to be able to say not just “this is a TypeStructure”, but also things like “this is a TypeStructure that corresponds to some nominal type (i.e., a struct or an enum), though I don’t know precisely which kind”. As you’ve probably surmised, making each variant its own type – as you would in the classes approach – gives you a simple form of refinement types for free.

To see what I mean, consider the class hierarchy we built for TypeStructure:

1
2
3
4
5
6
7
8
typedef TypeStructure *Ty;
class TypeStructure { .. };
class Bool : public TypeStructure { .. };
class Reference : public TypeStructure { .. };
class Nominal : public TypeStructure { .. }
class Struct : public Nominal { .. };
class Enum : public Nominal { .. };
class BareFn : public TypeStructure { .. };

Now, I can pass around a TypeStructure* to indicate “any sort of type”, or a Nominal* to indicate “a struct or an enum”, or a BareFn* to mean “a bare fn type”, and so forth.

If we limit ourselves to single inheritance, that means one can construct an arbitrary tree of refinements. Certainly one can imagine wanting arbitrary refinements, though in my own investigations I have always found a tree to be sufficient. In C++ and Scala, of course, one can use multiple inheritance to create arbitrary refinements, and I think one can imagine doing something similar in Rust with traits.

As an aside, the right way to handle ‘datasort refinements’ has been a topic of discussion in Rust for some time; I’ve posted a different proposal in the past, and, somewhat amusingly, my very first post on this blog was on this topic as well. I personally find that building on a variant hierarchy, as above, is a very appealing solution to this problem, because it avoids introducing a “new concept” for refinements: it just leverages the same structure that is giving you common fields and letting you control layout.

Conclusion

So we’ve seen that there also advantages to the approach of using subclasses to model variants. I showed this using the TypeStructure example, but there are lots of cases where this arises. In the compiler alone, I would say that the abstract syntax tree, the borrow checker’s LoanPath, the memory categorization cmt types, and probably a bunch of other cases would benefit from a more class-like approach. Servo developers have long been requesting something more class-like for use in the DOM. I feel quite confident that there are many other crates at large that could similarly benefit.

Interestingly, Rust can gain a lot of the benefits of the subclass approach—namely, common fields and refinement types—just by making enum variants into types. There have been proposals along these lines before, and I think that’s an important ingredient for the final plan.

Perhaps the biggest difference between the two approaches is the size of the “base type”. That is, in Rust’s current enum model, the base type (TypeStructure) is the size of the maximal variant. In the subclass model, the base class has an indeterminate size, and so must be referenced by pointer. Neither of these are an “expressiveness” distinction—we’ve seen that you can model anything in either approach. But it has a big effect on how easy it is to write code.

One interesting question is whether we can concisely state conditions in which one would prefer to have “precise variant sizes” (class-like) vs “largest variant” (enum). I think the “precise sizes” approach is better when the following apply:

  1. A recursive type (like a tree), which tends to force boxing anyhow. Examples: the AST or types in the compiler, DOM in servo, a GUI.
  2. Instances never change what variant they are.
  3. Potentially wide variance in the sizes of the variants.

The fact that this is really a kind of efficiency tuning is an important insight. Hopefully our final design can make it relatively easy to change between the ‘maximal size’ and the ‘unknown size’ variants, since it may not be obvious from the get go which is better.

Preview of the next post

The next post will describe a scheme in which we could wed together enums and structs, gaining the advantages of both. I don’t plan to touch virtual dispatch yet, but intead just keep focusing on concrete types.

Adam Munter“The” Problem With “The” Perimeter

“It’s secure, we transmit it over over SSL to a cluster behind a firewall in a restricted vlan.”
Protontorpedo
“But my PCI QSA said I had to do it this way to be compliant.”

This study by Gemalto discusses interesting survey results about perceptions of security perimeters such as that 64% of IT decision makers are looking to increase spend on perimeter security within the next 6 months and that 1/3 of those polled believe that unauthorized users have access to their information assets. It also revealed that only 8% of data affected by breaches was protected by encryption.

The perimeter is dead, long live the perimeter! The Jericho Forum started discussing “de-perimeterization” in 2003. If you hung out with pentesters, you already knew the the concept of ‘perimeter’ was built on shaky foundations. The growth of mobile, web API, and Internet of Things have only served to drive the point home.  Yet, there is an entire industry of VC-funded product companies and armies of consultants who are still operating from the mental model of there being “a perimeter.”[0]

In discussion about “the perimeter,” it’s not the concept of “perimeter” that is most problematic, it’s the word “the.”

There is not only “a” perimeter, there are many possible logical perimeters, depending on the viewpoint of the actor you are considering. There are an unquantifiable number of theoretical overlaid perimeters based on the perspective of the actors you’re considering and their motivation, time and resources, what they can interact with or observe, what each of those components can interact with, including humans and their processes and automated data processing systems, authentication and authorization systems, all the software, libraries, and hardware dependencies going down to the firmware, the interaction between different systems that might interpret the same data to mean different things, and all execution paths, known and unknown, etc, etc.

The best CSOs know they are working on a problem that has no solution endpoint, and that thinking so isn’t even the right mindset or model. They know they are living in a world of resource scarcity and have a problem of potentially unlimited size and start by asset classification, threat modeling[1] and inventorying. Without that it’s impossible to even have a rough idea of the shape and size of the problem. They know that their actual perimeter isn’t what’s drawn inside an arbitrary theoretical border in a diagram. It’s based on the attackable surface area seen by an potential attacker, the value of the resource to the attacker, and the many possible paths that could be taken to reach it in a way that is useful to the attacker, not some imaginary mental model of logical border control.

You’ve deployed anti-malware and anti-APT products, Network and web app firewalls, endpoint protection and database encryption. Fully PCI compliant!  All useful when applied with knowledge of what you’re protecting, how, from whom, and why. But if you don’t consider what you’re protecting and from whom as you design and build systems, not so useful. Imagine the following scenario:  All of these perimeter protection technologies allow SSL traffic through port 443 to your webserver’s REST API listeners. The listening application has permission to access the encrypted database to read or modify data. And when the attacker finds a logic vulnerability that lets them access data which their user id should not be able to see, it looks looks like normal application traffic to your IDS/IPS and web app firewall. As requested, the application uses its credentials to retrieve decrypted data and present it to the user.

Footnotes

0. I’m already skeptical about the usefulness of studies that aggregate data in this way. N percent of respondents think that y% is the correct amount to spend on security technology categories A, B, C. Who cares? The increasing yoy numbers of attacks are the result of the distribution of knowledge during the time surveyed and in any event these numbers aggregate a huge variety of industries, business histories, risk tolerance, and other tastes and preferences.
1. Threat modeling doesn’t mean technical decomposition to identify possible attacks, that’s attack modeling, through the two are often confused, even in many books and articles. The “threat” is “customer data exposed to unauthorized individuals.” The business “risk” is “Data exposure would lead to headline risk(bad press) and loss of data worth approx $N dollars.” The technical risk is “Application was built using inline SQL queries and is vulnerable to SQL injection” and “Database is encrypted but the application’s credentials let it retrieve cleartext data” and probably a bunch of other things.


Filed under: infosec, web security Tagged: infosec, Mozilla, perimeter, webappsec

Gregory SzorcFaster Cloning from hg.mozilla.org With Server Provided Bundles

When you type hg clone, the Mercurial server will create a bundle from repository content at the time of the request and stream it to the client. (Git works essentially the same way.)

This approach usually just works. But there are some downsides, particularly with large repositories.

Creating bundles for large repositories is not cheap. For mozilla-central, Firefox's main repository, it takes ~280s of CPU time on my 2014 MacBook Pro to generate a bundle. Every time a client runs a hg clone https://hg.mozilla.org/mozilla-central, a server somewhere is spinning a CPU core generating ~1.1 GB of data. What's more, if another clone arrives at the same time, another process will perform the exact same work! When we talk about multiple minutes of CPU time per request, this extra work starts to add up.

Another problem with large repositories is interrupted downloads. If you suffer a connectivity blip during your clone command, you'll have to start from scratch. This potentially means re-transferring hundreds of megabytes from the server. It also means the server has to generate a new bundle, consuming even more CPU time. This is not good for the user or the server.

There have been multiple outages of hg.mozilla.org as a result of the service being flooded with clone requests to large repositories. Dozens of clients (most of them in Firefox or Firefox OS release automation) have cloned the same repository around the same time and overwhelmed network bandwidth in the data center or CPU cores on the Mercurial servers.

A common solution to this problem is to not use the clone command to receive initial repository data from the server. Instead, a static bundle file will be generated and made available to clients. Clients will call hg init to create an empty repository then will perform an hg unbundle to apply the contents of a pre-generated bundle file. They will then run hg pull to fetch new data that was created after the bundle was generated. (It's worth noting that Git's clone --reference option is similar.)

This is a good technical solution. Firefox and Firefox OS release automation have effectively implemented this. However, it is a lot of work: you have to build your own bundle generation and hosting infrastructure and you have to remember that every hg clone should probably be using bundles instead. It is extra complexity and complexity that must be undertaken by every client. If a client forgets, the consequences can be disastrous (clone flooding leading to service outage). Client-side opt-in is prone to lapses and doesn't scale.

As of today, we've deployed a more scalable, server-based solution to hg.mozilla.org.

hg.mozilla.org is now itself generating bundles for a handful of repositories, including mozilla-central, inbound, fx-team, and mozharness. These bundles are being uploaded to Amazon S3. And those bundles are being advertised by the server over Mercurial's wire protocol.

When you install the bundleclone Mercurial extension, hg clone is taught to look for bundles being advertised on the server. If a bundle is available, the bundle is downloaded, applied, and then the client does the equivalent of an hg pull to fetch all new data since when the bundle was generated. If a bundle exists, it is used transparently: no client side cooperation is needed beyond installing the bundleclone extension. If a bundle doesn't exist, it simply falls back to Mercurial's default behavior. This effectively shifts responsibility for doing efficient clones from clients to server operators, which means server operators don't need cooperation from clients to enact important service changes. Before, if clients weren't using bundles, we'd have to wait for clients to update their code. Now, we can see a repository is being cloned heavily and start generating bundles for it without having to wait for the client to deploy new code.

Furthermore, we've built primitive content negotiation into the process. The server doesn't simply advertise one bundle file: it advertises several bundle files. We offer gzip, bzip2, and stream bundles. gzip is what Mercurial uses by default. It works OK. bzip2 bundles are smaller, but they take longer to process. stream bundles are essentially tar archives of the .hg/store directory and are larger than gzip bundles, but insanely fast because there is very little CPU required to apply them. In addition, we advertise URLs for multiple S3 regions, currently us-west-2 (Oregon) and us-east-1 (Virginia). This enables clients to prefer the bundle most appropriate for them.

A benefit of serving bundles from S3 is that Firefox and Firefox OS release automation (the biggest consumers of hg.mozilla.org) live in Amazon EC2. They are able to fetch from S3 over a gigabit network. And, since we're transferring data within the same AWS region, there are no data transfer costs. Previously, we were transferring ~1.1 GB from a Mozilla data center to EC2 for each clone. This took up bandwidth in Mozilla's network and cost Mozilla money to send data thousands of miles away. And, we never came close to saturating a gigabit network (we do with stream bundles). Wins everywhere!

The full instructions detail how to use bundleclone. I recommend everyone at Mozilla install the extension because there should be no downside to doing it.

Once bundleclone is deployed to Firefox and Firefox OS release automation, we should hopefully never again see those machines bring down hg.mozilla.org due to a flood of clone requests. We should also see a drastic reduction in load to hg.mozilla.org. I'm optimistic bandwidth will decrease by over 50%!

It's worth noting that the functionality from the bundleclone extension is coming to vanilla Mercurial. The functionality (which was initially added by Mozilla's Mike Hommey) is part of Mercurial's bundle2 protocol, which is available, but isn't enabled by default yet. bundleclone is thus a temporary solution to bring us server stability and client improvements until modern Mercurial versions are deployed everywhere in a few months time.

Finally, I would like to credit Augie Fackler for the original idea for server-assisted bundle-based clones.

Mozilla Open Policy & Advocacy BlogCopyright reform in the European Union

The European Union is considering broad reform of copyright regulations as part of a “Digital Single Market” reform agenda. Review of the current Copyright Directive, passed in 2001, began with a report by MEP Julia Reda. The European Parliament will vote on that report and a number of amendments this summer, and the process will continue with a legislative proposal from the European Commission in the autumn. Over the next few months we plan to add our voice to this debate; in some cases supporting existing ideas, in other cases raising new issues.

This post lays out some of the improvements we’d like to see in the EU’s copyright regime – to preserve and protect the Web, and to better advance the innovation and competition principles of the Mozilla Manifesto. Most of the objectives we identify are actively being discussed today as part of copyright reform. Our advocacy is intended to highlight these, and characterize positions on them. We also offer a proposed exception for interoperability to push the conversation in a slightly new direction. We believe an explicit exception for interoperability would directly advance the goal of promoting innovation and competition through copyright law.

Promoting innovation and competition

“The effectiveness of the Internet as a public resource depends upon interoperability (protocols, data formats, content), innovation and decentralized participation worldwide.” – Mozilla Manifesto Principle #6

Clarity, consistency, and new exceptions are needed to ensure that Europe’s new copyright system encourages innovation and competition instead of stifling it. If new and creative uses of copyrighted content can be shut down unconditionally, innovation suffers. If copyright is used to unduly restrict new businesses from adding value to existing data or software, competition suffers.

Open norm: Implement a new, general exception to copyright allowing actions which pass the 3-step test of the Berne Convention. That test says that any exception to copyright must be a special case, that it should not conflict with a normal exploitation of the work, and it should not unreasonably prejudice the legitimate interests of the author. The idea of an “open norm” is to capture a natural balance for innovation and competition, allowing the copyright holder to retain normal exclusionary rights but not exceptional restrictive capabilities with regards to potential future innovative or competing technologies.

Quotation: Expand existing protections for text quotations to all media and a wider range of uses. An exception of this type is fundamental not only for free expression and democratic dialogue, but also to promote innovation when the quoter is adding value through technology (such as a website which displays and combines excerpts of other pages to meet a new user need).

Interoperability: An exception for acts necessary to enable ongoing interoperability with an existing computer program, protocol, or data format. This would directly enable competition and technology innovation. Such interoperation is also necessary for full accessibility for the disabled (who are often not appropriately catered for in standard programs), and to allow citizens to benefit fully from other exceptions to copyright.

Not breaking the Internet

“The Internet is a global public resource that must remain open and accessible.” – Mozilla Manifesto Principle #2

The Internet has numerous technical and policy features which have combined, sometimes by happy coincidence, to make it what it is today. Clear legislation to preserve and protect these core capabilities would be a powerful assurance, and avoid creating chilling risk and uncertainty.

Hyperlinking: hyperlinking should not be considered as any form of “communication to a public”. A recent Court of Justice of the EU ruling stated that hyperlinking was generally legal, as it does not consist of communication to a “new public.” A stronger and more common-sense rule would be a legislative determination that linking, in and of itself, does not constitute communicating the linked content to a public under copyright law. The acts of communicating and making content available are done by the person who placed the content on the target server, not by those making links to content.

Robust protections for intermediaries: a requirement for due legal process before intermediaries are compelled to take down content. While it makes sense for content hosters to be asked to remove copyright-infringing material within their control, a mandatory requirement to do so should not be triggered by mere assertion, but only after appropriate legal process. The existing waiver for liability for intermediaries should thus be strengthened with an improved definition of “actual knowledge” that requires such process, and (relatedly) to allow minor, reasonable modifications to data (e.g. for network management) without loss of protection.

We look forward to working with European policymakers to build consensus on the best ways to protect and promote innovation and competition on the Internet.

Chris Riley
Gervase Markham
Jochai Ben-Avie

The Servo BlogThis Week In Servo 33

In the past two weeks, we merged 73 pull requests.

We have a new member on our team. Please welcome Emily Dunham! Emily will be the DevOps engineer for both Servo and Rust. She has a post about her ideas regarding open infrastructure which is worth reading.

Josh discussed Servo and Rust at a programming talk show hosted by Alexander Putilin.

We have an impending upgrade of the SpiderMonkey Javascript engine by Michael Wu. This moves us from a very old spidermonkey to a recent-ish one. Naturally, the team is quite excited about the prospect of getting rid of all the old bugs and getting shiny new ones in their place.

Notable additions

New contributors

Screenshots

Hebrew Wikipedia in servo-shell

This shows off the CSS direction property. RTL text still needs some work

Meetings

Minutes

  • We discussed forking or committing to maintaining the extensions we need in glutin. Glutin is trying to stay focused and “not become a toolkit”, but there are some changes we need in it for embedding. Currently we have some changes on our fork; but we’d prefer to not use tweaked forks for community-maintained dependencies and were exploring the possibilities.
  • Mike is back and working on more embedding!
  • There was some planning for the Rust-in-Gecko sessions at Whistler
  • RTL is coming!

Monica ChewTracking Protection for Firefox at Web 2.0 Security and Privacy 2015

Edited to add: I wrote a followup post to address comments here and elsewhere that advertising is working as intended. This paper has been reported incorrectly in several places as being about cookie blocking. Tracking protection blocks all traffic, not just cookies.

My paper with Georgios Kontaxis got best paper award at the Web 2.0 Security and Privacy workshop today! Georgios re-ran the performance evaluations on top news sites and the decrease in page load time with tracking protection enabled is even higher (44%!) than in our Air Mozilla talk last August, due to prevalence of embedded third party content on news sites. You can read the paper here.

This paper is the last artifact of my work at Mozilla, since I left employment there at the beginning of April. I believe that Mozilla can make progress in privacy, but leadership needs to recognize that current advertising practices that enable "free" content are in direct conflict with security, privacy, stability, and performance concerns -- and that Firefox is first and foremost a user-agent, not an industry-agent.

Advertising does not make content free. It merely externalizes the costs in a way that incentivizes malicious or incompetent players to build things like Superfish, infect 1 in 20 machines with ad injection malware, and create sites that require unsafe plugins and take twice as many resources to load, quite expensive in terms of bandwidth, power, and stability.

It will take a major force to disrupt this ecosystem and motivate alternative revenue models. I hope that Mozilla can be that force.

Monica ChewAdvertising: a sustainable utopia?

Advertising generates $50 billion annually in the US alone, but how much of that figure reflects real value? Approximately ⅓ of click traffic is fraudulent, leading to $10 billion in wasted spending annually. Counting revenue due to fraud towards the value of advertising is like counting money spent on diabetes treatments as part of the GDP -- if those figures went to zero, it would reflect a healthier ecosystem, or healthier people in the diabetes case. For people making money on advertising, it is difficult to accept that a reduction in annual revenue can mean that things are better for everyone else.

Even when ads are displayed to real people, they often create little to no value for the ad creator. According to Google, half of ads are never viewable, not even for a second. In addition, adblocking usage grew by 70% last year, and 41% of people between 18-29 use an adblocker. The advertising industry responds to these trends by making ads increasingly distracting (requiring large amounts of resources and unsafe plugins to run), collecting increasingly large amounts of data, and creating more opportunities for abuse by government agencies and other malicious actors. As Mitchell Baker put it, do we want to live in a house or a fish bowl?

There has to be a better way. Why can’t a person buy and blank out all of the ad space on sites they visit at a deep discount, since targeting machinery would no longer be relevant? Why aren’t subscriptions available as bundle deals, like in streaming video? Solutions like these are hypothetical and will remain so as long we maintain the fiction that the current advertising revenue model is a sustainable utopia.

John O'Duinn“RelEng as a Force Multiplier” at RelEng Conf 2015

Last week, I was honored to give the closing talk at RelEng Conf 2015, here in Florence, Italy.

I’ve used this same title in previous presentations; the mindset it portrays still feels important to me. Every time I give this presentation, I am invigorated by the enthusiastic response, and work to improve further, so I re-write it again. This most recent presentation at RelEngConf2015 was almost a complete re-write; only a couple of slides remain from the original. Click on the thumbnail to get the slides null

The main focus of this talk was:
1) Release Engineers build pipelines, while developers build products. When done correctly, this pipeline makes the entire company more effective. By contrast, done incorrectly, broken pipelines will hamper a company – sometimes fatally. This different perspective and career focus is important to keep in mind when hiring to solve company infrastructure problems.

2) Release Engineers routinely talk and listen with developers and testers, typically about the current project-in-progress. Thats good – we obviously need to keep doing that. However, I believe that Release Engineers also need to spend time talking to and listening with people who have a very different perspective – people who care about the fate of the *company*, as opposed to a specific project, and have a very different longer-term perspective. Typically, these people have titles like Founder/CxO/VP but every company has different cultural leaders and uses slightly different titles, so some detective work is in order. The important point here is to talk with people who care about the fate of the company, as opposed to the fate of a specific project – and keep that perspective in mind when building a pipeline that helps *all* your customers.

3) To illustrate these points, I then went into detail on some technical, and culture change, projects to highlight their strategic importance.

As usual, it was a lively presentation with lots of active Q+A during the talk, as well as the following break-out session. Afterwards, 25 of us managed to find a great dinner (without a reservation!) in a nearby restaurant where the geek talk continued at full force for several more hours.

All in all, a wonderful day.

It was also great to meet up with catlee in person again. We both had lots to catch up on, in work and in life.

Bram Adams, Christian, Foutse, Kim and Stephany Bellomo all deserve ongoing credit for continuing to make this unusual and very educational conference come to life, as well as for curating the openness that is the hallmark of this event. As usual, I love the mix of academic researchers and industry practitioners, with talks alternating between industry and academic speakers all day long. The different perspectives, and the eagerness of everyone to have fully honest “what worked, what didnt work” conversations with others from very different backgrounds is truly refreshing… and was very informative for everyone. I’m already looking forward to the next RelEngConf!!

[updated to fix some typos. John 28may2015]

Air MozillaParticipation at Mozilla

Participation at Mozilla The Participation Forum

Joel Maherthe orange factor – no need to retrigger this week

last week I did another round of re-triggering for a root cause and found some root causes!  This week I got an email from orange factor outlining the top 10 failures on the trees (as we do every week).

Unfortunately as of this morning there is no work for me to do- maybe next week I can hunt.

Here is the breakdown of bugs:

  • Bug 1081925 Intermittent browser_popup_blocker.js
    • investigated last week, test is disabled by a sheriff
  • Bug 1118277 Intermittent browser_popup_blocker.js
    • investigated last week, test is disabled by a sheriff
  • Bug 1096302 Intermittent test_collapse.html
    • test is fixed!  already landed
  • Bug 1121145 Intermittent browser_panel_toggle.js
    • too old!  problem got worse on April 24th
  • Bug 1157948 DMError: Non-zero return code for command
    • too old!  most likely a harness/infra issue
  • Bug 1166041 Intermittent LeakSanitizer
    • patch is already on this bug
  • Bug 1165938 Intermittent media-source
    • disabled the test already!
  • Bug 1149955 Intermittent Win8-PGO test_shared_all.py
    • too old!
  • Bug 1160008 Intermittent testVideoDiscovery
    • too old!
  • Bug 1137757 Intermittent Linux debug mochitest-dt1 command timed out
    • harness infra, test chunk is taking too long- problem is being addressed with more chunks.

As you can see there isn’t much to do here.  Maybe next week we will have some actions we can take.  Once I have about 10 bugs investigated I will summarize the bugs, related dates, and status, etc.


Air MozillaReps weekly

Reps weekly Weekly Mozilla Reps call

Selena Deckelmannpushlog from last night, a brief look at Try

One of the mysterious and amazing parts of Mozilla’s Release Engineering infrastructure is the Try server, or just “Try”. This is how Firefox and FirefoxOS developers can push changes to Mozilla’s large build and test system, made up of about 3000 servers at any time. There are a couple amazing things about this — one is that anyone can request access to push to try, not just core developers or Mozilla employees. It is a publicly-available system and tool. Second is that a remarkable amount of control is offered over which builds to produce and which tests are run. Each Try run could consume 300+ hours of machine time if every posible build and test option is selected.

This blog post is a brain dump from a few days of noodling and a quick review of of the pushlog for Try, which shows exactly the options developers are choosing for their Try runs.

To use Try, you need to include a string of configuration that looks something like this in your topmost hg commit:

try: -b do -p emulator,emulator-jb,emulator-kk,linux32_gecko,linux64_gecko,macosx64_gecko,win32_gecko -u all -t none

That’s a recommended string for B2G developers from the Sheriff best practices wiki page. If you’re interested in how this works, the code for the try syntax parser itself is here.

You can include a Try configuration string in an empty commit, or as the last part of an existing commit message. What most developers tell me they do is have a an empty commit with the Try string in it, and they remove the extra commit before merging a patch. From all the feedback I’ve read and heard, I think that’s probably what we should document on the wiki page for Try, and maybe have a secondary page with “variants” for those that want to use more advanced tooling. KISS rule seems to apply here.

If you’re a regular user of Try, you might have heard of the high scores tracker. What you might not know is that there is a JSON file behind this page and it contains quite a bit of history that’s used to generate that page. You can find it if you just replace ‘.html’ with ‘.json’.

Something about the 8-bit ambiance of this page that made me think of “texts from last night”. But in reality, Try is most busy during typical Pacific Time working hours.

The high scores page also made me curious about the actual Try strings that people were using. I pulled them all out and had a look at what config options were most common.

Of the 1262 pushes documented today in that file:

  • 760 used ‘-b do’ meaning both Debug and Opt builds are made. I wonder whether this should just be the default, or we should have some clear recomendations about what developers should do here.
  • 366 used ‘-p all’ meaning build on 28 platforms, and produce 28 binaries. Some people might intend this, but I wonder if some other default might be more helpful.
  • 456 used ‘-u all’ meaning that all the unit tests were run.
  • 1024 used ‘-t none’ reflecting the waning use of Talos tests.

I’m still thinking about how to use this information. I have a few ideas:

  • Change the defaults for a minimal try run
  • Make some commonly-used aliases for things like “build on B2G platforms” or “tests that catch a lot of problems”
  • Create a dashboard that shows TopN try syntax strings
  • update the parser to include more of the options documented on various wiki pages as environment variables

If you’re a regular user of Try, what would you like to see changed? Ping me in #releng, email or tweet your thoughts!

And some background on me: I’ve been working with the Release Engineering team since April 1, 2015, and most of that time so far was spent on #buildduty, a topic I’m planning to write a few blog posts about. I’m also having a look at planning for the Task Cluster migration (away from BuildBot), monitoring and developer environments for releng tooling. I’m also working on a zine to share at Whistler of what is going on when you push to Try. Finally, I stood up a bot for reporting alerts through AWS SNS and to be able to file #buildduty related bugzilla bugs.

My goal right now is to find ways of communicating about and sharing the important work that Release Engineering is doing. Some of that is creating tracker bugs and having meetings to spread knowlege. Some of that is documenting our infrastructure and drawing pictures of how our systems interact. A lot of it is listening and learning from the engineers who build and ship Firefox to the world.

Armen Zambranomozci 0.7.0 - Less network fetches - great speed improvements!

This release is not large in scope but it has many performance improvements.
The main improvement is to have reduced the number of times that we fetch for information and use a cache where possible. The network cost was very high.
You can read more about in here: http://explique.me/cProfile

Contributions

Thanks to @adusca @parkouss @vaibhavmagarwal for their contributions on this release.

How to update

Run "pip install -U mozci" to update

Major highlights

  • Reduce drastically the number of requests by caching where possible
  • If a failed build has uploaded good files let's use them
  • Added support for retriggering and cancelling jobs
  • Retrigger a job once with a count of N instead of triggering individually N times

Minor improvements

  • Documenation updates
  • Add waffle.io badge

All changes

You can see all changes in here:
0.6.0...0.7.0


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Laura HilligerMozilla Academy Thoughts

Most of this post will be much easier to understand if you view Mark’s recent talk at the Next Billion in London, skim or reread Mark’s posts, and the Wiki.

Thoughts on messaging

One thing that strikes me is the balance b/w inclusivity and having an opinion. To me, it feels off-kilter for the long-term adoptability of Mozilla Academy. We’re idealists, I know, but it can be off-putting and if we’re thinking of the billions of FBers, it may hamper us.

Some specific examples:

  • “Freedom Fighters”Vernacular in our context vs mass context. We idealists believe we’re resisting accepted social structures that are outdated in the modern world, but I don’t think most people view the “old” world as oppressive or illegitimate.

“a person who is part of an organized group fighting against a cruel and unfair government or system”– Webster’s definition of “Freedom Fighter”

I’ve talked about structural crises before – the societal movement from agriculturally-based to industrial-based to digitally-based societies – to me the “freedom fighter” vernacular is too heavy handed, as it ignores culturally validated complacency as well as fear that is associated with structural change. People resist structural change, sometimes for hundreds of years, because they are unable to adjust as quickly as other people want them to. Or, they simply don’t want to / don’t believe it. Can we speak in a voice inclusive of these viewpoints? Is that setting the bar too high?

  • the Romans – Mark uses the Romans as an example of how powerful, effective communication can lead to strong empires. I think it’s a good analogy, but I also think there’s a note about “propaganda” lost in our messaging. Are we, as idealists, vocalizing our own propaganda? We are definitely bias…
  • the move from country to city – this analogy feels non-inclusive to me as it underscores an opinion that one way of life (city) is better than another (country), it reiterates a rejection of traditionalist ideas. While it might be true at an economic level, the personal level is the piece that speaks to people. This article on ambition and relationships in relation to human happiness articulates my internal conflict.
  • Compulsory education – In passing, Mark uses this as a societal innovation, but it would be good to consider it’s weaknesses (and what we’re doing about it) in the messaging. One could argue that our educational structures increasingly ignore psychological understandings of how learning works. See the unschooling movement (and next point).

Missed opportunity: Compulsory Education for the 21st Century

Mark says that the 3rd piece, digital literacy, is in his opinion the most important piece (I agree), but he stops short of vocalizing the connection to “being educated”. Your view is relative to your experiences, cultural background and social position, so technical skill acquisition can lead to the expansion of those perceptions by widening your network of influences. It’s a powerful and objective argument in educational theory, and could be simplified for mass appeal + understanding.

He touches ever so lightly on our pedagogy, but this is another place where this community is quite strong. Mozilla Academy as a beacon of learner-centric, production oriented learning is, IMHO, another strong selling point.

Messaging in short

I think the current messaging is powerful for kindred spirits, but potentially weak for mass appeal. Maybe that’s ok, intentional even, but I wanted to point it out. Obviously, as one can see from the other documentation that exists on this topic, the vision is considering the above points, so that’s just a messaging thing.

Access

Mark also spoke about looking backwards and how people will think about the personal computing devices. Whether they’ll understand that they can do anything vs see their device the way people see televisions – this is another opportunity to weave in the educational value for society writ large. Increasingly our connections proliferate our lives. Governmental/Capitalist intervention is using ICTs as mechanisms of control – how can this aspect be brought in (in a balanced inclusive way)? I think convenience plays a big role here, so when I say “mechanisms of control” I don’t mean it intentionally negatively.

Common Approaches (on Curriculum)

For years I’ve been pushing a model of remixable, customizable curriculum. I’ve worked with folks across the community to integrate this model, and it’s been somewhat successful (e.g. everyone seems to be sold on the modularity as a non-negotiable, Mozillians are trying to use our pedagogy, etc). Yesterday, Robert Friedman and I talked about the layers of Mozilla Curriculum as being:

  1. Value Proposition Layer : The learning objectives and curriculum themselves
  2. Presentation Layer : Culture & Practice of teaching with that curriculum (in specific contexts)
  3. Mechanics Layer : the platform(s) that allow for modularity and customization outside of the example contexts we provide

Those three things likely need to be presupposed with the Web Literacy Map and Open Leadership Map as well as across the board components. Potentially, there are other competency and skill “maps” that need to sit on top of these two meta frameworks (e.g. Web Literacy for Science or Open Leadership for News). What we haven’t done is establish across the board components for that modularity, but I think a working group here might make that easier. For me, it’s been difficult to issue any sort of directive on “required” components because I’ve worried about innovation being stymied once we create such a framework. The Web Literacy Map contains enough flexibility to overcome this and what we’ve been talking about around Open Leadership is getting there too (assuming you think of these things as “a work in progress”) . How can we systematize curricular components with enough flexibility?

There’s a delicate balance between just enough and too much structure, and I think that conflicts will arise if we are too strict in our common approaches with such a diverse set of people/programs/topics.

The Academy Governance Model

“Highlighting value propositions through use cases (personas) helps clarify the vision / idea, makes others excited and wanting to participate) could be the start of both the development and structural strategy.” Mark’s “aha” moment from a recent workshop

There’s a lot to be learned from formal education, especially if you look at the all-to-familiar formulaic academic areas in tandem with our audience(s). Formal universities have Colleges, Departments, Subjects. Most develop pathways and curriculum separately from the “university” as a whole (but inform one another via their bicameral governance structures), and the “all to familiar” bit is the accepted standard of what is and is not a University Department (culturally defined).

If we start to think about the specific personas who would benefit from Mozilla Academy, we might be able to remix University structures to help us facilitate both the innovative development of specific Mozilla Academy “departments” (defined and built by personas in niche areas) as well as a meta framework and governance structure for what the Academy as a whole is and how it functions.

Currently, the working groups indicate a move towards a bicameral governance structure, but the “common approaches” note seems to pull away from that (with a step towards unicameral structures?). Stepping back from the working groups defined there and my questions around common approaches, I wonder whether it would be worthwhile to think about how bicameralism might (or might not) invite radical participation.

As always, it’s an open conversation so feel free to push back or share your thoughts here in the comments, in a responsive post or on Twitter.

 

 

Byron Joneshappy bmo push day!

it appears to be “bmo push week“.  today’s push addresses some issues caused by yesterday’s push.

the following changes have been pushed to bugzilla.mozilla.org:

  • [1168900] “Can’t call method “id” on unblessed reference” when changing component for a bug
  • [1168790] “End this group of criteria” under Custom Search is broken for query.cgi
  • [1168751] Request nagger sends me ginormous emails because I have a lot of outgoing requests
  • [1168826] only include requests someone is waiting on in the manager’s “watching” emails

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

Mozilla Open Policy & Advocacy BlogIt’s the final countdown: US surveillance reform nearing the finish line

The USA FREEDOM Act was born out of the shocking revelations of overbroad government surveillance that began two years ago. In its short life (it was first introduced in 2013) the USA FREEDOM Act has gone through many iterations; along with many others we have supported this legislation in our push for broader surveillance reform. The USA FREEDOM Act, although not perfect and clearly a product of compromise, deserved our support. Yet, even after overwhelming support in the House, the Senate failed to do its job late last Friday night and failed to pass the USA FREEDOM Act (defeated by a 57-42 vote). Just three more votes were needed to move this legislation forward. The word “disappointing” doesn’t really express the proper sentiment.

Facing expiration (at midnight on May 31, 2015) of three legal authorities authorizing US surveillance programs, the Senate also declined on Friday night to advance a two-month reauthorization of these authorities (by a vote of 45-54). The strategic decision-making here is still confounding. Why wait until the night before a holiday recess when the House has already left to take these votes to the floor? Now, Senate Majority Leader Mitch McConnell – left with few options — is taking the unusual step of calling a Sunday May 31st session to try again to pass something just hours before these authorities sunset. This won’t be without several challenges.

Any paths for emergency reauthorization or any “compromises” will face filibuster threats from, among others, Senator Rand Paul, who took to the floor for 10 and half hours last week to lambast the surveillance state and refused Majority Leader McConnell’s attempts on Friday evening to extend these expiring provisions by even one day. Senator Paul’s hardline opposition, combined with the nuances of the Senate’s arcane procedures and the presence of a number of other privacy champions in the Senate, means it will be nearly impossible to pass legislation in such a short window of time.

On the House side, Representatives passed the USA FREEDOM Act overwhelmingly. The House is scheduled to return from recess and vote no earlier than 6:30PM on June 1st – well after the authorities sunset. There may be some last minute wrangling during this recess week, but without some agreement from the House to take up an interim reauthorization prior to the May 31st expiration, it appears that the Senate must either pass the House version of USA FREEDOM or let these authorities sunset. Put another way, if the Senate doesn’t pass the House version of USA FREEDOM, there will be a gap in authorized collection by the intelligence community. Justice Department officials have indicated that the NSA has already begun shutting down the 215 programs so as to be in compliance if a sunset occurs.

So what actually happens if these provisions sunset? It is tempting to view a sunset as a privacy win, as the government would be restrained from bulk collection of our telephone records after all. While there’s truth to that, we can and should expect to see the national security hawks and the intelligence community push hard to reinstate some version of these authorities. A likely possibility is a Senate effort to pass a weaker “reform” bill, a watered down USA FREEDOM Act. For example, several senators have been vocal about their belief in the need for a data retention mandate — which we’ve spoken out against before — so if these authorities expire, we fear post-sunset legislative efforts will likely include such a requirement as a pathway to 60 votes in the Senate. Already Senate Intelligence Chair Richard Burr and Vice Chair Diane Feinstein have each floated bills to this end. We oppose both of these bills; they do not provide the reforms that we need.

However, as Harley Geiger of CDT notes in a recent post on Lawfare: “A few hours’ sunset may not make much difference to beltway insiders, but a post-sunset vote re-instating Sec. 215 can correctly be seen by voters as a huge expansion in government surveillance power after that power had gone dormant.” While the Senate may look to compromise after sunset, the House may well lean the other direction. House Judiciary Chairman Goodlatte and Ranking Member Conyers have indicated in a recent statement, “If the Senate chooses to allow these authorities to expire, they should do so knowing that sunset may be permanent.”

Yet, permanent sunset of Section 215 would not satisfy the Senate, and it would not be a panacea for civil liberties. The government could still conduct bulk collection in a few different ways:

  1. The government could use the FISA pen register/trap and trace statute (PR/TT), for example, which the government has previously used to conduct bulk collection of email metadata (the program allegedly ended in 2011 due to a lack of resources and demonstrated efficacy). It’s worth noting that while bulk collection under PR/TT is curtailed in the USA FREEDOM Act, this statute doesn’t otherwise have a sunset.
  2. Through other legal pathways — the vast majority of mass surveillance occurs under other authorities such as Executive Order 12333 and Section 702 of the FISA Amendments Act. While the USA FREEDOM Act doesn’t address these authorities either, we believe passing this legislation will serve as a better foundation for further reform than a sunset.
  3. Through grandfathering — Section 215’s sunset provision has a clause saying that investigations that were begun before the sunset occurred may continue. The FBI doesn’t put a time limit on its investigations, whatever broad investigation they’ve been using to justify bulk collection could foreseeably continue indefinitely. While the recent 2nd Circuit Court of Appeals opinion might mitigate this a bit, it’s unclear if the Foreign Intelligence Surveillance Court will apply that court’s reasoning to the government.

Privacy and security of users on the Internet is fundamental. Mozilla stands in continued support of the USA FREEDOM Act, as do hundreds of thousands of users, tech companies, privacy advocates, the Director of National Intelligence, the Attorney General, and the White House. Indeed, a bipartisan majority of both chambers of Congress support this legislation. We urge the House to stand behind their overwhelming vote to approve this legislation and to demand reforms that are at least as strong as the USA FREEDOM Act. Ultimately, this bill failed on a procedural vote on Friday night by just three votes. When the Senate reconvenes this Sunday, we hope three senators will step up to help enact the real reforms that users demand.

 

 

Mike ConleyThe Joy of Coding (Ep. 16): Wacky Morning DJ

I’m on vacation this week, but the show must go on! So I pre-recorded a shorter episode of The Joy of Coding last Friday.

In this episode1, I focused on a tool I wrote that I alluded to in the last episode, which is a soundboard to use during Joy of Coding episodes.

I demo the tool, and then I explain how it works. After I finished the episode, I pushed to repository to GitHub, and you can check that out right here.

So I’ll see you next week with a full length episode! Take care!


  1. Which, several times, I mistakenly refer to as the 15th episode, and not the 16th. Whoops. 

Air MozillaKids' Vision - Mentorship Series

Kids' Vision - Mentorship Series Mozilla hosts Kids Vision Bay Area Mentor Series

Air MozillaQuality Team (QA) Public Meeting

Quality Team (QA) Public Meeting This is the meeting where all the Mozilla quality teams meet, swap ideas, exchange notes on what is upcoming, and strategize around community building and...

Benjamin SmedbergYak Shaving

Yak shaving tends to be looked down on. I don’t necessarily see it that way. It can be a way to pay down technical debt, or learn a new skill. In many ways I consider it a sign of broad engineering skill if somebody is capable of solving a multi-part problem.

It started so innocently. My team has been working on unifying the Firefox Health Report and Telemetry data collection systems, and there was a bug that I thought I could knock off pretty easily: “FHR data migration: org.mozilla.crashes”. Below are the roadblocks, mishaps, and sideshows that resulted, and I’m not even done yet:

Tryserver failure: crashes

Constant crashes only on Linux opt builds. It turned out this was entirely my fault. The following is not a safe access pattern because of c++ temporary lifetimes:

nsCSubstringTuple str = str1 + str2;
Fn(str);
Backout #1: talos xperf failure

After landing, the code was backed out because the xperf Talos test detected main-thread I/O. On desktop, this was a simple ordering problem: we always do that I/O during startup to initialize the crypto system; I just moved it slightly earlier in the startup sequence. Why are we initializing the crypto system? To generate a random number. Fixed this by whitelisting the I/O. This involved landing code to the separate Talos repo and then telling the main Firefox tree to use the new revision. Much thanks to Aaron Klotz for helping me figure out the right steps.

Backout #2: test timeouts

Test timeouts if the first test of a test run uses the PopupNotifications API. This wasn’t caught during initial try runs because it appeared to be a well-known random orange. I was apparently changing the startup sequence just enough to tickle a focus bug in the test harness. It so happened that the particular test which runs first depends on the e10s or non-e10s configuration, leading to some confusion about what was going on. Fortunately, I was able to reproduce this locally. Gavin Sharp and Neil Deakin helped get the test harness in order in bug 1138079.

Local test failures on Linux

I discovered that several xpcshell tests were failing locally on Linux which were working fine on tryserver. After some debugging, I discovered that the tests thought I wasn’t using Linux, because the cargo-culted test for Linux was let isLinux = ("@mozilla.org/gnome-gconf-service;1" in Cc). This means that if gconf is disabled at build time or not present at runtime, the tests will fail. I installed GConf2-devel and rebuilt my tree and things were much better.

Incorrect failure case in the extension manager

While debugging the test failures, I discovered an incorrect codepath in GMPProvider.jsm for clients which are not Windows, Mac, or Linux (Android and the non-Linux Unixes).

Android performance regression

The landing caused an Android startup performance regression, bug 1163049. On Android, we don’t initialize NSS during startup, and the earlier initialization of the addon manager caused us to generate random Sync IDs for addons. I first fixed this by using Math.random() instead of good crypto, but Richard Newman suggested that I just make Sync generation lazy. This appears to work and will land when there is an open tree.

mach bootstrap on Fedora doesn’t work for Android

As part of debugging the performance regression, I built Firefox for Android for the first time in several years. I discovered that mach bootstrap for Android isn’t implemented on Fedora Core. I manually installed packages until it built properly. I have a list of the packages I installed and I’ll file a bug to fix mach bootstrap when I get a chance.

android build-tools not found

A configure check for the android build-tools package failed. I still don’t understand exactly why this happened; it has something to do with a version that’s too new and unexpected. Nick Alexander pointed me at a patch on bug 1162000 which fixed this for me locally, but it’s not the “right” fix and so it’s not checked into the tree.

Debugging on Android (jimdb)

Binary debugging on Android turned out to be difficult. There are some great docs on the wiki, but those docs failed to mention that you have to pass the configure flag –enable-debug-symbols. After that, I discovered that pending breakpoints don’t work by default with Android debugging, and since I was debugging a startup issue that was a critical failure. I wrote an ask.mozilla.org question and got a custom patch which finally made debugging work. I also had to patch the implementation of DumpJSStack() so that it would print to logcat on Android; this is another bug that I’ll file later when I clean up my tree.

Crash reporting broken on Mac

I broke crash report submission on mac for some users. Crash report annotations were being truncated from unicode instead of converted from UTF8. Because JSON.stringify doesn’t produce ASCII, this was breaking crash reporting when we tried to parse the resulting data. This was an API bug that existed prior to the patch, but I should have caught it earlier. Shoutout to Ted Mielczarek for fixing this and adding automated tests!

Semi-related weirdness: improving the startup performance of Pocket

The Firefox Pocket integration caused a significant startup performance issue on some trees. The details are especially gnarly, but it seems that by reordering the initialization of the addon manager, I was able to turn a performance regression into a win by accident. Probably something to do with I/O wait, but it still feels like spooky action at a distance. Kudos to Joel Maher, Jared Wein and Gijs Kruitbosch for diving into this under time pressure.

Experiences like this are frustrating, but as long as it’s possible to keep the final goal in sight, fixing unrelated bugs along the way might be the best thing for everyone involved. It will certainly save context-switches from other experts to help out. And doing the Android build and debugging was a useful learning experience.

Perhaps, though, I’ll go back to my primary job of being a manager.

Air MozillaSecurity Services / MDN Update

Security Services / MDN Update An update on the roadmap and plan for Minion and some new MDN projects coming up in Q3 and the rest of 2015

Air MozillaProduct Coordination Meeting

Product Coordination Meeting Duration: 10 minutes This is a weekly status meeting, every Wednesday, that helps coordinate the shipping of our products (across 4 release channels) in order...

Armen ZambranoWelcome adusca!

It is my privilege to announce that adusca (blog) joined Mozilla (since Monday) as an Outreachy intern for the next 4 months.

adusca has an outstanding number of contributions over the last few months including Mozilla CI Tools (which we're working on together).

Here's a bit about herself from her blog:
Hi! I’m Alice. I studied Mathematics in college. I was doing a Master’s degree in Mathematical Economics before getting serious about programming.
She is also a graduate from Hacker's School.

Even though Alice has not been a programmer for many years, she has shown already lots of potential. For instance, she wrote a script to generate scheduling relations for buildbot; for this and many other reasons I tip my hat.

adusca will initially help me out with creating a generic pulse listener to handle job cancellations and retriggers for Treeheder. The intent is to create a way for Mozilla CI tools to manage scheduling on behalf of TH, make the way for more sophisticated Mozilla CI actions and allow other people to piggy back to this pulse service and trigger their own actions.

If you have not yet had a chance to welcome her and getting to know her, I highly encourage you to do so.

Welcome Alice!


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Mark SurmanMozilla Academy Strategy Update

One of MoFo’s main goals for 2015 is to come up with an ambitious learning and community strategy. The codename for this is ‘Mozilla Academy’. As a way to get the process rolling, I wrote a long post in March outlining what we might include in that strategy. Since then, I’ve been putting together a team to dig into the strategy work formally.

This post is an update on that process in FAQ form. More substance and meat is coming in future posts. Also, there is lots of info on the wiki.

Q1. What are we trying to do?

Our main goal is alignment: to get everyone working on Mozilla’s learning and leadership development programs pointed in the same direction. The three main places we need to align are:

  1. Purpose: help people learn and hone the ability to read | write | participate.
  2. Process: people learn and improve by making things (in a community of like-minded peers).
  3. Poetry: tie back to ‘web = public resource’ narrative. Strong Mozilla brand.

At the end of the year, we will have a unified strategy that connects Mozilla’s learning and leadership development offerings (Webmaker, Hive, Open News, etc.). Right now, we do good work in these areas, but they’re a bit fragmented. We need to fix that by creating a coherent story and common approaches that will increase the impact these programs can have on the world.

Q2. What is ‘Mozilla Academy’?

That’s what we’re trying to figure out. At the very least, Mozilla Academy will be a clearly packaged and branded harmonization of Mozilla’s learning and leadership programs. People will be able to clearly understand what we’re doing and which parts are for them. Mozilla Academy may also include a common set of web literacy skills, curriculum format and learning approaches that we use across programs. We are also reviewing the possibility of a shared set of credentials or roles for people participating in Mozilla Academy.

Q3. Who is ‘Mozilla Academy’ for?

Over the past few weeks, we’ve started to look at who we’re trying to serve with our existing programs (blog post on this soon). Using the ‘scale vs depth’ graph in the Mozilla Learning plan as a framework, we see three main audiences:

  • 1.4 billion Facebook users. Or, whatever metric you use to count active people on the internet. We can reach some percentage of these people with software or marketing that invite people to ‘read | write | participate’. We probably won’t get them to want to ‘learn’ in an explicit way. They will learn by doing. Which is fine. Webmaker and SmartOn currently focus on this group.
  • People who actively want to grow their web literacy and skills. These are people interested enough in skills or technology or Mozilla that they will choose to participate in an explicit learning activity. They include everyone from young people in afterschool programs to web developers who might be interested in taking a course with Mozilla. Mozilla Clubs, Hive and MDN’s nascent learning program currently focus on this group.
  • People who want to hone their skills *and* have an impact on the world. These are people who already understand the web and technology at some level, but want to get better. They are also interested in doing something good for the web, the world or both. They include everyone from an educator wanting to create digital curriculum to a developer who wants to make the world of news or science better. Hive, ReMo and our community-based fellowships currently serve these people.

A big part of the strategy process is getting clear on these audiences. From there we can start to ask questions like: who can Mozilla best serve?; where can we have the most impact?; can people in one group serve or support people in another? Once we have the answers to these questions we can decide where to place our biggest bets (we need to do this!). And we can start raising more money to support our ambitious plans.

Q4. What is a ‘strategy’ useful for?

We want to accomplish a few things as a result of this process. A. A way to clearly communicate the ‘what and why’ of Mozilla’s learning and leadership efforts. B. A framework for designing new programs, adjusting program designs and fundraising for program growth. C. Common approaches and platforms we can use across programs. These things are important if we want Mozilla to stay in learning and leadership for the long haul, which we do.

Q5. What do you mean by ‘common approaches’?

There are a number of places where we do similar work in different ways. For example, Mozilla Clubs, Hive, Mozilla Developer Network, Open News and Mozilla Science Lab are all working on curriculum but do not yet have a shared curriculum model or repository. Similarly, Mozilla runs four fellowship programs but does not have a shared definition of a ‘Mozilla Fellow’. Common approaches could help here.

Q6. Are you developing a new program for Mozilla?

That’s not our goal. We like most of the work we’re doing now. As outlined in the 2015 Mozilla Learning Plan, our aim is to keep building on the strongest elements of our work and then connect these elements where it makes sense. We may modify, add or cut program elements in the future, but that’s not our main focus.

Q7. Are you set on the ‘Mozilla Academy’ name?

It’s pretty unlikely that we will use that name. Many people hate it. However, we needed a moniker to use during the strategy process. For better or for worse, that’s the one we chose.

Q8. What’s the timing for all of this?

We will have a basic alignment framework around ‘purpose, process and poetry’ by the end of June. We’ll work with the team at the Mozilla All Hands in Whistler. We will develop specific program designs, engage in a  broad conversation and run experiments. By October, we will have an updated version of the Mozilla Learning plan, which will lay out our work for 2016+.

As indicated above, the aim of this post is to give a process update. There is much more info on the process, who’s running it and what all the pieces are in the Mozilla Learning strategy wiki FAQ. The wiki also has info on how to get involved. If you have additional questions, ask them here. I’ll respond to the comments and also add my answers to the wiki.

In terms of substance, I’m planning a number of posts in coming weeks on topics like the essence of web literacy, who our audiences are and how we think about learning. People leading Mozilla Academy working groups will also be posting on substantive topics like our evolving thinking around the web literacy map and fellows programs. And, of course, the wiki will be growing with substantive strategy documents covering many of the topics above.


Filed under: education, mozilla, webmakers

Manish GoregaokarWrapper Types in Rust: Choosing Your Guarantees

In my previous post I talked a bit about why the RWlock pattern is important for accessing data, which is why Rust enforces this pattern either at compile time or runtime depending on the abstractions used.

It occurred to me that there are many such abstractions in Rust, each with their unique guarantees. The programmer once again has the choice between runtime and compile time enforcement. It occurred to me that this plethora of “wrapper types”1 could be daunting to newcomers; in this post I intend to give a thorough explanation of what some prominent ones do and when they should be used.

I’m assuming the reader knows about ownership and borrowing in Rust. Nevertheless, I will attempt to keep the majority of this post accessible to those not yet familiar with these concepts. Aside from the two links into the book above, these two blog posts cover the topic in depth.

Basic pointer types

Box<T>

Box<T> is an “owned pointer” or a “box”. While it can hand out borrowed references to the data, it is the only owner of the data. In particular, when something like the following occurs:

let x = Box::new(1);
let y = x;
// x no longer accessible here

Here, the box was moved into y. As x no longer owns it, the compiler will no longer allow the programmer to use x after this. A box can similarly be moved out of a function by returning, and when a box (one which hasn’t been moved) goes out of scope, destructors are run, deallocating the inner data.

This abstraction is a low cost abstraction for dynamic allocation. If you want to allocate some memory on the heap and safely pass a pointer to that memory around, this is ideal. Note that you will only be allowed to share borrowed references to this by the regular borrowing rules, checked at compile time.

Interlude: Copy

Move/ownership semantics are not special to Box<T>; it is a feature of all types which are not Copy.

A Copy type is one where all the data it logically encompasses (usually, owns) is part of its stack representation2. Most types containing pointers to other data are not Copy, since there is additional data elsewhere, and simply copying the stack representation may accidentally share ownership of that data in an unsafe manner.

Types like Vec<T> and String which also have data on the heap are also not Copy. Types like the integer/boolean types are Copy

&T and raw pointers are Copy. Even though they do point to further data, they do not “own” that data. Whereas Box<T> can be thought of as “some data which happens to be dynamically allocated”, &T is thought of as “a borrowing reference to some data”. Even though both are pointers, only the first is considered to be “data”. Hence, a copy of the first should involve a copy of the data (which is not part of its stack representation), but a copy of the second only needs a copy of the reference. &mut T is not Copy because mutable aliases cannot be shared, and &mut T “owns” the data it points to somewhat since it can mutate.

Practically speaking, a type can be Copy if a copy of its stack representation doesn’t violate memory safety.

&T and &mut T

These are immutable and mutable references respectively. They follow the “read-write lock” pattern described in my previous post, such that one may either have only one mutable reference to some data, or any number of immutable ones, but not both. This guarantee is enforced at compile time, and has no visible cost at runtime. In most cases such pointers suffice for sharing cheap references between sections of code.

These pointers cannot be copied in such a way that they outlive the lifetime associated with them.

*const T and *mut T

These are C-like raw pointers with no lifetime or ownership attached to them. They just point to some location in memory with no other restrictions. The only guarantee that these provide is that they cannot be dereferenced except in code marked unsafe.

These are useful when building safe, low cost abstractions like Vec<T>, but should be avoided in safe code.

Rc<T>

This is the first wrapper we will cover that has a runtime cost.

Rc<T> is a reference counted pointer. In other words, this lets us have multiple “owning” pointers to the same data, and the data will be freed (destructors will be run) when all pointers are out of scope.

Internally, it contains a shared “reference count”, which is incremented each time the Rc is cloned, and decremented each time one of the Rcs goes out of scope. The main responsibility of Rc<T> is to ensure that destructors are called for shared data.

The internal data here is immutable, and if a cycle of references is created, the data will be leaked. If we want data that doesn’t leak when there are cycles, we need a garbage collector. I do not know of any existing GCs in Rust, but I am working on one with Michael Layzell and there’s another cycle collecting one being written by Nick Fitzgerald.

Guarantees

The main guarantee provided here is that the data will not be destroyed until all references to it are out of scope.

This should be used when you wish to dynamically allocate and share some data (read-only) between various portions of your program, where it is not certain which portion will finish using the pointer last. It’s a viable alternative to &T when &T is either impossible to statically check for correctness, or creates extremely unergonomic code where the programmer does not wish to spend the development cost of working with.

This pointer is not thread safe, and Rust will not let it be sent or shared with other threads. This lets one avoid the cost of atomics in situations where they are unnecessary.

There is a sister smart pointer to this one, Weak<T>. This is a non-owning, but also non-borrowed, smart pointer. It is also similar to &T, but it is not restricted in lifetime — a Weak<T> can be held on to forever. However, it is possible that an attempt to access the inner data may fail and return None, since this can outlive the owned Rcs. This is useful for when one wants cyclic data structures and other things.

Cost

As far as memory goes, Rc<T> is a single allocation, though it will allocate two extra words as compared to a regular Box<T> (for “strong” and “weak” refcounts).

Rc<T> has the computational cost of incrementing/decrementing the refcount whenever it is cloned or goes out of scope respectively. Note that a clone will not do a deep copy, rather it will simply increment the inner reference count and return a copy of the Rc<T>

Cell types

“Cells” provide interior mutability. In other words, they contain data which can be manipulated even if the type cannot be obtained in a mutable form (for example, when it is behind an &-ptr or Rc<T>).

The documentation for the cell module has a pretty good explanation for these.

These types are generally found in struct fields, but they may be found elsewhere too.

Cell<T>

Cell<T> is a type that provides zero-cost interior mutability, but only for Copy types. Since the compiler knows that all the data owned by the contained value is on the stack, there’s no worry of leaking any data behind references (or worse!) by simply replacing the data.

It is still possible to violate your own invariants using this wrapper, so be careful when using it. If a field is wrapped in Cell, it’s a nice indicator that the chunk of data is mutable and may not stay the same between the time you first read it and when you intend to use it.

let x = Cell::new(1);
let y = &x;
let z = &x;
x.set(2);
y.set(3);
z.set(4);
println!("{}", x.get());

Note that here we were able to mutate the same value from various immutable references.

This has the same runtime cost as the following:

let mut x = 1;
let y = &mut x;
let z = &mut x;
x = 2;
*y = 3;
*z = 4;
println!("{}", x;

but it has the added benefit of actually compiling successfully.

Guarantees

This relaxes the “no aliasing with mutability” restriction in places where it’s unnecessary. However, this also relaxes the guarantees that the restriction provides; so if one’s invariants depend on data stored within Cell, one should be careful.

This is useful for mutating primitives and other Copy types when there is no easy way of doing it in line with the static rules of & and &mut.

Gábor Lehel summed up the guarantees provided by Cell in a rather succinct manner:

The basic guarantee we need to ensure is that interior references can’t be invalidated (left dangling) by mutation of the outer structure. (Think about references to the interiors of types like Option, Box, Vec, etc.) &, &mut, and Cell each make a different tradeoff here. & allows shared interior references but forbids mutation; &mut allows mutation xor interior references but not sharing; Cell allows shared mutability but not interior references.

[because Cell operates by copying, it can’t create interior references]

This comment by Eddy also touches on the guarantees of Cell and the alternatives

Cost

There is no runtime cost to using Cell<T>, however if one is using it to wrap larger (Copy) structs, it might be worthwhile to instead wrap individual fields in Cell<T> since each write is a full copy of the struct.

RefCell<T>

RefCell<T> also provides interior mutability, but isn’t restricted to Copy types.

Instead, it has a runtime cost. RefCell<T> enforces the RWLock pattern at runtime (it’s like a single-threaded mutex), unlike &T/&mut T which do so at compile time. This is done by the borrow() and borrow_mut() functions, which modify an internal reference count and return smart pointers which can be dereferenced immutably and mutably respectively. The refcount is restored when the smart pointers go out of scope. With this system, we can dynamically ensure that there are never any other borrows active when a mutable borrow is active. If the programmer attempts to make such a borrow, the thread will panic.

let x = RefCell::new(vec![1,2,3,4]);
{
    println!("{:?}", *x.borrow())
}

{
    let my_ref = x.borrow_mut();
    my_ref.push(1);
}

Similar to Cell, this is mainly useful for situations where it’s hard or impossible to satisfy the borrow checker. Generally one knows that such mutations won’t happen in a nested form, but it’s good to check.

For large, complicated programs, it becomes useful to put some things in RefCells to make things simpler. For example, a lot of the maps in the ctxt struct in the rust compiler internals are inside this wrapper. These are only modified once (during creation, which is not right after initialization) or a couple of times in well-separated places. However, since this struct is pervasively used everywhere, juggling mutable and immutable pointers would be hard (perhaps impossible) and probably form a soup of &-ptrs which would be hard to extend. On the other hand, the RefCell provides a cheap (not zero-cost) way of safely accessing these. In the future, if someone adds some code that attempts to modify the cell when it’s already borrowed, it will cause a (usually deterministic) panic which can be traced back to the offending borrow.

Similarly, in Servo’s DOM we have a lot of mutation, most of which is local to a DOM type, but some of which crisscrosses the DOM and modifies various things. Using RefCell and Cell to guard all mutation lets us avoid worrying about mutability everywhere, and it simultaneously highlights the places where mutation is actually happening.

Note that RefCell should be avoided if a mostly simple solution is possible with & pointers.

Guarantees

RefCell relaxes the static restrictions preventing aliased mutation, and replaces them with dynamic ones. As such the guarantees have not changed.

Cost

RefCell does not allocate, but it contains an additional “borrow state” indicator (one word in size) along with the data.

At runtime each borrow causes a modification/check of the refcount.

Synchronous types

Many of the types above cannot be used in a threadsafe manner. Particularly, Rc<T> and RefCell<T>, which both use non-atomic ref counts, cannot be used this way. This makes them cheaper to use, but one needs thread safe versions of these too. They exist, in the form of Arc<T> and Mutex<T>/RWLock<T>

Note that the non-threadsafe types cannot be sent between threads, and this is checked at compile time. I’ll touch on how this is done in a later blog post.

There are many useful wrappers for concurrent programming in the sync module, but I’m only going to cover the major ones.

Arc<T>

Arc<T> is just a version of Rc<T> that uses an atomic reference count (hence, “Arc”). This can be sent freely between threads.

C++’s shared_ptr is similar to Arc, however in C++s case the inner data is always mutable. For semantics similar to that from C++, we should use Arc<Mutex<T>>, Arc<RwLock<T>>, or Arc<UnsafeCell<T>>3 (UnsafeCell<T> is a cell type that can be used to hold any data and has no runtime cost, but accessing it requires unsafe blocks). The last one should only be used if one is certain that the usage won’t cause any memory unsafety. Remember that writing to a struct is not an atomic operation, and many functions like vec.push() can reallocate internally and cause unsafe behavior (so even monotonicity may not be enough to justify UnsafeCell)

Guarantees

Like Rc, this provides the (thread safe) guarantee that the destructor for the internal data will be run when the last Arc goes out of scope (barring any cycles).

Cost

This has the added cost of using atomics for changing the refcount (which will happen whenever it is cloned or goes out of scope). When sharing data from an Arc in a single thread, it is preferable to share & pointers whenever possible.

Mutex<T> and RwLock<T>

Mutex<T> and RwLock<T> provide mutual-exclusion via RAII guards. For both of these, the mutex is opaque until one calls lock() on it, at which point the thread will block until a lock can be acquired, and then a guard will be returned. This guard can be used to access the inner data (mutably), and the lock will be released when the guard goes out of scope.

{
    let guard = mutex.lock();
    // guard dereferences mutably to the inner type
    *guard += 1;
} // lock released when destructor runs

RwLock has the added benefit of being efficient for multiple reads. It is always safe to have multiple readers to shared data as long as there are no writers; and RwLock lets readers acquire a “read lock”. Such locks can be acquired concurrently and are kept track of via a reference count. Writers must obtain a “write lock” which can only be obtained when all readers have gone out of scope.

Guarantees

Both of these provide safe shared mutability across threads, however they are prone to deadlocks. Some level of additional protocol safety can be obtained via the type system. An example of this is rust-sessions, an experimental library which uses session types for protocol safety.

Costs

These use internal atomic-like types to maintain the locks, and these are similar pretty costly (they can block all memory reads across processors till they’re done). Waiting on these locks can also be slow when there’s a lot of concurrent access happening.

Composition

A common gripe when reading Rust code is with stuff like Rc<RefCell<Vec<T>>> and more complicated compositions of such types.

Usually, it’s a case of composing together the guarantees that one needs, without paying for stuff that is unnecessary.

For example, Rc<RefCell<T>> is one such composition. Rc itself can’t be dereferenced mutably; because Rc provides sharing and shared mutability isn’t good, so we put RefCell inside to get dynamically verified shared mutability. Now we have shared mutable data, but it’s shared in a way that there can only be one mutator (and no readers) or multiple readers.

Now, we can take this a step further, and have Rc<RefCell<Vec<T>>> or Rc<Vec<RefCell<T>>>. These are both shareable, mutable vectors, but they’re not the same.

With the former, the RefCell is wrapping the Vec, so the Vec in its entirety is mutable. At the same time, there can only be one mutable borrow of the whole Vec at a given time. This means that your code cannot simultaneously work on different elements of the vector from different Rc handles. However, we are able to push and pop from the Vec at will. This is similar to an &mut Vec<T> with the borrow checking done at runtime.

With the latter, the borrowing is of individual elements, but the overall vector is immutable. Thus, we can independently borrow separate elements, but we cannot push or pop from the vector. This is similar to an &mut [T]4, but, again, the borrow checking is at runtime.

In concurrent programs, we have a similar situation with Arc<Mutex<T>>, which provides shared mutability and ownership.

When reading code that uses these, go in step by step and look at the guarantees/costs provided.

When choosing a composed type, we must do the reverse; figure out which guarantees we want, and at which point of the composition we need them. For example, if there is a choice between Vec<RefCell<T>> and RefCell<Vec<T>>, we should figure out the tradeoffs as done above and pick one.

Discuss: HN, Reddit


  1. I’m not sure if this is the technical term for them, but I’ll be calling them that throughout this post.

  2. By “stack representation” I mean the data on the stack when a value of this type is held on the stack. For example, a Vec<T> has a stack representation of a pointer and two integers (length, capacity). While there is more data behind the indirection of the pointer, it is not part of the stack-held portion of the Vec. Looking at this a different way, a type is Copy if a memcopy of the data copies all the data owned by it.

  3. Arc<UnsafeCell<T>> actually won’t compile since UnsafeCell<T> isn’t Send or Sync, but we can wrap it in a type and implement Send/Sync for it manually to get Arc<Wrapper<T>> where Wrapper is struct Wrapper<T>(UnsafeCell<T>).

  4. &[T] and &mut [T] are slices; they consist of a pointer and a length and can refer to a portion of a vector or array. &mut [T] can have its elements mutated, however its length cannot be touched.

Mozilla Addons BlogUpdate on Extension Signing and New Developer Agreement

If you have an active extension listing on AMO you probably got a message from us already, explaining how we will automatically sign your add-on and provide it to your users via automatic updates. The automatic signing process will run this week, in batches, and we will notify you when your add-on is signed. Please take some time to test the signed version in the current release version of Firefox and either Developer Edition or Nightly (where Firefox already warns about unsigned extensions).

If you’re unfamiliar with extension signing, please read the original announcement for context.

Next week, we will activate two new features on AMO: signing of new add-on versions after they are reviewed, and add-on submission for developers who wish to have their add-ons signed but don’t want them listed on AMO. We will post another update once this happens. When this is done, all extension developers will be able to have their extensions signed, with enough time to update their users before signing becomes a requirement in release versions of Firefox.

New Developer Agreement

Since we will be signing add-ons that won’t be listed on AMO, we have updated the Add-on Distribution Developer Agreement to cover the new ways in which we will handle add-ons. This document hadn’t been touched for years, so we took our time and significantly updated its contents to reflect how we do things now. Hopefully it is also easier to read than its previous version.

Note that the new agreement will go into effect on June 1st. The version that is displayed on AMO when you submit a new add-on will be updated then, and all active developers on AMO will be notified about it.

If you want to stay up to date with changes related to extension signing, you can follow this blog or check in regularly to the wiki page, where I update the timeline information as it becomes clearer.

Air MozillaBugzilla Development Meeting

Bugzilla Development Meeting Help define, plan, design, and implement Bugzilla's future!

Gervase MarkhamThe Jeeves Test

What is a browser for? What should it do, or not do? What should it be?

People within the Mozilla project have been recently discussing the user value of some new features in Firefox. I think a person’s view of questions of this nature will depend on their view of the role of the browser. One option is the “featureless window on the web” view – the browser is nothing, the site is everything. But as one participant said, this leads to all the value-add and features being provided by the sites, which is not a recipe for user control, or for using the browser to advance the Mozilla mission.

I think the best vision for Firefox is as your “Internet butler” – quiet and refined, highly capable, provides what you need even before you know you need it, who gently guides you out of trouble but generally does his thing without you needing to think about him or provide explicit direction or management.

Bertie using an early voice interface prototype

So I’d like to propose the “Jeeves Test” for evaluating feature proposals for Firefox. It works like this: imagine Bertie Wooster, relaxing in an armchair in his apartment, with a cigarette, a gin and tonic, and a tablet computer. Then take the user value proposition of an idea, write it in appropriately deferential language, and see if you can imagine Jeeves whispering it into his ear. If you can’t, perhaps it’s not something we want to do.

To make that a bit more concrete, here are some examples of things that might pass: “Here’s an English translation of this Serbian page, sir”, or “For your safety, sir, access to this malware page has been blocked”. And here are some which might not: “For your convenience, sir, I’ve exempted aol.com from your popup blocker”, or “You’ll be pleased to find, sir, that the user interface has been substantially rearranged”.

There may be occasions where we’d want to do something which doesn’t obviously pass the Jeeves Test, if the effects on the broader web ecosystem of making the change are significantly positive. Some of the things we do to improve web security but which have a short-term compatibility impact might fall into that category. “Let me ensure this site doesn’t load for you, sir” generally doesn’t go down well, after all. But in those cases, that longer-term or broader value has to be clearly articulated – before we make the change – if we are to avoid an exasperated “Dash it, Jeeves… why?” from our userbase.

Mozilla Release Management TeamFirefox 38.0.5 rc to rc2

A small update for next week release. We took a patch to fix a JS top crash and 3 changes to fix an important regression in WebRTC.

  • 6 changesets
  • 10 files changed
  • 85 insertions
  • 35 deletions

ExtensionOccurrences
cpp5
xml1
txt1
mn1
html1
h1

ModuleOccurrences
js5
mobile1
media1
dom1
config1
browser1

List of changesets:

tbirdbldAutomated checkin: version bump for thunderbird 38.0b6 release. DONTBUILD CLOSED TREE a=release - 7eb2d1522ac2
Margaret LeibovicBug 1166392 - Include about:reader strings on Android. r=mfinkle, a=sledru - a468d1edeac8
Tooru FujisawaBug 1159973 - Abort parsing when TokenStream::SourceCoords hits OOM. r=jorendorff, a=sylvestre - 414d430012ee
Paul AdenotBug 1166183 - Back out the direct listener removal landed by mistake in Bug 1141781. r=jesup, a=sledru - 9525d8723413
Paul AdenotBug 1166183 - Work around Bug 934512 in track_peerConnection_replaceTrack.html. r=pehrson, a=sledru - 6ee39f753d19
Andreas PehrsonBug 1166183 - Reset PipelineListener's flag after ReplaceTrack(). r=bwc, a=sledru - 19b9ffda6e2f

Byron Joneshappy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

  • [1168357] Unable to reset mentor in modal bug view
  • [1146782] migrate autocomplete from yui to jquery
  • [1165355] sanitizeme.pl script should remove webservice api keys from database to remove sensitive information
  • [1158010] provide a standard and simple way to render relative dates, in perl and javascript
  • [1168303] send-request-nags.pl is triggering oom on bugzillaadm.private.scl3
  • [1162854] hitting back after making an edit should keep edit mode open
  • [1164850] add preferences to request nagging watching facility (reviews only, extended period, and skip encryption)
  • [1164780] updating an attachment with pre-existing review requests shouldn’t enforce the “last logged in to bugzilla” limit
  • [1162427] the “reset assignee to default” checkbox / functionality should be visible

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

Marcia KnousQA BuddyUp team meets again

The QA BuddyUp Team met once again prior to the Mozilla Balkans meetup in Bucharest. Thanks to Softvision, we had a great workspace to use for the day, and we had a lot of good discussion about the next steps for the project as well as our approach to automation. Pictured below are the members of the team (from left to right): Ada, Madalina, Ioana, fredy, Christos and Bebe.  You can see the notes from our discussion here. The day we met there were some challenges with working on automation, but

Daniel Stenbergpicturing curl’s future

development graph

There will be more stuff over time in the cURL project. Exactly which stuff and how long time it takes for everything, we don’t know. It depends largely on who works on what and how much time said persons can spend on implementing the stuff they work on…

I suspect we might be able to do things slightly faster over time, which is why the red arrow isn’t just a straight line.

I drew this little picture inspired from discussions with friends after a talk I did about curl and how development works in an open source project such as this. We know we will work on things that will improve the products but we don’t see exactly what very far in advance. I tweeted this picture a few days ago, and it turned out very popular.

Doug BelshawWeb Literacy: what happens beyond peak centralisation and software with shareholders?

There’s no TIDE podcast this week, so I thought I’d record a blog post today. Here’s the abstract:

We’re at peak centralisation of our data in online services, with data as the new oil. It’s a time of ‘frictionless sharing’, but also a time when we’re increasingly having decisions made on our behalf by algorithms. Education is now subject to a land grab by ‘software with shareholders’ looking to profit from collecting, mining, packaging, and selling learner data. This article explores some of the issues at stake, as well as pointing towards the seeds of a potential solution.

The Code Acts in Education blog I mention in the introduction to this piece can be found here and Ben Williamson is @BenPatrickWill on Twitter.

Comments (once you’ve listened!) much appreciated. I’ve still got time to re-work this… :)

(no audio? click here!)

References

Belshaw, D.A.J. (2014a). Software with shareholders (or, the menace of private public spaces). Doug Belshaw’s blog. 23 April 2014. http://dougbelshaw.com/blog/2014/04/23/software-with-shareholders.

Belshaw, D.A.J. (2014b). Curate or Be Curated: Why Our Information Environment is Crucial to a Flourishing Democracy, Civil Society. DMLcentral. 23 October 2014. http://dmlcentral.net/blog/doug-belshaw/curate-or-be-curated-why-our-information-environment-crucial-flourishing-democracy.

Dixon-Thayer, D. (2015). Mozilla View on Zero-Rating. Open Policy & Advocacy Blog. Mozilla. 5 May 2015. https://blog.mozilla.org/netpolicy/2015/05/05/mozilla-view-on-zero-rating.

Flew, T. (2008). New Media: An Introduction (3rd ed.). Melbourne: Oxford University Press.

Gillula, J. & Malcolm, J. (2015). Internet.org Is Not Neutral, Not Secure, and Not the Internet. Deeplinks Blog. Electronic Frontier Foundation. 18 May 2015. https://www.eff.org/deeplinks/2015/05/internetorg-not-neutral-not-secure-and-not-internet.

Kramer, A.D.I., Guillory, J.E., Hancock, J.T. (2014) Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences of the United Sates of America. 111(24).

McNeal, G.S. (2014). Facebook Manipulated User News Feeds To Create Emotional Responses. Forbes. 28 June 2014. http://www.forbes.com/sites/gregorymcneal/2014/06/28/facebook-manipulated-user-news-feeds-to-create-emotional-contagion

Mozilla. (2015). Web Literacy Map v1.1. https://teach.mozilla.org/teach-like-mozilla/web-literacy

Thorp, J. (2012). Big Data Is Not the New Oil. Harvard Business Review. 30 November 2012. https://hbr.org/2012/11/data-humans-and-the-new-oil.

Image CC BY-NC Graham Chastney

Air MozillaSpiderMonkey garbage collection update

SpiderMonkey garbage collection update An overview of the progress we've made in the last two years development on the SpiderMonkey GC implementing generational collection and starting work on compacting.

Chris AtLeeRelEng 2015 (part 1)

Last week, I had the opportunity to attend RelEng 2015 - the 3rd International Workshop of Release Engineering. This was a fantastic conference, and I came away with lots of new ideas for things to try here at Mozilla.

I'd like to share some of my thoughts and notes I took about some of the sessions. As of yet, the speakers' slides aren't collected or linked to from the conference website. Hopefully they'll get them up soon! The program and abstracts are available here.

For your sake (and mine!) I've split up my notes into a few separate posts. This post covers the introduction and keynote.

tl;dr

"Continuous deployment" of web applications is basically a solved problem today. What remains is for organizations to adopt best practices. Mobile/desktop applications remain a challenge.

Cisco relies heavily on collecting and analyzing metrics to inform their approach to software development. Statistically speaking, quality is the best driver of customer satisfaction. There are many aspects to product quality, but new lines of code introduced per release gives a good predictor of how many new bugs will be introduced. It's always challenging to find enough resources to focus on software quality; being able to correlate quality to customer satisfaction (and therefore market share, $$$) is one technique for getting organizational support for shipping high quality software. Common release criteria such as bugs found during testing, or bug fix rate, are used to inform stakeholders as to the quality of the release.

Introductory Session

Bram Adams and Foutse Khomh kicked things off with an overview of "continuous deployment" over the last 5 years. Back in 2009 we were already talking about systems where pushing to version control would trigger tens of thousands of tests, and do canary deployments up to 50 times a day.

Today we see companies like Facebook demonstrating that continuous deployment of web applications is basically a solved problem. Many organizations are still trying to implement these techniques. Mobile [and desktop!] applications still present a challenge.

Keynote

Pete Rotella from Cisco discussed how he and his team measured and predicted release quality for various projects at Cisco. His team is quite focused on data and analytics.

Cisco has relatively long release cycles compared to what we at Mozilla are used to now. They release 2-3 times per year, with each release representing approximately 500kloc of new code. Their customers really like predictable release cycles, and also don't like releases that are too frequent. Many of their customers have their own testing / validation cycles for releases, and so are only willing to update for something they deem critical.

Pete described how he thought software projects had four degrees of freedom in which to operate, and how quality ends up being the one sacrificed most often in order to compensate for constraints in the others:

  • resources (people / money): It's generally hard to hire more people or find room in the budget to meet the increasing demands of customers. You also run into the mythical man month problem by trying to throw more people at a problem.

  • schedule (time): Having standard release cycles means organizations don't usually have a lot of room to push out the schedule so that features can be completed properly.

    I feel that at Mozilla, the rapid release cycle has helped us out to some extent here. The theory is that if your feature isn't ready for the current version, it can wait for the next release which is only 6 weeks behind. However, I do worry that we have too many features trying to get finished off in aurora or even in beta.

  • content (features): Another way to get more room to operate is to cut features. However, it's generally hard to cut content or features, because those are what customers are most interested in.

  • quality: Pete believes this is where most organizations steal resources for to make up for people/schedule/content constraints. It's a poor long-term play, and despite "quality is our top priority" being the Official Party Line, most organizations don't invest enough here. What's working against quality?

    • plethora of releases: lots of projects / products / special requests for releases. Attempts to reduce the # of releases have failed on most occasions.
    • monetization of quality is difficult. Pete suggests tying the cost of a poor quality release to this. How many customers will we lose with a buggy release?
    • having RelEng and QA embedded in Engineering teams is a problem; they should be independent organizations so that their recommendations can have more weight.
    • "control point exceptions" are common. e.g. VP overrides recommendations of QA / RelEng and ships the release.

Why should we focus on quality? Pete's metrics show that it's the strongest driver of customer satisfaction. Your product's customer satisfaction needs to be more than 4.3/5 to get more than marginal market share.

How can RelEng improve metrics?

  • simple dashboards
  • actionable metrics - people need to know how to move the needle
  • passive - use existing data. everybody's stretched thin, so requiring other teams to add more metadata for your metrics isn't going to work.
  • standardized quality metrics across the company
  • informing engineering teams about risk
  • correlation with customer experience.

Interestingly, handling the backlog of bugs has minimal impact on customer satisfaction. In addition, there's substantial risk introduced whenever bugs are fixed late in a release cycle. There's an exponential relationship between new lines of code added and # of defects introduced, and therefore customer satisfaction.

Another good indicator of customer satisfaction is the number of "Customer found defects" - i.e. the number of bugs found and reported by their customers vs. bugs found internally.

Pete's data shows that if they can find more than 80% of the bugs in a release prior to it being shipped, then the remaining bugs are very unlikely to impact customers. He uses lines of code added for previous releases, and historical bug counts per version to estimate number of bugs introduced in the current version given the new lines of code added. This 80% figure represents one of their "Release Criteria". If less than 80% of predicted bugs have been found, then the release is considered risky.

Another "Release Criteria" Pete discussed was the weekly rate of fixing bugs. Data shows that good quality releases have the weekly bug fix rate drop to 43% of the maximum rate at the end of the testing cycle. This data demonstrates that changes late in the cycle have a negative impact on software quality. You really want to be fixing fewer and fewer bugs as you get closer to release.

I really enjoyed Pete's talk! There are definitely a lot of things to think about, and how we might apply them at Mozilla.

Andrea MarchesiniConsole API in workers and DevTools

As web developer, debugging Service Workers code can be a complex process. For this reason, we are putting extra effort in improving debugging tools for Firefox. This blog post is focusing on our developments on the Console API and how it can be used in any kind of Worker and shown up in the DevTools WebConsole.

As you probably know, since February 2014 Firefox supports the Console API in Workers (bug 965860). In so doing, we have recently extended the support to different kind of Workers (January 2015 - bug 1058644). However, any use of the Console API in those Workers has not been shown yet into the DevTools WebConsole.

I am pleased to say that, in the nightly, and soon in release builds, you are now able to choose and see the log messages from Shared Workers, Service Workers and Addon-on Workers, which enable them from the ‘logging’ menu. Here a screenshot:

This works with remote debugging too! Happy debugging!

Air MozillaReaching more users through better accessibility

Reaching more users through better accessibility Using Firefox OS to test your apps for better accessibility and usability of your mobile web application

Air MozillaPrivacy features for Firefox for Android

Privacy features for Firefox for Android Supporting privacy on the mobile web with built-in features and add-ons

Air MozillaMartes Mozilleros

Martes Mozilleros Reunión bi-semanal para hablar sobre el estado de Mozilla, la comunidad y sus proyectos. Bi-weekly meeting to talk (in Spanish) about Mozilla status, community and...

Air MozillaServo (the parallel web browser) and YOU!

Servo (the parallel web browser) and YOU! A beginner's guide to contributing to Servo

Air MozillaSilexLabs Test

SilexLabs Test Silex Labs

Daniel Stenberg2015 curl user poll analysis

My full 30 page document with all details and analyses of the curl user poll 2015 is now available. It shows details of all the questions, most of them with a comparison with last year’s survey. The write-ins are also full of good advice, wisdom and some signs of ignorance or unawareness.

I hope all curl hackers and others generally interested in the project can use my “report” to learn something about our users and our user’s view of the project and our products.

Let’s use this to guide us going forward.

keep-calm-and-improve-curl

Byron Joneshappy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

  • [1166280] max-width missing from bottom-actions, causing “top” and “new bug” buttons misalignment on the right edge of window
  • [1163868] Include requests from others in RequestNagger
  • [1164321] fix chrome issues
  • [1166616] disabling of reviewer last-seen doesn’t work for users that don’t have a last_seen_date
  • [1166777] Script for generating API keys
  • [1166623] move MAX_REVIEWER_LAST_SEEN_DAYS_AGO from a constant to a parameter
  • [1135164] display a warning on show_bug when an unassigned bug has a patch attached
  • [1158513] Script to convert MozReview attachments from parents to children
  • [1168190] “See Also” should accept webcompat.com/issues/ – prefixed URLs

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

David AscherNot Technologism, Alchemy.

Heated argument...

A couple of years ago, I was part of a panel with the remarkable Evgeny Morozov, at an event hosted by the Museum of Vancouver.  The event wasn’t remarkable, except that I ended up in a somewhat high-energy debate with Mr. Morozov, without particularly understanding why.  He ascribed to me beliefs I didn’t hold, which made a rigorous intellectual argument (which I generally enjoy) quite elusive and baffling.

It took a follow-up discussion with my political theorist brother to elucidate it.  In particular, he helped me realize that while for decades I’ve described myself as a technologist, depending on your definition of that word, I was misrepresenting myself.  Starting with my first job in my mid teens, my career (modulo an academic side-trip) has essentially been to bring technological skills to bear on a situation.  Using the word technologist to describe myself felt right, without much analysis.  As my brother pointed out, using an academic lexicon, the word technologist can be used to mean “someone who thinks technology is the answer”.  Using that definition, capital is to capitalist as technology is to technologist.  Once I understood that, I realized why Mr. Morozov was arguing with me.  He thought I believed in technologism, much like the way the New Yorker describes Marc Andreesen (I’m no Andreesen in many ways, but I do think he’d have been a better foil for Morozov).  It also made me realize that he either didn’t understand what Mozilla is trying to do, or didn’t believe me/us.

Using that definition, I am not a technologist any more than I am a capitalist or a regulator[ist?].  I resist absolutes, as it seems to me blatantly obvious (and probably boring) that except for rhetorical/marketing purposes (such as Morozov’s), the choices which one should make when building or changing a system, at every scale from the individual to the society, must be thoughtfully balanced.  This balance is hard to articulate and even harder to maintain, in a political environment which blurs subtlety and a network environment which amplifies differences.

 

The Purpose of Argument

 

It is in this balance that lies evolutionary progress.  I have been re-re-watching The Wire, thinking about Baltimore.  In season 3, one of the district commanders, sick of the status quo, tries to create a “free zone” for unimpeded drug dealing, an Amsterdam-in-Baltimore, in order to draw many of the associated social ills away from the rest of the district.  In doing this, he uses his authority (and fearlessness, thanks to his impending retirement), and combines management techniques, marketing, and positive and negative incentives, to try fix a systemic social problem.  I want more stories like that (especially less fictional ones).

I don’t believe that technology is “the answer,” just like police vans aren’t the answer.  That said, ignoring the impact that technology can have on societal (or business!) challenges is just as silly as saying that technology holds the answer.  Yes, technology choices are often political, and that’s just fine, especially if we invite non-techies to help make those political decisions.  When it comes to driving behavioral change, tech is a tool like many others, from empathy and consumer marketing to capital and incentives.  Let’s understand and use all of these tools.

Human psychology explains why marketing works, and marketing has definitely been used for silly reasons in the past. Still, if part of your goal involves getting people to do something, refusing to use human psychology to further your well-intentioned outcome is cutting your nose to spite your face.  Similarly, it’s unarguable that those who wield gigantic capital have had outsize influence in our society, and some of them are sociopaths. Still, refusing to consider how one could use capital and incentives to drive behavior is an equally flawed limit.  Lastly, obviously, technology can be dehumanizing. But technology can also, if wielded thoughtfully, be liberating.  Each has power, and power isn’t intrinsically evil.

1+1+1 = magic

Tech companies of the last decade have shown the outsize impact that these three disciplines, when wielded in coordinated fashion, can have on the world.  Uber’s success isn’t due to smart deployment of tech, psychology or capital.  Uber’s success is due to a brilliant combination of all three (and more).  I’ll leave it history to figure out whether Uber’s impact will be net positive or negative based on metrics that I care about, but the magical quality that combination unleashed is awe inspiring.

 

object.

 

This effective combination of disciplines is a critical step which I think eludes many people and organizations, due to specialization without coordination.  It is relatively clear how to become a better technologist or marketer.  The ladder of capitalist success is equally straightforward.  But figuring out how to blend those disciplines and unlock magic is elusive, both within a career path and within organizations.

I wonder if it’s possible to label this pursuit, in a gang-signal sort of way. Venture capitalists, marketers, and technologists all gain a lot by simply affiliating with these fields.  I find any one of these dimensions deeply boring, and the combinations so much more provocative.  Who’s with me and what do you call yourselves?  Combinatorists? Alchemists?

Aki Sasakiintroducing scriptharness

I found myself missing mozharness at various points over the past 10 months. Several things kept me from using it at my then-new job:

  • Even though we had kept mozharness.base.* largely non-Mozilla-specific, the mozharness clone-and-run model meant there was a lot of Mozilla-specific code that came along with it.

  • The monolithic BaseScript + mixins model had a very steep barrier of entry. There's a significant learning curve, and scripts need to be fully ported to mozharness to take advantage of its features.

I had wanted to address these issues for years, but never had time to devote fully to harness-specific development.

Now I do.

Introducing scriptharness 0.1.0:

I'm proud of this. I'm also aware it's not mature [yet], and it's currently missing some functionality.

There are some ideas I'd love to explore before 1.0.0:

  • multiple Script objects with threading and separate logs

  • Config Type Definitions

  • rethink how to enable/disable actions. I could keep mozharness' --add-action clobber --no-upload structure, or play with --actions +clobber -upload or something. (The - before the upload in the latter might cause argparse issues?)

  • also, explore Maven-style actions (all actions before target action are enabled) and actions with dependencies on other actions. I prefer keeping each action independent, idempotent, and individually targetable, but I can see someone wanting the other behavior for certain scripts.

  • I've already split out strings from code in a number of places, for unit testing. I'm curious what it would take to make scriptharness localizable, and if there would be demand for it.

  • ahal suggested adding structured logging; I'd love to investigate that.

I already have 0.2.0 on the brain. I'd love any feedback or patches.



comment count unavailable comments

Aki Sasakimozharness turns 5

Five years ago today, I landed the first mozharness commit in my user repo. (github)

starting something, or wasting my time. Log.py + a scratch trunk_nightly.json

The project had three initial goals:

  • First and foremost, I was tasked with building a multi-locale Fennec on Maemo. This was a more complex task than could sanely fit in a buildbot factory.

  • The Mozilla Releng team was already discussing pulling logic out of buildbot factories and into client-side scripts. I had been wanting to write a second version of my script framework idea. The first version was closed-source, perl, and very company- and product-specific. The second would improve on the first in every way, while keeping its three central principles of full logging, flexible config, and modular actions.

  • Finally, at that point I was still a Perl developer learning Python. I tend learn languages by writing a project from scratch in that new language; this was my opportunity.

Multi-locale Fennec became a reality, and then we started adding projects to mozharness, one by one.

As of last July, mozharness was the client-side engine for the majority of Mozilla's CI and release infrastructure. I still see plenty of activity in bugmail and IRC these days. I'll be the first to point out its shortcomings, but I think overall it has been a success.

Happy birthday, mozharness!



comment count unavailable comments

Air MozillaMozilla Weekly Project Meeting

Mozilla Weekly Project Meeting The Monday Project Meeting

Nick DesaulniersInterpreter, Compiler, JIT

Interpreters and compilers are interesting programs, themselves used to run or translate other programs, respectively. Those other programs that might be interpreted might be languages like JavaScript, Ruby, Python, PHP, and Perl. The other programs that might be compiled are C, C++, and to some extent Java and C#.

Taking the time to do translation to native machine code ahead of time can result in better performance at runtime, but an interpreter can get to work right away without spending any time translating. There happens to be a sweet spot somewhere in between interpretation and compilation that combines the best of both worlds. Such a technique is called Just In Time (JIT) compiling. While interpreting, compiling, and JIT’ing code might sound radically different, they’re actually strikingly similar. In this post, I hope to show how similar by comparing the code for an interpreter, a compiler, and a JIT compiler for the language Brainfuck in around 100 lines of C code each.

All of the code in the post is up on GitHub.

Brainfuck is an interesting, if hard to read, language. It only has eight operations it can perform > < + - . , [ ], yet is Turing complete. There’s nothing really to lex; each character is a token, and if the token is not one of the eight operators, it’s ignored. There’s also not much of a grammar to parse; the forward jumping and backwards jumping operators should be matched for well formed input, but that’s about it. In this post, we’ll skip over validating input assuming well formed input so we can focus on the interpretation/code generation. You can read more about it on the Wikipedia page, which we’ll be using as a reference throughout.

A Brainfuck program operates on a 30,000 element byte array initialized to all zeros. It starts off with an instruction pointer, that initially points to the first element in the data array or “tape.” In C code for an interpreter that might look like:

1
2
3
4
5
// Initialize the tape with 30,000 zeroes.
unsigned char tape [30000] = { 0 };

// Set the pointer to point at the left most cell of the tape.
unsigned char* ptr = tape;

Then, since we’re performing an operation for each character in the Brainfuck source, we can have a for loop over every character with a nested switch statement containing case statements for each operator.

The first two operators, > and < increment and decrement the data pointer.

1
2
case '>': ++ptr; break;
case '<': --ptr; break;

One thing that could be bad is that because the interpreter is written in C and we’re representing the tape as an array but we’re not validating our inputs, there’s potential for stack buffer overrun since we’re not performing bounds checks. Again, punting and assuming well formed input to keep the code and the point more precise.

Next up are the + and - operators, used for incrementing and decrementing the cell pointed to by the data pointer by one.

1
2
case '+': ++(*ptr); break;
case '-': --(*ptr); break;

The operators . and , provide Brainfuck’s only means of input or output, by writing the value pointed to by the instruction pointer to stdout as an ASCII value, or reading one byte from stdin as an ASCII value and writing it to the cell pointed to by the instruction pointer.

1
2
case '.': putchar(*ptr); break;
case ',': *ptr = getchar(); break;

Finally, our looping constructs, [ and ]. From the definition on Wikipedia for [: if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command and for ]: if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command.

I interpret that as:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
case '[':
  if (!(*ptr)) {
    int loop = 1;
    while (loop > 0) {
      unsigned char current_char = input[++i];
      if (current_char == ']') {
        --loop;
      } else if (current_char == '[') {
        ++loop;
      }
    }
  }
  break;
case ']':
  if (*ptr) {
    int loop = 1;
    while (loop > 0) {
      unsigned char current_char = input[--i];
      if (current_char == '[') {
        --loop;
      } else if (current_char == ']') {
        ++loop;
      }
    }
  }
  break;

Where the variable loop keeps track of open brackets for which we’ve not seen a matching close bracket, aka our nested depth.

So we can see the interpreter is quite basic, in around 50 SLOC were able to read a byte, and immediately perform an action based on the operator. How we perform that operation might not be the fastest though.

How about if we want to compile the Brainfuck source code to native machine code? Well, we need to know a little bit about our host machine’s Instruction Set Architecture (ISA) and Application Binary Interface (ABI). The rest of the code in this post will not be as portable as the above C code, since it assumes an x86-64 ISA and UNIX ABI. Now would be a good time to take a detour and learn more about writing assembly for x86-64. The interpreter is even portable enough to build with Emscripten and run in a browser!

For our compiler, we’ll iterate over every character in the source file again, switching on the recognized operator. This time, instead of performing an action right away, we’ll print assembly instructions to stdout. Doing so requires running the compiler on an input file, redirecting stdout to a file, then running the system assembler and linker on that file. We’ll stick with just compiling and not assembling (though it’s not too difficult), and linking (for now).

First, we need to print a prologue for our compiled code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
const char* const prologue =
  ".text\n"
  ".globl _main\n"
  "_main:\n"
  "  pushq %rbp\n"
  "  movq %rsp, %rbp\n"
  "  pushq %r12\n"        // store callee saved register
  "  subq $30008, %rsp\n" // allocate 30,008 B on stack, and realign
  "  leaq (%rsp), %rdi\n" // address of beginning of tape
  "  movl $0, %esi\n"     // fill with 0's
  "  movq $30000, %rdx\n" // length 30,000 B
  "  call _memset\n"      // memset
  "  movq %rsp, %r12";
puts(prologue);

During the linking phase, we’ll make sure to link in libc so we can call memset. What we’re doing is backing up callee saved registers we’ll be using, stack allocating the tape, realigning the stack (x86-64 ABI point #1), copying the address of the only item on the stack into a register for our first argument, setting the second argument to the constant 0, the third arg to 30000, then calling memset. Finally, we use the callee saved register %r12 as our instruction pointer, which is the address into a value on the stack.

We can expect the call to memset to result in a segfault if simply subtract just 30000B, and not realign for the 2 registers (64 b each, 8 B each) we pushed on the stack. The first pushed register aligns the stack on a 16 B boundary, the second misaligns it; that’s why we allocate an additional 8 B on the stack (x86-64 ABI point #1). The stack is mis-aligned upon function entry in x86-64. 30000 is a multiple of 16.

Moving the instruction pointer (>, <) and modifying the pointed to value (+, -) are straight-forward:

1
2
3
4
5
6
7
8
9
10
11
12
case '>':
  puts("  inc %r12");
  break;
case '<':
  puts("  dec %r12");
  break;
case '+':
  puts("  incb (%r12)");
  break;
case '-':
  puts("  decb (%r12)");
  break;

For output, ., we need to copy the pointed to byte into the register for the first argument to putchar. We explicitly zero out the register before calling putchar, since it takes an int (32 b), but we’re only copying a char (8 b) (Look up C’s type promotion rules for more info). x86-64 has an instruction that does both, movzXX, Where the first X is the source size (b, w) and the second is the destination size (w, l, q). Thus movzbl moves a byte (8 b) into a double word (32 b). %rdi and %edi are the same register, but %rdi is the full 64 b register, while %edi is the lowest (or least significant) 32 b.

1
2
3
4
5
6
case '.':
  // move byte to double word and zero upper bits since putchar takes an
  // int.
  puts("  movzbl (%r12), %edi");
  puts("  call _putchar");
  break;

Input (,) is easy; call getchar, move the resulting lowest byte into the cell pointed to by the instruction pointer. %al is the lowest 8 b of the 64 b %rax register.

1
2
3
4
case ',':
  puts("  call _getchar");
  puts("  movb %al, (%r12)");
  break;

As usual, the looping constructs ([ & ]) are much more work. We have to match up jumps to matching labels, but for an assembly program, labels must be unique. One way we can solve for this is whenever we encounter an opening brace, push a monotonically increasing number that represents the numbers of opening brackets we’ve seen so far onto a stack like data structure. Then, we do our comparison and jump to what will be the label that should be produced by the matching close label. Next, we insert our starting label, and finally increment the number of brackets seen.

1
2
3
4
5
6
case '[':
  stack_push(&stack, num_brackets);
  puts("  cmpb $0, (%r12)");
  printf("  je bracket_%d_end\n", num_brackets);
  printf("bracket_%d_start:\n", num_brackets++);
  break;

For close brackets, we pop the number of brackets seen (or rather, number of pending open brackets which we have yet to see a matching close bracket) off of the stack, do our comparison, jump to the matching start label, and finally place our end label.

1
2
3
4
5
6
case ']':
  stack_pop(&stack, &matching_bracket);
  puts("  cmpb $0, (%r12)");
  printf("  jne bracket_%d_start\n", matching_bracket);
  printf("bracket_%d_end:\n", matching_bracket);
  break;

So for sequential loops ([][]) we can expect the relevant assembly to look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  cmpb $0, (%r12)
  je bracket_0_end
bracket_0_start:

  cmpb $0, (%r12)
  jne bracket_0_start
bracket_0_end:

  cmpb $0, (%r12)
  je bracket_1_end
bracket_1_start:

  cmpb $0, (%r12)
  jne bracket_1_start
bracket_1_end:

and for nested loops ([[]]), we can expect assembly like the following (note the difference in the order of numbered start and end labels):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  cmpb $0, (%r12)
  je bracket_0_end
bracket_0_start:

  cmpb $0, (%r12)
  je bracket_1_end
bracket_1_start:

  cmpb $0, (%r12)
  jne bracket_1_start
bracket_1_end:

  cmpb $0, (%r12)
  jne bracket_0_start
bracket_0_end:

Finally, we need an epilogue to clean up the stack and callee saved registers after ourselves.

1
2
3
4
5
6
const char* const epilogue =
  "  addq $30008, %rsp\n" // clean up tape from stack.
  "  popq %r12\n" // restore callee saved register
  "  popq %rbp\n"
  "  ret\n";
puts(epilogue);

The compiler is a pain when modifying and running a Brainfuck program; it takes a couple extra commands to compile the Brainfuck program to assembly, assemble the assembly into an object file, link it into an executable, and run it whereas with the interpreter we can just run it. The trade off is that the compiled version is quite a bit faster. How much faster? Let’s save that for later.

Wouldn’t it be nice if there was a translation & execution technique that didn’t force us to compile our code every time we changed it and wanted to run it, but also performance closer to that of compiled code? That’s where a JIT compiler comes in!

For the basics of JITing code, make sure you read my previous article on the basics of JITing code in C. We’re going to follow the same technique of creating executable memory, copying bytes into that memory, casting it to a function pointer, then calling it. Just like the interpreter and the compiler, we’re going to do a unique action for each recognized token. What’s different is that for each operator, we’re going to push opcodes into a dynamic array, that way it can grow based on our sequential reading of input and will simplify our calculation of relative offsets for branching operations.

The other special thing we’re going to do it that we’re going to pass the address of our libc functions (memset, putchar, and getchar) into our JIT’ed function at runtime. This avoids those kooky stub functions you might see in a disassembled executable. That means we’ll be invoking our JIT’ed function like:

1
2
3
4
5
typedef void* fn_memset (void*, int, size_t);
typedef int fn_putchar (int);
typedef int fn_getchar ();
void (*jitted_func) (fn_memset, fn_putchar, fn_getchar) = mem;
jitted_func(memset, putchar, getchar);

Where mem is our mmap’ed executable memory with our opcodes copied into it, and the typedef’s are for the respective function signatures for our function pointers we’ll be passing to our JIT’ed code. We’re kind of getting ahead of ourselves, but knowing how we will invoke the dynamically created executable code will give us an idea of how the code itself will work.

The prologue is quite a bit involved, so we’ll take it step at a time. First, we have the usual prologue:

1
2
3
char prologue [] = {
  0x55, // push rbp
  0x48, 0x89, 0xE5, // mov rsp, rbp

Then we want to back up our callee saved registers that we’ll be using. Expect horrific and difficult to debug bugs if you forget to do this.

1
2
3
  0x41, 0x54, // pushq %r12
  0x41, 0x55, // pushq %r13
  0x41, 0x56, // pushq %r14

At this point, %rdi will contain the address of memset, %rsi will contain the address of putchar, and %rdx will contain the address of getchar, see x86-64 ABI point #2. We want to store these in callee saved registers before calling any of them, else they may clobber %rdi, %rsi, or %rdx since they’re not “callee saved,” rather “call clobbered.” See x86-64 ABI point #4.

1
2
3
  0x49, 0x89, 0xFC, // movq %rdi, %r12
  0x49, 0x89, 0xF5, // movq %rsi, %r13
  0x49, 0x89, 0xD6, // movq %rdx, %r14

At this point, %r12 will contain the address of memset, %r13 will contain the address of putchar, and %r14 will contain the address of getchar.

Next up is allocating 30008 B on the stack:

1
  0x48, 0x81, 0xEC, 0x38, 0x75, 0x00, 0x00, // subq $30008, %rsp

This is our first hint at how numbers, whose value is larger than the maximum representable value in a byte, are represented on x86-64. Where in this instruction is the value 30008? The answer is the 4 byte sequence 0x38, 0x75, 0x00, 0x00. The x86-64 architecture is “Little Endian,” which means that the least significant bit (LSB) is first and the most significant bit (MSB) is last. When humans do math, they typically represent numbers the other way, or “Big Endian.” Thus we write decimal ten as “10” and not “01.” So that means that 0x38, 0x75, 0x00, 0x00 in Little Endian is 0x00, 0x00, 0x75, 0x38 in Big Endian, which then is 7*16^3+5*16^2+3*16^1+8*16^0 which is 30008 in decimal, the amount of bytes we want to subtract from the stack. We’re allocating an additional 8 B on the stack for alignment requirements, similar to the compiler. By pushing even numbers of 64 b registers, we need to realign our stack pointer.

Next in the prologue, we set up and call memset:

1
2
3
4
5
6
7
8
  // address of beginning of tape
  0x48, 0x8D, 0x3C, 0x24, // leaq (%rsp), %rdi
  // fill with 0's
  0xBE, 0x00, 0x00, 0x00, 0x00, // movl $0, %esi
  // length 30,000 B
  0x48, 0xC7, 0xC2, 0x30, 0x75, 0x00, 0x00, // movq $30000, %rdx
  // memset
  0x41, 0xFF, 0xD4, // callq *%r12

After invoking memset, %rdi, %rsi, & %rcx will contain garbage values since they are “call clobbered” registers. At this point we no longer need memset, so we now use %r12 as our instruction pointer. %rsp will point to the top (technically the bottom) of the stack, which is the beginning of our memset’ed tape. That’s the end of our prologue.

1
2
  0x49, 0x89, 0xE4 // movq %rsp, %r12
};

We can then push our prologue into our dynamic array implementation:

1
vector_push(&instruction_stream, prologue, sizeof(prologue))

Now we iterate over our Brainfuck program and switch on the operations again. For pointer increment and decrement, we just nudge %r12.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
case '>':
  {
    char opcodes [] = {
      0x49, 0xFF, 0xC4 // inc %r12
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;
case '<':
  {
    char opcodes [] = {
      0x49, 0xFF, 0xCC // dec %r12
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;

That extra fun block in the switch statement is because in C/C++, we can’t define variables in the branches of switch statements.

Pointer deref then increment/decrement are equally uninspiring:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
case '+':
  {
    char opcodes [] = {
      0x41, 0xFE, 0x04, 0x24 // incb (%r12)
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;
case '-':
  {
    char opcodes [] = {
      0x41, 0xFE, 0x0C, 0x24 // decv (%r12)
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;

I/O might be interesting, but in x86-64 we have an opcode for calling the function at the end of a pointer. %r13 contains the address of putchar while %r14 contains the address of getchar.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
case '.':
  {
    char opcodes [] = {
      0x41, 0x0F, 0xB6, 0x3C, 0x24, // movzbl (%r12), %edi
      0x41, 0xFF, 0xD5 // callq *%r13
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;
case ',':
  {
    char opcodes [] = {
      0x41, 0xFF, 0xD6, // callq *%r14
      0x41, 0x88, 0x04, 0x24 // movb %al, (%r12)
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;

Now with our looping constructs, we get to the fun part. With the compiler, we deferred the concept of “relocation” to the assembler. We simply emitted labels, that the assembler turned into relative offsets (jumps by values relative to the last byte in the jump instruction). We’ve found ourselves in a Catch-22 though: how many bytes forward do we jump to the matching close bracket that we haven’t seen yet?

Normally, an assembler might have a data structure known as a “relocation table.” It keeps track of the first byte after a label and jumps, rewriting jumps-to-labels (which aren’t kept around in the resulting binary executable) to relative jumps. Spidermonkey, Firefox’s JavaScript Virtual Machine has two classes for this, MacroAssembler and Label. Spidermonkey embeds a linked list in the opcodes it generates for jumps with which it’s yet to see a label for. Once it finds the label, it walks the linked list (which itself is embedded in the emitted instruction stream) patching up these locations as it goes.

For Brainfuck, we don’t have to anything quite as fancy since each label only ends up having one jump site. Instead, we can use a stack of integers that are offsets into our dynamic array, and do the relocation once we know where exactly we’re jumping to.

1
2
3
4
5
6
7
8
9
10
11
case '[':
  {
    char opcodes [] = {
      0x41, 0x80, 0x3C, 0x24, 0x00, // cmpb $0, (%r12)
      // Needs to be patched up
      0x0F, 0x84, 0x00, 0x00, 0x00, 0x00 // je <32b relative offset, 2's compliment, LE>
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  stack_push(&relocation_table, instruction_stream.size); // create a label after
  break;

First we push the compare and jump opcodes, but for now we leave the relative offset blank (four zero bytes). We will come back and patch it up later. Then, we push the current length of dynamic array, which just so happens to be the offset into the instruction stream of the next instruction.

All of the relocation magic happens in the case for the closing bracket.

1
2
3
4
5
6
7
8
9
10
case ']':
  {
    char opcodes [] = {
      0x41, 0x80, 0x3C, 0x24, 0x00, // cmpb $0, (%r12)
      // Needs to be patched up
      0x0F, 0x85, 0x00, 0x00, 0x00, 0x00 // jne <33b relative offset, 2's compliment, LE>
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  // ...

First, we push our comparison and jump instructions into the dynamic array. We should know the relative offset we need to jump back to at this point, and thus don’t need to push four empty bytes, but it makes the following math a little simpler, as were not done yet with this case.

1
2
3
4
  // ...
  stack_pop(&relocation_table, &relocation_site);
  relative_offset = instruction_stream.size - relocation_site;
  // ...

We pop the matching offset into the dynamic array (from the matching open bracket), and calculate the difference from the current size of the instruction stream to the matching offset to get our relative offset. What’s interesting is that this offset is equal in magnitude for the forward and backwards jumps that we now need to patch up. We simply go back in our instruction stream 4 B, and write that relative offset negated as a 32 b LE number (patching our backwards jump), then go back to the site of our forward jump minus 4 B and write that relative offset as a 32 b LE number (patching our forwards jump).

1
2
3
4
  // ...
  vector_write32LE(&instruction_stream, instruction_stream.size - 4, -relative_offset);
  vector_write32LE(&instruction_stream, relocation_site - 4, relative_offset);
  break;

Thus, when writing a JIT, one must worry about manual relocation. From the Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z “A relative offset (rel8, rel16, or rel32) is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed, 8-bit or 32-bit immediate value, which is added to the instruction pointer.”

The last thing we push onto our instruction stream is clean up code in the epilogue.

1
2
3
4
5
6
7
8
9
10
char epilogue [] = {
  0x48, 0x81, 0xC4, 0x38, 0x75, 0x00, 0x00, // addq $30008, %rsp
  // restore callee saved registers
  0x41, 0x5E, // popq %r14
  0x41, 0x5D, // popq %r13
  0x41, 0x5C, // popq %r12
  0x5d, // pop rbp
  0xC3 // ret
};
vector_push(&instruction_stream, epilogue, sizeof(epilogue));

A dynamic array of bytes isn’t really useful, so we need to create executable memory the size of the current instruction stream and copy all of the machine opcodes into it, cast it to a function pointer, call it, and finally clean up:

1
2
3
4
5
6
7
void* mem = mmap(NULL, instruction_stream.size, PROT_WRITE | PROT_EXEC,
  MAP_ANON | MAP_PRIVATE, -1, 0);
memcpy(mem, instruction_stream.data, instruction_stream.size);
void (*jitted_func) (fn_memset, fn_putchar, fn_getchar) = mem;
jitted_func(memcpy, putchar, getchar);
munmap(mem, instruction_stream.size);
vector_destroy(&instruction_stream);

Note: we could have used the instruction stream rewinding technique to move the address of memset, putchar, and getchar as 64 b immediate values into %r12-%r14, which would have simplified our JIT’d function’s type signature.

Compile that, and we now have a function that will JIT compile and execute Brainfuck in roughly 141 SLOC. And, we can make changes to our Brainfuck program and not have to recompile it like we did with the Brainfuck compiler.

Hopefully it’s becoming apparent how similar an interpreter, compiler, and JIT behave. In the interpreter, we immediately execute some operation. In the compiler, we emit the equivalent text based assembly instructions corresponding to what the higher level language might get translated to in the interpreter. In the JIT, we emit the binary opcodes into executable memory and manually perform relocation, where the binary opcodes are equivalent to the text based assembly we might emit in the compiler. A production ready JIT would probably have macros for each operation in the JIT would perform, so the code would look more like the compiler rather than raw arrays of bytes (though the preprocessor would translate those macros into such). The entire process is basically disassembling C code with gobjdump -S -M suffix a.out, and punching in hex like one would a Gameshark.

Compare pointer incrementing from the three:

Interpreter:

1
case '>': ++ptr; break;

Compiler:

1
2
3
case '>':
  puts("  inc %r12");
  break;

JIT:

1
2
3
4
5
6
7
8
case '>':
  {
    char opcodes [] = {
      0x49, 0xFF, 0xC4 // inc %r12
    };
    vector_push(&instruction_stream, opcodes, sizeof(opcodes));
  }
  break;

Or compare the full sources of the the interpreter, the compiler, and the JIT. Each at ~100 lines of code should be fairly easy to digest.

Let’s now examine the performance of these three. One of the longer running Brainfuck programs I can find is one that prints the Mandelbrot set as ASCII art to stdout.

Running the UNIX command time on the interpreter, compiled result, and the JIT, we should expect numbers similar to:

1
2
3
4
5
6
7
8
$ time ./interpreter ../samples/mandelbrot.b
43.54s user 0.03s system 99% cpu 43.581 total

$ ./compiler ../samples/mandelbrot.b > temp.s; ../assemble.sh temp.s; time ./a.out
3.24s user 0.01s system 99% cpu 3.254 total

$ time ./jit ../samples/mandelbrot.b
3.27s user 0.01s system 99% cpu 3.282 total

The interpreter is an order of magnitude slower than the compiled result or run of the JIT. Then again, the interpreter isn’t able to jump back and forth as efficiently as the compiler or JIT, since it scans back and forth for matching brackets O(N), while the other two can jump to where they need to go in a few instructions O(1). A production interpreter would probably translate the higher level language to a byte code, and thus be able to calculate the offsets used for jumps directly, rather than scanning back and forth.

The interpreter bounces back and forth between looking up an operation, then doing something based on the operation, then lookup, etc.. The compiler and JIT preform the translation first, then the execution, not interleaving the two.

The compiled result is the fastest, as expected, since it doesn’t have the overhead the JIT does of having to read the input file or build up the instructions to execute at runtime. The compiler has read and translated the input file ahead of time.

What if we take into account the time it takes to compile the source code, and run it?

1
2
$ time (./compiler ../samples/mandelbrot.b > temp.s; ../assemble.sh temp.s; ./a.out)
3.27s user 0.08s system 99% cpu 3.353 total

Including the time it takes to compile the code then run it, the compiled results are now slightly slower than the JIT (though I bet the multiple processes we start up are suspect), but with the JIT we pay the price to compile each and every time we run our code. With the compiler, we pay that tax once. When compilation time is cheap, as is the case with our Brainfuck compiler & JIT, it makes sense to prefer the JIT; it allows us to quickly make changes to our code and re-run it. When compilation is expensive, we might only want to pay the compilation tax once, especially if we plan on running the program repeatedly.

JIT’s are neat but compared to compilers can be more complex to implement. They also repeatedly re-parse input files and re-build instruction streams at runtime. Where they can shine is bridging the gap for dynamically typed languages where the runtime itself is much more dynamic, and thus harder (if not, impossible) to optimize ahead of time. Being able to jump into JIT’d native code from an interpreter and back gives you the best of both (interpreted and compiled) worlds.

Nathan Froydwhite space as unused advertising space

I picked up Matthew Crawford’s The World Outside Your Head this weekend. The introduction, subtitled “Attention as a Cultural Problem”, opens with these words:

The idea of writing this book gained strength one day when I swiped my bank card to pay for groceries. I watched the screen intently, waiting for it to prompt me to do the next step. During the following seconds it became clear that some genius had realized that a person in this situation is a captive audience. During those intervals between swiping my card, confirming the amount, and entering my PIN, I was shown advertisements. The intervals themselves, which I had previously assumed were a mere artifact of the communication technology, now seemed to be something more deliberately calibrated. These haltings now served somebody’s interest.

I have had a similar experience: the gas station down the road from me has begun playing loud “news media” clips on the digital display of the gas pump while your car is being refueled, cleverly exploiting the driver as a captive audience. Despite this gas station being somewhat closer to my house and cheaper than the alternatives, I have not been back since I discovered this practice.

Crawford continues, describing how a recent airline trip bombarded him with advertisements in “unused” (“unmonetized”?) spaces: on the fold-down tray table in his airplane seat, the moving handrail on the escalator in the airport, on the key card (!) for his hotel room. The logic of filling up unused space reaches even to airport security:

But in the last few years, I have found I have to be careful at the far end of [going through airport security], because the bottoms of the gray trays that you place your items in for X-ray screening are now papered with advertisements, and their visual clutter makes it very easy to miss a pinky-sized flash memory stick against a picture of fanned-out L’Oréal lipstick colors…

Somehow L’Oréal has the Transportation Security Administration on its side. Who made the decision to pimp out the security trays with these advertisements? The answer, of course, is that Nobody decided on behalf of the public. Someone made a suggestion, and Nobody responded in the only way that seemed reasonable: here is an “inefficient” use of space that could instead be used to “inform” the public of “opportunities.” Justifications of this flavor are so much a part of the taken-for-granted field of public discourse that they may override our immediate experience and render it unintelligible to us. Our annoyance dissipates into vague impotence because we have no public language in which to articulate it, and we search instead for a diagnosis of ourselves: Why am I so angry? It may be time to adjust the meds.

Reading the introduction seemed especially pertinent to me in light of last week’s announcement about Suggested Tiles. The snippets in about:home featuring Mozilla properties or efforts, or even co-opting tiles on about:newtab for similar purposes feels qualitatively different than using those same tiles for advertisements from third parties bound only to Mozilla through the exchange of money. I have at least consented to the former, I think, by downloading Firefox and participating in that ecosystem, similar to how Chrome might ask you to sign into your Google account when the browser opens. The same logic is decidedly not true in the advertising case.

People’s attention is a scarce resource. We should be treating it as such in Firefox, not by “informing” them of “opportunities” from third parties unrelated to Mozilla’s mission, even if we act with the utmost concern for their privacy. Like the gas station near my house, people are not going to come to Firefox because it shows them advertisements in “inefficiently” used space. At best, they will tolerate the advertisements (maybe even taking steps to turn them off); at worst, they’ll use a different web browser and they won’t come back.

Mike HommeyDogfooding Firefox GTK+3

Thanks to Lee Salzman, the state of GTK+3 support in Firefox got better. Unit tests went from looking like this:

To looking like this:

elm-after

There’s obviously some work left to make those look even better, but we’ve come a long way.

Ludovic Hirlimann recently asked if there were builds to dogfood and that prompted me to attempt making the builds from the elm branch auto-update. Which, after several attempts, I managed to get working with gross (but small) hacks of the build system.

So here we are, if you want to dogfood GTK+3 Firefox, here is what you can do::

  • In a normal Linux nightly, go to about:config and create the following string preferences (right-click, New, String):
    • app.update.url.override” with the value “https://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/elm-linux/latest/update.xml” for 32-bits builds, or “https://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/elm-linux64/latest/update.xml” for 64-bits builds,
    • app.update.certs.3.issuerName” with the value “CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US“,
    • app.update.certs.3.commonName” with the value “ftp.mozilla.org“.
  • Open the burger menu, click the “?” icon, then choose “About Nightly”. This should check for an update, find one, and download it. This will upgrade to a GTK+3 build.

Alternatively, you can just download and install the elm builds directly (32-bits, 64-bits).

If for some reason, you want to go back to a normal GTK+2 nightly, go to about:config, find the “app.update.url.override” preference and set it to an empty value. Triggering the update from “About Nightly” won’t, however, work until the next nightly is available, so give it a day.

As mentioned in my previous post about GTK+3, if you’re interested in making those builds work better, you are welcome to help:

Alex GibsonUpFront Conference 2015

I took a trip to Manchester last week to attend UpFront Conf 2015. It was a full day of talks on web design and front-end development, and covered a range of topics including UI design, web performance, front-end testing, and typography. It even had a talk dedicated games console browsers! The line up of speakers delivered a good balance of both creative and technical talks, which made for an enjoyable day of brain-food. Here are my notes from the talks:

Brad Frost - Atomic design

Brad Frost started off the day by sharing his experience working with clients to design reusable, responsive UI components. Atomic Design focuses on building individual components instead of web pages. This helps to promote reusability, which in turn improves workflow, speeds up prototyping, and makes things more easily testable. Brad says we should be investing more time in keeping our style guides up to date, as well as developing common pattern libraries. These are things that often get treated as auxiliary aspects of a project, but if we prioritize our work using a clear methodology, we can work more efficiently in the long run.

Alica Sedlock - Jumping into front end testing

Alica Sedlock gave a useful high-level overview of front-end testing, which covered areas such as Unit tests, Integration tests, and Visual Regression testing. A lot of the usual suspects for unit & integration tests we’re covered, such as Jasmine, Mocha, Karma, and headless browsers such as PhantomJS. What I found particularly interesting (or new to me anyway), was how to test for visual regressions. This can be done using tools such as PhantomCSS and CasperJS to take screenshots of before & after states, and then diff the changes. Pretty neat!

Soledad Penadés - The disconnected ensemble: Scattered clouds, underground

Mozilla’s own Soledad Penadés gave a really fun talk & tech demo, sharing some of her experiments exploring the concepts of a P2P Web, and how this could be applied to mobile platforms such as Firefox OS. Sole built some musical toys, using Web Audio API and Web Components, which could be run and shared over a local network. This showed off some pretty interesting tech, including how an app can run its own local web server and share addresses to clients using NFC. Very cool!

I’d also never met Sole in person before, despite working at the same company (in the same country, even), so it was nice to get to say hi and chat a bit in the break!

View Sole’s demo source code.

Dean Hume - Faster mobile websites

In the first talk of the day dedicated to web performance, Dean Hume gave an overview on why page loading speed is so important, and how we are often failing horribly at performance when it comes to mobile. Dean gave an insight into using the RAIL approach (Response, Animate, Idle, Load) to help improve overall page speed, covering techniques such as image optimization, critical CSS, responsive images, and more. The most high level goals are to reduce overall page weight, minimize the number of requests a page needs to make, and speed up the time to first paint.

View Dean’s slides.

Ben Foxall - The Internet of browsers

Ben Foxall demoed a fun interactive concept that played on the “Internet of Things”, and showed that technology is very much a part of our physical world. Ben used browser meta data from devices in the audience to produce a visualization which highlighted where people we’re sat in relation to the speaker. This was accomplished using common Web API’s such as geolocation, orientation, proximity, ambient light, touch and sound.

View Ben’s interactive demo.

Richard Rutter - Web typography you could be doing now

Richard Rutter gave a talk on common typography best practices, and how best to apply them to web design. I found this quite insightful being more developer oriented, and also learnt a few new CSS properties that I never even knew existed (e.g. font-variant-numeric). Richard also made a great point that we should all be serving woff2 for our web fonts by now, which can save up to 30% in file size.

View Richard’s slides.

Anna Debenham - Games Console browsers

Anna Debenham gave a really interesting talk on the rise in use of games console browsers. Apparently 18% of people in the UK used a games console to log onto a social media web site in 2013. Another interesting group are 14-16 year olds, 20% of whom in the UK use a games console browser to access the internet (likely they don’t yet have a mobile device with data contract).

Anna went on to showcase the wide range of user inputs a games console can have, including gestures, voice commands, keypads, touch screens, and styluses to name just a few. Web browsers on these devices can also have very tight memory constraints, as well as pretty poor standards support. We need to try and optimize our web pages to be as light as possible, as well as make sure to always support common inputs such as keyboard, and use focus styles more effectively.

Yesenia Perez-Cruz - Design decisions through the lens of performance

Yesenia Perez-Cruz gave the second talk to focus on web performance, which was also the final talk of the day. I thought this one was particularly good as Yesenia shared how she works as a web designer to make informed decisions that won’t negatively impact a web page, taking into consideration factors like overall page weight, loading time and number of requests. This is something that designers don’t often prioritize, or consider being something a developer only needs worry about.

In Yesenia’s experience, slow and heavy sites are often a result of poor planning, communication, and awareness. Traditional waterfall processes often result with optimization being only an afterthought, or something that the developer needs to try and squeeze in toward the end of a project. It shouldn’t be this way. As a designer, Yesenia asks questions like, “How many requests will a carousel add?”, “How will performance be effected if we add another font weight?”, “Do we really need that parallax background?” Her suggestion to help weigh up these questions is to establish a web performance budget for any given page, and to make performance an overall project goal from the outset. Performance should be considered a design feature, not just a side effect of development.

Mozilla Release Management TeamFirefox 38.0.5b3 to 38.0.5 RC

Ready for the release! This RC was mainly about fixing the last pocket bugs and some stability fixes.

  • 22 changesets
  • 47 files changed
  • 301 insertions
  • 191 deletions

ExtensionOccurrences
html13
cpp6
js3
sh2
properties2
ini2
py1
mn1
json1
jsm1
java1
h1

ModuleOccurrences
dom16
mobile15
browser7
toolkit2
testing2
js2
gfx2
layout1

List of changesets:

Jean-Yves AvenardBug 1154881 - Disable test. r=karlt, a=test-only - 573c47bc1bf2
Nick AlexanderBug 1151619 - Add Adjust SDK license. r=gerv, a=NPOTB - 62e7fffff542
Ryan VanderMeulenBug 1164866 - Bump mozharness.json to rev 6f91445be987. a=test-only - f2ef3e1dadaf
Jared WeinBug 1166240 - Add pocket.svg to aero section of toolkit's windows/jar.mn. r=Gijs, a=gavin - 58d8fb9fc5e3
James WillcoxBug 1163841 - Always call eglInitialize(), but kill the preloading hack (which was crashing before). r=nchen, a=sledru - daa1f205525a
Benjamin ChenBug 1149842 - Release the mutex for NS_OpenAnonymousTemporaryFile to prevent the deadlock. r=roc, a=sledru - 06bdddc6463d
Chris ManchesterBug 978846 - Add a file to the tree to tell mozharness what arguments from try are acceptable to pass on to the harness process. r=ahal, a=test-only - cda517b321ee
Alexandre LissyBug 960762 - Fix intermittence of Notification mochitests. r=mhenretty, a=test-only - fe2c942655ec
Aaron KlotzBug 1158761 - Part 1: Make CheckPluginStopEvent run asynchronously. r=bholley, a=sledru - c163f5453215
Aaron KlotzBug 1158761 - Part 2: Update checks for plugin stop event in tests. r=jimm, a=sledru - aa884d29e93c
tbirdbldAutomated checkin: version bump for thunderbird 38.0b6 release. DONTBUILD CLOSED TREE a=release - 7f925ad5b331
Justin DolskeBug 1164649 - More late string changes in Pocket. r=jaws a=Sylvestre - 36b60a224d01
Geoff BrownBug 1073761 - Increase timeout for test_value_storage. r=dholbert, a=test-only - 1266331d5bc7
Kyle MachulisBug 1166870 - Fix permissions on settings event tests. a=test-only - 9e473441cbd9
Albert CrespellBug 849642 - Intermittent test_networkstats_enabled_perm.html. r=ettseng, a=test-only - bee6825f6c92
Albert CrespellBug 958689 - Fix intermittent errors in networkstats tests. r=ettseng, a=test-only - ad098fdd6f81
Milan SreckovicBug 1156058 - Null pointer check. r=jgilbert, a=sledru - 013da2859c88
Nicholas NethercoteBug 1103375 - Fix some crashes triggered from about:memory. r=mrbkap, a=sledru - b90caf52b6e2
Gijs KruitboschBug 1166771 - Force isArticle to false on pushstate on non-article pages. r=margaret, a=sledru - 17169e355c59
Jeff MuizelaarBug 1165732 - Block WARP when using the built-in VGA driver. r=bas, a=sledru - a297bd71b81a
Gijs KruitboschBug 1167096 - Flip introductory prefs if there's no saved state. r=jaws, a=sledru - 3ef925962765
Nick ThomasBackout rev 27bacb9dff64 to make mozilla-release ready to do release builds again, ra=release DONTBUILD - 79f9cd31b4b1

Christian HeilmannThe Ryanair approach to progressive enhancement

I fly – a lot. I spend more time in airports, in the air, hotel rooms and conferences than at home. As I am a natural recording and analysing device, I take in a lot of things on my travels. People at airports are stressed, confused, don’t pay attention to things, eat badly and are not always feeling good. They are tired, they feel rushed and they want just to get things over with and get where they want to go. Others – those new to travel – are overly excited about everything and want to things right, making mistakes because they are too eager. Exactly what users on the web are like. I found that companies who use technology for the benefit of their users are those people love and support. That’s what progressive enhancement means to me. But let’s start at the beginning.

Getting somewhere by plane is pretty simple. You buy a ticket and you get a booking confirmation number, an airport you leave from, a time and a destination airport. To claim all this and get on the flight, you also need to prove that you are you. You can do this in domestic flights with the credit card you booked the flight with, a driving license or your passport. For international travels, the latter is always the safest option.

The main thing you have to fear about flying is delays that make you miss your plane. Delays can be natural problems, technical failures with the plane or the airport. They could also be issues with air traffic control. It is busy up in the blue yonder, as this gorgeous visualisation shows. Another big issue is getting to the airport in time as all kind of traffic problems can delay you.

You can’t do much about that – you just have to take it in stride. I plan 3 hours from my house to sitting on the plane.

Avoid the queue

One thing you want to avoid is queues. The longer the queue, the more likely you are to miss your plane. Every single person in that queue and their problems become yours.

Airport QueuePhoto by James Emery

Airlines understand that and over the years have put improvements in place that make it easier for you to get up in the air.

In essence, what you need to get in exchange of your information is a boarding pass. It is the proof that all is well and you are good to go.

Passport with boarding passesPhoto by mroach

The fool-proof way of doing that is having check-in counters. These have people with computers and you go there, tell them your information and you get your boarding pass. You can also drop off your luggage and you get up-to-date information from them on delays, gates and – if you are lucky – upgrades. Be nice to them – they have a tough job and they can mess up your travels if you give them a tough time.

Improvement: self check-in counters

Manned check-in counters are also the most time consuming and expensive way. They also don’t scale to hundreds of customers – hence the queues.

Self check-in counters

The first step to improve this was self-check-in terminals. If you allow people to type in their booking confirmation and scan their passport, a machine can issue the boarding pass. You can then have a special check-in counter only for those who need to drop off luggage. Those without luggage, move on to the next level without having to interact with a person behind the counter and take up a space in the queue. Those who don’t know how to use the machine or who forgot some information or encounter a technical failure can still go to a manned check-in counter.

Improvement: mobile apps

Mobile boarding pass in app

Nowadays this is even better. We have online check-in that allows us to check in at home and print out our own boarding passes. As printer ink is expensive and boarding passes tend to be A4 and littered with ads, you can also use apps on smartphones.

Of course, every airline has their own app and all work different and – at times – in mysterious ways. But let’s not dwell on that.

Apps are incredible – they show you when your flight happens, delays, and you don’t need to print out anything. You get this uplifting feeling that you’re part of a technical elite and that you know your stuff.

Mobile app offering an upgrade for 'null'

Of course, as soon as you go high-tech, things also break:

  • You can always run out of battery
  • Apps crash and need to have a connection to re-start and re-fresh your booking content. That’s why a lot of people take screenshots of their boarding passes in the app.
  • You need to turn off your phone on planes, which means on changing to another plane you need to re-boot it, which takes time.
  • Some airports don’t have digital readers of QR codes or have access to priority lane only as a rubber stamp on a paper boarding pass (looking at you, SFO). That’s why you need a printout.
  • Staff checking your boarding pass at security and gate staff tend to wait for your phone display to go to sleep before trying to scan it. Then they ask you to enter your unlock code. There is probably some reason for that.
  • Some security lanes need you to keep your boarding pass with you but you can’t keep your phone on you as it needs to be X-Rayed. You see the problem…

Despite all that, you are still safe. When things go wrong, there are the fallbacks of the machines or the manned counter to go back to.

This is progressive enhancement

This, is progressive enhancement.

  • You put things in place that work and you make it more convenient for your users who have technical abilities.
  • You analyse the task at hand and offer the most basic solution.
  • With this as a security blanket, you think of ways to improve the experience and distribute the load.

You make it easier for users who are frequently using your product. That’s why I get access to fast-track security lanes and lounges. I get a reward for saving the company time and money and allowing them to cater to more users.

You almost never meet people in these lounges who have bad things to say about the airline. Of course they are stressed – everybody at an airport is – but there is a trust in the company they chose and good experiences means having a good relationship. You can check in 24 hours before your flight and all you bring to the airport is your phone and your passport. If you fail to do so, or you feel like it, you can still go to the counter. You feel like James Bond or Tony Stark.

Forcing your users to upgrade

Then there is Ryanair and other budget airlines. You will be hard pushed to find anyone who loves them. The mood ranges from “meh, it is convenient, as I can afford it” to “necessary evil” and ends in “spawn of satan and bane of my existence”. Why is that?

Well, budget airlines try to save and make money wherever they can. They have less ground staff and check-in counters. They have online check-in and expect you to bring a printout of your boarding pass. They have draconic measures when it comes to the size and weight of your luggage. They are less concerned when it comes to your available space on the plane or happy to charge extra for it. Instead of using a service it feels like you have to game it. You need to be on your toes, or you pay extra. You feel like you have to work for what you already paid for and you feel not empowered, but stupid when you forgot to have one thing the company requests you to have – things others don’t bother with.

They also have apps. And pretty ones at that. When everything goes right, these are cool. Yet, these come with silly limitations. These companies chose to offer apps so they can cut down on ground staff and less check-in counters. They are not an improvement or convenience, but become a necessity.

The “let’s make you queue anyways” app experience

The other day I was in Italy flying to Germany with Ryanair. I have no Italian data connection and roaming is expensive. I also had no wireless in the hotel or the convention I was at. Ryanair allows me to check-in online with a browser 24 hours before the flight. I couldn’t. When you use the app is even more draconic: you can only check in two hours before the flight. If you remember, I add a my 3 hour trip cushion to the airport to my travels. Which means I am on the road which in London means I am underground without a connection when I need to check in.

I grumpily queued up at the hot, packed airport in a massive queue full of screaming kids and drunk tourists. Others were people standing over half-unpacked luggage as their passports were missing. When I arrived at the counter, the clerk told me that as I needed to print out my boarding pass or check in with the app. As I failed to do so, I now need to pay 45 Euro for my boarding pass if he were to print it for me.

This was almost the price of the ticket. I told him that because of the 2 hour period and me not having connectivity, I couldn’t do that. All I got was “this is our policy”.

I ground my teeth, and connected my roaming data on my phone, trying to check in with the app. Instead of asking for my name and booking confirmation it asked for all kind of extra information. I guess the reason was that I hadn’t booked the ticket but someone had booked it for me. The necessary information included entering a lot of dates with a confusing date picker. In the end, I was one minute late and the app told me there is no way I can check in without going to a counter. I queued up again, and the clerk told me that I can not pay at his counter. Instead I needed to go to the other side of the airport to the ticketing counter, pay there and bring back a printout that I did pay. Of course, there was another queue. Coming back, I ended up in yet another queue, this time for another flight. I barely made it to my plane.

Guess what my attitude towards future business with this airline is. Right – they have a bleak future with me.

Progressive enhancement is for the user and you benefit, too

And this is when you use progressive enhancement the wrong way. Yes, an app is an improvement over queuing up or printing out. But you shouldn’t add arbitrary rules or punish those who can’t use it. Progressive enhancement is for the benefit of the end user. We also benefit a lot from it. Unlike the physical world of airport we can enhance without extra overhead. We don’t need to hire extra ground staff or put up hardware to read passports. All we need to do is to analyse:

  • What is the basic information the user needs to provide to fulfill a task
  • What is the simplest interface to reach this
  • How can we improve the experience for more advanced users and those on more advanced hardware?

The latter is the main thing: you don’t rely on any of those. Instead you test if they can be applied and apply them as needed.

Progressive enhancement is not about adding more work to your product. It is about protecting the main use case of your product and then enhance it with new functionality as it becomes available. Google is a great example of that. Turn off JavaScript and you still get a form to enter information in and you get a search result page with ads on it. This is how you find things and Google makes money. Anything else they added over time makes it more convenient for you but is not needed. It also offers them more opportunities to show you more ads and point at other services.

Use progressive enhancement as a means to reward your users. Don’t expect them to do things for you just to use your product. If the tools you use means your users have to have a “modern” browser and load a lot of script you share your problems with them. You can only get away with that if you offer them a cheaper version of what others offer but that’s a risky race to take part in. You can win their current business, but never their hearts or support. You become a necessary evil, not something they tell others about.

Andy McKayAccept Header

Recently I used an endpoint that had the following HTTP Accept Header:

*/*; q=0.5, application/xml

This server is saying it will accept "application/xml", but if that's not available, then it will accept anything. The "q" indicates that "application/xml" is preferred.

The assumption is that if you send a piece of content, you will send an appropriate HTTP Content-Type Header. Then the server will know how to parse it.

This server wanted a token, just some string, echoed back to it. Given the HTTP headers, it seemed perfectly for me to send:

Content-Type: application/json
"some token"

Or even [1]:

Content-Type: application/xml
<?xml version="1.0" encoding="UTF-8"?><token>some token</token>

As it turned out, it only accepted one thing and thats "text/plain" so something like:

Content-Type: text/plain
some token

This endpoint was hit using web interface, which was hitting a remote server, so took a little time to debug. But here's the thing, I think accepting */* is a little unusual, unless you are you really going to accept anything, really anything? Will you accept text/csv, how about audio/ogg or video/ogg? Here's one list of types [2].

The advantage of using a limited Accept header is that the server and client in question can figure things out without you having to do extra code. If the server explicitly declares what kinds of responses it can accept, then the client can check it can actually return the data encoded in that manner.

If your endpoint sends a */* Accept header and someone sends you something you don't know how to parse, hopefully you'll send them back a 406 response back. To tell the caller that you don't Accept that kind of response.

In this example, I've been talking about a server API. But of course this is for anything in the HTTP world, for example my browser sends the following: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8. It would much rather rather have HTML or XHTML, but will Accept anything and then do its best to render it. That seems reasonable for a browser.

But for an API that only accepts JSON or XML? Maybe we should make an effort to tighten up our Accept values on our APIs [3].

Footnotes

  1. Although it was never stated what the syntax for the XML was, so who knows.
  2. There isn't really a definitive list.
  3. Yes, that includes some APIs I've written.

Mike ConleyThings I’ve Learned This Week (May 18 – May 22, 2015)

You might have noticed that I had no “Things I’ve Learned This Week” post last week. Sorry about that – by the end of the week, I looked at my Evernote of “lessons from the week”, and it was empty. I’m certain I’d learned stuff, but I just failed to write it down. So I guess the lesson I learned last week was, always write down what you learn.

How to make your mozilla-central Mercurial clone work faster

I like Mercurial. I also like Git, but recently, I’ve gotten pretty used to Mercurial.

One complaint I hear over and over (and I’m guilty of it myself sometimes), is that “Mercurial is slow”. I’ve even experienced that slowness during some of my Joy of Coding episodes.

This past week, I was helping my awesome new intern get set up to tear into some e10s bugs, and at some point we went through this document to get her .hgrc all set up.

This document did not exist when I first started working with Mercurial – back then, I was using mq or sometimes pbranch, and grumbling about how I missed Git.

But there is some gold in this document.

gps has been doing some killer work documenting best practices with Mercurial, and this document is one of the results of his labour.

The part that’s really made the difference for me is the hgwatchman bit.

watchman is a tool that some folks at Facebook wrote to monitor changes in a folder. hgwatchman is an extension for Mercurial that takes advantage of watchman for a repository, smartly precomputing a bunch of stuff when the folder changes so that when you fire a command, like

hg status

It takes a fraction of the time it’d take without hgwatchman. A fraction.

Here’s how I set hgwatchman up on my MacBook (though you should probably go by the Mercurial for Mozillians doc as the official reference):

  1. Install watchman with brew:
    brew install watchman
  2. Clone the hgwatchman extension to some folder that you can easily remember and build it:
    hg clone https://bitbucket.org/facebook/hgwatchman
    cd hgwatchman
    make local
  3. Add the following lines to my user .hgrc:
    [extensions]
    hgwatchman = cloned-in-dir/hgwatchman/hgwatchman
  4. Make sure the extension is properly installed by running:
    hg help extensions
  5. hgwatchman should be listed under “enabled extensions”. If it didn’t work, keep in mind that you want to target the hgwatchman directory
  6. And then in my mozilla-central .hg/.hgrc:
    [watchman]
    mode = on
  7. Boom, you’re done!

Congratulations, hg should feel snappier now!

Next step is to try out this chg thingthough I’m having some issues still.

Mike ConleyThe Joy of Coding (Ep. 15): OS X Printing Returns

In Episode 15, we kept working on the same bug as the last two episodes – proxying the printing dialog on OS X to the parent process from the content process. At the end of Episode 14, we’d finished the serialization bits, and put in the infrastructure for deserialization. In this episode, we did the rest of the deserialization work.

And then we attempted to print a test page. And it worked!

We did it!

Then, we cleaned up the patches and posted them up for review. I had a lot of questions about my Objective-C++ stuff, specifically with regards to memory management (it seems as if some things in Objective-C++ are memory managed, and it’s not immediately obvious what that applies to). So I’ve requested review, and I hope to hear back from someone more experienced soon!

I also plugged a new show that’s starting up! If you’re a designer, and want to see how a designer at Mozilla does their work, you’ll love The Design Hour, by Ricardo Vazquez. His design chops are formidable, and he shows you exactly how he operates. It’s great!

Finally, I failed to mention that I’m on holiday next week, so I can’t stream live. I have, however, pre-recorded a shorter Episode 16, which should air at the right time slot next week. The show must go on!

Episode Agenda

References

Bug 1091112 – Print dialog doesn’t get focus automatically, if e10s is enabled – Notes

Air MozillaMozilla Balkans Meetup

Mozilla Balkans Meetup The Balkans Inter-Community meet-up 2015 will take place in Bucharest, Romania, on May 22-24th. Lead contributors from Balkan communities will be invited and sponsored by...

Robert O'Callahanrr Performance Update

It's been a while (March 2014 to be precise) since I gathered meaningful rr performance numbers. I'm preparing a talk for the TCE 2015 conference and as part of that I ran some new benchmarks with mozilla-central Firefox. It turned out that numbers had regressed --- unsurprisingly, since we don't have continuous performance tests for rr, and a lot has changed since March 2014. In particular, Firefox has evolved a lot, our tests have changed, we're using x86-64 now instead of x86-32, and rr has changed a lot. Over the last few days I studied the regressions and fixed a number of issues: in particular, during the transition to x86-64 some of the optimizations related to syscall-buffering were lost because we weren't patching some important syscall callsites and we weren't handling the recvfrom syscall, which is common in 64-bit Firefox. I also realized that in some cases we were flushing much more data from the syscallbuf to the trace file than we'd actually recorded in the buffer, massively bloating the traces, and fixed that.

There are still some regressions evident since last March. Octane overhead has increased significantly. Forcing Octane to run on a single core without rr shows a similar overhead; in particular that alone causes one test (Mandreel) to regress by a factor of 10! My guess is that Spidermonkey is using multiple cores much more aggressively that it did last year and because it's carefully tuned for Octane, going back to a single core really hurts performance. Replay overhead on the HTML mochitests has also increased significantly; I think this is partly because we changed rr to disable syscall buffering on writes to standard output. This improves the debugging experience but it results in a lot more overhead during replay.

Overall though, I remain very happy with rr performance, especially recording performance, which is critical when you're trying to capture a test failure under rr. Replay performance is becoming more important since it impacts the debugging experience, especially reverse execution; but doing a lot of work to improve raw replay performance is low priority since I think there are projects that could provide a better improvement in the debugging experience for less work (e.g. the ability to take a checkpoint during a recording and start debugging from there, and implement support for gdb's evaluate-in-target conditional breakpoints).

Air MozillaThe Joy of Coding (mconley livehacks on Firefox) - Episode 16

The Joy of Coding (mconley livehacks on Firefox) - Episode 16 mconley livehacks on real Firefox bugs while thinking aloud. Unscripted, unplanned, uncensored, and true to life, watch what a Firefox Desktop engineer does to close...

Mozilla Reps CommunityReps Weekly Call – May 21th 2015

Last Thursday we had our weekly call about the Reps program, where we talk about what’s going on in the program and what Reps have been doing during the last week.

webmaker-tiles

Summary

  • Webmaker & Mozilla Learning Update.
  • Suggested Tiles for Firefox update.
  • Featured Events.
  • Help me with my project.
  • Whistler WorkWeek – Reimbursements
  • Mozilla Reps SEA (SouthEast Asia) Online Meetup

AirMozilla video

Detailed notes

Shoutouts to Alex Wafula, African Reps, @konstantina, @Ioana and @lshapiro

Webmaker & Mozilla Learning update

Michelle joined the call to talk about webmaker and Mozilla Learning projects.

Blog post about Webmaker changes.

Have a question? Ask on discourse.

Elio share his experience on how they set up a Club and what does it look like. More about how to set up a club. Also we have create a topic to share your experience or ask questions about clubs.

Suggested Tiles for Firefox update

Patrick joined the call to update Reps about suggested Tiles in Firefox.

This week is going to be announced that it’s landing in beta starting with US users.

Firefox will use locally use the history to suggest interesting tiles for the user and it’s going to be super easy to opt-out or hide tiles you are not interested in. Firefox is the one deciding which tiles to show, not the partners.

Reps can be involved with this projects in two ways:

  • Suggesting community tiles.
  • Helping to curate relevant local content from partners.

Patrick will work with the Reps team to open this opportunity and to improve localization around this announcement and the technical details.

We have opened a discourse topic to ask any questions you might have.

Featured events

These are some events that have happened or are happening this week.

  • Mozilla Balkans: 22-25 May. More info on the wiki
  • Rust Releases parties: 23rd (Pune, Bangalore)
  • Debian/Ubuntu community Conference: 23-24 (Milano).
  • Mozilla QA Bangladesh, Train the contributors: 26th (Dhaka).
  • Festival TIK 2015 – Bandung: 28-29 (Bandung, Indonesia).

Help me with my project!

Staff onboarding

@george would love to know if a few volunteers would be excited to help out with new staff onboarding.

Requires availability at 17:15 UTC on Mondays for a 15min presentation, the benefit is that we would provide public speaking/presentation training and coaching and you will talk to new hires about the community and how awesome it is.

Business card generator

@helios needs help with the business card generator, which is written in nodejs.

Original generator idea.

Firefox e10s

@lshapiro needs help to test multiprocess in Firefox Developer Edition and add-ons.

http://arewee10syet.com

Whistler WorkWeek – Reimbursements

We are only one month to the workweek and there might be some questions about how to help volunteers that need reimbursements.

Reps will be reached out from Mozillians for reimbursement, so help them as better as possible to make the reimbursement smooth.

There will be an event created on reps portal to use on the budget form, otherwise add the mozillians page URL as event in the request form (or Reps profile URL).

Contact your mentor if you have doubts about reimbursing without an event, other questions reach out to @franc.

Mozilla Reps SEA (SouthEast Asia) Online Meetup

The next online meetup of ReMo SEA will be on Fri 22 MAY 2015 at 1200Z (UTC)

This is a monthly meet-up held by @bobreyes. Reps based in nearby countries (i.e. China [including Hong Kong], Taiwan, Japan and Korea) are also welcome to attend the online meetup, even people from Europe/Americas are invited to join!

They will share more details once the meet-up is over.

Full raw notes.

Don’t forget to comment about this call on Discourse and we hope to see you next week!

Andy McKayCommon People

Twenty years ago (May 22nd 1995), Pulp released the single "Common People":

Unfortunately, that version is censored. At 2 mins 30 seconds the lyrics are:

You'll never watch your life slide out of view,
and dance and drink and screw
Because there's nothing else to do.

The version on youtube omits "and screw". This song has some great lyrics:

Smoke some fags and play some pool, pretend you never went to school.
But still you'll never get it right
'cos when you're laid in bed at night watching roaches climb the wall
If you call your Dad he could stop it all.

This was a defining song of the Britpop era and I can remember it clearly being part of an Oasis (Manchester) vs Pulp (London) rivalry.

Of course, you haven't made it until William Shatner covers it:

Of that Jarvis said:

In 2011 Jarvis Cocker praised the cover version: "I was very flattered by that because I was a massive Star Trek fan as a kid and so you know, Captain Kirk is singing my song! So that was amazing."

Wikipedia

Apparently the subject of song is a lady who might have been named Danae, my wife's name.

There is a documentary about Pulp I haven't seen:

I'm now going to go listen to every Pulp song ever.

Robert O'CallahanBlinkOn 4

Last week I went to BlinkOn 4 in Sydney, having been invited by a Google developer. It was a lot of fun and I'm glad I was able to go. A few impressions:

It was good to hear talk about acting responsibly for the Web platform. My views about Google are a matter of public record, but the Blink developers I talked to have good intentions.

The talks were generally good, but there wasn't as much audience interaction as I'd expected. In my experience interaction makes most talks a lot better, and the BlinkOn environment is well-suited to interaction, so I'd encourage BlinkOn speakers and audiences to be a bit more interactive next time. I admit I didn't ask as many questions during talks as I usually do, because I felt the time belonged to actual Blink developers.

Blink project leaders felt that there wasn't enough long-term code ownership, so they formed subteams to own specific areas. It's a tricky balance between strong ownership, agile migration to areas of need, and giving people the flexibility to work on what excites them. I think Mozilla has a good balance right now.

The Blink event scheduling work is probably the only engine work I saw at BlinkOn that I thought was really important and that we're not currently working on in Gecko. We need to get going on that.

Another nice thing that Blink has that Gecko needs is the ability to do A/B performance testing on users in the field, i.e. switch on a new code path for N% of users and see how that affects performance telemetry.

On the other hand, we're doing some cool stuff that Blink doesn't have people working on --- e.g. image downscaling during decode, and compositor-driven video frame selection.

I spent a lot of time talking to Google staff working on the Blink "slimming paint" project. Their design is similar to some of what Gecko does, so I had information for them, but I also learned a fair bit by talking to their people. I think their design can be improved on, but we'll have to see about that.

Perhaps the best part of the conference was swapping war stories, realizing that we all struggle with basically the same set of problems, and remembering that the grass is definitely not all green on anyone's side of the fence. For example, Blink struggles with flaky tests just as we do, and deals with them the same way (by disabling them!).

It would be cool to have a browser implementors' workshop after some TPAC; a venue to swap war stories and share knowledge about how to implement all the specs we agreed on at TPAC :-).

Monica ChewFirefox 32 supports Public Key Pinning

Public Key Pinning helps ensure that people are connecting to the sites they intend. Pinning allows site operators to specify which certificate authorities (CAs) issue valid certificates for them, rather than accepting any one of the hundreds of built-in root certificates that ship with Firefox. If any certificate in the verified certificate chain corresponds to one of the known good certificates, Firefox displays the lock icon as normal.

Pinning helps protect users from man-in-the-middle-attacks and rogue certificate authorities. When the root cert for a pinned site does not match one of the known good CAs, Firefox will reject the connection with a pinning error. This type of error can also occur if a CA mis-issues a certificate.

Pinning errors can be transient. For example, if a person is signing into WiFi, they may see an error like the one below when visiting a pinned site. The error should disappear if the person reloads after the WiFi access is setup.

Firefox 32 and above supports built-in pins, which means that the list of acceptable certificate authorities must be set at time of build for each pinned domain. Pinning is enforced by default. Sites may advertise their support for pinning with the Public Key Pinning Extension for HTTP, which we hope to implement soon. Pinned domains include addons.mozilla.org and Twitter in Firefox 32, and Google domains in Firefox 33, with more domains to come. That means that Firefox users can visit Mozilla, Twitter and Google domains more safely. For the full list of pinned domains and rollout status, please see the Public Key Pinning wiki.

Thanks to Camilo Viecco for the initial implementation and David Keeler for many reviews!

Air MozillaGerman speaking community bi-weekly meeting

German speaking community bi-weekly meeting Zweiwöchentliches Meeting der deutschsprachigen Community. ==== German speaking community bi-weekly meeting.

Mozilla WebDev CommunityBeer and Tell – May 2015

Once a month, web developers from across the Mozilla Project get together to organize our poltical lobbying group, Web Developers Against Reality. In between sessions with titles like “Three Dimensions: The Last Great Lie” and “You Aren’t Real, Start Acting Like It”, we find time to talk about our side projects and drink, an occurrence we like to call “Beer and Tell”.

There’s a wiki page available with a list of the presenters, as well as links to their presentation materials. There’s also a recording available courtesy of Air Mozilla.

Groovecoder: WellHub

Groovecoder stopped by to share WellHub, a site for storing and visualizing log data from wells. The site was created for StartupWeekend Tulsa, and uses WebGL (via ThreeJS) + WebVR to allow for visualization of the wells based on their longitude/latitude and altitude using an Oculus Rift or similar virtual reality headset.

Osmose: Refract

Next up was Osmose (that’s me!), who shared some updates to Refract, a webpage previously shown in Beer and Tell that turns any webpage into an installable application. The main change this month was added support for generating Chrome Apps in addition to the Open Web Apps that it already supported.


This month’s session was a productive one, up until a pro-reality plant asked why we were having a real-life meetup for an anti-reality group, at which point most of the people in attendance began to scream uncontrollably.

If you’re interested in attending the next Beer and Tell, sign up for the dev-webdev@lists.mozilla.org mailing list. An email is sent out a week beforehand with connection details. You could even add yourself to the wiki and show off your side-project!

See you next month!