Planet Mozilla Automation

May 04, 2016

David Burns

GeckoDriver (Marionette) Release v0.7.1

I have just released a new version of the Marionette, well the executable that you need to download.

The main fix in this release is the ability to send over custom profiles that will be used. To be able to use the custom profile you will need to have marionette:true capability and pass in a profile when you instantiate your FirefoxDriver.

We have also fixed a number of minor issues like IPv6 support and compiler warnings.

We have also move the repository where our executable is developed to live under the Mozilla Organization. This is now called GeckoDriver. We will be updating the naming of it in Selenium and documentation over the next few weeks.

Since you are awesome early adopters it would be great if we could raise bugs.

I am not expecting everything to work but below is a quick list that I know doesn't work.

Switching of Frames needs to be done with either a WebElement or an index. Windows can only be switched by window handles.

If in doubt, raise bugs!

Thanks for being an early adopter and thanks for raising bugs as you find them!

May 04, 2016 11:00 AM

May 03, 2016

Armen Zambrano G. (@armenzg)

Replay Pulse messages

If you know what is Pulse and you would like to write some integration tests for an app that consumes them pulse_replay might make your life a bit easier.

You can learn more about by reading this quick

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

May 03, 2016 07:52 PM

May 02, 2016

Armen Zambrano G. (@armenzg)

Open Platform Operations’ logo design

Last year, the Platform Operations organization was born and it brought together multiple teams across Mozilla which empower development with tools and processes.

This year, we've decided to create a logo that identifies us an organization and builds our self-identify.

We've filed this issue for a logo design [1] and we would like to have a call for any community members to propose their designs. We would like to have all applications in by May 13th. Soon after that, we will figure out a way to narrow it down to one logo! (details to be determined).

We would also like to thank whoever made the logo which we pick at the end (details also to be determined).

Looking forward to collaborate with you and see what we create!


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

May 02, 2016 05:45 PM

Maja Frydrychowicz

Not Testing a Firefox Build (Generic Tasks in TaskCluster)

A few months ago I wrote about my tentative setup of a TaskCluster task that was neither a build nor a test. Since then, gps has implemented “generic” in-tree tasks so I adapted my initial work to take advantage of that.

Triggered by file changes

All along I wanted to run some in-tree tests without having them wait around for a Firefox build or any other dependencies they don’t need. So I originally implemented this task as a “build” so that it would get scheduled for every incoming changeset in Mozilla’s repositories.

But forget “builds”, forget “tests” — now there’s a third category of tasks that we’ll call “generic” and it’s exactly what I need.

In base_jobs.yml I say, “hey, here’s a new task called marionette-harness — run it whenever there’s a change under (branch)/testing/marionette/harness”. Of course, I can also just trigger the task with try syntax like try: -p linux64_tc -j marionette-harness -u none -t none.

When the task is triggered, a chain of events follows:

For Tasks that Make Sense in a gecko Source Checkout

As you can see, I made the script in the desktop-build docker image execute an arbitrary in-tree JOB_SCRIPT, and I created to run mozharness within a gecko source checkout.

Why not the desktop-test image?

But we can also run arbitrary mozharness scripts thanks to the configuration in the desktop-test docker image! Yes, and all of that configuration is geared toward testing a Firefox binary, which implies downloading tools that my task either doesn’t need or already has access to in the source tree. Now we have a lighter-weight option for executing tests that don’t exercise Firefox.

Why not mach?

In my lazy work-in-progress, I had originally executed the Marionette harness tests via a simple call to mach, yet now I have this crazy chain of shell scripts that leads all the way mozharness. The mach command didn’t disappear — you can run Marionette harness tests with ./mach python-test .... However, mozharness provides clearer control of Python dependencies, appropriate handling of return codes to report test results to Treeherder, and I can write a job-specific script and configuration.

May 02, 2016 04:00 AM

April 22, 2016

Mark Côté

How MozReview helps

A great post on code review is making its rounds. It’s started some discussion amongst Mozillians, and it got me thinking about how MozReview helps with the author’s points. It’s particularly interesting because apparently Twitter uses Review Board for code reviews, which is a core element of the whole MozReview system.

The author notes that it’s very important for reviewers to know what reviews are waiting on them, but also that Review Board itself doesn’t do a good job of this. MozReview fixes this problem by piggybacking on Bugzilla’s review flags, which have a number of features built around them: indicators, dashboards, notification emails, and reminder emails. People can even subscribe to the reminders for other reviewers; this is a way managers can ensure that their teams are responding promptly to review requests. We’ve also toyed around with the idea of using push notifications to notify people currently using Bugzilla that they have a new request (also relevant to the section on being “interrupt-driven”).

On the submitter side, MozReview’s core support for microcommits—a feature we built on top of Review Board, within our extensions—helps “keep reviews as small as possible”. While it’s impossible to enforce small commits within a tool, we’ve tried to make it as painless as possible to split up work into a series of small changes.

The MozReview team has made progress on automated static analysis (linters and the like), which helps submitters verify that their commits follow stylistic rules and other such conventions. It will also shorten review time, as the reviewer will not have to spend time pointing out these issues; when the review bots have given their r+s, the reviewer will be able to focus solely on the logic. As we continue to grow the MozReview team, we’ll be devoting some time to finishing up this feature.

April 22, 2016 03:39 PM

Armen Zambrano G. (@armenzg)

The Joy of Automation

This post is to announce The Joy of Automation YouTube channel. In this channel you should be able to watch presentations about automation work by Mozilla's Platforms Operations. I hope more folks than me would like to share their videos in here.

This follows the idea that mconley started with The Joy of Coding and his livehacks.
At the moment there is only "Unscripted" videos of me hacking away. I hope one day to do live hacks but for now they're offline videos.

Mistakes I made in case any Platform Ops member wanting to contribute want to avoid:

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 22, 2016 02:26 PM

David Burns

Selenium WebDriver and Firefox 46

As of Firefox 46, the extension based version FirefoxDriver will no longer work. This is because of the new Add-on policy that Mozilla is enforcing to try help protect end users from installers inserting add ons that are not what the user wants. This version is due for release next week.

This does not mean that your tests need to stop working entirely as there are options to keep them working.


Firstly, you can use Marionette, the Mozilla version of FirefoxDriver to drive Firefox. This has been in Firefox since about 24 as we, slowly working against Mozilla priorities, getting it up to Selenium level. Currently Marionette is passing ~85% of the Selenium test suite.

I have written up some documentation on how to use Marionette on MDN

I am not expecting everything to work but below is a quick list that I know doesn't work.

It would be great if we could raise bugs.

Firefox 45 ESR

If you don't want to worry about Marionette, the other option is to downgrade to Firefox 45, preferably the ESR as it won't update to 46 and will update in about 6-9 months time to Firefox 52 when you will need to use Marionette.

Marionette will be turned on by default from Selenium 3, which is currently being worked on by the Selenium community. Ideally when Firefox 52 comes around you will just update to Selenium 3 and, fingers crossed, all works as planned.

April 22, 2016 10:54 AM

April 17, 2016

Armen Zambrano G. (@armenzg)

Project definition: Give Treeherder the ability to schedule TaskCluster jobs

This is a project definition that I put up for GSoC 2016. This helps students to get started researching the project.

The main things I give in here are:

NOTE: This project has few parts that have risks and could change the implementation. It depends on close collaboration with dustin.

Mentor: armenzg 
IRC:   #ateam channel

Give Treeherder the ability to schedule TaskCluster jobs

This work will enable "adding new jobs" on Treeherder to work with pushes lacking TaskCluster jobs (our new continuous integration system).
Read this blog post to know how the project was built for Buildbot jobs (our old continous integration system).

The main work for this project is tracked in bug 1254325.

In order for this to work we need the following pieces:

A - Generate data source with all possible tasks

B - Teach Treeherder to use the artifact

C - Teach pulse_actions to listen for requests from Treeherder

  • pulse_actions is a pulse listener of Treeherder actions
  • You can see pulse_actions’ workflow in here
  • Once part B is completed, we will be able to listen for messages requesting certain TaskCluster tasks to be scheduled and we will schedule those tasks on behalf of the user
  • RISK: Depending if the TaskCluster actions project is completed on time, we might instead make POST requests to an API

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 17, 2016 04:01 PM

Project definition: SETA re-write

As an attempt to attract candidates to GSoC I wanted to make sure that the possible projects were achievable rather than lead them on a path of pain and struggle. It also helps me picture the order on which it makes more sense to accomplish.

It was also a good exercise for students to have to read and ask questions about what was not clear and give lots to read about the project.

I want to share this and another project definition in case it is useful for others.

We want to rewrite SETA to be easy to deploy through Heroku and to support TaskCluster (our new continuous integration system) [0].

Please read carefully this document before starting to ask questions. There is high interest in this project and it is burdensome to have to re-explain it to every new prospective student.

Main mentor: armenzg (#ateam)
Co-mentor: jmaher (#ateam)

Please read jmaher’s blog post carefully [1] before reading anymore.

Now that you have read jmaher’s blog post, I will briefly go into some specifics.
SETA reduces the number of jobs that get scheduled on a developer’s push.
A job is every single letter you see on Treeherder. For every developer’s push there is a number of these jobs scheduled.
On every push, Buildbot [6] decides what to schedule depending on the data that it fetched from SETA [7].

The purpose of this project is two-fold:
  1. Write SETA as an independent project that is:
    1. maintainable
    2. more reliable
    3. automatically deployed through Heroku app
  2. Support TaskCluster, our new CI (continuous integration system)

NOTE: The current code of SETA [2] lives within a repository called ouija.

Ouija does the following for SETA:
  1. It has a cronjob which kicks in every 12 hours to scrape information about jobs from every push
  2. It takes the information about jobs (which it grabs from Treeherder) into a database

SETA then goes a queries the database to determine which jobs should be scheduled. SETA chooses jobs that are good at reporting issues introduced by developers. SETA has its own set of tables and adds the data there for quick reference.

Involved pieces for this project:
  1. Get familiar with deploying apps and using databases in Heroku
  2. Host SETA in Heroku instead of
  3. Teach SETA about TaskCluster
  4. Change the gecko decision task to reliably use SETA [5][6]
    1. If the SETA service is not available we should fall back to run all tasks/jobs
  5. Document how SETA works and auto-deployments of docs and Heroku
    1. Write automatically generated documentation
    2. Add auto-deployments to Heroku and readthedocs
  6. Add tests for SETA
    1. Add tox/travis support for tests and flake8
  7. Re-write SETA using ActiveData [3] instead of using data collected by Ouija
  8. Make the current CI (Buildbot) use the new SETA Heroku service
  9. Create SETA data for per test information instead of per job information (stretch goal)
    1. On Treeherder we have jobs that contain tests
    2. Tests re-order between those different chunks
    3. We want to run jobs at a per-directory level or per-manifest
  10. Add priorities into SETA data (stretch goal)
    1. Priority 1 gets every time
    2. Priority 2 gets triggered on Y push

[6] testing/taskcluster/

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 17, 2016 03:54 PM

April 13, 2016

Armen Zambrano G. (@armenzg)

Improving how I write Python tests

The main focus of this post is about what I've learning about writing Python tests, using mocks and patching functions properly. This is not an exhaustive post.

What I'm writing now is something I should have learned many years ago as a Python developer. It can be embarrassing to recognize it, however, I've thought of sharing this with you since I know it would have helped me earlier on my career and I hope it might help you as well.

Somebody has probably written about this topic and if you're aware of a good blog post covering this similar topic please let me know. I would like to see what else I've missed.

Also, if you want to start a Python project from scratch or to improve your current one, I suggest you read "Open Sourcing a Python Project the Right Way". Many of the things he mentions is what I follow for mozci.

This post might also be useful for new contributors trying to write tests for your project.

My takeaway

These are some of the things I've learned

  1. Make running tests easy
    • We use tox to help us create a Python virtual environment, install the dependencies for the project and to execute the tests
    • Here's the tox.ini I use for mozci
  2. If you use py.test learn how to not capture the output
    • Use the -s flag to not capture the output
    • If your project does not print but instead it uses logging, add the pytest-capturelog plugin to py.test and it will immediately log for you
  3. If you use py.test learn how to jump into the debugger upon failures
    • Use --pdb to using the Python debugger upon failure
  4. Learn how to use @patch and Mock properly

How I write tests

This is what I do:

@patch properly and use Mocks

What I'm doing now to patch modules is the following:

The way that Mozilla CI tools is designed it begs for integration tests, however, I don't think it is worth doing beyond unit testing + mocking. The reason is that mozci might not stick around once we have fully migrated from Buildbot which was the hard part to solve.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 13, 2016 07:26 PM

April 12, 2016

Armen Zambrano G. (@armenzg)

mozci-trigger now installs with pip install mozci-scripts

If you use mozci from the command line this applies to you; otherwise, carry on! :)

In order to use mozci from the command line you now have to install with this:
pip install mozci-scripts
instead of:
pip install mozci

This helps to maintain the scripts separately from the core library since we can control which version of mozci the scripts use.

All scripts now lay under the scripts/ directory instead of the library:

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 12, 2016 07:32 PM

March 31, 2016

Geoff Brown

Firefox for Android Performance Measures – Q1 Check-up


APK Size

You can see the size of every build on treeherder using Perfherder.

Here’s how the APK size changed over the quarter, for mozilla-central Android 4.0 API15+ opt builds:


The dramatic decrease in February was caused by bug 1233799, which enabled the download content service and removed fonts from the APK.

For the same period, generally increased in size:


The recent decrease in libxul was caused by bug 1259521, an upgrade of the Android NDK.


This quarter we began tracking some memory metrics, using test_awsy_lite.


These memory measurements are generally steady over the quarter, with some small improvements.


This section tracks Perfherder graphs for mozilla-central builds of Firefox for Android, for Talos tests run on Autophone, on android-6-0-armv8-api15. The test names shown are those used on treeherder. See for background on Talos.

In previous quarters, these tests were running on Pandaboards; beginning this quarter, these tests run on actual phones via Autophone.


An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.


Generic page load test. Lower values are better.


No significant improvements or regressions noted for tsvgx or tp4m.


Throbber Start / Throbber Stop

These graphs are taken from  Browser startup performance is measured on real phones (a variety of popular devices).



There was a lot of work on Autophone this quarter, with new devices added and old devices retired or re-purposed. These graphs show devices running mozilla-central builds, of which none were in continuous use over the quarter.

Throbber Start/Stop test regressions are tracked by bug 953342; a recent regression in throbber start is under investigation in bug 1259479.


mozbench has been retired.😦

Long live!:) I’ll check in on next quarter.

March 31, 2016 03:27 PM

Henrik Skupin

Review of Firefox desktop automation work – Q1 2016

Today is the last day of Q1 2016 which means time to review what I have done during all those last weeks. When I checked my status reports it’s kinda lot, so I will shorten it a bit and only talk about the really important changes.

Build System / Mozharness

After I had to dig into mozharness to get support for Firefox UI Tests during last quarter I have seen that more work had to be done to fully support tests which utilize Nightly or Release builds of Firefox.

The most challenging work for me (because I never did a build system patch so far) was indeed prefixing the test_packages.json file which gets uploaded next to any nightly build to This work was necessary because without the prefix the file was always overwritten by later build uploads. Means when trying to get the test archives for OS X and Linux always the Windows ones were returned. Due to binary incompatibilities between those platforms this situation caused complete bustage. No-one noticed that until now because any other testsuite is run on a checkin basis and doesn’t have to rely on the nightly build folders on For Taskcluster this wasn’t a problem.

In regards of firefox-ui-tests I was finally able to get a test task added to Taskcluster which will execute our firefox-ui-tests for each check-in and this in e10s and non-e10s mode. Due to current Taskcluster limitations this only runs for Linux64 debug, but that already helps a lot and I hope that we can increase platform coverage soon. If you are interested in the results you can have a look at Treeherder.

Other Mozharness specific changes are the following ones:

Firefox UI Tests

The biggest change for us this quarter was the move of the Firefox UI tests from our external Github repository to mozilla-central. It means that our test code including the harness and Firefox Puppeteer is in sync with changes to Firefox now and regressions caused by ui changes should be very seldom. And with the Taskcluster task as mentioned above it’s even easier to spot those regressors on mozilla-inbound.

The move itself was easy but keeping backward compatibility with mozmill-ci and other Firefox branches down to mozilla-esr38 was a lot of work. To achieve that I first had to convert all three different modules (harness, puppeteer, tests) to individual Python packages. Those got landed for Firefox 46.0 on mozilla-central and then backported to Firefox 45.0 which also became our new ESR release. Due to backport complexity for older branches I decided to not land packages for Firefox 44.0, 43.0, and 38ESR. Instead those branches got smaller updates for the harness so that they had full support for our latest mozharness script on mozilla-central. Yes, in case you wonder all branches used mozharness from mozilla-central at this time. It was easier to do, and I finally switched to branch specific mozharness scripts later in mozmill-ci once Firefox 45.0 and its ESR release were out.

Adding mach support for Firefox UI Tests on mozilla-central was the next step to assist in running our tests. Required arguments from before are now magically selected by mach, and that allowed me to remove the firefox-ui-test dependency on firefox_harness, which was always a thorn in our eyes. As final result I was even able to completely remove the firefox-ui-test package, so that we are now free in moving our tests to any place in the tree!

In case you want to know more about our tests please check out our new documentation on MDN which can be found here:

Mozmill CI

Lots of changes have been done to this project to accommodate the Jenkins jobs to all the Firefox UI Tests modifications. Especially that I needed a generic solution which works for all existing Firefox versions. The first real task was to no longer use the firefox-ui-tests Github repository to grab the tests from, but instead let mozharness download the appropriate test package as produced and uploaded with builds to

It was all fine immediately for en-US builds given that the location of the test_packages.json file is distributed along with the Mozilla Pulse build notification. But it’s not the case for l10n builds and funsize update notifications. For those we have to utilize mozdownload to fetch the correct URL based on the version, platform, and build id. So all fine. A special situation came up for update tests which actually use two different Firefox builds. If we get the tests for the pre build, how can we magically switch the tests for the target version? Given that there is no easy way I decided to always use the tests from the target version, and in case of UI changes we have to keep backward compatibility code in our tests and Firefox Puppeteer. This is maybe the most ideal solution for us.

Another issue I had to solve with test packages was with release candidate builds. For those builds Release Engineering is not uploading nor creating any test archive. So a connection had to be made between candidate builds and CI (tinderbox) builds. As turned out the two properties which helped here are the revision and the branch. With them I at least know the changeset of the mozilla-beta, mozilla-release, and mozilla-esr* branches as used to trigger the release build process. But sadly that’s only a tag and no builds nor tests are getting created. Means something more is necessary. After some investigation I found out that Treeherder and its Rest API can be of help. Using the known tag and walking back the parents until Treeherder reports a successful build for the given platform, allowed me to retrieve the next possible revision to be used with mozdownload to retrieve the test_packages.json URL. I know its not perfect but satisfies us enough for now.

Then the release promotion project as worked on by the Release Engineering team was close to be activated. I heard a couple of days before, that Firefox 46.0b1 will be the first candidate to get it tested on. It gave me basically no time for testing at all. Thanks to all the support from Rail Aliiev I was able to get the new Mozilla Pulse listener created to handle appropriate release promotion build notifications. Given that with release promotion we create the candidates based on a signed off CI build we already have a valid revision to be used with mozdownload to retrieve the test_packages.json file – so no need for the above mentioned Treeherder traversal code. \o/ Once all has been implemented Firefox 46.0b3 was the first beta release for which we were able to process the release promotion notifications.

At the same time with release promotion news I also got informed by Robert Kaiser that the ondemand update jobs as performed with Mozmill do not work anymore. As turned out a change in the JS engine caused the bustage for Firefox 46.0b1. Given that Mozmill is dead I was not going to update it again. Instead I converted the ondemand update jobs to make use of Firefox-UI-Tests. This went pretty well, also because we were running those tests already for a while on mozilla-central and mozilla-aurora for nightly builds. As result we were able to run update jobs a day later for Firefox 46.0b1 and noticed that nearly all locales on Windows were busted, so only en-US got finally shipped. Not sure if that would have been that visible with Mozmill.

Last but not least I also removed the workaround which let all test jobs use the mozharness script from mozilla-central. It’s simply not necessary anymore given that all required features in mozharness are part of ESR45 now.

What’s next

I already have plans what’s next. But given that I will be away from work for a full month now, I will have to revisit those once I’m back in May. I promise that I will also blog about them around that time.

March 31, 2016 02:10 PM

March 23, 2016

Byron Jones

mozreview and inline comments on the diff view

when a comment is left on a review in review board/mozreview it is currently displayed as a small square in the left column.

comment indicator

our top reviewers have strongly indicated that this is suboptimal and would prefer to match what most other code review systems do in displaying comments as an inline block on the diff. i agree — review comments are important and deserve more attention in the user interface; they should be impossible to miss.

while the upstream review board team have long said that the current display is in need of fixing, there is minimal love for the inline comments approach.

recently we worked on a plan of attack to appease both our reviewers and upstream’s requirements.  :smacleod, :mconley and i talked through a wide assortment of design goals and potential issues.  we have a design document and i’ve started mocking up the approach in html:

inline comments mock-up

we also kicked off discussions with the upstream review board development team, and are hopeful that this is a feature that will be accepted upstream.

Filed under: mozreview

March 23, 2016 03:39 PM

March 10, 2016

Geoff Brown

Reduce, reuse, recycle

As Firefox for Android drops support for ancient versions of Android, I find my collection of test phones becoming less and less relevant. For instance, I have a Galaxy S that works fine but only runs Android 2.2.1 (API 8), and I have a Galaxy Nexus that runs Android 4.0.1 (API 14). I cannot run current builds of Firefox for Android on either phone, and, perhaps because I rooted them or otherwise messed around with them in the distant past, neither phone will upgrade to a newer version of Android.

I have been letting these phones gather dust while I test on emulators, but I recently needed a real phone and managed to breathe new life into the Galaxy Nexus using an AOSP build. I wanted all the development bells and whistles and a root shell, so I made a full-eng build and I updated the Galaxy Nexus to Android 4.3 (api 18) — good enough for Firefox for Android, at least for a while!

Basically, I followed the instructions at, building on Ubuntu 14.04. For the Galaxy Nexus, that broke down to:

mkdir aosp
cd aosp
repo init -u -b android-4.3_r1 # Galaxy Nexus
repo sync (this can take several hours)
# Download all binaries from the relevant section of 
# .
# I used "Galaxy Nexus (GSM/HSPA+) binaries for Android 4.3 (JWR66Y)".
# Extract each (6x) downloaded archive, extracting into <aosp>.
# Execute each (6x) .sh and accept prompts, populating <aosp>/vendor.
source build/
lunch full_maguro-eng
# use update-alternatives to select Java 6; I needed all 5 of these
sudo update-alternatives --config java
sudo update-alternatives --config javac
sudo update-alternatives --config javah
sudo update-alternatives --config javadoc
sudo update-alternatives --config javap
make -j4 (this can take a couple of hours)

Once make completes, I had binaries in <aosp>/out/… I put the phone in bootloader mode (hold down Volume Up + Volume Down + Power to boot Galaxy Nexus), connected it by  USB and executed “fastboot -w flashall”.

Actually, in my case, fastboot could not see the connected device, unless I ran it from root. In the root account, I didn’t have the right settings, so I needed to do something like:

sudo /bin/bash
source build/
lunch full_maguro-eng
fastboot -w flashall

If you are following along, don’t forget to undo your java update-alternatives when you are done!

It took some time to download and build, but the procedure was fairly straight-forward and the results excellent: I feel like I have a new phone, perfectly clean and functional — and rooted!

(I have had no similar luck with the Galaxy S: AOSP binaries are only supplied for Nexus devices, and I see no AOSP instructions for the Galaxy S. Maybe it’s time to recycle this one.)

March 10, 2016 11:53 PM

March 01, 2016

Robert Wood

Publishing Sensor Data on SparkFun


With the ‘Internet of Things’ taking off there are some cloud services out there that allow you to publish data from your various IOT devices. One such service is hosted by SparkFun, and I wanted to give it a try to see what the world of publishing sensor data is like.

The SparkFun Cloud

SparkFun is a free sensor data publishing service that lets you create a data ‘stream’, name your stream, and define what type of data you will be publishing. It’s cool as you don’t have to create an account to use their publishing cloud; instead of requiring a sign-in you are provided with a stream public key and private key. You don’t have to provide any personal information (you can even publish completely anonymously if you like).

Your data stream is allocated 50 MB maximum, and once you reach that limit it acts as a queue where the oldest sensor data records will be deleted as new ones are received. Any data that you do publish will be public, and it may be used by ‘data scientists’ around the world. However it appears that you do have full control over your own data, in that you can choose to clear out your data stream or delete it completely at any given time. There is a limit to how fast you can post data to your stream: a maximum of 100 data pushes in any given 15 minute window (approximately one post to your data stream every 9 seconds).

SparkFun also has an online storefront where you can purchase IOT hacking equipment such as Arduino boards, Raspberry Pi’s, sensor kits, their own branded hacking boards, and lots of other fun stuff. Note that the publishing cloud supports data posted from any platform not just from hardware purchased from the SparkFun store.

My Data Experiment

To try out I decided to create a live data stream and post light sensor and temperature data from my office environment, using my Arduino Uno board and sensor shield. I am running an Ubuntu 14.04 VM on my MacBook.

Step 1: Setup my Arduino Board

As mentioned earlier I’m using an Arduino Uno board with an attached Tinkerkit sensor shield, that I purchased as part of a sensor starter kit.

Arduino Uno with Tinkerkit Sensor Shield

Arduino Uno with Tinkerkit Sensor Shield

I previously flashed my Arduino board with the Arduino ‘Standard Firmata’ firmware (to allow my board to run code and to talk to my VM, etc.) using the Arduino IDE. I won’t go into the details as that is basically the standard way to setup an Arduino board before you start hacking with it.

After attaching the sensor shield to the Arduino board, I attached a photoresistor to the sensor shield input I0, a temperature sensor to the shield’s input I1, and also a green LED to the shield’s output O0.

Arduino Tinkerkit Sensor Shield ports

Arduino Tinkerkit Sensor Shield ports

I just used a box to stand-up the sensors so that I could interact with the sensors easily. The green LED will be used just to indicate that sensor data is being received.

Tinkerkit LED, photoresistor, and temperature sensor

Tinkerkit LED, photoresistor, and temperature sensor

Step 2: Create a SparkFun Stream

It was really easy to create a new SparkFun data stream and as I mentioned earlier the best part is no account creation/sign-in is required. To create the data stream I just browsed to and then clicked the ‘Create’ button to ‘create a free data stream immediately’.

Create stream page on

Create stream page on

Once on the create stream page, I just had to:

Then I clicked the ‘Save’ button and was good to go! The new stream was created and a new stream page displayed, that showed my new stream public key (which is a part of the stream URL), alias, private key and a delete key. These keys are required to publish to, clear, or delete your data stream (and there is an option to provide your email address so the keys are sent to you).

Step 3: Write the Code

With the Arduino board and sensors ready, and my SparkFun data stream created, next I had to write the code to grab the sensor data from the board and post it to the data stream.

I wanted to write the code in node.js, so I used a super cool framework called Johnny-Five. Johnny-Five is an open-source framework that lets you talk to your arduino board using node.js and the Arduino’s ‘StandardFirmata’ firmware. The JS code runs on your local machine but connects to and interacts with the Arduino board via the board’s firmware. With the J5 API you can create objects to control your Arduino board and interact with the board inputs and outputs. J5 also adds support for the REPL (Read-Eval-Print-Loop) command line, so when you are running your node.js code you can type in commands in the terminal and interact with your Arduino board live (using the objects that you created in your code). This is great for debugging and for just learning how to use the API in general.

J5 makes it super easy. As an example, using J5 and node.js to control an LED that is attached to the sensor shield output O0, you would do something like:

var five = require('johnny-five');
var board = new five.Board();

// when board is booted and connected, turn on the LED
board.on('ready', function() {
  var myLed = new five.Led('O0');

When creating a new J5 sensor object, you can specify the frequency at which you want to take sensor data readings (ms). A callback is specified that will be invoked each time new data is received from the sensor. For example, to use J5 to read light sensor data from the Arduino board every 10 seconds:

var five = require('johnny-five');
var board = new five.Board();

// when board is booted and connected, get light sensor data every 10 seconds
board.on('ready', function() {
  var myLightSensor = new five.Sensor({pin:'A0', freq: 10000});
  myLightSensor.on('data', () => {
      lightValue = myLightSensor.value;
      console.log(`Light sensor reading: ${lightValue}`);

To post sensor data to the SparkFun data stream, SparkFun provides the Phant-Client NPM package which uses the Phant API. First you have to connect to your existing SparkFun data stream, like this:

var Phant = require('phant-client').Phant;

var phant = new Phant();
var iri = '' + '<your_data_stream_public_key_here>';

// connect to the data stream
phant.connect(iri, function(error, streamd) {
    if (error) {
      // handle error
    } else {
      // successfully connected to the data stream, so continue

After connecting to the stream you need to add the stream’s private key to the stream object, so that you can post data:

myStream = streamd;
myStream.privateKey = sparkey; // sparkey contains the private key

Then the sensor data (lightValue, and tempValue in this example) can be posted using phant.add. A callback is provided which is invoked once the data has been successfully posted to your stream. Important: the value names you provide must match the field names that you specified at the time of your stream creation:

// post to my stream
phant.add(myStream, {light: lightValue, temp_c: tempValue}, () => {
  console.log(`Successfully posted to ${iri}`);

My full node.js code that I developed for this experiment can be found here in my Github repo.

Step 4: Start it Up!

With my Arduino board connected to USB and powered up, I started my node.js program in an Ubuntu terminal (providing my SparkFun stream private key as an environment variable on the command line).

My program successfully connected to my Sparkfun data stream online. Once the Arduino board entered the ‘ready’ state my program began grabbing light and temperature sensor data every minute, posting each data-set to my live stream. I let it run for awhile to generate some data (and at one point I turned on a lamp and touched the temperature sensor, to cause noticeable changes in the sensor values).

Console output for my node.js program. I turned a desk lamp on and touched the temperature sensor just before sensor reading 8

Console output for my node.js program. I turned a desk lamp on and touched the temperature sensor just before reading 8

Step 5: View the Data Online

Now that data was successfully being posted to my live stream, I could go and check it out! I just browsed to my sensor stream alias or directly to my sensor stream URL, and there it is. Success!

 My resulting sensor data stream on

My resulting sensor data stream on

Besides viewing your own data, you can explore the other SparkFun Public Streams and see what other people are posting too.

Exporting Sensor Data

Another really cool feature of is that you can export all of your sensor data from the cloud to your local drive. All you have to do is browse to your data stream, then in the top left there is the option to export to JSON, CSV, etc. I tested this out and exported all of my sensor data at that particular time to JSON, and you can see the resulting JSON file here in my Github repo.

In Summary

Using my Arduino board, the Johnny-Five node.js framework, and the Phant-Client, it was relatively easy to publish my sensor data on Cloud publishing services like this will encourage and enable hackers to get involved in the internet-of-things and learn new skills in general. Very cool!

March 01, 2016 08:49 PM

David Burns


The thing that is at the core of every hyper effective team is trust. Without it, any of the pieces that make the team hyper effective can fall apart very quickly. This is something that I have always instinctively known. I always work hard with my reports to make sure they can trust me. If they trust me, and more importantly I trust them, then I can ask them to take on work and then just come back every so often to see if they are stuck.

The other week I was in Washington, D.C to meet up with my manager peers. This was done with the plan to see how we can interact with each other, build bridges and more importantly build trust.

How did we do this?

We did a few trust exercises which, I am not going to lie was extremely uncomfortable. One actually made me shake in my boots was one where I had to think of things I was proud of last year and things I could have done better. Then I needed to say what I was planning for this year that I will be proud of. Once my part was done, the rest of the group could make comments about me.

"They are my peers, they are open to me all the time..." is what my brain should have been saying. In actual fact it was saying, "They are about to crucify you...". The irony is that my peers are a lovely group who are amazingly supportive. My brain knows that but went into flight mode...

This exercise showed that people are allowed to say both positive and negative things about your work. Always assume the best in people (at first until they prove otherwise).

It showed that conflict is ok, in actual fact it is extremely healthy! Well as long as it is constructive to the group and not destructive.

We also read The five dysfunctions of a team which I highly recommend. It puts trust at the heart of all the things people do!.

March 01, 2016 02:21 PM

February 19, 2016

Joel Maher

QoC.2 – Iterations and thoughts

Quite a few weeks ago now, the Second official Quarter of Contribution wrapped up.  We had advertised 4 projects and found awesome contributors for all 4.  While all hackers gave a good effort, sometimes plans change and life gets in the way.  In the end we had 2 projects with very active contributors.

We had two projects with a lot of activity throughout the project:

First off, this 2nd round of QoC wouldn’t have been possible without the Mentors creating projects and mentoring, nor without the great contributors volunteering their time to build great tools and features.

I really like to look at what worked and what didn’t, let me try to summarize some thoughts.

What worked well:

What I would like to see changed for QoC.3:

As it stands now, we are pushing on submitting Outreachy and GSoC project proposals, assuming that those programs pick up our projects, we will look at QoC.3 more into September or November.


February 19, 2016 06:19 PM

QoC.2 – WPT Results Viewer – wrapping up

Quite a few weeks ago now, the Second official Quarter of Contribution wrapped up.  We had advertised 4 projects and found awesome contributors for all 4.  While all hackers gave a good effort, sometimes plans change and life gets in the way.  In the end we had 2 projects with very active contributors.

This post, I want to talk about WPT Results Viewer.  You can find the code on github, and still find the team on irc in #ateam.  As this finished up, I reached out to :martianwars to learn what his experience was like, here are his own words:

What interested you in QoC?

So I’d been contributing to Mozilla for sometime fixing random bugs here and there. I was looking for something larger and more interesting. I think that was the major motivation behind QoC, besides Manishearth’s recommendation to work on the Web Platform Test Viewer. I guess I’m really happy that QoC came around the right time!

What challenges did you encounter while working on your project?  How did you solve them?

I guess the major issue while working on wptview was the lack of Javascript knowledge and the lack of help online when it came to Lovefield. But like every project, I doubt I would have enjoyed much had I known everything required right from the start. I’m glad I got jgraham as a mentor, who made sure I worked my way up the learning curve as we made steady progress.

What are some things you learned?

So I definitely learnt some Javascript, code styling, the importance of code reviews, but there was a lot more to this project. I think the most important thing that I learnt was patience. I generally tend to search for StackOverflow answers when it I need to perform a programming task I’m unaware of. With Lovefield being a relatively new project, I was compelled to patiently read and understand the documentation and sample programs. I also learnt a bit on how a large open source community functions, and I feel excited being a part of it!  A bit irrelevant to the question, but I think I’ve made some friends in #ateam :) The IRC is like my second home, and helps me escape life’s never ending stress, to a wonderland of ideas and excitement!

If you were to give advice to students looking at doing a QoC, what would you tell them?

Well the first thing I would advice them is not to be afraid, especially of asking the so called “stupid” questions on the IRC. The second thing would be to make sure they give the project a decent amount of time, not with the aim of completing it or something, but to learn as much as they can:) Showing enthusiasm is the best way to ensure one has a worthwhile QoC:) Lastly, I’ve tried my level best to get a few newcomers into wptview. I think spreading the knowledge one learns is important, and one should try to motivate others to join open source:)

If you were to give advice to mentors wanting to mentor a project, what would you tell them?

I think jgraham has set a great example of what an ideal mentor should be like. Like I mentioned earlier, James helped me learn while we made steady progress. I especially appreciate the way he had (has rather) planned this project. Every feature was slowly built upon and in the right order, and he ensured the project continued to progress while I was away. He would give me a sufficient insight into each feature, and leave the technical aspects to me, correcting my fallacies after the first commit. I think this is the right approach. Lastly, a quality every mentor MUST have, is to be awake at 1am on a weekend night reviewing PRs😉

Personally I have really enjoyed getting to know :martianwars and seeing the great progress he has made.

February 19, 2016 05:44 PM

February 16, 2016

Maja Frydrychowicz

First Experiment with TaskCluster

TaskCluster is a new-ish continuous integration system made at Mozilla. It manages the scheduling and execution of tasks based on a graph of their dependencies. It’s a general CI tool, and could be used for any kind of job, not just Mozilla things.

However, the example I describe here refers to a Mozilla-centric use case of TaskCluster1: tasks are run per check-in on the branches of Mozilla’s Mercurial repository and then results are posted to Treeherder. For now, the tasks can be configured to run in Docker images (Linux), but other platforms are in the works2.

So, I want to schedule a task! I need to add a new task to the task graph that’s created for each revision submitted to (This is part of my work on deploying a suite of tests for the Marionette Python test runner, i.e. testing the test harness itself.)

The rest of this post describes what I learned while making this work-in-progress.

There are builds and there are tests

mozilla-taskcluster operates based on the info under testing/taskcluster/tasks in Mozilla’s source tree, where there are yaml files that describe tasks. Specific tasks can inherit common configuration options from base yaml files.

The yaml files are organized into two main categories of tasks: builds and tests. This is just a convention in mozilla-taskcluster about how to group task configurations; TC itself doesn’t actually know or care whether a task is a build or a test.

The task I’m creating doesn’t quite fit into either category: it runs harness tests that just exercise the Python runner code in marionette_client, so I only need a source checkout, not a Firefox build. I’d like these tests to run quickly without having to wait around for a build. Another example of such a task is the recently-created ESLint task.

Scheduling a task

Just adding a yaml file that describes your new task under testing/taskcluster/tasks isn’t enough to get it scheduled: you must also add it to the list of tasks in base_jobs.yml, and define an identifier for your task in base_job_flags.yml. This identifier is used in base_jobs.yml, and also by people who want to run your task when pushing to try.

How does scheduling work? First a decision task generates a task graph, which describes all the tasks and their relationships. More precisely, it looks at base_jobs.yml and other yaml files in testing/taskcluster/tasks and spits out a json artifact, graph.json3. Then, graph.json gets sent to TC’s createTask endpoint, which takes care of the actual scheduling.

In the excerpt below, you can see a task definition with a requires field and you can recognize a lot of fields that are in common with the ‘task’ section of the yaml files under testing/taskcluster/tasks/.

"tasks": [
      "requires": [
        // id of a build task that this task depends on
      "task": {
        "taskId": "c2VD_eCgQyeUDVOjsmQZSg"
        "extra": {
          "treeherder": {
              "groupName": "Reftest", 
              "groupSymbol": "tc-R", 
        "metadata": {
          "description": "Reftest test run 1", 
          "name": "[TC] Reftest", 

For now at least, a major assumption in the task-graph creation process seems to be that test tasks can depend on build tasks and build tasks don’t really4 depend on anything. So:

So, I added marionette-harness under builds. Recall, my task isn’t a build task, but it doesn’t depend on a build, so it’s not a test, so I’ll treat it like a build.

# in base_job_flags.yml
  # ...
  - marionette-harness

# in base_jobs.yml
  # ...
      - Linux64
        task: tasks/tests/harness_marionette.yml

This will allow me to trigger my task with the following try syntax: try: -b o -p marionette-harness. Cool.

Make your task do stuff

Now I have to add some stuff to tasks/tests/harness_marionette.yml. Many of my choices here are based on the work done for the ESLint task. I created a base task called harness_test.yml by mostly copying bits and pieces from the basic build task, build.yml and making a few small changes. The actual task, harness_marionette.yml inherits from harness_test.yml and defines specifics like Treeherder symbols and the command to run.

The command

The heart of the task is in task.payload.command. You could chain a bunch of shell commands together directly in this field of the yaml file, but it’s better not to. Instead, it’s common to call a TaskCluster-friendly shell script that’s available in your task’s environment. For example, the desktop-test docker image has a script called through which you can call the mozharness script for your tests. There’s a similar script on desktop-build. Both of these scripts depend on environment variables set elsewhere in your task definition, or in the Docker image used by your task. The environment might also provide utilities like tc-vcs, which is used for checking out source code.

# in harness_marionette.yml
    + bash
    + -cx
    + >
        tc-vcs checkout ./gecko {{base_repository}} {{head_repository}} {{head_rev}} {{head_ref}} &&
        cd gecko &&
        ./mach marionette-harness-test

My task’s payload.command should be moved into a custom shell script, but for now it just chains together the source checkout and a call to mach. It’s not terrible of me to use mach in this case because I expect my task to work in a build environment, but most tests would likely call mozharness.

Configuring the task’s environment

Where should the task run? What resources should it have access to? This was probably the hardest piece for me to figure out.


My task will run in a docker image using a docker-worker5. The image, called desktop-build, is defined in-tree under testing/docker. There are many other images defined there, but I only considered desktop-build versus desktop-test. I opted for desktop-build because desktop-test seems to contain mozharness-related stuff that I don’t need for now.

# harness_test.yml
   type: 'task-image'
   path: 'public/image.tar'
   taskId: '{{#task_id_for_image}}desktop-build{{/task_id_for_image}}'

The image is stored as an artifact of another TC task, which makes it a ‘task-image’. Which artifact? The default is public/image.tar. Which task do I find the image in? The magic incantation '{{#task_id_for_image}}desktop-build{{/task_id_for_image}}' somehow6 obtains the correct ID, and if I look at a particular run of my task, the above snippet does indeed get populated with an actual taskId.

"image": {
  "path": "public/image.tar",
  // Mystery task that makes a desktop-build image for us. Thanks, mystery task!
  "taskId": "aqt_YdmkTvugYB5b-OvvJw", 
  "type": "task-image"

Snooping around in the handy Task Inspector, I found that the magical mystery task is defined in image.yml and runs Fun. It’s also quite convenient to define and test your own custom image.

Other details that I mostly ignored

# in harness_test.yml
  # Nearly all of our build tasks use tc-vcs
  - 'docker-worker:cache:level-{{level}}-{{project}}-tc-vcs'
   # The taskcluster-vcs tooling stores the large clone caches in this
   # directory and will reuse them for new requests this saves about 20s~
   # and is the most generic cache possible.
   level-{{level}}-{{project}}-tc-vcs: '/home/worker/.tc-vcs'

Yay for trial and error


Blog posts from other TaskCluster users at Mozilla:

There is lots of great documentation at, but these sections were especially useful to me:


Thanks to dustin, pmoore and others for corrections and feedback.

  1. This is accomplished in part thanks to mozilla-taskcluster, a service that links Mozilla’s hg repo to TaskCluster and creates each decision task. More at TaskCluster at Mozilla 

  2. Run tasks on any platform thanks to generic worker 

  3. To look at a graph.json artifact, go to Treeherder, click a green ‘D’ job, then Job details > Inspect Task, where you should find a list of artifacts. 

  4. It’s not really true that build tasks don’t depend on anything. Any task that uses a task-image depends on the task that creates the image. I’m sorry for saying ‘task’ five times in every sentence, by the way. 

  5. …as opposed to a generic worker

  6. {{#task_id_for_image}} is an example of a predefined variable that we can use in our TC yaml files. Where do they come from? How do they get populated? I don’t know. 

February 16, 2016 05:00 AM

February 12, 2016

Andrew Halberstadt

The Zen of Mach

Mach is the Mozilla developer's swiss army knife. It gathers all the important commands you'll ever need to run, and puts them in one convenient place. Instead of hunting down documentation, or asking for help on irc, often a simple |mach help| is all that's needed to get you started. Mach is great. But lately, mach is becoming more like the Mozilla developer's toolbox. It still has everything you need but it weighs a ton, and it takes a good deal of rummaging around to find anything.

Frankly, a good deal of the mach commands that exist now are either poorly written, confusing to use, or even have no business being mach commands in the first place. Why is this important? What's wrong with having a toolbox?

Here's a quote from an excellent article on engineering effectiveness from the Developer Productivity lead at Twitter:

Finally there’s a psychological aspect to providing good tools to engineers that I have to believe has a really (sic) impact on people’s overall effectiveness. On one hand, good tools are just a pleasure to work with. On that basis alone, we should provide good tools for the same reason so many companies provide awesome food to their employees: it just makes coming to work every day that much more of a pleasure. But good tools play another important role: because the tools we use are themselves software, and we all spend all day writing software, having to do so with bad tools has this corrosive psychological effect of suggesting that maybe we don’t actually know how to write good software. Intellectually we may know that there are different groups working on internal tools than the main features of the product but if the tools you use get in your way or are obviously poorly engineered, it’s hard not to doubt your company’s overall competence.

Working with good tools is a pleasure. Rather than breaking mental focus, they keep you in the zone. They do not deny you your zen. Mach is the frontline, it is the main interface to Mozilla for most developers. For this reason, it's especially important that mach and all of its commands are an absolute joy to use.

There is already good documentation for building a mach command, so I'm not going to go over that. Instead, here are some practical tips to help keep your mach command simple, intuitive and enjoyable to use.

Keep Logic out of It

As awesome as mach is, it doesn't sprinkle magic fairy dust on your messy jumble of code to make it smell like a bunch of roses. So unless your mach command is trivial, don't stuff all your logic into a single Instead, create a dedicated python package that contains all your functionality, and turn your into a dumb dispatcher. This python package will henceforth be called the 'underlying library'.

Doing this makes your command more maintainable, more extensible and more re-useable. It's a no-brainer!

No Global Imports

Other than things that live in the stdlib, mozbuild or mach itself, don't import anything in a's global scope. Doing this will evaluate the imported file any time the mach binary is invoked. No one wants your module to load itself when running an unrelated command or |mach help|.

It's easy to see how this can quickly add up to be a huge performance cost.

Re-use the Argument Parser

If your underlying library has a CLI itself, don't redefine all the arguments with @CommandArgument decorators. Your redefined arguments will get out of date, and your users will become frustrated. It also encourages a pattern of adding 'mach-only' features, which seem like a good idea at first, but as I explain in the next section, leads down a bad path.

Instead, import the underlying library's ArgumentParser directly. You can do this by using the parser argument to the @Command decorator. It'll even conveniently accept a callable so you can avoid global imports. Here's an example:

```python def setup_argument_parser(): from mymodule import MyModuleParser return MyModuleParser()

@CommandProvider class MachCommands(object): @Command('mycommand', category='misc', description='does something', parser=setup_argument_parser): def mycommand(self, **kwargs): # arguments from MyModuleParser are in kwargs ```

If the underlying ArgumentParser has arguments you'd like to avoid exposing to your mach command, you can use argparse.SUPPRESS to hide it from the help.

Don't Treat the Underlying Library Like a Black Box

Sometimes the underlying library is a huge mess. It can be very tempting to treat it like a black box and use your mach command as a convenient little fantasy-land wrapper where you can put all the nice things without having to worry about the darkness below.

This situation is temporary. You'll quickly make the situation way worse than before, as not only will your mach command devolve into a similar state of darkness, but now changes to the underlying library can potentially break your mach command. Just suck it up and pay a little technical debt now, to avoid many times that debt in the future. Implement all new features and UX improvements directly in the underlying library.

Keep the CLI Simple

The command line is a user interface, so put some thought into making your command useable and intuitive. It should be easy to figure out how to use your command simply by looking at its help. If you find your command's list of arguments growing to a size of epic proportions, consider breaking your command up into subcommands with an @SubCommand decorator.

Rather than putting the onus on your user to choose every minor detail, make the experience more magical than a Disney band.

Be Annoyingly Helpful When Something Goes Wrong

You want your mach command to be like one of those super helpful customer service reps. The ones with the big fake smiles and reassuring voices. When something goes wrong, your command should calm your users and tell them everything is ok, no matter what crazy environment they have.

Instead of printing an error message, print an error paragraph. Use natural language. Include all relevant paths and details. Format it nicely. Create separate paragraphs for each possible failure. But most importantly, only be annoying after something went wrong.

Use Conditions Liberally

A mach command will only be enabled if all of its condition functions return True. This keeps the global |mach help| free of clutter, and makes it painfully obvious when your command is or isn't supposed to work. A command that only works on Android, shouldn't show up for a Firefox desktop developer. This only leads to confusion.

Here's an example:

```python from mozbuild.base import ( MachCommandBase, MachCommandConditions as conditions, )

@CommandProvider class MachCommands(MachCommandBase): @Command('mycommand', category='post-build', description='does stuff' conditions=conditions.is_android): def mycommand(self): pass ```

If the user does not have an active fennec objdir, the above command will not show up by default in |mach help|, and trying to run it will display an appropriate error message.

Design Breadth First

Put another way, keep the big picture in mind. It's ok to implement a mach command with super specific functionality, but try to think about how it will be extended in the future and build with that in mind. We don't want a situation where we clone a command to do something only slightly differently (e.g |mach mochitest| and |mach mochitest-b2g-desktop| from back in the day) because the original wasn't extensible enough.

It's good to improve a very specific use case that impacts a small number of people, but it's better to create a base upon which other slightly different use cases can be improved as well.

Take a Breath

Congratulations, now you are a mach guru. Take a breath, smell the flowers and revel in the satisfaction of designing a great user experience. But most importantly, enjoy coming into work and getting to use kick-ass tools.

February 12, 2016 05:27 PM

February 09, 2016

Dave Hunt

Python testing sprint 2016

In June, the pytest developer community are gathering in Freiburg, Germany for a development sprint. This is being funded via an Indiegogo campaign, which needs your help to reach the goal! I am excited to say that I will be attending, which means that after over 5 years of using pytest, I’ll finally get to meet some of the core contributors.

I first learned about pytest when I joined Mozilla in late 2010. Much of the browser based automation at that time was either using Selenium IDE or Python’s unittest. There was a need to simplify much of the Python code, and to standardise across the various suites. One important requirement was the generation of JUnit XML reports (considered essential for reporting results in Jenkins) without compromising the ability to run tests in parallel. Initially we looked into nose, but there was an issue with this exact requirement. Fortunately, pytest didn’t have a problem with this – JUnit XML was supported in core and was compatible with the pytest-xdist plugin for running tests in parallel.

Ever since the decision to use pytest was made, I have not seen a compelling reason to switch away. I’ve worked on various projects, some with overly complex suites based on unittest, and I’ve always been grateful when I’ve been able to return to pytest. The active development of pytest has meant we’ve never had to worry about the project becoming unsupported. I’ve also always found the core contributors to be extremely friendly and helpful on IRC (#pylib on whenever I need help. I’ve also more recently been following the pytest-dev mailing list.

I’ve recently written about the various plugins that we’ve released, which have allowed us to considerably reduce the amount of duplication between our various automation suites. This is even more critical as the Web QA team shifts some of the responsibility and ownership of some of their suites to the developers. This means we can continue to enhance the plugins and benefit all of the users at once, and our users are not limited to teams at Mozilla. The pytest user base is large, and that means our plugins are discovered and used by many. I always love hearing from users, especially when they submit their own enhancements to our plugins!

There are a few features I particularly like in pytest. Highest on the list is probably fixtures, which can really simplify setup and teardown, whilst keeping the codebase very clean. I also like being able to mark tests and use this to influence the collection of tests. One I find myself using a lot is a ‘smoke’ or ‘sanity’ marker, which collects a subset of the tests for when you can’t afford to run the entire suite.

During the sprint in June, I’d like to spend some time improving our plugins. In particular I hope to learn better ways to write tests for plugins. I’m not sure how much I’ll be able to help with the core pytest development, but I do have my own wishlist for improvements. This includes the following:

Maybe I’ll even be able to work on one of these, or any of the open issues on pytest with guidance from the experts in the room.

February 09, 2016 05:16 PM

mozregression updates

Release 2.3.0 and GUI release 0.9.0

Mozregression 2.3.0 and GUI 0.9.0 has been released!

Changes for both GUI and command line:

GUI changes only:

Command line changes only:

Thanks to Wasif Hyder, Mike Ling and Saurabh Singhal for their contributions!

February 09, 2016 12:00 AM

February 02, 2016

Henrik Skupin

Firefox Desktop automation goals Q1 2016

As promised in my last blog posts I don’t want to only blog about the goals from last quarters, but also about planned work and what’s currently in progress. So this post will be the first one which will shed some light into my active work.

First lets get started with my goals for this quarter.

Execute firefox-ui-tests in TaskCluster

Now that our tests are located in mozilla-central, mozilla-aurora, and mozilla-beta we want to see them run on a check-in basis including try. Usually you will setup Buildbot jobs to get your wanted tasks running. But given that the build system will be moved to Taskcluster in the next couple of months, we decided to start directly with the new CI infrastructure.

So how will this look like and how will mozmill-ci cope with that? For the latter I can say that we don’t want to run more tests as we do right now. This is mostly due to our limited infrastructure I have to maintain myself. Having the needs to run firefox-ui-tests for each check-in on all platforms and even for try pushes, would mean that we totally exceed the machine capacity. Therefore we continue to use mozmill-ci for now to test nightly and release builds for en-US but also a couple of other locales. This might change later this year when mozmill-ci can be replaced by running all the tasks in Taskcluster.

Anyway, for now my job is to get the firefox-ui-tests running in Taskcluster once a build task has been finished. Although that this can only be done for Linux right now it shouldn’t matter that much given that nothing in our firefox-puppeteer package is platform dependent so far. Expanding testing to other platforms should be trivial later on. For now the primary goal is to see test results of our tests in Treeherder and letting developers know what needs to be changed if e.g. UI changes are causing a regression for us.

If you are interested in more details have a look at bug 1237550.

Documentation of firefox-ui-tests and mozmill-ci

We are submitting our test results to Treeherder for a while and are pretty stable. But the jobs are still listed as Tier-3 and are not taking care of by sheriffs. To reach the Tier-2 level we definitely need proper documentation for our firefox-ui-tests, and especially mozmill-ci. In case of test failures or build bustage the sheriffs have to know what’s necessary to do.

Now that the dust caused by all the refactoring and moving the firefox-ui-tests to settles a bit, we want to start to work more with contributors again. To allow an easy contribution I will create various project documentation which will show how to get started, and how to submit patches. Ultimately I want to see a quarter of contribution project for our firefox-ui-tests around mid this year. Lets see how this goes…

More details about that can be found on bug 1237552.

February 02, 2016 04:39 PM

Armen Zambrano G. (@armenzg)

End of MozCI QoC term

We recently completed another edition of Quarter of Contribution and I had the privilege to work with MikeLingF3real & xenny.
I want to take a moment to thank all three of you for your hard work and contributions! It was a pleasure to work together with you during this period.

Some of the highlights of this term are:

You can see all other mozci contributions in here.

One of the things I learned from this QoC term:
  • Prepare sets of issues that are related which build towards a goal or a feature.
    • The better you think it through the easier it will be for you and the contributors
    • In GitHub you can create milestones of associated issues
  • Remind them to review their own code.
    • This is something I try to do for my own patches and saves me from my own embarrassment :)
  • Put it on the contributors to test their code before requesting formal review
    • It forces them to test that it does what they expect it to do
  • Set expectations for review turn around.
    • I could not be reviewing code every day since I had my own personal deliverables. I set Monday, Wednesday and Friday as code review days.
It was a good learning experience for me and I hope it was beneficial for them as well.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

February 02, 2016 02:48 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

February 02, 2016 07:23 AM

January 27, 2016

Geoff Brown


Bug 1233220 added a new Android-only mochitest-chrome test called test_awsy_lite.html. Inspired by, test_awsy_lite runs similar code and takes similar measurements to, but runs as a simple mochitest and reports results to Perfherder.

There are some interesting trade-offs to this approach to performance testing, compared to running a custom harness like or Talos.

+ Writing and adding a mochitest is very simple.

+ It is easy to report to Perfherder (see

+ Tests can be run locally to reproduce and debug test failures or irregularities.

+ There’s no special hardware to maintain. This is a big win compared to ad-hoc systems that might fail because someone kicks the phone hanging off the laptop that’s been tucked under their desk, or because of network changes, or failing hardware. was plagued by problems like this and hasn’t produced results in over a year.

? Your new mochitest is automatically run on every push…unless the test job is coalesced or optimized away by SETA.

? Results are tracked in Perfherder. I am a big fan of Perfherder and think it has a solid UI that works for a variety of data (APK sizes, build times, Talos results). I expect Perfherder will accommodate test_awsy_lite data too, but some comparisons may be less convenient to view in Perfherder compared to a custom UI, like

– For Android, mochitests are run only on Android emulators, running on aws. That may not be representative of performance on real phones — but I’m hoping memory use is similar on emulators.

– Tests cannot run for too long. Some Talos and other performance tests run many iterations or pause for long periods of time, resulting in run-times of 20 minutes or more. Generally, a mochitest should not run for that long and will probably cause some sort of timeout if it does.

For test_awsy_lite.html, I took a few short-cuts, worth noting:

Results are in Perfherder. Add data for “android-2-3-armv7-api9” or “android-4-3-armv7-api15” and you will see various tests named “Resident Memory …”, each corresponding to a traditional measurement.


January 27, 2016 01:39 PM

January 20, 2016

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

January 20, 2016 07:33 AM

January 19, 2016

mozregression updates

Release 2.2.0 and GUI release 0.8.0

New mozregression releases, coming with great new features!

Changes for both GUI and command line:

GUI changes only:

Command line changes only:

January 19, 2016 12:00 AM

January 13, 2016

David Burns

Marionette Executable Release v0.6.0

I have just released a new version of the Marionette, well the executable that you need to download.

The main fixes in this release is the ability to speak to Firefox and get meaningful error messages. This was a slight oversight on our part to make sure that we don't run commands out of sync. We have also added in getPageSource. This "native" call runs in the browser instead of trying to do it in the JavaScript sandbox which is what a number of the drivers were attempting. This will be added to the specification very soon.

I have also landed the update to interactions to the specification. This reads much better and has prose that makes it implementable. I suspect as the likes of Google and Microsoft start looking to implement it there will be bugs that need fixing.

Since you are awesome early adopters it would be great if we could raise bugs.

I am not expecting everything to work but below is a quick list that I know doesn't work.

Switching of Frames needs to be done with either a WebElement or an index. Windows can only be switched by window handles. This is currently how it has been discussed in the specification.

If in doubt, raise bugs!

Thanks for being an early adopter and thanks for raising bugs as you find them!

January 13, 2016 02:48 PM

January 11, 2016

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

January 11, 2016 03:13 PM

Julien Pagès

mozregression – Engineering Productivity Project of the Month

Hello from Engineering Productivity! Once a month we highlight one of our projects to help the Mozilla community discover a useful tool or an interesting contribution opportunity.

This month’s project is mozregression!

Why is mozregression useful ?

mozregression helps to find regressions in Mozilla projects like Firefox or Firefox on Android. It downloads and runs the builds between two dates (or changesets) known to be good and bad, and lets you test each build to finally find by bisection the smallest possible range of changesets where the regression appears.

It does not build locally the application under test, instead, it uses pre-built files, making it fast and easy for everyone to look for the origin of a regression.


# Search a Firefox regression in mozilla-central starting from 2016-01-01

mozregression -g 2016-01-01

# Firefox regression, on mozilla-aurora from 2015-09-01 to 2015-10-01

mozregression --repo aurora -g 2015-09-01 -b 2015-10-01

# Look for a regression in fennec (firefox for android)

mozregression --app fennec -g 2015-09-01 -b 2015-10-01

# regression on firefox in inbound, using debug builds and starting from known changesets

mozregression -b 6f4da397ac3c -g eb4dc9b5a928 -B debug --repo inbound

Note that a graphical interface exists also.

Where do I find mozregression ?

Start with:

What are we working on ?

Currently mozregression is improving in multiple areas, among them:


William Lachance (:wlach) and myself (:parkouss) are the current maintainers of mozregression.

We welcome contributors! Mike Ling is helping the project for quite some time now, adding useful features and fixing various bugs – he’s currently working on providing ready to use binaries for Mac OS X. A big thanks Mike Ling for your contributions!

Also thanks to Saurabh Singhal and Wasif Hider, who are recent contributors on the graphical user interface.

If you want to contribute as a developer or help us on the documentation, please say hi on the #ateam irc channel!

Reporting bugs / new ideas

You can also help a lot by reporting bugs or new ideas! Please file bugs on bugzilla with the mozregression component:

mozregression’s bug list:

For more information about all Engineering Productivity projects visit our wiki. If you’re interested in helping out, the A-Team bootcamp has resources for getting started.

January 11, 2016 02:47 PM

David Burns

Public Source vs Open Source

A few weeks ago I had an interesting conversation on twitter and then on instant messaging about a bit of software that was open sourced. Some thought, and might still do, the new piece of software might not be supported.

There was also recently a good blog post from James Long about how it can be hard to create open source code and then maintain it. Either life gets in the way or another project gets in the way.

I have always had a different view of code to most people. The idea is simple, at least in my head.

Open Source

The idea of Open Source has changed over the years and this has meant the original meaning is not quite right. Open Source has started to have a certain aspect of community involved, from people submitting patches (PRs on Github).

Projects that have been around for a very long time have organically grown some kind of community. These tend to be people who believe in the project or see other people working on it that they get to know. We see meet ups forming as more an more people get involved.

This is best part of be source code! The openness with people and code all wrapped up in one!

However, not all code out in the open will achieve this! (And this is fine, not all pieces of code need to have community. Imagine if every package on NPM had a meet up!?!?).

Public Source

Public source everything that open source has minus all community side of things. A really good example of Public Source is Android. You can see all the code, you derive your own work but want to submit a patch? Well Cryogen might take it but Google, seemingly, don't care.

Also, most projects on Github probably fall under this category. Especially if the person is a starter and not a maintainer, to use James' concept.

The thing to remember is that everyone wins when the code is in the public but before you start getting all hung up on "support" from people who have given up the code, and their time, to put it out there remember that open source needs to grow from public source!

January 11, 2016 10:46 AM

January 08, 2016

Henrik Skupin

Review of automation work – Q4 2015

The last quarter of 2015 is gone and its time to reflect what happened in Q4. In the following you will find a full overview again for the whole quarter. It will be the last time that I will do that. From now on I will post in shorter intervals to specific topics instead of covering everything. This was actually a wish from our latest automation survey which I want to implement now. I hope you will like it.

So during the last quarter my focus was completely on getting our firefox-ui-tests moved into mozilla-central, and to use mozharness to execute firefox-ui-tests in mozmill-ci via the test archive. As result I had lesser time for any other project. So lets give some details…

Firefox UI Tests / Mozharness

One thing you really want to have with tests located in the tree is that those are not failing. So I spent a good amount of time to fix our top failures and all those regressions as caused by UI changes (like the security center) in Firefox as preparation for the move. I got them all green and try my best to keep that state now while we are in the transition.

The next thing was to clean-up the repository and split apart all the different sub folders into their own package. With that others could e.g. depend on our firefox-puppeteer package for their own tests. The whole work of refactoring has been done on bug 1232967. If you wonder why this bug is not closed yet it’s because we still have to wait with the landing of the patch until mozmill-ci production uses the new mozharness code. This will hopefully happen soon and only wait of some other bugs to be fixed.

But based on those created packages we were able to use exactly that code to get our harness, puppeteer, and tests landed on We also package them into the archive for use in mozmill-ci. Details about all that can be found on bug 1212609. But please be aware that we still use the Github repository as integration repository. I regularly mirror the code to hg, which has to happen until we can also use the test package for localized builds and update tests.

Beside all that there were also a couple of mozharness fixes necessary. So I implemented a better fetching of the tooltool script, added the uninstall feature, and also setup the handling of crash symbols for firefox-ui-tests. Finally the addition of test package support finished up my work on mozharness for Q4 in 2015.

During all the time I was also sheriffing our test results on Treeherder (e.g. mozilla-central) because we are still Tier-3 level and sheriffs don’t care about it.

Mozmill CI

Our Jenkins based CI system is still called mozmill-ci even it doesn’t really run any mozmill tests anymore. We decided to not change its name given that it will only be around this year until we can run all of our tests in TaskCluster. But lots of changes have been landed, which I want to announce below:

Addons / Tools

I also had some time to work on supporting tools. Together with the help of contributors we got the following done:


Nightly Tester Tools


So all in all it was a productive quarter with lots of things accomplished. I’m glad that we got all of this done. Now in Q1 it will continue and more interesting work is in-front of me, which I’m excited about. I will announce that soon in my next blog post.

Until then I would like to give a little more insight into our current core team for Firefox automation. A picture taken during our all hands work week in Orlando early in December shows Syd, Maja, myself, and David:

Group Picture

Lets get started into 2016 with lots of ideas, discussions, and enough energy to get those things done.

January 08, 2016 10:25 PM

Dan Minor

Using masked writes with ARM NEON intrinsics

I recently fixed Bug 1105513 which was to provide an ARM NEON optimized version of the AudioBlockPanStereoToStereo for the case where the “OnTheLeft” is an array. This is used by the StereoPanner node when the value is set at a future time, for instance with code like the following:

panner = oac.createStereoPanner();
panner.pan.setValueAtTime(-0.1, 0.0);
panner.pan.setValueAtTime(0.2, 0.5);

The “OnTheLeft” values determine whether the sound is on the left or right of the listener at a given time, which controls the interpolation calculation performed when panning. If this changes with time, then this is passed as an array rather than as a constant.

The unoptimized version of this function checks each value of “OnTheLeft” and performs the appropriate calculation. This isn’t an option for NEON which lacks this kind of conditional execution.

The bright side is that NEON does provide masked writes where a variable controls which components of a vector are written. Unfortunately, the NEON documentation is spare at best, so it took a few tries to get things right.

The first trick is to convert from a bool to a suitable mask. What a bool is, is of course platform dependent, but in this case I had an array of eight bytes, each containing a zero or a one. The best solution I came up with was to load them as a vector of 8 unsigned bytes and then load each corresponding float value in the mask individually:

isOnTheLeft = vld1_u8((uint8_t *)&aIsOnTheLeft[i]);
voutL0 = vsetq_lane_f32(vget_lane_u8(isOnTheLeft, 0), voutL0, 0);
voutL1 = vsetq_lane_f32(vget_lane_u8(isOnTheLeft, 1), voutL0, 1);

Once loaded, they can be converted into a suitable mask by using the vcgtq function which sets all bits to 1 in the first argument if it is greater than the second argument:

voutL0 = (float32x4_t)vcgtq_f32(voutL0, zero);

After that, the appropriate calculations are done for both the case where “OnTheLeft” is true and where it is false. These are then written to the result using vbsql function, which treats the mask as the output, and selects from the second two arguments based upon the value in the mask:

voutL0 = vbslq_f32((uint32x4_t)voutL0, onleft0, notonleft0);

I evaluated these changes on a StereoPanner benchmark where I saw a small performance improvement.

January 08, 2016 01:53 PM

Mark Côté

BMO in 2015

It’s been a whole year since my last BMO update, partly because I’ve been busy with MozReview (and blogging a lot about it), and partly because the BMO team got distracted from our goals by a few sudden priority changes, which I’ll get to later in this post.

Plans from 2014

Even with some large interruptions, we fully achieved three of our five goals for the year and made good progress on a fourth.

Alternative Bug Views

Have you tried out the new modal UI? Although not completely finished (it lacks a few features that the standard UI has), it’s very usable. I don’t remember the last time I had to switch back, and I’ve been using it for at least 6 months. Bonus: gone is the intermediate page when you change a bug’s product, a gripe from time immemorial!

Even though there are still a large number of controls, the new UI is a lot more streamlined. glob gave a brief presentation at a Mozilla Project Meeting in November if you’d like to learn more.

The part we haven’t yet undertaken is building on this new framework to provide alternate views of bug data depending on what the user is trying to accomplish. We want to experiment with stripping down the presented data to only what is needed for a particular task, e.g. developing, triaging, approving, etc. The new UI is a lot more flexible than the old, so in 2016 we’ll build out at least one new task-centric view.

GitHub Authentication

If you haven’t noticed, you can log into BMO via GitHub. If you’ve never used BMO before, you’ll be prompted to set up an account after authenticating. As with Persona, only users with no special privileges (i.e. not admins nor people in security groups) can log in via GitHub.

Auth Delegation

Originally designed to smooth the process of logging into Review Board, auth delegation for API keys is actually a general-use feature that greatly improves the user experience, not to mention security, of third-party apps by allowing them to delegate authentication to BMO. There’s now no reason for apps to directly ask for your BMO credentials!

MozReview Details

There’s now a panel just above the attachments table that shows all the MozReview commits associated with the displayed bug along with a bit of other info:

We’re currently sorting out a single method to display other relevant information, notably, status of reviews, and then we’ll add that to this table.

Improved Searchability

This is the big item we haven’t made much progress on. We’ve got a plan to mirror some data to an Elasticsearch cluster and wire it into Quick Search. We’ve even started on the implementation, but it’s not going to be ready until mid-2016. It will increase search speeds, understandably one of the more common complaints about BMO.

Curve balls

We had two sets of surprises in 2015. One was work that ended up consuming more time than we had expected, and the other was important work that suddenly got a big priority boost.

BMO Backup in AWS

The first is that we moved the BMO failover out of a data center in Phoenix and into the cloud. IT did most of the work, but we had to make a series of changes to BMO to facilitate the move. We also had a lot of testing to do to. The upside is that our new failover system has had more testing than our old one had for quite some time!

Hardened Security

In August we found out that an attacker had compromised a privileged BMO account, using a combination of a weak, reused password and an exploit in another web site. In addition to a huge forensics effort from the great security folks at Mozilla, the BMO team implemented a number of security enhancements to BMO, most notably two-factor authentication. This work naturally took high priority and is the main reason for the slippage of our big 2015 goals. Here’s to a more secure 2016!

Other Stuff

As usual, the BMO team rolled out a pile of smaller fixes, enhancements, improvements, and new features. A few notable examples include

You can always find the exhaustive list of recent changes to BMO on the wiki or on the group/mailing list.

January 08, 2016 01:20 AM

January 05, 2016

Henrik Skupin

Automation Survey Follow-up

As promised in my last post about the automation survey results I wanted to come up with a follow-up to clarify our next steps in being more open for our activities, discussions, and also quarterly goals. Sorry, that it has been taken a bit longer but end of the quarter and especially the year is mostly packed with stuff to finish up. Also the all-hands work week in Orlando beginning of December hold me off from doing a lot real work.

So lets get started with the mailing list topic first. As we have seen most people kinda like to get our news via the automation mailing list. But given the low usage of that list in the last months it was a bit surprising. Nearly all the time I sent emails myself (not to count in Travis results). That means we want to implement a change here. From now on we won’t use the list but instead utilize the list. Also because this is the recommended list for the Engineering Productivity team we are all part of, and discussions will reach a larger audience. So please subscribe to this list via Google Groups or Email.

For status updates about our current activities we started to use last quarter. It seems to work pretty well for us and everyone else is welcome to also post updates to our automation project section. If you are interested in those updates then read through that list or simply subscribe the page in your RSS reader.

Please also note that from now on there will be no Firefox Automation reports anymore. Instead I will reduce the amount of different contents, and only write about projects I worked on. So keep an eye out to not miss those!

January 05, 2016 12:03 PM

January 04, 2016

Mark Côté

Review Board history

A few weeks ago, mdoglio found an article from six years ago comparing Review Board and Splinter in the context of GNOME engineering. This was a fascinating read because, without having read this article in advance, the MozReview team ended implementing almost everything the author talked about.

Firstly, I admit the comparison isn’t quite fair when you replace with GNOME doesn’t use attachment flags, which BMO relies heavily on. I haven’t ever submitted a patch to GNOME, but I suspect BMO’s use of review flags makes the review process at least a bit simpler.

The first problem with Review Board that he points out is that the “post-review command-line leaves a lot to be desired when compared to git-bz”. This was something we did early on in MozReview, all be it with Mercurial instead: the ability to push patches up to MozReview with the hg command. Admittedly, we need an extension, mainly because of interactions with BMO, but we’ve automated that setup with mach mercurial-setup to reduce the friction. Pushing commits is the area of MozReview that has seen the fewest complaints, so I think the team did a great job there in making it intuitive and easy to use.

Then we get to what the author describes as “a more fundamental problem”: “a review in Review Board is of a single diff”. As he continues, “More complex enhancements are almost always done as patchsets [emphasis his], with each patch in the set being kept as standalone as possible. … Trying to handle this explicitly in the Review Board user interface would require some substantial changes”. This was also an early feature of MozReview, implemented at the same time as hg push support. It’s a core philosophy baked into MozReview, the single biggest feature that distinguishes MozReview from pretty much every other code-review tool out there. It’s interesting to see that people were thinking about this years before we started down that road.

An interesting aside: he says that “a single diff … [is] not very natural to how people work with Git”. The article was written in 2009, as GitHub was just starting to gain popularity. GitHub users tend to push fix-up commits to address review points rather than editing the original commits. This is at least in part due to limitations present early on in GitHub: comments would be lost if the commit was updated. The MozReview team, in fact, has gotten some push back from people who like working this way, who want to make a bunch of follow-up commits and then squash them all down to a single commit before landing. People who strongly support splitting work into several logical commits and updating them in place actually tend to be Mercurial users now, especially those that use the evolve extension, which can even track bigger changes like commit reordering and insertion.

Back to Review Board. The author moves onto how they’d have to integrate Review Board with Bugzilla: “some sort of single-sign-on across Bugzilla and Review Board”, “a bugzilla extension to link to reviews”, and “a Review Board extension to update bugs in Bugzilla”. Those are some of the first features we developed, and then later improved on.

There are other points he lists that we don’t have, like an “automated process to keep the repository list in Review Board in sync with the 600+ GNOME repositories”. Luckily many people at Mozilla work on just one repo: mozilla-central. But it’s true that we have to add others manually.

Another is “reduc[ing] the amount of noise for bug reporters”, which you get if you confine all patch-specific discussion to the review tool. We don’t have this yet; to ease the transition to Review Board, we currently mirror pretty much everything to Bugzilla. I would really like to see us move more and more of code-related discussion to Review Board, however. Hopefully as more people transition to using MozReview full time, we can get there.

Lastly, I have to laugh a bit at “it has a very slick and well developed web interface for reviewing and commenting on patches”. Clearly we thought so as well, but there are those that prefer the simplicity of Splinter, even in 2015, although probably mostly from habit. Trying to reconcile these two views is very challenging.

January 04, 2016 10:19 PM

David Burns

The "power" of overworking

The other week I was in Orlando, Florida for a Mozilla All-Hands. It is a week where around 1200 Mozillians get together to spend time with each other planning, coding, or solving some hard problems.

One of the topics that came up was how someone always seemed to be online. This comment was a little more than "they never seem to go offline from IRC". It was "they seem to commenting on things around 20 hours a day". Overworking is a simple thing to do and when you love your job you can easily be pulled into this trap.

I use the word trap and I mean it!

If you are overworking you put yourself into this state where people come to expect that you will overwork. If you overwork, and have a manager who doesn't notice that you are overworking, when you do normal hours they begin to think that you are slacking. If you do have a manager who is telling you to stop overdoing it, you might then have colleagues who don't notice that you work all the hours. They then expect you to do be this machine, doing everything and more. And those colleagues that notice you doing too many hours start to think your manager is a poor manager for not helping you have a good work/life balance.

At this point, everyone is starting to lose. You are not being as productive as your could be. Studies have shown that working more than 40 hours a week only marginally increases productivity and this only lasts for a few weeks before productivity drops below the productivity you would have if you worked 40 hours a week.

The reasons for overworking can be numerous but the one that regularly stands out is imposter syndrome. "If I work 50 hours a week then people won't see me fail because I will hopefully have fixed it in time". This is a fallacy, people are happy to wait for problems to be fixed as long as it is in hand. Having one person be responsible for fixing things is a road to ruin. A good team is measured by how quickly they help colleagues. If you fall, there will be 2 people there to pick you up.

Before you start working more than 40 hours a week start thinking about the people this is going to impact. This is not only your colleagues, who start having to clean up technical debt, but your personal life. It is also your loved ones who are impacted. Missing an anniversary, a birthday, a dance/music recital. Work is never worth missing that!

If you are working more than 40 hours I suggest bringing this up in your next 1:1. Your manager will appreciate that you are doing some self care (if they are good managers) and work with you in making changes to your workload. They could be over promising their team and need to get this under control.

January 04, 2016 09:10 PM

December 31, 2015

Geoff Brown

Firefox for Android Performance Measures – Q4 Check-up



APK Size

This quarter we began tracking the size of the Firefox for Android APK, and some of its components. You can see the size of every build on treeherder using Perfherder.

Here’s how the APK size changed over the last 2 months, for mozilla-central Android 4.0 opt builds:


There are lots of increases and a few decreases here. The most significant decrease (almost half a megabyte) is on Nov 23, from mfinkle’s change for Bug 1223526. The most significant increase (~200K) is on Dec 20, from a Skia update, Bug 1082598.

It is worth noting that the sizes of over the same period were almost always increasing:



This section tracks Perfherder graphs for mozilla-central builds of Firefox for Android, for Talos tests run on Android 4.0 Opt. The test names shown are those used on treeherder. See for background on Talos.

We intend to retire the remaining Android Talos tests, migrating these tests to autophone in the very near future.


Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

This test is no longer running. It was noisy and needed to be rewritten for APZ. See discussion in bug 1213032 and bug 1230572.


An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.


730 (start of period) – 110 (end of period)

A small regression at the end of November corresponded with the introduction of APZ; it was investigated in bug 1229118. An extraordinary improvement on Dec 25 was the result of jchen’s refactoring.


Generic page load test. Lower values are better.


730 (start of period) – 680 (end of period)

Note the same regression and improvement as seen in tsvgx.


Throbber Start / Throbber Stop

These graphs are taken from  Browser startup performance is measured on real phones (a variety of popular devices).




Android tests are no longer run on Eideticker.


These graphs are taken from the mozbench dashboard at which includes some comparisons involving Firefox for Android. More info at


Sadly, the other mobile benchmarks have no data for most of November and December…I’m not sure why.

December 31, 2015 08:44 PM

Robert Wood

Posting to Treeherder from Jenkins


The Firefox OS post-commit Raptor performance automated tests currently run on real FxOS remote devices, driven by Jenkins jobs. The performance results are published on the raptor dashboard. In order to add more visibility for these tests, and gaia performance testing in general, we decided to also post raptor performance test data on the Mozilla treeherder dashboard.

If the raptor automation was running inside of taskcluster, posting to treeherder would be taken care of easily via the actual taskcluster task-graph. However when running automation on jenkins, in order to post data on treeherder it does take a bit of extra work. This blog post will tell you how to do it!

For the purpose of this blog I will summarize the steps I took, but I strongly encourage you to refer to the treeherder docs if you run into problems or just want more detailed information. Thanks to the treeherder team for the help they gave me along the way.


It is assumed that you already have a linux development environment setup (I am running Ubuntu 14.04 x64). Posting to treeherder can be done via a treeherder python client or a node.js client; this blog post only provides direction for using the python client. It is assumed that your development environment already has python 2.7.9+ already installed (2.7.9+ is required for authentication).

Virtual Box and Vagrant are also required for the development environment. If you don’t have them installed, please refer to the Virtual Box downloads site and Vagrant Docs on how to install them.

Setup the Development Environment

In order to develop and test your submission code to post data to treeherder, you need to setup the development environment.

Clone the Repository

git clone

Install Treeherder Client

pip install treeherder-client

You will also need these supporting packages:

pip install mozinfo boto

Start Vagrant

Instead of posting to the live treeherder site during your development, it’s best to run your own local treeherder instance. To do that, you need to start vagrant. First add the following IP entry to your /etc/hosts file:

Then from  your local treeherder github repository directory, start it up:

~/treeherder$ vagrant up

Wait for vagrant to boot up, it can take several minutes. Watch for errors; I had a couple of errors on first time startup and had to rebuild my local virtualbox (sudo /etc/init.d/vboxdrv setup) and install this package:

sudo apt-get install nfs-kernel-server

Start Local Treeherder Instance

Now that vagrant is running we can startup our local treeherder instance! Open a terminal and from your treeherder repository folder, SSH into your vagrant:

~/treeherder$ vagrant ssh

Inside the vagrant vm, startup treeherder:

vagrant ~/treeherder $ ./bin/run_gunicorn

Wait for that puppy to start, and then in Firefox go to your local instance URL:

You’ll see your local treeherder instance dashboard, however you’ll notice there’s no data! In order to receive real live treeherder data on your local test instance, you need to ‘ingest’ the data.

Ingest Live Data

In another terminal session ssh into vagrant and start the worker, as follows:

~/treeherder$ vagrant ssh

vagrant ~/treeherder $ celery -A treeherder worker -B –concurrency 5

Now if you want a minute and then refresh your local treeherder instance in Firefox, you’ll see some live data magically appear!

Local Treeherder Credentials

In order to test posting data to treeherder you need credentials, even if just posting to your local treeherder instance. To generate credentials for your local treeherder instance, there’s a command to run inside a vagrant ssh session, as follows (replace ‘test-client-id’ with your desired test client id):

~/treeherder$ vagrant ssh

vagrant ~/treeherder $ ./ create_credentials test-client-id “Local instance test credentials”

The generated credentials will be displayed on the console. Be sure to record your client-id and resulting secret somewhere.

If you had any issues with setting up your development environment, I encourage you to review the setup info in the treeherder docs for further details.

Develop and Test Submission Code

Now that you have your development environment setup and your local treeherder instance up and running, you’re ready to start developing your submission code. The submission code will ultimately run as part of your jenkins build.

Basically you need to write code that builds treeherder job data and submits it, using the treeherder client.

Submission Code Example

Henrik on the Engineering Productivity team pointed me towards the treeherder submission code from the Mozmill-ci automation framework. I grabbed the submission code from there as a start, created my own python project, and modified the code as required to meet the needs for raptor. If you like, clone or fork the raptor post-to-treeherder github repo to use as a starting point, and modify it accordingly for your project.

The python script that we use to submit raptor data to treeherder is here. I’ll explain the code a little bit further now.

Treeherder Job Data

Each treeherder data set contains job data, which includes:

Product name: As it sounds, we use ‘Raptor’

Repository: The Mozilla repo that you ran your tests against, and the tree that you want your treeherder results to be published to (i.e. ‘b2g-inbound’)

Platform: This is listed next to your treeherder results group (i.e. ‘B2G Raptor opt’)

Group name: Summary name for your test data (i.e. ‘Raptor Performance’). This appears when you mouse over your group symbol on treeherder.

Group symbol: The name of the group to hold all of your job results. For example for the Raptor coldlaunch test we use a group symbol ‘Coldlaunch’. The actual jobs on treeherder will appear inside brackets following the group symbol.

Job name: Name for the individual treeherder job. For Raptor, we use the name of the gaia app that the performance test was run against (i.e. ‘clock’). This name appears when you mouse over the job symbol on treeherder.

Job symbol: The job code on treeherder for the actual job result. This is the symbol that will turn colour based on the job results, and when moused over the job state and result will appear. For Raptor we use a three-letter code for the gaia app that was tested (i.e. ‘clk’).

In the Raptor treeherder submission code, the job details are stored in a configuration file here.

Revision hash: This is the gecko version hash, for the version of gecko that your tests were run against. This is how treeherder finds your dataset to post to – your submission code will add a group and job to the already existing treeherder dataset for the specified gecko revision. In my submission code, this is where the revision hash is retrieved.

In the Raptor treeherder submission code, this is where it creates the initial treeherder job dataset.

Treeherder Job State

In order to have your jenkins job appear on treeherder like other jobs do, with a status of ‘running’ and then followed by a status of ‘completed’, you need to post the treeherder data set once when you are starting your build on jenkins, and then post again at the end of your jenkins job.

For Raptor, just before starting the performance test I submit the treeherder data set, with:


Then after the performance test has finished then I post the treeherder data set again, this time with:


Treeherder Job Result

For Raptor, at the start the treeherder job has a status of ‘running’. After the performance test is finished, I submit the dataset to treeherder again but in one of three states: busted, passed, or failed. The state is specified by adding the following to your treeherder job dataset (example for a passed job):


if the test didn’t finish for some reason (i.e. test itself is incomplete, timed out, etc) then I post the ‘completed’ state with a result of ‘BUSTED’.

If the Raptor performance test completed successfully, I have code that checks the results. If result is deemed to be a pass, then I post the ‘completed’ state with a result of ‘SUCCESS’.

If the Raptor performance test completed successfully, and the results code determines that the result itself indicates a performance regression, then I post the ‘completed’ state with a result of ‘TESTFAILED’.

This is the part of the code that checks the Raptor results and determines the treeherder job result.

In my submission code, this is where the actual treeherder job status is set. For more detailed information about building the treeherder job dataset, see the Job Collections section on the treeherder docs.

Test on Local Instance

Once your submission code is ready, test it out first by submitting to your local treeherder instance. To do that, just specify the treeherder URL to be:

Use your local treeherder credentials that you previously generated above. For example, this is how the command line looks for my submission code, to submit to my local treeherder instance:

(raptor-th)rwood@ubuntu:~/raptor-post$ ./ –repository=b2g-inbound –treeherder-url= –treeherder-client-id=xxx –treeherder-secret=xxx –test-type=cold-launch –app-name=’clock’ –app_symbol=’clock’ –build-state=completed


When you run into submission errors while testing your code, there are a couple of log files that might give you more info to help debug.


vagrant /var/log/treeherder $ ls
treeherder.log  treeherder.log.1  treeherder.log.2
Or in the vagrant SSH when starting up the treeherder instance just do:
vagrant ~/treeherder $ ./bin/run_gunicorn | grep error

Request Credentials for Staging and Production

In order to submit data to live treeherder staging and production, you need credentials. To find out how to request credentials, see Managing API Credentials in the treeherder docs.

Test Posting to Treeherder Staging

Now at this point your submission code should be working great and posting successfully to your local treeherder instance. The next step, now that you have your treeherder credentials, is to test posting to live treeherder staging. Simply run your submission code locally as you did before, but change your submission URL to point to treeherder staging:

Use the credentials for live treeherder staging, as you requested above.

Add Submission Code to Jenkins

This is a given; in Jenkins just add an execute shell step to clone your github repo where your submission code lives. Then add a managed shell script (or execute shell step) that uses your submission code the same way you have tested it locally. Ensure you are posting to the treeherder staging URL first (, NOT production.

You may not want to fail your entire jenkins build if submitting to treeherder fails for some reason. For Raptor, if the performance tests have completed successfully but submitting to treeherder fails, I don’t want the jenkins build to fail because the performance tests passed. Therefore, in my managed script (or in your execute shell step), finish with this line:

exit 0

Then your jenkins submission code step will always return success and won’t fail the jenkins build.

Example Jenkins Submission Code

For an example of how we use our treeherder submission script from jenkins, see our jenkins coldlaunch driver script.

Test from Jenkins to Treeherder Staging

Let your jenkins builds run (or retrigger builds) and verify the results are being posted to live treeherder staging. They will look the same as they did when you tested posting to treeherder staging from your local machine. If there are errors you may need to fix something in your jenkins execute shell or managed script.

Server Clock Time

One issue that I ran into, submitting to my local treeherder instance was working great, however submitting to the live treeherder staging from jenkins was failing with the following error:

13:50:05 requests.exceptions.HTTPError: 403 Client Error: FORBIDDEN for url

Turns out the problem is that the server time on the jenkins node was off by 13 minutes. If the jenkins node time is off from the treeherder server time by more than 60 seconds, then authentication will fail; so be sure that your jenkins server time is correct.

Switch to Treeherder Production

Be sure to test posting from jenkins to treeherder staging for a good while before going live on production. You may want to ask someone in your team for a final review of your submission code, and to approve how the dataset looks on treeherder. Once you are happy with it, and want to switch to production, all you do is update your jenkins code to use the treeherder production URL instead of staging:

I hope that you found this blog post useful. If you have any questions about my submission code feel free to contact me. Happy submitting!

December 31, 2015 07:01 PM

December 30, 2015

Julien Pagès

Convert Firefox into Emacs

Firefox is a great browser. One of the reasons I really love it is because it is highly configurable: as an Emacs user, I wanted to use emacs key bindings inside Firefox – well, it’s easy to do that. And much more!

Most of the magic for me comes from the awesome keysnail addon. It basically convert Firefox into Emacs, is also highly configurable and have plugins.

For example, I now use C-x <left> and C-x <right> to switch tabs; C-x b to choose a specific tab (using the Tanything plugin) or C-x k to kill a tab. Tabs are now like Emacs buffers! Keysnail support the mark, incremental search (C-s and C-r), specific functions, … Even M-x is implemented, to search and run specific commands!

Also I use the Find As You Type Firefox feature, for links. It’s awesome: I just hit ‘, then start typing some letters in a link title that I want to follow – I can then use C-s or C-r to find next/previous matching links if needed, then I just press Return to follow the link.

I can browse the web more efficiently, I am less using the mouse and I can reuse the same key bindings in Emacs and Firefox! I keep my configuration files on github, feel free to look at it if you’re interested!

Happy browsing!

December 30, 2015 03:49 PM

December 24, 2015

Geoff Brown

Comparing Linux mochitest results across environments

A few weeks ago, I was trying to run Linux Debug mochitests in an unfamiliar environment and that got me to thinking about how well tests run on different computers. How much does the run-time environment – the hardware, the OS, system applications, UI, etc. – affect the reliability of tests?

At that time, Linux 64 Debug plain, non-e10s mochitests on treeherder – M(1) .. M(5) – were running well: Nearly all jobs were green. The most frequent intermittent failure was dom/html/test/test_fullscreen-api-race.html, but even that test failed only about 1 time in 10. I wondered, are those tests as reliable in other environments? Do intermittent failures reproduce with the same frequency on other computers?

Experiment: Borrow a test slave, run tests over VNC

I borrowed an aws test slave – see – and used VNC to access the slave and run tests. I downloaded builds and test packages from mozilla-central and invoked with the same arguments used for the automated tests shown on treeherder.   To save time, I restricted my tests to mochitest-1, but I repeated mochitest-1 10 times. All tests passed all 10 times. Additional runs produced intermittent failures, like test_fullscreen-api-race, with approximately the same frequency reported by Orange Factor for recent builds. tl;dr Treeherder results, including intermittent failures, for mochitests can be reliably reproduced on borrowed slaves accessed with VNC.

Experiment: Run tests on my laptop

Next I tried running tests on my laptop, a ThinkPad w540 running Ubuntu 14. I downloaded the same builds and test packages from mozilla-central and invoked with the same arguments used for the automated tests shown on treeherder. This time I noticed different results immediately: several tests in mochitest-1 failed consistently. I investigated and tracked down some failures to environmental causes: essential components like pulseaudio or gstreamer not installed or not configured correctly. Once I corrected those issues, I still had a few permanent test failures (like dom/base/test/test_applet_alternate_content.html, which has no bugs on file) and very frequent intermittents (like dom/base/test/test_bug704320_policyset.html, which is decidedly low-frequency in Orange Factor). I also could not reproduce the most frequent mochitest-1 intermittents I found on Orange Factor and reproduced earlier on the borrowed slave. An intermittent failure like test_fullscreen-api-race, which I could generally reproduce at least once in 10 to 20 runs on a borrowed slave, I could not reproduce at all in over 100 runs on my laptop. (That’s 100 runs of the entire mochitest-1 job. I also tried running specific tests or specific directories of tests up to 1000 times, but I still could not reproduce the most common intermittent failures seen on treeherder.) tl;dr Intermittent failures seen on treeherder are frequently impossible to reproduce on my laptop; some failures seen on my laptop have never been reported before.

Experiment: Run tests on a Digital Ocean instance

Digital Ocean offers virtual servers in the cloud, similar to AWS EC2. Digital Ocean is of interest because rr can be used on Digital Ocean but not on aws. I repeated my test runs, again with the same methodology, on a Digital Ocean instance set up earlier this year for Orange Hunter.

My experience on Digital Ocean was very similar to that on my own laptop. Most tests pass, but there are some failures seen on Digital Ocean that are not seen on treeherder and not seen on my laptop, and intermittent failures which occur with some frequency on treeherder could not be reproduced on Digital Ocean.

tl;dr Intermittent failures seen on treeherder are frequently impossible to reproduce on Digital Ocean; some failures seen on Digital Ocean have never been reported before; failures on Digital Ocean are also different (or of different frequency) from those seen on my laptop.


I found it relatively easy to run Linux Debug mochitests in various  environments in a manner similar to the test jobs we see on treeherder. Test results were similar to treeherder, in that most tests passed. That’s all good, and expected.

However, test results often differed in small but significant ways across environments and I could not reproduce most frequent intermittent failures seen on treeherder and tracked in Orange Factor. This is rather discouraging and the cause of the concern mentioned in my last post: While rr appears to be an excellent tool for recording and replaying intermittent test failures and seems to have minimal impact on the chances of reproducing an intermittent failure, rr cannot be run on the aws instances used to run Firefox tests in continuous integration, and it seems difficult to reproduce many intermittent test failures in different environments. (I don’t have a good sense of why this is: Timing differences, hardware, OS, system configuration?)

If rr could be run on aws, all would be grand: We could record test runs in aws with excellent chances of reproducing and recording intermittent test failures and could make those recordings available to developers interested in debugging the failures. But I don’t think that’s possible.

We had hoped that we could run tests in another environment (Digital Ocean) and observe the same failures seen on aws and reported in treeherder, but that doesn’t seem to be the case.

Another possibility is bug 1226676: We hope to start running Linux tests in a docker container soon. Once that’s working, if rr can be run in the container, perhaps intermittent failures will behave the same way and can be reproduced and recorded.

December 24, 2015 05:28 AM

December 22, 2015

Dave Hunt

Selenium tests with pytest

When you think of Mozilla you most likely first associate it with Firefox or our mission to build a better internet. You may not think we have many websites of our own, beyond perhaps the one where you can download our products. It’s only when you start listing them that you realise how many we actually have; addons repository, product support, app marketplace, build results, crash statistics, community directory, contributor tasks, technical documentation, and that’s just a few! Each of these have a suite of automated functional tests that simulate a user interacting with their browser. For most of these we’re using Python and the pytest harness. Our framework has evolved over time, and this year there have been a few exciting changes.

Over four years ago we developed and released a plugin for pytest that removed a lot of duplicate code from across our suites. This plugin did several things; it handled starting a Selenium browser, passing credentials for tests to use, and generating a HTML report. As it didn’t just do one job, it was rather difficult to name. In the end we picked pytest-mozwebqa because it was only specific in addressing the needs of the Web QA team at Mozilla. It really took us to a new level of consistency and quality across all our our web automation projects.

Enhanced HTML report generated by pytest-htmlThis year, when I officially joined the Web QA team, I started working on breaking the plugin up into smaller plugins, each with a single purpose. The first to be released was the HTML report generation (pytest-html), which generates a single file report as an alternative to the existing JUnit report or console output. The plugin was written such that the report can be enhanced by other plugins, which ultimately allows us to include screenshots and other useful things in the report.

Next up was the variables injection (pytest-variables). This was needed primarily because we have tests that require an existing user account in the application under test. We couldn’t simply hard-code these credentials into our tests, because our tests are open source, and if we exposed these credentials someone may be able to use them and adversely affect our test results. With this plugin we are able to store our credentials in a private JSON file that can be simply referenced from the command line.

The final plugin was for browser provisioning (pytest-selenium). This started as a fork of the original plugin because much of the code already existed. There were a number of improvements, such as providing direct access to the Selenium object in tests, and avoiding setting a default implicit wait. In addition to supporting Sauce Labs, we also added support for BrowserStack and TestingBot.

Now that pytest-selenium has been released, we have started to migrate our own projects away from pytest-mozwebqa. The migration is relatively painless, but does involve changes to tests. If you’re a user of pytest-mozwebqa you can check out a few examples of the migration. There will no longer be any releases of pytest-mozwebqa and I will soon be marking this project as deprecated.

The most rewarding consequence of breaking up the plugins is that we’ve already seen individual contributors adopting and submitting patches. If you’re using any of these plugins let us know – I always love hearing how and where our tools are used!

December 22, 2015 09:45 AM

mozregression updates

Release 2.1.0 and GUI release 0.7.0

This is a minor release of the command-line and GUI mozregression tools.

On the command line side:

And other minor features and fixes: bug 1195390, bug 1231745, bug 1232879, bug 1233649 and bug 1233905.

On the GUI side:

And other fixes: bug 1232660 and bug 1233657

Thanks to Saurabh Singhal and GopianiS! They are new mozregression contributors who helped me with this release.

December 22, 2015 12:00 AM

December 18, 2015

Geoff Brown

Recording and replaying mochitests with rr and mach

rr is a lightweight debugging tool that allows program execution to be recorded and subsequently replayed and debugged. gdb-based debugging of recordings is enhanced by reverse execution.

rr can be used to record and replay Firefox and Firefox tests on Linux. See If you have rr installed and have a Linux Debug build of Firefox handy, recording a mochitest is as simple as:

  ./mach mochitest --debugger=rr ...

For example, to record a single mochitest:

  ./mach mochitest testing/mochitest/tests/Harness_sanity/test_sanitySimpletest.html \
    --keep-open=false --debugger=rr

Even better, use –run-until-failure to repeat the mochitest until an intermittent failure occurs:

  ./mach mochitest testing/mochitest/tests/Harness_sanity/test_sanitySimpletest.html \
    --keep-open=false --run-until-failure --debugger=rr

To replay and debug the most recent recording:

  rr replay

Similar techniques can be applied to reftests, xpcshell tests, etc.

For a fun and simple experiment, you can update a test to fail randomly, maybe based on Math.random(). Run the test in a loop or with –run-until-failure to reproduce your failure, then replay: Your “random” failure should occur at exactly the same point in execution on replay.

In recent weeks, I have run many mochitests on my laptop in rr, hoping to improve my understanding of how well rr can record and replay intermittent test failures.

rr has some, but only a little, effect on test run-time. I can normally run mochitest-1 via mach on my laptop in about 17 minutes; with rr, that increases to about 22 minutes (130% of normal). That’s consistent with :roc’s observations at

I observed no difference in test results, when running on my laptop: the same tests passed and failed with or without rr, and intermittent failures occurred with approximately the same frequency with or without rr. (This may not be universal; others have noted differences:

So my experience with rr has been very encouraging: If I can reproduce an intermittent test failure on my laptop, I can record it with rr, then debug it at my leisure and benefit from rr “extras” like reverse execution. This seems great!

I still have a concern about the practical application of rr to recording intermittent failures reported on treeherder…I’ll try to write a follow-up post on that soon.

December 18, 2015 06:47 PM

December 05, 2015

Alice Scarpa

How I made Treeherder do something it was not meant to

For the past couple of months I have been working on integrating Try Extender with Treeherder. The goal was to add an “Add new jobs” button to Treeherder that would display every possible job for that push. Users would then be able to click on the jobs they want to trigger them.

It was a fun project in which I had a lot of help from the Treeherder team and I ended up learning a little about how TH works.

How Treeherder shows jobs

For every push, Treeherder makes a request to its API to obtain a JSON object with every job for that push and their respective symbols, status, types, platforms and whatever else is needed to correctly display them. Every single one of these jobs has an id and it’s in a row in Treeherder’s job database.

Buildbot jobs enter TH’s job database as part of the ETL layer. Celery tasks parse JSON files that are generated every minute by BuildAPI.

Runnable jobs database

Treeherder already knows how to get a list of jobs from an API endpoint and display them in the right places (if you are curious, mapResultSetJobs carries most of the weight). All I needed to do was add a new API endpoint with the list of every possible job (and the associated information).

To feed the information to the new endpoint, I created a table of runnable jobs. Jobs enter this new table through a daily task that downloads and processes allthethings.json.

Setting things up

With the database part ready, some things had to be done on the UI side. An (extremely reasonable) assumption made by Treeherder is that it will only show jobs that exist. Since runnable jobs don’t exist, I had to create a new type of job button that would not open the information panel and that would allow users to click on several jobs at the same time.

The triggering part was done by sending Pulse messages to Pulse Actions, which would then schedule jobs using mozci and releng’s amazing BuildBot Bridge (armenzg did a great job adding BBB support to mozci).

Possible improvements

The UX is not very intuitive.

Selecting several jobs is very annoying. One idea to fix that is to have a keyboard shortcut to “select all visible jobs”, so users could use the search box to filter only the jobs they wanted (e.g. “e10s”) and select everything that is showing.

Known problems

Since the triggering part happens in Pulse Actions and the selecting part happens in Treeherder, we don’t tell users what happened with their requests. Until bug 1032163 lands, only the push author and people with an “” email address will be able to extend pushes. Right now we have no way of telling users that their request was denied.

We can schedule test jobs when no build job exists, and we can trigger test jobs when the build job is already completed. But when the build job is currently running/pending, we don’t trigger anything. We could either trigger an additional build job or do nothing, and we choose to do nothing to avoid triggering costly unnecessary build jobs.

What about TaskCluster jobs?

Currently “Add new jobs” only supports triggering Buildbot jobs. What is needed to support TaskCluster jobs? 2 things:

If anyone is interested in working on this, please ping me (I’m adusca on IRC), or we can talk more about it in Mozlando ;)

December 05, 2015 12:00 AM

December 04, 2015

mozregression updates

GUI Release 0.6.0

After mozregression 2.0, it is time for the GUI to follow!

0.6.0 GUI release is based on the changes from mozregression 2.0: the bisection flow is now updated, and starting a bisection should be a lot easier since it does not ask anymore for a nightly or inbound bisection kind - simply choose an application, possibly a branch and some options (build type, bits) then choose the regression range based on dates, release numbers, build ids or raw changesets.

That’s all. :)

All in all, a great simplification of the interface and more power. Give it a try!

December 04, 2015 12:00 AM

December 03, 2015

Henrik Skupin

Results of the Firefox Automation Survey

November 23rd I blogged about the active survey covering the information flow inside our Firefox Automation team. This survey was open until November 30th and I thank everyone of the participants which have taken the time to get it filled out. In the following you can find the results:

Most of the contributors who are following our activities are with Mozilla for the last 3 years. Whereby half of them joined less than a year ago. There is also a 1:1 split between volunteers and paid staff members. This is most likely because of the low number of responses, but anyway increasing the number of volunteers is certainly something we want to follow-up on in the next months.

The question about which communication channel is preferred to get the latest news got answered with 78% for the automation mailing list. I feel that this is a strange result given that we haven’t really used that list for active discussions or similar in the past months. But that means we should put more focus on the list. Beside that also 55% listening our activities on Bugzilla via component watchers. I would assume that those people are mostly our paid staff who kinda have to follow each others work regarding reviews, needinfo requests, and process updates. 44% of all read our blog posts on the Mozilla A-Team Planet. So we will put more focus in the future to both blog posts and discussions on the mailing list.

More than half of our followers check for updates at least once a day. So when we get started with interesting discussions I would expect good activity throughout the day.

44% of all feel less informed about our current activities. Another 33% answered this question with ‘Mostly’. So it’s a clear indication what I already thought and which clearly needs action on our side to be more communicative. Doing this might also bring more people into our active projects, so mentoring would be much more valuable and time-effective as handling any drive-by projects which we cannot fully support.

A request for the type of news we should do more is definitely for latest changes and code landings from contributors. This will ensure people feel recognized and contributors will also know each others work, and see the effectiveness in regards of our project goals. But also discussions about various automation related topics (as mentioned already above) are highly wanted. Other topics like quarterly goals and current status updates are also wanted and we will see how we can do that. We might be able to fold those general updates into the Engineering Productivity updates which are pushed out twice a month via the A-Team Planet.

Also there is a bit of confusion about the Firefox Automation team and how it relates to the Engineering Productivity team (formerly A-Team). Effectively we are all part of the latter, and the “virtual” Automation team has only been created when we got shifted between the A-Team and QA-Team forth and back. This will not happen anymore, so we agreed on to get rid of this name.

All in all there are some topics which will need further discussions. I will follow-up with another blog post soon which will show off our plans for improvements and how we want to work to make it happen.

December 03, 2015 11:59 AM

December 01, 2015

mozregression updates

Release 2.0.0

2.0.0 is a major release of mozregression, as we changed the bisection flow based on ideas in the post I wrote a couple of weeks ago.

Now mozregression will automatically detect a merge commit, and switch to bisect in the branch where the merged commits comes from. So mozilla-inbound is no longer the default for Firefox when bisecting by date is done, since there is no default now.

Based on that, we have been able to simplify the overall usage of mozregression:

Those changes adds some new possibilities to bisect which were not available before, like bisecting using changesets on mozilla-central, only specify a good changeset (the bad changeset will be implied, and will be the most recent one).

Some examples:

# bisect using dates
mozregression -g 2015-11-20 -b 2015-11-25  # implied branch is m-c
mozregression -g 2015-11-20 -b 2015-11-25 --repo inbound
# bisect using changesets
mozregression -g dcd5230c4ce1 -b 931721112d8e  # implied branch is m-i
mozregression -g 1b2e15608f34 -b abbd213422a5 --repo m-c
# use debug builds
mozregression -g 2015-11-20 -b 2015-11-25 -B debug
mozregression -g dcd5230c4ce1 -b 931721112d8e -B debug
# of course, --launch works the same way now
mozregression --launch abbd213422a5 --repo m-c
mozregression --launch 2015-11-25 --repo m-i -B debug

Just keep in mind that when you use a changeset, the default branch will be the default integration branch for the application instead of the release branch. For firefox, mozilla-inbound will be the default when you use a changeset, and mozilla-central will be used otherwise. This is historical and we may change that in the future - for now just keep that in mind, or always specify a branch to be sure.

See Bugs 1095058, 1210013, 1225544, 1228951, 1225541 and 1228857 for a description of technical implementation.

December 01, 2015 12:00 AM

November 26, 2015

Armen Zambrano G. (@armenzg)

Mozhginfo/Pushlog client released

If you've ever spent time trying to query metadata from hg with regards to revisions, you can now use a Python library we've released to do so.

In bug 1203621 [1], our community contributor @MikeLing has helped us release the module we had written for Mozilla CI tools.

You can find the pushlog_client package in here [3] and you can find the code in here [4]

Thanks MikeLing!


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

November 26, 2015 03:17 PM

November 24, 2015

Armen Zambrano G. (@armenzg)

Welcome F3real, xenny and MikeLing!

As described by jmaher, we started this week our first week of mozci's quarter of contribution.

I want to personally welcome Stefan, Vaibhav and Mike to mozci. We hope you get to learn and we thank you for helping Mozilla move forward in this corner of our automation systems.

I also want to give thanks to Alice for committing at mentoring. This could not be possible without her help.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

November 24, 2015 05:58 PM

Mozilla CI tools meet up

In order to help the contributors' of mozci's quarter of contribution, we have set up a Mozci meet up this Friday.

If you're interested on learning about Mozilla's CI, how to contribute or how to build your own scheduling with mozci come and join us!

9am ET -> other time zones
Vidyo room:

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

November 24, 2015 05:52 PM

mozregression updates

Release 1.2.0 and GUI release 0.5.0

Minor releases of the command line and GUI mozregression tools.

On the command line side:

mozregression --launch 20151102030241

On the GUI side:

November 24, 2015 12:00 AM

November 23, 2015

Henrik Skupin

Survey about sharing information inside the Firefox Automation team

Within the Firefox Automation team we were suffering a bit in sharing information about our work over the last couple of months. That mainly happened because I was alone and not able to blog more often than once in a quarter. The same applies to our dev-automation mailing list which mostly only received emails from Travis CI with testing results.

Given that the team has been increased to 4 people now (beside me this is Maja Frydrychowicz, Syd Polk, and David Burns, we want to be more open again and also trying to get more people involved into our projects. To ensure that we do not make use of the wrong communication channels – depending where most of our readers are – I have setup a little survey. It will only take you a minute to go through but it will help us a lot to know more about the preferences of our automation geeks. So please take that little time and help us.

The survey can be found here and is open until end of November 2015:

Thank you a lot!

November 23, 2015 09:40 PM

November 20, 2015

Geoff Brown

Running and debugging Firefox for Android with mach

Recent updates to mach provide support for running and debugging Firefox for Android.

When run from a Firefox for Android context, ‘mach run’ starts Firefox on a connected Android device. As with other Android mach commands, if no device is found, mach offers to start an emulator, and if Firefox is not installed, mach offers to install it.

gbrown@mozpad:~/src$ ./mach run
No Android devices connected. Start an emulator? (Y/n) y 
Starting emulator running Android 4.3...
It looks like Firefox is not installed on this device.
Install Firefox? (Y/n) y
Installing Firefox. This may take a while...
 1:22.97 /usr/bin/make -C . -j8 -s -w install
 1:32.04 make: Entering directory `/home/gbrown/objdirs/droid'
 1:47.48 2729 KB/s (42924584 bytes in 15.358s)
 1:48.22     pkg: /data/local/tmp/
 2:05.97 Success
 2:06.34 make: Leaving directory `/home/gbrown/objdirs/droid'
Starting: Intent { act=android.activity.MAIN cmp=org.mozilla.fennec_gbrown/.App }

Parameters can be passed to Firefox on the command line. For example, ‘mach run –guest’ starts Firefox in guest mode.

mach also supports gdb-based debugging with JimDB, :jchen’s celebrated fork of gdb for Firefox for Android. ‘mach run –debug’ starts JimDB. If necessary, mach will even fetch, install, and configure JimDB for you.

  $ ./mach run --debug
  JimDB (arm) not found: /home/gbrown/.mozbuild/android-device/jimdb-arm does not exist
  Download and setup JimDB (arm)? (Y/n) y
  Installing JimDB (linux64/arm). This may take a while...
   * [new branch]      master     -> origin/master
   * [new tag]         gdbutils-2 -> gdbutils-2
   * [new tag]         initial-release -> initial-release
   1:45.57 /home/gbrown/.mozbuild/android-device/jimdb-arm/bin/gdb -q --args 

  Fennec GDB utilities
    (see utils/gdbinit and utils/gdbinit.local on how to configure settings)
  1. Debug Fennec (default)
  2. Debug Fennec with env vars and args
  3. Debug using jdb
  4. Debug content Mochitest
  5. Debug compiled-code unit test
  6. Debug Fennec with pid
  Enter option from above: 1

  New ADB device is "emulator-5554"
  Using device emulator-5554
  Using object directory: /home/gbrown/objdirs/droid
  Set sysroot to "/home/gbrown/.mozbuild/android-device/jimdb-arm/lib/emulator-5554".
  Updated solib-search-path.
  Ignoring BHM signal.
  Using package org.mozilla.fennec_gbrown.
  Launching org.mozilla.fennec_gbrown... Done
  Attaching to pid 674... Done
  Setting up remote debugging... Done

  Ready. Use "continue" to resume execution.
  : No such file or directory.

See for more info on JimDB.

November 20, 2015 04:05 PM

Joel Maher

Introducing the contributors for the MozCI Project

As I previously announced who will be working on Pulse Guardian, the Web Platform Tests Results Explorer, and the  Web Driver Infrastructure projects, I would like to introduce the contributors for the 4th project this quarter, Mozilla CI Tools – Polish and Packaging:

* MikeLing (:mikeling on IRC) –

What interests you in this specific project?

As its document described, Mozilla CI Tools is designed to allow interacting with the various components which compose Mozilla’s Continuous Integration. So, I think get involved into it can help me know more about how Treeherder and Mozci works and give me better understanding of A-team.

What do you plan to get out of this after 8 weeks?

Keep try my best to contribute! Hope I can push forward this project with Armen, Alice and other contributors in the furture:)

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I’m a guy who would like to keep challenge myself and try new stuff.

* Stefan (:F3real on IRC) –

What interests you in this specific project?

I thought it would be good starting project and help me learn new things.

What do you plan to get out of this after 8 weeks?

Expand my knowledge and meet new people.

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I play guitar but I don’ t think that’s really interesting.

* Vaibhav Tulsyan (:xenny on IRC) –

What interests you in this specific project?

Continuous Integration, in general, is interesting for me.

What do you plan to get out of this after 8 weeks?

I want to learn how to work efficiently in a team in spite of working remotely, learn how to explore a new code base and some new things about Python, git, hg and Mozilla. Apart from learning, I want to be useful to the community in some way. I hope to contribute to Mozilla for a long term, and I hope that this helps me build a solid foundation.

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

One of my hobbies is to create algorithmic problems from real-world situations. I like to think a lot about the purpose of existence, how people think about things/events and what affects their thinking. I like teaching and gaining satisfaction from others’ understanding.


Please join me in welcoming all the contributors to this project and the previously mentioned ones as they have committed to work on a larger project with their free time!

November 20, 2015 03:37 PM

Introducing a contributor for the WebDriver Infrastructure project

As I previously announced who will be working on Pulse Guardian and the Web Platform Tests Results Explorer, let me introduce who will be working on Web Platform Tests – WebDriver Infrastructure:

* Ravi Shankar (:waffles on IRC) –

What interests you in this specific project?

There are several. Though I love coding, I’m usually more inclined to Python & Rust (so, a “Python project” is what excited me at first). Then, my recently-developed interest in networking code (ever since my work on a network-related issue in Servo), and finally, I’m very curious about how we’re establishing the Python-JS communication and emulate user inputs.

What do you plan to get out of this after 8 weeks?

Over the past few months of my (fractional) contributions to Mozilla, I’ve always learned something useful whenever I finish working on a bug/issue. Since this is a somewhat “giant” implementation that requires more time and commitment, I think I’ll learn some great deal of stuff in relatively less time (which is what excites me).

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

Well, I juggle, or I (try to) reproduce some random music in my flute (actually, a Bansuri – Indian flute) when I’m away from my keyboard.


We look forward to working with Ravi over the next 8 weeks.  Please say hi in irc when you see :waffles in channel:)

November 20, 2015 03:25 PM

Introducing 2 contributors for the Web Platform Tests project

As I previously announced who will be working on Pulse Guardian, let me introduce who will be working on Web Platform Tests – Results Explorer:

* Kalpesh Krishna (:martianwars on irc) –

What interests you in this specific project?

I have been contributing to Mozilla for a couple of months now and was keen on taking up a project on a slightly larger scale. This particular project was recommended to me by Manish Goregaokar. I had worked out a few issues in Servo prior to this and all involved Web Platform Tests in some form. That was the initial motivation. I find this project really interesting as it gives me a chance to help build an interface that will simplify browser comparison so much! This project seems to have more of planning rather than execution, and that’s another reason that I’m so excited! Besides, I think this would be a good chance to try out some statistics / data visualization ideas I have, though they might be a bit irrelevant to the goal.

What do you plan to get out of this after 8 weeks?

I plan to learn as much as I can, make some great friends, and most importantly make a real sizeable contribution to open source:)

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I love to star gaze. Constellations and Messier objects fascinate me. Given a chance, I would love to let my imagination run wild and draw my own set of constellations! I have an unusual ambition in life. Though a student of Electrical Engineering, I have always wanted to own a chocolate factory (too much Roald Dahl as a child) and have done some research regarding the same. Fingers crossed! I also love to collect Rubiks Cube style puzzles. I make it a point to increase my collection by 3-4 puzzles every semester and learn how to solve them. I’m not fast at any of them, but love solving them!

* Daniel Deutsch

What interests you in this specific project?

I am really interested in getting involved in Web Standards. Also, I am excited to be involved in a project that is bigger than itself–something that spans the Internet and makes it better for everyone (web authors and users).

What do you plan to get out of this after 8 weeks?

As primarily a Rails developer, I am hoping to expand my skill-set. Specifically, I am looking forward to writing some Python and learning more about JavaScript. Also, I am excited to dig deeper into automated testing. Lastly, I think Mozilla does a lot of great work and am excited to help in the effort to drive the web forward with open source contribution.

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I live in Brooklyn, NY and have terrible taste in music. I like writing long emails, running, and Vim.


We look forward to working with these great 2 hackers over the next 8 weeks.

November 20, 2015 03:20 PM

Introducing a contributor for the Pulse Guardian project

3 weeks ago we announced the new Quarter of Contribution, today I would like to introduce the participants.  Personally I really enjoy meeting new contributors and learning about them. It is exciting to see interest in all 4 projects.  Let me introduce who will be working on Pulse Guardian – Core Hacker:

Mike Yao

What interests you in this specific project?

Python, infrastructure

What do you plan to get out of this after 8 weeks?

Continue to contribute to Mozilla

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

Cooking/food lover, I was chef long time ago. Free software/Open source and Linux changed my mind and career.


I do recall one other eager contributor who might join in late when exams are completed, meanwhile, enjoy learning a bit about Mike Yao (who was introduced to Mozilla by Mike Ling who did our first every Quarter of Contribution).

November 20, 2015 03:14 PM

November 14, 2015

Julien Pagès

mozregression – new way for handling merges

I am currently investigating how we can make mozregression smarter to handle merges, and I will explain how in this post.


While bisecting builds with mozregression on mozilla-central, we often end up with a merge commit. These commits often incorporate many individual changes, consider for example this url for a merge commit. A regression will be hard to find inside such a large range of commits.

How mozregression currently works

Once we reach a one day range by bisecting mozilla-central or a release branch, we keep the most recent commit tested, and we use that for the end of a new range to bisect mozilla-inbound (or another integration branch, depending on the application) The beginning of that mozilla-inbound range is determined by one commit found 4 days preceding the date of the push of the commit (date pushed on mozilla-central) to be sure we won’t miss any commit in mozilla-central.

But there are multiple problems. First, it is not always the case that the offending commit really comes from m-i. It could be from any other integration branch (fx-team, b2g-inbound, etc). Second, bisecting over a 4 days range in mozilla-inbound may involve testing a lot of builds, with some that are useless to test.

Another approach

How can we improve this ? As just stated, there are two points that can be improved:

So, how can this be achieved ? Here is my current approach (technical):

  1. Once we are done with the nightlies (one build per day) from a bisection from m-c or any release branch, switch to use taskcluster to download possible builds between. This way we reduce the range to two pushes (one good, one bad) instead of a full day. But since we tested them both, only the commits in the most recent push may contain the regression.
  2. Read the commit message of the top most commit in the most recent push. If it does not looks like a merge commit, then we can’t do anything (maybe this is not a merge, then we are done).
  3. We have a merge push. So now we try to find the exact commits around, on the branch where the merged commits come from.
  4. Bisect this new push range using the changesets and the branch found above, reduce that range and go to 2.

Let’s take an example:

mozregression -g 2015-09-20 -b 2015-10-10

We are bisecting firefox, on mozilla-central. Let’s say we end up with a range 2015-10-01 – 2015-10-02. This is how the pushlog will looks like at the end, 4 pushes and more than 250 changesets.

Now mozregression will automatically reduce the range (still on mozilla-central) by asking you good/bad for those remaining pushes. So, we would end up with two pushes – one we know is good because we tested the top most commit, and the other we know is bad for the same reason. Look at the following pushlog, showing what is still untested (except for the merge commit itself) – 96 commits, coming from m-i.

And then mozregression will detect that it is a merge push from m-i, so automatically it will let you bisect this range of pushes from m-i. That is, our 96 changesets from m-c now converted to testable pushes in m-i. And we will end with a smaller range, for example this one where it will be easy to find our regression because this is one push without any merge.


Note that both methods for the example above would have worked. Mainly because we are ending in commits originated from m-i. I tried with another bisection, this time trying to find a commit in fx-team – in that case, current mozregression is simply out – but with the new method it was handled well.

Also using the current method, it would have required around 7 steps after reducing to the one day range for the example above. The new approach can achieve the same with around 5 steps.

Last but not least, this new flow is much more cleaner:

  1. start to bisect from a given branch. Reduce the range to one push on that branch.
  2. if we found a merge, find the branch, the new pushes, and go to 1 to bisect some more with this new data. Else we are done.

Is this applicable ?

Well, it relies on two things. The first one (and we already rely on that a bit currently) is that a merged commit can be found in the branch where it comes from, using the changeset. I have to ask vcs gurus to know if that is reliable, but from my tests this is working well.

Second thing it that we need to detect a merge commit – and from which branch commits comes from. Thanks to the consistency of the sheriffs in their commit messages, this is easy.

Even if it is not applicable everywhere for some reason, it appears that it often works. Using this technique would result in a more accurate and helpful bisection, with speed gain and increased chances to find the root cause of a regression.

This need some more thinking and testing, to determine the limits (what if this doesn’t work ? Should we/can we use the old method in that case ?) but this is definitely something I will explore more to improve the usefulness of mozregression.

November 14, 2015 09:42 AM

November 11, 2015

Joel Maher

Adventures in Task Cluster – running a custom Docker image

I needed to get compiz on the machine (bug 1223123), and I thought it should be on the base image.  So to take the path of most resistance, I dove deep into the internals of taskcluster/docker and figured this out.  To be fair, :ahal gave me a lot of pointers, in fact if I would have taken better notes this might have been a single pass to success vs. 3 attempts.

First let me explain a bit about how the docker image is defined and how it is referenced/built up.  We define the image to use in-tree.  In this case we are using taskcluster/docker-test:0.4.4 for the automation jobs.  If you look carefully at the definition in-tree, we have a Dockerfile which outlines who we inherit the base image from:

FROM          taskcluster/ubuntu1204-test-upd:

This means there is another image called ubuntu1204-test-upd, and this is also defined in tree which then references a 3rd image, ubuntu1204-test.  These all build upon each other creating the final image we use for testing.  If you look in each of the directories, there is a REGISTRY and a VERSION file, these are used to determine the image to pull, so in the case of wanting:

docker pull taskcluster/desktop-test:0.4.4

we would effectively be using:

docker pull {REGISTRY}/desktop-test:{VERSION}

For our use case, taskcluster/desktop-test is defined on  This means that you could create a new version of the container ‘desktop-test’ and use that while pushing to try.  In fact that is all that is needed.

First lets talk about how to create an image.  I found that I needed to create both a desktop-test and an ubuntu1204-test image on Docker Hub.  Luckily in tree there is a script which will take a currently running container and make a convenient package ready to upload, some steps would be:

  • docker pull taskcluster/desktop-test:0.4.4
  • docker run taskcluster/desktop-test:0.4.4
  • apt-get install compiz; # inside of docker, make modifications
  • # now on the host we prepare the new image (using elvis314 as the docker hub account)
  • echo elvis314 > testing/docker/docker-test/REGISTRY
  • echo 0.4.5 > testing/docker/docker-test/VERSION  # NOTE: I incremented the version
  • cd testing/docker
  • docker-test # go run a 5K
  • docker push elvis314/docker-test # go run a 10K

those are the simple steps to update an image, what we want to do is actually verify this image has what we need.  While I am not an expert in docker, I do like to keep my containers under control, so I do a |docker ps -a| and then |docker rm <cid>| for any containers that are old.  Now to verify I do this:

  • docker pull elvis314/desktop-test:0.4.5
  • docker run elvis314/desktop-test:0.4.5
  • compiz # this verifies my change is there, I should get an error about display not found!

I will continue on here assuming things are working.  As you saw earlier, I had modifed filed in testing/docker/desktop-test, these should be part of a patch to push to try.  In fact that is all the magic.  to actually use compiz successfully, I needed to add this to to launch |compiz &| after initializing Xvfb.

Now when you push to try with your patch, any tasks that used taskcluster/desktop-test before will use the new image (i.e. elvis314/desktop-test).  In this case I was able to see the test cases that opened dialogs and windows pass on try!

November 11, 2015 03:03 PM

Dan Minor

Autoland Update

Today I made the first successful autolanding to mozilla-inbound from MozReview. We’ve also been dogfooding autolanding to version-control-tools for the past few weeks without running into any problems, although the volume of commits to mozilla-inbound will give the automatic rebase and retry code more exercise than it receives when landing to version-control-tools.

Bug 1220214 tracks the workflow and user interface improvements we want to make before we enable autolanding to mozilla-inbound for everyone. The largest (and riskiest) change we want to make is to enable automatic rewriting of commit summaries to reflect who actually granted a “ship-it” in MozReview.

Without this work, people would have to amend their commits prior to landing to replace any r? with a r=, which makes autolanding much less useful. I recently fixed Bug 1160479 which was the Autoland service portion of the rewriting. Glob is nearly done with Bug 1220232 which is the MozReview portion which determines the new commit summary and provide a confirmation dialog to the user.

November 11, 2015 01:27 PM

November 09, 2015

Joel Maher

Adventures in Task Cluster – Running tests locally

There is a lot of promise around Taskcluster (the replacement for BuildBot in our CI system at Mozilla) to be the best thing since sliced bread.  One of the deliverables on the Engineering Productivity team this quarter is to stand up the Linux debug tests on Taskcluster in parallel to running them normally via Buildbot.  Of course next quarter it would be logical to turn off the BuildBot tests and run tests via Taskcluster.

This post will outline some of the things I did to run the tests locally.  What is neat is that we run the taskcluster jobs inside a Docker image (yes this is Linux only), and we can download the exact OS container and configuration that runs the tests.

I started out with a try server push which generated some data and a lot of failed tests.  Sadly I found that the treeherder integration was not really there for results.  We have a fancy popup in treeherder when you click on a job, but for taskcluster jobs, all you need is to find the link to inspect task.  When you inspect a task, it takes you to a task cluster specific page that has information about the task.  In fact you can watch a test run live (at least from the log output point of view).  In this case, my test job is completed and I want to see the errors in the log, so I can click on the link for live.log and search away.  The other piece of critical information is the ‘Task‘ tab at the top of the inspect task page.  Here you can see the details about the docker image used, what binaries and other files were used, and the golden nugget at the bottom of the page, the “Run Locally” script!  You can cut and paste this script into a bash shell and theoretically reproduce the exact same failures!

As you can imagine this is exactly what I did and it didn’t work!  Luckily in the #taskcluster channel, there were a lot of folks to help me get going.  The problem I had was I didn’t have a v4l2loopback device available.  This is interesting because we need this in many of our unittests and it means that our host operating system running docker needs to provide video/audio devices for the docker container to use.  Now is time to hack this up a bit, let me start:

first lets pull down the docker image used (from the run locally script):

docker pull 'taskcluster/desktop-test:0.4.4'

next lets prepare my local host machine to run by installing/setting up v4l2loopback:

sudo apt-get install v4l2loopback-dkms

sudo modprobe v4l2loopback devices=2

Now we can try to run docker again, this time adding the –device command:

docker run -ti \
  --name "${NAME}" \
  --device=/dev/video1:/dev/video1 \
  -e MOZHARNESS_SCRIPT='mozharness/scripts/' \
  -e MOZHARNESS_CONFIG='mozharness/configs/unittests/ mozharness/configs/
' \
  -e GECKO_HEAD_REV='5e76c816870fdfd46701fd22eccb70258dfb3b0c' \

Now when I run the test command, I don’t get v4l2loopback failures!

bash /home/worker/bin/ --no-read-buildbot-config '--installer-url=' '--test-packages-url=' '--download-symbols=ondemand' '--mochitest-suite=browser-chrome-chunked' '--total-chunk=7' '--this-chunk=1'

In fact, I get the same failures as I did when the job originally ran :)  This is great, except for the fact that I don’t have an easy way to run the test by itself, debug, or watch the screen- let me go into a few details on that.

Given a failure in browser/components/search/test/browser_searchbar_keyboard_navigation.js, how do we get more information on that?  Locally I would do:

./mach test browser/components/search/test/browser_searchbar_keyboard_navigation.js

Then at least see if anything looks odd in the console, on the screen, etc.  I might look at the test and see where we are failing at to give me more clues.  How do I do this in a docker container?  The command above to run the tests, calls, which then calls as the user ‘worker’ (not as user root).  This is important that we use the ‘worker’ user as the pactl program to find audio devices will fail as root.  Now what happens is we setup the box for testing, including running pulseaudio, Xfvb, compiz (after bug 1223123), and bootstraps mozharness.  Finally we call the mozharness script to run the job we care about, in this case it is ‘mochitest-browser-chrome-chunked’, chunk 1.  It is important to follow these details because mozharness downloads all python packages, tools, firefox binaries, other binaries, test harnesses, and tests.  Then we create a python virtualenv to setup the python environment to run the tests while putting all the files and unpacking them in the proper places.  Now mozharness can call the test harness (python –browser-chrome …)  Given this overview of what happens, it seems as though we should be able to run: <params> –test-path browser/components/search/test

Why this doesn’t work is that mozharness has no method for passing in a directory or single test, let along doing other simple things that |./mach test| allows.  In fact, in order to run this single test, we need to:

Of course most of this is scripted, how can we take advantage of our scripts to set things up for us?  What I did was hack the locally to not run mozharness and instead echo the command.  Likewise with the mozharness script to echo the test harness call instead of calling it.  Here is the commands I ended up using:

  • bash /home/worker/bin/ --no-read-buildbot-config '--installer-url=' '--test-packages-url=' '--download-symbols=ondemand' '--mochitest-suite=browser-chrome-chunked' '--total-chunk=7' --this-chunk=1
  • #now that it failed, we can do:
  • cd workspace/build
  • . venv/bin/activate
  • cd ../build/tests/mochitest
  • python –app ../../application/firefox/firefox –utility-path ../bin –extra-profile-file ../bin/plugins –certificate-path ../certs –browser-chrome browser/browser/components/search/test/
  • # NOTE: you might not want –browser-chrome or the specific directory, but you can adjust the parameters used

This is how I was able to run a single directory, and then a single test.  Unfortunately that just proved that I could hack around the test case a bit and look at the output.  In docker there is no simple way to view the screen.   To solve this I had to install x11vnc:

apt-get install x11vnc

Assuming the Xvfb server is running, you can then do:

x11vnc &

This allows you to connect with vnc to the docker container!  The problem is you need the ipaddress.  I then need to get the ip address from the host by doing:

docker ps #find the container id (cid) from the list

docker inspect <cid> | grep IPAddress

for me this is and now from my host I can do:


This is great as I can now see what is going on with the machine while the test is running!

This is it for now.  I suspect in the future we will make this simpler by doing:

Stay tuned for my next post on how to update your own custom TaskCluster image- yes it is possible if you are patient.

November 09, 2015 08:49 PM

November 05, 2015

Jonathan Griffin

Engineering Productivity Update, November 5, 2015

It’s the first week of November, and because of the December all-hands and the end-of-year holidays, this essentially means the quarter is half over. You can see what the team is up to and how we’re tracking against our deliverables with this spreadsheet.

Highlights gps did some interesting work investigating ways to increase cloning performance on Windows; it turns out closing files which have been appended is a very expensive process there. He also helped roll out bundle-related cloning improvements in Mercurial 3.6.

Community: jmaher has posted details about our newest Quarter of Contribution. One of our former Outreachy interns, adusca, has blogged about what she gets out of contributing to open source software.

MozReview and Autoland: dminor blogged about the most recent MozReview work week in Toronto. Meanwhile, mcote is busy trying to design a more intuitive way to deal with parent-child review requests. And glob, who is jumping in to help out with MozReview, has created a high-level diagram sketching out MozReview’s primary components and dependencies.

Autoland has been enabled for the version-control-tools repo and is being dogfooded by the team. We hope to have it turned on for landings to mozilla-inbound within a couple of weeks.

Treeherder: the team is in London this week working on the automatic starring project. They should be rolling out an experimental UI soon for feedback from sheriffs and others. armenzg has fixed several issues with automatic backfilling so it should be more useful.

Perfherder: wlach has blogged about recent improvements to Perfherder, including the ability to track the size of the Firefox installer.

Developer Workflows: gbrown has enabled |mach run| to work with Android.

TaskCluster Support: the mochitest-gl job on linux64-debug is now running in TaskCluster side-by-side with buildbot. Work is ongoing to green up other suites in TaskCluster. A few other problems (like failure to upload structured logs) need to be fixed before we can turn off the corresponding buildbot jobs and make the TaskCluster jobs “official”.

e10s Support: we are planning to turn on e10s tests on Windows 7 as they are greened up; the first job which will be added is the e10s version of mochitest-gl, and the next is likely mochitest-devtools-chrome. To help mitigate capacity impacts, we’ve turned off Windows XP tests by default on try in order to allow us to move some machines from the Windows XP pool to the Windows 7 pool, and some machines have already been moved from the Linux 64 pool (which only runs Talos and Android x86 tests) to the Windows 7 pool. Combined with some changes recently made by Releng, Windows wait times are currently not problematic.

WebDriver: ato, jgraham and dburns recently went to Japan to attend W3C TPAC to discuss the WebDriver specification. They will be extending the charter of the working group to get it through to CR. This will mean certain parts of the specification need to finished as soon as possible to start getting feedback.

The Details


Mobile Automation

Firefox and Media Automation


Perfherder/Performance Testing

TaskCluster Support

General Automation



November 05, 2015 01:59 PM

November 04, 2015

William Lachance

Perfherder: Onward!

In addition to the database refactoring I mentioned a few weeks ago, some cool stuff has been going into Perfherder lately.

Tracking installer size

Perfherder is now tracking the size of the Firefox installer for the various platforms we support (bug 1149164). I originally only intended to track Android .APK size (on request from the mobile team), but installer sizes for other platforms came along for the ride. I don’t think anyone will complain. :)

Screen Shot 2015-11-03 at 5.28.48 PM


Just as exciting to me as the feature itself is how it’s implemented: I added a log parser to treeherder which just picks up a line called “PERFHERDER_DATA” in the logs with specially formatted JSON data, and then automatically stores whatever metrics are in there in the database (platform, options, etc. are automatically determined). For example, on Linux:

PERFHERDER_DATA: {"framework": {"name": "build_metrics"}, "suites": [{"subtests": [{"name": "", "value": 99030741}], "name": "installer size", "value": 55555785}]}

This should make it super easy for people to add their own metrics to Perfherder for build and test jobs. We’ll have to be somewhat careful about how we do this (we don’t want to add thousands of new series with irrelevant / inconsistent data) but I think there’s lots of potential here to be able to track things we care about on a per-commit basis. Maybe build times (?).

More compare view improvements

I added filtering to the Perfherder compare view and added back links to the graphs view. Filtering should make it easier to highlight particular problematic tests in bug reports, etc. The graphs links shouldn’t really be necessary, but unfortunately are due to the unreliability of our data — sometimes you can only see if a particular difference between two revisions is worth paying attention to in the context of the numbers over the last several weeks.

Screen Shot 2015-11-03 at 5.37.02 PM


Even after the summer of contribution has ended, Mike Ling continues to do great work. Looking at the commit log over the past few weeks, he’s been responsible for the following fixes and improvements:

Next up

My main goal for this quarter is to create a fully functional interface for actually sheriffing performance regressions, to replace alertmanager. Work on this has been going well. More soon.

Screen Shot 2015-11-04 at 10.41.26 AM

November 04, 2015 03:45 PM

Alice Scarpa

What I got from contributing to OSS

There are a lot of good altruistic reasons to contribute to Open Source Software, but this post focuses on my selfish reasons.

Learning Projects

I’m OK at reading books, implementing examples and doing exercises, but when it comes to thinking about good projects to get my hands dirty and implement stuff, I had a lot of trouble thinking of stuff to do. OSS provides an endless supply of bugs, projects and features to work on.

Code Reviews

Before I got started on OSS, the only person who ever really read my code was myself. Every patch I submitted to Mozilla was reviewed by at least one person, and that really improved my code. From running a Python linter plugin in Emacs to learning idiomatic ways of writing expressions, I learned a lot of good habits.


Whenever I was working on a bug, I could ask for help and someone would always answer, no matter if it was a problem specific to a bug or a general language/module/tool question. This way I was able to accomplish things that were unimaginable to me before.


Knowing someone is using a feature/tool I wrote is an amazing feeling. Even bug reports make me happy! I cherish every IRC mention of my projects.


Before I got started with OSS, all of my programming experience came from books and small projects. Contributing to OSS I got a chance to work on larger codebases, work with other people and play with technologies that I wouldn’t get to play by myself.


I’m now part of a very friendly community, full of people that I respect, like and trust. They help me a lot, and sometimes I even get to help back!


I used to be very afraid of not being good enough to contribute to OSS. I was not sure I was a real programmer. There were several bugs that I was completely sure I would not be able to fix, until I fixed them. Now I look back at what I did and I feel proud. I feel like maybe I really am a programmer.

If you are interested in long-term contributing, the A-team has some pretty cool contribution opportunities on the next quarter of contribution. Check it out!

November 04, 2015 12:00 AM

November 03, 2015

Joel Maher

Lost in Data – Episode 3 – digging into alerts from an uplift

Yesterday I recorded a session where I looked at alerts from an uplift.  I did a lot of rambling and not a lot of content, but there are a few interesting differences between uplift alerts and normal alerts:

If you want to take a look at this, the link is on


I do plan to do more episodes soon, a few topics of interest:

November 03, 2015 01:24 PM

October 31, 2015

Julien Pagès

mozregression 1.1.0 release

New release of mozregression, with some goodies!

See for more details and the full changelog.

October 31, 2015 10:04 AM

mozregression updates

Release 1.1.0

This new release of mozregression includes some new features:

And a few bugfixes also:

Thanks to Mikeling for being really active on some bugs here!

There is also a basic support for firefox os builds (flame, aries, simulator). Lots of work still needs to be done to make it really useful, (see bug 1205560) but it is now possible to bisect between dates or changesets on a given branch: mozregression will download the builds and ask you to flash it on the device.

# Regression finding by date range with aries-opt builds (defaults to b2g-inbound)
mozregression --app b2g-aries --bad 2015-09-10 --good 2015-09-07
# Regression finding on mozilla-inbound for debug builds
mozregression --app b2g-aries --build-type debug --bad 2015-09-10 --good 2015-09-07 \
              --inbound-branch mozilla-inbound
# Flame builds with a good and bad revision
mozregression --app b2g-flame --bad-rev c4bf8c0c2044 --good-rev b93dd434b3cd
# find more information
mozregression --help
mozregression --list-build-types

Thanks to Michael Shal, Naoki Hirata and others for helping me on this.

October 31, 2015 12:00 AM

October 29, 2015

Joel Maher

Looking for hackers interested in hacking for 6-8 weeks on a Quarter of Contribution project

Today I am happy to announce the second iteration of the Quarter of Contribution.  This will take place between November 23 and run until January 18th.

We are looking for contributors who want to tackle more bugs or a larger project and who are looking to prove existing skills or work on learning new skills.

There are 4 great projects that we have:

There are no requirements to be an accomplished developer.  Instead we are looking for folks who know the basics and want to improve.  If you are interested, please read about the program and the projects and ask questions to the mentors or in the #ateam channel on

Happy hacking!

October 29, 2015 08:26 PM

October 23, 2015

William Lachance

The new old Perfherder data model

I spent a good chunk of time last quarter redesigning how Perfherder stores its data internally. Here are some notes on this change, for posterity.

Perfherder’s data model is based around two concepts:

  1. Series signatures: A unique set of properties (platform, test name, suite name, options) that identifies a performance test.
  2. Series data: A set of measurements for a series signature, indexed by treeherder push and job information.

When it was first written, Perfherder stored the second type of data as a JSON-encoded series in a relational (MySQL) database. That is, instead of storing each datum as a row in the database, we would store sequences of them. The assumption was that for the common case (getting a bunch of data to plot on a graph), this would be faster than fetching a bunch of rows and then encoding them as JSON. Unfortunately this wasn’t really true, and it had some serious drawbacks besides.

First, the approach’s performance was awful when it came time to add new data. To avoid needing to decode or download the full stored series when you wanted to render only a small subset of it, we stored the same series multiple times over various time intervals. For example, we stored the series data for one day, one week… all the way up to one year. You can probably see the problem already: you have to decode and re-encode the same data structure many times for each time interval for every new performance datum you were inserting into the database. The pseudo code looked something like this for each push:

for each platform we're testing talos on:
  for each talos job for the platform:
    for each test suite in the talos job:
      for each subtest in the test suite:
        for each time interval in one year, 90 days, 60 days, ...:
           fetch and decode json series for that time interval from db
           add datapoint to end of series
           re-encode series as json and store in db

Consider that we have some 6 platforms (android, linux64, osx, winxp, win7, win8), 20ish test suites with potentially dozens of subtests… and you can see where the problems begin.

In addition to being slow to write, this was also a pig in terms of disk space consumption. The overhead of JSON (“{, }” characters, object properties) really starts to add up when you’re storing millions of performance measurements. We got around this (sort of) by gzipping the contents of these series, but that still left us with gigantic mysql replay logs as we stored the complete “transaction” of replacing each of these series rows thousands of times per day. At one point, we completely ran out of disk space on the treeherder staging instance due to this issue.

Read performance was also often terrible for many common use cases. The original assumption I mentioned above was wrong: rendering points on a graph is only one use case a system like Perfherder has to handle. We also want to be able to get the set of series values associated with two result sets (to render comparison views) or to look up the data associated with a particular job. We were essentially indexing the performance data only on one single dimension (time) which made these other types of operations unnecessarily complex and slow — especially as the data you want to look up ages. For example, to look up a two week old comparison between two pushes, you’d also have to fetch the data for every subsequent push. That’s a lot of unnecessary overhead when you’re rendering a comparison view with 100 or so different performance tests:

Screen Shot 2015-08-07 at 1.57.39 PM

So what’s the alternative? It’s actually the most obvious thing: just encode one database row per performance series value and create indexes on each of the properties that we might want to search on (repository, timestamp, job id, push id). Yes, this is a lot of rows (the new database stands at 48 million rows of performance data, and counting) but you know what? MySQL is designed to handle that sort of load. The current performance data table looks like this:

| Field          | Type             |
| id             | int(11)          |
| job_id         | int(10) unsigned |
| result_set_id  | int(10) unsigned |
| value          | double           |
| push_timestamp | datetime(6)      |
| repository_id  | int(11)          | 
| signature_id   | int(11)          | 

MySQL can store each of these structures very efficiently, I haven’t done the exact calculations, but this is well under 50 bytes per row. Including indexes, the complete set of performance data going back to last year clocks in at 15 gigs. Not bad. And we can examine this data structure across any combination of dimensions we like (push, job, timestamp, repository) making common queries to perfherder very fast.

What about the initial assumption, that it would be faster to get a series out of the database if it’s already pre-encoded? Nope, not really. If you have a good index and you’re only fetching the data you need, the overhead of encoding a bunch of database rows to JSON is pretty minor. From my (remote) location in Toronto, I can fetch 30 days of tcheck2 data in 250 ms. Almost certainly most of that is network latency. If the original implementation was faster, it’s not by a significant amount.

Screen Shot 2015-10-23 at 1.55.09 PM

Lesson: Sometimes using ancient technologies (SQL) in the most obvious way is the right thing to do. DoTheSimplestThingThatCouldPossiblyWork

October 23, 2015 06:28 PM

Dan Minor

MozReview Toronto Work Week

We’re just wrapping up another MozReview work week, this time in Toronto. Our main goal was to indoctrinate Glob into MozReview development as he is joining us for at least a few quarters. Since we reserve our fortnightly “Engineering Productivity Updates” for significant contributions, here is a list of my insignificant contributions from this week instead:

October 23, 2015 01:26 PM

October 22, 2015

Byron Jones

moving from bugzilla to mozreview

for the next couple of quarters (at least) i’ll be shifting my attention full time from bugzilla to mozreview. this switch involves a change of language, frameworks, and of course teams. i’m looking forward to new challenges.

one of the first things i’ve done is sketch out a high level architectural diagram of mozreview and its prime dependencies:

MozReview Architectural Diagram

mozreview exists as an extension to reviewboard, using bugzilla for user authentication, ldap to check commit levels, with autoland pushing commits automatically to try (and to mozilla-central soon).  there’s mecurial extensions on both the client and server to make pushing things easer, and there are plans to perform static analysis with bots.

Filed under: mozilla, mozreview

October 22, 2015 07:53 PM

Mark Côté

MozReview's Parental issues

As mentioned in my previous post on MozReview, one of the biggest sources of confusion is the way we present the “squashed” diffs, that is, the diff that show all of the changes in a commit series, the sum of all the proposed changes. We also refer to these as “parent” review requests, since they function as something to hold all the commits together. They are stored in MozReview as separate review requests, similar to the individual commits.

The confusion results from several things:

There are a few simple things we can do to fix these problems: use better link names, put a big “This is an overview of the commit series” message, and/or put a warning “You must review individual commits” on the review dialog. But really, we need to step back and think about the way we present the squashed diffs, and if they even make sense as a concept in MozReview.

To reiterate, squashed diffs provide a complete view of a whole commit series. The concept of a commit series doesn’t exist in core Review Board (nor does it exist in many other code-review tools), but it’s central to the idea of the repository-centric approach (like in GitHub pull requests). We added this concept by storing metadata resulting from pushes to tie commit series together with a parent, and we added UI elements like the commits table.

There are three broad ways we can deal with squashed diffs going forward. We need to settle on one and make the associated UI changes to make our model clear to users.

  1. Remove squashed diffs altogether.

    This is the simplest option. Squashed diffs aren’t actually technically necessary, and they can distract reviewers from the individual commits, which is where they should be spending most of their time, since, in most cases, this is how the code will be landing in the main repository. Some other repository-centric review tools, like Critic, don’t have the concept of an overview diff, so there are precedents. However, it might be a bit heavy handed to tell reviewers that they can’t view all the commits as a single diff (at least, without pulling them down locally).

  2. Continue to allow reviews, of some sort, on squashed diffs.

    This is what we have now: reviewers can leave reviews (at the moment, comments only) on squashed diffs. If we decide we want to continue to allow users to leave reviews on squashed diffs, we’ll need to both figure out a better UI to distinguish them from the individual commits and also settle several open questions:

    • Should reviewers be able to grant ship its (i.e. r+s) on squashed diffs? This would imply that the commits probably haven’t been reviewed individually, which would defeat the purpose of a commit-centric system. That said, reviewer time is very important, so we could have a trade off to support more work flows.

    • Conversely, should reviewers be able to leave comments on the parent diff? For simplicity, we could allow reviewers to leave a “ship it” review on a squashed diff that would apply to all commits but force them to leave any comments on diffs on the commits themselves. This would essentially remove the ability to review squashed diffs themselves but would leave the convenience of saying “this is all good”.

    • If we do want to allow review comments on squashed diffs, how should they be consolidated with the reviews on individual commits? Right now, reviews (general comments and comments on diffs) for the squashed diff and all commits are all on separate pages/views. Giving one view into all activity on a commit series would be ideal if we want to support squashed-diff reviews. Arguably, this would be valuable even if we didn’t have reviews on squashed diffs.

    For comparison, GitHub pull requests support this model. There are three tabs in a pull request: “Files changed”, which is the squashed diff; “Commits”, which is a list of commits with links to the individual commit diffs; and “Conversation”, which shows comments on the commits and on the squashed diff (along with other events like updates to the commits). The way they are presented is a little confusing (comments on the squashed diff are just labelled “<user> commented on the diff”, whereas comments on the diffs are of the form “<user> commented on <file> in <commit hash>”), but it is a useful single view. However, note that pull requests do not have the concept of a “ship it” or “r+”, which makes the GitHub interface simpler.

    This approach would support multiple reviewer work flows, but it is also the most complicated, both in terms of UX and technical implementation, and it waters down the philosophy behind MozReview.

  3. Provide read-only overview diffs.

    The third approach is to keep squashed diffs but make them read only. They could be used as reference, to get a big picture of the whole series, but since they are read only, they would be easily distinguishable from commits and would force reviewers to look at the individual commits. This is really just option 1 above, with a reference view of the whole series. It would be more work than option 1 but less than option 2, and would preserve the philosophy.

The MozReview team has been leaning towards option 3. We have a mock-up that strips away a lot of the UI that would be useless in this scenario and makes the intention clear. It’s not the prettiest, but it wouldn’t take too much work to get here:

However, we’d like to hear user feedback before making any decisions. Whichever option we go with, we’ll come up with a plan to get there that ideally will have incremental improvements, depending on the complexity of the full solution, so that we can start to fix things right away.

October 22, 2015 06:29 PM