Planet Mozilla Automation

May 27, 2015

Armen Zambrano G. (@armenzg)

Welcome adusca!

It is my privilege to announce that adusca (blog) joined Mozilla (since Monday) as an Outreachy intern for the next 4 months.

adusca has an outstanding number of contributions over the last few months including Mozilla CI Tools (which we're working on together).

Here's a bit about herself from her blog:
Hi! I’m Alice. I studied Mathematics in college. I was doing a Master’s degree in Mathematical Economics before getting serious about programming.
She is also a graduate from Hacker's School.

Even though Alice has not been a programmer for many years, she has shown already lots of potential. For instance, she wrote a script to generate scheduling relations for buildbot; for this and many other reasons I tip my hat.

adusca will initially help me out with creating a generic pulse listener to handle job cancellations and retriggers for Treeheder. The intent is to create a way for Mozilla CI tools to manage scheduling on behalf of TH, make the way for more sophisticated Mozilla CI actions and allow other people to piggy back to this pulse service and trigger their own actions.

If you have not yet had a chance to welcome her and getting to know her, I highly encourage you to do so.

Welcome Alice!

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

May 27, 2015 05:10 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

May 27, 2015 07:11 AM

May 26, 2015

Geoff Brown

Handling intermittent test timeouts in long-running tests

Tests running on our new-ish Android 4.3 Opt emulator platform have recently been plagued by intermittent timeouts and I have been having a closer look at some of them (like bug 919246 and bug 1154505) .

A few of these tests normally run “quickly”. Think of a test that runs to completion in under 10 seconds most of the time but times out after 300+ seconds intermittently. In a case like this, it seems likely that there is an intermittent hang and the test needs debugging to determine the underlying cause.

But most of the recent Android 4.3 Opt test timeouts seem to be affecting what I classify as “long-running” tests. Think of a test that normally runs to completion in 250 to 299 seconds, but intermittently times out after 300 seconds. It seems likely that normal variations in test duration are intermittently pushing past the timeout threshold; if we can tolerate a longer time-out, or make the test run faster in general, we can probably eliminate the intermittent test failure.

We have a lot of options for dealing with long-running tests that sometimes timeout.

Option: Simplify or optimize the test

Long-running tests are usually doing a lot of work. A lot of assertions can be run in 300 seconds, even on a slow platform! Do we need to test all of those cases, or could some be eliminated? Is there some setup or tear down code being run repeatedly that could be run just once, or even just less often?

We usually don’t worry about optimizing tests but sometimes a little effort can help a test run a lot more efficiently, saving test time, money (think aws costs), and aggravation like intermittent time-outs.

Option: Split the test into 2 or more smaller tests

Some tests can be split into 2 or more smaller tests with minimal effort. Instead of testing 100 different cases in one test, we may be able to test 50 in each. There may be some loss of efficiency: Maybe some setup code will need to be run twice, and copied and pasted to the second test. But now each half runs faster, reducing the chance of a timeout. And when one test fails, the cause is – at least slightly – more isolated.

Option: Request a longer timeout for the test

Mochitests can call SimpleTest.requestLongerTimeout(2) to double the length of the timeout applied to the test. We currently have about 100 mochitests that use this feature.

For xpcshell tests, the same thing can be accomplished with a manifest annotation:

requesttimeoutfactor = 2

That’s a really simple “fix” and an effective way of declaring that a test is known to be long-running.

On the other hand, it is avoiding the problem and potentially covering up an issue that could be solved more effectively by splitting, optimizing, or simplifying. Also, long-running tests make our test job “chunking” less effective: It’s harder to split load evenly amongst jobs when some tests run 100 times longer than others.

Option: Skip the test on slow platforms

Sometimes it’s not worth the effort. Do we really need to run this test on Android as well as on all the desktop platforms? Do we get value from running this test on both Android 2.3 and Android 4.3? We may “disable our way to victory” too often, but this is a simple strategy, doesn’t affect other platforms and sometimes it feels like the right thing to do.

Option: Run on faster hardware

This usually isn’t practical, but in special circumstances it seems like the best way forward.

If you have a lot of timeouts from long-running tests on one platform and those tests don’t timeout on other platforms, it may be time to take a closer look at the platform.

Our Android arm emulator test platforms are infamous for slowness. In fairness, the emulator has a lot of work to do, Firefox is complex, our tests are often relentless (compared to human-driven browsing), and we normally run the emulator on the remarkably slow (and cheap!) m1.medium AWS instances.

If we are willing to pay for better cpu, memory, and I/O capabilities, we can easily speed up the emulator by running on a faster AWS instance type — but the cost must be justified.

I recently tried running Android 4.3 Debug mochitests on m1.medium and found that many tests timed out. Also, since all tests were taking longer, each test job (each “chunk”) needed 2 to 3 hours to complete — much longer than we can wait. Increasing chunks seemed impractical (we would need 50 or so) and we would still have all those individual timeouts to deal with. In this case, running the emulator on c3.xlarge instances for Debug mochitests made a big difference, allowing them to run in the same number of chunks as Opt on m1.medium and eliminating nearly all timeouts.

I’ve enjoyed investigating mochitest timeouts and found most of them to be easy to resolve. I’ll try to investigate more timeouts as I see them. Won’t you join me?

May 26, 2015 11:34 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

May 26, 2015 04:49 AM

May 25, 2015

Alice Scarpa

Using cProfile with gprof2dot to identify bottlenecks

At Mozilla CI Tools we have a script that can sometimes be annoyingly slow. I used cProfile with gprof2dot to better understand how to improve it.

To get a pretty graph of the script’s behaviour I ran:

python -m cProfile -o timing
gprof2dot -f pstats timing -o
dot -Tsvg -o graph_master.svg

This gave me a very useful graph.

Looking at the graph I was able to identify two bottlenecks that were low-hanging fruit: query_jobs and valid_revision. These two functions are called a lot of times in the script with the same arguments. This means that by adding some simple caching I could improve the script’s speed. Preliminary results show a 2x speed-up. There is still a lot of room for improvement, but it’s a nice start.

May 25, 2015 12:00 AM

May 20, 2015

Joel Maher

re-triggering for a [root] cause – version 1.57

Last week I wrote some notes about re-triggering jobs to find a root cause.  This week I decided to look at the orange factor email of the top 10 bugs and see how I could help.  Looking at each of the 10 bugs, I had 3 worth investigating and 7 I ignored.



Looking at the bugs of interest, I jumped right in in retriggering.  This time around I did 20 retriggers for the original changeset, then went back to 30 revisions (every 5th) doing the same thing.  Effectively this was doing 20 retriggers for the 0, 5th, 10th, 15th, 20th, 25th, and 30th revisions in the history list (140 retriggers).

I ran into issues doing this, specifically on Bug 1073761.  The reason why is that for about 7 revisions in history the windows 8 builds failed!  Luckily the builds finished enough to get a binary+tests package so we could run tests, but mozci didn’t understand that the build was available.  That required some manual retriggering.  Actually a few cases on both retriggers were actual build failures which resulted in having to manually pick a different revision to retrigger on.  This was fairly easy to then run my tool again and fill in the 4 missing revisions using slightly different mozci parameters.

This was a bit frustrating as there was a lot of manual digging and retriggering due to build failures.  Luckily 2 of the top 10 bugs are the same root cause and we figured it out.  Including irc chatter and this blog post, I have roughly 3 hours invested into this experiment.

May 20, 2015 01:41 PM

May 19, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

May 19, 2015 06:19 AM

May 18, 2015

Joel Maher

A-Team contribution opportunity – Dashboard Hacker

I am excited to announce a new focused project for contribution – Dashboard Hacker.  Last week we gave a preview that today we would be announcing 2 contribution projects.  This is an unpaid program where we are looking for 1-2 contributors who will dedicate between 5-10 hours/week for at least 8 weeks.  More time is welcome, but not required.

What is a dashboard hacker?

When a developer is ready to land code, they want to test it. Getting the results and understanding the results is made a lot easier by good dashboards and tools. For this project, we have a starting point with our performance data view to fix up a series of nice to have polish features and then ensure that it is easy to use with a normal developer workflow. Part of the developer work flow is the regular job view, If time permits there are some fun experiments we would like to implement in the job view.  These bugs, features, projects are all smaller and self contained which make great projects for someone looking to contribute.

What is required of you to participate?

What we will guarantee from our end:

How do you apply?

Get in touch with us either by replying to the post, commenting in the bug or just contacting us on IRC (I am :jmaher in #ateam on, wlach on IRC will be the primary mentor).  We will point you at a starter bug and introduce you to the bugs and problems to solve.  If you have prior work (links to bugzilla, github, blogs, etc.) that would be useful to learn more about you that would be a plus.

How will you select the candidates?

There is no real criteria here.  One factor will be if you can meet the criteria outlined above and how well you do at picking up the problem space.  Ultimately it will be up to the mentor (for this project, it will be :wlach).  If you do apply and we already have a candidate picked or don’t choose you for other reasons, we do plan to repeat this every few months.

Looking forward to building great things!

May 18, 2015 07:43 PM

A-Team contribution opportunity – DX (Developer Ergonomics)

I am excited to announce a new focused project for contribution – Developer Ergonomics/Experience, otherwise known as DX.  Last week we gave a preview that today we would be announcing 2 contribution projects.  This is an unpaid program where we are looking for 1-2 contributors who will dedicate between 5-10 hours/week for at least 8 weeks.  More time is welcome, but not required.

What does DX mean?

We chose this project as we continue to experience frustration while fixing bugs and debugging test failures.  Many people suggest great ideas, in this case we have set aside a few ideas (look at the dependent bugs to clean up argument parsers, help our tests run in smarter chunks, make it easier to run tests locally or on server, etc.) which would clean up stuff and be harder than a good first bug, yet each issue by itself would be too easy for an internship.  Our goal is to clean up our test harnesses and tools and if time permits, add stuff to the workflow which makes it easier for developers to do their job!

What is required of you to participate?

What we will guarantee from our end:

How do you apply?

Get in touch with us either by replying to the post, commenting in the bug or just contacting us on IRC (I am :jmaher in #ateam on  We will point you at a starter bug and introduce you to the bugs and problems to solve.  If you have prior work (links to bugzilla, github, blogs, etc.) that would be useful to learn more about you that would be a plus.

How will you select the candidates?

There is no real criteria here.  One factor will be if you can meet the criteria outlined above and how well you do at picking up the problem space.  Ultimately it will be up to the mentor (for this project, it will be me).  If you do apply and we already have a candidate picked or don’t choose you for other reasons, we do plan to repeat this every few months.

Looking forward to building great things!

May 18, 2015 07:42 PM

Mark Côté

Project Isolation

The other day I read about another new Mozilla project that decided to go with GitHub issues instead of our Bugzilla installation (BMO). The author’s arguments make a lot of sense: GitHub issues are much simpler and faster, and if you keep your code in GitHub, you get tighter integration. The author notes that a downside is the inability to file security or confidential bugs, for which Bugzilla has a fine-grained permission system, and that he’d just put those (rare) issues on BMO.

The one downside he doesn’t mention is interdependencies with other Mozilla projects, e.g. the Depends On/Blocks fields. This is where Bugzilla gets into project, product, and perhaps even program management by allowing people to easily track dependency chains, which is invaluable in planning. Many people actually file bugs solely as trackers for a particular feature or project, hanging all the work items and bugs off of it, and sometimes that work crosses product boundaries. There are also a number of tracking flags and fields that managers use to prioritize work and decide which releases to target.

If I had to rebut my own point, I would argue that the projects that use GitHub issues are relatively isolated, and so dependency tracking is not particularly important. Why clutter up and slow down the UI with lots of features that I don’t need for my project? In particular, most of the tracking features are currently used only by, and thus designed for, the Firefox products (aside: this is one reason the new modal UI hides most of these fields by default if they have never been set).

This seems hard to refute, and I certainly wouldn’t want to force an admittedly complex tool on anyone who had much simpler needs. But something still wasn’t sitting right with me, and it took a while to figure out what it was. As usual, it was that a different question was going unasked, leading to unspoken assumptions: why do we have so many isolated projects, and what are we giving up by having such loose (or even no) integration amongst all our work?

Working on projects in isolation is comforting because you don’t have to think about all the other things going on in your organization—in other words, you don’t have to communicate with very many people. A lack of communication, however, leads to several problems:

By working in isolation, we can’t leverage each other’s strengths and accomplishments. We waste effort and lose great opportunities to deliver amazing things. We know that places like Twitter use monorepos to get some of these benefits, like a single build/test/deploy toolchain and coordination of breaking changes. This is what facilitates architectures like microservices and SOAs. Even if we don’t want to go down those paths, there is still a clear benefit to program management by at least integrating the tracking and planning of all of our various endeavours and directions. We need better organization-wide coordination.

We’re already taking some steps in this direction, like moving Firefox and Cloud Services to one division. But there are many other teams that could benefit from better integration, many teams that are duplicating effort and missing out on chances to work together. It’s a huge effort, but maybe we need to form a team to define a strategy and process—a Strategic Integration Team perhaps?

May 18, 2015 02:37 AM

May 15, 2015

Armen Zambrano G. (@armenzg)

mozci 0.6.0 - Trigger based on Treeherder filters, Windows support, flexible and encrypted password managament

In this release of mozci we have a lot of developer facing improvements like Windows support or flexibility on password management.
We also have our latest experimental script mozci-triggerbyfilters (

How to update

Run "pip install -U mozci" to update.


We have move all scripts from scripts/ to mozci/scripts/.
Note that you can now use "pip install" and have all scripts available as mozci-name_of_script_here in your PATH.


We want to welcome @KWierso as our latest contributor!
Our gratitude @Gijs for reporting the Windows issues and for all his feedback.
Congratulations to @parkouss for making the first project using mozci as its dependency.
In this release we had @adusca and @vaibhavmagarwal as our main and very active contributors.

Major highlights

  • Added script to trigger jobs based on Treeherder filters
    • This allows using filters like --include "web-platform-tests" and that will trigger all matching builders
    • You can also use --exclude to exclude builders you don't want
  • With the new trigger by filters script you can preview what will be triggered:
233 jobs will be triggered, do you wish to continue? y/n/d (d=show details) d
05/15/2015 02:58:17 INFO: The following jobs will be triggered:
Android 4.0 armv7 API 11+ try opt test mochitest-1
Android 4.0 armv7 API 11+ try opt test mochitest-2
  • Remove storing passwords in plain-text (Sorry!)
    • We now prompt the user if he/she wants to store their password enctrypted
  • When you use "pip install" we will also install the main scripts as mozci-name_of_script_here binaries
    • This makes it easier to use the binaries in any location
  • Windows issues
    • The python module is uncapable of decompressing large binaries
    • Do not store buildjson on a temp file and then move

Minor improvements

  • Updated docs
  • Improve wording when triggering a build instead of a test job
  • Loosened up the python requirements from == to >=
  • Added filters to

All changes

You can see all changes in here:

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

May 15, 2015 08:13 PM

Joel Maher

Watching the watcher – Some data on the Talos alerts we generate

What are the performance regressions at Mozilla- who monitors them and what kind of regressions do we see?  I want to answer this question with a few peeks at the data.  There are plenty of previous blog posts I have done outlining stats, trends, and the process.  Lets recap what we do briefly, then look at the breakdown of alerts (not necessarily bugs).

When Talos uploads numbers to graph server they get stored and eventually run through a calculation loop to find regressions and improvements.  As of Jan 1, 2015, we upload these to as well as email to the offending patch author (if they can easily be identified).  There are a couple folks (performance sheriffs) who look at the alerts and triage them.  If necessary a bug is filed for further investigation.  Reading this brief recap of what happens to our performance numbers probably doesn’t inspire folks, what is interesting is looking at the actual data we have.

Lets start with some basic facts about alerts in the last 12 months:

As you can see this is not a casual hobby, it is a real system helping out in fixing and understanding hundreds of performance issues.

We generate alerts on a variety of branches, here is the breakdown of branches and alerts/branch;

number of regression alerts we have received per branch

number of regression alerts we have received per branch

There are a few things to keep in mind here, mobile/mozilla-central/Firefox are the same branch, and for non-pgo branches that is only linux/windows/android, not osx. 

Looking at that graph is sort of non inspiring, most of the alerts will land on fx-team and mozilla-inbound, then show up on the other branches as we merge code.  We run more tests/platforms and land/backout stuff more frequently on mozilla-inbound and fx-team, this is why we have a larger number of alerts.

Given the fact we have so many alerts and have manually triaged them, what state the the alerts end up in?

Current state of alerts

Current state of alerts

The interesting data point here is that 43% of our alerts are duplicates.  A few reasons for this:

The last piece of information that I would like to share is the break down of alerts per test:

Alerts per test

number of alerts per test (some excluded)

There are a few outliers, but we need to keep in mind that active work was being done in certain areas which would explain a lot of alerts for a given test.  There are 35 different test types which wouldn’t look good in an image, so I have excluded retired tests, counters, startup tests, and android tests.

Personally, I am looking forward to the next year as we transition some tools and do some hacking on the reporting, alert generation and overall process.  Thanks for reading!

May 15, 2015 12:40 PM

May 12, 2015

Joel Maher

community hacking – thoughts on what works for the automation and tools team

Community is a word that means a lot of things to different people.  When there is talk of community at an A*Team meeting, some people perk up and others tune out.  Taking a voluntary role in leading many community efforts on the A*Team over the last year, here are some thoughts I have towards accepting contributions, growing community, and making it work within the team.


Historically on the A*Team we would file bugs which are mentored (and discoverable via bugsahoy) and blog/advertise help wanted.  This is always met with great enthusiasm from a lot of contributors.  What does this mean for the mentor?  There are a few common axis here:

We need to appreciate all types of contributions and ensure we do our best to encourage folks to participate.  As a mentor if you have a lot of high-touch, low-reward, short-term contributors, it is exhausting and de-moralizing.  No wonder a lot of people don’t want to participate in mentoring folks as they contribute.  It is also unrealistic to expect a bunch of seasoned coders to show up and implement all the great features, then repeat for years on end.

The question remains, how do you find low-touch contributors or identify ones that are high-touch at the start and end up learning fast (some of the best contributors fall into this latter category).

Growing Community:

The simple answer here is file a bunch of bugs.  In fact whenever we do this they get fixed real fast.  This turns into a problem when you have 8 new contributors, 2 mentors, and 10 good first bugs.  Of course it is possible to find more mentors, and it is possible to file more bugs.  In reality this doesn’t work well for most people and projects.

The real question to ask is what kind of growth are you looking for?  To answer this is different for many people.  What we find of value is slowly growing our long-term/low-touch contributors by giving them more responsibility (i.e. ownership) and really depending on them for input on projects.  There is also a need to grow mentors and mentors can be contributors as well!  Lastly it is great to have a larger pool of short-term contributors who have ramped up on a few projects and enjoy pitching in once in a while.

How can we foster a better environment for both mentors and contributors?  Here are a few key areas:

Just focusing on the relationships and what comes after the good first bugs will go a long way in retaining new contributors and reducing the time spent helping out.

How we make it work in the A*Team:

The A*Team is not perfect.  We have few mentors and community is not built into the way we work.  Some of this is circumstantial, but a lot of it is within our control.  What do we do and what does and does not work for us.

Once a month we meet to discuss what is going on within the community on our team.  We have tackled topics such as project documentation, bootcamp tools/docs, discoverability, good next bugs, good first projects, and prioritizing our projects for encouraging new contributors.

While that sounds good, it is the work of a few people.  There is a lot of negative history of contributors fixing one bug and taking off.  Much frustration is expressed around helping someone with basic pull requests and patch management, over and over again.  While we can document stuff all day long, the reality is new contributors won’t read the docs and still ask questions.

The good news is in the last year we have seen a much larger impact of contributors to our respective projects.  Many great ideas were introduced, problems were solved, and experiments were conducted- all by the growing pool of contributors who associate themselves with the A*Team!

Recently, we discussed the most desirable attributes of contributors in trying to think about the problem in a different way.  It boiled down to a couple things, willingness to learn, and sticking around at least for the medium term.

Going forward we are working on growing our mentor pool, and focusing on key projects so the high-touch and timely learning curve only happens in areas where we can spread the love between domain experts and folks just getting started.

Keep an eye out for most posts in the coming week(s) outlining some new projects and opportunities to get involved.

May 12, 2015 07:41 PM

Henrik Skupin

Firefox Automation report – Q1 2015

As you may have noticed I was not able to come up with status reports of the Firefox Automation team during the whole last quarter. I feel sad about it, but there was simply no time to keep up with those blog posts. Even now I’m not sure how often I will be able to blog. So maybe I will aim to do it at least once a quarter or if possible once a month.

You may ask how it comes? The answer is simple. Our team faced some changes and finally a massive loss of core members. Which means from the former 6 people only myself are remaining. Since end of February all 5 former team members from Softvision are no longer participating in any of the maintained projects. Thanks to all of them for the great help over all the last months and years! But every project we own is now on my own shoulders. And this is kinda hell of work with downsides like not being able to do as many reviews as I want for side projects. One positive thing at least was that I got pulled back into the A-Team at the same time. With that move I’m once more closer again to all the people who care about the basics of all test infrastructure at Mozilla. I feel back home.

So what have I done the whole last quarter… First, it was always the ongoing daily work for maintaining our Mozmill CI system. This was usually a job for a dedicated person all the last months. The amount of work can sometimes eat up a whole day. Especially if several regressions have been found or incompatible changes in Firefox have been landed. Seeing my deliverables for Q1 it was clear that we have to cut down the time to spent on those failures. As result we started to partially skip tests which were failing. There was no time to get any of those fixed. Happily the latest version of Mozmill is still working kinda nicely so no other work had to be dedicated for this project.

Most of my time during the last quarter I actually had to spent on Marionette, especially building up wrapper scripts for being able to use Marionette as test framework for Firefox Desktop. This was a kinda large change for us but totally important in terms of maintenance burden and sustainability. The code base of Mozmill is kinda outdated and features like Electrolysis (e10s) will totally break it. Given that a rewrite of the test framework is too cost-intensive the decision has been made to transition our Mozmill tests over to Marionette. Side-effect was that a lot of missing features had to be implemented in Marionette to bring it at a level as what Mozmill offers. Thanks for the amazing work goes to Andrew Halberstadt, David Burns, Jonathan Griffin, and especially Chris Manchester.

For the new UI driven tests for Firefox Desktop we created the firefox-ui-tests repository at Github. We decided on that name to make it clear to which product the tests belong to, and also to get rid of any relationship to the underling test framework name. This repository contains the harness extensions around Marionette, a separate puppeteer library for back-end and UI modules, and last but not least the tests themselves. As goal for Q1 we had to get the basics working including the full set of remote security tests, and most important the update tests. A lot of help on the security tests we got from Barbara Miller our intern from December to March. She did great amount of work here, and also assisted other community members in getting their code done. Finally we got all the security tests converted.

My own focus beside the harness pieces were the update tests. Given the complete refactoring of those Mozmill tests we were able to easily port them over to Marionette. We tried to keep the class structure as is, and only did enhancements where necessary. Here Bob Silverberg helped with two big chunks of work which I’m gladly thankful about! Thanks a lot! With all modules in-place I finally converted the update tests and got them running for each version of Firefox down to 38.0, which will be the next ESR release and kinda important to be tested with Marionette. For stability and ease of contribution we added support for Travis CI to our new repository. It helps us a lot with reviews of patches from community members, and they also can see immediately if changes they have done are working as expected.

The next big chunk of work will be to get those tests running in Mozmill CI (to be renamed) and the test reporting to use Treeherder. Also we want to get our update tests for Firefox releases executed by the RelEng system, to further reduce the amount of time for signoffs from QE. About this work I will talk more in my next blog post. So please stay tuned.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agendas, the video recordings, and notes from the Firefox Automation meetings. Please note that since end of February we no longer host a meeting due to the low attendance and other meetings like the A-team ones, where I have to report my status.

May 12, 2015 04:03 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

May 12, 2015 01:26 PM

Joel Maher

Re-Triggering for a [root] cause – some notes from doing it

With all this talk of intermittent failures and folks coming up with ideas on how to solve them, I figured I should dive head first into looking at failures.  I have been working with a few folks on this, specifically :parkouss and :vaibhav1994.  This experiment (actually the second time doing so) is where I take a given intermittent failure bug and retrigger it.  If it reproduces, then I go back in history looking for where it becomes intermittent.  This weekend I wrote up some notes as I was trying to define what an intermittent is.

Lets outline the parameters first for this experiment:

Here are what comes out of this:

The next step was to look at each of the 25 bugs and see if it makes sense to do this.  In fact 13 of the bugs I decided not to take action on (remember this is an experiment, so my reasoning for ignoring these 13 could be biased):

This leaves us with 12 bugs to investigate.  The premise here is easy, find the first occurrence of the intermittent (branch, platform, testjob, revision) and re-trigger it 20 times (picked for simplicity).  When the results are in, see if we have reproduced it.  In fact, only 5 bugs reproduced the exact error in the bug when re-triggered 20 times on a specific job that showed the error.

Moving on, I started re-triggering jobs back in the pushlog to see where it was introduced.  I started off with going back 2/4/6 revisions, but got more aggressive as I didn’t see patterns.  Here is a summary of what the 5 bugs turned out like:

In summary, out of 356 bugs 2 root causes were found by re-triggering.  In terms of time invested into this, I have put about 6 hours of time to fine the root cause of the 5 bugs.

May 12, 2015 01:00 PM

May 11, 2015

Julien Pagès

mozregression-gui first release!

A year from now, I started to contribute to Mozilla. Mozregression (a regression range finder for Mozilla nightly and inbound builds) was one of the first projects that catched my attention: I made one patch, then another, and one again… I was mentored by the awesome William Lachance. Since then, I became one of the core contributors and a packager for mozregression and am now myself mentoring people for new contributions on this project!

At the time of the Google Summer of Code, William and I thought about adding a graphical interface for mozregression. Unfortunately, Mozilla was not accepted for gsoc 2015 – but still the idea was interesting – and we worked on it anyway.

And here we are! Ready for a first release of mozregression-gui! Please give it a try, and help us by reporting bugs and giving feedback. :) For now we provide binaries for Windows and Linux 64 (a Mac port is planned) but it is still possible to build from sources for each platform.

Much work remains to be done on the GUI, and this is a good place to start contributing to Mozilla – so if you’d like to hack with Python and Qt don’t wait and get involved!mozregui

May 11, 2015 06:44 PM

mozregression updates

Mozregression GUI is out!

I am proud to announce the first release of mozregression-gui, a graphical interface for mozregression!

This is a pre-release alpha version, provided so you can help us to find bugs and give us some feedback.

Give it a try!

May 11, 2015 12:00 AM

May 10, 2015

Joel Maher

intermittent oranges- missing a real definition

There are all kinds of great ideas folks have for fixing intermittent issues.  In fact each idea in and of itself is a worthwhile endeavor.   I have spent some time over the last couple of months fixing them, filing bugs on them, and really discussing them.  One question that remains- what is the definition of an intermittent.

I don’t plan to lay out a definition, instead I plan to ask some questions and lay out some parameters.  According to orange factor, there are 4640 failures in the last week (May 3 -> May 10) all within 514 unique bugs.  These are all failures that the sheriffs have done some kind of manual work on to star on treeherder.  I am not sure anybody can find a way to paint a pretty picture to make it appear we don’t have intermittent failures.

Looking at a few bugs, there are many reasons for intermittent failures:

There are a lot of reasons, many of these have nothing to do with poor test cases or bad code in Firefox.  But many of these are showing up many times a day and as a developer who wants to fix a bad test, many are not really actionable.  Do we need to have some part of a definition to include something that is actionable?

Looking at the history of ‘intermittent-failure’ bugs in Bugzilla, many occur once and never occur again.  In fact this is the case for over half of the bugs filed (we file upwards of 100 new bugs/week).  While there are probably reasons for a given test case to fail, if it failed in August 2014 and has never failed again, is that test case intermittent?  As a developer could you really do anything about this given the fact that reproducing it is virtually impossible?

This is where I start to realize we need to find a way to identify real intermittent bugs/tests and not clutter the statistics with tests which are virtually impossible to reproduce.  Thinking back to what is actionable- I have found that while filing bugs for Talos regressions the closer the bug is filed to the original patch landing, the better the chance it will get fixed.  Adding to that point, we only keep 30 days of builds/test packages around for our CI automation.  I really think a definition of an intermittent needs to have some kind of concept of time.  Should we ignore intermittent failures which occur only once in 90 days?  Maybe ignore ones that don’t reproduce after 1000 iterations?  Some could argue that we look in a smaller or larger window of time/iterations.

Lastly, when looking into specific bugs, I find many times they are already fixed.  Many of the intermittent failures are actually fixed!  Do we track how many get fixed?  How many have patches and have debugging already taking place?  For example in the last 28 days, we have filed 417 intermittents, of which 55 are already resolved and of the remaining 362 only 25 have occurred >=20 times.  Of these 25 bugs, 4 already have patches.  It appears a lot of work is done to fix intermittent failures which are actionable.  Are the ones which are not being fixed not actionable?  Are they in a component where all the developers are busy and heads down?

In a perfect world a failure would never occur, all tests would be green, and all users would use Firefox.  In reality we have to deal with thousands of failures every week, most of which never happen again.  This quarter I would like to see many folks get involved in discussions and determine:

Thanks for reading, I look forward to hearing from many who have ideas on this subject.  Stay tuned for an upcoming blog post about re-trigging intermittent failures to find the root cause.

May 10, 2015 09:49 PM

May 08, 2015

Armen Zambrano G. (@armenzg)

"Thank you!"

This week I had a co-worker thank me again for a project I worked on at the end of 2014.
It touched my heart to hear it so I encourage you to do the same if someone has enabled you recently.

I have failed again and again to use the power of these two powerful words and I hope to use them more often.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

May 08, 2015 10:20 PM

mozci 0.5.0 released - Store password in keyring, prevent corrupted data, progress bar and many small improvements

In this release we have many small improvements that help with issues we have found.

The main improvement is that we now don't store credentials in plain-text (sorry!) but use keyring to store it encrypted.

We also prevent partially downloading any data (corrupted data) and added progress bar to downloads.

Congrats to @chmanchester as our latest contributor!
Our usual and very appreciated contributions are by @adusca @jmaher and @vaibhavmagarwal

Minor improvements:
  • Lots of test changes and increased coverage
  • Do not use the root logger but a mozci logger
  • Allow passing custom files to a triggered job
  • Work around buildbot status corruptions (Issue 167)
  • Allow passing buildernames with lower case and removing trailing spaces (since we sometimes copy/paste from TH)
  • Added support to use build a buildername based on trychooser syntax
  • Allow passing extra properties when scheduling a job on Buildbot
You can see all changes in here:

Link to official release notes.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

May 08, 2015 10:15 PM

May 05, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

May 05, 2015 05:52 AM

April 28, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

April 28, 2015 06:21 AM

April 27, 2015

Armen Zambrano G. (@armenzg)

mozci hackday - Friday May 1st, 2015

I recently blogged about mozci and I was gladly surprised that people have curiosity about it.

I want to spend Friday fixing some issues on the tool and I wonder if you would like to join me to learn more about it and help me fix some of them.

I will be available as armenzg_mozci from 9 to 5pm EDT on IRC (#ateam channel).
I'm happy to jump on Vidyo to give you a hand understanding mozci.

I hand picked some issues that I could get a hand with.
Documentation and definition of the project in readthedocs.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 27, 2015 05:30 PM

April 24, 2015

Armen Zambrano G. (@armenzg)

What Mozilla CI tools is and what it can do for you (aka mozci)

Mozci (Mozilla CI tools) is a python library, scripts and package which allows you to trigger jobs on
Not all jobs can be triggered but those that are run on Release Engineering's Buildbot setup. Most (if not all) Firefox desktop and Firefox for Android jobs can be triggered. I believe some B2G jobs can still be triggered.

NOTE: Most B2G jobs are not supported yet since they run on TaskCluster. Support for it will be given on this quarter.

Using it

Once you check out the code:
git clone
python develop
you can run scripts like this one (click here for other scripts):
python scripts/ \
  --buildername "Rev5 MacOSX Yosemite 10.10 fx-team talos dromaeojs" \
  --rev e16054134e12 --times 10
which would trigger a specific job 10 times.

NOTE: This is independent if a build job exist to trigger the test job. mozci will trigger everything which is required to get you what you need.

One of the many other options is if you want to trigger the same job for the last X revisions, this would require you to use --back-revisions X.

There are many use cases and options listed in here.

A use case for developers

One use case which could be useful to developers (thanks @mike_conley!) is if you pushed to try and used this try syntax: "try: -b o -p win32 -u mochitests -t none". Unfortunately, you later determine that you really need this one: "try: -b o -p linux64,macosx64,win32 -u reftest,mochitests -t none".

In normal circumstances you would go and push again to the try server, however, with mozci (once someone implements this), we could simply pass the new syntax to a script (or with ./mach) and trigger everything that you need rather than having to push again and waster resources and your time!

If you have other use cases, please file an issue in here.

If you want to read about the definition of the project, vision, use cases or FAQ please visit the documentation.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 24, 2015 08:23 PM

Syd Polk


In order to run our multi-machine tests, we are going to need some kind of process runner which can execute remote-control commands on the test machines. Our solution uses a technology called Steeplechase (, which has a simple command language. It uses another technology called signalingserver ( This in turns relies on node.js (, which allows a server to run javascript without having to have a browser.

So, first created another Linux VM to run these pieces in. It won't need as many cores nor as much RAM as the Jenkins machine, but it will potentially need some more disk space for logs or binaries. So, the specs on this machine:
 Now installed signalling server and its dependencies. I ran the following:

sudo apt-get install git nodejs
npm install

Now, I ran this to get the signaling server source:

git clone

Finally, I started it up:

cd simplesignalling
nodejs ./server.js

Now, for steeplechase, we need the following dependencies:

sudo apt-get install python-setuptools

Clone the steeplechase repo:
git clone

We need to bootstrap the python environment for steeplechase.
cd steeplechase
sudo python install

The machine is now setup and ready to run steeplechase, but without test clients, there is nothing for steeplechase to talk to. Next installment, setting up the test clients.

April 24, 2015 07:31 PM


So, we are hiring in my group. Here is the link to the job posting:

We are looking for people with automation skills. What does that mean, you might ask? Well, here is what I think.

First of all, automation engineers are developers. The have good skills at developing automation software. At Mozilla, that means that they know Python, JavaScript, C, or C++. They know how to code and debug.

However, they are also QA engineers. They have a desire to break software to make it better. They have a desire to have some level of comfort in measurable quality. They are the kind of person who breaks websites trying to do normal things like buy a shirt or login to their bank. And they always want to know why it breaks, and may try to figure it out because either they are angry, they are curious, or they enjoy the thrill of a problem solved.

These kinds of people are very hard to find. Most people with coding skills want to work on some other kind of software than automation. Most really good testers don't necessarily have the technical skills to write automation.

I am plowing through a bunch of resumes this afternoon. Most of them are from candidates who are submitting their resume to every position that they can find regardless of qualifications. While it is true that there are many fewer jobs in tech than there are people who want them, it is also true that the number of qualified candidates is very small. There is no such thing as a candidate that matches all of the job requirements perfectly. My philosophy is looking for somebody who is smart, has a track record of success, knows some of your skills, has knowledge in similar skills, and has a track record of learning. Add on top of that some requirement of social skills, and you have a good candidate.

It's a large job standing in front of the firehose and screening resumes. It's fun, but at the end of the day, I am glad it is my bosses position and I am only helping out.

April 24, 2015 07:31 PM

Resume screening burnout

While taking care of getting cars fixed in anticipation of summer road trips, I spent two days screening resumes. I am burned out!

So, going back to building my lab tomorrow.

April 24, 2015 07:29 PM

Back to lab building

Doing two things at once:

- Starting to pay with an ESX server a coworker set up.
- Continuing building out a home lab.

So, ESX requires Virtual Center to actually build machines. You download Virtual Center with your browser and install it.

Oh, yeah. It requires Windows. Sigh.

So I decided to make a VM for it. Making a Windows 8 VM is a pain because you have to do registry tricks so as not to be presented with swipe panels you can't dismiss. So I decided to make a Windows 7 VM. Straightforward. Except that now I am waiting for 145 Windows Updates to downloads....

Meanwhile, back to building my lab. So, I need to build clients now. Time for two more Ubuntu VMs. Running out of memory, so these will be 2 core/2 GB machines (we'll try that).

And, of course, Linux has its own set of updates....

And the Macs are now wanting to update...

Lots of waiting at times when building labs. More later.

April 24, 2015 07:29 PM

Networking home lab setup

Things work much better when you have static IP addresses. This means that your IP addresses won't change when you restart your VMs, which means any command lines you use will remain valid. VMWare Fusion does not make assigning IP addresses easy, but it can certainly be done.

When you created your VMs, they were assigned generated MAC addresses by Fusion. Need to retrieve those here:

Virutal Machine -> Network Adapter -> Network Adapter Settings... Turn down the "Advanced Options" disclosure triangle...

Once you have the Mac addresses for your VMs, you can change your config. One source I drew heavily on is here; you have to restart your network services if VMWare is running. And you have to be careful; Fusion loves to blow away your changes.

On my machine, here is the mapping:

# Configuration file for ISC 2.0 vmnet-dhcpd operating on vmnet8.
# This file was automatically generated by the VMware configuration program.
# See Instructions below if you want to modify it.
# We set domain-name-servers to make some DHCP clients happy
# (dhclient as configured in SuSE, TurboLinux, etc.).
# We also supply a domain name to make pump (Red Hat 6.x) happy.

###### VMNET DHCP Configuration. Start of "DO NOT MODIFY SECTION" #####
# Modification Instructions: This section of the configuration file contains
# information generated by the configuration program. Do not modify this
# section.
# You are free to modify everything else. Also, this section must start
# on a new line
# This file will get backed up with a different name in the same directory
# if this section is edited and you try to configure DHCP again.

# Written at: 05/14/2014 16:13:27
allow unknown-clients;
default-lease-time 1800; # default is 30 minutes
max-lease-time 7200; # default is 2 hours

subnet netmask {
option broadcast-address;
option domain-name-servers;
option domain-name localdomain;
default-lease-time 1800; # default is 30 minutes
max-lease-time 7200; # default is 2 hours
option netbios-name-servers;
option routers;
host vmnet8 {
hardware ethernet 00:50:56:C0:00:08;
option domain-name-servers;
option domain-name "";
option routers;
####### VMNET DHCP Configuration. End of "DO NOT MODIFY SECTION" #######

host jenkins {
hardware ethernet 00:0c:29:ff:39:df;

host steeplechase {
hardware ethernet 00:0c:29:5a:8e:75;

host linux64-negatus-01 {
hardware ethernet 00:0c:29:b7:fe:99;

host linux64-negatus-02 {
hardware ethernet 00:0c:29:69:b0:0f;

Restart the vmware network.

sudo /Applications/VMware\ --configure
sudo /Applications/VMware\ --stop
sudo /Applications/VMware\ --start

This article also discusses this issue.

And we can see that all four machines are up and running by pinging them from Terminal on the same machine (they are not visible outside of the Mac):

sydpolkzillambp:~ spolk$ ping -c 1
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=0.355 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.355/0.355/0.355/0.000 ms
sydpolkzillambp:~ spolk$ ping -c 1
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=0.263 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.263/0.263/0.263/0.000 ms
sydpolkzillambp:~ spolk$ ping -c 1
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=0.338 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.338/0.338/0.338/0.000 ms
sydpolkzillambp:~ spolk$ ping -c 1
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=64 time=0.331 ms

--- ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.331/0.331/0.331/0.000 ms
sydpolkzillambp:~ spolk$

OK, so next we'll put it all together.

April 24, 2015 07:28 PM

Virtual Center

The ESX server I am installing onto does not have any OS IOS images to use to make VMs. So I need to have them locally. However, I'm in Austin, and the ESX server is in Mt. View. This is not going to work. Somebody at Moz HQ is going to get up a machine I can login via Remote Desktop Connection so I can continue.

April 24, 2015 07:28 PM

Home lab

Now, it's time to try everything out. The steeplechase machine needs to have firefox binaries and test artifacts. So, let's go get them.

The nightly build artifacts are here. On the steeplechase machine, we need to download both firefox-33.0a1.en-US.linux-x86_64.tar.bz2 and We will then unpack them appropriately:

mkdir firefox-releases
cd firefox-releases
mv ~/Downloads/firefox* .
tar xvfz firefox*.tab.bz2
mkdir tests
cd tests
unzip ../firefox*.zip

We have to have node running on this machine:

mozilla@jenkins-steeplechase:~$ cd simplesignalling/
mozilla@jenkins-steeplechase:~/simplesignalling$ ls
package.json server.js
mozilla@jenkins-steeplechase:~/simplesignalling$ nodejs server.js

We need to start the agent on the two Negatus machines:

mozilla@ubuntu:~/src$ cd Negatus/
mozilla@ubuntu:~/src/Negatus$ git pull
Already up-to-date.
mozilla@ubuntu:~/src/Negatus$ ./agent
Command handler listening on
Heartbeat handler listening on
Query url: IPADDR=
No SUTAgent.ini data.
No reboot callback data.

Mmm. It looks like I forgot a step. Running server.js should have output something. Looking back on our internal notes, I needed to run this:
npm install

If you do that from the simplesignalling directory, it will fail kind of like:
npm ERR! Error: Invalid version: "0.1"
npm ERR! at Object.module.exports.fixVersionField (/usr/lib/nodejs/normalize-package-data/lib/fixer.js:178:13)
npm ERR! at /usr/lib/nodejs/normalize-package-data/lib/normalize.js:29:38
npm ERR! at Array.forEach (native)
npm ERR! at normalize (/usr/lib/nodejs/normalize-package-data/lib/normalize.js:28:15)

Once you install this correctly, then server.js will output something correctly:
mozilla@jenkins-steeplechase:~/simplesignalling$ nodejs server.js 
info - started

Now, we are ready to try to run steeplechase.
mozilla@jenkins-steeplechase:~/steeplechase$ python `pwd`/steeplechase/ --binary /home/mozilla/firefox-releases/firefox/firefox --specialpowers-path /home/mozilla/firefox-releases/tests/steeplechase/specialpowers --prefs-file /home/mozilla/firefox-releases/tests/steeplechase/prefs_general.js --signalling-server '' --html-manifest /home/mozilla/firefox-releases/tests/steeplechase/tests/steeplechase.ini --host1 --host2
steeplechase INFO | Pushing app to Client 1...
steeplechase INFO | Pushing app to Client 2...
Writing profile for Client 1...
Pushing profile to Client 1...
cmd: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', '']
Writing profile for Client 2...
Pushing profile to Client 2...
cmd: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', '']
steeplechase INFO | Waiting for results...
steeplechase INFO | All clients finished
steeplechase INFO | Result summary:
steeplechase INFO | Passed: 112
steeplechase INFO | Failed: 0

It worked! We have a running lab on Linux now.

Next step: Get Jenkins to invoke this.

April 24, 2015 07:28 PM

ESX, remotely

OK, somebody gave me VNC to a linux box in the office over VPN, so I can download ISOs. MSDN's sight has a really really bad captcha on it, and using a password manager was a pain, and it did 2-level, but I'm safe now, right?

Anyway, I can now download ISOs onto a machine on the same network as ESX box. Need to figure out how to get ESX to mount the newly created NFS export where the ISO lives.

OK, I figured it out. You create the VM in vSphereClient. Let the boot fail. Then, connect the .iso stored on your data store to the DVD drive. Finally, press Ctrl-Alt-Ins, and it should boot to your ISO.

More later.

April 24, 2015 07:28 PM

Bootstrapping a data center remotely

Using a VM running vSphere connecting to an ESX server 2000 miles away over VPN, and using VNC to connect to a known Linux box on the same network as the ESX box, I was able to download enough ISOs to create a Windows 7 VM. The idea is to put vSphere Client on it, and use Microsoft Remote Desktop to connect to it. I can then download ISOs to its hard drive and create VMs with "ISO on local disk" option.

Had to download MS RDC. It's free now. This is great.

I am also starting to build up the first linux box. I used VNC on the other machine to download the ISO of Ubuntu 14.10 Desktop. Used the VSphere Client to install the machine. Of course, I had to use the vSphere Client to access the Desktop. Doing this from RDC is painful; typing often results in duplicated characters. So I installed openssh-server:

sudo apt-get install openssh-server
sudo /etc/init.d/ssh restart

And then I could ssh in from a Terminal on my machine. Ah, much better.

I still want desktop access, though. I enabled Desktop Sharing on the Linux VM. Alas, the Mac Screen Sharing could not connect to it, although it could connect to an Ubuntu 12 machine. My coworker point me to this article, and now the Linux machine is good to go.

April 24, 2015 07:28 PM

Running the tests

One of my coworkers is going to post a public document on how to get steeplechase and Negatus running. Once he posts that, I will repost the link here. Basically, it looks like this:

- One machine needs to run simplesignaling. This is a nodejs-based server to facilitate Firefox communication.
- A machine needs to run steelpechase. This can be the same machine as the one that runs simplesignalling, but not required.
- Each of the client machines runs Negatus, which is a test agent.

The steeplechase machine needs to download the firefox binaries and tests from The binaries and test files have to be de-archived. Steeplechase can then be run:

% tar xvfj firefox-33.0a1.en-US.linux-x86_64.tar.bz2 
% mkdir tests
% cd tests/
% unzip ../
% mkdir ~/logs
% python ~/src/steeplechase/steeplechase/ --binary /home/mozilla/firefox-releases/firefox/firefox --specialpowers-path /home/mozilla/firefox-releases/tests/steeplechase/specialpowers --prefs-file /home/mozilla/firefox-releases/tests/steeplechase/prefs_general.js --signalling-server '' --html-manifest /home/mozilla/firefox-releases/tests/steeplechase/tests/steeplechase.ini --save-logs-to ~/logs/ --host1 --host2
steeplechase INFO | Pushing app to Client 1...
steeplechase INFO | Pushing app to Client 2...
Writing profile for Client 1...
Pushing profile to Client 1...
cmd: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', '']
Writing profile for Client 2...
Pushing profile to Client 2...
cmd: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', '']
steeplechase INFO | Waiting for results...
steeplechase INFO | All clients finished
steeplechase INFO | Result summary:
steeplechase INFO | Passed: 118
steeplechase INFO | Failed: 0

I now have this working on both my lab at home and the ESX lab in Mountain View. Next: making it work autonomously.

April 24, 2015 07:27 PM

OK, time to step back

Occasionally, you reach some little milestone, and it helps to make a To Do list. So, here is what is left to have this lab up and running now that I have 3 machines that can run tests.

1. Get Negatus to run on boot for the two client machines.
2. On my home lab, develop the Jenkins scripts and job templates that will do the work.
3. Install the Jenkins instance in the ESX lab.
4. Port the Jenkins work from the home lab to the ESX lab.
5. Start adding platforms.

That's a good general list.

Tomorrow is a test day, so I probably won't be doing my with this stuff. Let's what I can get done today.

April 24, 2015 07:27 PM

New task

I was given a new area to investigate on top of building out a lab. Basically, we have some media streaming tests in our tree, in <mozilla-central>/content/media/test, and they are flaky. I have been asked to investigate a couple of things, using test_seek.html as an example:

- The tests actually get run in parallel. I have been asked to see if running them singly will affect the intermediate failure rate.
- I have been asked to split this file up. It currently calls 13 sub-files. I have already generated the first of those 13 files and it works fine.

All of this requires checking out the Firefox source (instructions here), and then running Mochitest (instructions here). I can run the individual test.

If do add a test file in a directory, such as <mozilla-central>/content/media/test, you have to add it to the mochitest.ini file in that directory to be picked up by the system.

I have two build trees (my Mercurial foo is low, and I am lazy), one to generate patches for the split out of the tests, and one to generate patches for our try system running only one test at a time.

Alas, I am not at my house; I am with family in another state, and the internet is slow here. It will be upgraded soon, but this is taking a while...

April 24, 2015 07:27 PM

[LAB] Setting up for jenkins

Packages required on Jenkins machine:

Packages required on Steeplechase machine:
  • openjdk-7-jre-headless
  • curl
Jenkins plugins:
  • Git plugin
  • Mercurial plugin
I am afraid I finished the buildout of this without good notes. Sorry!

April 24, 2015 07:27 PM

Three things I am working on

The first one is this bugzilla. Basically, the audio/video tests sometimes time out in Mozilla's build/test environment, and we are trying to track down which tests are sensitive. We run a lot of tests in Amazon's S3 cloud, and disk and network access are not predictable in that environment.

My boss and I have submitted a few patches to unparallelize some of the tests to see if it helps. Latest patch is run by our environment in tbpl, here. I need to run quite a few more tests and analyze them today. One thing I had to learn was Mercurial queues; using this is the best way to work with patches with Firefox and Bugzilla. More info here and here. You have to be really careful with them, as it is easy to blow away work. Still, it's a really nice system for managing patches. In git, you would do the same with local branches, but it's not quite as easy.

The second is a test time out running mochitest on a subdirectory on a Mac OS X VM. Mochitest is the oldest Firefox test suite. More info on it can be found here. I chose to run it from a build tree, so I had to go build Firefox on Mac. Info on that here. I then ran the following:

./mach mochitest-plain content/media/test

And then watched the magic! Note that this is the same test suite as we are watching with the first problem I am working on. Sometimes, I get a test failure on my VM that nobody else seems to be running into, so I am working on trying to get enough data to file a bug. A coworker directed me to try out setting MINIDUMP_STACKWALK before running the test. Once I dug enough for somebody to tell me that this tools was part of a another mercurial repo (, I tried it. No dice. Need to submit that bug report today.

And last, I am having trouble getting steeplechase to run in my Jenkins lab. The latest:

cmd: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', '']
Traceback (most recent call last):
File "/home/mozilla/jenkins/workspace/linux64-linux64/steeplechase/steeplechase/", line 311, in
sys.exit(0 if main(sys.argv[1:]) else 1)
File "/home/mozilla/jenkins/workspace/linux64-linux64/steeplechase/steeplechase/", line 301, in main
html_pass_count, html_fail_count =
File "/home/mozilla/jenkins/workspace/linux64-linux64/steeplechase/steeplechase/", line 187, in run
passes, failures = result
TypeError: 'NoneType' object is not iterable
Exception in thread Client 1:
Traceback (most recent call last):
File "/usr/lib/python2.7/", line 810, in __bootstrap_inner
File "/home/mozilla/jenkins/workspace/linux64-linux64/steeplechase/steeplechase/", line 100, in run
output = dm.shellCheckOutput(cmd, env=env)
File "/usr/local/lib/python2.7/dist-packages/mozdevice-0.37-py2.7.egg/mozdevice/", line 395, in shellCheckOutput
raise DMError("Non-zero return code for command: %s (output: '%s', retval: '%s')" % (cmd, output, retval))
DMError: Non-zero return code for command: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', ''] (output: 'r', retval: '256')

Exception in thread Client 2:
Traceback (most recent call last):
File "/usr/lib/python2.7/", line 810, in __bootstrap_inner
File "/home/mozilla/jenkins/workspace/linux64-linux64/steeplechase/steeplechase/", line 100, in run
output = dm.shellCheckOutput(cmd, env=env)
File "/usr/local/lib/python2.7/dist-packages/mozdevice-0.37-py2.7.egg/mozdevice/", line 395, in shellCheckOutput
raise DMError("Non-zero return code for command: %s (output: '%s', retval: '%s')" % (cmd, output, retval))
DMError: Non-zero return code for command: ['/tmp/tests/steeplechase/app/firefox', '-no-remote', '-profile', '/tmp/tests/steeplechase/profile', ''] (output: 'r', retval: '256')

Need to see what that is about...

April 24, 2015 07:26 PM

Three things I am working on - follow-up

Thing 1 - Bugzilla 1036992 - I have a patch now which completes the split up of test_seek.html, including refactoring of common javascript in test_seek-split*.html files. Figuring out how to add the javascript so that it was actually available was fun. Basically, everything goes in mochitest.ini. Who new?

Thing 2 - mochitest dying on my VM. Well, it still happens, and it often does not leave a stack dump. Still working on this one.

Thing 3 - Jenkins and nodejs. I restarted everything and it started working. Huh. Weird.

So, now I am setting up Jenkins in Mountain View to reproduce the result. Unfortunately, when the physical box was moved into the lab, the networking broke. All of the existing VMs have IPv6 addresses rather than the static addresses I thought were assigned.

Coworker in charge of box said he would look at that.

So, now I am building linux32 builders for this test on my home VM. I am working out of one of my relative's houses in the Midwest right now, and her internet connection was SLOW. Well, the upgrade came through yesterday, and now it is as fast as my home network in Texas.

It helps.

April 24, 2015 07:26 PM


You actually have to do work to run 32-bit binaries on 64-bit linux. I knew this, but I rediscovered this fact this morning.

Of course, this got much harder in modern Ubuntu:

sudo apt-get install libxtst6:i386 libXext6:i386 libxi6:i386 libncurses5:i386 libxt6:i386 libxpm4:i386 libxmu6:i386 libxp6:i386
sudo apt-get install libstdc++6-4.8-dbg:i386


April 24, 2015 07:26 PM

OK, it's just too hard to nail down all of the libraries to run firefox32 on linux64. I give up. Average users won't do this.

April 24, 2015 07:26 PM

Another fun fact: For Negatus to be able to run firefox for the tests, it has to run it in a display. Without setting up a fake X session, the Negatus client has to be logged in, at least on Linux. I am sure that this will be true on other platforms as well. For Linux, we could run in a virtual frame buffer, but I am not sure that this is necessary. Just set up the account the test will run in to auto login. Make sure it is on a private network, though...

April 24, 2015 07:25 PM


I have a Jenkins instance running the regression suite on Firefox nightly builds in all platform combinations of linux64, linux32, and macosx. The front page looks like this:

So, I have proof-of-concept. I am demoing this at our QA Work Week QA Fair later today. Should be fun!

Lots of work to do, however.

  1. I need to investigate using parameterized builds. Having to create and maintain a separate Jenkins job for each combination is painful (I have a lot of experience with this from Coverity, alas).
  2. Henrick Skupin has a Jenkins instance, and he has solved the problem of the nightly version numbers changing every month. Need to implement that.
  3. Need to add jobs for nightly versions connecting to Aurora, Beta, Release and ERS versions of Firefox.
  4. The test is not incredibly valid. I need to take the test additions done by Geo Mealer in our Sunny Day Environment and run the connections for 1 minute each. I really want to do the parameterization first so I don't have to update dozens of jobs.
  5. I need to put this in our ESX farm. However, I only want to do this when most of the above is done. I am going to set up Jenkins on the ESX farm, running linux64 regressions for now.
  6. I have got to get Jenkins and Steeplechase working for Windows. While I am in Mountain View this week, I do not have access to my Windows VMs. Will have to wait until I get home next week.
  7. Sunny Day Environment does most of its job maintenance and triggering via cron jobs. It would be nice to move this into Jenkins.
  8. Need to report results of test runs into Treeherder. This is a pretty big effort.
  9. And then, there is B2G and Android.
  10. Steeplechase changes
    1. Run with existing binaries but new profile. (or whatever; independent control)
    2. Send binary archives down instead of directories, and tell clients how to unpack them.
    3. Steeplechase should talk to treeherder.
Ought to keep me busy.

April 24, 2015 07:25 PM

Which Firefox build to download?

So, eventually this system is going to have to download all Firefox releases and test them. All of the releases are available here:

So, there are a LOT of releases on this server. So which ones to we want? Mozilla has five (or more) active release trains at any one point:

There is a utility called mozdownload which is here. Once you build and install it, mozdownload is a big help. It deals with changing file names and dates and the like.

So where is all of this stuff on And what is the mozdownload command-line for it?

I hope that this helps. It sure helps me.

April 24, 2015 07:25 PM

Back to Jenkins fun

So, now, I know how to download binaries and get the correct versions independent of what their names actually are.

Now, I need my jobs to trigger each other. The scheme I have is:

  1. Run download. If the binary is newer, overwrite the canonically named version, i.e., firefox-latest-nightly.en-US.linux-x86_64.tar.bz2.
  2. Another job triggers when firefox-latest-nightly.en-US.linux-x86_64.tar.bz2. It runs on the Steeplechase machine, and it expands the archive into a known directory location.
  3. It then triggers jobs for all of the steeplechase runs that depend on it.
At Coverity I used the URLSCM plugin to do the triggering. Basically it used the SCM polling mechanism builtin to Jenkins to see if the local copy of a file is newer than the version at a URL. The problem is, this mechanism broke a few years ago, and to this day nobody has fixed the bug.

Today I found out about another Jenkins plugin, URLTrigger Plugin. This allows you to trigger builds of a variety of things, but most germain to me is that it can trigger if md5 checksums are different. I am trying this out overnight; we'll see what happens.

April 24, 2015 07:25 PM

Working Jenkins

Well, that was a lot of trial and error.

So, here is how the Jenkins setup is working. I wrote a script (available at at this repo in my github account) called (It calls other scripts in the checkout directory). What it does:

- Calls mozdownload. It will download the latest nightly build, but the name does not stay the same from day-to-day.
- If this is the first time it is downloaded, it will copy the download payload to "firefox-latest-lighty.en-US.<platform>.<ext>", where <platform> and <ext> are appropriate to the platform we are caching.
- If there is one there, it uses the unix find command to find the name of the latest binary, and copies that one. It also finds binaries older than the cached version and removes them.
- On the Mac, this runs in a Mac builder, and this script will open the .dmg, copy the contents out, and repackage them into a .tar.bz2 file, since the steeplechase machine will be on Linux and doesn't easily know how to open a .dmg file.

This leaves an artifact on the Jenkin's master filesystem. This is important later on.

We also have to have the payload (this is the Firefox 34 version) with the tests directory. We are using the linux64 version, because our tests do not required any platform-specific compiled assets from the package. Unfortunately, mozdownload does not know about the tests payload, and this url will have to updated everytime there is a version bump. Maybe I'll add that to mozdownload some day.

So that's great. We have firefox binaries, and test assets. Both of these need to be on the local filesystem of the steeplechase machine. So, how do we get them there?

My previous post talked about a couple of plugins designed to help track assets and when they changed:

For the tests download, I use the URL Trigger to detect changes and URLSCM to download it.

For the Firefox binaries, I was trying to use URL Trigger to track changes in the URL of the Last Successful builds, but they were never triggering. Instead, I use the FSTrigger to detect when files change on the Jenkins master itself.

So there is the sequence (using linux64 as an example):

  1. Once every 24 hours, firefox-nightly-linux24 fires. It run, which runs mozdownload to get the linux64 binary. If there is a new binary, the new firefox-latest-nightly.en-US.tar.bz2 file is archived.
  2. trigger-firefox-nightly-linux64 (running on the Jenkins master) notices the new file, and immediately triggers expand-firefox-nightly-linux64.
  3. On the steeplechase machine, expand-firefox-nightly-linux64:
    1. Copies the firefox-latest-nightly-en-US.tar.bz2 file from the Jenkins master to the local filesystem.
    2. Expands the payload to the /home/mozilla/firefoxes/nightly/linux64 directory.
    3. Triggers all of the steeplechase jobs based on linux64 to be run.
  4. A steeplechase job will run, passing the correct binaries and test files to the test machines.
I also have add the SCM Sync plugin to the Jenkins instance so hopefully I won't have to create all of these jobs on the ESX machine from scratch (although I will have to edit them).

April 24, 2015 06:27 PM

How to build Platform QA Jenkins

So, I never have done a step-by-step to build the Jenkins instance that I have built. Knowing how to do it yourself might be important if you want to test jobs running in Jenkins.

First of all, decide on the host OS for Jenkins. I chose Ubuntu 14 for several reasons:

Setting up the master VM

  • Install Ubuntu on a VM host, like VirtualBox or VMWare.
    • Give the VM at least 2 cores, 80 GB of disk space, and 4 GB of memory.
  • Follow the instructions here:
  • Install the following Plugins. First follow the following steps:
    • Navigate to your jenkins instance.
    • Click on "Manage Jenkins"
    • Click on "Manage Plugins". (This is the url http://localhost:8080/pluginManager on your VM)
    • Click on "Available". Install the following:
      • File System SCM
      • Filesystem Trigger Plug-in
      • GIT client plugin
      • GIT Parameter Plug-in
      • GIT plutin
      • GitHub API Plugin
      • Mercurial plugin
      • Multiple SCMs plugin
      • SSH Agent Plugin
      • SSH Credentials Plugin
      • SSH plugin
      • SSH Slaves plugin
      • URLTrigger Plug-in
      • Windows Slaves Plugin
      • Workspace Cleanup Plugin
    • Restart Jenkins.
Next Installment: Add a builder

April 24, 2015 06:27 PM

Armen Zambrano G. (@armenzg)

Firefox UI update testing

We currently trigger manually UI update tests for Firefox releases. There are automated headless update verification tests but they don't test the UI of Firefox.

The goal is to integrate this UI update testing as part of the Firefox releases.
This will require changes to firefox-ui-tests, buildbot scheduling changes, Marionette changes and other Mozbase packages. The ultimate goal is to speed up our turn around on releases.

The update testing code was recently ported from Mozmill to use Marionette to drive the testing.

I've already written some documentation on how to run the update verification using Release Engineering configuration files. You can use my tools repository until the code lands (update_testing is the branch to be used).

My deliverable is to ensure that the update testing works reliably on Release Engineering infrastructure and there is existing scheduling code for it.

You can read more about this project in bug 1148546.

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 24, 2015 02:42 PM

April 23, 2015

Geoff Brown

Android 4.3 Opt tests running on trunk trees

Beginning today, “Android 4.3 API11+ opt” unit tests are running on treeherder on all trunk trees. These tests run against our standard Firefox for Android API11+ arm builds, but run in an Android arm emulator running Android 4.3. Just like the existing Android 2.3 tests, the emulator for 4.3 runs on an aws instance.

The emulator environment has these characteristics:

This Android 4.3 emulator environment is very much like the existing “Android 2.3 API9 opt” environment. Broadly, tests seem to run in about the same amount of time on 4.3 as on 2.3 and we see some of the same failures on 4.3 as on 2.3. One significant difference between the 4.3 and 2.3 environments is the “device manager” used to communicate between the test harnesses and the device. On Android 2.3, sutagent is installed on the device and a custom tcp protocol is used to push/pull files, start processes, etc; on Android 4.3, sutagent is not used at all (it doesn’t play well with SELinux security) and adb is used instead.

Android 4.3 API11+ opt tests are available on try and run as a consequence of:

try: -b o -p android-api-11 -u …

As Android 4.3 API11+ opt tests have been introduced, the corresponding “Android 4.0 API11+ opt” test jobs have been disabled. Android 4.0 tests were running on our aging Pandaboards; running in the emulator on aws is more scalable, cost-effective, and future-safe.

Android 4.0 API11+ debug tests continue to run, but we plan to migrate those to the 4.3 emulator soon.

A few Android 4.0 API11+ opt Talos tests continue to run. We are evaluating whether those can be replaced by similar Autophone tests.

As with the introduction of any new test platform, some tests failed on Android 4.3 and had to be disabled or marked as failing on 4.3. Corresponding bugs have been opened for these tests and you can find them by looking for bugs with “[test disabled on android 4.3]” on the whiteboard:—&resolution=FIXED&resolution=INVALID&resolution=WONTFIX&resolution=DUPLICATE&resolution=WORKSFORME&resolution=INCOMPLETE&resolution=SUPPORT&resolution=EXPIRED&resolution=MOVED&status_whiteboard_type=allwordssubstr&query_format=advanced&status_whiteboard=disabled%20on%204.3&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED

Great thanks to everyone who has contributed to the 4.3 effort, but especially :kmoir for all the Release Engineering work to get everything running smoothly in continuous integration.

Looking for more details about the new environment? Have a look at:

Bug 1062365 Investigate Android 4.3/4.4 emulator test setup

Bug 1133833 Android 4.3 emulator tests

…or ask me!

April 23, 2015 10:27 PM

William Lachance

PyCon 2015

So I went to PyCon 2015. While I didn’t leave quite as inspired as I did in 2014 (when I discovered iPython), it was a great experience and I learned a ton. Once again, I was incredibly impressed with the organization of the conference and the diversity and quality of the speakers.

Since Mozilla was nice enough to sponsor my attendance, I figured I should do another round up of notable talks that I went to.

Technical stuff that was directly relevant to what I work on:

Non-technical stuff:

I probably missed out on a bunch of interesting things. If you also went to PyCon, please feel free to add links to your favorite talks in the comments!

April 23, 2015 02:55 PM

April 21, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

April 21, 2015 06:37 AM

April 20, 2015

Armen Zambrano G. (@armenzg)

How to install pywin32 on Windows

In Mozilla's Release Engineering Windows machines we have pywin32 installed.
This dependency if you're going to run older scripts that are needed for the release process.
Unfortunately, at the moment, we can't get rid of this dependency and need to install it.

If you're not using Mozilla-build, you can easily install it with these steps:
NOTE:These are 32-bit binary installers. 64-bit binaries are also available.

In Mozilla we use Mozilla-build which brings most of the tools you need to build Firefox.
Python is included on it, however, pywin32 is currently not pat of it (bug to fix this).

Since the process was a bit painful for me, I will take note of it for future reference.
I tried few approaches until I figured out that we need to use easy_install instead of pip and we need to point to an .exe file rather than a normal Python package.

Use easy_install

Here it is:
You will know that it worked if you can run this without any errors:
python -c "import win32api" 

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

April 20, 2015 03:59 PM

April 17, 2015

mozregression updates

Release 0.36

This release includes a bugfix for the download in backgound feature introduced in 0.35 (see bug 1153801).

Also, on Firefox for Android we now use a custom profile, meaning:

See bug 1147576 for more information.

April 17, 2015 12:00 AM

April 14, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

April 14, 2015 06:41 AM

April 13, 2015

Geoff Brown

mach support for mochitest-plain and mochitest-chrome on Android

We recently added mach commands for running mochitest-plain and mochitest-chrome tests on Android. For now, these commands only support the adb device manager (no way to run with sutagent) and there is minimal support for mochitest options available on desktop; see/comment on bug 1152944.

The old make targets continue to work, but should be considered deprecated.

See / for detailed instructions.

April 13, 2015 09:59 PM

April 07, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

April 07, 2015 07:17 AM

April 01, 2015

Byron Jones

happy bmo push day!

in order to fix a needinfo clearing issue with the new ui, the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

April 01, 2015 03:15 PM

Geoff Brown

Firefox for Android Performance Measures – February/March Check-up

I skipped my regular February post and I am thinking of writing up these performance summaries less frequently — maybe every 2 or 3 months. Any objections?


– Talos trobopan, tprovider, and ts tests have been retired.

– Big improvement in tsvgx.


This section tracks Perfomatic graphs from for mozilla-central builds of Firefox for Android, for Talos tests run on Android 4.0 O. The test names shown are those used on treeherder. See for background on Talos.


Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

18.7 (start of period) – 19.0 (end of period)


This test is no longer run.


This test is no longer run.


An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.


3500 (start of period) – 720 (end of period).

Big improvements Feb 26 and March 6. The March 6 improvement seems to have been caused by a Talos update.


Generic page load test. Lower values are better.

710 (start of period) – 680 (end of period).

Throbber Start / Throbber Stop

These graphs are taken from  Browser startup performance is measured on real phones (a variety of popular devices).

Time to throbber start seems steady this month.


Time to throbber stop has some small improvements and regressions.



These graphs are taken from Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at:






I recently discovered the mozbench dashboard at which includes some comparisons involving Firefox for Android. More info at





April 01, 2015 02:43 PM

March 31, 2015

Byron Jones’s new look

this quarter i’ve been working on redesigning how bugs are viewed and edited on — expect large changes to how bmo looks and feels!

unsurprisingly some of the oldest code in bugzilla is that which displays bugs; it has grown organically over time to cope with the many varying requirements of its users worldwide.  while there has been ui improvements over time (such as the sandstone skin), we felt it was time to take a step back and start looking at bugzilla with a fresh set of eyes. we wanted something that was designed for mozilla’s workflow, that didn’t look like it was designed last century, and would provide us with a flexible base upon which we could build further improvements.

a core idea of the design is to load the bug initially in a read-only “view” mode, requiring the user to click on an “edit” button to make most changes. this enables us to defer loading of a lot of data when the page is initially loaded, as well as providing a much cleaner and less overwhelming view of bugs.


major interface changes include:

the view/edit mode:


you can use it today!

this new view has been deployed to, and you can enable it by setting the user preference “experimental user interface” to “on”.

you can also enable it per-bug by appending &format=modal to the url (eg.  once enabled you can disable it per-bug by appending &format=default to the url.

what next?

there’s still a lot to be done before there’s feature parity between the new modal view and the current show_bug.  some of the major items missing with the initial release include:

you can view the complete list of bugs, or file a new bug if you discover something broken or missing that hasn’t already been reported.

Filed under: bmo

March 31, 2015 06:32 AM

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

March 31, 2015 06:01 AM

March 29, 2015

Andrew Halberstadt

Making mercurial bookmarks more git-like

I mentioned in my previous post a mercurial extension I wrote for making bookmarks easier to manipulate. Since then it has undergone a large overhaul, and I believe it is now stable and intuitive enough to advertise a bit more widely.

Introducing bookbinder

When working with bookmarks (or anonymous heads) I often wanted to operate on the entire series of commits within the feature I was working on. I often found myself digging out revision numbers to find the first commit in a bookmark to do things like rebasing, grafting or diffing. This was annoying. I wanted bookmarks to work more like a git-style branch, that has a definite start as well as an end. And I wanted to be able to easily refer to the set of commits contained within. Enter bookbinder.

First, you can install bookbinder by cloning:

bash $ hg clone

Then add the following to your hgrc:

ini [extensions] bookbinder = path/to/bookbinder

Usage is simple. Any command that accepts a revset with --rev, will be wrapped so that bookmark labels are replaced with the series of commits contained within the bookmark.

For example, let's say we create a bookmark to work on a feature called foo and make two commits:

```bash $ hg log -f changeset: 2:fcd3bdafbc88 bookmark: foo summary: Modify foo

changeset: 1:8dec92fc1b1c summary: Implement foo

changeset: 0:165467d1f143 summary: Initial commit ```

Without bookbinder, bookmarks are only labels to a commit:

bash $ hg log -r foo changeset: 2:fcd3bdafbc88 bookmark: foo summary: Modify foo

But with bookbinder, bookmarks become a logical series of related commits. They are more similar to git-style branches:

```bash $ hg log -r foo changeset: 2:fcd3bdafbc88 bookmark: foo summary: Modify foo

changeset: 1:8dec92fc1b1c summary: Implement foo ```

Remember hg log is just one example. Bookbinder automatically detects and wraps all commands that have a --rev option and that can receive a series of commits. It even finds commands from arbitrary extensions that may be installed! Here are few examples that I've found handy in addition to hg log:

```bash $ hg rebase -r <bookbark> -d <dest> $ hg diff -r <bookmark> $ hg graft -r <bookmark> $ hg grep -r <bookmark> $ hg fold -r <bookmark> $ hg prune -r <bookmark>



They all replace the single commit pointed to by the bookmark with the series of commits within the bookmark. But what if you actually only want the single commit pointed to by the bookmark label? Bookbinder uses '.' as an escape character, so using the example above:

bash $ hg log -r .foo changeset: 2:fcd3bdafbc88 bookmark: foo summary: Modify foo

Bookbinder will also detect if bookmarks are based on top of one another:

bash $ hg rebase -r my_bookmark_2 -d my_bookmark_1

Running hg log -r my_bookmark_2 will not print any of the commits contained by my_bookmark_1.

The gory details

But how does bookbinder know where one feature ends, and another begins? Bookbinder implements a new revset called "feature". The feature revset is roughly equivalent to the following alias (kudos to smacleod for coming up with it):

ini [revsetalias] feature($1) = ($1 or (ancestors($1) and not (excludemarks($1) or hg ancestors(excludemarks($1))))) and not public() and not merge() excludemarks($1) = ancestors(parents($1)) and bookmark()

Here is a formal definition. A commit C is "within" a feature branch ending at revision R if all of the following statements are true:

  1. C is R or C is an ancestor of R
  2. C is not public
  3. C is not a merge commit
  4. no bookmarks exist in [C, R) for C != R
  5. all commits in (C, R) are also within R for C != R

In easier to understand terms, this means all ancestors of a revision that aren't public, a merge commit or part of a different bookmark, are within that revision's 'feature'. One thing to be aware of, is that this definition allows empty bookmarks. For example, if you create a new bookmark on a public commit and haven't made any changes yet, that bookmark is "empty". Running hg log -r with an empty bookmark won't have any output.

The feature revset that bookbinder exposes, works just as well on revisions that don't have any associated bookmark. For example, if you are working with an anonymous head, you could do:

bash $ hg log -r 'feature(<rev>)'

In fact, when you pass in a bookmark label to a supported command, bookbinder is literally just substituting -r <bookmark> with -r feature(<bookmark>). All the hard work is happening in the feature revset.

In closing, bookbinder has helped me make a lot more sense out of my bookmark based workflow. It's solving a problem I think should be handled in mercurial core, maybe one day I'll attempt to submit a patch upstream. But until then, I hope it can be useful to others as well.

March 29, 2015 10:43 PM

March 27, 2015

William Lachance

Perfherder update: Summary series drilldown

Just wanted to give another quick Perfherder update. Since the last time, I’ve added summary series (which is what GraphServer shows you), so we now have (in theory) the best of both worlds when it comes to Talos data: aggregate summaries of the various suites we run (tp5, tart, etc), with the ability to dig into individual results as needed. This kind of analysis wasn’t possible with Graphserver and I’m hopeful this will be helpful in tracking down the root causes of Talos regressions more effectively.

Let’s give an example of where this might be useful by showing how it can highlight problems. Recently we tracked a regression in the Customization Animation Tests (CART) suite from the commit in bug 1128354. Using Mishra Vikas‘s new “highlight revision mode” in Perfherder (combined with the revision hash when the regression was pushed to inbound), we can quickly zero in on the location of it:

Screen Shot 2015-03-27 at 3.18.28 PM

It does indeed look like things ticked up after this commit for the CART suite, but why? By clicking on the datapoint, you can open up a subtest summary view beneath the graph:

Screen Shot 2015-03-27 at 2.35.25 PM

We see here that it looks like the 3-customize-enter-css.all.TART entry ticked up a bunch. The related test 3-customize-enter-css.half.TART ticked up a bit too. The changes elsewhere look minimal. But is that a trend that holds across the data over time? We can add some of the relevant subtests to the overall graph view to get a closer look:

Screen Shot 2015-03-27 at 2.36.49 PM

As is hopefully obvious, this confirms that the affected subtest continues to hold its higher value while another test just bounces around more or less in the range it was before.

Hope people find this useful! If you want to play with this yourself, you can access the perfherder UI at

March 27, 2015 07:24 PM

Geoff Brown

Complete logcats for Android tests (updated for 2015)

I described Android test “complete logcats” last year in, but a few details have changed since then, so here’s a re-write!

“Logcats” – those Android logs you see when you execute “adb logcat” – are an essential part of debugging Firefox for Android. We include logcats in our Android test logs on treeherder: After a test run, we run logcat on the device, collect the output and dump it to the test log. Sometimes those logcats are very useful; other times, they are too little, too late. A typical problem is that a failure occurs early in a test run, but does not cause the test to fail immediately; by the time the test ends, the fixed-size logcat buffer has filled up and overwritten the earlier, important messages. How frustrating!

All Android test jobs also offer “complete logcats”: logcat is run for the duration of the test job, the output is collected continuously, and dumped to a file. At the end of the test job, the file is uploaded to an aws server, and a link is displayed in treeherder. Here’s a sample of a treeherder summary, from the bottom right (you may need to resize or scroll to see the whole thing):


Notice the “artifact uploaded logcat.log” line? Open that link and you have a complete logcat showing what was happening on the device for the duration of the test job.

We have not changed the “old” logcat features in test logs: We still run logcat at the end of most jobs and dump the output to the test log. That might be more convenient in some cases.

Happy test debugging!

March 27, 2015 06:27 PM

March 26, 2015

Armen Zambrano G. (@armenzg)

mozci 0.4.0 released - Many bug fixes and improved performance

For the release notes with all there hyper-links go here.

NOTE: I did a 0.3.1 release but the right number should have been 0.4.0

This release does not add any major features, however, it fixes many issues and has much better performance.

Many thanks to @adusca, @jmaher and @vaibhavmagarwal for their contributions.



For all changes visit: 0.3.0...0.4.0

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

March 26, 2015 08:36 PM

March 24, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

March 24, 2015 06:27 AM

March 17, 2015

Byron Jones

happy bmo push day!

the following changes have been pushed to

discuss these changes on

Filed under: bmo, mozilla

March 17, 2015 08:29 AM

March 15, 2015

Alice Scarpa

Platforms, buildbots and allthethings

When I asked about larger projects to get involved in, Joel (:jmaher) told me about a new tool suite that’s being written in Python: Mozilla Continuous Integration Tools. The project maintainer is Armen Zambrano (:armenzg) and talking with him on the #ateam IRC channel I discovered that a big problem they had was the function associated_build_job() in That ended up sending me in a journey through buildbot source code. But first, the problem.

What should associated_build_job() do?

When someone makes a new commit to a Mozilla repo, a lot of tests run in a lot of different platforms. You can see what is happening live on Treeherder. Every test job has a corresponding build job that triggers it, e.g. “Ubuntu VM 12.04 mozilla-central opt test cppunit” is triggered by “Linux mozilla-central build” (you can see the name of the build job clicking on the green ‘B’).

The function associated_build_job should receive a test job and return its corresponding build job. That sounds easy, Treeherder already knows every test run by a given build job, right? Turns out that’s not the case. Treeherder just shows what’s happening on Mozilla’s machines, it has no knowledge of what triggers what. used a mapping to know that “Ubuntu VM 12.04” is Linux, but that mapping wasn’t very robust and already failed for some buildernames.

How a build job triggers test jobs

If there is no list of what triggers what, how test jobs get triggered? To answer that I had to study a bit of buildbot source code.

In buildbot, when a build job finishes it sends a message. When a test scheduler receives the right message, it triggers the tests it knows about. For example, “Linux mozilla-central build” sends the message “mozilla-central-linux-opt-unittest” and the test scheduler “tests-mozilla-central-ubuntu32_vm-opt-unittest” waits for that message to trigger all of its test jobs, including “Ubuntu VM 12.04 mozilla-central opt test cppunit”.

There is a file called allthethings.json where a lot of information about builders is dumped every day. I thought that adding what message a build job sends and what message a test scheduler listens to would solve problems.

After I started working on that, I discovered that the problem was more complicated than what it looked originally. Not every builder sends a message the same way, and how a builder sends a message is going to change soon, because a lot of jobs are moving to mozharness.

The code ended up really messy, fragile to changes and hard to maintain. Instead of trying to improve it, I talked with armenzg, my mentor, and we decided that it was not a reasonable approach. But I did have a version of allthethings.json with triggers information, and I could use that to generate test cases (12576 of them!) for associated_build_job(). Turns out that the original version returned a error in 1707 of those cases and got the wrong result 5340 times.

New heuristic

Working with allthethings.json and buildbot I noticed that the message sent by a build job was always just it’s shortname with one or two suffixes appended. I could guess what message a build job would send from its shortname, a parameter already available in allthethings. I still needed to have a key in every test scheduler telling what message it listens to, but that part was ready from my previous approach.

After my patch to add more things to allthethings was merged, I could fully test my new heuristic. It passed every one of the 12576 test cases! It has been in production for about 3 weeks and so far no bugs. In fact, when I wrote a script to find out if it failed for any builder in allthethings I ended up finding 300 test jobs that didn’t have build jobs, and this was actually a previously unknown buildbot bug.

March 15, 2015 12:00 AM

March 12, 2015

mozregression updates

Release 0.35

This release contains the implementation of a new algorithm to download the next builds while you are currently evaluating one (bug 999019).

On a fast enough connection, this can considerably speed up the bisection process by reducing the amount of time spent waiting for downloads to complete.

March 12, 2015 12:00 AM

March 10, 2015

Byron Jones

happy bmo push day!


bugzilla now shows the status of mozreview hosted reviews:


it’s now possible to prevent review, feedback, and needinfo requests:

block review requests

discuss these changes on

Filed under: bmo, mozilla

March 10, 2015 07:11 AM

March 09, 2015

David Burns

Marionette - Where we are

I thought that I spend some time describing where we are with the Marionette project. For those that don't know, Marionette is the project to implement WebDriver in Firefox. We are implementing the WebDriver based on the W3C WebDriver specification.

We have working quite hard to get as much of implementation done as per specification. One thing to note is there is a few places where the specification and the open source project have diverged but hopefully a Selenium 3 release can align them again.

So... what is left to do for the Marionette project to be able to ship it's 1.0 release?

and a few other things. Feel free to look at our current roadmap!

That means we have some of the big ticket items, like modal dialog support, landed! We have some of the actions landed and most importantly we have large parts of a driver executable (written in Rust!), like chromedriver or internetexplorerdriver, completed.

Somethings are going slower than anticipated and other sections are shooting along so all in all I am really pleased with the current progress!

If you want to help out, we have a number of good first bugs that you can all join in!

March 09, 2015 09:44 PM

March 07, 2015

Alice Scarpa

Structured logs in Test Informant

Test Informant is a test monitoring service for Mozilla. It provides a high level view of what’s going on with automated tests.

It currently uses pulse to listen for the completion of build jobs and then downloads the associated file (that can reach 200MB), finds test manifests and parses them. After that it saves the information it found in a Mongo database.

The problem with that approach is that only a subset of tests is compatible with manifestparser, and Test Informant only supports those.

mozlog.structured provides structured logs as JSON files with information about tests that is easily machine-readable, but can also be interpreted by humans. Here is an example. Most of the suites compatible with manifestparser are also compatible with mozlog.structured, and so by using structured logs we get a lot of new suites. Therefore, the goal was to make Test Informant use structured logs instead of parsing manifests.

Listening for test jobs

Structured logs are attached to test jobs, not build jobs. So the first step to make Test-Informant use structured logs would be to make it listen for test jobs instead of build jobs on pulse listener.

The hardest part was making sure my code was working as expected. I had to wait for tests to show up in pulse, and sometimes that took a long time. One thing that helped was listening to ‘mozilla-inbound’ instead of ‘mozilla-central’, because it has much more activity. That way I ended up receiving several “Your pulse queue is full” emails.

Reading structured logs

After step 1 was ready, it was time to actually consume structured logs. Turns out that it’s pretty easy. All I had to do was adapt the example code from ahal’s blog post and that part was done.

Dealing with database issues was harder. I had to worry about race conditions and also deal with chunking (some test suites are split into several chunks that must be added to the same database entry).

Adding more tests

Test-Informant only deals with tests in its configuration file. There were a bunch of new tests that had structured logs but weren’t in I wrote the following script to identify pairs of platforms and suites that were compatible with structured logs:

def print_name(self, data):
     structured_logs = [(fn, url) for fn, url in data['blobber_files'].iteritems() 
                        if fn.endswith('_raw.log')]
     if structured_logs and not self.get_suite_name(data['test'], data['platform']):
         print data['platform'], data['test']

I left my script running for a couple hours, and after that I added the new platforms/suites I found to

What happened

This is a sample report using structured logs. It has a bunch of new tests, and that is actually a problem. The report links to skipped tests, but now not every test lives on, so our previous way of figuring out the URL does not work anymore. My work resulted in four PRs that were merged on the structured_log branch, and we are waiting for follow-up work from other contributors to merge structured_logs into master.

It was a really rewarding project and I’m excited about seeing it being used!

March 07, 2015 12:00 AM

March 06, 2015

Armen Zambrano G. (@armenzg)

How to generate data potentially useful to a dynamically generated trychooser UI

If you're interested on generating an up-to-date trychooser, I would love to hear from you.
adusca has helped me generate data similar to what a dynamic trychooser UI could use.
If you would like to help, please visit bug 983802 and let us know.

In order to generate the data all you have to do is:
git clone
cd mozilla_ci_tools
python develop
python scripts/misc/

That's it! You will then have a graphs.json dictionary with some of the pieces needed. Once we have an idea on how to generate the UI and what we're missing we can modify this script.

Here's some of the output:
    "android": [

Here are the remaining keys:
[u'android', u'android-api-11', u'android-api-9', u'android-armv6', u'android-x86', u'emulator', u'emulator-jb', u'emulator-kk', u'linux', u'linux-pgo', u'linux32_gecko', u'linux64', u'linux64-asan', u'linux64-cc', u'linux64-mulet', u'linux64-pgo', u'linux64_gecko', u'macosx64', u'win32', u'win32-pgo', u'win64', u'win64-pgo']

Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

March 06, 2015 04:41 PM

March 05, 2015

Armen Zambrano G. (@armenzg)

mozci 0.3.0 - Support for backfilling jobs on treeherder added

Sometime on treeherder, jobs get coalesced (a.k.a. we run the tests on the most recent revision) in order to handle load. This is good so we can catch up when many pushes are committed on a tree.

However, when a job run on the most recent code comes back failing we need to find out which revision introduced the the regression. This is when we need to backfill up to the last good run.

In this release of mozci we have added the ability to --backfill:
python scripts/ --buildername "b2g_ubuntu64_vm cedar debug test gaia-js-integration-5" --dry-run --revision 2dea8b3c6c91 --backfill
This should be useful specially for sheriffs.

You can start using mozci as long as you have LDAP credentials. Follow these steps to get started:
git clone
python develop (or install)

Release notes

Thanks again to vaibhav1994 and adusca for their many contributions in this release.

Major changes
  • Issue #75 - Added the ability to backfill changes until last good is found
  • No need to use --repo-name anymore
  • Issue #83 - Look for request_ids from a better place
  • Add interface to get status information instead of scheduling info
Minor fixes:
  • Fixes to make livehtml documentation
  • Make determine_upstream_builder() case insensitive
      Release notes:
      PyPi package:

      Creative Commons License
      This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

      March 05, 2015 04:19 PM

      March 04, 2015

      mozregression updates

      Release 0.34


      There is a bunch of new things in this new release of mozregression (0.34)!

      March 04, 2015 12:00 AM