Planet Mozilla Automation

April 23, 2014

Joel Maher

Vaibhav has a blog- a perspective of a new Mozilla hacker

As I mentioned earlier in this year, I have had the pleasure of working with Vaibhav.  Now that time has passed he continues to contribute to Mozilla, and he will be participating this year in Google Summer of Code with Mozilla.  I am excited. 

He now has a blog- while there is only one post, he will be posting ~weekly with updates to his GSoC project and other fun topics.


April 23, 2014 04:37 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

April 23, 2014 06:59 AM

April 22, 2014

William Lachance

PyCon 2014 impressions: ipython notebook is the future & more

This year’s PyCon US (Python Conference) was in my city of residence (Montréal) so I took the opportunity to go and see what was up in the world of the language I use the most at Mozilla. It was pretty great!

ipython

The highlight for me was learning about the possibilities of ipython notebooks, an absolutely fantastic interactive tool for debugging python in a live browser-based environment. I’d heard about it before, but it wasn’t immediately apparent how it would really improve things — it seemed to be just a less convenient interface to the python console that required me to futz around with my web browser. Watching a few presentations on the topic made me realize how wrong I was. It’s already changed the way I do work with Eideticker data, for the better.

Using ipython to analyze some eideticker dataUsing ipython to analyze some eideticker data

I think the basic premise is really quite simple: a better interface for typing in, experimenting with, and running python code. If you stop and think about it, the modern web interface supports a much richer vocabulary of interactive concepts that the console (or even text editors like emacs): there’s no reason we shouldn’t take advantage of it.

Here are the (IMO) killer features that make it worth using:

To learn more about how to use ipython notebooks for data analysis, I highly recommend Julie Evan’s talk Diving into Open Data with IPython Notebook & Pandas, which you can find on pyvideo.org.

Other Good Talks

I saw some other good talks at the conference, here are some of them:

April 22, 2014 09:36 PM

April 21, 2014

Joel Maher

Hi Vikas Mishra, welcome to the Mozilla community!

I remember right before the winter break where I was off for a couple of weeks to end the year, A new person showed up and started working on some of my bugs. This was Vikas and he was eager and motivated to do a lot of great things.  Fast forward to today, and I still chat with Vikas weekly in irc.  When he gets a break from school and other activities of interest, he will be hunting for bugs to fix and usually knocking them out 3 or 4 at a time. In fact, I was surprised when I did a query on bugzilla, and on github to see that he is involved in a variety of projects at Mozilla such as Phonebook, Marionette, Talos, and test automation.

I wanted to learn more about Vikas, so I asked him some questions

Tell me about where you live?

Currently I live in Goa, India which is famous for its beaches and attracts many tourists from both India as well as foreign nationals. The best thing about this place is my campus which feels like a home away from home now. I like this place for its beautiful weather, its beaches and ever friendly nature of people living here. This place has brought  a lot of changes in my personality and has given me a chance to make some new friends who are always ready to help.

Tell me about your school?

I am a first year undergraduate student at Birla Institute of Technology and Science Pilani, Goa Campus (popularly known as BITS Pilani Goa) pursuing Msc.(Hons) Economics. I am enrolled in a dual degree scheme offered by my college where in I get an option to take a B.E degree depending on my first year’s performance and I hope to get a C.S major. My favorite subject so far has been Computer Science as I’ve always been fascinated towards computers and love to code

How did you get involved with Mozilla?

My first interaction with the Mozilla community was when I started using firefox few years back. Last December I decided to start contributing to some open source organizations and searched for organizations to start with and decided to go with Mozilla. I started with Mozilla since its a very large community so there’s always lots of help available. My experience with Mozilla community so far has been absolutely great, I had no idea how the open source community works and was afraid to ask silly questions as I thought people wont have time to answer those questions. I remember when I was assigned my first bug and Joel Maher was my mentor, while setting the local environment I kept on getting errors and used to pastebin those to him and it took me hours to set it up but he was totally cool with it even though I  myself was annoyed with those ever lasting errors. Even now I get stuck at some places but he’s always there to help me. Besides him there are other members too who have helped a lot and are always ready to help if required. My experience so far has been brilliant and am looking forward to learning new things and meeting some great people in the future as well.

What hobbies and activities do you enjoy doing?

Nowadays I spend most of my time coding and learning new stuff. I love the concept of open source software and love to interact with the community and gain valuable experience. Besides coding I enjoy playing football, cricket, and am a complete movie buff. I also love to hang out with friends and spend time with them.

Where do you see yourself in 5 years?

Down the lane in 5 years I see myself as a software engineer working for Mozilla (preferably in the Ateam) solving problems and developing new tools.

What advice would you give someone?

Although I don’t see myself as someone who can advise about life but If someone asked me I’ll advise him/her to enjoy life to the fullest and do whatever he/she loves to do no matter what others expect them to do and if they are related to the software industry obviously I’ll advise them to contribute to the open source community :)

 

I have enjoyed watching Vikas learn his way around Mozilla.  If you have worked with him, you probably already know why I have enjoyed my interactions with him, if you haven’t had the chance to work with him, say hi in irc, his nick is :mishravikas.


April 21, 2014 12:38 PM

April 14, 2014

Joel Maher

browser-chrome is greener and in many chunks

On Friday we rolled out a big change to split up our browser-chrome tests.  It started out as a great idea to split the devtools out into their own suite, then after testing, we ended up chunking the remaining browser chrome tests into 3 chunks.

No more 200 minute wait times, in fact we probably are running too many chunks.  A lot of heavy lifting took place, a lot of it in releng from Armen and Ben, and much work from Gavin and RyanVM who pushed hard and proposed great ideas to see this through.

What is next?

There are a few more test cases to fix and to get all these changes on Aurora.  We have more work we want to do (lower priority) on running the tests differently to help isolate issues where one test affects another test.

In the next few weeks I want to put together a list of projects and bugs that we can work on to make our tests more useful and reliable.  Stay tuned!

 


April 14, 2014 03:39 PM

Bob Clary

Splitting and Packing Android Boot images

After wandering twisty little passages, all alike concerning rooting Android devices I decided the best way forward for devices with unlocked bootloaders was to root via modifying the default.prop values in the boot image. There are a number of different scripts available on various sites which purport to unpack and pack Android boot images, but it seemed that the best approach was to go to the source and see how Android dealt with boot images.

I created spbootimg and pkbootimg from the official mkbootimg in order that it would properly handle official android boot images at least.

spbootimg

usage: spbootimg
       -i|--input 

spbootimg splits an Android boot image file into separate files:

* <bootimg-filename>-kernel – kernel
* <bootimg-filename>-first-ramdisk – ramdisk
* <bootimg-filename>-second-ramdisk – only created if it existed in the input boot image file.
* <bootimg-filename-header> – text file containing constants discovered in the boot image

You can download the source for spbootimg and pkbootimg and build it yourself. I don’t provide binaries because I do not wish people who do not understand the consequences of their actions to brick their devices.

pkbootimg

usage: pkbootimg
       --kernel 
       --kernel-addr 
--ramdisk --ramdisk-addr
[ --second <2ndbootloader-filename>] [ --cmdline ] [ --board ] [ --pagesize ] --second-addr
--tags-addr
-o|--output

pkbootimg takes the output of spbootimg: a kernel file, a ramdisk file, an optional ramdisk file; and using the command line, board, page size and address information discovered in the original boot image, packs them into a new boot image which can be flashed onto a device using fastboot.

default.prop

Looking at the source for adb.c, we see that at a minimum we need a build of adb where ALLOW_ADBD_ROOT was set and either of the following was set in the ramdisk’s default.prop:

ro.secure=0

or

ro.debuggable=1
service.adb.root=1

ro.secure=0 will result in adbd running as root by default. ro.debuggable=1 and service.adb.root=1 will allow you to run adbd as root via the adb root command.

If your device’s version of adb was not built with ALLOW_ADBD_ROOT set, you will need to build your own version and place it in the ramdisk at sbin/adbd.

Once you are able to run adb as root via adb root, you will be able to remount the /system/ directory as writable and can install anything you wish.

su

Some of the automation used in mozilla requires the use of a version of the su command which has the same command line syntax as sh. In particular, we require that su support calling commands via su -c "command args...". You can possibly build your own version of su which will run commands as root without access control or you can use one of the available “Superuser” apps which manage access to su and provide some degree of security. I’ve found Koushik Dutta’s Superuser to work well with Android up to 4.4.

If you use Koushik’s Superuser, you will need at least version 1.0.3.0 and will need to make sure that the install-recovery.sh script is properly installed so that Koushik’s su runs as a daemon. This is automatically handled if you install Superuser.apk via a recovery image, but if you install Superuser manually, you will need to make sure to unpack Superuser.apk and manually install:

April 14, 2014 12:21 PM

April 11, 2014

Andrew Halberstadt

Part 2: How to deal with IFFY requirements

My last post was basically a very long winded way of saying, "we have a problem". It kind of did a little dance around "why is there a problem" and "how do we fix it", but I want to explore these two questions in a bit more detail. Specifically, I want to return to the two case studies and explore why our test harnesses don't work and why mozharness does work even though both have IFFY (in flux for years) requirements. Then I will explore how to use the lessons learned to improve our general test harness design. ### DRY is not everything I talked a lot about the [DRY principle][1] in the last article. Basically the conclusion about it was that it is very useful, but that we tend to fixate on it to the point where we ignore other equally useful principles. Having reached this conclusion, I did a quick internet search and found [an article][2] by Joel Abrahamsson arguing the exact same point (albeit much more succinctly than me). Through his article I found out about the [SOLID principles][3] of object oriented design (have I been living under a rock?). They are all very useful guidelines, but there are two that immediately made me think of our test harnesses in a bad way. The first is the [single responsibility principle][4] (which I was delighted to find is meant to mitigate requirement changes) and the second is the [open/closed principle][5]. The single responsibility principle states that a class should only be responsible for one thing, and responsibility for that thing should not be shared with other classes. What is a responsibility? A responsibility is defined as a *reason to change*. To use the wikipedia example, a class that prints a block of text can undergo two changes. The content of the text can change, or the format of the text can change. These are two different responsibilities that should be split out into different classes. The open/closed principle states that software should be open for extension, but closed for modification. In other words, it should be possible to change the behaviour of the software only by adding new code without needing to modify any existing code. A popular way of implementing this is through abstract base classes. Here the interface is closed for modification, and each new implementation is an extension of that. Our test harnesses fail miserably at both of these principles. Instead of having several classes each with a well defined responsibility, we have a single class responsible for everything. Instead of being able to add some functionality without worrying about breaking something else, you have to take great pains that your change won't affect some other platform you don't even care about! Mozharness on the other hand, while not perfect, does a much better job at both principles. The concept of actions makes it easy to extend functionality without modifying existing code. Just add a new action to the list! The core library is also much better separated by responsibility. There is a clear separation between general script, build, and testing related functionality. ### Inheritance is evil This is probably old news to many people, but this is something that I'm just starting to figure out on my own. I like Zed Shaw's [analogy from *Learn Python the Hard Way*][6] the best. Instead of butchering it, here it is in its entirety. > In the fairy tales about heroes defeating evil villains there's always a dark forest of some kind. > It could be a cave, a forest, another planet, just some place that everyone knows the hero > shouldn't go. Of course, shortly after the villain is introduced you find out, yes, the hero has > to go to that stupid forest to kill the bad guy. It seems the hero just keeps getting into > situations that require him to risk his life in this evil forest. > > You rarely read fairy tales about the heroes who are smart enough to just avoid the whole situation > entirely. You never hear a hero say, "Wait a minute, if I leave to make my fortunes on the high seas > leaving Buttercup behind I could die and then she'd have to marry some ugly prince named Humperdink. > Humperdink! I think I'll stay here and start a Farm Boy for Rent business." If he did that there'd > be no fire swamp, dying, reanimation, sword fights, giants, or any kind of story really. Because of > this, the forest in these stories seems to exist like a black hole that drags the hero in no matter > what they do. > > In object-oriented programming, Inheritance is the evil forest. Experienced programmers know to > avoid this evil because they know that deep inside the Dark Forest Inheritance is the Evil Queen > Multiple Inheritance. She likes to eat software and programmers with her massive complexity teeth, > chewing on the flesh of the fallen. But the forest is so powerful and so tempting that nearly every > programmer has to go into it, and try to make it out alive with the Evil Queen's head before they > can call themselves real programmers. You just can't resist the Inheritance Forest's pull, so you go > in. After the adventure you learn to just stay out of that stupid forest and bring an army if you > are ever forced to go in again. > > This is basically a funny way to say that I'm going to teach you something you should avoid called > Inheritance. Programmers who are currently in the forest battling the Queen will probably tell you > that you have to go in. They say this because they need your help since what they've created is > probably too much for them to handle. But you should always remember this: > > Most of the uses of inheritance can be simplified or replaced with composition, and multiple > inheritance should be avoided at all costs. I had never heard the (apparently popular) term "composition over inheritance". Basically, unless you really really mean it, always go for "X has a Y" instead of "X is a Y". Never do "X is a Y" for the sole purpose of avoiding code duplication. This is exactly the mistake we made in our test harnesses. The Android and B2G runners just inherited everything from the desktop runner, but oops, turns out all three are actually quite different from one another. Mozharness, while again not perfect, does a better job at avoiding inheritance. While it makes heavy use of the mixin pattern (which, yes, is still inheritance) at least it promotes separation of concerns more than classic inheritance. ### Practical Lessons So this is all well and great, but how can we apply all of this to our automation code base? A smarter way to approach our test harness design would have been to have most of the shared code between the three platforms in a single (relatively) bare-bones runner that *has a* target environment (e.g desktop Firefox, Fennec or B2G in this case). In this model there is no inheritance, and no code duplication. It is easy to extend without modifying (just add a new target environment) and there are clear and distinct responsibilities between managing tests/results and actually launching them. In fact this how the gaia team implemented their [marionette-js-runner][7]. I'm not sure if that pattern is common to node's [mocha runner][8] or something of their design, but I like it. I'd also like our test harnesses to employ mozharness' concept of actions. Each action could be an atomic as possible setup step. For example, setting preferences in the profile is a single action. Setting environment is another. Parsing a manifest could be a third. Each target environment would consist of a list of actions that are run in a particular order. If code needs to be shared, simply add the corresponding action to whichever targets need it. If not, just don't include the action in the list for targets that don't need it. My dream end state here is that there is no distinction between test runners and mozharness scripts. They are both trying to do the same thing (perform setup, launch some code, collect results) so why bother wrapping one around the other? The test harness should just *be* a mozharness script and vice versa. This would bring actions into test harnesses, and allow mozharness scripts to live in-tree. ### Conclusion Is it possible to avoid code duplication with a project that has IFFY requirements? I think yes. But I still maintain it is exceptionally hard. It wasn't until after it was too late and I had a lot of time to think about it that I realized the mistakes we made. And even with what I know now, I don't think I would have fared much better given the urgency and time constraints we were under. Though next time, I think I'll at least have a better chance. [1]: http://en.wikipedia.org/wiki/DRY_principle [2]: http://joelabrahamsson.com/the-dry-obsession [3]: http://en.wikipedia.org/wiki/SOLID [4]: http://en.wikipedia.org/wiki/Single_responsibility_principle [5]: http://en.wikipedia.org/wiki/Open/closed_principle [6]: http://learnpythonthehardway.org/book/ex44.html [7]: https://github.com/mozilla-b2g/marionette-js-runner [8]: http://visionmedia.github.io/mocha

April 11, 2014 01:24 PM

Henrik Skupin

Firefox Automation report – week 9/10 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 9 and 10. I for myself was a week on vacation. A bit of relaxing before the work on the TPS test framework should get started.

Highlights

In preparation to run Mozmill tests for Firefox Metro in our Mozmill-CI system, Andreea has started to get support for Metro builds and appropriate tests included.

With the help from Henrik we got Mozmill 2.0.6 released. It contains a helpful fix for waitForPageLoad(), which let you know about the page being loaded and its status in case of a failure. This might help us to nail down the intermittent failures when loading remote and even local pages. But the most important part of this release is indeed the support of mozcrash. Even that we cannot have a full support yet due to missing symbol files for daily builds on ftp.mozilla.org, we can at least show that a crash was happening during a testrun, and let the user know about the local minidump files.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 9 and week 10.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 9 and week 10.

April 11, 2014 09:28 AM

April 10, 2014

Henrik Skupin

Firefox Automation report – week 7/8 2014

The current work load is still affecting my time for getting out our automation status reports. The current updates are a bit old but still worth to mention. So lets get them out.

Highlights

As mentioned in my last report, we had issues with Mozmill on Windows while running our restart tests. So during a restart of Firefox, Mozmill wasn’t waiting long enough to detect that the old process is gone, so a new instance has been started immediately. Sadly that process failed with the profile already in use error. Reason here was a broken handling of process.poll() in mozprocess, which Henrik fixed in bug 947248. Thankfully no other Mozmill release was necessary. We only re-created our Mozmill environments for the new mozprocess version.

With the ongoing pressure of getting automated tests implemented for the Metro mode of Firefox, our Softvision automation team members were concentrating their work on adding new library modules, and enhancing already existent ones. Another big step here was the addition of the Metro support to our Mozmill dashboard by Andrei. With that we got new filter settings to only show results for Metro builds.

When Henrik noticed one morning that our Mozmill-CI staging instance was running out of disk space, he did some investigation and has seen that we were using the identical settings from production for the build retention policy. Without having more then 40GB disk space we were trying to keep the latest 100 builds for each job. This certainly wont work, so Cosmin worked on reducing the amount of builds to 5. Beside this we were also able to fix our l10n testrun for localized builds of Firefox Aurora.

Given that we stopped support for Mozmill 1.5 and continued with Mozmill 2.0 already a while ago, we totally missed to update our Mozmill tests documentation on MDN. Henrik was working on getting all of it up-to-date, so new community members wont struggle anymore.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 7 and week 8.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the Firefox Automation meetings of week 7 and week 8.

April 10, 2014 08:36 AM

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

April 10, 2014 05:21 AM

April 09, 2014

Joel Maher

is a phone too hard to use?

Working at Mozilla, I get to see a lot of great things.  One of them is collaborating with my team (as we are almost all remoties) and I have been doing that for almost 6 years.  Sometime around 3 years ago we switch to using Vidyo as a way to communicate in meetings.  This is great, we can see and hear each other.  Unfortunately heartbleed came out and affects Mozilla’s Vidyo servers.  So yesterday and today we have been without Vidyo.

Now I am getting meeting cancellation notices, why are we cancelling meetings?  Did meetings not happen 3 years ago?  Mozilla actually creates an operating system for a … phone.  In fact our old teleconferencing system is still in place.  I thought about this earlier today and wondered why we are cancelling meetings.  Personally I always put Vidyo in the background during meetings and keep IRC in the foreground.  Am I a minority?

I am not advocating for scrapping Vidyo, instead I would like to attend meetings, and if we find they cannot be held without Vidyo, we should cancel them (and not reschedule them). 

Meetings existed before Vidyo and Open Source existed before GitHub, we don’t need the latest and greatest things to function in life. Pick up a phone and discuss what needs to be discussed.


April 09, 2014 04:43 PM

April 08, 2014

Joel Maher

polishing browser-chrome – coming to a branch near you soon

The last 2 weeks I have gone head first into a world of resolving some issues with our mochitest browser-chrome tests with RyanVM, Armen, and the help of Gavin and many developers who are fixing problems left and right.

There are 3 projects I have been focusing on:

1) Moving our Linux debug browser chrome tests off our old fedora slaves in a datacenter and running them on ec2 slave instances, in bug 987892.

These are live and green on all Firefox 29, 30, and 31 trees!  More work is needed for Firefox-28 and ESR-24 which should be wrapped up this week.  Next week we can stop running all linux unittests on fedora slaves.

2) Splitting all the developer tools tests out of the browser-chrome suite into their own suite in bug 984930.

browser-chrome tests have been a thorn in the side of the sheriff team for many months.  More and more the rapidly growing features and tests of developer tools have been causing the entire browser-chrome suite to fail, in cases of debug to run for hours.  Splitting this out gives us a small shield of isolation.  In fact, we have this running well on Cedar, we are pushing hard to have this rolled out to our production and development branches by the end of this week!

3) Splitting the remaining browser chrome tests into 3 chunks, in bug 819963.

Just like the developer tools, we have been running browser-chrome in 3 chunks on Cedar.  With just 7 tests disabled, we are very green and consistently green. 

 

 

While there are a lot of other changes going on under the hood, what will be seen by next week on your favorite branch of Firefox will be:

 


April 08, 2014 07:54 PM

April 04, 2014

Mark Côté

Bugzfeed: Bugzilla push notifications

A large number of external applications have grown up around Bugzilla serving a variety of purposes. One thing many of these apps have in common is a need to get updates from Bugzilla. Unfortunately, the only way to get notifications of changes was, until recently, to poll Bugzilla. Everyone knows that polling is bad, particularly because it doesn’t scale well, but until recently there was no alternative.

Thus I would like to introduce to the world Bugzfeed, a WebSocket app that allows you to subscribe to one or more bugs and get pushed notifications when they change. It’s rather a small app, based on Tornado, and has a very simple interface, so it should scale quite nicely. It relies on a few moving parts to work, but I’ll start with the basics and explain the whole system later.

The production version is at ws://bugzfeed.mozilla.org. I also made a very simple (and ugly) example app for you to use and examine. A development version of Bugzfeed is available at ws://bugzfeed-dev.allizom.org; it’s tied to the development Bugzilla server, so it’s a good place to experiment if you’re a Mozilla contributor; you can make whatever changes you need to bugzilla-dev without worrying about messing with production data. You’ll need to get someone in #bmo on irc.mozilla.org to reset your password, since we periodically refresh and sanitize the database on bugzilla-dev, and email is disabled so you can’t reset it yourself.

(This makes me think that there should probably be a Bugzfeed instance tied to Landfill; maybe I’ll look into that, in particular if we implement middleware other than Pulse (see below).)

Client commands, responses, and notifications are all in JSON format. The project wiki page has the full list of commands. Here’s a little example of what you need to send to subscribe to bugs 1234 and 5678:

{"command": "subscribe", "bugs": [1234, 5678]}

The server will send a simple response, including a list of all the bugs you are (now) subscribed to:

{"command": "subscribe", "result": "ok", "bugs": [1234, 5678]}

Now you can just wait for notifications to be pushed from the server to your app:

{"command": "update", "bug": 1234, "when": "2014-04-03T21:13:45"}

Wait, you are probably asking, that’s it? That’s all I get?

The short answer is yup, that’s it. You can now use the regular REST API to get further details about what changed.

The longer answer is yup, that’s it, because security. Bugzilla has evolved a very fine-grained security system. We have bugs, attachments, and even comments that can only be seen by a privileged few, due to security, legal, and other considerations. Furthermore, many of the variables involved in determining whether a particular user can see a particular bug/attachment/comment can change at any time: not only can elements of a bug shift between public and confidential, but so can a user’s groups, and the groups themselves. Monitoring for all those possible changes would make this app significantly more complex and brittle, so we opted for the most secure notification, which is also the simplest: just a bug ID and a timestamp. All the other work is handled by the standard Bugzilla APIs.

(You might also be asking “why is ‘update’ considered a command?” and, to be honest, I’m not sure, so maybe that’ll change.)

There are other commands, and some limited caching of changes in case your client disconnects; see the project wiki page for more.


So how does it work? Here’s a system diagram created by contributor musingmario:

Bugzfeed system diagram

The four main pieces (with links to source) are

On the Bugzilla side, the BMO team created an extension which writes the bug ID and timestamp to a table when any bug changes. A simple Python app polls this table and sends all the updates to Pulse, cleaning up the table as it does so.

Pulse is a Mozilla RabbitMQ server with a specific configuration and message format implementing the publish/subscribe pattern. The usage is somewhat Mozilla specific, but it would be pretty easy to set up a similar system or even modify Bugzfeed and the Bugzilla shim to use RabbitMQ directly, or a different AMQP system like ØMQ.

Notifications from all bugs flow through Pulse; it is Bugzfeed, the WebSocket server, that does the filtering for its clients to notify only on subscribed bugs. Subscribing to individual notifications from Pulse is possible via topics, but this requires one channel per bug, so I doubt it would be any more efficient if hundreds of clients are connected to Bugzfeed.

While you could have the Bugzfeed server read directly from the Bugzilla database, eliminating the shim and the queuing system, having an intermediary allows us to easily stand up more Bugzfeed servers if load gets too high, as each Bugzfeed instance would see the stream of changes via its own subscriber queue. We can also easily interface new applications to the notification stream, such as the BMO Elastic Search cluster.

Enough technicalities; go out and play with it! And if you want to adapt it for your own Bugzilla installation, I’d be more than willing to help out.

April 04, 2014 04:13 AM

April 01, 2014

Geoff Brown

Android 2.3 Opt tests on tbpl

Today, we started running some Android 2.3 Opt tests on tbpl:

Image

“Android 2.3 Opt” tests run on emulators running Android 2.3. The emulator is simply the Android arm emulator, taken from the Android SDK (version 18). The emulator runs a special build of Gingerbread (2.3.7), patched and built specifically to support our Android tests. The emulator is running on an aws ec2 host. Android 2.3 Opt runs one emulator at a time on a host (unlike the Android x86 emulator tests, which run up to 4 emulators concurrently on one ix host).

Android 2.3 Opt tests generally run slower than tests run on devices. We have found that tests will run faster on faster hosts; for instance, if we run the emulator on an aws m3.large instance (more memory, more cpu), mochitests run in about 1/3 of the time that they do currently, on m1.medium instances.

Reftests – plain reftests, js reftests, and crashtests – run particularly slowly. In fact, they take so long that we cannot run them to completion with a reasonable number of test chunks. We are investigating more and also considering the simple solution: running on different hosts.

We have no plans to run Talos tests on Android 2.3 Opt; we think there is limited value in running performance tests on emulators.

Android 2.3 Opt tests are supported on try — “try: -b o -p android …” You can also request that a slave be loaned to you for debugging more intense problems: https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_slave. In my experience, these methods – try and slave loans – are more effective at reproducing test results than running an emulator locally: The host seems to affect the emulator’s behavior in significant and unpredictable ways.

Once the Android 2.3 Opt tests are running reliably, we hope to stop the corresponding tests on Android 2.2 Opt, reducing the burden on our old and limited population of Tegra boards.

As with any new test platform, we had to disable some tests to get a clean run suitable for tbpl. These are tracked in bug 979921.

There are also a few unresolved issues causing infrequent problems in active tests. These are tracked in bug 967704.


April 01, 2014 08:51 PM

Joel Maher

Performance Bugs – How to stay on top of Talos regressions

Talos is the framework used for desktop Firefox to measure performance for every patch that gets checked in.  Running tests for every checkin on every platform is great, but who looks at the results?

As I mentioned in a previous blog post, I have been looking at the alerts which are posted to dev.tree-management, and taking action on them if necessary.  I will save discussing my alert manager tool for another day.  One great thing about our alert system is that we send an email to the original patch author if we can determine who it is.  What is great is many developers already take note of this and take actions on their own.  I see many patches backed out or discussed with no one but the developer initiating the action.

So why do we need a Talos alert sheriff?  For the main reason that not even half of the regressions are acted upon.  There are valid reasons for this (wrong patch identified, noisy data, doesn’t seem related to the patch) and of course many regressions are ignored due to lack of time.  When I started filing bugs 6 months ago, I incorrectly assumed all of them would be fixed or resolved as wontfix for a valid reason.  This happens for most of the bugs, but many regressions get forgotten about.

When we did the uplift of Firefox 30 from mozilla-central to mozilla-aurora, we saw 26 regression alerts come in and 4 improvement alerts.  This prompted us to revisit the process of what we were doing and what could be done better.  Here are some of the new things we will be doing:

As this process goes through a cycle or two, we will refine it to ensure we have less noise for developers and more accuracy in tracking regressions faster

 


April 01, 2014 05:30 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

April 01, 2014 06:27 AM

March 31, 2014

Geoff Brown

Firefox for Android Performance Measures – March check-up

My monthly review of Firefox for Android performance measurements. March highlights:

- 3 throbber start/stop regressions

- Eideticker not reporting results for the last couple of weeks.

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android 2.2 opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcanvasmark

This test runs the third-party CanvasMark benchmark suite, which measures the browser’s ability to render a variety of canvas animations at a smooth framerate as the scenes grow more complex. Results are a score “based on the length of time the browser was able to maintain the test scene at greater than 30 FPS, multiplied by a weighting for the complexity of each test type”. Higher values are better.

7200 (start of period) – 6300 (end of period).

Regression of March 5 – bug 980423 (disable skia-gl).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

24 (start of period) – 24 (end of period)

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

110000 (start of period) – 110000 (end of period)

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

375 (start of period) – 425 (end of period).

Regression of March 29 – bug 990101. (test modified)

tsvgx

An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.

7600 (start of period) – 7300 (end of period).

tp4m

Generic page load test. Lower values are better.

710 (start of period) – 750 (end of period).

No specific regression identified.

ts_paint

Startup performance test. Lower values are better.

3600 (start of period) – 3600 (end of period).

Throbber Start / Throbber Stop

These graphs are taken from http://phonedash.mozilla.org.  Browser startup performance is measured on real phones (a variety of popular devices).

3 regressions were reported this month: bug 980757, bug 982864, bug 986416.

:bc continued his work on noise reduction in March. Changes in the test setup have likely affected the phonedash graphs this month. We’ll check back at the end of April.

Eideticker

These graphs are taken from http://eideticker.mozilla.org. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

Eideticker results for the last couple of weeks are not available. We’ll check back at the end of April.


March 31, 2014 09:58 PM

Android 4.0 Debug tests on tbpl

Today, we started running some Android 4.0 Debug mochitests and js-reftests on tbpl.

Screenshot from 2014-03-31 12:13:10b

“Android 4.0 Debug” tests run on Pandaboards running Android 4.0, just like the existing “Android 4.0 Opt” tests which have been running for some time. Unlike the Opt tests, the Debug tests run debug builds, with more log messages than Opt and notably, assertions. The “complete logcats” can be very useful for these jobs — see Complete logcats for Android tests.

Other test suites – the remaining mochitests chunks, robocop, reftests, etc – run on Android 4.0 Debug only on the Cedar tree at this time. They mostly work, but have failures that make them too unreliable to run on trunk trees. Would you like to see more Android 4.0 Debug tests running? A few test failures are all that is blocking us from running the remainder of our test suites. See Bug 940068 for the list of known failures.

 


March 31, 2014 06:34 PM

March 26, 2014

Dave Hunt

Hunting for performance regressions in Firefox OS

At Mozilla we’re running performance tests against Firefox OS devices several times a day, and you can see these results on our dashboard. Unfortunately it takes a while to run these tests, which means we’re not able to run them against each and every push, and therefore when a regression is detected we can have a tough time determining the cause.

We do of course have several different types of performance testing, but for the purposes of this post I’m going to focus on the cold launch of applications measured by b2gperf. This particular test launches 15 of the packaged applications (each one is launched 30 times) and measures how long it takes. Note that this is how long it takes to launch the app, and not how long it takes for the app to be ready to use.

In order to assist with tracking down performance regressions I have written a tool to discover any Firefox OS builds generated after the last known good revision and before the first known bad revision, and trigger additional tests to fill in the gaps. The results are sent via e-mail for the recipient to review and either revise the regression range or (hopefully) identify the commit that caused the regression.

Before I talk about how to use the tool, there’s a rather important prerequisite to using it. As our continuous integration solution involves Jenkins, you will need to have access to an instance with at least one job configured specifically for this purpose.

The simplest approach is to use our Jenkins instance, which requires Mozilla-VPN access and access to our tinderbox builds. If you have these you can use the instance running at http://selenium.qa.mtv2.mozilla.com:8080 and the b2g.hamachi.perf job.

Even if you have the access to our Jenkins instance and the device builds, you may still want to set up a local instance. This will allow you to run the tests without tying up the devices we have dedicated to running these tests, and you wont be contending for resources. If you’re going to set up a local instance you will of course need at least one Firefox OS device and access to tinderbox builds for the device.

You can download the latest long-term support release (recommended) of Jenkins from here. Once you have that, run java -jar jenkins.war to start it up. You’ll be able to see the dashboard at http://localhost:8080 where you can create a new job. The job must accept the following parameters, which are sent by the command line tool when it triggers jobs.

BUILD_REVISION – This will be populated with the revision of the build that will be tested.
BUILD_TIMESTAMP – A formatted timestamp of the selected build for inclusion in the e-mail notification.
BUILD_LOCATION – The URL of build to download.
APPS – A comma separated names of the applications to test.
NOTIFICATION_ADDRESS – The e-mail address to send the results to.

Your job can then use these parameters to run the desired tests. There are a few things I’d recommend, which we’re using for our instance. If you have access to our instance it may also make sense to use the b2g.hamachi.perf job as a template for yours:

Once you have a suitable Jenkins instance and job available, you can move onto triggering your tests. The quickest way to install the b2ghaystack tool is to run the following in a terminal:

pip install git+git://github.com/davehunt/b2ghaystack.git#egg=b2ghaystack

Note that this requires you to have Python and Git installed. I would also recommend using virtual environments to avoid polluting your global site-packages.

Once installed, you can get the full usage by running b2ghaystack --help but I’ll cover most of these by providing the example of taking a real regression identified on our dashboard and narrowing it down using the tool. It’s worth calling out the --dry-run argument though, which will allow you to run the tool without actually triggering any tests.

The tool takes a regression range and determines all of the pushes that took place within the range. It will then look at the tinderbox builds available and try to match them up with the revisions in the pushes. For each of these builds it will trigger a Jenkins job, passing the variables mentioned above (revision, timestamp, location, apps, e-mail address). The tool itself does not attempt to analyse the results, and neither does the Jenkins job. By passing an e-mail address to notify, we can send an e-mail for each build with the test results. It is then up to the recipient to review and act on them. Ultimately we may submit these results to our dashboard, where they can fill in the gaps between the existing results.

The regression I’m going to use in my example was from February, where we actually had an issue preventing the tests for running for a week. When the issue was resolved, the regression presented itself. This is an unusual situation, but serves as a good example given the very wide regression range.

Below you can see a screenshot of this regression on our B2G dashboard. The regression is also available to see on our generic dashboard.

Perf-o-Matic__Build_20140325004002_

Performance regression shown on B2G dashboard

It is necessary to determine the last known good and first known bad gecko revisions in order to trigger tests for builds in between these two points. At present, the dashboard only shows the git revisions for our builds, but we need to know the mercurial equivalents (see bug 979826). Both revisions are present in the sources.xml available alongside the builds, and I’ve been using this to translate them.

For our regression, the last known good revision was 07739c5c874f from February 10th, and the first known bad was 318c0d6e24c5 from February 17th. I first ran this against the mozilla-central branch:

b2ghaystack -b mozilla-central --eng -a Settings -u username -p password -j http://localhost:8080 -e dhunt@mozilla.com hamachi b2g.hamachi.perf 07739c5c874f 318c0d6e24c5

-b mozilla-central specifies the target branch to discover tinderbox builds for.
--eng means the builds selected will have the necessary tools to run my tests.
-a Settings limits my test to just the Settings app, as it’s one of the affected apps, and means my jobs will finish much sooner.
-u username and -p password are my credentials for accessing the device builds.
-j http://localhost:8080 is the location of my Jenkins instance.
-e dhunt@mozilla.com is where I want the results to be sent.
hamachi is the device I’m testing against.
b2g.hamachi.perf is the name of the job I’ve set up in Jenkins. Finally, the last two arguments are the good and bad revisions as determined previously.

This discovered 41 builds, but to prevent overloading Jenkins the tool only triggers a maximum of 10 builds (this can be overridden using the -m command line option). The ten builds are interspersed from the 41, and had the range of f98c5c2d6bba:4f9f58d41eac.

Here’s an example of what the tool will output to the console:

Getting revisions from: https://hg.mozilla.org/mozilla-central/json-pushes?fromchange=07739c5c874f&tochange=318c0d6e24c5
--------> 45 revisions found
Getting builds from: https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/tinderbox-builds/mozilla-central-hamachi-eng/
--------> 301 builds found
--------> 43 builds within range
--------> 40 builds matching revisions
Build count exceeds maximum. Selecting interspersed builds.
--------> 10 builds selected (f98c5c2d6bba:339f0d450d46)
Application under test: Settings
Results will be sent to: dhunt@mozilla.com
Triggering b2g.hamachi.perf for revision: f98c5c2d6bba (1392094613)
Triggering b2g.hamachi.perf for revision: bd4f1281c3b7 (1392119588)
Triggering b2g.hamachi.perf for revision: 3b3ac98e0dc1 (1392183487)
Triggering b2g.hamachi.perf for revision: 7920df861c8a (1392222443)
Triggering b2g.hamachi.perf for revision: a2939bac372b (1392276621)
Triggering b2g.hamachi.perf for revision: 6687d299c464 (1392339255)
Triggering b2g.hamachi.perf for revision: 0beafa155ee9 (1392380212)
Triggering b2g.hamachi.perf for revision: f6ab28f98ee5 (1392434447)
Triggering b2g.hamachi.perf for revision: ed8c916743a2 (1392488927)
Triggering b2g.hamachi.perf for revision: 339f0d450d46 (1392609108)

None of these builds replicated the issue, so I took the last revision, 4f9f58d41eac and ran again in case there were more builds appropriate but previously skipped due to the maximum of 10:

b2ghaystack -b mozilla-central --eng -a Settings -u username -p password -j http://localhost:8080 -e dhunt@mozilla.com hamachi b2g.hamachi.perf 4f9f58d41eac 318c0d6e24c5

This time no builds matched, so I wasn’t going to be able to reduce the regression range using the mozilla-central tinderbox builds. I move onto the mozilla-inbound builds, and used the original range:

b2ghaystack -b mozilla-inbound --eng -a Settings -u username -p password -j http://localhost:8080 -e dhunt@mozilla.com hamachi b2g.hamachi.perf 07739c5c874f 318c0d6e24c5

Again, no builds matched. This is most likely because we only retain the mozilla-inbound builds for a short time. I moved onto the b2g-inbound builds:

b2ghaystack -b b2g-inbound --eng -a Settings -u username -p password -j http://localhost:8080 -e dhunt@mozilla.com hamachi b2g.hamachi.perf 07739c5c874f 318c0d6e24c5

This found a total of 187 builds within the range 932bf66bc441:9cf71aad6202, and 10 of these ran. The very last one replicated the regression, so I ran again with the new revisions:

b2ghaystack -b b2g-inbound --eng -a Settings -u username -p password -j http://localhost:8080 -e dhunt@mozilla.com hamachi b2g.hamachi.perf e9025167cdb7 9cf71aad6202

This time there were 14 builds, and 10 ran. The penultimate build replicated the regression. Just in case I could narrow it down further, I ran with the new revisions:

b2ghaystack -b b2g-inbound --eng -a Settings -u username -p password -j http://localhost:8080 -e dhunt@mozilla.com hamachi b2g.hamachi.perf b2085eca41a9 e9055e7476f1

No builds matched, so I had my final regression range. The last good build was with revision b2085eca41a9 and the first bad build was with revision e9055e7476f1. This results in a pushlog with just four pushes.

Of these pushes, one stood out as a possible cause for the regression: Bug 970895: Use I/O loop for polling memory-pressure events, r=dhylands The code for polling sysfs for memory-pressure events currently runs on a separate thread. This patch implements this functionality for the I/O thread. This unifies the code base a bit and also safes some resources.

It turns out this was reverted for causing bug 973824, which was a duplicate of bug 973940. So, regression found!

Here’s an example of the notification e-mail content that our Jenkins instance will send:

b2g.hamachi.perf - Build # 54 - Successful:

Check console output at http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.perf/54/ to view the results.

Revision: 2bd34b64468169be0be1611ba1225e3991e451b7
Timestamp: 24 March 2014 13:36:41
Location: https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/tinderbox-builds/b2g-inbound-hamachi-eng/20140324133641/

Results:
Settings, cold_load_time: median:1438, mean:1444, std: 58, max:1712, min:1394, all:1529,1415,1420,1444,1396,1428,1394,1469,1455,1402,1434,1408,1712,1453,1402,1436,1400,1507,1447,1405,1444,1433,1441,1440,1441,1469,1399,1448,1447,1414

Hopefully this tool will be useful for determining the cause for regressions much sooner than we are currently capable of doing. I’m sure there are various improvements we could make to this tool – this is very much a first iteration! Please file bugs and CC or needinfo me (:davehunt), or comment below if you have any thoughts or concerns.

March 26, 2014 10:54 PM

March 25, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

March 25, 2014 06:39 AM

Mark Côté

Moving Bugzilla from Bazaar to Git

Or, how to migrate to git using only three programming languages

Another aspect of Bugzilla has been dragged, kicking & screaming, into the future! On March 11, 2014, the Bugzilla source moved to git.mozilla.org. We’re still mirroring to bzr.mozilla.org (more on that later), but the repository of record is now git, meaning it is the only place we accept new code.

Getting over there was no small feat, so I want to record the adventure in the hopes that it can benefit someone else, and so I can look back some day and wonder why I put myself through these things.

Background

The rationale isn’t the focus of this post, but suffice it to say that Bazaar isn’t very widely used, and many projects are abandoning it. Eric S. Raymond wrote a good post on the Emacs dev list about why they need to move from Bazaar to git. The same rationale applies to Bugzilla: “Sticking to a moribund version-control system will compound and exacerbate the project’s difficulty in attracting new talent.”

So, moving on, I started off scouring the Internet to find the best way to perform this migration. One major complication is the fact that we want to keep mirroring (one-way) to Bazaar for at least a while, since the main suggested way to upgrade Bugzilla is from bzr.mozilla.org. It was deemed unreasonable to require existing installations to switch to git to obtain a small security fix, so we’ll continue to mirror changes to Bazaar for some time.

Initial migration

I found a few posts here and there about people who had done migrations like this, but the most useful was a post by David Roth from last year that detailed how to preserve Bazaar’s commit metadata, specifically bug-tracker metadata, which Bugzilla has used on virtually every commit since switching from CVS. It involves using the --no-plain option with bzr fast-export and then translating the output to something git understands.

Interestingly, Roth’s translation script was written in C#, not my personal first choice for such a task (or any, really, since I don’t generally develop for Windows). However it compiled fine under Mono, so I could run it on a Linux box. Something I learned, though, is to not try this kind of thing on OS X, where, by default, the filesystem is case-insensitive.

As much as I’d prefer to deal with a language with which I am more comfortable, I dislike duplicated effort even more. I used Roth’s C# script as a basis, modifying it a bit for our needs. The metadata is in the form <bug URL> <resolution>. Rather than editing existing commit messages, I just took that string and pasted it to the bottom of the commit message, but only if the bug number was not already in the commit message. This actually revealed a few typos in the “Bug 123456” strings that generally start commit messages.

There turned out to a few other subtle bugs, like the fact that a file which is both renamed and modified in the same commit shows up, in the output from bzr fast-export, as being modified under the original name. Thus if the delete is processed first, it looks like bzr has modified an nonexistent file. Those were easy to see by comparing the contents of every file before and after migration (admittedly just for the last revision).

Since there are a lot of branches on bzr.mozilla.org, I created a bash script to record them all and make sure none were missed. It output the pre-/postmigration diff of md5 sums as well as doing a git repack for each repo, after all branches were migrated.

One thing I forgot was pushing tags via the --tags option to git push; I had to do that manually after the migration. That’s also when I discovered that the same tag existed in several related bzr branches which were all combined into one git repo. This is, of course, not allowed in git. It made me think more about how Bugzilla uses certain tags, like current-stable, which are moved after each release. In git this requires the --force option to git push and is a big no-no if the remote repo is shared. I learned that, in fact, this is also the case in bzr, though perhaps it’s regarded as less of a sin than it is in git. Anyway, I’ve since realized that those should be branches, named appropriately (per branch). Despite them not being branches in the standard sense—they’ll always point to somewhere at or behind a version branch and never fork—it’s perfectly acceptable to move them, as opposed to tags, and since they’ll always be fast-forwarded, they won’t take any more space than a lightweight tag.

Mirroring

This was a harder problem. Originally, I tried to use the bzr-git extension, and it failed when I tried to pull in changes from git. I exchanged some emails with bzr-git’s author, Jelmer Vernooij, and he said that to keep an existing bzr branch in sync with a newly formed git repo is impossible at the moment: “This is essentially the ‘roundtripping’ support that we’ve been trying to achieve with bzr-git for a while. It’s a nontrivial problem (since it requires all semantics from bzr to be preserved when pushing to git).” Considering bzr-git hasn’t had a new release in two years, I won’t be holding my breath.

Luckily (and perhaps somewhat unfortunately) Bugzilla has jumped VCSes before, as I hinted above. With the old bzr-to-cvs script as a starting point, I created a git-to-bzr script—in, of course, Perl, as the original.

This script is essentially an automated way of applying individual commits from a git branch to a bzr branch. For each commit, the entire file tree is copied from a local git clone to a local bzr checkout, bzr add and remove are executed where needed, and the changes committed with the original author, contributor, and date preserved. The script also parses out the standard “Bug X:” commit-message format and passes it to bzr’s --fixes commit option. A file called .gitrev in the bzr repo tracks the associated git commit ID for each bzr commit.

To avoid excessive disk activity, since the script polls git and bzr for changes, the script uses bzr cat to check the contents of the .gitrev file and git ls-remote to get the ID of the last git commit. If they are equal, no further actions are performed.

Summing up

And that, folks, is how you can migrate from bzr to git! The initial migration is pretty straightforward, more so if you don’t care about any bzr commit metadata. It was unfortunate that there was no off-the-shelf way to sync the repos afterwards, but the basic idea isn’t too complicated.

For more, there’s the project page on our wiki, and all the scripts used are in a GitHub repo for your perusal. I’m no VCS expert—I’ve never heavily used bzr, and I’m constantly learning new things about git—but feel free to ask me questions if you want our process further clarified.

March 25, 2014 12:39 AM

March 21, 2014

David Burns

Treeclosure stats

As the manager of the sheriffs I am always interested in how ofter the sheriffs, and anyone else, closes the tree. For those who don't know who the Mozilla Sheriffs are, they are the team that manage the code landing in a number of Mozilla trees. If a bad patch lands they are the people who typically back it out. There has been some recent changes in the way the infrastructure does things which has led to a few extra closures. Not having the data for this I went and got it (you can see the last year's worth of data for Mozilla-Inbound below)

2013-03
         infra: 14:59:38
         no reason: 5 days, 12:13:31
         total: 6 days, 3:13:09
2013-04
         infra: 22:21:18
         no reason: 3 days, 19:30:21
         total: 4 days, 17:51:39
2013-05
         infra: 1 day, 2:03:08
         no reason: 4 days, 11:30:41
         total: 5 days, 13:33:49
2013-06
         checkin-compilation: 10:04:17
         checkin-test: 1 day, 5:48:15
         infra: 18:44:06
         no reason: 5:05:59
         total: 2 days, 15:42:37
2013-07
         backlog: 22:38:39
         checkin-compilation: 1 day, 13:05:52
         checkin-test: 2 days, 16:43:53
         infra: 1 day, 2:16:02
         no reason: 0:30:13
         other: 1:32:23
         planned: 4:59:09
         total: 6 days, 13:46:11
2013-08
         backlog: 4:13:49
         checkin-compilation: 1 day, 23:49:34
         checkin-test: 1 day, 12:32:35
         infra: 13:06:19
         total: 4 days, 5:42:17
2013-09
         backlog: 0:21:39
         checkin-compilation: 1 day, 8:27:27
         checkin-test: 2 days, 15:17:50
         infra: 15:34:16
         other: 2:02:07
         planned: 3:16:22
         total: 4 days, 20:59:41
2013-10
         checkin-compilation: 15:29:45
         checkin-test: 3 days, 10:41:33
         infra: 16:31:41
         no reason: 0:00:05
         other: 0:09:01
         planned: 2:30:35
         total: 4 days, 21:22:40
2013-11
         checkin-compilation: 1 day, 9:40:25
         checkin-test: 4 days, 18:41:35
         infra: 1 day, 19:11:36
         no reason: 0:05:54
         other: 3:28:40
         planned: 1:50:20
         total: 8 days, 4:58:30
2013-12
         backlog: 5:07:06
         checkin-compilation: 18:49:29
         checkin-test: 1 day, 16:29:16
         infra: 6:30:03
         total: 2 days, 22:55:54
2014-01
         backlog: 1:54:43
         checkin-compilation: 20:52:34
         checkin-test: 1 day, 12:22:01
         infra: 1 day, 5:37:14
         no reason: 1:20:46
         other: 4:53:42
         planned: 3:48:16
         total: 4 days, 2:49:16
2014-02
         backlog: 3:08:18
         checkin-compilation: 1 day, 12:26:35
         checkin-test: 15:30:42
         infra: 19:40:38
         no reason: 0:00:16
         other: 0:47:38
         total: 3 days, 3:34:07
2014-03
         backlog: 8:52:34
         checkin-compilation: 19:27:21
         checkin-test: 1 day, 0:37:55
         infra: 19:47:13
         other: 2:53:21
         total: 3 days, 3:38:24

I created a graph of the data showing Mozilla Inbound since we started using it in August 2012 till now.

Closures by reason on Mozilla Inbound

The first part of the graph there wasn't any data for specific things but the sheriffs changed that in the middle of last year. I am hoping that we can get information like this, and other interesting back out info into a "Tree Health Report" in Treeherder (The TBPL replacement the Automation and Tools Team is developing).

March 21, 2014 11:19 PM

Henrik Skupin

Join our first Automation Training days on March 24/26

Building software is fun. Spending countless hours or even days on something to get it finally working. Helping someone who uses your software to speed-up the daily workflow. All that is fantastic and every developer loves that. But don’t let the dark side come up, when customers are pointing you to a lot of software defects. Are you still proud and will you continue the work as how you did it before?

Most likely not. Or well, lets say at least not when quality is what you want to ship. So you will start to think about how to test your application. You can do a lot of manual testing based on test plans or just do exploratory testing. That will work as long as your application is not complex enough, and can be done in a couple of minutes. But once you have to do the same kind of steps over and over again for each release, you will find it boring and loose the interest or concentration on it. Failures will happen so that issues slip through your testing, and bugs becoming part of the new release.

That’s indeed something you eventually want to kill in the future. But how? There is an easy answer to this question! Use test automation! Create tests for each feature you implement, or regression you got fixed. Over time the enlarged suite of tests will make you happy, given that you have to spend nearly no time on manual tests, and have results in a split of the time needed before. Releasing new versions of the application can be done much faster.

At some point, when your application is large enough, you might even not work alone anymore on that product. There will be other developers, or even software testers whose job is to plan and execute the testing strategy. Given that in the past there was not such a high demand on automation knowledge for them, the requirements for jobs have been changed in the past months. So lesser companies will hire engineers for quality assurance who do not have a coding background. This is hard for them, given that it can take ages to find a new position. Something has to change for them.

We, the Firefox Automation team at Mozilla want to help out here. Given our knowledge in automation for various Mozilla related projects, our goal is to support interested people in gaining their knowledge in software development and especially test automation. Therefor we are planning to have automation trainings on a regular basis. And all based on our own projects, so you will have the chance to practice all the new things, which you have learned. All that indeed depends on the acceptance for that offer, and the number of participants.

The first two training days will happen on March 24th and 26th, and will mainly take place in our #automation channel on IRC. Given that we have no idea how many of you will join us during that day, and what your knowledge is, we will start with the basics. That means we will guide you through courses of Javascript, Python, HTML, or CSS. We will collect your feedback and extend the etherpad for automation trainings to finally have a wonderful list of getting started tutorials.

For those of you who already have more experience, we will offer tasks to work on depending on your skills and directions. Please see the before mentioned etherpad for areas of work and appropriate mentors. We will guarantee that it will be fun!

We would love to see you next week, and if you have questions, don’t hesitate to ask here, or in the automation mailing list.

March 21, 2014 04:36 PM

Andrew Halberstadt

Part 1: Sharing code is not always a good thing

Dry versus Wet

As programmers, we are taught early on that code duplication is bad and should be avoided at all cost. It makes code less maintainable, reusable and readable. The DRY principle is very basic and fundamental to how most of us approach software design. If you aren't familiar with the DRY principle, please take a minute to read the wikipedia page on it. The counterpart of DRY, is WET (write everything twice). In general, I agree that DRY is good and WET is bad. But I think there are a class of problems where the DRY approach can actually be harmful. For these types of problems, I will make a claim that a WET approach can actually work better.

IFFY Requirements

So what are these problems? They are problems that have continuously evolving unpredictable requirements. Continuously evolving means that the project will continue to receive additional requirements indefinitely. Unpredictable means that you won't know when the requirements will change, how often they'll change, or what they might be.

Hold on a second, you might be thinking. If the requirements are so unpredictable, then shouldn't we be creating a new project to address them instead of trying to morph an old one to meet them? Yes! But there's a catch (hey, and isn't starting a new project just a form of code duplication?). The catch is that the requirements are continuously evolving. They change a little bit at a time over long periods (years). Usually at the beginning of a project it is not possible to tell whether the requirements will be unpredictable, or even if they will be continuously evolving. It isn't until the project has matured, and feature creep vines are firmly entrenched that these facts become apparent and by this time it is often too late. Because "continuously evolving unpredictable requirements" is a mouthful to say, I've invented an acronym to describe them. From here on out I will refer to them as IFFY (in flux for years) requirements.

An IFFY Example

This probably sounds very hand wavy at the moment, so let me give an example of a problem that has IFFY requirements. This example is what I primarily work on day to day and is the motivation behind this post, test harnesses. A test harness is responsible for testing some other piece of software. As that other piece of software evolves, so too must the test harness. If the system under test adds support for a wizzlebang, then the harness must also add support for testing a wizzlebang. Or closer to home, if the system under test suddenly becomes multiprocess, then the harness needs to support running tests in both parent and child processes. Usually the developer working on the test harness does not know when or how the system under test will evolve in a way that requires changes to the harness. The requirements are in flux for as long as the system under test continues to evolve. The requirements are IFFY.

Hopefully you now have some idea about what types of problems might benefit from a WET approach. But so far I haven't talked about why WET might be helpful and why DRY might be harmful. To do this, I'd like to present two case studies. The first is an example of where sticking to the DRY principle went horribly wrong. The second is an example of where duplicating code turned out to be a huge success.

Case Study #1: Mochitest, Reftest, XPCShell, etc.

Most of our test harnesses have life cycles that follow a common pattern. Originally they were pretty simple, consisting of a single file that unsurprisingly ran the test suite in Firefox. But as more harnesses were created, we realized that they all needed to do some common things. For example they all needed to launch Firefox, most of them needed to modify the profile in some way, etc. So we factored out the code that would be useful across test harnesses into a file called automation.py. As Firefox became more complicated, we needed to add more setup steps to the test harnesses. Automation.py became a dumping ground for anything that needed to be shared across harnesses (or even stuff that wasn't shared across harnesses, but in theory might need to be in the future). So far, there wasn't really a huge problem. We were using inheritance to share code, and sure maybe it could have been organized better, but this was more the fault of rushed developers than anything inherently wrong with the design model.

Then Mozilla announced they would be building an Android app. We scrambled to figure out how we could get our test suites running on a mobile device at production scale. We wrote a new Android specific class which inherited from the main Firefox one. This worked alright, but there was a lot of shoe-horning and finagling to get it working properly. At the end, we were sharing a fair amount of code with the desktop version, but we were also overriding and ignoring a lot of code from the desktop version. A year or so later, Mozilla announced that it would be working on B2G, an entire operating system! We went through the same process, again creating a new subclass and trying our darndest to not duplicate any code. The end result was a monstrosity of overrides, subtle changing of state, no separation of concerns, different command line options meaning different things on different platforms, the list goes on. Want to add something to the desktop Firefox version of the test harness? Good luck, chances are you'd break both Fennec and B2G in the process. Want to try and follow the execution path to understand how the harness works? Ha!

At this point you are probably thinking that this isn't the fault of the DRY principle. This is simply a case of not architecting it properly. And I completely agree! But this brings me to a point I'd like to make. Projects that have IFFY requirements are *insanely* difficult to implement properly in a way that adheres to DRY. Fennec and B2G were both massive and unpredictable requirement changes that came out of nowhere. In both cases, we were extremely behind on getting tests running in continuous integration. We had developers and product managers yelling at us to get *something* running as quickly as possible. The longer we took, the bigger the backlog of even newer requirement changes became. We didn't have time to sit down and think about the future, to implement everything perfectly. It was a mad dash to the finish line. The problem is exacerbated when you have many people all working on the same set of issues. Now you've thrown people problems into the mix and it's all but impossible to design anything coherent.

Had we simply started anew for both Fennec and B2G instead of trying to share code with the desktop version of the harness, we would have been much better off.

To re-iterate my opening paragraph, I'm not arguing that DRY is bad, or that code duplication is good. At this point I simply hope to have convinced you that there exist scenarios where striving for DRY can lead you into trouble. Next, I'll try to convince you that there exist scenarios where a WET approach can be beneficial.

Case Study #2: Mozharness

Ask anyone on releng what their most important design considerations are when they approach a problem. My guess is that somewhere on that list you'll see something about "configurability" or "being explicit". This basically means that it needs to be as easy as possible to adapt to a changing requirement. Adaptability is a key skill for a release engineer, they've been dealing with changing requirements since the dawn of computer science. The reality is that most release engineers have already learned the lesson I'm just starting to understand now (a lesson I am only beginning to see because I happen to work pretty closely with a large number of really awesome release engineers).

Hopefully if I'm wrong about anything in this next part, someone from releng will correct me. Releng was in a similar situation as our team, except instead of test harnesses, it was buildbotcustom. Buildbotcustom was where most of the Mozilla-specific buildbot code lived. That is, it was the code responsible for preparing a slave with all of the build systems, harnesses, tests, environment and libraries needed to execute a test or build job. Similar to our test harnesses, changes in requirements quickly made buildbotcustom very difficult to update or maintain (Note: I don't know whether it was DRY or WET, but that's not really important for this case study).

To solve the problem, Aki created a tool called mozharness. At the end of the day, mozharness is just a glorified execution context for running a python script. You pass in some configuration and run a script that uses said configuration. In addition to that, mozharness itself provides lots of "libraries" (yes, shared code) for the scripts to use. But mozharness is genius for a few reasons. First, logging is built into its core. Second it is insanely configurable. But the third is this concept of actions. An action is just a function, a script is just a series of actions. Actions are meant to be as atomic and independent as possible. Actions can live as a library in mozharness core, or be defined by the script author themselves.

What is so special about actions? It allowed us to quickly create a large number of scripts that all did similar, but not quite the same things. Instead of worrying about whether the scripts share the same code or not, we just banged out a new one in ten minutes. Instead of having one complicated script trying to abstract the shared code into one place, we have one script per platform. As you may imagine, many of these scripts look very similar, there is quite a bit of code duplication going on. At first we had intended to remove the duplicated code, since we assumed it would be a pain to maintain. Instead it turned out to be a blessing in disguise.

Like with buildbotcustom and the test harnesses themselves, mozharness scripts also suffer from IFFY requirements. The difference is, now when someone says something like "We need to pass in --whizzlebang to B2G emulator reftests and only B2G emulator reftests", it's easy to make that change. Before we'd need to do some kind of complicated special casing that doesn't scale to make sure none of the other platforms are affected (not to mention B2G desktop in this case). Now? Just change a configuration variable that gets passed into the B2G emulator reftest script. Or worst case scenario, a quick change to the script itself. It is guaranteed that our change won't affect any of the other platforms or test harnesses because the code we need to modify is not shared.

We are now able to respond to IFFY requirements really quickly. Instead of setting us back days or weeks, a new requirement might only set us back a few hours or even minutes. With all this extra time, we can focus on improving our infrastructure, rather than always playing catchup. It's true that once in awhile we'll need to change duplicated code in more than one location, but in my experience the number of times this happens (in this particular instance at least) is exceedingly rare.

Oh, by the way, remember how I said this was a lesson releng had already learned? Take a look at these files. You'll notice that the same configuration is repeated over and over again, not only for different platforms and test harnesses, but also just different slave types! This may seem like a horrible idea, until you realize that all this duplication allows releng to be extremely flexible about what jobs get run on which branches and which types of slaves. It's a lot less work to maintain some duplication, than it is to figure out a way to share the configuration while maintaining the same level of speed and flexibility when dealing with IFFY requirements.

The Takeaway

Hopefully by now I've convinced you that code duplication is not necessarily a bad thing, and that in some cases it isn't wise to blindly follow the DRY principle. If there's one takeaway from this post, it's to not take design principles for granted. Yes, code duplication is almost always bad, but that's not the same thing as always bad. Just be aware of the distinction, and use the case studies to try to avoid making the same mistakes.

FAQ (actually, no one has ever asked these)

So my project has IFFY requirements, should I just duplicate code whenever possible?

No.

Okay.. How do I know if I should use a DRY or WET approach then?

I don't think there is a sure fire way, but I have a sort of litmus test. Anytime you are wondering whether to consolidate some code, ask the question "If I duplicate this code in multiple places and the requirements change, will I need to update the code in every location? Or just one?". If you answered the former, then DRY is what you want. But if you answered the latter, then a WET approach just might be better. This is a difficult question to answer, and the answer is not black and white, usually it falls somewhere in-between. But at least the question gets you thinking about the answer in the first place which is already a big step forward. Another thing to take into consideration is how much time you have to architect your solution properly.

But if you could somehow have a solution that is both DRY and flexible to incoming requirement changes, wouldn't that be better?

Yes! Inheritance is one of the most obvious and common ways to share code, so I was a bit surprised at how horribly it failed us. It turns out that DRY is just one principle. And while we succeeded at not duplicating code, we failed at several other principles (like the Open/closed principle or the Single responsibility principle). I plan on doing a part 2 blog post that explores these other principles and possible implementations further. Stay tuned!

March 21, 2014 02:39 PM

March 18, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

March 18, 2014 06:41 AM

March 16, 2014

William Lachance

Upcoming travels to Japan and Taiwan

Just a quick note that I’ll shortly be travelling from the frozen land of Montreal, Canada to Japan and Taiwan over the next week, with no particular agenda other than to explore and meet people. If any Mozillians are interested in meeting up for food or drink, and discussion of FirefoxOS performance, Eideticker, entropy or anything else… feel free to contact me at wrlach@gmail.com.

Exact itinerary:

I will also be in Taipei the week of the March 31st, though I expect most of my time to be occupied with discussions/activities inside the Taipei office about FirefoxOS performance matters (the Firefox performance team is having a work week there, and I’m tagging along to talk about / hack on Eideticker and other automation stuff).

March 16, 2014 09:19 PM

March 14, 2014

William Lachance

It’s all about the entropy

[ For more information on the Eideticker software I'm referring to, see this entry ]

So recently I’ve been exploring new and different methods of measuring things that we care about on FirefoxOS — like startup time or amount of checkerboarding. With Android, where we have a mostly clean signal, these measurements were pretty straightforward. Want to measure startup times? Just capture a video of Firefox starting, then compare the frames pixel by pixel to see how much they differ. When the pixels aren’t that different anymore, we’re “done”. Likewise, to measure checkerboarding we just calculated the areas of the screen where things were not completely drawn yet, frame-by-frame.

On FirefoxOS, where we’re using a camera to measure these things, it has not been so simple. I’ve already discussed this with respect to startup time in a previous post. One of the ideas I talk about there is “entropy” (or the amount of unique information in the frame). It turns out that this is a pretty deep concept, and is useful for even more things than I thought of at the time. Since this is probably a concept that people are going to be thinking/talking about for a while, it’s worth going into a little more detail about the math behind it.

The wikipedia article on information theoretic entropy is a pretty good introduction. You should read it. It all boils down to this formula:

wikipedia-entropy-formula

You can see this section of the wikipedia article (and the various articles that it links to) if you want to break down where that comes from, but the short answer is that given a set of random samples, the more different values there are, the higher the entropy will be. Look at it from a probabilistic point of view: if you take a random set of data and want to make predictions on what future data will look like. If it is highly random, it will be harder to predict what comes next. Conversely, if it is more uniform it is easier to predict what form it will take.

Another, possibly more accessible way of thinking about the entropy of a given set of data would be “how well would it compress?”. For example, a bitmap image with nothing but black in it could compress very well as there’s essentially only 1 piece of unique information in it repeated many times — the black pixel. On the other hand, a bitmap image of completely randomly generated pixels would probably compress very badly, as almost every pixel represents several dimensions of unique information. For all the statistics terminology, etc. that’s all the above formula is trying to say.

So we have a model of entropy, now what? For Eideticker, the question is — how can we break the frame data we’re gathering down into a form that’s amenable to this kind of analysis? The approach I took (on the recommendation of this article) was to create a histogram with 256 bins (representing the number of distinct possibilities in a black & white capture) out of all the pixels in the frame, then run the formula over that. The exact function I wound up using looks like this:


def _get_frame_entropy((i, capture, sobelized)):
    frame = capture.get_frame(i, True).astype('float')
    if sobelized:
        frame = ndimage.median_filter(frame, 3)

        dx = ndimage.sobel(frame, 0)  # horizontal derivative
        dy = ndimage.sobel(frame, 1)  # vertical derivative
        frame = numpy.hypot(dx, dy)  # magnitude
        frame *= 255.0 / numpy.max(frame)  # normalize (Q&D)

    histogram = numpy.histogram(frame, bins=256)[0]
    histogram_length = sum(histogram)
    samples_probability = [float(h) / histogram_length for h in histogram]
    entropy = -sum([p * math.log(p, 2) for p in samples_probability if p != 0])

    return entropy

[Context]

The “sobelized” bit allows us to optionally convolve the frame with a sobel filter before running the entropy calculation, which removes most of the data in the capture except for the edges. This is especially useful for FirefoxOS, where the signal has quite a bit of random noise from ambient lighting that artificially inflate the entropy values even in places where there is little actual “information”.

This type of transformation often reveals very interesting information about what’s going on in an eideticker test. For example, take this video of the user panning down in the contacts app:

If you graph the entropies of the frame of the capture using the formula above you, you get a graph like this:

contacts scrolling entropy graph
[Link to original]

The Y axis represents entropy, as calculated by the code above. There is no inherently “right” value for this — it all depends on the application you’re testing and what you expect to see displayed on the screen. In general though, higher values are better as it indicates more frames of the capture are “complete”.

The region at the beginning where it is at about 5.0 represents the contacts app with a set of contacts fully displayed (at startup). The “flat” regions where the entropy is at roughly 4.25? Those are the areas where the app is “checkerboarding” (blanking out waiting for graphics or layout engine to draw contact information). Click through to the original and swipe over the graph to see what I mean.

It’s easy to see what a hypothetical ideal end state would be for this capture: a graph with a smooth entropy of about 5.0 (similar to the start state, where all contacts are fully drawn in). We can track our progress towards this goal (or our deviation from it), by watching the eideticker b2g dashboard and seeing if the summation of the entropy values for frames over the entire test increases or decreases over time. If we see it generally increase, that probably means we’re seeing less checkerboarding in the capture. If we see it decrease, that might mean we’re now seeing checkerboarding where we weren’t before.

It’s too early to say for sure, but over the past few days the trend has been positive:

entropy-levels-climbing
[Link to original]

(note that there were some problems in the way the tests were being run before, so results before the 12th should not be considered valid)

So one concept, at least two relevant metrics we can measure with it (startup time and checkerboarding). Are there any more? Almost certainly, let’s find them!

March 14, 2014 11:52 PM

Henrik Skupin

Firefox Automation report – week 5/6 2014

A lot of things were happening in weeks 5 and 6, and we made some good progress regards the stability of Mozmill.

Highlights

The unexpected and intermittent Jenkins crashes our Mozmill CI system was affected with are totally gone now. Most likely the delayed creation of jobs made that possible, which also gave Jenkins a bit more breath and not bomb it with hundreds of API calls.

For the upcoming release of Mozmill 2.0.4 a version bump for package dependencies was necessary for mozdownload. So we released mozdownload 1.11. Sadly a newly introduced regression in packaging caused us to release mozdownload 1.11.1 a day later.

After a lot of work for stability improvements we were able to release Mozmill 2.0.4. This version is one with the largest amount of changes in the last couple of months. Restarts and shutdowns of the application is way better handled by Mozmill now. Sadly we noticed another problem during restarts of the application on OS X (bug 966234) which forced us to fix mozrunner.

Henrik released mozrunner 5.34 which includes a fix how mozrunner retrieves the state of the application during a restart. It was failing here by telling us that the application has quit while it was still running. As result Mozmill started a new Firefox process, which was not able to access the still used profile. A follow-up Mozmill release was necessary, so we went for testing it.

As another great highlight for community members who are usually not able to attend our Firefox Automation meetings, we have started to record our meetings now. So if you want to replay the meetings please check our archive.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 5 and week 6.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the Firefox Automation meetings of week 5 and week 6.

March 14, 2014 10:59 AM

Firefox Automation report – week 3/4 2014

Due to the high work load and a week of vacation I was not able to give some updates for work done by the Firefox Automation team. In the next days I really want to catch up with the reports, and bring you all on the latest state.

Highlights

After the staging system for Mozmill CI has been setup by IT and Henrik got all the VMs connected, also the remaining Mac Minis for OS X 10.6 to 10.9 have been delivered. That means our staging system is complete and can be used to test upcoming updates, and to investigate failures.

For the production system the Ubuntu 13.04 machines have been replaced by 13.10. It’s again a bit late but other work was stopping us from updating earlier. The next update to 14.04 should become live faster.

Beside the above news we also had 2 major blockers. First one was a top crasher of Firefox caused by the cycle collector. Henrik filed it as bug 956284 and together with Tim Taubert we got it fixed kinda quick. The second one was actually a critical problem with Mozmill, which didn’t let us successfully run restart tests anymore. As it has been turned out the zombie processes, which were affecting us for a while, kept the socks server port open, and the new Firefox process couldn’t start its own server. As result JSBridge failed to establish a connection. Henrik got this fixed on bug 956315

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 3 and week 4.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the Firefox Automation meetings of week 3 and week 4.

March 14, 2014 09:58 AM

March 13, 2014

David Burns

Management is hard

I have been a manager within the A*Team for 6 months now and I wanted to share what I have learnt in that time. The main thing that I have learnt is being a manager is hard work.

Why has it been hard?

Well, being a manager is requires a certain amount of personal skills. So being able to speak to people and check they are doing the tasks they are supposed to is trivial. You can be a "good" manager on this but being a great manager is knowing how to fix problems when members of your team aren't doing the things that you expect.

It's all about the people

As an engineer you learn how to break down hardware and software problems and solve them. Unfortunately breaking down problems with people in your team are nowhere near the same. Engineering skills can be taught, even if the person struggles at first, but people skills can't be taught in the same way.

Working at Mozilla I get to work with a very diverse group of people literally from all different parts of the world. This means that when speaking to people, what you say and how you say things can be taken in the wrong way. It can be the simplest of things that can go wrong.

Careers

Being a manager means that you are there to help shape peoples careers. Do a great job of it and the people that you manage will go far, do a bad job of it and you will stifle their careers and possibly make them leave the company. The team that I am in is highly skilled and highly sought after in the tech world so losing them isn't really an option.

Feedback

Part of growing people's careers is asking for feedback and then acting upon that feedback. At Mozilla we have a process of setting goals and then measuring the impact of those goals on others. The others part is quite broad, from team mates to fellow paid contributors to unpaid contributors and the wider community. As a manager I need to give feedback on how I think they are doing. Personally I reach out to people who might be interacting with people I manage and get their thoughts.

But I don't stop at asking for feedback for the people I manage, I also ask for feedback about how I am doing from the people I manage. If you are a manager and never done this I highly recommend doing it. What you can learn about yourself in a 360 review is quite eye opening. You need to set ground rules like the feedback is private and confidential AND MUST NOT influence how you interact with that person in the future. Criticism is hard to receive but a manager is expected to be the adult in the room and if you don't act that way you head to the realm of a bad manager, a place you don't want to be.

So... do I still want to be a manager?

Definitely! Being a manager is hard work, let's not kid ourselves but seeing your team succeed and the joy on people's faces when they succeed is amazing.

March 13, 2014 03:46 PM

Joel Maher

mochitests and manifests

Of all the tests that are run on tbpl, mochitests are the last ones to receive manifests.  As of this morning, we have landed all the changes that we can to have all our tests defined in mochitest.ini files and have removed the entries in b2g*.json, by putting entries in the appropriate mochitest.ini files.

Ahal, has done a good job of outlining what this means for b2g in his post.  As mentioned there, this work was done by a dedicated community member :vaibhav1994 as he continues to write patches, investigate failures, and repeat until success.

For those interested in the next steps, we are looking forward to removing our build time filtering and start filtering tests at runtime.  This work is being done by billm in bug 938019.  Once that is landed we can start querying which tests are enabled/disabled per platform and track that over time!


March 13, 2014 02:58 PM

March 12, 2014

David Burns

Don't write "Five Hidden Costs of X" but when you do I will reply

Recently I was shown that Telerik did a "Five Hidden Costs of Selenium". I knew straight away from the title that this was purely a marketing document targeting teams with little to no automation skills to do automation. For what it is worth, if you want to do automation you should really hire the right engineers for the job.

My offence with the article is not that its wrong, there are a few items I disagree with which are documented below, but with it trying to sell snake oil or silver bullets. So let's even up the argument a bit. Note I am only comparing the WebDriver parts since if it were purely Selenium IDE vs Teleriks tool then I think it would be fair comments.

No Build/Scheduling Server

Telerik say we don't have those items and we don't. We don't want to be working on them since there are some awesome open source products with many years worth of engineering effort in them that you can use. These are free and allow a great amount freedom of customisation. They also work really well if you have hybrid systems as part of your test. Have you seen that ThoughtWorks has open sourced Go which is a great product from people who have been doing continuous integration for nearly, if not more, than a decade. Don't want to host it yourself, because managing servers is a hidden cost in all worlds, then look at the huge amount of Continuous Integration as a Service companies out there.

Execution Agents/Parallel Running

It says this is a 3rd party plugin which is not true. The Selenium server has a remote server system built in and if the correct arguments are passed in it can become a grid with a central hub managing which nodes are being used. This is called Selenium Grid.

The one thing, from the documentation, is that you have to host all these nodes yourself. Does it create hermetic environments to run against when scheduling? Hermetic environments is something that each core developer would want and if we can't give it then its not worth releasing. There are Infrastructure as a Service companies that WebDriver tests can be hooked up to so you don't need to maintain all the infrastructure yourself. The infrastructure costs can be quite expensive if you're a smallish team, using a service here can help. Unfortunately Telerik don't offer execute nodes as a service so you'll have to manage them yourself.

Also, its fine that nUnit doesn't support parallel execution, get your scheduling server to run a task for each browser and these tasks could be run in parallel.

Reporting

This is best done by the continuous integration server as part of the build and test. These take the output from the tasks they have told them to execute and then report. Having this as a selling point in marketing documentation feels like it is just targeting the untrained staff.

Multi-Browser Support

This is where you would think we would be even but Telerik is stuck to desktop browsers. WebDriver, due to its flexible transport mechanism (it's like we thought about it or something, means anyone can implement the server side and control a browser and all languages just work. We recently saw that with Blackberry creating an implementation for their devices. We have Selendroid for Android and iOS-Driver for iOS. Mobile is the future so only supporting the major desktop browsers is going to limit your future greatly.

Jim also mentioned that you would need to build a factory and teach it to get things running against multiple projects. You do have to but here is a link to the Selenium projects' way of handling it. We need to run our tests in multiple environments and we do it pretty well. This is just a one time sunk cost.

Maintenance of tests

I might be pulling a Telerik here but having tests look like the following
lolwat? What is SENDSELECTEDDiv even?.

Being able to code as an automation engineer is crucial. Being able to write good tests is useful too! I am biased but Mozilla has some really good examples that show good maintainable tests. Tests are trivial to write and update since they invested in a good pattern for tests. Record and playback tools have never had the ability to write maintainable tests and for them to be using meaningful API names. Also, it hampers (as in no one will take you seriously) your career calling yourself an automation engineer and only using record and playback tools.

Now beware of the snake oil that is being offered by vendors and for all that is holy... if you want to do automation don't do record and replay. I, and my peers, will not even let you past a phone screen if you don't show enough knowledge about coding and automation.

Also if Telerik was thinking straight they would be wrapping webdriver and then they would get everything that is happening in the webdriver world. Knowing your tests will always work in the browser no matter what platform (including mobile) is a huge selling point. And its standards based, feels like a no-brainer but i am obviously biased.

March 12, 2014 03:59 PM

March 09, 2014

William Lachance

Eideticker for FirefoxOS: Becoming more useful

[ For more information on the Eideticker software I'm referring to, see this entry ]

Time for a long overdue eideticker-for-firefoxos update. Last time we were here (almost 5 months ago! man time flies), I was discussing methodologies for measuring startup performance. Since then, Dave Hunt and myself have been doing lots of work to make Eideticker more robust and useful. Notably, we now have a setup in London running a suite of Eideticker tests on the latest version of FirefoxOS on the Inari on a daily basis, reporting to http://eideticker.mozilla.org/b2g.

b2g-contacts-startup-dashboard

There were more than a few false starts with and some of the earlier data is not to be entirely trusted… but it now seems to be chugging along nicely, hopefully providing startup numbers that provide a useful counterpoint to the datazilla startup numbers we’ve already been collecting for some time. There still seem to be some minor problems, but in general I am becoming more and more confident in it as time goes on.

One feature that I am particularly proud of is the detail view, which enables you to see frame-by-frame what’s going on. Click on any datapoint on the graph, then open up the view that gives an account of what eideticker is measuring. Hover over the graph and you can see what the video looks like at any point in the capture. This not only lets you know that something regressed, but how. For example, in the messages app, you can scan through this view to see exactly when the first message shows up, and what exact state the application is in when Eideticker says it’s “done loading”.

Capture Detail View
[link to original]

(apologies for the low quality of the video — should be fixed with this bug next week)

As it turns out, this view has also proven to be particularly useful when working with the new entropy measurements in Eideticker which I’ve been using to measure checkerboarding (redraw delay) on FirefoxOS. More on that next week.

March 09, 2014 08:31 PM

March 06, 2014

Andrew Halberstadt

Add more mach to your B2G

Getting Started

tl;dr - It is possible to add more mach to your B2G repo! To get started, install pip:

$ wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py -O - | python

Install b2g-commands:

$ pip install b2g-commands

To play around with it, cd to your B2G repo and run:

$ git pull                 # make sure repo is up to date
$ ./mach help              # see all available commands
$ ./mach help <command>    # see additional info about a command

Details

Most people who spend the majority of their time working within mozilla-central have probably been acquainted with mach. In case you aren't acquainted, mach is a generic command dispatching tool. It is possible to write scripts called 'mach targets' which get registered with mach core and transformed into commands. Mach targets in mozilla-central have access to all sorts of powerful hooks into the build and test infrastructure which allow them to do some really cool things, such as bootstrapping your environment, running builds and tests, and generating diagnostics.

A contributor (kyr0) and I have been working on a side project called b2g-commands to start bringing some of that awesomeness to B2G. At the moment b2g-commands wraps most of the major B2G shell scripts, and provides some brand new ones as well. Here is a summary of its current features:

I feel it's important to re-iterate, that this is *not* a replacement for the current build system. You can have b2g-commands installed and still keep your existing workflows if desired. Also important to note is that there's a good chance you'll find bugs (especially related to the bootstrap command on varying platforms), or arguments missing from your favourite commands. In this case please don't hesitate to contact me or file an issue. Or, even better, submit a pull request!

If the feature set feels a bit underwhelming, that's because this is just a first iteration. I think there is a lot of potential here to add some really useful things. Unfortunately, this is just a side project I've been working on and I don't have as much time to devote to it as I would like. So I encourage you to submit pull requests (or at least submit an issue) for any additional functionality you would like to see. In general I'll be very open to adding new features.

Future Plans

In the end, because this module lives outside the build system, it will only ever be able to wrap existing commands or create new ones from scratch. This means it will be somewhat limited in what it is capable of providing. The targets in this module don't have the same low-level hooks into the B2G and gaia repos like the targets for desktop do into gecko. My hope is that if a certain feature in this module turns out to be especially useful and/or widely used it'll get merged into the B2G repo and be available by default.

Eventually my hope is that we implement some deeper mach integration into the various B2G repos (especially gaia) which would allow us to create even more powerful commands. I guess time will tell.

March 06, 2014 04:36 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

March 06, 2014 08:39 AM

March 05, 2014

Joel Maher

quick tip – when all else fails – “reseat”

While chatting with dminor the other day he mentioned his camera stopped working and after a reboot there was no mention of the camera hardware in the logs or via dmesg.  His conclusion, the camera was not working.  Since I have the same hardware and run Ubuntu 13.10 as he does he wanted a sanity check.  My only suggestion was to turn off the computer, unplug it and take the battery out, wait 30 seconds then reassemble and power on.

Hey my suggestion worked and now dminor has a working camera again. 

This general concept of reseating hardware is something that is easily forgotten, yet is so effective.


March 05, 2014 02:08 PM

March 04, 2014

Joel Maher

Where did all the good first bugs go?

As this is the short time window of Google Summer of Code applications, I have seen a lot of requests for mochitest related bugs to work on.  Normally, we look for new bugs on the bugs ahoy! tool.  Most of these have been picked through, so I spent some time going through a bunch of mochitest/automation related bugs.  Many of the bugs I found were outdated, duplicates of other things, or didn’t apply to the tools today.

Here is my short list of bugs to get more familiar with automation while fixing bugs which solve real problems for us:

I have added the appropriate tags to those bugs to make them good first bugs.  Please take time to look over the bug and ask questions in the bug to get a full understanding of what needs to be done and how to test it.

Happy hacking!


March 04, 2014 02:31 PM

March 03, 2014

Geoff Brown

Firefox for Android Performance Measures – February check-up

My monthly review of Firefox for Android performance measurements.

February highlights:

- Regressions in tcanvasmark, tcheck2, and tsvgx; improvement in ts-paint.

- Improvements in some eideticker startup measurements.

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android 2.2 opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcanvasmark

This test runs the third-party CanvasMark benchmark suite, which measures the browser’s ability to render a variety of canvas animations at a smooth framerate as the scenes grow more complex. Results are a score “based on the length of time the browser was able to maintain the test scene at greater than 30 FPS, multiplied by a weighting for the complexity of each test type”. Higher values are better.

7800 (start of period) – 7200 (end of period).

Regression on Feb 19 – bug 978958.

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

tcheck2

2.7 (start of period) – 24 (end of period)

Regression of Feb 25: bug 976563.

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

110000 (start of period) – 110000 (end of period)

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

375 (start of period) – 375 (end of period).

tsvgx

An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.

7500 (start of period) – 7600 (end of period).

This test both improved and regressed slightly over the month, for a slight overall regression. Bug 978878.

tp4m

Generic page load test. Lower values are better.

700 (start of period) – 710 (end of period).

No specific regression identified.

ts_paint

Startup performance test. Lower values are better.

4300 (start of period) – 3600 (end of period).

Throbber Start / Throbber Stop

These graphs are taken from http://phonedash.mozilla.org.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

throbberstart

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

throbberstop

:bc has been working on reducing noise in these results — notice the improvement. And there is more to come!

Eideticker

These graphs are taken from http://eideticker.mozilla.org. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

There is some improvement from last month’s startup measurements:

eide1

eide2

eide3

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

I did not notice any big changes this month.


March 03, 2014 09:51 PM

Andrew Halberstadt

A Workflow for using Mach with multiple Object Directories

Mach is an amazing tool which facilitates a large number of common user stories in the mozilla source tree. You can perform initial setup, execute a build, run tests, examine diagnostics, even search Google. Many of these things require an object directory. This can potentially lead to some confusion if you typically have more than one object directory at any given time. How does mach know which object directory to operate on?

It turns out that mach is pretty smart. It takes a very good guess at which object directory you want. Here is a simplification of the steps in order:

  1. If cwd is an objdir or a subdirectory of an objdir, use that
  2. If a mozconfig is detected and MOZ_OBJDIR is in it, use that
  3. Attempt to guess the objdir with build/autoconf/config.guess

The cool thing about this is that there are tons of different workflows that fit nicely into this model. For example, many people put the mach binary on their $PATH and then always make sure to 'cd' into their objdirs before invoking related mach commands.

It turns out that mach works really well with a tool I had written quite awhile back called mozconfigwrapper. I won't go into details about mozconfigwrapper here. For more info, see my previous post on it. Now for the sake of example, let's say we have a regular and debug build called 'regular' and 'debug' respectively. Now let's say I wanted to run the 'mochitest-plain' test suite on each build, one after the other. My workflow would be (from any directory other than an objdir):

$ buildwith regular
$ mach mochitest-plain
$ buildwith debug
$ mach mochitest-plain

How does this work? Very simply, mozconfigwrapper is exporting the $MOZCONFIG environment variable under the hood anytime you call 'buildwith'. Mach will then pick up on this due to the second step listed above.

Your second question might be why bother installing mozconfigwrapper when you can just export MOZCONFIG directly? This is a matter of personal preference, but one big reason for me is the buildwith command has full tab completion, so it is easy to see which mozconfigs you have available to choose from. Also, since they are hidden away in your home directory, you don't need to memorize any paths. There are other advantages as well which you can see in the mozconfigwrapper readme.

I've specially found this workflow useful when building several platforms at once (e.g firefox and b2g desktop) and switching back and forth between them with a high frequency. In the end, to each their own and this is just one possible workflow out of many. If you have a different workflow please feel free to share it in the comments.

March 03, 2014 03:47 PM

February 28, 2014

David Burns

WebDriver Face To Face - February 2014

This week saw the latest WebDriver F2F to work on the specification. We held the meeting at the Mozilla San Francisco office.

The agenda for the meeting was placed, as usual, on the W3 Wiki. We had quite a lot to discuss and, as always, was a very productive meeting.

The meeting notes are available for Tuesday and Wednesday. The most notable items are;

The other amazing things that happened was we had Blackberry join the working group, especially after their announcement saying they have created an implementation.

And... how can I forget about this...

The specification is getting a lot of attention from the people that we need and want which makes me really excited!

February 28, 2014 06:34 PM

Joel Maher

Hi Vaibhav Agrawal, welcome to the Mozilla Community!

I have had the pleasure to work with Vaibhav for the last 6 weeks as he has joined the Mozilla community as a contributor on the A-Team.  As I have watched him grow in his skills and confidence, I thought it would be useful to introduce him and share a bit more about him.

From Vaibhav:

What is your background?

I currently reside in Pilani, a town in Rajasthan, India. The thing that I like the most about where I live is the campus life. I am surrounded by some awesome and brilliant people, and everyday I learn something new and interesting.

I am a third year student pursuing Electronics and Instrumentation at BITS Pilani. I have always loved and been involved in coding and hacking stuff. My favourite subjects so far have been Computer Programming and Data Structures and Algorithms. These courses have made my basics strong and I find the classes very interesting.

I like to code and hack stuff, and I am an open source enthusiast. Also, I enjoy solving algorithmic problems in my free time. I like following new startups and the technology they are working on, running, playing table tennis, and I am a cricket follower.

How did you get involved with Mozilla?

I have been using Mozilla Firefox for many years now. I have recently started contributing to the Mozilla community when a friend of mine encouraged me to do so. I had no idea how the open source community worked and I had the notion that people generally do not have time to answer silly questions and doubts of newcomers. But guess what? I was totally wrong. The contributors in Mozilla are really very helpful and are ready to answer every trivial question that a newbie faces.

Where do you see yourself in 5 years?

I see myself as a software developer solving big problems, building great products and traveling different places!

What advice would you give someone?

Do what you believe in and have the courage to follow your heart and instincts. They somehow know what you truly want to become.

In January :vaibhav1994 popped into IRC and wanted to contribute to a bug that I was a mentor on.  This was talos bug 931337, from that first bug Vaibhav has fixed many bugs including work on finalizing our work to support manifests in mochitest.  He wrote a great little script to generate patches for bugs 971132 and 970725.

Say hi in IRC and keep an eye out for bugs related to automation where he is uploading patches and fixing many outstanding issues!

 


February 28, 2014 02:07 PM

February 27, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

February 27, 2014 07:11 AM

February 25, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

February 25, 2014 07:23 AM

February 24, 2014

Joel Maher

tracking talos alerts across branches

A year without blogging and I am back.  I figured there was some cool stuff to share, here is one tidbit.

In the last year I have picked up looking at talos results and filing regression bugs for results.  This has been useful.  What currently happens is when results are submitted to g.m.o (graph server) we detect a regression and send out an email to the original patch author (if we can determine it) and post to mozilla.dev.tree-management.  I have been using dev.tree-management as a starting point for my hunting regressions.  When things are busy it can eat up a couple hours in a day.  Luckily many developers are responsible in taking action when they receive the emails.

Given that at least half of the regressions are not acted upon by the original developer, it is important to read the newsgroup. One of the things which makes it frustrating is that for a single regression we can get multiple alerts (regular builds vs pgo builds and as the patch merges between branches/projects).

To make my life easier, I have taken all the alerts on dev.tree-management and put them in a database (local right now).  The final goal is a webUI that lets me easily annotate these alerts similar to tbpl for random test failures.  One thing I wanted to do was help identify duplicate alerts.  Today in my attempt I had a clear picture of what the lifecycle of a regression looks like:

mysql> select date,branch,percent,keyrevision from alerts where test=’Paint’ and platform=’WINNT 6.2 x64′ order by date ASC;
+———————+————————-+———+————–+
| date                | branch                  | percent | keyrevision  |
+———————+————————-+———+————–+
| 2014-02-14 19:41:38 | Mozilla-Inbound-Non-PGO | 10.1%   | c7802c9d6eec |
| 2014-02-15 01:03:54 | Fx-Team-Non-PGO         | 9.53%   | 7a3adc5aac28 |
| 2014-02-15 21:43:48 | Mozilla-Inbound         | 10.6%   | c7802c9d6eec |
| 2014-02-16 03:46:12 | Firefox-Non-PGO         | 8.88%   | 5d7caa093f4f |
| 2014-02-16 03:46:13 | B2g-Inbound-Non-PGO     | 9.44%   | 071885f79841 |
| 2014-02-16 14:22:38 | Fx-Team                 | 10.4%   | 7a3adc5aac28 |
| 2014-02-17 04:42:57 | B2g-Inbound             | 10.7%   | 071885f79841 |
| 2014-02-18 11:43:54 | Firefox                 | 9.76%   | eac89fb04bb9 |
+———————+————————-+———+————–+
8 rows in set (0.00 sec)

This is really cool to see how 1 change can generate alerts for 4 days.

Stay tuned for more information on this and other topics!


February 24, 2014 09:10 PM

February 20, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

February 20, 2014 06:13 AM

February 14, 2014

Byron Jones

markup within bugzilla comments

a response i received multiple times following yesterday’s deployment of comment previews on bugzilla.mozilla.org was “wait, bugzilla allows markup within comments now?”.

yes, and this has been the case for a very long time.  here’s a list of all the recognised terms on bugzilla.mozilla.org;


Filed under: bmo, mozilla

February 14, 2014 04:48 AM

February 13, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.

comment-preview


Filed under: bmo, mozilla

February 13, 2014 06:52 AM

February 12, 2014

Geoff Brown

Complete logcats for Android tests

“Logcats” – those Android logs you see when you execute “adb logcat” – are an essential part of debugging Firefox for Android. For a long time, we have included logcats in our Android test logs on tbpl: After a test run, we run logcat on the device, collect the output and dump it to the test log. Sometimes those logcats are very useful; other times, they are too little, too late. A typical problem is that a failure occurs early in a test run, but does not cause the test to fail immediately; by the time the test ends, the fixed-size logcat buffer has filled up and overwritten the earlier, important messages. How frustrating!

Beginning today, Android 4.0 and Android 4.2 x86 test jobs offer “complete logcats”: logcat is run for the duration of the test job, the output is collected continuously, and dumped to a file. At the end of the test job, the file is uploaded to an aws server, and a link is displayed in tbpl. Here’s a sample of a tbpl summary:

Image

Notice the (blobuploader) line? Open that link (yes, it’s long and awkward — there’s a bug for that!) and you have a complete logcat showing what was happening on the device for the duration of the test job.

We have not changed the “old” logcat features in test logs: We still run logcat at the end of most jobs and dump the output to the test log. That might be more convenient in some cases.

Are you wondering what “blobuploader” means? Curious about how the aws upload works? That’s the “blobber” project at work. See http://atlee.ca/posts/blobber-is-live.html and https://air.mozilla.org/intern-presentation-tabara/.

Unfortunately, the Android 2.2 (Tegra) test jobs use an older infrastructure which makes it difficult to implement blobber and complete logcats. There are no logcats-via-blobber for Android 2.2 — it’s only available for Android 4.0 and the newer Android emulator tests.

Happy test debugging!


February 12, 2014 05:08 PM

February 11, 2014

Henrik Skupin

Firefox Automation report – week 1/2 2014

I promised to keep up with our updates over the last week but given a major breakage in the freshly released version of Mozmill 2.0.4, I had a full week of work to get the fix out. I promise that during this week I will write reports for the weeks in January.

Highlights

With the new year our team has been reorganized and we are part of the Mozilla QA team again. That means we will have a way closer relationship to any feature owner, and also working towards in bringing more automation knowledge to everyone. The goals for our team are getting worked out and I will present those in one of my following blog posts. As of now you can find our team page on the Mozilla wiki under Firefox Automation.

Since the landing of all the new features for Mozmill-CI on our staging machine before Christmas, we have no longer experienced any crash of the Jenkins master. Given that Henrik pushed all the changes to our production system. We are totally happy that the incremental updates made our system that stable, and that Mozilla QA doesn’t have cope with outages.

Henrik and Jarek were both working on the mozfile 1.1 release to make it more stable in terms of removing files when those are still in use or don’t have the right permissions set.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 1 and week 2.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the first Firefox Automation meeting of week 2.

February 11, 2014 10:10 AM

February 06, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

February 06, 2014 07:05 AM

February 03, 2014

Geoff Brown

Firefox for Android Performance Measures – January check-up

My monthly review of Firefox for Android performance measurements.

January highlights:

- only minor Talos regressions

- Eideticker startup regressions

- inconsistent improvement in many awsy measures

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android 2.2 opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcanvasmark

This test runs the third-party CanvasMark benchmark suite, which measures the browser’s ability to render a variety of canvas animations at a smooth framerate as the scenes grow more complex. Results are a score “based on the length of time the browser was able to maintain the test scene at greater than 30 FPS, multiplied by a weighting for the complexity of each test type”. Higher values are better.

7800 (start of period) – 7800 (end of period).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

rck2

2.5 (start of period) – 2.7 (end of period)

Jan 16 regression – bug 961869.

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

110000 (start of period) – 110000 (end of period)

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

375 (start of period) – 375 (end of period).

tsvgx

An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.

svg

7200 (start of period) – 7500 (end of period).

Regression of Jan 7 – bug 958129.

tp4m

Generic page load test. Lower values are better.

700 (start of period) – 700 (end of period).

ts_paint

Startup performance test. Lower values are better.

4300 (start of period) – 4300 (end of period).

Throbber Start / Throbber Stop

These graphs are taken from http://phonedash.mozilla.org.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

throbber_start

There is so much data here, it is hard to see what is happening – bug 967052. I filtered out many of the devices to get this:

throbber_start-2

I think existing, long-running devices are showing no regressions, and some of the new devices are exhibiting a lot of noise — a problem that :bc is working to correct.

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

throbber_stop

A similar story here, I think.

But there was a regression for some devices on Jan 24 – bug 964323.

Eideticker

These graphs are taken from http://eideticker.mozilla.org. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

Let’s look at our startup numbers this month:

eide1 eide2 eide3 eide4 eide5 eide6 eide7 eide8

Regressions noted in bugs 964307 and 966580.

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

awsy1

awsy2

awsy3

There seems to be an improvement in several of the measurements, but it is inconsistent — it varies from one test run to the next. I wonder what that’s about.


February 03, 2014 11:46 PM

David Burns

Updated BrowserMob Proxy Python Documentation

After much neglect I have finally put some effort into writing some documentation for the BrowserMob Proxy Python bindings however I am sure that they can be a lot better!

To do this I am going to need your help! If there are specific pieces of information, or examples, that you would like in the documentation please either raise an issue or even better create a pull request with the information, or example, that you would like to see in the documentation.

February 03, 2014 09:09 PM

January 30, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

the changes to quicksearch are worth highlighting:
the default operator, colon (:), has always performed a substring match of the value. the following operators are now also supported:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

January 30, 2014 05:21 AM

January 24, 2014

Henrik Skupin

Automation Development report – week 51/52 2013

Wow, somehow I totally missed to send out reports for our automation work. Most likely that happened because of the amount of work I had in the past couple of weeks. So for now lets do a final update before the title will be updated to ‘Firefox Automation report’ by the year 2014.

Highlights

We have released Mozmill 2.0.3 to fix a couple of issues (see dependencies on bug 950831 seen with Firefox Metro builds and our Firefox shutdown code. We pushed those changes together with the releases of mozmill-automation 2.0.3 and the new mozmill-environment files to our mozmill-ci staging instance for baking.

Henrik was able to finish the work in setting up our new mozmill-ci staging instance in the SCL3 datacenter. Please see bug 947108 for details. With it we have the identical environment as the production instance and can see regressions immediately and not when we merge to production, which was pretty bad in the past couple of week. So RIP old staging server!

One of our goals for quarter 3 in 2013 was to setup a web based configutation tool for ondemand testruns in mozmill-ci, which can be used by QA people to trigger testruns for beta and release builds. Cosmin jumped on it and got the first version implemented. You can find a running instance on Github for now. Later we want to make the tool available via http://www.mozqa.com.

To make our mozmill-ci system more stable, Henrik pushed a large set of new features and fixes to the staging instance. Our plan was to let it bake over the Christmas holidays with the hope that Jenkins will run way more stable now.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 51 and week 52.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the last Automation Development meeting of week 51.

January 24, 2014 03:50 PM

January 23, 2014

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

January 23, 2014 05:30 AM

January 21, 2014

Geoff Brown

Android x86 tests (S4) on tbpl

Today, we started running Robocop and xpcshell tests in an Android x86 emulator environment on tbpl.

Firefox for Android has been running on Android x86 for over a year now [1] and we have had Android x86 builds on tbpl for nearly as long [2], but our attempts to run test suites on Android x86 [3] have been more problematic. (See [4] to get an appreciation of the complications.)

This is the first of our Android test jobs to run in an emulator (Android 2.2 test jobs run on Tegra boards and Android 4.0 jobs run on Panda boards). The Android x86 tests run in Android x86 emulators running Android 4.2. The emulators run on Linux 64-bit in-house machines, and we run up to 4 emulators at a time on each Linux machine.

The new tests are labelled “Android 4.2 x86 Opt” on tbpl and look like this:

Screen shot 2014-01-21 at 12.58.45 PM

Each “set” contains up to 4 test jobs, reflecting the set of emulators that are run in parallel on each Linux box. For now, only set S4 is run on trunk trees; S4 contains xpcshell tests and robocop tests, broken up into 3 chunks, robocop-1, robocop-2, and robocop-3:

Image

Other test suites – mochitests, reftests, etc – run on Android 4.2 x86 Opt only on the Cedar and Ash trees at this time. They mostly work, but have intermittent infrastructure failures that make them too unreliable to run on trunk trees  (bug 927602 is the main issue).

Image

If you need to debug a test failure on this platform, try server support is available, or you can borrow a releng machine and run the mozharness job yourself.

[1] http://starkravingfinkle.org/blog/2012/11/firefox-for-android-running-on-android-x86/

[2] http://oduinn.com/blog/2012/12/20/android-x86-builds-now-on-tbpl-mozilla-org/

[3] http://armenzg.blogspot.ca/2013/09/running-hidden-firefox-for-android-42.html

[4] https://bugzilla.mozilla.org/showdependencytree.cgi?id=891959&hide_resolved=0


January 21, 2014 08:00 PM

January 20, 2014

Byron Jones

renaming mozilla-corporation-confidential to mozilla-employee-confidential

in the early days of bugzilla.mozilla.org there were three bugzilla security groups which covered all mozilla employees: mozilla-corporation-confidential (mozilla corporation employees), mozilla-foundation-confidential (mozilla foundation employees), and mozilla-confidential (both corporation and foundation). as is the way, things change. the mozilla-confidential group got deprecated and eventually disabled. mozilla-corporation-confidential’s usage within bugzilla has expanded and is now the default security group for a large number of products. mozilla itself has made efforts to remove the distinction between the foundation and the corporation.

this has resulted in bugs which should be visible to all employees but were not (see bug 941671).

late on wednesday the 19th of feburary we will be renaming the mozilla-corporation-confidential group to mozilla-employee-confidential, and will update the group’s membership to include mozilla foundation staff.

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

January 20, 2014 03:51 AM

January 07, 2014

Geoff Brown

Firefox for Android Performance Measures – 2013 in review

Let’s review our performance measures for 2013.

Highlights:

- significant regressions in “time to throbber start/stop” and Eideticker startup tests

- most Talos measurements stable, or regressions addressed

- slight, gradual improvements to frame rate and responsiveness

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android 2.2 opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcanvasmark

This test runs the third-party CanvasMark benchmark suite, which measures the browser’s ability to render a variety of canvas animations at a smooth framerate as the scenes grow more complex. Results are a score “based on the length of time the browser was able to maintain the test scene at greater than 30 FPS, multiplied by a weighting for the complexity of each test type”. Higher values are better.

tcanvas

7800 (start of period) – 7700 (end of period).

This test was introduced in September and has been fairly stable ever since. There does however seem to be a slight, gradual regression over this period.

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

tcheck2

4.4 (start of period) – 2.8 (end of period)

This test saw lots of regressions and improvements over 2013, ending on a stable high note.

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

tpan

14000 (start of period) – 110000 (end of period)

The nature of this test measurement makes it one of the most variable Talos tests. We overcame the worst of the Sept/Oct regression, but still ended the year worse off than we started.

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

troboprovider

375 (start of period) – 375 (end of period).

This test has hardly ever reported significant change — is it a useful test?

tsvgx

An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.

tsvg

1200 (start of period) – 7200 (end of period).

Introduced in September; the regression of Nov 27 is tracked in bug 944429.

tp4m

Generic page load test. Lower values are better.

tp4m

700 (start of period) – 700 (end of period).

This version of tp4m was introduced in September; no significant changes here.

ts_paint

Startup performance test. Lower values are better.

tspaint

4300 (start of period) – 4300 (end of period)

Introduced in September; there are no significant regressions here, but there is a lot of variability, possibly related to the frequent test failures — see bug 781107.

Throbber Start / Throbber Stop

These graphs are taken from http://phonedash.mozilla.org.  Browser startup performance is measured on real phones (a variety of popular devices).

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

throbberstart1

There is so much data here, it is hard to see what is happening, but a troubling upward trend over the year is evident.

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

throbberstop1

Again, there is a lot of data here. Here’s another graph that hides the data for all but a few devices:

throbberstop2

Evidently we have lost a lot of ground over the year, with an increase in “time to throbber stop” of nearly 80% for some devices.

Eideticker

These graphs are taken from http://eideticker.mozilla.org. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

Eideticker confirms that startup time has regressed over the year:

eide1

eide2

Most checkerboarding and frame rate measurements have been steady, or show slight improvement:

eide5

Responsiveness — a new measurement added this year — is similarly steady with slight improvement:

eide3

eide4

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

awsy1

awsy2

awsy3

There is an upward trend here for many of the measurements, but what I find striking is how often we have managed to keep these numbers stable while adding new features.


January 07, 2014 04:01 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

January 07, 2014 05:59 AM

January 03, 2014

Mark Côté

BMO in 2013

2013 was a pretty big year for BMO! I covered a bit in my last post on BMO, but I want to sum up just some of the things that the team accomplished in 2013 as well as to give you a preview of a few things to come.

We push updates to BMO generally on a weekly basis. The changelog for each push is posted to glob’s blog and linked to from Twitter (@globau) and from BMO’s discussion forum, mozilla.tools.bmo (available via mailing list, Google Group, and USENET).

I’m leaving comments open, but if you have something to discuss, please bring it to mozilla.tools.bmo.

Stats for 2013

BMO Usage:

35 190 new users registered
130 385 new bugs filed
107 884 bugs resolved

BMO Development:

115 code pushes
1 202 new bugs filed
1 062 bugs resolved

Native REST API

2013 saw a big investment in making Bugzilla a platform, not just a tool, for feature and defect tracking (and the other myriad things people use it for!). We completed a native RESTish API to complement the antiquated XMLRPC, JSONRPC, and JSONP interfaces. More importantly, we’ve built out this API to support more and more functionality, such as logging in with tokens, adding and updating flags, and querying the permissions layer.

Something worth taking note of is the bzAPI compatibility layer, which will be deployed in early Q1 of 2014. bzAPI is a nice application which implements a REST interface to Bugzilla through a combination of the legacy APIs, CSV renderings, and screen-scraped HTML. It is, however, essentially a proxy service to Bugzilla, so it has both limited functionality and poorer performance than a native API. With the new bzAPI compatibility layer, site admins will just have to change a URL to take advantage of the faster built-in REST API.

We are also planning to take the best ideas from the old APIs, bzAPI, the newly added functionality, and modern REST interfaces to produce an awesome version 2.0.

Project Kick-off Form

The Project Kick-off Form that was conceived and driven by Michael Coates was launched in January. The BMO team implemented the whole thing in the preceding months and did various improvements over the course of 2013.

The Form is now in the very capable hands of Winnie Aoieong. Winnie did a Project Kick-Off Refresher Brown Bag last month if you want, well, a refresher. We’ll be doing more to support this tool in 2014.

Sandstone Skin

BMO finally got a new default look this year. This was the result of some ideas from the “Bugzilla pretty” contest, the Mozilla Sandstone style guide, and our own research and intuition. BMO is still a far cry from a slick Web 2.x (or are we at 3.0 yet?) site, but it’s a small step towards it.

Oh and we have Gravatar support now!

User Profiles

Want to get some quick stats about a Bugzilla user—how long they’ve been using Bugzilla, the length of their review queue, or the areas in which they’ve been active? Click on a user’s name and select “Profile”, or go directly to your user profile page and enter a name or email into the search field.

File bugs under bugzilla.mozilla.org :: Extensions: UserProfile if there are other stats you think might be useful.

Review Suggestions and Reminders

Code reviews were a big topic at Mozilla in 2013. The BMO team implemented a couple related features:

System Upgrade

When we upgraded† BMO to Bugzilla 4.2, IT also moved BMO from older hardware in Phoenix to new, faster hardware in SCL3. BMO was then set up anew in Phoenix and is now the failover location in case of an outage in SCL3.

† The BMO team regularly backports particularly useful patches from later upstream Bugzilla versions and trunk, but we fully upgraded to version 4.2 in the spring of 2013.

Other Stuff

We added user and product dashboards, implemented comment tagging, improved bug update times, and added redirects for GitHub pull-request reviews.

And then there were various bits of internal plumbing largely (by design!) invisible to users, such as the great tracking-flags migration; tonnes of little fixes here and there; and of course daily administration.

Plans for 2014

We’re already at work planning and implementing new features to start 2014 off right.

Our quarterly goals and other major work items are tracked on the BMO wiki page. You can also check out our road map for some vague ideas of plans into the future; these are ideas based on our current understanding of the Mozillaverse and will almost certainly change to some degree.

January 03, 2014 04:20 PM

December 18, 2013

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

December 18, 2013 11:24 AM

December 17, 2013

Henrik Skupin

Automation Development report – week 49/50 2013

It’s getting closer to Christmas. So here the 2nd last automation development report for 2013. Also please note that all team members including myself, who were dedicated to Firefox Desktop automation, have been transitioned back from the A-Team into the Mozilla QA team. This will enable us to have a better relationship with QA feature owners, and get them trained for writing automated tests for Firefox. Therefor my posts will be named “Firefox Automation” in the future.

Highlights

With the latest release of Firefox 26.0 a couple of merges in our mozmill-tests repository had to be done. Work involved here was all done by Andreea and Andrei on bug 945708

To get rid of failed attempts to remove files after a testrun with Mozmill, Henrik was working on a new version of mozfile, which includes a method called remove() now. It should be used by any code given that it tries multiple times to remove files or directories if access is getting denied on Windows systems.

We released Mozmill 2.0.2 with a couple of minor fixes and the above mentioned file removal fixes. Beside that mozmill-automation 2.0.2 and new mozmill-environments have been released.

We are still working on the remaining issues with Mozmill 2.0 and are hoping to get them fixed as soon as possible. So that an upgrade to the 2.0.x branch can happen.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 49 and week 50.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the last two Automation Development meetings of week 49 and week 50.

December 17, 2013 10:57 AM

December 16, 2013

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

December 16, 2013 04:55 AM

December 10, 2013

Byron Jones

comment tagging deployed to bmo

i’ve been working on a bugzilla enhancement which allows you to tag individual comments with arbitrary strings, which was deployed today.

comment tagging features:

automatic collapsing of comments

the bugzilla administrator can configure a list of comment tags which will result in those comments being collapsed by default when a bug is loaded.

this allows obsolete or irrelevant comments to be hidden from the information stream.

comment grouping/threading

bugzilla shows a list of all comment tags in use on the bug, and clicking on a tag will expand those comments while collapsing all others.

this allows for simple threading of comments without diverging significantly from the current bugzilla user interface, api, and schema. you’ll be able to tag all comments relating to the same topic, and remove comments no longer relevant to that thread by removing the tag.

highlighting importing comments

on bugs with a lot of information, it can be time consuming for people not directly involved in the bug to find the relevant comments.  applying comment tags to the right comments assists this, and may negate the need for information to be gathered outside of bugzilla.

for example:

implementation notes


Filed under: bmo, mozilla

December 10, 2013 04:55 PM

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

more information about comment tagging can be found in this blog post.

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

December 10, 2013 07:20 AM

December 06, 2013

Henrik Skupin

Automation Development report – week 47/48 2013

This is again an overview about the highlights of our work in week 47 and 48 this year. Sorry for the delay of a couple of days but some critical work was holding me off from writing this post.

Highlights

Henrik and Dave were working on a couple of Mozmill-CI updates, which have been pushed to production. The first batch of features and bug fixes landed in week 47. It included the monitoring plugin for Jenkins, which hopefully will help us to figure out the reasons of Java crashes. Also we can finally run project branches via our CI now, even it can only be done manually as of now. It is important for our upcoming tests for the Holly branch (Firefox Nightly without the Australis feature). The second batch landed by last week was intended to upgrade our system to Mozmill 2.0.1. Sadly we failed in it due to a couple of other failures, which we haven’t seen before on our staging server. So we partly reverted back the latest commits for production and we all are working hard on getting those issues fixed.

With the detected failures by upgrading to Mozmill 2.0.1, Henrik has noticed that one of those existed because of an incompatibility of mozbase with Python 2.6. See bug 944361 for details. To solve this we upgraded our OS X 10.6 boxes to Python 2.7.3, so all machines are running the same version of Python now. As a very nice side-effect we noticed a speed improvement by running our Mozmill tests of about 25%!

Henrik pushed a couple of releases to pypi, which include mozprofile 0.17 with a lot of add-on manager related bug fixes, mozdownload 1.10 (see details), and mozmill 2.0.1 (see details)

To be prepared for executing the first Metro tests with Mozmill we had to prepare the mozmill-tests repository for handling multiple applications. Therefore Andreea refactored nearly the whole repository. You can find details on bug 927397.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 47 and week 48.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the last two Automation Development meetings of week 47 and week 48.

December 06, 2013 10:51 AM

November 29, 2013

Geoff Brown

Firefox for Android Performance Measures – November check-up

My monthly review of Firefox for Android performance measurements.

November highlights:

- significant improvement in tcheck2

- significant regression in tsvgx (SVG-ASAP) – bug 944429

- more devices added to phonedash

Talos

This section tracks Perfomatic graphs from graphs.mozilla.org for mozilla-central builds of Native Fennec (Android 2.2 opt). The test names shown are those used on tbpl. See https://wiki.mozilla.org/Buildbot/Talos for background on Talos.

tcanvasmark

This test runs the third-party CanvasMark benchmark suite, which measures the browser’s ability to render a variety of canvas animations at a smooth framerate as the scenes grow more complex. Results are a score “based on the length of time the browser was able to maintain the test scene at greater than 30 FPS, multiplied by a weighting for the complexity of each test type”. Higher values are better.

7800 (start of period) – 7800 (end of period).

tcheck2

Measure of “checkerboarding” during simulation of real user interaction with page. Lower values are better.

tcheck2

30 (start of period) – 2.5 (end of period)

The regressions of October were more than reversed.

trobopan

Panning performance test. Value is square of frame delays (ms greater than 25 ms) encountered while panning. Lower values are better.

140000 (start of period) – 140000 (end of period)

tprovider

Performance of history and bookmarks’ provider. Reports time (ms) to perform a group of database operations. Lower values are better.

375 (start of period) – 375 (end of period).

tsvgx

An svg-only number that measures SVG rendering performance. About half of the tests are animations or iterations of rendering. This ASAP test (tsvgx) iterates in unlimited frame-rate mode thus reflecting the maximum rendering throughput of each test. The reported value is the page load time, or, for animations/iterations – overall duration the sequence/animation took to complete. Lower values are better.

tsvg

1400 (start of period) – 7500 (end of period).

Regression of Nov 27 – bug 944429

 

tp4m

Generic page load test. Lower values are better.

700 (start of period) – 740 (end of period)

Slight regression of Nov 27 – bug 944429

 

ts_paint

Startup performance test. Lower values are better.

4300 (start of period) – 4300 (end of period)

Throbber Start / Throbber Stop

These graphs are taken from http://phonedash.mozilla.org.  Browser startup performance is measured on real phones (a variety of popular devices).

throbber-start

“Time to throbber start” measures the time from process launch to the start of the throbber animation. Smaller values are better.

Time to throbber start was generally stable during this period.

throbber-stop

“Time to throbber stop” measures the time from process launch to the end of the throbber animation. Smaller values are better.

Time to throbber stop was generally stable during this period.

Note that many new devices were added this month.

Eideticker

These graphs are taken from http://eideticker.mozilla.org. Eideticker is a performance harness that measures user perceived performance of web browsers by video capturing them in action and subsequently running image analysis on the raw result.

More info at: https://wiki.mozilla.org/Project_Eideticker

Many of the eideticker graphs were generally stable this month.

eide

awsy

See https://www.areweslimyet.com/mobile/ for content and background information.

awsy

awsy graphs were generally stable this month.

 


November 29, 2013 06:01 PM

Henrik Skupin

Mozmill speed improvements after upgrading Python from 2.6 to 2.7.3

Yesterday we tried to upgrade our mozmill-ci cluster to the previously released Mozmill 2.0.1. Sadly we failed on the OS X 10.6 machines and had to revert this change. After some investigation I found out that incompatibility issues between Python 2.6 and 2.7.3 were causing this problem in mozprofile. Given the unclear status of Python 2.6 support in mozbase, and a talk in the #ateam IRC channel, I have been advised to upgrade those machines to Python 2.7. I did so after some testing, also because all other machines are running Python 2.7.3 already. So I didn’t expect any fallout. First post upgrade tests have proven this.

The interesting fact I would like to highlight here is that we can see speed improvements by running our tests now. Previously a functional testrun on 10.6 has been taken about 15 minutes. Now after the upgrade it went down to 11 minutes only. That’s an improvement of nearly 27% with Mozmill 1.5.24. With Mozmill 2.0.1 there is a similar drop which is from 8 minutes to 6 minutes.

Given all that and the upcoming upgrade (hopefully soon) of our mozmill-ci system to Mozmill 2.0.1 we will see an overall improvement of 60% (15 minutes -> 6 minutes) per testrun!! This is totally stunning and allows us to run 2.5 times more tests in the same timespan. With it we can further increase our coverage for locales from 20 to 40 for beta and release candidate builds as next step.

November 29, 2013 10:13 AM

November 28, 2013

William Lachance

mozregression now supports inbound builds

Just wanted to send out a quick note that I recently added inbound support to mozregression for desktop builds of Firefox on Windows, Mac, and Linux.

For the uninitiated, mozregression is an automated tool that lets you bisect through builds of Firefox to find out when a problem was introduced. You give it the last known good date, the last known bad date and off it will go, automatically pulling down builds to test. After each iteration, it will ask you whether this build was good or bad, update the regression range accordingly, and then the cycle repeats until there are no more intermediate builds.

Previously, it would only use nightlies which meant a one day granularity — this meant pretty wide regression ranges, made wider in the last year by the fact that so much more is now going into the tree over the course of the day. However, with inbound support (using the new inbound archive) we now have the potential to get a much tighter range, which should be super helpful for developers. Best of all, mozregression doesn’t require any particularly advanced skills to use which means everyone in the Mozilla community can help out.

For anyone interested, there’s quite a bit of scope to improve mozregression to make it do more things (FirefoxOS support, easier installation…). Feel free to check out the repository, the issues list (I just added an easy one which would make a great first bug) and ask questions on irc.mozilla.org#ateam!

November 28, 2013 06:14 PM

Mark Côté

VMware Tools in Ubuntu

I went about the seemingly simple task of sharing a directory in OS X with an Ubuntu VMware box so that I could code in my main desktop and run under Linux. The simple sharing dialog is of course only the beginning of the work; after that, I needed to refresh VMware tools, since I had done several kernel upgrades. Well that turned into a few hours of flailing at a command line.

For whatever reason, the kernel headers aren’t automatically found by the VMware Tools installation program, and even when you give the direct path, it still denies that they exist. Some web trawling told me that it looks for version.h, which isn’t in the root header directory, so I made a symlink. Then the installer found it but got compiler warnings a short time later.

I’m writing this post to tell you not to bother with any of that. Installing the open-vm-tools package from the multiverse repository is actually all you need! It’s amazing how many different searches it took for me to finally figure that out. I had to unshare and the reshare my directory for it to finally work, but now it’s working great. Let’s see what happens the next time I upgrade my kernel though…

November 28, 2013 04:22 PM

November 26, 2013

Mark Côté

ReviewBoard

There’s been a lot of interest in improving Mozilla’s code-review process lately, so in that vein the BMO team has set up a ReviewBoard instance at https://reviewboard.allizom.org for testing and evaluation.

ReviewBoard is a lot more useful than Splinter, so I suggest you try it out. One of the features I think will be most adored is proper interdiff support, made possible by the fact that ReviewBoard knows about the repo you’re working in. Tightly related is the ability to extend the context of the patch from the repo. Check out the ReviewBoard site for more.

Review apps, like most tools, are fairly contentious, so we wanted to give Mozillians a chance to try it out before we commit to it. Other options, like Phabricator, have been suggested; we happened to have been working on ReviewBoard. I’d like to hear from the greater public before settling on one for at least a few years.

For this experimentation phase, we’ve only done minimal integration with Bugzilla, namely, having ReviewBoard use Bugzilla’s accounts. You log into ReviewBoard with your Bugzilla account, and ReviewBoard’s username autosuggest is linked to Bugzilla (similarly, reviewboard-dev uses bugzilla-dev’s user database). (Note that this version of ReviewBoard doesn’t support Persona, but it should be there soon.) There’s a lot more we could do; some examples are in bug 515210. Again I’d like to hear feedback in order to prioritize our work.

I suggest using ReviewBoard much like we use GitHub pull requests. Start a review, then paste the URL as an attachment on a Bugzilla bug. Bug 922226 is on file to get redirects working for ReviewBoard reviews the way they do for pull requests.

For now, please don’t use ReviewBoard for any non-public (e.g. security-related) or really critical reviews. While the security team has gone over ReviewBoard, we’re still considering this an evaluation phase. We’ll also have to put some work into ensuring that only the right people can see non-public reviews; Bugzilla’s security system is rather fine-grained and complicated, so this will take some thought and possibly some modifications to ReviewBoard itself (don’t worry, we have several ReviewBoard developers in house!).

Finally, to get your repo added—having a linked repo is where you really see the value of ReviewBoard—either file a bug or drop by #bmo on IRC.

Please direct all feedback to mozilla.tools.bmo. To reinforce that, I’ve disabled comments on this post.

November 26, 2013 05:37 PM

November 21, 2013

Henrik Skupin

Automation Development report – week 45/46 2013

After the last report over a two week cycle, I have to follow-up with another one for the weeks 45 and 46. Due to my move I had limited availability and to fix some important other stuff. So hopefully this is the last report over a two weeks period in the near future.

Highlights

To be able to release Mozmill 2.0.1 as soon as possible Henrik had to fix a lot of existent bugs for mozprofile in conjunction with its add-on manager class. Those fixes were necessary because our restart tests were broken due to an inappropriate clean-up of add-ons after closing Firefox. At the end 10 bugs have been fixed.

We are close to get our first Metro tests running with Mozmill on Windows 8 and 8.1. But before we can really do that, the whole mozmill-tests repository has to be refactored to support multiple applications. Here both Andreea and Andrei are doing most of the work.

Dave and Henrik were both working on a couple of Mozmill-CI issues, which will help us to better diagnose the memory issues and random crashes of the Jenkins Java process. Everything has been merged to our staging server and has to bake a bit before a push to production will happen.

Henrik added a new feature to Mozmill-CI which allows us to run tests for builds off project branches. This will come into play really soon when we have to execute daily tests for the upcoming holly branch, which is mozilla-central without the Australis UX.

Dave released gaiatest 0.19, b2gpopulate 0.11, and b2gperf 0.12 to resolve leaks in performance tests.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 45 and week 46.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the last two Automation Development meetings of week 45 and week 46.

November 21, 2013 09:55 PM

November 20, 2013

David Burns

TPAC 2013 - WebDriver Face To Face and more

Last week I was at W3 TPAC for week of face to face meeting to discuss WebDriver and other W3C specifications that other working groups are working on.

Our initial agenda went up just before the meeting and we were lucky enough to get through all the items. If you would like to read the notes for the meeting (Monday and Tuesday.

Highlights from the meeting are

There are other actions from the meeting that need to be done but I think the items above cover the main points, at least for me, that came out of the meeting.

I found the rest of the week actually really useful from a networking perspective and from a learning perspective. I have a lot of changes that I need to put into the WebDriver spec and have been getting feed back about when i am doing it wrong which is great!

November 20, 2013 02:08 PM

Byron Jones

happy bmo push day!

the following changes have been pushed to bugzilla.mozilla.org:

discuss these changes on mozilla.tools.bmo.


Filed under: bmo, mozilla

November 20, 2013 08:29 AM

November 19, 2013

David Burns

Test The WebForward - Shenzhen, China

Just over a week ago I was in Shenzhen, China helping with Test The Web Forward. A great initiative to get anyone and everyone to write tests for browser companies to use. The tests are conformance tests for W3C.

In the next couple of weeks I am going to document what is needed to help out and how you, yes you, can help! The WebDriver open source test suite needs to be checked against the WebDriver Specification and then ported over or bugs raised against the spec or implementations. There will be bugs which is a great thing!

P.S. We had 4 patches with ~10 tests from people who had never used WebDriver so I consider that a great success. We also noticed a number of pain points we need to fix before we get more people involved

November 19, 2013 09:47 PM