Planet Thunderbird

July 31, 2014

Kent James

Thunderbird’s Future: the TL;DR Version

In the next few months I hope to do a series of blog posts that talk about Mozilla’s Thunderbird email client and its future. Here’s the TL;DR version (though still pretty long). These are my personal views, I have no authority to speak for Mozilla or for the Thunderbird project.

Current Status

Where We Need to Go

How We Get There

The Thunderbird team is currently planning to get together in Toronto in October 2014, and Mozilla staff are trying to plan an all-hands meeting sometimes soon. Let’s discussion the future in conjunction with those events, to make sure that in 2015 we have a sustainable plan for the future.

 

July 31, 2014 08:16 PM

July 25, 2014

Instantbird

Linux nightly builds back!

Back in March, we posted that we had started building nightly builds from mozilla-central/comm-central, but because the version of CentOS we had been using was too old, we were unable to continue providing Linux nightly builds. That has now changed and (as of today) we have both 32-bit and 64-bit Linux nightlies! Since this involved us installing a new operating system (CentOS 6.2) and tweaking some of the build configuration for Linux, please let us know if you see any issues! Additionally, some more up-to-date features that have been available in Mozilla Firefox for a while should now be available in Instantbird (e.g. dbus and pulse audio support) and even some minor bugs were fixed!

Sorry that this took so long, but go grab your updated copy now!

July 25, 2014 01:07 PM

July 22, 2014

Calendar

Lightning 3.3 is Out the Door

I am happy to announce that Lightning 3.3, a new major release, is out of the door. Here are a few release highlights:

There have also been a lot of changes in the backend that are not visible to the user. This includes better testing framework support, which will help avoid regressions in the future. A total of 103 bugs have been fixed since Lightning 2.6.

When installing or updating to Thunderbird 31, you should automatically receive the upgrade to Lightning 3.3. If something goes wrong, you can get the new versions here:

Should you be using Seamonkey, you will have to wait for the 2.28 release, which is postponed as per this thread.

If you encounter any major issues, please comment on this blog post. Support issues are handled on support.mozilla.org. Feature requests and bug reports can be made on bugzilla.mozilla.org in the product Calendar. Be sure to search for existing bugs before you file them.

July 22, 2014 05:43 PM

July 17, 2014

Kent James

Following Wikipedia, Thunderbird Could Raise $1,600,000 in annual donations

What will it take to keep Thunderbird stable and vibrant? Although there is a dedicated, hard-working team of volunteers trying hard to keep Thunderbird alive, there has been very little progress on improvements since Mozilla drastically reduced their funding. I’ve been an advocate for some time that Thunderbird needs income to fulfill its potential, and that the best way to generate that income would be to appeal directly to its users for donations.

One internet organization that has done this successfully has been Wikipedia. How much income could Thunderbird generate if they received the same income per user as Wikipedia? Surely our users, who rely on Thunderbird for critical daily communications, are at least as willing to donate as Wikipedia users.

Estimates of income from Wikipedia’s annual fund raising drive to users are around $20,000,000 per year. Recently Wikipedia is reporting 11824 M pageviews per month and 5 pageviews per user. That results in a daily user count of 78 million users. Thunderbird by contrast has about 6 million daily users (using hits per day to update checks), or about 8% of the daily users of Wikipedia.

If Thunderbird were willing to directly engage users asking for donations, at the same rate per user as Wikipedia, there is a potential to raise $1,600,000 per year. That would certainly be enough income to maintain a serious team to move forward.

Wikipedia’s donation requests were fairly intrusive, with large banners at the top of all Wikipedia pages. When Firefox did a direct appeal to users early this year, the appeal was very subtle (did you even notice it?). I tried to scale the Firefox results to Thunderbird, and estimated that a similar subtle appeal might raise $50,000 – $100,000 per year in Thunderbird. That is not sufficient to make a significant impact. We would have to be willing to be a little intrusive, like Wikipedia, it we are going to be successful. This will generate pushback, as has Wikipedia’s campaign, so we would have to be willing to live with the pushback.

But is it really in the best interest of our users to spare them an annual, slightly intrusive appeal for donations, while letting the product that they depend on each day slowly wither away? I believe that if we truly care about our users, we will take the necessary steps to insure that we give them the best product possible, including undertaking fundraising to keep the product stable and vibrant.

July 17, 2014 06:31 AM

July 11, 2014

Kent James

The Thunderbird Tree is Green!

For the first time in a while, the Thunderbird build tree is all green. That means that all platforms are building, and passing all tests:

The Thunderbird build tree is green!

The Thunderbird build tree is green!

Many thanks to Joshua Cranmer for all of his hard work to make it so!

July 11, 2014 07:05 PM

Ludovic Hirlimann

Thunderbird 31 coming soon to you and needs testing love

We just released the second beta of Thunderbird 31. Please help us improve Thunderbird quality by uncovering bugs now in Thunderbird 31 beta so that developers have time to fix them.

There are two ways you can help

- Use Thunderbird 31 beta in your daily activities. For problems that you find, file a bug report that blocks our tracking bug 1008543.

- Use Thunderbird 31 beta to do formal testing.  Use the moztrap testing system to tests : choose run test - find the Thunderbird product and choose 31 test run.

Visit https://etherpad.mozilla.org/tbird31testing for additional information, and to post your testing questions and results.

Thanks for contributing and helping!

Ludo for the QA team

Updated links

July 11, 2014 10:39 AM

July 09, 2014

Rumbling Edge - Thunderbird

2014-07-08 Calendar builds

Common (excluding Website bugs)-specific: (2)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

July 09, 2014 07:21 AM

2014-07-08 Thunderbird comm-central builds

Thunderbird-specific: (14)

MailNews Core-specific: (17)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

July 09, 2014 07:20 AM

July 07, 2014

Mike Conley

DocShell in a Nutshell – Part 1: Original Intents

I think in order to truly understand what the DocShell currently is, we have to find out where the idea of creating it came from. That means going way, way back to its inception, and figuring out what its original purpose was.

So I’ve gone back, peered through various archived wiki pages, newsgroup and mailing list posts, and I think I’ve figured out that original purpose.1

The original purpose can be, I believe, summed up in a single word: embedding.

Embedding

Back in the late 90′s, sometime after the Mozilla codebase was open-sourced, it became clear to some folks that the web was “going places”. It was the “bees knees”. It was the “cat’s pajamas”. As such, it was likely that more and more desktop applications were going to need to be able to access and render web content.

The thing is, accessing and rendering web content is hard. Really hard. One does not simply write web browsing capabilities into their application from scratch hoping for the best. Heartbreak is in that direction.

Instead, the idea was that pre-existing web engines could be embedded into other applications. For example, Steam, Valve’s game distribution platform, displays a ton of web content in its user interface. All of those Steam store pages? Those are web pages! They’re using an embedded web engine in order to display that stuff.2

So making Gecko easily embeddable was, at the time, a real goal, and a real project.

nsWebShell

The problem was that embedding Gecko was painful. The top-level component that embedders needed to instantiate and communicate with was called “nsWebShell”, and it was a pretty unwieldy. Lots of internal knowledge about the internal workings of Gecko was leaked through the nsWebShell component, and it’s interface changed far too often.

It was also inefficient – the nsWebShell didn’t just represent the top-level “thing that loads web content”. Instances of nsWebShell were also used recursively for subdocuments within those documents – for example, (i)frames within a webpage. These nested nsWebShell’s formed a tree. That’s all well and good, except for the fact that there were things that the nsWebShell loaded or did that only the top-level nsWebShell really needed to load or do. So there was definitely room for some performance improvement.

In order to correct all of these issues, a plan was concocted to retire nsWebShell in favour of several new components and a slew of new interfaces. Two of those new components were nsDocShell and nsWebBrowser.

nsWebBrowser

nsWebBrowser would be the thing that embedders would drop into the applications – it would be the browser, and would do all of the loading / doing of things that only the top-level web browser needed to do.

The interface for nsWebBrowser would be minimal, just exposing enough so that an embedder could drop one into their application with little fuss, point it at a URL, set up some listeners, and watch it dance.

nsDocShell

nsDocShell would be… well, everything else that nsWebBrowser wasn’t. So that dumping ground that was nsWebShell would get dumped into nsDocShell instead. However, a number of new, logically separated interfaces would be created for nsDocShell.

Examples of those interfaces were:

So instead of a gigantic, ever changing interface, you had lots of smaller interfaces, many of which could eventually be frozen over time (which is good for embedders).

These interfaces also made it possible to shield embedders from various internals of the nsDocShell component that embedders shouldn’t have to worry about.

Ok, but… what was it?

But I still haven’t answered the question – what was the DocShell at this point? What was it supposed to do now that it was created.

This ancient wiki page spells it out nicely:

This class is responsible for initiating the loading and viewing of a document.

This document also does a good job of describing what a DocShell is and does.

Basically, any time a document is to be viewed, a DocShell needs to be created to view it. We create the DocShell, and then we point that DocShell at the URL, and it does the job of kicking off communications via the network layer, and dealing with the content once it comes back.

So it’s no wonder that it was (and still is!) a dumping ground – when it comes to loading and displaying content, nsDocShell is the central nexus point of communications for all components that make that stuff happen.

I believe that was the original purpose of nsDocShell, anyhow.

And why “shell”?

This is a simple question that has been on my mind since I started this. What does the “shell” mean in nsDocShell?

Y’know, I think it’s actually a fragment left over from the embedding work, and that it really has no meaning anymore. Originally, nsWebShell was the separation point between an embedder and the Gecko web engine – so I think I can understand what “shell” means in that context – it’s the touch-point between the embedder, and the embedee.

I think nsDocShell was given the “shell” monicker because it did the job of taking over most of nsWebShell’s duties. However, since nsWebBrowser was now the touch-point between the embedder and embedee… maybe shell makes less sense. I wonder if we missed an opportunity to name nsDocShell something better.

In some ways, “shell” might make some sense because it is the separation between various documents (the root document, any sibling documents, and child documents)… but I think that’s a bit of a stretch.

But now I’m racking my brain for a better name (even though a rename is certainly not worth it at this point), and I can’t think of one.

What would you rename it, if you had the chance?

What is nsDocShell doing now?

I’m not sure what’s happened to nsDocShell over the years, and that’s the point of the next few posts in this series. I’m going to be going through the commits hitting nsDocShell from 1999 until the present day to see how nsDocShell has changed and evolved.

Hold on to your butts.

Further reading

The above was gleaned from the following sources:


  1. I’m very much prepared to be wrong about any / all of this. I’m making assertions and drawing conclusions by reading and interpreting things that other people have written about DocShell – and if the telephone game is any indication, this indirect analysis can be lossy. If I have misinterpreted, misunderstood, or completely missed the point in any of the above, please don’t hesitate to comment, and I will correct it forthwith. 

  2. They happen to be using WebKit, the same web engine that powers Safari, and (until recently) Chromium. According to this, they’re using the Chromium Embedding Framework to display this web content. There are a number of applications that embed Gecko. Firefox is the primary consumer of Gecko. Thunderbird is another obvious one – when you display HTML email, it’s using the Gecko web engine in order to lay it out and display it. WINE uses Gecko to allow Windows-binaries to browse the web. Making your web engine embeddable, however, has a development cost, and over the years, making Gecko embeddable seems to have become less of a priority. Servo is a next-generation web browser engine from Mozilla Research that aims to be embeddable. 

July 07, 2014 05:12 PM

Blake Winton

Figuring out where things are in an image.

People love heatmaps.

They’re a great way to show how much various UI elements are used in relation to each other, and are much easier to read at a glance than a table of click- counts would be. They can also reveal hidden patterns of usage based on the locations of elements, let us know if we’re focusing our efforts on the correct elements, and tell us how effective our communication about new features is. Because they’re so useful, one of the things I am doing in my new role is setting up the framework to provide our UX team with automatically updating heatmaps for both Desktop and Android Firefox.

Unfortunately, we can’t just wave our wands and have a heatmap magically appear. Creating them takes work, and one of the most tedious processes is figuring out where each element starts and stops. Even worse, we need to repeat the process for each platform we’re planning on displaying. This is one of the primary reasons we haven’t run a heatmap study since 2012.

In order to not spend all my time generating the heatmaps, I had to reduce the effort involved in producing these visualizations.

Being a programmer, my first inclination was to write a program to calculate them, and that sort of worked for the first version of the heatmap, but there were some difficulties. To collect locations for all the elements, we had to display all the elements.

Firefox in the process of being customized

Customize mode (as shown above) was an obvious choice since it shows everything you could click on almost by definition, but it led people to think that we weren’t showing which elements were being clicked the most, but instead which elements people customized the most. So that was out.

Next we tried putting everything in the toolbar, or the menu, but those were a little too cluttered even without leaving room for labels, and too wide (or too tall, in the case of the menu).

A shockingly busy toolbar

Similarly, I couldn’t fit everything into the menu panel either. The only solution was to resort to some Photoshop-trickery to fit all the buttons in, but that ended up breaking the script I was using to locate the various elements in the UI.

A surprisingly tall menu panel

Since I couldn’t automatically figure out where everything was, I figured we might as well use a nicely-laid out, partially generated image, and calculate the positions (mostly-)manually.

The current version of the heatmap (Note: This is not the real data.)

I had foreseen the need for different positions for the widgets when the project started, and so I put the widget locations in their own file from the start. This meant that I could update them without changing the code, which made it a little nicer to see what’s changed between versions, but still required me to reload the whole page every time I changed a position or size, which would just have taken way too long. I needed something that could give me much more immediate feedback.

Fortunately, I had recently finished watching a series of videos from Ian Johnson (@enjalot on twitter) where he used a tool he made called Tributary to do some rapid prototyping of data visualization code. It seemed like a good fit for the quick moving around of elements I was trying to do, and so I copied a bunch of the code and data in, and got to work moving things around.

I did encounter a few problems: Tributary wasn’t functional in Firefox Nightly (but I could use Chrome as a workaround) and occasionally sometimes trying to move the cursor would change the value slider instead. Even with these glitches it only took me an hour or two to get from set-up to having all the numbers for the final result! And the best part is that since it's all open source, you can take a look at the final result, or fork it yourself!

July 07, 2014 03:53 PM

June 30, 2014

Mike Conley

Alice in DocShell Land

I’ve been reading a book called The Annotated Alice. In this book, the late and great Martin Gardner shows us the stories of Alice’s Adventures in Wonderland and Through the Looking-Glass but supplies copious footnotes to illustrate the puns, wordplay, allusions, logic problems and satire going on beneath the text. Some of these footnotes delve into pure conjecture (there are still people to this day who theorize about various aspects of the stories), and other footnotes show quite clearly that Carrol wrote these stories with a sophisticated wit and whimsy that isn’t immediately obvious at first glance.

And it’s clear that Gardner (and others like him) have spent hours upon hours thinking and theorizing about these stories. A purposeful misspelling gets awarded a two page footnote here, and a mention of a mirror sends us off talking about matter and anti-matter and other matters (ha) of quantum physics.

So much thinking and effort to interpret these stories, and what you get out of it is a fascinating tapestry of ideas and history.

Needless to say, I’ve been finding the whole thing fascinating. It’s a hell of a read.

While reading it, I’ve wondered what it’d be like to apply the same practice to source code. Take some relatively mysterious piece of source code that only a few people feel comfortable with, and explode it out. Go through the source control history, and all of the old bugs, and see where this code came from. What was its purpose to begin with? What is its purpose now? What are the battle scars?

After much thinking, I’ve decided to try this, and I’m going to try it on a piece of Gecko called “DocShell”.

I think I just heard Ms2ger laughing somewhere.

It’s become pretty clear having talked to a few seasoned Mozilla hackers that DocShell is not well understood. The wiki page on it makes that even more clear – it starts:

The goal of this page is to serve as a dumping/organization ground for docshell docs. When someone finds out something, it should be added here in a reasonable way. By the time this gets unwieldy, hopefully we will have enough material for several actual docs on what docshell does and why.

So, I’m going to attempt to figure out what DocShell was supposed to do, and figure out what it currently does. I’m going to dig through source code, old bugs, and old CVS commits, back to the point where Netscape first open-sourced the Mozilla code-base.

It’s not going to be easy. It’s definitely going to be a multiple month, multiple post effort. I’m likely to get things wrong, or only partially correct. I’ll need help in those cases, so please comment.

And I might not succeed in figuring out what DocShell was supposed to do. But I’m pretty confident I can get a grasp on what it currently does.

So in the end, if I’m lucky, we’ll end up with a few things:

  1. A greater shared understanding of DocShell
  2. Materials that can be used to flesh out the DocShell wiki
  3. Better inline documentation for DocShell maybe?

I’ve also asked bz to forward me feedback requests for DocShell patches, so that way I get another angle of attack on understanding the code.

So, deep breath. Here goes. Watch this space.

June 30, 2014 02:41 AM

June 29, 2014

Mike Conley

Australis Performance Post-mortem Summary

Over the last few months, I’ve been talking about all of the work we put into making Australis feel fast when it shipped in Firefox 29.

I talked about where we started with our performance work, and how we grappled with the ts_paint and tpaint performance (“talos”) tests. After that, I talked a bit about the excellent tools we have (and ones we developed ourselves) to make finding our performance bottlenecks easier.

After a brief delay, I rounded out the series by talking about our tab animation performance work, and the customization transition performance work.

I think over the course of working on these things, I’ve learned quite a bit about performance work in general. If I had to distill it down to a few tidbits, it’d be:

So that’s it on the series. Enjoy your zippy Firefox!

June 29, 2014 02:26 AM

June 19, 2014

Mike Conley

Australis Performance Post-mortem Part 5: The Customize Mode Transition

Another new thing that came out with Firefox 29 is a sexy new customization interface. We wanted to make UI customization something that anybody would feel comfortable doing, instead of something only a few mighty power users might do.

The new customization mode is accessible by pressing the Menu Button (☰), and then clicking Customize.

Give it a shot now if you’ve never done it.

BAM, did you see that? What you just saw was a full-on mode switch in the browser chrome to indicate to you that you’ve put the browser into a different state – specifically, a state where much of the browser UI is malleable. You can drag and drop things in your toolbars and the menu, enable toolbars, etc.

Going in and out of customization mode is not something that most people will do frequently, but we still invested a bunch of time trying to make it smooth. I still think there is more we can do on that front, and I’ll get into that at the end of this post.

Anyhow, as with any performance-related project, we started with a way of measuring. In this case, our trusty performance team re-purposed the TART test that we used for tab animation, and pointed it at the customize mode transition. We called this new test CART (customization animation regression test, natch).

This is slightly different from the tab animation stuff, because we didn’t really have a baseline measurement to compare against – there was no old customization mode transition to try to match or beat. So we just had to do our best to do all of our processing under 16ms in order to draw the transition at 60fps.

That means that instead of comparing two sets of data over time, we’ll only be looking at a single series of points, and how they change over time.

So let’s see where we started, and where we finished, and how we got there.

CART results - pre-optimizations

CART results – pre-optimizations

So in an effort to get this blog post done, I’ve cut some corners and combined the data from a number of different platforms into a single graph. I apologize if this makes it difficult to interpret for some of you – but I’ll do my best to explain what the graph means.

I chose a representative sample of the platforms that we measured – we’ve got OS X (10.6, 10.8), Ubuntu 12.04 (32-bit), Windows XP, Windows 7, and Windows 8.

The X-axis is obviously plotting the time (we started gathering CART data right at the end of January 2014). The Y-axis plots the “final CART value”. CART, like TART, measures a number of things, and we “boil that number down” into a single value that we can plot. Each of the subtests are measured in milliseconds, so I guess you could say the Y-axis is also in milliseconds – but as it’s an aggregate, that’s not too meaningful. It’s not the greatest for detecting small shifts in the subtests (we use Datazilla for that), but it allows me to illustrate the changes over time in a pretty simple way.

So it looks like my sampled platforms are all floating around the 40′s and 50′s. Is that good or bad? Well, it’s neither really. It’s just our starting point, and we wanted to try to improve on that.

And so we set out trying to move those dots downwards (for this data, without going into grand detail, lower milliseconds = faster and smoother animation).

As with TART, we did a lot of profiling, and used a lot of the tools I mentioned in this post. Here are the bugs that seemed to move the needle the most.

The Rogue’s Gallery

As before, this is not a complete list, but captures some of the more interesting bugs we worked on.

Bug 932963 – Break customize mode transition into several phases

This change allowed us to break the transition in and out of customization mode into some phases:

  1. Not in customization mode (default)
  2. Entering customization mode
  3. Entered customization mode
  4. Exiting customization mode

Phases 2, 3 and 4 have attributes on the main browser window that we can target with CSS selectors. We were then able to “lighten” the CSS during phases 2 and 4, in order to optimize the frames during the transition. For example, we don’t display the semi-transparent grid texture on the main window unless we’re in phase 3.

Bug 972485 – Find out why we’re doing a bunch of synchronous file reading at the start of the customize mode transition

This was a rather surprising one – using the Gecko Profiler, it looked like we were doing sync file IO as the blank about:customizing document was loading.

For context, we have a page registered at about:customizing that’s a blank document. When we detect that about:customizing has been loaded or switched to, we enter customize mode.

So this sync file IO was causing some jank at the start of the customization mode transition.

Strangely, it turned out that the XHTML file we were loading for the blank about:customizing document was synchronously loading a bunch of MathML localization stuff as it loaded.

We switched the document from XHTML to XUL, and the sync IO load went away. We filed a follow-up bug to investigate why exactly we were doing sync file IO on loading XHTML files, because that’s a bad thing to do.

Anyhow, this was a small but significant win for almost all platforms.

Bug 975552 – Preload about:customizing like we do with about:newtab

I think this was the biggest win we achieved during the performance work. That browser that we load the blank about:customizing page in is not free – a bunch of stuff gets instantiated in order for a working browser to be created, and all of that expense is wasted on a blank document. That expense also causes enough main thread thinking that it reduces the smoothness of the customization entering transition.

So the solution was a hack that takes after the same strategy we use for about:newtab. Essentially, we preload the about:customizing browser and document in the background at what seems like a logical time (right after the user has opened the menu). This allows us to front-load all of the expense in creating the browser, and we end up with something much smoother.

Here’s a before video and an after video from my Windows 7 machine, for reference.

This was a pretty bodacious hack, and in the future, I’d like to remove it completely, and find some way of sidestepping all of the browser internal loading for the about:customizing document (or, even better, find a way of not using a browser element at all!). That’s filed here.

Bug 977796 – Disable subpixel anti-aliasing during customize mode transition.

A profile on Windows 8 showed that we were spending an inordinate amount of time rendering text during the customize mode transition, and this has to do with subpixel anti-aliasing in the menu that animates in.

So I landed a patch that temporarily disables subpixel anti-aliasing on that element during the customize mode transition, and that bought us a huge win for Window 8 (about 20%). Not much of a win for any of the other platforms.

So where did we end up?

CART results - post-optimizations

CART results – post-optimizations

Those numbers are indeed lower!

Now, before you lose your mind about that epic cross-platform win around Feburary 20th, I investigated that changeset range and didn’t find anything we worked on directly in there.

There are some platform patches in there (specifically in Graphics) that might account for this win, but I’m pretty certain it’s because of bug 974621, which updated our test runners to use a different version of Talos that included the patch in bug 967186. 967186 altered the CART test to be more accurate in its measurements, so actually a lot of our initial data was erroneous. That blows a bit because it adds some unnecessary noise to our graph, but it’s also good because accuracy is a thing we definitely want.

Future work

I still think there’s more wins to be made in the customize mode transition. Finding a way of getting rid of the about:customizing preloading hack and replace it with something smarter (like a thin browser that doesn’t need to instantiate much of its backend, or no browser at all) is probably the first step. More profiling might find more wins. I’m learning more and more about platform as I work on Electrolysis, so maybe I’ll come back to this problem with some new skills and information, and I can get it performing reliably on all platforms as it really should: 60fps, smooth as silk.

June 19, 2014 05:51 PM

June 16, 2014

Calendar

Only 5 Weeks Until Thunderbird 31 – Testing Needed

Thunderbird and Lightning do major releases about every 42 weeks and by now there are only 5 weeks left until Thunderbird 31 will be released. Your matching calendaring extension version is Lightning 3.3. Betas for both products are available and now its up to you: we need all the manpower we can get to make sure there are no regressions and everything still works as expected. Over a million active daily users will thank you for this!

To get started, download the beta builds and follow the standard installation procedure.

Although I doubt there will be any issues, its always a good idea to back up your profile. I myself have been using each of the beta builds and haven’t run into any dataloss issues.

Here is a list of areas that have changed and you should especially look into:

There is one known issue if you use a localized version of Thunderbird and Lightning. Certain locales have not been fully translated, but are included in the beta builds nevertheless. We will fix this issue for the second beta build soon.

If you notice any issues please leave a comment here as soon as possible. The more time we have to fix the issue the better. If you have a bugzilla account (or don’t mind creating one for free) please comment on an existing bug or file a new one if needed.

June 16, 2014 07:26 PM

June 13, 2014

Rumbling Edge - Thunderbird

2014-06-11 Calendar builds

Common (excluding Website bugs)-specific: (3)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

June 13, 2014 12:12 AM

2014-06-11 Thunderbird comm-central builds

Thunderbird-specific: (33)

MailNews Core-specific: (19)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

June 13, 2014 12:11 AM

May 27, 2014

Joshua Cranmer

Why email is hard, part 6: today's email security

This post is part 6 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. This part discusses how email security works in practice.

Email security is a rather wide-ranging topic, and one that I've wanted to cover for some time, well before several recent events that have made it come up in the wider public knowledge. There is no way I can hope to cover it in a single post (I think it would outpace even the length of my internationalization discussion), and there are definitely parts for which I am underqualified, as I am by no means an expert in cryptography. Instead, I will be discussing this over the course of several posts of which this is but the first; to ease up on the amount of background explanation, I will assume passing familiarity with cryptographic concepts like public keys, hash functions, as well as knowing what SSL and SSH are (though not necessarily how they work). If you don't have that knowledge, ask Wikipedia.

Before discussing how email security works, it is first necessary to ask what email security actually means. Unfortunately, the layman's interpretation is likely going to differ from the actual precise definition. Security is often treated by laymen as a boolean interpretation: something is either secure or insecure. The most prevalent model of security to people is SSL connections: these allow the establishment of a communication channel whose contents are secret to outside observers while also guaranteeing to the client the authenticity of the server. The server often then gets authenticity of the client via a more normal authentication scheme (i.e., the client sends a username and password). Thus there is, at the end, a channel that has both secrecy and authenticity [1]: channels with both of these are considered secure and channels without these are considered insecure [2].

In email, the situation becomes more difficult. Whereas an SSL connection is between a client and a server, the architecture of email is such that email providers must be considered as distinct entities from end users. In addition, messages can be sent from one person to multiple parties. Thus secure email is a more complex undertaking than just porting relevant details of SSL. There are two major cryptographic implementations of secure email [3]: S/MIME and PGP. In terms of implementation, they are basically the same [4], although PGP has an extra mode which wraps general ASCII (known as "ASCII-armor"), which I have been led to believe is less recommended these days. Since I know the S/MIME specifications better, I'll refer specifically to how S/MIME works.

S/MIME defines two main MIME types: multipart/signed, which contains the message text as a subpart followed by data indicating the cryptographic signature, and application/pkcs7-mime, which contains an encrypted MIME part. The important things to note about this delineation are that only the body data is encrypted [5], that it's theoretically possible to encrypt only part of a message's body, and that the signing and encryption constitute different steps. These factors combine to make for a potentially infuriating UI setup.

How does S/MIME tackle the challenges of encrypting email? First, rather than encrypting using recipients' public keys, the message is encrypted with a symmetric key. This symmetric key is then encrypted with each of the recipients' keys and then attached to the message. Second, by only signing or encrypting the body of the message, the transit headers are kept intact for the mail system to retain its ability to route, process, and deliver the message. The body is supposed to be prepared in the "safest" form before transit to avoid intermediate routers munging the contents. Finally, to actually ascertain what the recipients' public keys are, clients typically passively pull the information from signed emails. LDAP, unsurprisingly, contains an entry for a user's public key certificate, which could be useful in large enterprise deployments. There is also work ongoing right now to publish keys via DNS and DANE.

I mentioned before that S/MIME's use can present some interesting UI design decisions. I ended up actually testing some common email clients on how they handled S/MIME messages: Thunderbird, Apple Mail, Outlook [6], and Evolution. In my attempts to create a surreptitious signed part to confuse the UI, Outlook decided that the message had no body at all, and Thunderbird decided to ignore all indication of the existence of said part. Apple Mail managed to claim the message was signed in one of these scenarios, and Evolution took the cake by always agreeing that the message was signed [7]. It didn't even bother questioning the signature if the certificate's identity disagreed with the easily-spoofable From address. I was actually surprised by how well people did in my tests—I expected far more confusion among clients, particularly since the will to maintain S/MIME has clearly been relatively low, judging by poor support for "new" features such as triple-wrapping or header protection.

Another fault of S/MIME's design is that it makes the mistaken belief that composing a signing step and an encryption step is equivalent in strength to a simultaneous sign-and-encrypt. Another page describes this in far better detail than I have room to; note that this flaw is fixed via triple-wrapping (which has relatively poor support). This creates yet more UI burden into how to adequately describe in UI all the various minutiae in differing security guarantees. Considering that users already have a hard time even understanding that just because a message says it's from example@isp.invalid doesn't actually mean it's from example@isp.invalid, trying to develop UI that both adequately expresses the security issues and is understandable to end-users is an extreme challenge.

What we have in S/MIME (and PGP) is a system that allows for strong guarantees, if certain conditions are met, yet is also vulnerable to breaches of security if the message handling subsystems are poorly designed. Hopefully this is a sufficient guide to the technical impacts of secure email in the email world. My next post will discuss the most critical component of secure email: the trust model. After that, I will discuss why secure email has seen poor uptake and other relevant concerns on the future of email security.

[1] This is a bit of a lie: a channel that does secrecy and authentication at different times isn't as secure as one that does them at the same time.
[2] It is worth noting that authenticity is, in many respects, necessary to achieve secrecy.
[3] This, too, is a bit of a lie. More on this in a subsequent post.
[4] I'm very aware that S/MIME and PGP use radically different trust models. Trust models will be covered later.
[5] S/MIME 3.0 did add a provision stating that if the signed/encrypted part is a message/rfc822 part, the headers of that part should override the outer message's headers. However, I am not aware of a major email client that actually handles these kind of messages gracefully.
[6] Actually, I tested Windows Live Mail instead of Outlook, but given the presence of an official MIME-to-Microsoft's-internal-message-format document which seems to agree with what Windows Live Mail was doing, I figure their output would be identical.
[7] On a more careful examination after the fact, it appears that Evolution may have tried to indicate signedness on a part-by-part basis, but the UI was sufficiently confusing that ordinary users are going to be easily confused.

May 27, 2014 12:32 AM

May 17, 2014

Rumbling Edge - Thunderbird

2014-05-17 Calendar builds

Common (excluding Website bugs)-specific: (15)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

May 17, 2014 09:47 PM

2014-05-17 Thunderbird comm-central builds

Thunderbird-specific: (45)

MailNews Core-specific: (19)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

May 17, 2014 09:46 PM

May 15, 2014

Mike Conley

Australis Performance Post-mortem Part 4: On Tab Animation

Whoops, forgot that I had a blog post series going on here where I talk about the stuff we did to make Australis blazing fast. In that time, we’ve shipped the thing (Firefox 29 represent!), so folks are actually feeling the results of this performance work, which is pretty excellent.

I ended the last post on an ominous note – something about how we were clear to land Australis on Nightly, or so we thought.

This next bit is all about timing.

There is a Performance Team at Mozilla, who are charged with making our products crazy-fast and crazy-smooth. These folks are geniuses at wringing out every last millisecond possible from a computation. It’s what they do all day long.

A pretty basic principle that I’ve learned over the years is that if you want to change something, you first have to develop a system for measuring the thing you’re trying to change. That way, you can determine if what you’re doing is actually changing things in the way you want them to.

If you’ve been reading this Australis Performance Post-mortem posts, you’ll know that we have some performance tests like this, and they’re called Talos tests.

Just about as we were finishing off the last of the t_paint and ts_paint regressions, the front-end team was suddenly made aware of a new Talos test that was being developed by the Performance team. This test was called TART, and stands for Tab Animation Regression Test. The purpose of this test is to exercise various tab animation scenarios and measure the time it takes to paint each frame and to proceed through the entire animation of a tab open and a tab close.

The good news was that this new test was almost ready for running on our Nightly builds!

The bad news was that the UX branch, which Australis was still on at the time, was regressing this test. And since we cannot land if we regress performance like this… it meant we couldn’t land.

Bad news indeed.

Or was it? At the time, the lot of us front-end engineers were groaning because we’d just slogged through a ton of other performance regressions. Investigating and fixing performance regressions is exhausting work, and we weren’t too jazzed that another regression had just shown up.

But thinking back, I’m somewhat glad this happened. The test showed that Australis was regressing tab animation performance, and tabs are opened every day by almost every Firefox user. Regressing tab performance is simply not a thing one does lightly. And this test caught us before we landed something that regressed those tabs mightily!

That was a good thing. We wouldn’t have known otherwise until people started complaining that their tabs were feeling sluggish when we released it (since most of us run pretty beefy development machines).

And so began the long process of investigating and fixing the TART regressions.

So how bad were things?

Let’s take a look at the UX branch in comparison with mozilla-central at the time that we heard about the TART regressions.

Here are the regressing platforms. I’ll start with Windows XP:

Where we started with TART on Windows XP

Where we started with TART on Windows XP

Forgive me, I couldn’t get the Graph Server to swap the colours of these two datasets, so my original silent pattern of “red is the regressor” has to be dropped. I could probably spend some time trying to swap the colours through various tricky methods, but I honestly don’t think anybody reading this will care too much.

So here we can see the TART scores for Windows XP, and the UX branch (green) is floating steadily over mozilla-central (red). Higher scores are bad. So here’s the regression.

Now let’s see OS X 10.6.

Where we started with TART on OS X 10.6

Where we started with TART on OS X 10.6

Same problem here – the UX nodes (tan) are clearly riding higher than mozilla-central. This was pretty similar to OS X 10.7 too, so I didn’t include the graph.

On OS X 10.8, things were a little bit better, but not too much:

Where we started with TART on OS X 10.8

Where we started with TART on OS X 10.8

Here, the regression was still easily visible, but not as large in magnitude.

Ubuntu was in the same boat as OS X 10.6/10.7 and Windows XP:

Where we started with TART on Ubuntu

Where we started with TART on Ubuntu

But what about Windows 7 and Windows 8? Well, interesting story – believe it or not, on those platforms, UX seemed to perform better than mozilla-central:

Windows 7 (the blue nodes are mozilla-central, the tan nodes are UX)

Where we started with TART on Windows 7

Where we started with TART on Windows 7

Windows 8 (the green nodes are mozilla-central, the red nodes are UX)

Where we started with TART on Windows 8

Where we started with TART on Windows 8

So what the hell was going on?

Well, we eventually figured it out. I’ll lay it out in the next few paragraphs. The following is my “rogue’s gallery” of regressions. This list does not include many false starts and red herrings that we followed during the months working on these regressions. Think of this as “getting to the good parts”.

Backfilling

The problem with having a new test, and having mozilla-central better than UX, is not knowing where UX got worse; there was no historical measurements that we could look at to see where the regression got introduced.

MattN, smart guy that he is, got us a few talos loaner machines, and wrote some scripts to download the Nightlies for both mozilla-central and UX going back to the point where UX split off. Then, he was able to run TART on these builds, and supply the results to his own custom graph server.

So basically, we were able to backfill our missing TART data, and that helped us find a few points of regression.

With that data, now it was time to focus in on each platform, and figure out what we could do with it.

Windows XP

We started with XP, since on the regressing platforms, that’s where most of our users are.

Here’s what we found and fixed:

Bug 916946 – Stop animating the back-button when enabling or disabling it.

During some of the TART tests, we start with single tab, open a new tab, and then close the new tab, and repeat. That first tab has some history, so the back button is enabled. The new tab has no history, so the back button is disabled.

Apparently, we had some CSS that was causing us to animate the back-button when we were flipping back and forth from the enabled / disabled states. That CSS got introduced in the patch that bound the back/forward/stop/reload buttons to the URL bar. It seemed to affect Windows only. Fixing that CSS gave us our first big win on Windows XP, and gave us more of a lead with Windows 7 and Windows 8!

Bug 907544 - Pass the D3DSurface9 down into Cairo so that it can release the DC and LockRect to get at the bits

I don’t really remember how this one went down (and I don’t want to really spend the time swapping it back in by reading the bug), but from my notes it looks like the Graphics team identified this possible performance bottleneck when I showed them some profiles I gathered when running TART.

The good news on this one, was that it definitely gave the UX branch a win on Windows XP. The bad news is that it gave the same win to mozilla-central. This meant that while overall performance got better on Windows XP, we still had the same regression preventing us from landing.

Bug 919541 - Consider not animating the opacity for Australis tabs

Jeff Muizelaar helped me figure this out while we were using paint flashing to analyze paint activity while opening and closing tabs. When we slowed down the transitions, we noticed that the closing tabs were causing paints even though they weren’t visible. Closing tabs aren’t visible because with Australis, we don’t show the tab shape around tabs when they’re not selected – and closing a tab automatically unselects it for something else.

For some reason, our layout and graphics code still wanted to paint this transition even though the element was not visible. We quickly nipped that in the bud, and got ourselves a nice win on tab close measurements for all platforms!

Bug 921038 – Move selected tab curve clip-paths into SVG-as-an-image so it is cached.

This was the final nail in the coffin for the TART regression on Windows XP. Before this bug, we were drawing the linear-gradients in the tab shape using CSS, and the clipping for the curve background colour was being pulled off using clip-path and an SVG curve defined in the browser.xul document.

In this bug, we moved from clipping a background to create the curve, to simply drawing a filled curve using SVG, and putting the linear-gradient for the texture in the “stroke” image (the image that overlays the border on the tab curve).

That by itself was not enough to win back the regression – but thankfully, Seth Fowler had been working on SVG caching, and with that cache backend, our patch here knocked the XP and Ubuntu regressions out! It also took out a chunk from OS X. Things were looking good!

OS X

Bug 924415 - Find out why setting chromemargin to 0,-1,-1,-1 is so expensive for TART on UX branch on OS X.

I don’t think I’ll ever forget this bug.

I had gotten my hands on a Mac Mini that (after some hardware modifications) matched the specs of our 10.6 Talos test machines. That would prove to be super useful, as I was easily able to reproduce the regression that machine, and we could debug and investigate locally, without having to remote in to some loaner device.

With this machine, it didn’t take us long to identify the drawing of the tabs in the titlebar as the main culprit in the OS X regression. But the “why” eluded us for weeks.

It was clear I wasn’t going to be able to solve it on my own, so Jeff Muizelaar from the Graphics team joined in to help me.

We looked at OpenGL profiles, we looked at apitraces, we looked at profiles using the Gecko Profiler, and we looked at profiles from Instruments – the profiler that comes included with XCode.

It seemed like the performance bottleneck was coming from the operating system, but we needed to prove it.

Jeff and I dug and dug. I remember going home one day, feeling pretty deflated by another day of getting nowhere with this bug, when as I was walking into my apartment, I got a phone call.

It was Jeff. He told me he’d found something rather interesting – when the titlebar of the browser overlapped the titlebar of another window, he was able to reproduce the regression. When it did not overlap, the regression went away.

Talos tests open a small window before they open any test browser windows. That little test window stays in the background, and is (from my understanding) a dispatch point for making talos tests occur. That little window has a titlebar, and when we opened new browser windows, the titlebars would overlap.

Jeff suggested I try modifying the TART test to move the browser approximately 22px (the height of a standard OS X titlebar) so that they would no longer overlap. I set that up, triggered a bunch of test runs, and went to bed.

I wasn’t able to sleep. Around 4AM, I got out of bed to look at the results – SUCCESS! The regression had gone away! Jeff was right!

I slept like a baby the rest of the night.

We closed this bug as a WONTFIX due to it being way outside our control.

Comment 31 and onward in that bug are the ones that describe our findings.

Eat it TART, your tears are delicious

Those were the big regressions we fixed for TART. It was a long haul, but we got there – and in the end, it means faster and smoother tab animation for our users, which means a better experience – and it’s totally worth it.

I’m particularly proud of the work we did here, and I’m also really happy with the cross-team support and collaboration we had – from Performance, to Layout, to Graphics, to Front-end – it was textbook teamwork.

Here’s an e-mail I wrote about us beating TART.

Where did we end up?

After the TART regression was fixed, we were set to land on mozilla-central! We didn’t just land a more beautiful browser, we also landed a more performant one.

Noice.

Stay tuned for Part 5 where I talk about CART.

May 15, 2014 01:46 AM

May 12, 2014

Robert Kaiser

Hacking Day And Linuxwochen

In the last weeks, I spent some time on developer events, which I always embrace a lot because the "hallway tracks" are a constant great flow of interesting information and discussions - and of course the official talks are often quite interesting as well! :)

Image No. 23215

First up was the Mozilla Hacking Day in Berlin on Saturday, April 26. I met Arpad Borsos, fellow Mozillian from Vienna, already at Vienna airport on Friday, as we incidentally were hopping over from the Austrian to the German capital in the same plane. Later in Berlin, we met more Mozillians at the hotel and went out for dinner together.
At the actual Hacking Day, we had ~70 people streaming in for some introductory talks, and then people split up into hacking sessions and some more in-depth talks (like the one in the picture, where fellow "Desktop QA" team member Henrik talked about automated testing), which were followed by even more hands-on hacking. I spent most of the time talking to various people about different topics, showing off Firefox OS a bit (after I had reflashed my Geeksphone Peak and got it into a working stage again during the talks) and my actual "hacking" achievement: Together with Georg Fritzsche, we could find out that an issue with my Lantea Maps app freezing was a WebGL-related bug in Gecko, found a stack, reported it as a new bug and set a needinfo flag to a gfx developer. (Yesterday, I found a hacky workaround so Lantea Maps will soon work again despite that bug.)
On Sunday, I also was able to squeeze in some sightseeing and photo-taking, marching through Berlin for ~3.5h before returning to the airport.

Image No. 23216 Image No. 23217 Image No. 23218

And last week, May 8-10, it was time for the yearly Linuxwochen conference in Vienna. As we do not have an organized group of Mozillians here, we didn't have a Mozilla booth, but I usually share the OpenStreetMap booth (that's why you'll spot that in the pictures), as I'm pretty active in the local community there, and just run around with Mozilla T-shirts and talk about Mozilla stuff in addition to OSM things. I also had submitted talks about Firefox OS and the Makey Makey, but as you'll see in my speaker profile, I was additionally "coerced" into a privacy and security panel discussion last-minute as well. I was put there as more or less a representative of Mozilla and more in general those doing end-user software, and of course also tried to drum home a few points of how Mozilla strongly works for respecting user privacy but also pointed out how it is an interesting field that sometimes has tradeoffs with enabling features that people out there really want, and finding a balance can be hard.
The Firefox OS talk was quite appreciated by people attending, and we had quite a few good questions asked there as well, and interest from people afterwards to actually see and try my FxOS phones - I remember how some hard a hard time believing how well the ZTE Open worked with just 256MB of RAM and an all-web UI. :)
In the "hallway track", I did get a lot of questions about Mozilla on all three days, given my T-shirts clearly pointed out my affiliation with the project. Some of those were Firefox OS, mostly about strategy and availability, where I had to explain a few times how we target features phone converts in emerging markets for now. A number of the concerns and questions were around Australis, with quite a few of the very technical folks there not liking how we messed up their customizations and preferences, including tops-on-bottom or the add-ons bar, or how esp. our tabs looked "just like Chrome" now, but I even heard one or two comments on how awesome the new design was. Some of those I could ease with explanation and pointing to the Classic Theme Restorer add-on, but some of them are just unhappy (and it's not helpful that the very helpful Tour does not run when NoScript is installed and active). And then there were questions about our finances (the "Google dependency" issue) and about the fact that new Sync clashes with Master Password (which also protects casual "friends" from reading through passwords in your password manager), among others. Overall, a lot of talk about Mozilla, and I hope I could make most of those people feel better about us than before, wherever they stood then.

Both events cost me a lot of sleep and energy right there, but feel like they were worth the investment for sure, and the meeting and talking to people also gives me energy of a different form. ;-)

May 12, 2014 04:42 PM

May 10, 2014

Ludovic Hirlimann

The next major release of Thunderbird is around the corner and needs some love

We just released the first beta of Thunderbird 30. There will be two betas for 30 and probably 2 or more for 31. We need to start uncovering bugs nows so that developers have time to fix things.

Now is the time to get the betas and use them as you do with the current release  and file bugs. Makes these bugs block our tracking bug : 1008543.

For the next beta we will need more people to do formal testing - we will use moztrap and eventbrite to track this. The more participants to this (and other during the 31 beta period), the higher the quality. Follow this blog or subscribe to the Thunderbird-tester mailing list if you wish to make 31 a great release.

Ludo for the QA team

May 10, 2014 10:16 AM

May 02, 2014

Jonathan Protzenko

Shutting down my (now redundant) comm-central clone on github

This is just a public service announcement: I'm shutting down the comm-central repository on github I've been maintaining since 2011. Please use the official version: they share the same hashes so it's just a matter of modifying the url in your .git/config. Plus, the official repo has all the cool tags baked in.

(Turns out that due to yet-another hg repository corruption, the repo has been broken since early March. Since no one complained, I suspect no one's really using this anymore. I'm pretty sure the folks at Mozilla are much better at avoiding hg repository corruptions than I am, so here's the end of it!).

May 02, 2014 08:51 PM

April 30, 2014

Robert Kaiser

Finding an Openly Licensed JS Graph Library

As I mentioned at the end of the recent post on stability graphs, I was looking for doing JS-based, dynamic versions of those graphs. Those would be able to always have current data without manually copying stuff around (as I've done previously) and should be available to more people in the Mozilla community.

That said, even though I tried previously to do some canvas magic to create graphs in JS, it's tedious to create all that code by oneself, so I set out to find a library that can help me.

My set of ideal requirements was:
So, I went on a web search and tried to find libraries that would fulfill my requirements. Interestingly, I saw a number of commercially licensed JS libraries floating around, I guess there's some business in creating graphs out there.
The list below is the libraries I took a closer look at (before you ask, flot is not on the list because it requires jQuery and therefore violates my "graphs only" and "lightweight" goals right away):
So none of those turned out to be completely ideal in the light of my goals/preferences. That said, dygraphs looked decent enough so I went with that in the end and have dynamic versions of my graphs implemented.
(Ping me on IRC if you want a link, I don't want to link them on the blog as the uninformed could come to wrong or hasty conclusions looking at the data).

Of course, if you are looking for a graphs library for your project, you might chose completely differently, but maybe the list is helpful to find what fits your needs. ;-)

April 30, 2014 11:12 PM

Mike Conley

Electrolysis Code Spelunking: How links open new windows in Firefox

Hey. I’ve started hacking on Electrolysis bugs. I’m normally a front-end engineer working on Firefox desktop, but I’ve been temporarily loaned out to help get Electrolysis ready to be enabled by default on Nightly.

I’m working on bug 989501. Basically, when you click on a link that targets “_blank” or uses window.open, we open a new tab instead. That’s no good – assuming the user’s profile is set to allow it, we should open the link in a new window.

In order to fix this, I need a clearer picture on what happens in the Firefox platform when we click on one of these links.

This isn’t really a tutorial – I’m not going to go out of my way to explain much here. Think of this more as a public posting of my notes during my exploration.

So, here goes.

(Note that the code in this post was current as of revision 400a31da59a9 of mozilla-central, so if you’re reading this in the future, it’s possible that some stuff has greatly changed).

I know for a fact that once the link is clicked, we eventually call mozilla::dom::TabChild::ProvideWindow. I know this because of conversations I’ve had with smaug, billm and jdm in and out of Bugzilla, IRC, and meatspace.

Because I know this, I can hook up gdb to see how I get to that call. I have some notes here on how to hook up gdb to the content process of an e10s window.

Once that’s hooked up, I set a breakpoint on mozilla::dom::TabChild::ProvideWindow, and click on a link somewhere with target=”_blank”.

I hit my breakpoint, and I get a backtrace. Ready for it? Here we go:

#0  mozilla::dom::TabChild::ProvideWindow (this=0x109afb400, aParent=0x10b098820, aChromeFlags=4094, aCalledFromJS=false, aPositionSpecified=false, aSizeSpecified=false, aURI=0xffe, aName=@0x0, aFeatures=@0x0, aWindowIsNew=0x10b098820, aReturn=0x7fff5fbfb648) at TabChild.cpp:1201
#1  0x00000001018682e4 in nsWindowWatcher::OpenWindowInternal (this=0x10b05b540, aParent=0x10b098820, aUrl=<value temporarily unavailable, due to optimizations>, aName=<value temporarily unavailable, due to optimizations>, aFeatures=<value temporarily unavailable, due to optimizations>, aCalledFromJS=false, aDialog=<value temporarily unavailable, due to optimizations>, aNavigate=<value temporarily unavailable, due to optimizations>, _retval=<value temporarily unavailable, due to optimizations>) at nsWindowWatcher.cpp:601
#2  0x0000000101869544 in non-virtual thunk to nsWindowWatcher::OpenWindow2(nsIDOMWindow*, char const*, char const*, char const*, bool, bool, bool, nsISupports*, nsIDOMWindow**) () at nsWindowWatcher.cpp:417
#3  0x0000000100e5dc63 in nsGlobalWindow::OpenInternal (this=0x10b098800, aUrl=@0x7fff5fbfbf90, aName=@0x7fff5fbfc038, aOptions=@0x103d77320, aDialog=false, aContentModal=false, aCalleePrincipal=<value temporarily unavailable, due to optimizations>, aJSCallerContext=<value temporarily unavailable, due to optimizations>, aReturn=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/dom/base/nsGlobalWindow.cpp:11498
#4  0x0000000100e5e3a4 in non-virtual thunk to nsGlobalWindow::OpenNoNavigate(nsAString_internal const&, nsAString_internal const&, nsAString_internal const&, nsIDOMWindow**) () at /Users/mikeconley/Projects/mozilla-central/dom/base/nsGlobalWindow.cpp:7463
#5  0x000000010184d99d in nsDocShell::InternalLoad (this=<value temporarily unavailable, due to optimizations>, aURI=0x113eed200, aReferrer=0x1134c0fe0, aOwner=0x114a69070, aFlags=0, aWindowTarget=0x10b098820, aLoadType=<value temporarily unavailable, due to optimizations>, aSHEntry=<value temporarily unavailable, due to optimizations>, aSourceDocShell=<value temporarily unavailable, due to optimizations>, aDocShell=<value temporarily unavailable, due to optimizations>, aRequest=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:9079
#6  0x0000000101855758 in nsDocShell::OnLinkClickSync (this=0x10b075000, aContent=0x112865eb0, aURI=0x113eed3c0, aTargetSpec=<value temporarily unavailable, due to optimizations>, aFileName=@0x106f27f10, aPostDataStream=0x0, aDocShell=<value temporarily unavailable, due to optimizations>, aRequest=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:12699
#7  0x0000000101857f85 in mozilla::Maybe<mozilla::AutoCxPusher>::~Maybe () at /Users/mikeconley/Projects/mozilla-central/obj-x86_64-apple-darwin12.5.0/dist/include/nsCxPusher.h:12499
#8  0x0000000101857f85 in nsCxPusher::~nsCxPusher () at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:41
#9  0x0000000101857f85 in nsCxPusher::~nsCxPusher () at /Users/mikeconley/Projects/mozilla-central/obj-x86_64-apple-darwin12.5.0/dist/include/nsCxPusher.h:66
#10 0x0000000101857f85 in OnLinkClickEvent::Run (this=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:12502
#11 0x0000000100084f60 in nsThread::ProcessNextEvent (this=0x106f245e0, mayWait=false, result=0x7fff5fbfc947) at nsThread.cpp:715
#12 0x0000000100023241 in NS_ProcessPendingEvents (thread=<value temporarily unavailable, due to optimizations>, timeout=20) at nsThreadUtils.cpp:210
#13 0x0000000100d41c47 in nsBaseAppShell::NativeEventCallback (this=0x1096e8660) at nsBaseAppShell.cpp:98
#14 0x0000000100cfdba1 in nsAppShell::ProcessGeckoEvents (aInfo=0x1096e8660) at nsAppShell.mm:388
#15 0x00007fff86adeb31 in __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ ()
#16 0x00007fff86ade455 in __CFRunLoopDoSources0 ()
#17 0x00007fff86b017f5 in __CFRunLoopRun ()
#18 0x00007fff86b010e2 in CFRunLoopRunSpecific ()
#19 0x00007fff8ad65eb4 in RunCurrentEventLoopInMode ()
#20 0x00007fff8ad65c52 in ReceiveNextEventCommon ()
#21 0x00007fff8ad65ae3 in BlockUntilNextEventMatchingListInMode ()
#22 0x00007fff8cce1533 in _DPSNextEvent ()
#23 0x00007fff8cce0df2 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#24 0x0000000100cfd266 in -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (self=0x106f801a0, _cmd=<value temporarily unavailable, due to optimizations>, mask=18446744073709551615, expiration=0x422d63c37f00000d, mode=0x7fff7205e1c0, flag=1 '\001') at nsAppShell.mm:165
#25 0x00007fff8ccd81a3 in -[NSApplication run] ()
#26 0x0000000100cfe32b in nsAppShell::Run (this=<value temporarily unavailable, due to optimizations>) at nsAppShell.mm:746
#27 0x000000010199b3dc in XRE_RunAppShell () at /Users/mikeconley/Projects/mozilla-central/toolkit/xre/nsEmbedFunctions.cpp:679
#28 0x00000001002a0dae in MessageLoop::AutoRunState::~AutoRunState () at message_loop.cc:229
#29 0x00000001002a0dae in MessageLoop::AutoRunState::~AutoRunState () at /Users/mikeconley/Projects/mozilla-central/ipc/chromium/src/base/message_loop.h:197
#30 0x00000001002a0dae in MessageLoop::Run (this=0x0) at message_loop.cc:503
#31 0x000000010199b0cd in XRE_InitChildProcess (aArgc=<value temporarily unavailable, due to optimizations>, aArgv=<value temporarily unavailable, due to optimizations>, aProcess=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/toolkit/xre/nsEmbedFunctions.cpp:516
#32 0x0000000100000f1d in main (argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbff4d8) at /Users/mikeconley/Projects/mozilla-central/ipc/app/MozillaRuntimeMain.cpp:149

Oh my. Well, the good news is, we can chop off a good chunk of the lower half because that’s all message / event loop stuff. That’s going to be in every single backtrace ever, pretty much, so I can just ignore it. Here’s the more important stuff:

#0  mozilla::dom::TabChild::ProvideWindow (this=0x109afb400, aParent=0x10b098820, aChromeFlags=4094, aCalledFromJS=false, aPositionSpecified=false, aSizeSpecified=false, aURI=0xffe, aName=@0x0, aFeatures=@0x0, aWindowIsNew=0x10b098820, aReturn=0x7fff5fbfb648) at TabChild.cpp:1201
#1  0x00000001018682e4 in nsWindowWatcher::OpenWindowInternal (this=0x10b05b540, aParent=0x10b098820, aUrl=<value temporarily unavailable, due to optimizations>, aName=<value temporarily unavailable, due to optimizations>, aFeatures=<value temporarily unavailable, due to optimizations>, aCalledFromJS=false, aDialog=<value temporarily unavailable, due to optimizations>, aNavigate=<value temporarily unavailable, due to optimizations>, _retval=<value temporarily unavailable, due to optimizations>) at nsWindowWatcher.cpp:601
#2  0x0000000101869544 in non-virtual thunk to nsWindowWatcher::OpenWindow2(nsIDOMWindow*, char const*, char const*, char const*, bool, bool, bool, nsISupports*, nsIDOMWindow**) () at nsWindowWatcher.cpp:417
#3  0x0000000100e5dc63 in nsGlobalWindow::OpenInternal (this=0x10b098800, aUrl=@0x7fff5fbfbf90, aName=@0x7fff5fbfc038, aOptions=@0x103d77320, aDialog=false, aContentModal=false, aCalleePrincipal=<value temporarily unavailable, due to optimizations>, aJSCallerContext=<value temporarily unavailable, due to optimizations>, aReturn=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/dom/base/nsGlobalWindow.cpp:11498
#4  0x0000000100e5e3a4 in non-virtual thunk to nsGlobalWindow::OpenNoNavigate(nsAString_internal const&, nsAString_internal const&, nsAString_internal const&, nsIDOMWindow**) () at /Users/mikeconley/Projects/mozilla-central/dom/base/nsGlobalWindow.cpp:7463
#5  0x000000010184d99d in nsDocShell::InternalLoad (this=<value temporarily unavailable, due to optimizations>, aURI=0x113eed200, aReferrer=0x1134c0fe0, aOwner=0x114a69070, aFlags=0, aWindowTarget=0x10b098820, aLoadType=<value temporarily unavailable, due to optimizations>, aSHEntry=<value temporarily unavailable, due to optimizations>, aSourceDocShell=<value temporarily unavailable, due to optimizations>, aDocShell=<value temporarily unavailable, due to optimizations>, aRequest=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:9079
#6  0x0000000101855758 in nsDocShell::OnLinkClickSync (this=0x10b075000, aContent=0x112865eb0, aURI=0x113eed3c0, aTargetSpec=<value temporarily unavailable, due to optimizations>, aFileName=@0x106f27f10, aPostDataStream=0x0, aDocShell=<value temporarily unavailable, due to optimizations>, aRequest=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:12699
#7  0x0000000101857f85 in mozilla::Maybe<mozilla::AutoCxPusher>::~Maybe () at /Users/mikeconley/Projects/mozilla-central/obj-x86_64-apple-darwin12.5.0/dist/include/nsCxPusher.h:12499
#8  0x0000000101857f85 in nsCxPusher::~nsCxPusher () at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:41
#9  0x0000000101857f85 in nsCxPusher::~nsCxPusher () at /Users/mikeconley/Projects/mozilla-central/obj-x86_64-apple-darwin12.5.0/dist/include/nsCxPusher.h:66
#10 0x0000000101857f85 in OnLinkClickEvent::Run (this=<value temporarily unavailable, due to optimizations>) at /Users/mikeconley/Projects/mozilla-central/docshell/base/nsDocShell.cpp:12502

That’s a bit more manageable.

So we start inside something called a docshell. I’ve heard that term bandied about a lot, and I can’t say I’ve ever been too sure what it means, or what a docshell does, or why I should care.

I found some documents that make things a little bit clearer.

Basically, my understanding is that a docshell is the thing that connects incoming stuff from some URI (this could be web content, or it might be a XUL document that’s loading the browser UI…), and connects it to the things that make stuff show up on your screen.

So, pretty important.

It seems to be a place where some utility methods and functions go as well, so it’s kind of this abstract thing that seems to have multiple purposes.

But the most important thing for the purposes of this post is this: every time you load a document, you have a docshell taking care of it. All of these docshells are structured in a tree which is rooted with a docshell owner. This will come into play later.

So one thing that a docshell does, is that it notices when a link was clicked inside of its content. That’s nsDocShell.cpp’d OnLinkClickEvent::Run, and that eventually makes its way over to nsDocShell::OnLinkClickSync.

After some initial checks and balances to ensure that this thing really is a link we want to travel to, we get sent off to nsDocShell::InternalLoad.

Inside there, there’s some more checking… there’s a policy check to make sure we’re allowed to open a link. Lots of security going on. Eventually I see this:

if (aWindowTarget && *aWindowTarget)

That’s good. aWindowTarget maps to the target=”_blank” attribute in the anchor. So we’ll be entering this block.

    if (aWindowTarget && *aWindowTarget) {
        // Locate the target DocShell.
        nsCOMPtr<nsIDocShellTreeItem> targetItem;
        rv = FindItemWithName(aWindowTarget, nullptr, this,
                              getter_AddRefs(targetItem));

So now we’re looking for the right docshell to load this new document in. That makes sense – if you have a link where target=”foo”, subsequent links from the same origin targeted at “foo” will open in the same window or tab or what have you. So we’re checking to see if we’ve opened something with the name inside aWindowTarget already.

So now we’re in nsDocShell::FindItemWithName, and I see this:

        else if (name.LowerCaseEqualsLiteral("_blank"))
        {
            // Just return null.  Caller must handle creating a new window with
            // a blank name himself.
            return NS_OK;
        }

Ah hah, so target=”_blank”, as we already knew, is special-cased – and this is where it happens. There’s no existing docshell for _blank because we know we’re going to be opening a new window (or tab if the user has preffed it that way). So we don’t return a pre-existing docshell.

So we’re back in nsDocShell::InternalLoad.

        rv = FindItemWithName(aWindowTarget, nullptr, this,
                              getter_AddRefs(targetItem));
        NS_ENSURE_SUCCESS(rv, rv);

        targetDocShell = do_QueryInterface(targetItem);
        // If the targetDocShell doesn't exist, then this is a new docShell
        // and we should consider this a TYPE_DOCUMENT load
        isNewDocShell = !targetDocShell;

Ok, so now targetItem is nullptr, targetDocShell is also nullptr, and so isNewDocShell is true.

There seems to be more policy checking going on in InternalLoad after this… but eventually, I see this:

   if (aWindowTarget && *aWindowTarget) {
        // We've already done our owner-inheriting.  Mask out that bit, so we
        // don't try inheriting an owner from the target window if we came up
        // with a null owner above.
        aFlags = aFlags & ~INTERNAL_LOAD_FLAGS_INHERIT_OWNER;
        
        bool isNewWindow = false;
        if (!targetDocShell) {
            // If the docshell's document is sandboxed, only open a new window
            // if the document's SANDBOXED_AUXILLARY_NAVIGATION flag is not set.
            // (i.e. if allow-popups is specified)
            NS_ENSURE_TRUE(mContentViewer, NS_ERROR_FAILURE);
            nsIDocument* doc = mContentViewer->GetDocument();
            uint32_t sandboxFlags = 0;

            if (doc) {
                sandboxFlags = doc->GetSandboxFlags();
                if (sandboxFlags & SANDBOXED_AUXILIARY_NAVIGATION) {
                    return NS_ERROR_DOM_INVALID_ACCESS_ERR;
                }
            }

            nsCOMPtr<nsPIDOMWindow> win =
                do_GetInterface(GetAsSupports(this));
            NS_ENSURE_TRUE(win, NS_ERROR_NOT_AVAILABLE);

            nsDependentString name(aWindowTarget);
            nsCOMPtr<nsIDOMWindow> newWin;
            nsAutoCString spec;
            if (aURI)
                aURI->GetSpec(spec);
            rv = win->OpenNoNavigate(NS_ConvertUTF8toUTF16(spec),
                                     name,          // window name
                                     EmptyString(), // Features
                                     getter_AddRefs(newWin));

So we check again to see if we’re targeted at something, and check if we’ve found a target docshell for it. We hadn’t, so we do some security checks, and then … what the hell is nsPIDOMWindow? I’m used to things being called nsIBlahBlah, but now nsPIBlahBlah… what does the P mean?

It took some asking around, but I eventually found out that the P is supposed to be for Private – as in, this is a private XPIDL interface, and non-core embedders should stay away from it.

Ok, and we also see do_GetInterface. This is not the same as QueryInterface, believe it or not. The difference is subtle, but basically it’s this: QueryInterface says “you implement X, but I think you also implement Y. If you do, please return a pointer to yourself that makes you seem like a Y.” GetInterface is different – GetInterface says “I know you know about something that implements Y. It might be you, or more likely, it’s something you’re holding a reference to. Can I get a reference to that please?”. And if successful, it returns it. Here’s more documentation about GetInterface.

It’s a subtle but important difference.

So this docshell knows about a window, and we’ve now got a handle on that window using the private interface nsPIDOMWindow. Neat.

So eventually, we call OpenNoNavigate on that nsPIDOMWindow. That method is pretty much like nsIDOMWindow::Open, except that OpenNoNavigate doesn’t send the window anywhere – it just returns it so that the caller can send it to a URI.

Through the magic of do_GetInterface, nsDocShell::GetInterface, EnsureScriptEnvironment, and NS_NewScriptGlobalObject, I know that the nsPIDOMWindow is being implemented by nsGlobalWindow, and that’s where I should go to to find the OpenNoNavigate implementation.

So off we go!

nsGlobalWindow::OpenNoNavigate just seems to forward the call, after some argument setting, to nsGlobalWindow::OpenInternal, like this:

  return OpenInternal(aUrl, aName, aOptions,
                      false,          // aDialog
                      false,          // aContentModal
                      true,           // aCalledNoScript
                      false,          // aDoJSFixups
                      false,          // aNavigate
                      nullptr, nullptr,  // No args
                      GetPrincipal(),    // aCalleePrincipal
                      nullptr,           // aJSCallerContext
                      _retval);

Having a glance around at the rest of the nsGlobalWindow::Open[foo] methods, it looks like they all call into OpenInternal. It’s the big-mamma opening method.

This method does a few things, including making sure that we’re not being abused by web content that’s trying to spam the user with popups.

Eventually, we get to this:

      rv = pwwatch->OpenWindow2(this, url.get(), name_ptr, options_ptr,
                                /* aCalledFromScript = */ false,
                                aDialog, aNavigate, aExtraArgument,
                                getter_AddRefs(domReturn));

and return the domReturn pointer back after a few more checks to our caller. Remember that the caller is going to take this new window, and navigate it to some URI.

Ok, so, pwwatch. What is that? Well, that appears to be a private interface to nsWindowWatcher, which gives us access to the OpenWindow2 method.

After prepping some arguments, much like nsGlobalWindow::OpenNoNavigate did, we forward the call over to nsWindowWatcher::OpenWindowInternal.

And now we’re almost done – we’re almost at the point where we’re actually going to open a window!

Some key things need to happen though. First, we do this:

nsCOMPtr<nsIDocShellTreeOwner>  parentTreeOwner;  // from the parent window, if any
...
GetWindowTreeOwner(aParent, getter_AddRefs(parentTreeOwner));

So what that does is it tries to get the docshell owner of the docshell that’s attempting to open the window (and that’d be the docshell that we clicked the link in).

After a few more things, we check to see if there’s an existing window with that target name which we can re-use:

  // try to find an extant window with the given name
  nsCOMPtr<nsIDOMWindow> foundWindow = SafeGetWindowByName(name, aParent);
  GetWindowTreeItem(foundWindow, getter_AddRefs(newDocShellItem));

And if so, we set it to newDocShellItem.

After some more security stuff, we check to see if newDocShellItem exists. Because name is nullptr (since we had target=”_blank”, and nsDocShell::FindItemWithName returned nullptr), newDocShellItem is null.

Because it doesn’t exist, we know we’re opening a brand new window!

More security things seem to happen, and then we get to the part that I’m starting to focus on:

      nsCOMPtr<nsIWindowProvider> provider = do_GetInterface(parentTreeOwner);
      if (provider) {
        NS_ASSERTION(aParent, "We've _got_ to have a parent here!");

        nsCOMPtr<nsIDOMWindow> newWindow;
        rv = provider->ProvideWindow(aParent, chromeFlags, aCalledFromJS,
                                     sizeSpec.PositionSpecified(),
                                     sizeSpec.SizeSpecified(),
                                     uriToLoad, name, features, &windowIsNew,
                                     getter_AddRefs(newWindow));

We ask the parentTreeOwner to get us something that it knows about that implements nsIWindowProvider. In the Electrolysis / content process case, that’d be TabChild. In the normal, non-Electrolysis case, that’s nsContentTreeOwner.

The nsIWindowProvider is the thing that we’ll use to get a new window from! So we call ProvideWindow on it, to give us a pointer to new nsIDOMWindow window, assigned to newWindow.

Here’s TabChild::ProvideWindow:

NS_IMETHODIMP
TabChild::ProvideWindow(nsIDOMWindow* aParent, uint32_t aChromeFlags,
                        bool aCalledFromJS,
                        bool aPositionSpecified, bool aSizeSpecified,
                        nsIURI* aURI, const nsAString& aName,
                        const nsACString& aFeatures, bool* aWindowIsNew,
                        nsIDOMWindow** aReturn)
{
    *aReturn = nullptr;

    // If aParent is inside an <iframe mozbrowser> or <iframe mozapp> and this
    // isn't a request to open a modal-type window, we're going to create a new
    // <iframe mozbrowser/mozapp> and return its window here.
    nsCOMPtr<nsIDocShell> docshell = do_GetInterface(aParent);
    if (docshell && docshell->GetIsInBrowserOrApp() &&
        !(aChromeFlags & (nsIWebBrowserChrome::CHROME_MODAL |
                          nsIWebBrowserChrome::CHROME_OPENAS_DIALOG |
                          nsIWebBrowserChrome::CHROME_OPENAS_CHROME))) {

      // Note that BrowserFrameProvideWindow may return NS_ERROR_ABORT if the
      // open window call was canceled.  It's important that we pass this error
      // code back to our caller.
      return BrowserFrameProvideWindow(aParent, aURI, aName, aFeatures,
                                       aWindowIsNew, aReturn);
    }

    // Otherwise, create a new top-level window.
    PBrowserChild* newChild;
    if (!CallCreateWindow(&newChild)) {
        return NS_ERROR_NOT_AVAILABLE;
    }

    *aWindowIsNew = true;
    nsCOMPtr<nsIDOMWindow> win =
        do_GetInterface(static_cast<TabChild*>(newChild)->WebNavigation());
    win.forget(aReturn);
    return NS_OK;
}

The docshell->GetIsInBrowserOrApp() is basically asking “are we b2g?”, to which the answer is “no”, so we skip that block, and go right for CallCreateWindow.

CallCreateWindow is using the IPC library to communicate with TabParent in the UI process, which has a corresponding function called AnswerCreateWindow. Here it is:

bool
TabParent::AnswerCreateWindow(PBrowserParent** retval)
{
    if (!mBrowserDOMWindow) {
        return false;
    }

    // Only non-app, non-browser processes may call CreateWindow.
    if (IsBrowserOrApp()) {
        return false;
    }

    // Get a new rendering area from the browserDOMWin.  We don't want
    // to be starting any loads here, so get it with a null URI.
    nsCOMPtr<nsIFrameLoaderOwner> frameLoaderOwner;
    mBrowserDOMWindow->OpenURIInFrame(nullptr, nullptr,
                                      nsIBrowserDOMWindow::OPEN_NEWTAB,
                                      nsIBrowserDOMWindow::OPEN_NEW,
                                      getter_AddRefs(frameLoaderOwner));
    if (!frameLoaderOwner) {
        return false;
    }

    nsRefPtr<nsFrameLoader> frameLoader = frameLoaderOwner->GetFrameLoader();
    if (!frameLoader) {
        return false;
    }

    *retval = frameLoader->GetRemoteBrowser();
    return true;
}

So after some checks, we call mBrowserDOMWindow’s OpenURIInFrame, with (among other things), nsIBrowserDOMWindow::OPEN_NEWTAB. So that’s why we’ve got a new tab opening instead of a new window.

mBrowserDOMWindow is a reference to this thing implemented in browser.js:

function nsBrowserAccess() { }

nsBrowserAccess.prototype = {
  QueryInterface: XPCOMUtils.generateQI([Ci.nsIBrowserDOMWindow, Ci.nsISupports]),

  _openURIInNewTab: function(aURI, aOpener, aIsExternal) {
    let win, needToFocusWin;

    // try the current window.  if we're in a popup, fall back on the most recent browser window
    if (window.toolbar.visible)
      win = window;
    else {
      let isPrivate = PrivateBrowsingUtils.isWindowPrivate(aOpener || window);
      win = RecentWindow.getMostRecentBrowserWindow({private: isPrivate});
      needToFocusWin = true;
    }

    if (!win) {
      // we couldn't find a suitable window, a new one needs to be opened.
      return null;
    }

    if (aIsExternal && (!aURI || aURI.spec == "about:blank")) {
      win.BrowserOpenTab(); // this also focuses the location bar
      win.focus();
      return win.gBrowser.selectedBrowser;
    }

    let loadInBackground = gPrefService.getBoolPref("browser.tabs.loadDivertedInBackground");
    let referrer = aOpener ? makeURI(aOpener.location.href) : null;

    let tab = win.gBrowser.loadOneTab(aURI ? aURI.spec : "about:blank", {
                                      referrerURI: referrer,
                                      fromExternal: aIsExternal,
                                      inBackground: loadInBackground});
    let browser = win.gBrowser.getBrowserForTab(tab);

    if (needToFocusWin || (!loadInBackground && aIsExternal))
      win.focus();

    return browser;
  },

  openURI: function (aURI, aOpener, aWhere, aContext) {
    ... (removed for brevity)
  },

  openURIInFrame: function browser_openURIInFrame(aURI, aOpener, aWhere, aContext) {
    if (aWhere != Ci.nsIBrowserDOMWindow.OPEN_NEWTAB) {
      dump("Error: openURIInFrame can only open in new tabs");
      return null;
    }

    var isExternal = (aContext == Ci.nsIBrowserDOMWindow.OPEN_EXTERNAL);
    let browser = this._openURIInNewTab(aURI, aOpener, isExternal);
    if (browser)
      return browser.QueryInterface(Ci.nsIFrameLoaderOwner);

    return null;
  },

  isTabContentWindow: function (aWindow) {
    return gBrowser.browsers.some(function (browser) browser.contentWindow == aWindow);
  },

  get contentWindow() {
    return gBrowser.contentWindow;
  }
}

So nsBrowserAccess’s openURIInFrame only supports opening things in new tabs, and then it just calls _openURIInNewTab on itself, which does the job of returning the tab’s remote browser after the tab is opened.

I might follow this up with a post about how nsContentTreeOwner opens a window in the non-Electrolysis case, and how we might abstract some of that out for re-use here. We’ll see.

And that’s about it. Hopefully this is useful to future spelunkers.

April 30, 2014 08:14 PM

Calendar

Icedove and Iceowl-extension compatibility

I have noticed a few reports that Lightning 2.6.4 is not working with the Debian variant of Thunderbird, called Icedove. Together with one of our users (thanks Arie!) this was reported to the Debian package maintainers.

There are slight differences in how Icedove and Thunderbird are compiled, the result is that the Linux symbol versions required are different. The maintainers are working on fixing this, in the meanwhile all you need to do is make sure you are using only one source for the packages:

Unfortunately, this difference in compilation has cost us some negative reviews. If everything is working for you or this post has helped you to use the right versions, please show some love and write a review ♥.

 

April 30, 2014 09:49 AM

April 25, 2014

Mike Conley

Electrolysis: Debugging Child Processes of Content for Make Benefit Glorious Browser of Firefox

Here’s how I’m currently debugging Electrolysis stuff on OS X using gdb. It involves multiple terminal windows. I live with that.

# In Terminal Window 1, I execute my Firefox build with MOZ_DEBUG_CHILD_PROCESS=1.
# That environment variable makes it so that the parent process spits out the child
# process ID as soon as it forks out. I also use my e10s profile so as to not muck up
# my default profile.

MOZ_DEBUG_CHILD_PROCESS=1 ./mach run -P e10s

# So, now my Firefox is spawned up and ready to go. I have
# browser.tabs.remote.autostart set to "true" in my about:config, which means I'm
# using out-of-process tabs by default. That means that right away, I see the
# child process ID dumped into the console. Maybe you get the same thing if
# browser.tabs.remote.autostart is false. I haven't checked.

CHILDCHILDCHILDCHILD
  debug me @ 45326

# ^-- so, this is what comes out in Terminal Window 1.

So, the next step is to open another terminal window. This one will connect to the parent process.

# Maybe there are smarter ways to find the firefox process ID, but this is what I
# use in my new Terminal Window 2.
ps aux | grep firefox

# And this is what I get back:

mikeconley     45391  17.2  5.3  3985032 883932   ??  S     2:39pm   1:58.71 /Applications/FirefoxAurora.app/Contents/MacOS/firefox
mikeconley     45322   0.0  0.4  3135172  69748 s000  S+    2:36pm   0:06.48 /Users/mikeconley/Projects/mozilla-central/obj-x86_64-apple-darwin12.5.0/dist/Nightly.app/Contents/MacOS/firefox -no-remote -foreground -P e10s
mikeconley     45430   0.0  0.0  2432768    612 s002  R+    2:44pm   0:00.00 grep firefox
mikeconley     44878   0.0  0.0        0      0 s000  Z    11:46am   0:00.00 (firefox)

# That second one is what I want to attach to. I can tell, because the executable
# path lies within my local build's objdir. The first row is my main Firefox I just
# use for work browsing. I definitely don't want to attach to that. The third line
# is just me looking for the process with grep. Not sure what that last one is.

# I use sudo to attach to the parent because otherwise, OS X complains about permissions
# for process attachment. I attach to the parent like this:

sudo gdb firefox 45322

# And now I have a gdb for the parent process. Easy peasy.

And finally, to debug the child, I open yet another terminal window.

# That process ID that I got from Terminal Window 1 comes into play now.

sudo gdb firefox 45326

# Boom - attached to child process now.

Setting breakpoints for things like TabChild::foo or TabParent::bar can be done like this:

# In Terminal Window 3, attached to the child:

b mozilla::dom::TabChild::foo

# In Terminal Window 2, attached to the parent:

b mozilla::dom::TabParent::bar

And now we’re cookin’.

April 25, 2014 07:26 PM

April 24, 2014

Rumbling Edge - Thunderbird

2014-04-23 Calendar builds

Common (excluding Website bugs)-specific: (34)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

April 24, 2014 07:00 AM

2014-04-23 Thunderbird comm-central builds

Thunderbird-specific: (37)

MailNews Core-specific: (18)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

April 24, 2014 06:59 AM

April 23, 2014

Calendar

The 2014 Google Summer of Code Students are here!

I am particularly excited to announce that this year the Calendar Project has received two slots in Google Summer of Code 2014. Both projects target our backend code. This means users won’t have a chance to complain about user interface changes and instead will be blown away by performance and interoperability improvements.

I would like to take a few minutes to introduce our awesome new students to the community, please join me in giving them a warm welcome!

Reid Anderson: Improve Calendar Provider Backends

This project is about performance and stability for our calendar storage. Here is what Reid has to say:

I have been a student at the University of Minnesota since 2012 studying Chemistry and Computer Science. Outside of the classroom, I spend a lot of my time both watching and playing a variety of sports. I also enjoy reading, talking to friends, or playing a quick game of Civilization. I heard about Google Summer of Code before I entered college, and participating in the program had always been a goal of mine.

I was introduced to the Mozilla community when I started submitting patches to Songbird, a desktop media manager built on the Mozilla framework. Throughout the entire community I saw a consistent message of an open web powered by open technology and open software. This is something that I am excited to be a part of, and I am looking forward to contributing. My project is to improve the cached mode for online calendar providers to the point where it can be used as the default setting. This should allow Lightning to function effectively in an offline environment, while also bringing significant performance improvements. Hopefully these will be a useful contributions to the community, and I’m looking forward to getting started.

Isn’t that wonderful? I’m particularly excited about the performance improvements this will bring. Further project updates will be done on his blog and we will of course be posting major updates here too.

Malintha Fernando: Update Invitations to the latest Specification

This project will not only improve our invitations support, it will also guard us from future regressions through more tests. Here is an excerpt from Malintha’s blog post:

My name is Malintha Fernando and I am a student developer from Sri Lanka, currently studying at University of Moratuwa. I started contributing to Mozilla some months back (Still got a lot to learn) as my first contribution in open source and glad to be a part of the Lightning project in GSoC 2014.

The objective of this project is to improve Lightning’s scheduling system by updating the available features to the latest RFC specifications. As we know most of the Lightning’s implementation were done referring to the draft version 4 of the RFC 6638, there are some features lagging behind from the final RFC document.

Do you remember the mess we had when 2.6.x was released? At least one of the bugs we had to fix quickly was a regression in the invitations code. With Malintha’s help this won’t happen again!Whats next?According to Google, we are currently in the “Community Bonding Period”. This means we have a little time to set things up and make preparations. Coding officially begins on May 19th. You can follow progress on the projects as mentioned above, I will also blog about major updates here as we get closer to completion. Lets have some fun with this and continue to make Lightning better. Its about time!

April 23, 2014 07:46 PM

April 21, 2014

Instantbird

Google Summer of Code 2014 has Commenced!

Instantbird again has the pleasure of participating in Google Summer of Code under the Mozilla umbrella. In the past we’ve had a variety of exciting projects and this year is no different. Three students will be working with us this summer:

Saurabh Anand (sawrubh), mentored by Patrick Cloke (clokep), will aim to add support for reliable file transfer to Instantbird using FileLink as a fallback to standard file transfer.

Mayank Kumar (mayanktg), mentored by Benedikt P. (Mic), will be adding voice and video support to Instantbird by integrating WebRTC for XMPP. WebRTC makes it easy for us to have real-time communication without the use of additional plugins.

Nihanth Subramanya (nhnt11), who last year added the “awesometab“, will be looking to improve loading of conversations and history under the guidance of aleth. He will work on adding the ability to search across all logs of a contact and loading the previous context of a conversation when scrolling (“infinite scroll”).

Please feel free to stop by #instantbird on irc.mozilla.org to say hello and congratulate our students! Thanks again to Mozilla for allowing us to participate in Google Summer of Code with them!

April 21, 2014 07:00 PM

April 07, 2014

Mike Conley

Much Ado About Brendan (or As I’ve Seen It)

Since Brendan Eich’s resignation, I’ve been struggling to articulate what I think and feel about the matter. It’s been difficult. I haven’t been able to find what I wanted to say. Many other better, smarter, and more qualified Mozillians have written things about this, and I was about to let it go. I didn’t just want to say “me too”.

I felt I had nothing of substance to contribute. I feebly wrote something about Brendan Eich and the Kobayashi Maru, but it became a rambling mess, and the analogy fell apart quite quickly. I was about to call it quits on contributing my thoughts.

And then this post happened.

Don’t ask me where this came from. A muse woke me up in the night to write it (it’s just past 4AM for crying out loud – muse, let me sleep). Maybe through the lens of this nonsense, some real sense will prevail. I’m not hopeful, but this muse is nodding emphatically (and grinning like a lunatic).

Please believe that I’m not at all trying to trivialize, oversimplify, or make light of the events of the past few weeks by writing this. I’m just trying to understand it, and view it with a looking glass I have at least a little familiarity with.

And maybe it’s mostly catharsis.

I also apologize that it’s not really told like a story from the Bard. I think that’d be too long winded (no offense, Shakey). I’m pretty sure the narrator / stage directions have the most lines. It’s actually quite criminal.

I also want to point out that the only “real world names” in this little travesty is Brendan Eich’s, Jay Sullivan’s, and Mitchell Baker’s. The rest are from the world of Shakespeare.

And I also apologize that it’s not in iambic pentameter – that’d probably be more appropriate, but I have neither the wit nor the patience to pull this off with that much verisimilitude.

Oooh! Verisimilitude! Fancy words! Enough apologies, let’s get started.

Much Ado About Brendan (or As I’ve Seen It)

Prologue

Venice, Italy. Sometime during the Renaissance. This glorious city is composed of many families – the Montagues, the Capulets, the Macbeths, the MacDuffs, the Aguecheeks, the Fortanbras, the Whitmore’s, and many many more. Too many to name or count.

Many of these families argue and disagree about things. There’s almost always one thing that one family does or thinks that another family just cannot abide by.

It is in this turbulent city of families that we find The Merchant’s Building. The Merchant’s of Venice are selling their wares, lending or selling books, playing music, and much more – and people are constantly streaming in and out. It’s a marketplace of endless possibility.

In one section of The Merchant’s Building, is the Mozilla booth. Mozilla does and makes many things – but it’s probably best known for its Firefox jewelry. Mozilla is one of a small number of merchants giving away jewelry – and jewelry, in this building, is special: the more people wear your jewelry, the more of a voice you have at the Merchant’s Weekly Meeting, where the rules of the building are written and refined.

So what is special about this Mozilla merchant? Why should we wear their jewelry? There are certainly other merchants giving away jewelry a few booths down. What does Mozilla bring to the table?

For one thing, the jewelry is beautiful. And it makes you walk faster. And it’s got the latest features. And it makes it harder for sketchy people to follow you. And it doesn’t have a built-in tracking device recording which merchants you’re visiting. And you can add cool charms to it, and make it look exactly how you want it.

And another thing that’s unique to the Mozilla booth is that they’re composed of members of every single family in Venice. Every single family has at least one member working in the Mozilla booth. And what’s more – a bunch of these workers are volunteering their time and efforts to make this stuff!

Why? Why do they volunteer? And why do these family members work side by side with people their families might balk at, or sneer at?

Well, In the very center of the Mozilla booth, overhanging the whole thing, is… The Mission. The Mission is the guiding principals upon which the Mozilla booth operates. This is what these family members bury their gauntlets for. They work, sweat and bleed side by side for this mission. This is their connective tissue. This is what guides them when they vote and argue for things at the Merchant’s Weekly Meeting.

The other truly unique thing about the Mozilla booth is that there are no walls to it! You can walk right in, and watch the craftspeople make jewelry! Heck, you can sit right down at a bench and somebody will show you how to make some yourself. They’ll guide you, and they’ll critique you, and soon, somebody will be wearing a piece of jewelry that you made.

The greatest debates also occur within the Mozilla booth. People stand on soap boxes and give their opinions about jewelry, or other merchandise – or merchandise practices. People say what they think out loud, and perhaps print it on a t-shirt and wear it. Sometimes, discussions get heated, but level thinking usually prevails because these Mozillians are an unusually bright bunch.

ACT I

There is a leadership selection underway. Someone needs to be the Chief of Business Affairs (or CBA) in the Mozilla booth. The current chief, Jay, has been holding the position as an interim chief, and the Board of Business Affairs is trying to select someone to take the position permanently.

Two members of this board already have their bags packed – for a while now, they’ve been neglecting other interests of theirs, and after this chief is selected, they feel they need to do other things.

Enter Brendan Eich. Brendan Eich is chief craftsperson of the makers of jewelry in the Mozilla booth. He’s a brilliant and widely respected craftsperson himself, having invented some of the amazing techniques that are used by all serious jewelry makers. He is also one of the founders of the Mozilla booth, having set it up with Mitchell Baker.

The Board of Business Affairs selects Brendan to be the next Chief of Business Affairs.

They announce this, and there is much applause! People clap Brendan on the back. Many craftspeople are pleased that one of their own will be in charge.

The two board members, as they’ve agreed to, take their bags, salute, and walk off out of the booth and on to other things.

A third board member leaves as well, but for reasons not related to what I describe below.

Suddenly, several Montagues and Montague supporters in the Mozillian booth grow concerned. They recall that several years ago, Brendan had donated $1000 dollars to a law that supported Capulet values – a law which impacted their rights. The Montagues and Montague supporters grow concerned that someone who supports this Capulet law is not fit to be Chief of a booth that houses all of the families, Montagues included.

Several of these Montagues raise these concerns out loud. This is not unusual in the Mozilla booth, as most concerns are raised out loud – and, as usual, debate begins. Brendan states that he will 100% abide by the Mozilla participation guidelines, and what’s more, began supporting a project that a Montague in the Mozilla booth has been working on – to bring more Montagues into the booth.

Vigorous debate continues, as is the Mozilla booth custom.

However, as the booth lets anybody in, and the debate can be heard outside of the booth, several Montagues and Montague supporters hear these concerns and start passing the message along to one another – a Capulet has been selected to be the CBA!

Many of these Montagues are reasonable, and say and write reasonable arguments about why they are concerned, and why Brendan may not be the right choice as CBA.

ACT II

A few meters away, the Cupid booth overhears all of this concern from the Montagues. Perhaps they really are Montague supportors (or, more likely, they just wanted to perk up business), but they suddenly decide to take a stand. For people who try to come into their booth wearing Firefox jewelry, they have to read a big sign that tells them about why the Cupid booth believes that restricting the rights of Montagues is terrible, and that the Mozilla booth is terrible for making a Capulet the CBA. They tell the people wearing Firefox jewelry that they should probably wear other things.

And so some people start to take off their Firefox jewelry. Some Montagues take it off angrily, and smash it into the ground – stomping it with their feet, creating a big dust cloud.

Enter Iago, and his team of writers. There are many writers and story-sellers in the Merchant’s Building, but Iago is one of those writers that just wants people to listen to him. He likes to twist words and make things up, or to insinuate things that are not true. He saw the board members leaving the Mozilla booth and concocts some headlines, insinuating that they left in protest of Brendan’s support of the Capulet laws. He also writes about how all of the Mozillians in the booth were not supporting Brendan’s appointment as CBA (which is not true – it’s true that some were concerned and questioned the wisdom of his appointment, but certainly not all). He writes and he writes, and his messengers pass copies and leaflets around. Montagues and Montague supporters read these leaflets, or hear people talking about them, and they grow very concerned. More Montagues start to take off their Firefox jewelry.

Some Montagues start to engage with Mozillians and try to figure out what is happening. As always, each family has calm and reasonable people to converse with – and that’s always welcome in the Mozilla booth.

However, every family also has their groundlings. The groundlings are the members of a family who are always looking for a fight. Always looking for blood. Always hoping an actor will forget their lines, and will shout distracting things at them to make it happen. They always have a bag of rotten fruit and vegetables with them to throw. Some of them just like to make trouble.

Every family has their groundlings. You’ve probably met some yourself.

The groundlings start to hear these rumors that Iago has been spreading around, copied and recopied, distorted and mutilated – and they see the signs at the Cupid booth.

And they rush the Mozilla booth! They start throwing rotten fruit and vegetables, and they tear off their Firefox jewelry, and swear to never wear it again! They gnash their teeth, and they rip out their own hair in a rage, and they scream and yell and make so much noise – it’s almost impossible for the craftspeople in the Mozilla booth to work!

A tempest of Montague rage was upon the Mozilla booth.

ACT III

After several hours of this, Brendan addresses the crowd outside, and speaks to some storytellers (Iago and his team are among them – he always is).

They ask him if he renounces Capulet ways, or if he will apologize for the Montague rights that were impacted by the Capulet law that he helped fund.

And Brendan says something along the lines of “I don’t think that’s helpful to discuss. I don’t think that’s relevant here. I’m not going to run this booth as if everybody in here were Capulets – I helped make this booth, I know that it’s composed of many families, and I know how it operates.”

But Iago and the groundlings were not satisfied. They put up signs and placards claiming that anybody wearing Firefox jewelry is supporting the Capulets!

The Mozillians look at all of the broken and stomped-on jewelry on the market ground. All their work, being trampled. If this continues, their ability to improve things for all families at the Merchant’s Weekly Meeting will fade. Their ability to enact their Mission will fade. They are agitated, discouraged, upset, angry, sad, anxious, confused – a cocktail of emotion playing pretty much the entire spectrum.

Brendan’s speech had not done anything to quell the groundlings. And Iago could smell blood, and was not going to stop writing about Brendan or Mozilla.

The other leaders look to Brendan. What will we do?

And Brendan said, “This noise is getting absurdly loud. How are we supposed to work under these conditions? There’s no way we can enact the mission like this.”

And Brendan steps onto the proscenium, and says:

To leave, or not to leave, that is the question—
Whether ’tis Nobler in the mind to suffer
The Slings and Arrows of outrageous Fortune,
Or to take Arms against a Sea of troubles,
And by opposing end them?

And so, after much thought, he takes arms. He sacrifices, and he chooses to leave the booth – the booth he helped plant into the ground over 15 years ago. The booth he helped build, the jewelry and techniques he helped craft.

“I think if I leave, you folks might have a chance to keep the mission going.”

And so he leaves, to the heartbreak of many Mozillians, and to the cheering of the Montague groundlings outside.

ACT IV

Several of the more sensible Montagues watch Brendan leave and wonder if perhaps the groundlings in their family have made them look petty and vindictive. Some of them are also sad that Brendan left the Mozilla booth – all they wanted from him was an apology, they say. That would have sufficed, they say. They didn’t expect or want him to leave the whole booth.

But the damage is done, and Brendan has left. There is no chief craftsperson, and there is no CBA. Holy shit.

The Mozillians in the booth start to get back to work, since the cheers of the Montagues outside are much easier to work against as a backdrop than the booing, hissing and food-throwing. A bunch of Montagues dust off their stomped Firefox jewelry (or grab new copies!) and put them back on proudly. Others are happy with the new jewelry they got, and don’t care about the Mission. Still others never took off the Firefox jewelry, but said they did. And now they wear it publicly again, proudly.

But suddenly, the Capulets and Capulet supporters in and around the Mozilla booth look at this gaping void where Brendan was and sense injustice. This was wrong, they cry! This man should not have been chased out of here!

Vigorous debate begins, as is the Mozilla booth custom.

And reasonable Capulets say and write reasonable things about why they think it was wrong for Brendan to have left.

And Iago, who never really left the area, hears all of this, and smells more blood in the air. He takes his poison pen, and writes stories about how Brendan was forcibly removed from the Mozilla booth by an angry mob of Montagues. He writes that, like Julius Cesar, Brendan was heard gasping “Et tu, Brute?” as he was stabbed by his fellow senators – or, like King Hamlet, poisoned and betrayed by the people closest to him.

But as usual, Iago gets this completely wrong. Not that he cares or bothers to check. What a douche. And LOUD too, holy smokes. And people listen to Iago, and read what he writes, and hear what he says, and the rumours abound!

And a second tempest starts to brew.

ACT V

Many reasonable Capulets, both inside and outside of the Mozilla booth are concerned about what this means for them. Does this mean that Capulets aren’t allowed to become CBA’s? That’s certainly against the inclusiveness guidelines, is it not? And much debate resonated, as is the Mozilla way.

But, as you recall, every family has their groundlings, and the Capulets are no exception. The Capulet groundlings heard the rumours that Iago and his ilk were slinging, and they gnashed their teeth, and they pulled out their hair.

“YOU KILLED BRENDAN”, the groundings howled at the Mozilla booth.

“No, he left on his own accord to save us and the mission,” some Mozillians said with sadness.

“NO HE DIDN’T, HE WAS BETRAYED AND MURDERED BY HIS CLOSEST ALLIES!” the groundlings yelled back.

“No, that’s simply not true. He left on his own accord in an attempt to save the booth and the mission.”

And the reasonable Capulets understood this, and they understood the mindblowing complexities of this whole clusterfuck. And they spoke with reason and passion.

The Mozillian craftspeople got up from their work making jewelry to talk to these Capulets, and the supporters of the Capulets. And many were very reasonable and calm – but the groundlings among them were vicious and yelled and made so much noise. In some ways, their rage was indistinguishable from the Montegue groundling rage, which I believe is some kind of irony.

And, as you’d expect, the Capulet groundlings, like all groundlings, love blood. They love a fight. And they tore off their Firefox jewelry, and they stamped it into the ground. Vegetables and rotten fruit started to be thrown at the Mozilla booth. Again.

And the Mozillians in the booth looked at each other. They looked at the gaping void where Brendan used to stand. They all hugged one another, and comforted one another, as the jeers and boos of the groundlings got louder and louder, and as rotten fruit and vegetables slammed into them and their works.

And this is where we currently are, I believe.

Epilogue

If these ramblings have offended,
Think but this, and all is mended,
That you have but slumber’d here
While these visions did appear.
And this weak and idle theme,
No more yielding but a dream,
Gentles, do not reprehend:
if you pardon, we will mend:
And, as I am an honest Mike,
I do yet miss this Brendan Eich.
Now to ‘scape the serpent’s tongue,
We will make amends ere long;
Else the Mike a liar call;
So, good night unto you all.
Give me your hands, if we be friends,
And Robin shall restore amends.

For a less silly and more sober analysis of what happened, I suggest reading this next.

April 07, 2014 08:30 AM

April 06, 2014

Jonathan Protzenko

Freedom of speech in the Mozilla Community

I read this morning Andrew Truong's blog on planet:

I guess it's okay to speak out about how we truly feel when somebody resigns over a controversial topic but not to speak out during the controversy? We should ALWAYS speak out. Freedom of Speech.

The reason why I didn't speak up is, just like many other Mozillians I suspect, for fear of retribution. Seeing the attacks perpetrated on Brendan, a well-respected member of the Mozilla community, I can only imagine the torrent of hatemail that I'm going to get for publishing this blog post. (Update: so far, I didn't get any hatemail. It seems like the fears were unjustified.) The other reason is, I didn't want to further encourage a debate which I hoped would fade off after a few days. I somehow hoped we could go back to "business as usual". We were unable to, and thus I see no reason to hold back this blog post anymore.

I live in Europe. We tend to have a different opinion on these matters. But in any case, before you go any further down, I should just mention that I do support the right for anyone to marry the one they love, and I voted according to that belief for the last few elections in my country. I am a straight person, though.

I'm going to quote a few blog posts that resonate with me, and add a few comments.

Daniel wrote:

Who said "Mozilla Community"? Who said Openness? Pfffff. I've been a Mozillian for fourteen years and I'm not even sure I still recognize myself in today's Mozilla Community. Well done guys, well done. What's the next step? 100% political correctness? Is it still possible to have a legally valid personal opinion while being at Mozilla and express it in public?

From private conversations I had with other (mostly European) Mozillians, I know for certain that people have opinions that they are afraid to express in the Mozilla Community. Some people are religious, and will take great care _not_ to reveal that fact. Some people may have other beliefs that do not align with the dominant, Silicon-Valley progressive ideology. They also make sure that these are not apparent. Andrew Truong mentions freedom of speech. I believe there is freedom of speech in the Mozilla community as long as you happen to have the right opinions.

In my personal case, I fortunately happen to side with the prevalent ideology for most points, but I am now very afraid of slipping and expressing an opinion that is not considered progressive enough. I am now afraid of what is going to happen to me then: will I be kicked out? Will people call out for my name being removed from about:credits? Will people call on Twitter for my being ousted from Mozillians.org?

Ben Moskowitz wrote:

For the record, I don’t believe Brendan Eich is a bigot. He’s stubborn, not hateful. He has an opinion. It’s certainly not my opinion, but it was the opinion of 52% of people who voted on Prop 8 just six years ago, and the world is changing fast.

Again, my European point of view on the matter is simple: I interact in my daily life with many people who do not support same-sex marriage. I'm pretty sure some bosses up my hierarchy do not support that. I believe that, ultimately, there will be less and less such people, and that society will change enough that being against same-sex marriage will be a thing of the past. I also believe that as long as my bosses do their jobs, I'm fine with that. I do not care about the personal life of my president; I just care for him to run the country according to the values that his party adheres to. Same thing goes for Brendan.

Christie, who in a much better position that I am to speak about that, says:

Certainly it would be problematic if Brendan’s behavior within Mozilla was explicitly discriminatory, or implicitly so in the form of repeated microagressions. I haven’t personally seen this (although to be clear, I was not part of Brendan’s reporting structure until today). To the contrary, over the years I have watched Brendan be an ally in many areas and bring clarity and leadership when needed. Furthermore, I trust the oversight Mozilla has in place in the form of our chairperson, Mitchell Baker, and our board of directors.

The way people demanded a public apology reminded me of the glorious times of Soviet Russia and Communist China. What next? Should Brendan be photoshopped out of all the pictures? If we leave the matter as is, the only reasonable thing left to do is to add an extra round of interviews when hiring people: the political interview. There, we should make sure that the people we hire share the "right" political opinion. Otherwise, it seems like they is no space for them in the Mozilla Community.

Andrew Sullivan wrote:

If this is the gay rights movement today – hounding our opponents with a fanaticism more like the religious right than anyone else – then count me out. If we are about intimidating the free speech of others, we are no better than the anti-gay bullies who came before us.

While being a strong supporter of the movement, I really feel sad today: what legitimacy is there now for people who've been dragging through the mud an opponent, instead of treating them like a decent, human being? Is this what Mozilla is about?

(Comments are disabled on this post.)

April 06, 2014 09:25 PM

Andrew Sutherland

webpd: a Polymer-based web UI for the beets music library manager

beets webpd filtered artists list

beets is the extensible music database tool every programmer with a music collection has dreamed of writing.  At its simplest it’s a clever tagger that can normalize your music against the MusicBrainz database and then store the results in a searchable SQLite database.  But with plugins it can fetch album art, use the Discogs music database for tagging too, calculate ReplayGain values for all your music, integrate meta-data from The Echo Nest, etc.  It even has a Music Player Daemon server-mode (bpd) and a simple HTML interface (web) that lets you search for tracks and play them in your browse using the HTML5 audio tag.

I’ve tried a lot of music players through the years (alphabetically: amarok, banshee, exaile, quodlibetrhythmbox).  They all are great music players and (at least!) satisfy the traditional Artist/Album/Track hierarchy use-case, but when you exceed 20,000 tracks and you have a lot of compilation cd’s, that frequently ends up not being enough. Extending them usually turned out to be too hard / not fun enough, although sometimes it was just a question of time and seeking greener pastures.

But enough context; if you’re reading my blog you probably are on board with the web platform being the greatest platform ever.  The notable bits of the implementation are:

beets webpd madonna and morrissey

“What’s with all those tastefully chosen colors?” is what you are probably asking yourself.  The answer?  Two things!

  1. A visualization of albums/releases in the database by time, heat-map style.
    • We bin all of the albums that beets knows about by year.  In this case we assume that 1980 is the first interesting year and so put 1979 and everything before it (including albums without a year) in the very first bin on the left.  The current year is the rightmost bucket.
    • We vertically divide the albums into “albums” (red), “singles” (green), and “compilations” (blue).  This is accomplished by taking the MusicBrainz Release Group / Types and mapping them down to our smaller space.
    • The more albums in a bin, the stronger the color.
  2. A scatter-plot using the echo nest‘s acoustic attributes for the tracks where:
    • the x-axis is “danceability”.  Things to the left are less danceable.  Things to the right are more danceable.
    • the y-axis is “valence” which they define as “the musical positiveness conveyed by a track”.  Things near the top are sadder, things near the bottom are happier.
    • the colors are based on the type of album the track is from.  The idea was that singles tend to have remixes on them, so it’s interesting if we always see a big cluster of green remixes to the right.
    • tracks without the relevant data all end up in the upper-left corner.  There are a lot of these.  The echo nest is extremely generous in allowing non-commercial use of their API, but they limit you to 20 requests per minute and at this point the beets echonest plugin needs to upload (transcoded) versions of all my tracks since my music collection is apparently more esoteric than what the servers already have fingerprints for.

Together these visualizations let us infer:

Code is currently in the webpd branch of my beets fork although I should probably try and split it out into a separate repo.  You need to enable the webpd plugin like you would any other plugin for it to work.  There’s still a lot lot lot more work to be done for it to be usable, but I think it’s neat already.  It definitely works in Firefox and Chrome.

April 06, 2014 04:56 PM

April 05, 2014

Joshua Cranmer

Announcing jsmime 0.2

Previously, I've been developing JSMime as a subdirectory within comm-central. However, after discussions with other developers, I have moved the official repository of record for JSMime to its own repository, now found on GitHub. The repository has been cleaned up and the feature set for version 0.2 has been selected, so that the current tip on JSMime (also the initial version) is version 0.2. This contains the feature set I imported into Thunderbird's source code last night, which is to say support for parsing MIME messages into the MIME tree, as well as support for parsing and encoding email address headers.

Thunderbird doesn't actually use the new code quite yet (as my current tree is stuck on a mozilla-central build error, so I haven't had time to run those patches through a last minute sanity check before requesting review), but the intent is to replace the current C++ implementations of nsIMsgHeaderParser and nsIMimeConverter with JSMime instead. Once those are done, I will be moving forward with my structured header plans which more or less ought to make those interfaces obsolete.

Within JSMime itself, the pieces which I will be working on next will be rounding out the implementation of header parsing and encoding support (I have prototypes for Date headers and the infernal RFC 2231 encoding that Content-Disposition needs), as well as support for building MIME messages from their constituent parts (a feature which would be greatly appreciated in the depths of compose and import in Thunderbird). I also want to implement full IDN and EAI support, but that's hampered by the lack of a JS implementation I can use for IDN (yes, there's punycode.js, but that doesn't do StringPrep). The important task of converting the MIME tree to a list of body parts and attachments is something I do want to work on as well, but I've vacillated on the implementation here several times and I'm not sure I've found one I like yet.

JSMime, as its name implies, tries to work in as pure JS as possible, augmented with several web APIs as necessary (such as TextDecoder for charset decoding). I'm using ES6 as the base here, because it gives me several features I consider invaluable for implementing JavaScript: Promises, Map, generators, let. This means it can run on an unprivileged web page—I test JSMime using Firefox nightlies and the Firefox debugger where necessary. Unfortunately, it only really works in Firefox at the moment because V8 doesn't support many ES6 features yet (such as destructuring, which is annoying but simple enough to work around, or Map iteration, which is completely necessary for the code). I'm not opposed to changing it to make it work on Node.js or Chrome, but I don't realistically have the time to spend doing it myself; if someone else has the time, please feel free to contact me or send patches.

April 05, 2014 05:18 PM

April 03, 2014

Joshua Cranmer

If you want fast code, don't use assembly

…unless you're an expert at assembly, that is. The title of this post was obviously meant to be an attention-grabber, but it is much truer than you might think: poorly-written assembly code will probably be slower than an optimizing compiler on well-written code (note that you may need to help the compiler along for things like vectorization). Now why is this?

Modern microarchitectures are incredibly complex. A modern x86 processor will be superscalar and use some form of compilation to microcode to do that. Desktop processors will undoubtedly have multiple instruction issues per cycle, forms of register renaming, branch predictors, etc. Minor changes—a misaligned instruction stream, a poor order of instructions, a bad instruction choice—could kill the ability to take advantages of these features. There are very few people who could accurately predict the performance of a given assembly stream (I myself wouldn't attempt it if the architecture can take advantage of ILP), and these people are disproportionately likely to be working on compiler optimizations. So unless you're knowledgeable enough about assembly to help work on a compiler, you probably shouldn't be hand-coding assembly to make code faster.

To give an example to elucidate this point (and the motivation for this blog post in the first place), I was given a link to an implementation of the N-queens problem in assembly. For various reasons, I decided to use this to start building a fine-grained performance measurement system. This system uses a high-resolution monotonic clock on Linux and runs the function 1000 times to warm up caches and counters and then runs the function 1000 more times, measuring each run independently and reporting the average runtime at the end. This is a single execution of the system; 20 executions of the system were used as the baseline for a t-test to determine statistical significance as well as visual estimation of normality of data. Since the runs observed about a constant 1-2 μs of noise, I ran all of my numbers on the 10-queens problem to better separate the data (total runtimes ended up being in the range of 200-300μs at this level). When I say that some versions are faster, the p-values for individual measurements are on the order of 10-20—meaning that there is a 1-in-100,000,000,000,000,000,000 chance that the observed speedups could be produced if the programs take the same amount of time to run.

The initial assembly version of the program took about 288μs to run. The first C++ version I coded, originating from the same genesis algorithm that the author of the assembly version used, ran in 275μs. A recursive program beat out a hand-written assembly block of code... and when I manually converted the recursive program into a single loop, the runtime improved to 249μs. It wasn't until I got rid of all of the assembly in the original code that I could get the program to beat the derecursified code (at 244μs)—so it's not the vectorization that's causing the code to be slow. Intrigued, I started to analyze why the original assembly was so slow.

It turns out that there are three main things that I think cause the slow speed of the original code. The first one is alignment of branches: the assembly code contains no instructions to align basic blocks on particular branches, whereas gcc happily emits these for some basic blocks. I mention this first as it is mere conjecture; I never made an attempt to measure the effects for myself. The other two causes are directly measured from observing runtime changes as I slowly replaced the assembly with code. When I replaced the use of push and pop instructions with a global static array, the runtime improved dramatically. This suggests that the alignment of the stack could be to blame (although the stack is still 8-byte aligned when I checked via gdb), which just goes to show you how much alignments really do matter in code.

The final, and by far most dramatic, effect I saw involves the use of three assembly instructions: bsf (find the index of the lowest bit that is set), btc (clear a specific bit index), and shl (left shift). When I replaced the use of these instructions with a more complicated expression int bit = x & -x and x = x - bit, the program's speed improved dramatically. And the rationale for why the speed improved won't be found in latency tables, although those will tell you that bsf is not a 1-cycle operation. Rather, it's in minutiae that's not immediately obvious.

The original program used the fact that bsf sets the zero flag if the input register is 0 as the condition to do the backtracking; the converted code just checked if the value was 0 (using a simple test instruction). The compare and the jump instructions are basically converted into a single instruction in the processor. In contrast, the bsf does not get to do this; combined with the lower latency of the instruction intrinsically, it means that empty loops take a lot longer to do nothing. The use of an 8-bit shift value is also interesting, as there is a rather severe penalty for using 8-bit registers in Intel processors as far as I can see.

Now, this isn't to say that the compiler will always produce the best code by itself. My final code wasn't above using x86 intrinsics for the vector instructions. Replacing the _mm_andnot_si128 intrinsic with an actual and-not on vectors caused gcc to use other, slower instructions instead of the vmovq to move the result out of the SSE registers for reasons I don't particularly want to track down. The use of the _mm_blend_epi16 and _mm_srli_si128 intrinsics can probably be replaced with __builtin_shuffle instead for more portability, but I was under the misapprehension that this was a clang-only intrinsic when I first played with the code so I never bothered to try that, and this code has passed out of my memory long enough that I don't want to try to mess with it now.

In short, compilers know things about optimizing for modern architectures that many general programmers don't. Compilers may have issues with autovectorization, but the existence of vector intrinsics allow you to force compilers to use vectorization while still giving them leeway to make decisions about instruction scheduling or code alignment which are easy to screw up in hand-written assembly. Also, compilers are liable to get better in the future, whereas hand-written assembly code is unlikely to get faster in the future. So only write assembly code if you really know what you're doing and you know you're better than the compiler.

April 03, 2014 04:52 PM

Andrew Sutherland

monitoring gaia travis build status using webmail LED notifiers

usb LED webmail notifiers showing build status

For Firefox OS the Gaia UI currently uses Travis CI to run a series of test jobs in parallel for each pull request.  While Travis has a neat ember.js-based live-updating web UI, I usually find myself either staring at my build watching it go nowhere or forgetting about it entirely.  The latter is usually what ends up happening since we have a finite number of builders available, we have tons of developers, each build takes 5 jobs, and some of those jobs can take up to 35 minutes to run when they finally get a turn to run.

I recently noticed ThinkGeek had a bunch of Dream Cheeky USB LED notifiers on sale.  They’re each a USB-controlled tri-color LED in a plastic case that acts as a nice diffuser.  Linux’s “usbled” driver exposes separate red/green/blue files via sysfs that you can echo numbers into to control them.  While the driver and USB protocol inherently support a range of 0-255, it seems like 0-63 or 0-64 is all they give.  The color gamut isn’t amazing but is quite respectable and they are bright enough that they are useful in daylight.  I made a node.js library at https://github.com/asutherland/gaudy-leds that can do some basic tricks and is available on npm as “gaudy-leds”.  You can tell it to do things by doing “gaudy-leds set red green blue purple”, etc.  I added a bunch of commander sub-commands, so “gaudy-leds –help” should give a lot more details than the currently spartan readme.

I couldn’t find any existing tools/libraries to easily watch a Travis CI build and invoke commands like that (though I feel like they must exist) so I wrote https://github.com/asutherland/travis-build-watcher.  While the eventual goal is to not have to manually activate it at all, right now I can point it at a Travis build or a github pull request and it will poll appropriately so it ends up at the latest build and updates the state of the LEDs each time it polls.

Relevant notes / context:

April 03, 2014 01:58 PM

April 01, 2014

Robert Kaiser

How Effective is the Mozilla Stability Program?

One of my goals for last quarter was to get some basic metrics for the effectiveness of Mozilla's stability program. This can most easily be determined by measuring how often Firefox Desktop and Firefox for Android crash over time. Below you'll find some graphs and discussion on the data I could gather on that topic so far.

The Crash Rate

The crash rate is our primary stability measure used at Mozilla. We measure this rate in "crashes per 100 active daily installations (ADI)" or "crashes / 100 ADI". (ADI is the number of daily requests sent by Firefox Desktop and Firefox for Android to update their copy of our add-on blocklist. This value is considered a good enough estimation for usage for our purposes.)

Challenges for a Long-Term Rate

In our daily work, we tend to look at crash rates in terms of short-term changes within a single version, esp. development versions, so we can determine regressions and then dig deeper into what those are. For determining long-term program efficiency, it makes sense though to look at cross-version crash rates instead, so we know how our releases (or betas) improved. So it might make sense to look at all users on the release "channel", i.e. anyone using a stable release. On the other hand, we sometimes have leftover users of old and unsupported users producing a lot of crashes, but those are not really relevant to our current effectiveness of the stability program, so I wanted some way to age out old versions from this overall rate. To take all that into account, I needed some way to more or less "concatenate" the stability rate graphs of a series of versions. Also, people updating to or installing a release very soon after it's published tend to have somewhat different usage patterns than those installing it only after some time and therefore crash rates to those updating late in the cycle, so I needed to find some way to smoothen over that as well and ideally make this into an algorithm that can be automatically requested and put into an SQL query (as the data I base this on is in a PostgreSQL database).

Used Methodology

So, I began to think we could always sum up the crash and ADI numbers of the most recent two releases, or the ones that have the most users. But sometimes we release two adjacent versions 6 weeks apart and sometimes we do a fast update after a week and when the second of those is released, the one before might not have a lot of people updated to it yet so taking only those two might only cover a small portion of users and skew the numbers. So in the end, I decided to go with a moving window that always counts all versions where the builds have been created within the last 12 weeks for the Release channel, and the last 4 weeks for the Beta channel (I had 9 and 3 in the beginning but extended that to make numbers smooth over the impact of the 2-week hiatus we had over New Year's this year). The data we have in usable form goes back to the last few days of September 2011, so that's what I could use for the graphs (I'm trying to get some older data but that is harder to dig out).

Graphs & Discussion

So, here are some screen copies of the graphs I have created out of the data collected with that algorithm (includes data up to March 5, which was current when I originally wrote up this post):

Image No. 23207

The first graph, with data from the Firefox desktop release channel, shows three lines, as the legend says the include crashes of the browser process, those of a plugin process (the vast majority of the plugin processes are Adobe Flash), and so-called "hangs" where we kill the plugin process after it doesn't react to contact from the browser process for a long time (by default, 45 seconds).
For one thing, you'll see that weekends have higher crash rates than weekdays. This could for example be because the ADI data isn't as reliable/accurate as one would hope or because people using Firefox on weekends do things that are more crash-prone (including work/home usage pattern and possibly machine differences).
In this graph we can also clearly see the results of known stability events in this time frame: For example, it nicely shows the Google Doodle crash of August 2012, where almost every startup of the browser crashed when Google was set as the home page, and where we scrambled to get a fix out in very short time (and Google helped us by putting a workaround in their doodles as well). It's also easy to see a few other sharp spikes where we had ADI (upwards) or crash submission (downwards) issues, as well as the crash-and-hang-rich Flash 11.3 release in June of 2012 and subsequent fixes for Flash, including the concerted efforts between Adobe and us to get down to the old levels with fixes on both sides in May/June 2013. For the most time on the graph, you'll see that the browser crash rate didn't change very much (other than the sharp spikes mentioned). In January of 2013, though, it's possible to see the rise in crashes that caused us to ship Firefox 18.0.2 with a fix for that. Right following that, at the end of February, you'll see the sharp rise in crashes when we released Firefox 19.0, triggered by a bug in certain AMD CPUs, which we worked around by rebuilding and releasing a 19.0.1. Those examples, like anything showing up in that graph significantly and not being a data error has pretty intricate story, any of those could make up a separate blog post.
That said, the fact that we could keep the crash rate pretty much at 1.0 browser crashes / 100 ADI over that whole time (and even slightly improve to just below that with the Firefox 26 release in December 2013) is a statement on how effective the Mozilla Stability Program is on keeping Firefox crashes down even though a whole lot of code has been added to support a ton of new features that the web has gained over that time.

Now, let's see how Firefox Beta looks in comparison:

Image No. 23208

At the end of 2012, we apparently did manage to improve base level stability of the Beta channel, but you'll see that this channel is more noisy - which is expected as here we still see regressions and work on fixes before the issues hit release. For example, you can see that Firefox 27 Beta regressed stability in December 2013. We fixed that only very late in the cycle so that you don't see 27 being worse on the Release channel, but 28 had other regressions in the beginning and a rather large one in 28 Beta 4 (mid February 2014) - once we fixed that, you see that we come down to the 1.0 line in the last one or two weeks, so that looks pretty good for the 28 release, which was to be released ~2 weeks after the end of that data.
Also, you'll see that the plugin improvements of early 2013 are about 6 weeks earlier in Beta than in Release, which shows pretty well that there were actual patches in our code that helped with Flash hangs and crashes (as our code is on a 6-week cadence while Adobe's releases hit both channels pretty much at the same time).

Now, let's see how the picture looks when we look at a product that was newly created while we already had the mechanisms in place to record this data, like the current "native UI" Firefox for Android:

Image No. 23211

The early releases had higher crash rates, but we significantly improved over time due to our efforts in the Stability Program. You also can make out that the sharper changes happen pretty exactly at the edges of the 6-week release cycles. Also, you'll see that Firefox 23 for Android in September 2013 was pretty good but we became worse in the following months. Because of that, we started a renewed effort to improve stability of Firefox for Android this January. The current Firefox 27 for Android release is somewhat better than the one before, but it's not where we want to be yet, obviously. We didn't have too much time to pound on issues from the start of the year until 27 was release, but Beta can show us if our newer efforts are pointing in the right direction:

Image No. 23212

Now this graph looks pretty nice, doesn't it? When we started off putting this product on Beta the first time, we were seeing the usual churn of exposing a new product to a wider audience for the first time, but we burned down the issues pretty well. Then we had a big regression, fixed it, and burned down bugs slowly over multiple months again. The regressions of late 2013 look even more dramatic here as we had even worse issues there but could actually fix the worst parts of those so that the regressions on the Release channel weren't as bad as the first Betas we had there. Many of the 6-week cycles in this graph look like burn-down charts, high in the beginning, going down over the cycle as we push for bugs being fixed. It's also pretty awesome to see how the efforts since the start of this year have really paid off and current Beta is rivaling the best Beta numbers we had so far - you can imagine how I was looking forward to Firefox 28 for Android hitting Release based on that data! :)

All that said, we know there's more we can do on both products, and while holding crash rates pretty stable over a long time while adding a ton of features is awesome, we strive for improving overall stability. Those graphs are one part of measuring the effectiveness of the stability program. I hope we will be able to put them up in a more dynamic and daily updating form at some point (right now I manually construct them in LibreOffice).

And in case you're interested in digging deeper into the source of the graphs, the code to pull the data from the crash-stats DB is in my crash-report-tools repo and the JSON coming out of that and powering my charts is in my directory on crash-analysis (F*-bytype.json files). Also feel free to contact me for more details.

April 01, 2014 05:58 PM

March 29, 2014

Robert Kaiser

Lantea Maps conversion to WebGL

I blogged about Lantea Maps 18 months ago. As its marketplace listing describes, the app's purpose is displaying maps as well as recording and displaying GPS tracks.

I wrote this app both to scratch an itch (I wanted an OpenStreetMap-based app to record GPS tracks) and to learn a bit more of JavaScript and web app development. As maps are a 2D problem and the track display requires drawing various lines and possibly other location indicators, I wrote this app based on 2D canvas. I started off with some base code on drawing bitmap tile maps to canvas, and wrote the app around that, doing quite some rewriting on the little bit of code I started from. I also ended up splitting map and track into separate canvases so I wouldn't need to redraw everything when deleting the track or when moving the indicator of the last location or similar. Over time, I did a lot of improvements in various areas of the app, from the tile cache in IndexedDB via OpenStreetMap upload of tracks to pinch zooming on touch screens.

Still, performance of the map canvas was not good - on phones (esp. the small 320x480 screens like the ZTE Open), where you only have a handful of 256x256 map tiles to draw, panning was slightly chunky, but on larger screens, like my Android tablet or even my pretty fast desktop, it ranged from bad to awful (like, noticeably waiting from any movement until you saw any drawing of a move map). Also, as it takes until images are loaded (cached from IndexedDB or out from the web) and that's all called asynchronously, the positions the images ended up being drawn often weren't completely correct any more at the time of drawing them. I tried some optimizations with actually grepping the pixels from the canvas, setting them in the new positions and only actually redrawing the images on the borders, but that only helped slightly on small screens while making large ones even worse in performance.

Given what I read and heard about how today's graphics chips and pipelines work, I figured that the problem was with the drawImage() calls to draw the tiles to the canvas as well as the getImageData()/putImageData() calls to move the pixels in the optimizations. All those copy image data between JS and graphics memory, which is slow, and doing it a lot doesn't really fit well with how graphics stacks work nowadays. The only way I heard that should improve that a lot would be to switch from 2D canvas to WebGL (or go to the image-based tile maps that many others are using, but that wouldn't be as much fun). I don't remember all sources for that, but just did get another pointer to a Mozilla Hacks post that explains some of it. And as Google also seems to being moving their Maps site to WebGL (from image-based tiles, mind you), it can't be a really wrong move. :)

So, I set out to try and learn the pieces of WebGL I needed for this app. You'd guess that Mozilla, who invented that API together with Khronos, would have ample docs on it, but the WebGL MDN page does only have one tutorial for an animated 3D cube and a list of external links. I meanwhile have filed a bug on a WebGL reference so may improve this further in the future, but I started off first trying with the tutorial that MDN has. I didn't get a lot to work there except some basics, and a lot of the commands in there were not very well explained, but the html5rocks tutorial helped me to get things into a better shape, and some amount of trying around and the MSDN WebGL reference helped to understand more and get things actually right.
One thing that was pretty useful there as well was separating the determination of what tiles should be visible and loading them into textures from the actual drawing of the textures to the canvas. By doing the drawing itself on requestAnimationFrame and this being the only thing done when we pan as long as I have all tiles loaded into textures, I save work and should improve performance.

Image No. 23214 Image No. 23213
2D Canvas (left) and WebGL (right) version of Lantea Maps on the ZTE Open

As you can see from the images, the 2D canvas and WebGL versions of Lantea Maps do not look different - but then, that was not intended, as the map is the map after all. Right now, you can actually test both versions, though: I have not moved the WebGL to production yet, so lantea.kairo.at still uses 2D canvas, while the staging version lantea-dev.kairo.at already is WebGL. You'll notice that panning the map is way more fluid in the new version and the tile distortions that could happen with delayed loading in the old one do not happen. I still wonder though why it sometimes happens that you have to wait very long for tiles to load, esp. after zooming. I still need to figure that out at some point, but things render after waiting, so I found it OK for now. Also, I found the WebGL app to work fine on Firefox desktop (Linux), Firefox for Android, as well as Firefox OS (1.1, 1.2, and 1.5/Nightly).

So, I'm happy I did manage the conversion and learn some WebGL, though there's still a lot to be done. And as always, the code to Lantea Maps is available in my public git as well as GitHub if you want to learn or help. ;-)

March 29, 2014 12:02 AM

March 14, 2014

Joshua Cranmer

Understanding email charsets

Several years ago, I embarked on a project to collect the headers of all the messages I could reach on NNTP, with the original intent of studying the progression of the most common news clients. More recently, I used this dataset to attempt to discover the prevalence of charsets in email messages. In doing so, I identified a critical problem with the dataset: since it only contains headers, there is very little scope for actually understanding the full, sad story of charsets. So I've decided to rectify this problem.

This time, I modified my data-collection scripts to make it much easier to mass-download NNTP messages. The first script effectively lists all the newsgroups, and then all the message IDs in those newsgroups, stuffing the results in a set to remove duplicates (cross-posts). The second script uses Python's nntplib package to attempt to download all of those messages. Of the 32,598,261 messages identified by the first set, I succeeded in obtaining 1,025,586 messages in full or in part. Some messages failed to download due to crashing nntplib (which appears to be unable to handle messages of unbounded length), and I suspect my newsserver connections may have just timed out in the middle of the download at times. Others failed due to expiring before I could download them. All in all, 19,288 messages were not downloaded.

Analysis of the contents of messages were hampered due to a strong desire to find techniques that could mangle messages as little as possible. Prior experience with Python's message-parsing libraries lend me to believe that they are rather poor at handling some of the crap that comes into existence, and the errors in nntplib suggest they haven't fixed them yet. The only message parsing framework I truly trust to give me the level of finess is the JSMime that I'm writing, but that happens to be in the wrong language for this project. After reading some blog posts of Jeffrey Stedfast, though, I decided I would give GMime a try instead of trying to rewrite ad-hoc MIME parser #N.

Ultimately, I wrote a program to investigate the following questions on how messages operate in practice:

While those were the questions I seeked the answers to originally, I did come up with others as I worked on my tool, some in part due to what information I was basically already collecting. The tool I wrote primarily uses GMime to convert the body parts to 8-bit text (no charset conversion), as well as parse the Content-Type headers, which are really annoying to do without writing a full parser. I used ICU to handle charset conversion and detection. RFC 2047 decoding is done largely by hand since I needed very specific information that I couldn't convince GMime to give me. All code that I used is available upon request; the exact dataset is harder to transport, given that it is some 5.6GiB of data.

Other than GMime being built on GObject and exposing a C API, I can't complain much, although I didn't try to use it to do magic. Then again, in my experience (and as this post will probably convince you as well), you really want your MIME library to do charset magic for you, so in doing well for my needs, it's actually not doing well for a larger audience. ICU's C API similarly makes me want to complain. However, I'm now very suspect of the quality of its charset detection code, which is the main reason I used it. Trying to figure out how to get it to handle the charset decoding errors also proved far more annoying than it really should.

Some final background regards the biases I expect to crop up in the dataset. As the approximately 1 million messages were drawn from the python set iterator, I suspect that there's no systematic bias towards or away from specific groups, excepting that the ~11K messages found in the eternal-september.* hierarchy are completely represented. The newsserver I used, Eternal September, has a respectably large set of newsgroups, although it is likely to be biased towards European languages and under-representing East Asians. The less well-connected South America, Africa, or central Asia are going to be almost completely unrepresented. The download process will be biased away towards particularly heinous messages (such as exceedingly long lines), since nntplib itself is failing.

This being news messages, I also expect that use of 8-bit will be far more common than would be the case in regular mail messages. On a related note, the use of 8-bit in headers would be commensurately elevated compared to normal email. What would be far less common is HTML. I also expect that undeclared charsets may be slightly higher.

Charsets

Charset data is mostly collected on the basis of individual body parts within body messages; some messages have more than one. Interestingly enough, the 1,025,587 messages yielded 1,016,765 body parts with some text data, which indicates that either the messages on the server had only headers in the first place or the download process somehow managed to only grab the headers. There were also 393 messages that I identified having parts with different charsets, which only further illustrates how annoying charsets are in messages.

The aliases in charsets are mostly uninteresting in variance, except for the various labels used for US-ASCII (us - ascii, 646, and ANSI_X3.4-1968 are the less-well-known aliases), as well as the list of charsets whose names ICU was incapable of recognizing, given below. Unknown charsets are treated as equivalent to undeclared charsets in further processing, as there were too few to merit separate handling (45 in all).

For the next step, I used ICU to attempt to detect the actual charset of the body parts. ICU's charset detector doesn't support the full gamut of charsets, though, so charset names not claimed to be detected were instead processed by checking if they decoded without error. Before using this detection, I detect if the text is pure ASCII (excluding control characters, to enable charsets like ISO-2022-JP, and +, if the charset we're trying to check is UTF-7). ICU has a mode which ignores all text in things that look like HTML tags, and this mode is set for all HTML body parts.

I don't quite believe ICU's charset detection results, so I've collapsed the results into a simpler table to capture the most salient feature. The correct column indicates the cases where the detected result was the declared charset. The ASCII column captures the fraction which were pure ASCII. The UTF-8 column indicates if ICU reported that the text was UTF-8 (it always seems to try this first). The Wrong C1 column refers to an ISO-8859-1 text being detected as windows-1252 or vice versa, which is set by ICU if it sees or doesn't see an octet in the appropriate range. The other column refers to all other cases, including invalid cases for charsets not supported by ICU.

DeclaredCorrectASCIIUTF-8 Wrong C1OtherTotal
ISO-8859-1230,526225,6678838,1191,035466,230
Undeclared148,0541,11637,626186,796
UTF-875,67437,6001,551114,825
US-ASCII98,238030498,542
ISO-8859-1567,52918,527086,056
windows-125221,4144,3701543,31913029,387
ISO-8859-218,6472,13870712,31923,245
KOI8-R4,61642421,1126,154
GB23121,3075901121,478
Big562260801741,404
windows-125634310045398
IBM437842570341
ISO-8859-1331160317
windows-125113197161290
windows-12506969014101253
ISO-8859-7262600131183
ISO-8859-9127110017155
ISO-2022-JP766903148
macintosh67570124
ISO-8859-16015101116
UTF-7514055
x-mac-croatian0132538
KOI8-U282030
windows-125501800624
ISO-8859-4230023
EUC-KR0301619
ISO-8859-14144018
GB180301430017
ISO-8859-800001616
TIS-620150015
Shift_JIS840113
ISO-8859-391111
ISO-8859-10100010
KSC_56013609
GBK4206
windows-1253030025
ISO-8859-510034
IBM8500404
windows-12570303
ISO-2022-JP-22002
ISO-8859-601001
Total421,751536,3732,22611,52344,8921,016,765

The most obvious thing shown in this table is that the most common charsets remain ISO-8859-1, Windows-1252, US-ASCII, UTF-8, and ISO-8859-15, which is to be expected, given an expected prior bias to European languages in newsgroups. The low prevalence of ISO-2022-JP is surprising to me: it means a lower incidence of Japanese than I would have expected. Either that, or Japanese have switched to UTF-8 en masse, which I consider very unlikely given that Japanese have tended to resist the trend towards UTF-8 the most.

Beyond that, this dataset has caused me to lose trust in the ICU charset detectors. KOI8-R is recorded as being 18% malformed text, with most of that ICU believing to be ISO-8859-1 instead. Judging from the results, it appears that ICU has a bias towards guessing ISO-8859-1, which means I don't believe the numbers in the Other column to be accurate at all. For some reason, I don't appear to have decoders for ISO-8859-16 or x-mac-croatian on my local machine, but running some tests by hand appear to indicate that they are valid and not incorrect.

Somewhere between 0.1% and 1.0% of all messages are subject to mojibake, depending on how much you trust the charset detector. The cases of UTF-8 being misdetected as non-UTF-8 could potentially be explained by having very few non-ASCII sequences (ICU requires four valid sequences before it confidently declares text UTF-8); someone who writes a post in English but has a non-ASCII signature (such as myself) could easily fall into this category. Despite this, however, it does suggest that there is enough mojibake around that users need to be able to override charset decisions.

The undeclared charsets are described, in descending order of popularity, by ISO-8859-1, Windows-1252, KOI8-R, ISO-8859-2, and UTF-8, describing 99% of all non-ASCII undeclared data. ISO-8859-1 and Windows-1252 are probably over-counted here, but the interesting tidbit is that KOI8-R is used half as much undeclared as it is declared, and I suspect it may be undercounted. The practice of using locale-default fallbacks that Thunderbird has been using appears to be the best way forward for now, although UTF-8 is growing enough in popularity that using a specialized detector that decodes as UTF-8 if possible may be worth investigating (3% of all non-ASCII, undeclared messages are UTF-8).

HTML

Unsuprisingly (considering I'm polling newsgroups), very few messages contained any HTML parts at all: there were only 1,032 parts in the total sample size, of which only 552 had non-ASCII characters and were therefore useful for the rest of this analysis. This means that I'm skeptical of generalizing the results of this to email in general, but I'll still summarize the findings.

HTML, unlike plain text, contains a mechanism to explicitly identify the charset of a message. The official algorithm for determining the charset of an HTML file can be described simply as "look for a <meta> tag in the first 1024 bytes. If it can be found, attempt to extract a charset using one of several different techniques depending on what's present or not." Since doing this fully properly is complicated in library-less C++ code, I opted to look first for a <meta[ \t\r\n\f] production, guess the extent of the tag, and try to find a charset= string somewhere in that tag. This appears to be an approach which is more reflective of how this parsing is actually done in email clients than the proper HTML algorithm. One difference is that my regular expressions also support the newer <meta charset="UTF-8"/> construct, although I don't appear to see any use of this.

I found only 332 parts where the HTML declared a charset. Only 22 parts had a case where both a MIME charset and an HTML charset and the two disagreed with each other. I neglected to count how many messages had HTML charsets but no MIME charsets, but random sampling appeared to indicate that this is very rare on the data set (the same order of magnitude or less as those where they disagreed).

As for the question of who wins: of the 552 non-ASCII HTML parts, only 71 messages did not have the MIME type be the valid charset. Then again, 71 messages did not have the HTML type be valid either, which strongly suggests that ICU was detecting the incorrect charset. Judging from manual inspection of such messages, it appears that the MIME charset ought to be preferred if it exists. There are also a large number of HTML charset specifications saying unicode, which ICU treats as UTF-16, which is most certainly wrong.

Headers

In the data set, 1,025,856 header blocks were processed for the following statistics. This is slightly more than the number of messages since the headers of contained message/rfc822 parts were also processed. The good news is that 97% (996,103) headers were completely ASCII. Of the remaining 29,753 headers, 3.6% (1,058) were UTF-8 and 43.6% (12,965) matched the declared charset of the first body part. This leaves 52.9% (15,730) that did not match that charset, however.

Now, NNTP messages can generally be expected to have a higher 8-bit header ratio, so this is probably exaggerating the setup in most email messages. That said, the high incidence is definitely an indicator that even non-EAI-aware clients and servers cannot blindly presume that headers are 7-bit, nor can EAI-aware clients and servers presume that 8-bit headers are UTF-8. The high incidence of mismatching the declared charset suggests that fallback-charset decoding of headers is a necessary step.

RFC 2047 encoded-words is also an interesting statistic to mine. I found 135,951 encoded-words in the data set, which is rather low, considering that messages can be reasonably expected to carry more than one encoded-word. This is likely an artifact of NNTP's tendency towards 8-bit instead of 7-bit communication and understates their presence in regular email.

Counting encoded-words can be difficult, since there is a mechanism to let them continue in multiple pieces. For the purposes of this count, a sequence of such words count as a single word, and I indicate the number of them that had more than one element in a sequence in the Continued column. The 2047 Violation column counts the number of sequences where decoding words individually does not yield the same result as decoding them as a whole, in violation of RFC 2047. The Only ASCII column counts those words containing nothing but ASCII symbols and where the encoding was thus (mostly) pointless. The Invalid column counts the number of sequences that had a decoder error.

CharsetCountContinued2047 ViolationOnly ASCIIInvalid
ISO-8859-156,35515,6104990
UTF-836,56314,2163,3112,7049,765
ISO-8859-1520,6995,695400
ISO-8859-211,2472,66990
windows-12525,1743,075260
KOI8-R3,5231,203120
windows-125676556800
Big551146280171
ISO-8859-71652603
windows-12511573020
GB2312126356051
ISO-2022-JP10285049
ISO-8859-13784500
ISO-8859-9762100
ISO-8859-471200
windows-1250682100
ISO-8859-5662000
US-ASCII3810380
TIS-620363400
KOI8-U251100
ISO-8859-16221022
UTF-7172183
EUC-KR174409
x-mac-croatian103010
Shift_JIS80003
Unknown7207
ISO-2022-KR70000
GB1803061001
windows-12554000
ISO-8859-143000
ISO-8859-32100
GBK20002
ISO-8859-61100
Total135,95143,3603,3613,33810,096

This table somewhat mirrors the distribution of regular charsets, with one major class of differences: charsets that represent non-Latin scripts (particularly Asian scripts) appear to be overdistributed compared to their corresponding use in body parts. The exception to this rule is GB2312 which is far lower than relative rankings would presume—I attribute this to people using GB2312 being more likely to use 8-bit headers instead of RFC 2047 encoding, although I don't have direct evidence.

Clearly continuations are common, which is to be relatively expected. The sad part is how few people bother to try to adhere to the specification here: out of 14,312 continuations in languages that could violate the specification, 23.5% of them violated the specification. The mode-shifting versions (ISO-2022-JP and EUC-KR) are basically all violated, which suggests that no one bothered to check if their encoder "returns to ASCII" at the end of the word (I know Thunderbird's does, but the other ones I checked don't appear to).

The number of invalid UTF-8 decoded words, 26.7%, seems impossibly high to me. A brief check of my code indicates that this is working incorrectly in the face of invalid continuations, which certainly exaggerates the effect but still leaves a value too high for my tastes. Of more note are the elevated counts for the East Asian charsets: Big5, GB2312, and ISO-2022-JP. I am not an expert in charsets, but I belive that Big5 and GB2312 in particular are a family of almost-but-not-quite-identical charsets and it may be that ICU is choosing the wrong candidate of each family for these instances.

There is a surprisingly large number of encoded words that encode only ASCII. When searching specifically for the ones that use the US-ASCII charset, I found that these can be divided into three categories. One set comes from a few people who apparently have an unsanitized whitespace (space and LF were the two I recall seeing) in the display name, producing encoded words like =?us-ascii?Q?=09Edward_Rosten?=. Blame 40tude Dialog here. Another set encodes some basic characters (most commonly = and ?, although a few other interpreted characters popped up). The final set of errors were double-encoded words, such as =?us-ascii?Q?=3D=3FUTF-8=3FQ=3Ff=3DC3=3DBCr=3F=3D?=, which appear to be all generated by an Emacs-based newsreader.

One interesting thing when sifting the results is finding the crap that people produce in their tools. By far the worst single instance of an RFC 2047 encoded-word that I found is this one: Subject: Re: [Kitchen Nightmares] Meow! Gordon Ramsay Is =?ISO-8859-1?B?UEgR lqZ VuIEhlYWQgVH rbGeOIFNob BJc RP2JzZXNzZW?= With My =?ISO-8859-1?B?SHVzYmFuZ JzX0JhbGxzL JfU2F5c19BbXiScw==?= Baking Company Owner (complete with embedded spaces), discovered by crashing my ad-hoc base64 decoder (due to the spaces). The interesting thing is that even after investigating the output encoding, it doesn't look like the text is actually correct ISO-8859-1... or any obvious charset for that matter.

I looked at the unknown charsets by hand. Most of them were actually empty charsets (looked like =??B?Sy4gSC4gdm9uIFLDvGRlbg==?=), and all but one of the outright empty ones were generated by KNode and really UTF-8. The other one was a Windows-1252 generated by a minor newsreader.

Another important aspect of headers is how to handle 8-bit headers. RFC 5322 blindly hopes that headers are pure ASCII, while RFC 6532 dictates that they are UTF-8. Indeed, 97% of headers are ASCII, leaving just 29,753 headers that are not. Of these, only 1,058 (3.6%) are UTF-8 per RFC 6532. Deducing which charset they are is difficult because the large amount of English text for header names and the important control values will greatly skew any charset detector, and there is too little text to give a charset detector confidence. The only metric I could easily apply was testing Thunderbird's heuristic as "the header blocks are the same charset as the message contents"—which only worked 45.2% of the time.

Encodings

While developing an earlier version of my scanning program, I was intrigued to know how often various content transfer encodings were used. I found 1,028,971 parts in all (1,027,474 of which are text parts). The transfer encoding of binary did manage to sneak in, with 57 such parts. Using 8-bit text was very popular, at 381,223 samples, second only to 7-bit at 496,114 samples. Quoted-printable had 144,932 samples and base64 only 6,640 samples. Extremely interesting are the presence of 4 illegal transfer encodings in 5 messages, two of them obvious typos and the others appearing to be a client mangling header continuations into the transfer-encoding.

Conclusions

So, drawing from the body of this data, I would like to make the following conclusions as to using charsets in mail messages:

  1. Have a fallback charset. Undeclared charsets are extremely common, and I'm skeptical that charset detectors are going to get this stuff right, particularly since email can more naturally combine multiple languages than other bodies of text (think signatures). Thunderbird currently uses a locale-dependent fallback charset, which roughly mirrors what Firefox and I think most web browsers do.
  2. Let users override charsets when reading. On a similar token, mojibake text, while not particularly common, is common enough to make declared charsets sometimes unreliable. It's also possible that the fallback charset is wrong, so users may need to override the chosen charset.
  3. Testing is mandatory. In this set of messages, I found base64 encoded words with spaces in them, encoded words without charsets (even UNKNOWN-8BIT), and clearly invalid Content-Transfer-Encodings. Real email messages that are flagrantly in violation of basic spec requirements exist, so you should make sure that your email parser and client can handle the weirdest edge cases.
  4. Non-UTF-8, non-ASCII headers exist. EAI not withstanding, 8-bit headers are a reality. Combined with a predilection for saying ASCII when text is really ASCII, this means that there is often no good in-band information to tell you what charset is correct for headers, so you have to go back to a fallback charset.
  5. US-ASCII really means ASCII. Email clients appear to do a very good job of only emitting US-ASCII as a charset label if it's US-ASCII. The sample size is too small for me to grasp what charset 8-bit characters should imply in US-ASCII.
  6. Know your decoders. ISO-8859-1 actually means Windows-1252 in practice. Big5 and GB1232 are actually small families of charsets with slightly different meanings. ICU notably disagrees with some of these realities, so be sure to include in your tests various charset edge cases so you know that the decoders are correct.
  7. UTF-7 is still relevant. Of the charsets I found not mentioned in the WHATWG encoding spec, IBM437 and x-mac-croatian are in use only due to specific circumstances that limit their generalizable presence. IBM850 is too rare. UTF-7 is common enough that you need to actually worry about it, as abominable and evil a charset it is.
  8. HTML charsets may matter—but MIME matters more. I don't have enough data to say if charsets declared in HTML are needed to do proper decoding. I do have enough to say fairly conclusively that the MIME charset declaration is authoritative if HTML disagrees.
  9. Charsets are not languages. The entire reason x-mac-croatian is used at all can be traced to Thunderbird displaying the charset as "Croatian," despite it being pretty clearly not a preferred charset. Similarly most charsets are often enough ASCII that, say, an instance of GB2312 is a poor indicator of whether or not the message is in English. Anyone trying to filter based on charsets is doing a really, really stupid thing.
  10. RFCs reflect an ideal world, not reality. This is most notable in RFC 2047: the specification may state that encoded words are supposed to be independently decodable, but the evidence is pretty clear that more clients break this rule than uphold it.
  11. Limit the charsets you support. Just because your library lets you emit a hundred charsets doesn't mean that you should let someone try to do it. You should emit US-ASCII or UTF-8 unless you have a really compelling reason not to, and those compelling reasons don't require obscure charsets. Some particularly annoying charsets should never be written: EBCDIC is already basically dead on the web, and I'd like to see UTF-7 die as well.

When I have time, I'm planning on taking some of the more egregious or interesting messages in my dataset and packaging them into a database of emails to help create testsuites on handling messages properly.

March 14, 2014 04:17 AM

March 10, 2014

Rumbling Edge - Thunderbird

2014-03-09 Calendar builds

Common (excluding Website bugs)-specific: (27)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

March 10, 2014 08:16 AM

2014-03-09 Thunderbird comm-central builds

Thunderbird-specific: (31)

MailNews Core-specific: (26)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

March 10, 2014 08:15 AM

March 07, 2014

Ludovic Hirlimann

Thunderbird 28.0b1 is out and why you should care

We’ve just released another beta of Thunderbird. We are now in the middle of the release cycle until the next major version is released to our millions of daily users. (we’ve fixed 200+ bugs since the last major release (version 24)). We currently have less than 1% of our users - using the beta and that’s not enough to catch regressions - because Thunderbird offers mail, newsgroups and rss feeds we can’t cover the usage of our user base. Also many companies out there sell extensions for spam filtering, for virus protection and so forth. The QA community just doesn’t have the time to try all these and run these with Thunderbird betas to find issues.

And that’s where you dear reader can help. How you might ask well here is a list of examples of how you can help :

If you find issues let us know either thru bugzilla or thru the support forums, so we can try to address them.

ps the current download page says English only because of a bug in our build infrastructure for windows. Linux and Mac builds are available localized.

March 07, 2014 02:55 PM

March 05, 2014

David Ascher

Product Thinking

I have a new job!  Still with Mozilla, still doing a lot of what I’ve done in the past, just hopefully more/better/faster.  The group I’m joining has a great culture of active blogging, so I’m hoping the peer pressure there will help me blog more often.

What’s the gig you ask? My new focus is to help the Mozilla Foundation make our products as adoptable as possible.

MoFo (as we affectionately call that part of the Mozilla organization) has a few main ways in which we’re hoping to change the world — some of those are programs, like Open News and the Science Lab, some are products. In a program, the change we’re hoping to effect happens by connecting brains together, either through fellowship programs, events, conferences, things like that. That work is the stuff of movement-building, and it’s fascinating to watch my very skilled colleagues at work — there is a distinctive talent required to attract autonomous humans to a project, get them excited about both what you’re doing and what they could do, and empowering them to help themselves and others.

Alongside these programmatic approaches, MoFo has for a while been building software whose use is itself impactful.  Just like getting people to use Firefox was critical to opening up the web, we believe that using products like the Webmaker tools or BadgeKit will have direct impact and help create the internet the world needs.

And that’s where I come in!  Over the last few years, various smart people have kept labeling me a “product person”, and I’ve only recently started to understand what they meant, and that indeed, they are right — “product” (although the word is loaded with problematic connotations) is central for me.

I’ll write a lot more about that over the coming months, but the short version is that I am particularly fascinated by the process that converts an idea or a pile of code into something that intelligent humans choose to use and love to use.  That translation to me is attractive because it requires a variety of types of thinking: business modeling, design, consumer psychology, and creative application of technology.  It is also compelling to me in three other aspects: it is subversive, it is humane, and it is required for impact.

It is subversive because I think if we do things right, we use the insights from billions of dollars worth of work by “greedy, evil, capitalist corporations” who have figured out how to get “eyeballs” to drive profit and repurpose those techniques for public benefit — to make it easy for people to learn what they want to learn, to allow people to connect with each other, to amplify the positive that emerges when people create.  It is humane because I have never seen a great product emerge from teams that treat people as hyper-specialized workers, without recognizing the power of complex brains who are allowed to work creatively together.  And it is required for impact because software in a repo or an idea in a notebook can be beautiful, but is inert.  To get code or an idea to change the world, we need multitudes to use it; and the best way I know to get people to use software is to apply product thinking, and make something people love.

I am thrilled to say that I have as much to learn as I have to teach, and I hope to do much of both in public.  I know I’ll learn a lot from my colleagues, but I’m hoping I’ll also get to learn on this blog.

I’m looking forward to this new phase, it fits my brain.

March 05, 2014 12:09 AM

March 03, 2014

Instantbird

Pardon the Interruption! Instantbird nightly builds are back!

As of today, March 3rd, 2014, Instantbird nightly builds (1.6a1pre) are being built again. We last had nightly builds on January 9th, 2014 and they have been broken since due to a series of a large infrastructure change we’ve been going through to merge the Instantbird Bugzilla and code repository with Mozilla’s. Unfortunately, getting nightly builds working again took us longer than expected as it involved many related issues: updating Instantbird to work with newer versions of Mozilla, reconfiguring our buildbot and working on getting libpurple to build as an extension.

The results of this is that Instantbird is now building out of the “comm-central” code repository (the same place the code for Thunderbird is stored). What does this mean for you?

Again, sorry for any interruption. Regular development should be continuing now. Thanks for all the concerned emails telling us our builds had stopped!

March 03, 2014 06:21 PM

March 01, 2014

Ludovic Hirlimann

Fixing email ...

A recent twitter thread annoyed me about innovation in email :

That got me annoyed, because I always here we need to innovate with email and gmail is always taken as a reference in terms of email innovation.

But if you look at it gmail brought the following things to the world :

  1. unlimited “brokenish imap”
  2. Dkim/spf to fight spammers
  3. Good/very good spam filters (using the sender trust level)
  4. new nice UI to webmail

And that’s it. I’ve been involved with a MUA long enough now that I think email can’t be fixed by ‘just’ UI work or client work. Some things need to happen at the spec level and be implemented on both client and servers.

Unfortunately email has been around for 30ish years now and working for all that time - so fixing and simplifying can’t happen over night.

Let’s make a list of what needs to be fixed client side first :

On the server side there are way more things that need to be fixed

  1. smtp needs to be replaced with something that would reduce spam and try to keep the identify of the sender (things done today with things like dkim/spf/openpgp).
  2. mailing list need to be rethought in the way you interact with them - eg digest mode / subscribe / unsubscribe / vacation / cross-posting
  3. Rich Format email

Of course the last thing to fix is users, why ho why on earth do I get a Microsoft word has an attachment - while I could have directly got the content in the email itself.

March 01, 2014 10:56 AM

February 27, 2014

Robert Kaiser

Preserving Software: Emulators

It's been a while since I wrote a post here, and even longer since I wrote about preserving software. But there's two more topics I have on my list to write about the event I attended last May. This is one of them.

One problem for preserving software is that the original hardware that the software did run on might not survive very long. Some people are still keeping some old machines like C64, Apple ][ and others running, but at some point there won't be many left as the original ones wear out or get damaged, and other hardware might not be usable at any more already at this point. And for sure, those machines are not available broadly to the public. Ideally, we'd have the hardware and recreate the full experience, e.g. how you connected the machine to your own TV in the living room and played or worked with it there - but that is pretty unlikely or at least hard to do, esp. with the hardware being less and less available, as I mentioned.

But there's one way to bring at least part of the experience to users: We can emulate the old machines and let the preserved software run within that emulator. That doesn't give us the living-room-TV experience, but there's a better chance in both preserving that way of running the old pieces of software for a long time and making the experience broadly available. Now, it's not always easy to get emulators running well, but there are a number of projects out there, and we heard about a few interesting solutions in the preserving software event at the LoC, but one was particularly appealing to us as Mozillians.



I blogged about The Internet Archive (archive.org) and Jason Scott already some time ago, and he was it that mentioned this very appealing kind of emulator called JSMESS. What hides behind that name is the multi-platform MESS emulator, cross-compiled into JavaScript via EmScripten, a project that should be well-known here at Mozilla. :)



Since the event in May, a lot of work has been flowing into JSMESS, and as Jason has blogged about, there are a thousand cartriges available now in the Historical Software Collection of The Internet Archive, and performance is pretty decent within the browser now.

With that, a whole lot of old software is available for everyone, at any time, to try and experience within their own browser!

That's a powerful way to preserve software for the current world and upcoming generations, isn't it?

February 27, 2014 01:40 AM

February 10, 2014

Ludovic Hirlimann

PGP Key signing Party in London MArch 25th 2014

On march 25th 2014 I’m organizing a pgp key signing party in the Mozilla London office after working hours.

If you value privacy and email you should probably attend. In order to organize the meeting I need to know the number of participants , for that I’m requesting that people register using eventbrite.

For thos who don’t know how pgp key signing party work you should read this http://sietch-tabr.tumblr.com/post/61392468398/pgp-key-signing-party-at-mozilla-summit-2013-details.

February 10, 2014 01:59 PM

February 09, 2014

Rumbling Edge - Thunderbird

2014-02-09 Calendar builds

Common (excluding Website bugs)-specific: (6)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

February 09, 2014 11:55 PM

2014-02-09 Thunderbird comm-central builds

Thunderbird-specific: (85)

MailNews Core-specific: (32)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

February 09, 2014 11:53 PM

February 04, 2014

Ludovic Hirlimann

Fosdem 2014 report

This year I attended a full fosdem, after missing it last year and  only having 1/2 fosdem the two years before. This year I attended more than a couple of talks outside of the mozilla room. I also helped setting up the Mozilla booth on the fist day.

Mozilla booth being setup at fosdem

I attended the keynote, how we found 10 millions errors in wikipedia - which turned about being only 1 millions with false positive. Overall a great presentation of the LanguageTool a nice grammar checking tool written in java. They have an extension for firefox/thunderbird that will let you grammar check after spell checking is done. Also they have a recent change pages for wikipedia so anyone can check grammar error in “realtime” on wikipedia. Last but not least they where looking for help to provide the tool in JS instead of Java.

Next I went to the mozilla room to get updates on Fennec, and how the google summer of code worked. Basically very interesting numbers.

I then on listen to a talk about scaling dovecot:

at the dovecot talk at fosdem

Very very interesting. Of course I asked what the shittiest client was. The answer was Thunderbird , because it tries to implement more. After the talk I went and asked Timo Sirainen if he could give me a list of such bugs. The answer was search for my name in bugzilla.  My bugzilla fu being what it is I was of course unable to find any bugs :(

The next talk I listen to was also about email, but postfix this time - very very nice and informative about spam, and how to scale for spam. Full of interesting data.

My turn to talk came. You’ll find the slides here.

Sunday was less busy. I loosely listen to a talk about OTR, a good talk from the people at kaltura on HTML5 nad the video tag. I say very interesting because the lack of a file format standard really complicate things.

I then went on the largest PGP signing party. Will only sign 91 keys cause I had to go before the end.

Last but not least, ipv6 support was very strong on the fosdem network. Next year it will be full v6 only.

I took a bunch of pictures with my phone as I wanted to travel light with out my heavy Camera.

http://www.flickr.com/photos/lhirlimann/sets/72157640364126833/

Update : Of course someone from #tb-qa with the proper bugzilla foo gave me the list of dovecot bug related.

February 04, 2014 02:33 PM

February 01, 2014

Joshua Cranmer

Why email is hard, part 5: mail headers

This post is part 5 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. This post discusses the more general problem of email headers.

Back in my first post, Ludovic kindly posted, in a comment, a link to a talk of someone else's email rant. And the best place to start this post is with a quote from that talk: "If you want to see an email programmer's face turn red, ask him about CFWS." CFWS is an acronym that stands for "comments and folded whitespace," and I can attest that the mere mention of CFWS is enough for me to start ranting. Comments in email headers are spans of text wrapped in parentheses, and the folding of whitespace refers to the ability to continue headers on multiple lines by inserting a newline before (but not in lieu of) a space.

I'll start by pointing out that there is little advantage to adding in free-form data to headers which are not going to be manually read in the vast majority of cases. In practice, I have seen comments used for only three headers on a reliable basis. One of these is the Date header, where a human-readable name of the timezone is sometimes included. The other two are the Received and Authentication-Results headers, where some debugging aids are thrown in. There would be no great loss in omitting any of this information; if information is really important, appending an X- header with that information is still a viable option (that's where most spam filtration notes get added, for example).

For this feature of questionable utility in the first place, the impact it has on parsing message headers is enormous. RFC 822 is specified in a manner that is familiar to anyone who reads language specifications: there is a low-level lexical scanning phase which feeds tokens into a secondary parsing phase. Like programming languages, comments and white space are semantically meaningless [1]. Unlike programming languages, however, comments can be nested—and therefore lexing an email header is not regular [2]. The problems of folding (a necessary evil thanks to the line length limit I keep complaining about) pale in comparison to comments, but it's extra complexity that makes machine-readability more difficult.

Fortunately, RFC 2822 made a drastic change to the specification that greatly limited where CFWS could be inserted into headers. For example, in the Date header, comments are allowed only following the timezone offset (and whitespace in a few specific places); in addressing headers, CFWS is not allowed within the email address itself [3]. One unanticipated downside is that it makes reading the other RFCs that specify mail headers more difficult: any version that predates RFC 2822 uses the syntax assumptions of RFC 822 (in particular, CFWS may occur between any listed tokens), whereas RFC 2822 and its descendants all explicitly enumerate where CFWS may occur.

Beyond the issues with CFWS, though, syntax is still problematic. The separation of distinct lexing and parsing phases means that you almost see what may be a hint of uniformity which turns out to be an ephemeral illusion. For example, the header parameters define in RFC 2045 for Content-Type and Content-Disposition set a tradition of ;-separated param=value attributes, which has been picked up by, say, the DKIM-Signature or Authentication-Results headers. Except a close look indicates that Authenticatin-Results allows two param=value pairs between semicolons. Another side effect was pointed out in my second post: you can't turn a generic 8-bit header into a 7-bit compatible header, since you can't tell without knowing the syntax of the header which parts can be specified as 2047 encoded-words and which ones can't.

There's more to headers than their syntax, though. Email headers are structured as a somewhat-unordered list of headers; this genericity gives rise to a very large number of headers, and that's just the list of official headers. There are unofficial headers whose use is generally agreed upon, such as X-Face, X-No-Archive, or X-Priority; other unofficial headers are used for internal tracking such as Mailman's X-BeenThere or Mozilla's X-Mozilla-Status headers. Choosing how to semantically interpret these headers (or even which headers to interpret!) can therefore be extremely daunting.

Some of the headers are specified in ways that would seem surprising to most users. For example, the venerable From header can represent anywhere between 0 mailboxes [4] to an arbitrarily large number—but most clients assume that only one exists. It's also worth noting that the Sender header is (if present) a better indication of message origin as far as tracing is concerned [5], but its relative rarity likely results in filtering applications not taking it into account. The suite of Resent-* headers also experiences similar issues.

Another impact of email headers is the degree to which they can be trusted. RFC 5322 gives some nice-sounding platitudes to how headers are supposed to be defined, but many of those interpretations turn out to be difficult to verify in practice. For example, Message-IDs are supposed to be globally unique, but they turn out to be extremely lousy UUIDs for emails on a local system, even if you allow for minor differences like adding trace headers [6].

More serious are the spam, phishing, etc. messages that lie as much as possible so as to be seen by end-users. Assuming that a message is hostile, the only header that can be actually guaranteed to be correct is the first Received header, which is added by the final user's mailserver [7]. Every other header, including the Date and From headers most notably, can be a complete and total lie. There's no real way to authenticate the headers or hide them from snoopers—this has critical consequences for both spam detection and email security.

There's more I could say on this topic (especially CFWS), but I don't think it's worth dwelling on. This is more of a preparatory post for the next entry in the series than a full compilation of complaints. Speaking of my next post, I don't think I'll be able to keep up my entirely-unintentional rate of posting one entry this series a month. I've exhausted the topics in email that I am intimately familiar with and thus have to move on to the ones I'm only familiar with.

[1] Some people attempt to be to zealous in following RFCs and ignore the distinction between syntax and semantics, as I complained about in part 4 when discussing the syntax of email addresses.
[2] I mean this in the theoretical sense of the definition. The proof that balanced parentheses is not a regular language is a standard exercise in use of the pumping lemma.
[3] Unless domain literals are involved. But domain literals are their own special category.
[4] Strictly speaking, the 0 value is intended to be used only when the email has been downgraded and the email address cannot be downgraded. Whether or not these will actually occur in practice is an unresolved question.
[5] Semantically speaking, Sender is the person who typed the message up and actually sent it out. From is the person who dictated the message. If the two headers would be the same, then Sender is omitted.
[6] Take a message that's cross-posted to two mailing lists. Each mailing list will generate copies of the message which end up being submitted back into the mail system and will typically avoid touching the Message-ID.
[7] Well, this assumes you trust your email provider. However, your email provider can do far worse to your messages than lie about the Received header…

February 01, 2014 03:57 AM

January 24, 2014

Joshua Cranmer

Charsets and NNTP

Recently, the question of charsets came up within the context of necessary decoder support for Thunderbird. After much hemming and hawing about how to find this out (which included a plea to the IMAP-protocol list for data), I remembered that I actually had this data. Long-time readers of this blog may recall that I did a study several years ago on the usage share of newsreaders. After that, I was motivated to take my data collection to the most extreme way possible. Instead of considering only the "official" Big-8 newsgroups, I looked at all of them on the news server I use (effectively, all but alt.binaries). Instead of relying on pulling the data from the server for the headers I needed, I grabbed all of them—the script literally runs HEAD and saves the results in a database. And instead of a month of results, I grabbed the results for the entire year of 2011. And then I sat on the data.

After recalling Henri Svinonen's pesterings about data, I decided to see the suitability of my dataset for this task. For data management reasons, I only grabbed the data from the second half of the year (about 10 million messages). I know from memory that the quality of Python's message parser (which was used to extract data in the first place) is surprisingly poor, which introduces bias of unknown consequence to my data. Since I only extracted headers, I can't identify charsets for anything which was sent as, say, multipart/alternative (which is more common than you'd think), which introduces further systematic bias. The end result is approximately 9.6M messages that I could extract charsets from and thence do further research.

Discussions revealed one particularly surprising tidbit of information. The most popular charset not accounted for by the Encoding specification was IBM437. Henri Sivonen speculated that the cause was some crufty old NNTP client on Windows using that encoding, so I endeavored to build a correlation database to check that assumption. Using the wonderful magic of d3, I produced a heatmap comparing distributions of charsets among various user agents. Details about the visualization may be found on that page, but it does refute Henri's claim when you dig into the data (it appears to be caused by specific BBS-to-news gateways, and is mostly localized in particular BBS newsgroups).

Also found on that page are some fun discoveries of just what kind of crap people try to pass off as valid headers. Some of those User-Agents are clearly spoofs (Outlook Express and family used the X-Newsreader header, not the User-Agent header). There also appears to be a fair amount of mojibake in headers (one of them appeared to be venerable double mojibake). The charsets also have some interesting labels to them: the "big5\n" and the "(null)" illustrate that some people don't double check their code very well, and not shown are the 5 examples of people who think charset names have spaces in them. A few people appear to have mixed up POSIX locales with charsets as well.

January 24, 2014 12:53 AM

January 15, 2014

Ludovic Hirlimann

For those of you attending FOSDEM

In a a few weeks it will be Fosdem week-end. Something I’ve been attending since 2004.

This year I’d like to tell people that care about email privacy that fosdem has the biggest pgp key signing party in Europe. If you use pgp, or gnupg you might want to join the party.

To do so you’ll need to register before the 30th of January and follow the detailed instructions at https://fosdem.org/2014/keysigning/ .

update: People are sending plenty of keys it’s going to be a great event.

January 15, 2014 10:42 AM

January 08, 2014

Meeting Notes

Thunderbird: 2014-01-07

Thunderbird meeting notes 2014-01-07
Minute taker – don’t forget to save a revision of the pad before clearing it for next use.
Please don’t forget to post on wiki.mozilla.org after the end of the meeting so that they will go public in the meeting notes blog.

Attendees

  • rkent, florian, jcranmer, Roland, JosiahOne, mkmelin

Action items from last meetings

  • [mconley] Put together an Etherpad for contributor badge categories and graphics

    • Make sure each badge has an explicit, once-sentence goal.

Friends of the tree

Current status and discussions

  • mconley still did not put together this etherpad!

  • Chat updates in brief:
    • Instantbird now uses bugzilla.mozilla.org

    • instantbird front-end code will be merged into comm-central in a folder named im/
    • GPL code (libpurple + glue code) would be in a separate repository
      • This would make it easier to develop the add-on for TB as well

Upcoming

Round Table

clokep/florian

  • Finally completed the 6th chat/ merge of IB -> c-c (bug 920801)

    • Dealt with a couple of minor bugs from this (bugs 956487 and 956767)
  • Planning to do another chat/ merge of IB -> c-c soon
  • Merged the Instantbird bugzilla into Mozilla’s bugzilla (bug 749586): sorry for the bugspam
    • Includes a “Chat core” component for code shared between Thunderbird and Instantbird

      • Need to resolve duplicates and move bugs to their proper components
  • Filed bug to merge the Instantbird UI into c-c (bug 956609)
    • Hoping to finish this merge by the end of the week

    • Will likely close the tree for a few hours once we are ready to land

jcranmer

  • Windows builds almost work on Alder (down to a packaging failure)

    • Pymake’s shell quoting is annoying

    • bug 957720 is the tracking for work needed to get windows builds green
  • Goal is to finish cc-rework by the end of January:
    • Still need to ensure unit test configs work in new setup

    • Unresolved OS X Universal builds failure
    • Unresolved Windows packaging failure
    • Need to land bug 957720, bug 944952
  • More talk about killing RDF, so returning to working on bug 441437
  • Had a wonderful winter weather in Chicago followed by a bitterly cold weekend here in Urbana

JosiahOne (Left meeting early)

  • Did a lot of reviews, still have more to do. (Paenglab’s been knocking out bugs left and right.)

    • If you have flagged me for review, I plan to finish reviews this week.
  • Created several shared CSS files for theme with no regressions so far.
  • Landed a bunch of minor theme improvements
    • Removed the grain texture that has been causing issues. Bug 935023.
  • My plan now is actually to start doing less theme stuff and move to Cocoa and JS-based Front end.

mkmelin

  • bug 953426 expose remote content per-host privileges

  • bug 956586 get rid of FillInHTMLTooltip which is now in a binding (and bug 331772)

Question Time

Other

  • Q12014: We are (finally) moving the Knowledge Base from support.mozillamessaging.com to support.mozilla.org & from getstatisfaction to support.mozilla.org forums [roland]

January 08, 2014 04:00 AM

December 20, 2013

Instantbird

Instantbird 1.5 Released!

Instantbird 1.5 has been released: go grab your copy now! There are a ton of new features and bugs fixed for this new release, but we’d like to highlight a couple of new features below.

An exciting new feature you’ll find in Instantbird 1.5 is the New Conversation tab. It displays a list of your contacts, ordered based on how frequently and recently you’ve talked to them. Starting a conversation has never been easier! No longer will you have to open a separate window and scroll through your contact list to find a person. Just click the “+” button or press Ctrl/Cmd+T, start typing the name of the contact, and you should see your contact appear at the top of the list after typing only a few letters! You can then press enter and your conversation opens! The first time you open the tab, Instantbird will load your chat logs and learn who you talk to most often in order to offer accurate suggestions. New friends might not show up at the top immediately, but keep talking to them and they’ll reorder themselves. Don’t worry though, this ranking data is kept only on your own computer and is not transmitted or shared in any way!

Additionally, if you use Instantbird for IRC, the New Conversation tab will automatically query your servers to download the list of channels that are available to you. (This is generally known as LIST in IRC jargon.) Just like with your contacts, you can type in the name of a channel and it’ll bubble to the top of the list. Sometimes you don’t always know the channel name (that’s why you’re searching, right?): we’ve got you covered there too! Instantbird will search the channel topics in addition to channel names so you can quickly find new channels to join!

A very visible user interface improvement that was included for Instantbird 1.5 is redone tooltips that fit more into the visual style of the rest of the user interface. They should be immediately familiar to Instantbird users as they’re modeled after the conversation header! Hopefully this will help you find information quickly and easily whether conversing with your contacts or just checking their status.

For Linux users out there, we are still only offering 32-bit builds, although we hope to change that soon! If you are running a 64-bit Linux distribution, previously you’d have to install the ia32-libs (see our FAQ), but this has changed in recent versions of Ubuntu which no longer offer this package. The procedure now is to run:

sudo apt-get install libgtk2.0-0:i386 libpangox-1.0-0:i386 libpangoxft-1.0-0:i386 libidn11:i386 libglu1-mesa:i386 libxt-dev:i386 libasound-dev:i386

If you’d like to see a complete list of what’s new in Instantbird 1.5, please view the release notes.

December 20, 2013 09:12 PM

December 19, 2013

Robert Kaiser

LCARStrek and Australis

The latest version of my LCARStrek theme does not just support the latest SeaMonkey and Firefox releases. As I'm using it myself on Nightly, I'm trying to keep it working in an experimental way with that as well - and with that, a pretty huge challenge came up in the last weeks: A redesign of the Firefox interface code-named "Australis".

I blogged a month ago about how it may affect my customizations and I have dealt with those to a good degree by now, even though not yet even as drastically as I thought when writing that blog post. As always, more will follow. It took me some time until I switched over actually, as I wanted to keep using my theme, but it was naturally not compatible with such a huge redesign.

But after a lot of hours of my free time in the last few weeks, I have experimental support for Australis working in LCARStrek. The new changes living together with support for pre-Australis Firefox in the same theme require quite a few hacks to have a number of styles only apply on one side or the other. But then, I have been doing theme design for long enough (about 14 years now) that I know a few tricks and could use those - thankfully, there are a few changes in attributes set on the main toolbox, for example.

There's still a lot to be done in this area to fix some details (and I see a painting issue that is triggered in the submenus of the new main menu but is probably Linux-specific and connected to transparency used in the arrowpanel), but the main things seems to work decently now. See this screenshot:
Image No. 23159

Given that I'm using it every day, I hope starting now gives me enough experience with it that I can deliver a really decent theme when Australis finally will ship, probably with Firefox 29. :)

December 19, 2013 10:43 PM

Rumbling Edge - Thunderbird

2013-12-17 Calendar builds

Common (excluding Website bugs)-specific: (14)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

December 19, 2013 09:49 AM

2013-12-17 Thunderbird comm-central builds

Thunderbird-specific: (32)

MailNews Core-specific: (10)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

December 19, 2013 09:48 AM

December 14, 2013

Mike Conley

Australis Performance Post-mortem Part 3: As Good As Our Tools

While working on the ts_paint and tpaint regressions, we didn’t just stab blindly at the source code. We had some excellent tools to help us along the way. We also MacGyver‘d a few of those tools to do things that they weren’t exactly designed to do out of the box. And in some cases, we built new tools from scratch when the existing ones couldn’t cut it.

I just thought I’d write about those.

MattN’s Spreadsheet

I already talked about this one in my earlier post, but I think it deserves a second mention. MattN has mad spreadsheet skills. Also, it turns out you can script spreadsheets on Google Docs to do some pretty magical things – like pull down a bunch of talos data, and graph it for you.

I think this spreadsheet was amazingly useful in getting a high-level view of all of the performance regressions. It also proved very, very useful in the next set of performance challenges that came along – but more on those later.

MattN’s got a blog post up about his spreadsheet that you should check out.

The Gecko Profiler

This is a must-have for Gecko hackers who are dealing with some kind of performance problem. The next time I hit something performance related, this is the first tool I’m going to reach for. We used a number of tools in this performance work, but I’m pretty sure this was the most powerful one in our arsenal.

Very simply, Gecko ships with a built-in sampling profiler, and there’s an add-on you can install to easily dump, view and share these profiles. That last bit is huge – you click a button, it uploads, and bam – you have a link you can send to someone over IRC to have them look at your profile. It’s sheer gold.

We also built some tools on top of this profiler, which I’ll go into in a few paragraphs.

You can read up on the Gecko Profiler here at the official documentation.

Homebrew Profiler

At one point, jaws built a very simple profiler for the CustomizableUI component, to give us a sense of how many times we were entering and exiting certain functions, and how much time we were spending in them.

Why did we build this? To be honest, it’s been too long and I can’t quite remember. We certainly knew about the Gecko Profiler at this point, so I imagine there was some deficiency with the profiler that we were dealing with.

My hypothesis is that this was when we were dealing strictly with the ts_paint / tpaint regression on Windows XP. Take a look at the graphs in my last post again. Notice how UX (red) and mozilla-central (green) converge at around July 1st on Ubuntu? And how OS X finally converges on t_paint around August 1st?

I haven’t included the Windows 7 and 8 platform graphs, but I’m reasonably certain that at this point, Windows XP was the last regressing platform on these tests.

And I know for a fact that we were having difficulty using the Gecko Profiler on Windows XP, due to this bug.

Basically, on Windows XP, the call tree wasn’t interleaving the Javascript and native-code calls properly, so we couldn’t trust the order of tree, making the profile really useless. This was a serious problem, and we weren’t sure how to workaround it at the time.

And so I imagine that this is what prompted jaws to write the homebrew profiler. And it worked – we were able to find sections of CustomizableUI that were causing unnecessary reflow, or taking too long doing things that could be shortcutted.

I don’t know where jaws’ homebrew profiler is – I don’t have the patch on my machine, and somehow I doubt he does too. It was a tool of necessity, and I think we moved past it once we sorted out the Windows XP stack interleaving thing.

And how did we do that, exactly?

Using the Gecko Profiler on Windows XP

jaws profiler got us some good data, but it was limited in scope, since it only paid attention to CustomizableUI. Thankfully, at some point, Vladan from the Perf team figured out what was going wrong with the Gecko Profiler on Windows XP, and gave us a workaround that lets us get proper profiles again. I have since updated the Gecko Profiler MDN documentation to point to that workaround.

Reflow Profiles

This is where we start getting into some really neat stuff. So while we were hacking on ts_paint and tpaint, Markus Stange from the layout team wrote a patch for Gecko to take “reflow profiles”. This is a pretty big deal – instead of telling us what code is slow, a reflow profile tells us what things take a long time to layout and paint. And, even better, it breaks it down by DOM id!

This was hugely powerful, and I really hope something like this can be built into the Gecko Profiler.

Markus’ patch can be found in this bug, but it’ll probably require de-bitrotting. If and when you apply it, you need to run Firefox with an environment variable MOZ_REFLOW_PROFILE_FILE pointing at the file you’d like the profile written out to.

Once you have that profile, you can view it on Markus’ special fork of the Gecko Profiler viewer.

This is what a reflow profile looks like:

Screen Shot 2013-12-13 at 11.49.34 PM

I haven’t linked to one I’ve shared because reflow profiles tend to be very large – too large to upload. If you’d like to muck about with a real reflow profile, you can download one of the reflow profiles attached to this bug and upload it to Markus’ Gecko Profiler viewer.

These reflow profiles were priceless throughout all of the Australis performance work. I cannot stress that enough. They were a way for us to focus on just a facet of the work that Gecko does – layout and painting – and determine whether or not our regressions lay there. If they did, that meant that we had to find a more efficient way to paint or layout. And if the regressions didn’t show up in the reflow profiles, that was useful too – it meant we could eliminate graphics and layout from our pool of suspects.

Comparison Profiles

Profiles are great, but you know what’s even better? Comparison profiles. This is some more Markus Stange wizardry.

Here’s the idea – we know that ts_paint and tpaint have regressed on the UX branch. We can take profiles of both the UX and mozilla-central. What if we can somehow use both profiles and find out what UX is doing that’s uniquely different and uniquely slow?

Sound valuable? You’re damn right it is.

The idea goes like this – we take the “before” profile (mozilla-central), and weight all of its samples by -1. Then, we add the samples from the “after” profile (UX).

The stuff that is positive in the resulting profile is an indicator that UX is slower in that code path. The stuff that is negative means that UX is faster.

How did we do this? Via these scripts. There’s a script in this repository called create_comparison_profile.py that does all of the work in generating the final comparison profile.

Here’s a comparison profile to look at, with mozilla-central as “before” and UX as “after”.

Now I know what you’re thinking – Mike – the root of that comparison profile is a negative number, so doesn’t that mean that UX is faster than mozilla-central?

That would seem logical based on what I’ve already told you, except that talos consistently returns the opposite opinion. And here’s where I expose some ignorance on my part – I’m simply not sure why that root node is negative when we know that UX is slower. I never got a satisfying answer to that question. I’ll update this post if I find out.

What I do know is that drilling into the high positive numbers of these comparison profiles yielded very valuable results. It allowed us to quickly determine what was unique slow about UX.

And in performance work, knowing is more than half the battle – knowing what’s slow is most of the battle. Fixing it is often the easy part – it’s the finding that’s hard.

Oh, and I should also point out that these scripts were able to generate comparison profiles for reflow profiles as well. Outstanding!

Profiles from Talos

Profiling locally is all well and good, but in the end, if we don’t clear the regressions on the talos hardware that run the tests, we’re still not good enough. So that means gathering profiles on the talos hardware.

So how do we do that?

Talos is not currently baked into the mozilla-central tree. Instead, there’s a file called testing/talos/talos.json that knows about a talos repository and a revision in that repository. The talos machines then pull talos from that repository, check out that revision, and execute the talos suites on the build of Firefox they’ve been given.

We were able to use this configuration to our advantage. Markus cloned the talos repository, and modified the talos tests to be able to dump out both SPS and reflow profiles into the logs of the test runs. He then pushed those changes to his user repository for talos, and then simply modified the testing/talos/talos.json file to point to his repo and the right revision.

The upshot being that Try would happily clone Markus’ talos, and we’d get profiles in the test logs on talos hardware! Brilliant!

Extracting and symbolicating those profiles would be handled by more of Markus’ scripts – see get_profiles.py.

Now we were cooking with gas – reflow and SPS profiles from the test hardware. Could it get better?

Actually, yes.

Getting the Good Stuff

When the talos tests run, the stuff we really care about is the stuff being timed. We care about how long it takes to paint the window, but not how long it takes to tear down the window. Unfortunately, things like tearing down the window get recorded in the SPS and reflow profiles, and that adds noise.

Wouldn’t it be wonderful to get samples just from the stuff we’re interested in? Just to get samples only when the talos test has its stopwatch ticking?

It’s actually easier than it sounds. As I mentioned, Markus had cloned the talos tests, and he was able to modify tpaint and ts_paint to his liking. He made it so that just as these tests started their stopwatches (waiting for the window to paint), an SPS profile marker was added to the sample taken at that point. A profile marker simply allows us to decorate a sample with a string. When the stopwatch stopped (the window has finished painting), we added another marker to the profile.

With that done, the extraction scripts simply had to exclude all samples that didn’t occur between those two markers.

The end result? Super concentrated profiles. It’s just the stuff we care about. Markus made it work for reflow profiles too – it was really quite brilliant.

And I think that pretty much covers it.

Lessons

So with these amazing tools we were eventually able to grind down our ts_paint and tpaint regressions into dust.

And we celebrated! We were very happy to clear those regressions. We were all clear to land!

Or so we thought. Stay tuned for Part 4.

December 14, 2013 05:00 PM

December 12, 2013

Calendar

Finally, A New Website!

This is the moment I have been waiting for so long! I can’t say this big enough because I am so excited:

Check out this awesome:

new-website

The new site is clear and concise. Not too many links, all important links on one page. The download button gives you the latest version directly from addons.mozilla.org.

There are some features still lacking that we will be adding in later, it was more important to just get this thing out in the open first. The holiday calendars page is still missing. Ideally the new holiday calendars area will auto-generate itself from the existing holiday files. We also wanted to show the most recent blog headlines.

The new site is part of bedrock, Mozilla’s shiny new website framework. This means we get a lot of stuff for free, one of them is localization. I haven’t found out how this works yet, but it will be possible to translate the page into any language.

Is there any information you are missing? Let us know what you think!

December 12, 2013 10:51 PM

Lightning 2.6.x Version Recap

As you may have read in the previous post, there have been quite a few issues with Lightning 2.6.x. I wanted to explain what happened and what we can do to avoid these issues in the future.

The Lightning build process is closely coupled with Thunderbird. Every time Thunderbird does a release, we get builds for Lightning for free. This means we mostly depend on them doing a release, otherwise I have to patch the final builds manually. This is a little more work. Luckily, each of the releases between 2.6 and 2.6.4 have been done together with a Thunderbird build.

Google Calendar Issues via CalDAV

Just before Lightning 2.6 was released, we made some last minute changes to accomodate for the fact that Google Calendar had changed their CalDAV URL. Not only that, they also implemented a specification for faster synchronization of CalDAV. We already supported this specification, but only an older version. A quick fix was done to take care. In total, were some authentication issues and an error loading calendars. We knew we had to release a 2.6.1, but we didn’t know it had to happen so fast…

Version compatibilitiy issues

When Thunderbird 24.0.1 was released, Lightning 2.6 did not work on Linux. The reason for this was a regression in the Mozilla Platform around Thunderbird 23. The binary component we have was built with a specifc compiler flag with a parameter that was too strict. It bascially said “this binary component is only for version 24.0.1″. The fix was easy, change it to “this binary component is for version 24.*”, but it took a while for that fix to be completed and admitted to all branches. Lightning 2.6.1 was quickly released as a workaround specificly compatible to Thunderbird 24.0.1, Lightning 2.6.2 was needed for Thunderbird 24.1.0.

Another problem why this was so hard to figure out for users is that some Linux distributions decided to skip the minor releases and only do 24.0, 24.1.0, 24.2.0 and so on. There were complaints because the latest Lightning version wasn’t working, when 24.1.1 was missing from the distribution repositories. We still needed to release consequent Lightning versions though, otherwise users using the stock builds would complain.

Lightning 2.6.3: Issues with CalDAV

Unfortunately, one of the patches for 2.6.1 had an error in it. We decided there needs to be a quick fix, and it was just in time for Lightning 2.6.3. The binary compatibility bug had been fixed by now, so this should also be the first version that is compatible to any version of Thunderbird 24.1.1 and up.

Lightning 2.6.4: Yet another one

Now this is the release that really annoyed me. First of all, I did a bad job on one of the patches. The other one was a minor issue with servers that don’t have a certain XML element in their response. These are the kinds of issues we could have easily figured out before the release with more and better unit tests. We might have even saved another release.

Conclusion

We probably could have known about all of these issues beforehand if we had tests to catch them. Just running the any of the tests using the build machinery would have caught the binary compatibility issue. If we had at least some manual tests to test CalDAV servers, we could have started them for a few public demo servers and caught all of the CalDAV regressions. Both of this has been on my list for quite some time, but given all the other things coming up I never got around to it.

Integrating the tests with the build system is unfortunately something only someone with Mozillians trust can do, but if you want to help us write some unit test, that would be marvelous. The cool new thing to use is promises and tasks, which allow writing really easy to read asynchronous code. I have some demo code thats not quite working but is ready for someone to pick up.

If you want to help in some other way, please contact me! Even if you are not a developer, there is a lot that can be done for someone with a little initiative.

 

December 12, 2013 10:24 PM

December 11, 2013

Meeting Notes

Thunderbird: 2013-12-10

Thunderbird meeting notes 2013-12-10
Minute taker – don’t forget to save a revision of the pad before clearing it for next use.
Please don’t forget to post on wiki.mozilla.org after the end of the meeting so that they will go public in the meeting notes blog.

Attendees

  • mconley, irving, jcranmer, paenglab, rolandtanglao, aceman, mkmelin, sshagarwal, javirueda

Action items from last meetings

  • none

Friends of the tree

  • nobody nominated but i am 100% somebody did something awesome

Current status and discussions

  • mconley: can we create an open contributor badge for TB contributors?

    • Friends of the tree, new contributor, I lost the list of because I’m not a fast typer

    • Need graphics or something or thing
    • Roland knows how to do it, so give him the graphics that we need. Goals, graphics, and people to award the badges to.

Critical Issues

  • Bug 948555 – Core graphics fix broke us in the last merge, but jcranmer is on it – we’ve essentially lost all OS X Mozmill tests. :/

Upcoming

Round Table

mconley

  • Still no takers on my Ensemble call for help – although I’ve responded to email asking how it currently works

  • Took another chunk out of my review queue this past weekend! \o/ I’ll have another opportunity this evening or tomorrow evening.
    • Part of my review queue includes needinfo?’s from tessarakt’s questions from the last meeting

jcranmer

  • Review queue is backed up, apologies

  • Hoping to land bug 842632… when bug 948555 is fixed (:-( @ m-c bustage)
  • Alder is green on 5/8 trees!
    • OS X opt broken due to universal build + nsinstall + ldap headaches

    • Windows requires a lot more investigation to fix
  • Will have spotty internet access and work times from ~Dec 21-Dec 29, and likely to be overloaded until Jan 14 with school stuff

Paenglab (no Mic)

  • Used a bit TB with dark Persona and found two bugs:

    • Bug 948384 – No chat icon when unreadMessages=”true” with dark Personas.

    • Bug 948568 – Use inverted chat toolbar icons with dark Personas.
  • Around christmas will look to update TB’s tabs to Fx Australis implementation.

Question Time

Other

  • Q12014: We are (finally) moving the Knowledge Base from support.mozillamessaging.com to support.mozilla.org & from getstatisfaction to support.mozilla.org forums [roland]

Action Items

  • [mconley] Put together an Etherpad for contributor badge categories and graphics

    • Make sure each badge has an explicit, 1-sentence goal.

December 11, 2013 04:00 AM

December 06, 2013

Robert Kaiser

Firefox OS DevTreff Vienna

Last month, I was contacted within a few days by a local "open mobile devices" enthusiast, a Mozilla events manager and a fellow German-speaking Mozilla Rep, all of them pointing to an event here in Vienna called Firefox OS DevTreff Austria.

While the local just asked me if I'd go there, the Mozilla contacts had been asked by the organizers for a speaker to open up the event. We were trying to get someone more used to talking about Firefox OS, but everyone's busy this time of year, so in the end we settled with me doing this keynote.

Now, I have been giving presentations on different occasions and events in the last years, but I never have actually keynoted anything, so that made me somewhat nervous. The other talks that were lined up for the evening were about app development, to some part about very concrete pieces of it, so I figured I should give that some frame and introduce people to Firefox OS, starting with why we are doing it, moving to what and where it is and giving a bit of glance onto where we want to take it. So I came up with "Firefox OS: Reasons, Status & Plans" as the title (my slides are behind the link).

Image No. 23157 Image No. 23158

The audience was supposed to be about 50 people, I guess 30-35 really showed up (the pictures, taken "in style" with Firefox OS on my Peak, only show one part of the room), but those were an awesome bunch. They were really into the topic, asked interesting questions, and the talks following me were showing that we really had capable developers in the room, from those that do JS in their free time to those who earn their bread and butter by doing apps.
We also had two Mozillians, both of which I had not met in person before, even though I spent a lot of time in this city in the last decade!

As the event was going on, I was often the voice in the room who would have answers from the Mozilla side or could explain our point of view and initiatives - and in quite a few cases, I could loop back to something I said in my keynote. It was really great to see how apparently I had touched exactly on the right things there and gave everything else a good base to build on. Interestingly, there was quite a bit of interest in the DeviceStorage API, probably because accessing local files is something people can refer better to than storing items in-app. I was thankful someone did a talk on our Marketplace and in-app payment API/Services as that's one area I'm actually weak in, but it also sparked quite a bit of interest. The permission model did also get a few questions.

We surely had people with Firefox OS app experience in there, but I think more of those people might pick up web app development, esp. if more similar events come around, which would be cool. And maybe someone should tell them how to do simple apps without larger libraries or frameworks, and explain app manifests in more detail. I hope they will organize more of those and the chance for that will come along!

December 06, 2013 04:16 AM

December 04, 2013

Joshua Cranmer

Why email is hard, part 4: Email addresses

This post is part 4 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. This post discusses the problems with email addresses.

You might be surprised that I find email addresses difficult enough to warrant a post discussing only this single topic. However, this is a surprisingly complex topic, and one which is made much harder by the presence of a very large number of people purporting to know the answer who then proceed to do the wrong thing [0]. To understand why email addresses are complicated, and why people do the wrong thing, I pose the following challenge: write a regular expression that matches all valid email addresses and only valid email addresses. Go ahead, stop reading, and play with it for a few minutes, and then you can compare your answer with the correct answer.

 

 

 

Done yet? So, if you came up with a regular expression, you got the wrong answer. But that's because it's a trick question: I never defined what I meant by a valid email address. Still, if you're hoping for partial credit, you may able to get some by correctly matching one of the purported definitions I give below.

The most obvious definition meant by "valid email address" is text that matches the addr-spec production of RFC 822. No regular expression can match this definition, though—and I am aware of the enormous regular expression that is often purported to solve this problem. This is because comments can be nested, which means you would need to solve the "balanced parentheses" language, which is easily provable to be non-regular [2].

Matching the addr-spec production, though, is the wrong thing to do: the production dictates the possible syntax forms an address may have, when you arguably want a more semantic interpretation. As a case in point, the two email addresses example@test.invalid and example @ test . invalid are both meant to refer to the same thing. When you ignore the actual full grammar of an email address and instead read the prose, particularly of RFC 5322 instead of RFC 822, you'll realize that matching comments and whitespace are entirely the wrong thing to do in the email address.

Here, though, we run into another problem. Email addresses are split into local-parts and the domain, the text before and after the @ character; the format of the local-part is basically either a quoted string (to escape otherwise illegal characters in a local-part), or an unquoted "dot-atom" production. The quoting is meant to be semantically invisible: "example"@test.invalid is the same email address as example@test.invalid. Normally, I would say that the use of quoted strings is an artifact of the encoding form, but given the strong appetite for aggressively "correct" email validators that attempt to blindly match the specification, it seems to me that it is better to keep the local-parts quoted if they need to be quoted. The dot-atom production matches a sequence of atoms (spans of text excluding several special characters like [ or .) separated by . characters, with no intervening spaces or comments allowed anywhere.

RFC 5322 only specifies how to unfold the syntax into a semantic value, and it does not explain how to semantically interpret the values of an email address. For that, we must turn to SMTP's definition in RFC 5321, whose semantic definition clearly imparts requirements on the format of an email address not found in RFC 5322. On domains, RFC 5321 explains that the domain is either a standard domain name [3], or it is a domain literal which is either an IPv4 or an IPv6 address. Examples of the latter two forms are test@[127.0.0.1] and test@[IPv6:::1]. But when it comes to the local-parts, RFC 5321 decides to just give up and admit no interpretation except at the final host, advising only that servers should avoid local-parts that need to be quoted. In the context of email specification, this kind of recommendation is effectively a requirement to not use such email addresses, and (by implication) most client code can avoid supporting these email addresses [4].

The prospect of internationalized domain names and email addresses throws a massive wrench into the state affairs, however. I've talked at length in part 2 about the problems here; the lack of a definitive decision on Unicode normalization means that the future here is extremely uncertain, although RFC 6530 does implicitly advise that servers should accept that some (but not all) clients are going to do NFC or NFKC normalization on email addresses.

At this point, it should be clear that asking for a regular expression to validate email addresses is really asking the wrong question. I did it at the beginning of this post because that is how the question tends to be phrased. The real question that people should be asking is "what characters are valid in an email address?" (and more specifically, the left-hand side of the email address, since the right-hand side is obviously a domain name). The answer is simple: among the ASCII printable characters (Unicode is more difficult), all the characters but those in the following string: " \"\\[]();,@". Indeed, viewing an email address like this is exactly how HTML 5 specifies it in its definition of a format for <input type="email">

Another, much easier, more obvious, and simpler way to validate an email address relies on zero regular expressions and zero references to specifications. Just send an email to the purported address and ask the user to click on a unique link to complete registration. After all, the most common reason to request an email address is to be able to send messages to that email address, so if mail cannot be sent to it, the email address should be considered invalid, even if it is syntactically valid.

Unfortunately, people persist in trying to write buggy email validators. Some are too simple and ignore valid characters (or valid top-level domain names!). Others are too focused on trying to match the RFC addr-spec syntax that, while they will happily accept most or all addr-spec forms, they also result in email addresses which are very likely to weak havoc if you pass to another system to send email; cause various forms of SQL injection, XSS injection, or even shell injection attacks; and which are likely to confuse tools as to what the email address actually is. This can be ameliorated with complicated normalization functions for email addresses, but none of the email validators I've looked at actually do this (which, again, goes to show that they're missing the point).

Which brings me to a second quiz question: are email addresses case-insensitive? If you answered no, well, you're wrong. If you answered yes, you're also wrong. The local-part, as RFC 5321 emphasizes, is not to be interpreted by anyone but the final destination MTA server. A consequence is that it does not specify if they are case-sensitive or case-insensitive, which means that general code should not assume that it is case-insensitive. Domains, of course, are case-insensitive, unless you're talking about internationalized domain names [5]. In practice, though, RFC 5321 admits that servers should make the names case-insensitive. For everyone else who uses email addresses, the effective result of this admission is that email addresses should be stored in their original case but matched case-insensitively (effectively, code should be case-preserving).

Hopefully this gives you a sense of why email addresses are frustrating and much more complicated then they first appear. There are historical artifacts of email addresses I've decided not to address (the roles of ! and % in addresses), but since they only matter to some SMTP implementations, I'll discuss them when I pick up SMTP in a later part (if I ever discuss them). I've avoided discussing some major issues with the specification here, because they are much better handled as part of the issues with email headers in general.

Oh, and if you were expecting regular expression answers to the challenge I gave at the beginning of the post, here are the answers I threw together for my various definitions of "valid email address." I didn't test or even try to compile any of these regular expressions (as you should have gathered, regular expressions are not what you should be using), so caveat emptor.

RFC 822 addr-spec
Impossible. Don't even try.
RFC 5322 non-obsolete addr-spec production
([^\x00-\x20()\[\]:;@\\,.]+(\.[^\x00-\x20()\[\]:;@\\,.]+)*|"(\\.|[^\\"])*")@([^\x00-\x20()\[\]:;@\\,.]+(.[^\x00-\x20()\[\]:;@\\,.]+)*|\[(\\.|[^\\\]])*\])
RFC 5322, unquoted email address
.*@([^\x00-\x20()\[\]:;@\\,.]+(\.[^\x00-\x20()\[\]:;@\\,.]+)*|\[(\\.|[^\\\]])*\])
HTML 5's interpretation
[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*
Effective EAI-aware version
[^\x00-\x20\x80-\x9f]()\[\]:;@\\,]+@[^\x00-\x20\x80-\x9f()\[\]:;@\\,]+, with the caveats that a dot does not begin or end the local-part, nor do two dots appear subsequent, the local part is in NFC or NFKC form, and the domain is a valid domain name.

[1] If you're trying to find guides on valid email addresses, a useful way to eliminate incorrect answers are the following litmus tests. First, if the guide mentions an RFC, but does not mention RFC 5321 (or RFC 2821, in a pinch), you can generally ignore it. If the email address test (not) @ example.com would be valid, then the author has clearly not carefully read and understood the specifications. If the guide mentions RFC 5321, RFC 5322, RFC 6530, and IDN, then the author clearly has taken the time to actually understand the subject matter and their opinion can be trusted.
[2] I'm using "regular" here in the sense of theoretical regular languages. Perl-compatible regular expressions can match non-regular languages (because of backreferences), but even backreferences can't solve the problem here. It appears that newer versions support a construct which can match balanced parentheses, but I'm going to discount that because by the time you're going to start using that feature, you have at least two problems.
[3] Specifically, if you want to get really technical, the domain name is going to be routed via MX records in DNS.
[4] RFC 5321 is the specification for SMTP, and, therefore, it is only truly binding for things that talk SMTP; likewise, RFC 5322 is only binding on people who speak email headers. When I say that systems can pretend that email addresses with domain literals or quoted local-parts don't exist, I'm excluding mail clients and mail servers. If you're writing a website and you need an email address, there is no need to support email addresses which don't exist on the open, public Internet.
[5] My usual approach to seeing internationalization at this point (if you haven't gathered from the lengthy second post of this series) is to assume that the specifications assume magic where case insensitivity is desired.

December 04, 2013 11:24 PM

December 01, 2013

Mike Conley

Australis Performance Post-mortem Part 2: ts_paint and t_paint

Continued from Part 1.

So we’d just gotten Talos data in, and it looked like we were regressing on ts_paint and tpaint right across the board.

Speaking just for myself, up until this point, Talos had been a black box. I vaguely knew that Talos tests were run, and I vaguely understood that they measured certain performance things, but I didn’t know what those things were nor where to look at the results.

Luckily, I was working with some pretty seasoned veterans. MattN whipped up an amazing spreadsheet that dynamically pulled in the Talos test data for each platform so that we could get a high-level view of all of the regressions. This would turn out to be hugely useful.

Here’s a link to a read-only version of that spreadsheet in all of its majesty. Or, if that link is somehow broken in the future, here’s a screenshot:

Numbers!

Numbers!

So now we had a high-level view of the regressions. The next step was determining what to do about it.

I should also mention that these regressions, at this point, were the only big things blocking us from landing on mozilla-central. So naturally, a good chunk of us focused our attention on this performance stuff. We quickly organized a daily standup meeting time where we could all get together and give reports on what we were doing to grind down the performance issues, and what results we were getting from our efforts.

That chunk of team, however, didn’t initially include me. I believe Gijs, Unfocused, mikedeboer and myself kept hacking on customization and widget bugs while jaws and MattN dug at performance. As time went on though, a few more of us eventually joined MattN and jaws in their performance work.

The good news in all of this is that ts_paint and tpaint are related – both measure the time it takes from issuing the command to open a browser window to actually painting it on the screen. ts_paint is concerned with the very first Firefox window from a cold-start, and tpaint is concerned with new windows from an already-running Firefox. It was quite possible that there was some overlap in what was making us slow on these two tests, which was somewhat encouraging.

The following bugs are just a subset of the bugs we filed and landed to improve our ts_paint and tpaint performance. Looking back, I’m pretty sure these are the ones that made the most difference, but the full list can be found as dependencies of these bugs.

Bug 890105 - TabsInTitleBar._update should group measurements and style changes to avoid unnecessary reflows

After a bit of examination, MattN dealt the first blow when he filed Bug 890105. The cross-platform code that figures out how best to place the tabs in the titlebar (while taking into account things like the system font size) is run before the window first paints, and it was being inefficient.

By inefficient, I mean it was causing more reflows than necessary. Here’s some information on reflows. The MDN page states that the article is obsolete, but the page still does a pretty good job of explaining what a reflow is.

The code would take a measurement of something on the page (causing a reflow), update that thing’s size (causing a reflow), and then repeat the process. MattN found we could cluster the measurements into a single pass, and then do all of the changes one after another. This reduced the number of reflows, which helped speed up both ts_paint and tpaint.

And boom, we saw our first win for both ts_paint and tpaint!

Bug 892532 – Add an optional fast-path to CustomizableUI.isWidgetRemovable

jaws found the next big win using a home-brewed profiler. The home-brewed profiler simply counted the number of times we entered and exited various functions in the CustomizableUI code, and recorded the time it took from entering to exiting.

I can’t really recall why we didn’t use the SPS profiler at this point. We certainly knew about it, but something tells me that at this point, we were having a hard time getting useful data from it.

Anyhow, with the home-brew profiler, jaws determined that we had the opportunity to fast-path a section of our code. Basically, we had a function that takes the ID of a widget, looks for and retrieves the widget, and returns whether or not that widget can be removed from its current location. There were some places that called this function during window start-up, and those places already had the widget that was to be found. jaws figured we could fast-path the function by being able to pass the widget itself rather than the ID, and skip the look-up.

Bug 891104 – Skip calling onOverflow during startup if there wasn’t any overflowed content before the toolbar is fully initialized

It was MattN’s turn again – this time, he found that the overflow toolbar code for the nav-bar (this is the stuff that handles putting widgets into the overflow panel if the window gets too small) was running the overflow handler as soon as the nav-bar was initted, regardless of whether anything was overflowed. This was causing a reflow because a measurement was on the overflowable toolbar to see if items needed to be moved into the overflow panel.

Originally, the automatic call of the overflow handler was to account for the case where the nav-bar is overflowed from the very beginning – but jaws made it smarter by attaching an overflow handler before the CSS attribute that made the toolbar overflowable was applied. That meant that if the nav-bar would only call the overflow handler if it really needed to, as opposed to every time.

Bug 898126 – Cache client hit test values

Around this time, a few more people started to get involved in Australis performance work. Gijs and mstange got a bug filed to investigate if there was a way to make start-up faster on Windows XP and 7. Here’s some context from mstange in that bug in comment 9:

It turns out that Windows XP sends about 200 WM_NCHITTEST events per second when we open a new window. All these events have the same position – possibly the current mouse position. And all the ClientMarginHitTestPoint optimizations we’ve been playing with only make a difference because that function is called so often during the test – one invocation is unnoticeably quick, but it starts to add up if we call it so many times.

This patch makes sure that we only send one hittest event per second if the position doesn’t change, and returns a cached value otherwise.

After some fiddling about with cache invalidation times, the patch landed, and we saw a nice win on Windows XP and 7!

Bug 906075 – Only send toolbars through buildArea if they’re not in their default state

It was around now that I started to get involved with performance work. One of my first successful bugs was to only run a toolbar through CustomizableUI’s buildArea function if the toolbar was not starting in a default state. The buildArea function’s job is to populate a customizable area with only the things that the user has moved into the area, and remove the things that the user has taken out. That involves cycling through the nodes in the area to see if they belong, and that takes time. I wrote a patch that cached a “dirty” state on a toolbar to indicate that it’d been customized in the past, and if we didn’t see that value, we didn’t run the toolbar through the function. Easy as pie, and we saw a little win on both ts_paint and tpaint on all platforms.

Bug 905695 – Skip checking for tab overflows if there is only one tab open

This was another case where we had an unnecessary reflow during start-up. And, like bug 891104, it involved an overflow event handler running when it really didn’t need to. jaws writes:

If only one tab is opened and we show the left/right arrows, we are actually removing quite a bit of space that could have been used to show the tab. Scrolling the tabbox in this state is also quite useless, since all the user can do is scroll to see the other parts of the *only* tab.

If we make this change, we can skip a synchronous reflow for new windows that only have one tab.

Which means we could skip a reflow for all new windows. Are you starting to notice a pattern? Sections of our code had been designed to operate the same way, regardless of whether or not it was in the default, common case. We were finding ways of detecting the default case, and fast-pathing them.

Chalk up another win!

Bug 907787 – Australis: toolbar overflow button should be hidden by default

Yet another example where we could fast-path the default case. The overflow button in the nav-bar is only supposed to be displayed if there are too many items in the nav-bar, resulting in some getting put into the overflow panel, which anchors on the overflow button.

If nothing is being overflowed and the panel is empty, the button should not be displayed.

We were, however, displaying the button by default, and then hiding it when we determined that nothing was overflowed. Bug 907787 inverted that logic, and hid the button by default, and only showed it when things got overflowed (which was not the default case).

We were getting really close to performance parity with mozilla-central…

Bug 908326 – default the navbar to overflowable to avoid needless reflowing

Once again, an example of us not greasing the default-path. Our overflowable toolbar code applies an overflowable attribute to the nav-bar in order to apply some CSS styles to give the toolbar its overflowing properties. Adding that attribute dynamically means a reflow.

Instead, we just added the attribute to the node’s definition in browser.xul, and dropped that unnecessary reflow like a hot brick.

So how far had we come?

Let’s take a look at the graphs, shall we? Remember, in these graphs, the red points represent UX, and the green represent mozilla-central. Up is bad, and down is good. Our goal was to sink the red dots down into the noise of the green dots, which would give us performance parity.

ts_paint

Windows XP - ts_paint improvements

Windows XP – ts_paint improvements

Ubuntu - ts_paint improvements

Ubuntu – ts_paint improvements

OSX 10.6 ts_paint improvements

OSX 10.6 ts_paint improvements

You might be wondering what that bug jump is for ts_paint for OSX 10.6 at the end of the graph. This thread explains.

tpaint

Windows XP - tpaint improvements

Windows XP – tpaint improvements

 

Ubuntu - tpaint improvements

Ubuntu – tpaint improvements

OSX 10.6 tpaint improvements

OSX 10.6 tpaint improvements

Looking good.

The big lessons

I think the big lesson here is to identify the common, default case, and optimize it as best you can. By definition, this is the path that’s going to be hit the most, so you can special-case it, and build in fast paths for it. Your users will thank you.

Close the feedback loop as much as you can. To test our theories, we’d push our patches to try and use compare-talos to compare our tpaint and ts_paint numbers to baseline pushes to see if we were making improvements. This requires several hours for the try builds to complete. This is super slow. Release Engineering was awesome and lent us some Windows XP talos slaves for us to experiment on, and that helped us close the feedback loop a lot. Don’t be afraid to ask Release Engineering for talos slaves.

Also note that while it’s easy for me to rattle off bug numbers and explain where we were being slow, all of that investigation and progress occurred over several months. Performance work can be really slow. The bottleneck is not making the slow code faster – the bottleneck is identifying where the slow code is. Profiling is the key here. If you’re not using some kind of profiler while doing performance work, you’re seriously impeding yourself. If you don’t have a profiler, build a simple one. If you don’t know how to build a simple one, find someone who can.

I mentioned Gecko’s built-in SPS profiler a few paragraphs back. The SPS profiler was instrumental (pun intended) in getting our performance back up to snuff. We also built a number of tools alongside the SPS profiler to help us in our analyses.

Read up about those tools we built in Part 3…

December 01, 2013 08:24 PM

November 21, 2013

Mike Conley

Australis Performance Post-mortem Part 1: Where We Started

Getting to the merge

Last Monday, November 18th, Australis merged into our Nightly release channel, meaning lots of people are getting to try it and give us feedback. It’s been an exciting week, and we’re all very pleased with the response so far!

Up until then, if you wanted to try Australis, you had to use the Nightlies from the UX branch. If you followed along on the UX branch, you’ll know that the tabs and the customization work have been in a pretty steady state for the last few months.

So what was the hold up? Why did it take so long to get to the merge?

Gather round folks, I have a story to tell.

Some terminology

I’m going to be batting around a few terms here, and some people will understand them right away, and some people won’t, so I’ll just spell them out here, in no particular order:

Australis
If at this point you’re still not sure what I mean by Australis, you might want to check out this blog post and the accompanying video.
mozilla-central
mozilla-central, in this instance, refers to code that did not have the Australis changes in them. In the grand scheme of things, mozilla-central was where non-Australis code went, and then we’d merge those changes into the UX branch.
UX branch
The UX branch was where we were storing all of the Australis code.
Talos
Talos is a series of tests that we can run against a build of Firefox to measure the performance of different things – for example, how long it takes for a window to be opened. As of this writing, Talos tests for Desktop Firefox are run on Ubuntu Linux 12.04, OS X (10.6, 10.7 and 10.8), and Windows (XP, 7 and 8).

Where we started from

Let’s rewind a bunch of months. Let’s go to about early June, 2013. At this time, the curvy tab work was essentially finished on Windows, and had been ported to OS X and Linux. The customization code was still being hacked on, but we felt like we were in a pretty decent place – the team felt like we were ready to merge into mozilla-central to get some real user feedback and testing.

The problem was that up until that point, we hadn’t been running the Talos tests on the UX Branch, which means we didn’t really have a good idea about how we were performing in comparison to mozilla-central.

And then we turned the Talos tests on. Data started to flow in, and it wasn’t happy data. In particular, we were regressing pretty badly on two tests: ts_paint and tpaint.

ts_paint
this test measures how long it takes for Firefox to paint the first window on startup.
tpaint
this test measures how long it takes for Firefox to paint a newly opened window from a Firefox that is already running

Before I show you this data, I should clear some things up:  as mentioned above, we run these Talos tests on a bunch of operating systems, and a variety of operating system versions. I don’t want to bog this post down with too many charts, so I’m going to extract a chart for each operating system, and forgo breaking it down by operating system version. Suffice it to say that the regressions were pretty consistent from version to version.

Also, in each of these graphs, green represents mozilla-central, and red represents the UX branch. Up is bad (slower). Down is good (faster).

Anyhow, here’s what we saw:

ts_paint

Windows XP - ts_paint regression

Windows XP – ts_paint regression

Linux 32 - ts_paint regression

Ubuntu 12.04 – ts_paint regression

OSX 10.6 - ts_paint regression

OSX 10.6 – ts_paint regression

tpaint

Windows XP - tpaint regression

Windows XP – tpaint regression

Linux 32 - tpaint regression

Ubuntu 12.04 – tpaint regression

OSX 10.6 - tpaint regression

OSX 10.6 – tpaint regression

Ouch

The team has been working like crazy to make Firefox look and feel faster. Hitting a regression like this blows.

It’s also flat out unacceptable to have a regression like this unless there’s a really really good reason for it.

So we had to investigate. What was making us slow? What had we done wrong?

Find out in Part 2.

November 21, 2013 06:13 PM

November 20, 2013

Joshua Cranmer

Why email is hard, part 3: MIME

This post is part 3 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discuses internationalization. This post discusses MIME, the mechanism by which email evolves beyond plain text.

MIME, which stands for Multipurpose Internet Mail Extensions, is primarily dictated by a set of 5 RFCs: RFC 2045, RFC 2046, RFC 2047, RFC 2048, and RFC 2049, although RFC 2048 (which governs registration procedures for new MIME types) was updated with newer versions. RFC 2045 covers the format of related headers, as well as the format of the encodings used to convert 8-bit data into 7-bit for transmission. RFC 2046 describes the basic set of MIME types, most importantly the format of multipart/ types. RFC 2047 was discussed in my part 2 of this series, as it discusses encoding internationalized data in headers. RFC 2049 describes a set of guidelines for how to be conformant when processing MIME; as you might imagine, these are woefully inadequate for modern processing anyways. In practice, it is only the first three documents that matter for building an email client.

There are two main contributions of MIME, which actually makes it a bit hard to know what is meant when people refer to MIME in the abstract. The first contribution, which is of interest mostly to email, is the development of a tree-based representation of email which allows for the inclusion of non-textual parts to messages. This tree is ultimately how attachments and other features are incorporated. The other contribution is the development of a registry of MIME types for different types of file contents. MIME types have promulgated far beyond just the email infrastructure: if you want to describe what kind of file binary blob is, you can refer to it by either a magic header sequence, a file extension, or a MIME type. Searching for terms like MIME libraries will sometimes refer to libraries that actually handle the so-called MIME sniffing process (guessing a MIME type from a file extension or the contents of a file).

MIME types are decomposable into two parts, a media type and a subtype. The type text/plain has a media type of text and a subtype of plain, for example. IANA maintains an official repository of MIME types. There are very few media types, and I would argue that there ought to be fewer. In practice, degradation of unknown MIME types means that there are essentially three "fundamental" types: text/plain (which represents plain, unformatted text and to which unknown text/* types degrade), multipart/mixed (the "default" version of multipart messages; more on this later), and application/octet-stream (which represents unknown, arbitrary binary data). I can understand the separation of the message media type for things which generally follow the basic format of headers+body akin to message/rfc822, although the presence of types like message/partial that don't follow the headers+body format and the requirement to downgrade to application/octet-stream mars usability here. The distinction between image, audio, video and application is petty when you consider that in practice, the distinction isn't going to be able to make clients give better recommendations for how to handle these kinds of content (which really means deciding if it can be displayed inline or if it needs to be handed off to an external client).

Is there a better way to label content types than MIME types? Probably not. X.400 (remember that from my first post?) uses OIDs, in line with the rest of the OSI model, and my limited workings with other systems that use these OIDs is that they are obtuse, effectively opaque identifiers with no inherent semantic meaning. People use file extensions in practice to distinguish between different file types, but not all content types are stored in files (such as multipart/mixed), and the MIME types is a finer granularity to distinguish when needing to guess the type from the start of a file. My only complaints about MIME types are petty and marginal, not about the idea itself.

No, the part of MIME that I have serious complaints with is the MIME tree structure. This allows you to represent emails in arbitrarily complex structures… and onto which the standard view of email as a body with associated attachments is poorly mapped. The heart of this structure is the multipart media type, for which the most important subtypes are mixed, alternative, related, signed, and encrypted. The last two types are meant for cryptographic security definitions [1], and I won't cover them further here. All multipart types have a format where the body consists of parts (each with their own headers) separated by a boundary string. There is space before and after the last parts which consists of semantically-meaningless text sometimes containing a message like "This is a MIME message." meant to be displayed to the now practically-non-existent crowd of people who use clients that don't support MIME.

The simplest type is multipart/mixed, which means that there is no inherent structure to the parts. Attachments to a message use this type: the type of the message is set to multipart/mixed, a body is added as (typically) the first part, and attachments are added as parts with types like image/png (for PNG images). It is also not uncommon to see multipart/mixed types that have a multipart/mixed part within them: some mailing list software attaches footers to messages by wrapping the original message inside a single part of a multipart/mixed message and then appending a text/plain footer.

multipart/related is intended to refer to an HTML page [2] where all of its external resources are included as additional parts. Linking all of these parts together is done by use of a cid: URL scheme. Generating and displaying these messages requires tracking down all URL references in an HTML page, which of course means that email clients that want full support for this feature also need robust HTML (and CSS!) knowledge, and future-proofing is hard. Since the primary body of this type appears first in the tree, it also makes handling this datatype in a streaming manner difficult, since the values to which URLs will be rewritten are not known until after the entire body is parsed.

In contrast, multipart/alternative is used to satisfy the plain-text-or-HTML debate by allowing one to provide a message that is either plain text or HTML [3]. It is also the third-biggest failure of the entire email infrastructure, in my opinion. The natural expectation would be that the parts should be listed in decreasing order of preference, so that streaming clients can reject all the data after it finds the part it will display. Instead, the parts are listed in increasing order of preference, which was done in order to make the plain text part be first in the list, which helps increase readability of MIME messages for those reading email without MIME-aware clients. As a result, streaming clients are unable to progressively display the contents of multipart/alternative until the entire message has been read.

Although multipart/alternative states that all parts must contain the same contents (to varying degrees of degradation), you shouldn't be surprised to learn that this is not exactly the case. There was a period in time when spam filterers looked at only the text/plain side of things, so spammers took to putting "innocuous" messages in the text/plain half and displaying the real spam in the text/html half [4] (this technique appears to have died off a long time ago, though). In another interesting case, I received a bug report with a message containing an image/jpeg and a text/html part within a multipart/alternative [5].

To be fair, the current concept of emails as a body with a set of attachments did not exist when MIME was originally specified. The definition of multipart/parallel plays into this a lot (it means what you think it does: show all of the parts in parallel… somehow). Reading between the lines of the specification also indicates a desire to create interactive emails (via application/postscript, of course). Given that email clients have trouble even displaying HTML properly [6], and the fact that interactivity has the potential to be a walking security hole, it is not hard to see why this functionality fell by the wayside.

The final major challenge that MIME solved was how to fit arbitrary data into a 7-bit format safe for transit. The two encoding schemes they came up with were quoted-printable (which retains most printable characters, but emits non-printable characters in a =XX format, where the Xs are hex characters), and base64 which reencodes every 3 bytes into 4 ASCII characters. Non-encoded data is separated into three categories: 7-bit (which uses only ASCII characters except NUL and bare CR or LF characters), 8-bit (which uses any character but NUL, bare CR, and bare LF), and binary (where everything is possible). A further limitation is placed on all encodings but binary: every line is at most 998 bytes long, not including the terminating CRLF.

A side-effect of these requirements is that all attachments must be considered binary data, even if they are textual formats (like source code), as end-of-line autoconversion is now considered a major misfeature. To make matters even worse, body text for formats with text written in scripts that don't use spaces (such as Japanese or Chinese) can sometimes be prohibited from using 8-bit transfer format due to overly long lines: you can reach the end of a line in as few as 249 characters (UTF-8, non-BMP characters, although Chinese and Japanese typically take three bytes per character). So a single long paragraph can force a message to be entirely encoded in a format with 33% overhead. There have been suggestions for a binary-to-8-bit encoding in the past, but no standardization effort has been made for one [7].

The binary encoding has none of these problems, but no one claims to support it. However, I suspect that violating maximum line length, or adding 8-bit characters to a quoted-printable part, are likely to make it through the mail system, in part because not doing so either increases your security vulnerabilities or requires more implementation effort. Sending lone CR or LF characters is probably fine so long as one is careful to assume that they may be treated as line breaks. Sending a NUL character I suspect could cause some issues due to lack of testing (but it also leaves room for security vulnerabilities to ignore it). In other words, binary-encoded messages probably already work to a large degree in the mail system. Which makes it extremely tempting (even for me) to ignore the specification requirements when composing messages; small wonder then that blatant violations of specifications are common.

This concludes my discussion of MIME. There are certainly many more complaints I have, but this should be sufficient to lay out why building a generic MIME-aware library by itself is hard, and why you do not want to write such a parser yourself. Too bad Thunderbird has at least two different ad-hoc parsers (not libmime or JSMime) that I can think of off the top of my head, both of which are wrong.

[1] I will be covering this in a later post, but the way that signed and encrypted data is represented in MIME actually makes it really easy to introduce flaws in cryptographic code (which, the last time I surveyed major email clients with support for cryptographic code, was done by all of them).
[2] Other types are of course possible in theory, but HTML is all anyone cares about in practice.
[3] There is also text/enriched, which was developed as a stopgap while HTML 3.2 was being developed. Its use in practice is exceedingly slim.
[4] This is one of the reasons I'm minded to make "prefer plain text" do degradation of natural HTML display instead of showing the plain text parts. Not that cleanly degrading HTML is easy.
[5] In the interests of full disclosure, the image/jpeg was actually a PNG image and the HTML claimed to be 7-bit UTF-8 but was actually 8-bit, and it contained a Unicode homograph attack.
[6] Of the major clients, Outlook uses Word's HTML rendering engine, which I recall once reading as being roughly equivalent to IE 5.5 in capability. Webmail is forced to do their own sanitization and sandboxing, and the output leaves something to desire; Gmail is the worst offender here, stripping out all but inline style. Thunderbird and SeaMonkey are nearly alone in using a high-quality layout engine: you can even send a <video> in an email to Thunderbird and have it work properly. :-)
[7] There is yEnc. Its mere existence does contradict several claims (for example, that adding new transfer encodings is infeasible due to install base of software), but it was developed for a slightly different purpose. Some implementation details are hostile to MIME, and although it has been discussed to death on the relevant mailing list several times, no draft was ever made that would integrate it into MIME properly.

November 20, 2013 07:54 PM

November 18, 2013

Robert Kaiser

Bye, Bye, My Customizations - Hello New Ones?

Australis is landing on Nightly and therefore in my builds, and so it's time to review my customizations.
The illustrations below are from my pre-Australis state (though with my custom LCARStrek theme removed to make it easier to see what the customizations are). I still need to figure out the post-Australis state fully.

Image No. 23154

For one thing, I like having everything bookmark-like bundled in the bookmarks toolbar. That starts with the home button (that goes to my custom start page), the bookmarks menubutton, and then quick-access bookmark items like a menu with Bugzilla query live feeds and frequently visited sites. All those items, including the home and bookmarks buttons, have the same layout of an icon with text next to it and the same small height. I'm using the home button quite frequently to get to that site, same for the bookmark items, and I'm using the bookmarks list in the menubutton for retrieving sites I rarely visit (so they're often falling off the awesomebar) but want to go to every now and then (say, those sites where I buy karaoke songs every few months - I might not remember the exact name but I know where to find them in my bookmarks hierarchy). Australis removes the text from the buttons and actually makes them larger in height (which would make the whole toolbar higher), so I cannot place them in that place any more without breaking design. I might remove the home button completely and replace it with just a bookmark item pointing to the page, which should do the same job nicely (though the home button might not actually increase height, I need to test that a bit more). For the bookmarks button, I'm not sure. I don't feel like I want it at the right of any bar where it's far away from the bookmarks, and I don't want to size of the bookmarks bar to grow. Maybe I'll also hide it away completely, possibly place it in the "Hamburger" menu, as I don't use the starring feature too much anyhow, and move my whole bookmarks hierarchy to a folder on the bookmarks bar. That might seem strange in the logic of bookmark hierarchies, but it should do the job.

Image No. 23155

The throbber button at the right of my tab bar and the "Nightly" button on its left need to go as well, so I'll get rid of the former (even though it's an old friend from days gone by - but it's not even available for customization any more), and the latter is being morphed into the "Hamburger" button the right of the navigation bar, it's default location. While we're at the right of the navigation bar, I might think about removing the search field, actually. I find it annoying that it reminds me for hours of the last search I did because its content never goes away, and I do my Google searches from the location bar anyhow - and other searches by first going to the respective site and search from there. Maybe the old "Search Tabs" idea comes along one time and gives me nicer ways to use alternate search engines.

Image No. 23156

And then, there's add-ons. I have/had the Sync, Lightbeam and Diaspora EasyShare buttons as well as the MemChaser display in my add-ons bar. The Sync button is only showing if Sync is in progress and able to trigger one intentionally, I probably will remove that as I can easily live without it. Diaspora Easyshare might move to the bookmarks or even navigation bar, I'm not using it a lot but it actually looks very handy and I do use Diaspora quite a bit as my only social network. MemChaser will need to go away, unfortunately. It's nice to look at those numbers every now and then, esp. when problems might occur, but there's just no place in Australis where I can place a large and constantly updating thing like that without constant distraction. The bottom border of the screen/window was perfect for that. But OK, I'll probably end up removing or at least disabling the add-on completely. Lightbeam will either go away or move to the "Hamburger" menu - given how rarely I use it, it has no place at the top of my window in primary UI. While I'm at it, it might make sense to go through my add-ons list and do some cleanup there in general.
And once I get a feeling if I can do my work with the new "Hamburger" menu or not, I will see if I'll need to turn on the menu bar again or leave it hidden, like I've had it for a while now - but first I want to see how well the new stuff works for me so I can really evaluate it.
Unfortunately we don't have the capability in desktop FHR to track add-on behavior changes or uninstalls that might come with Australis (mobile has a new format that can track that), otherwise it would be interesting to see if it's just me or if others are disabling or uninstalling add-ons as well with that change.

What changes does this mean for you? Are you able to cope (like me), excited about them, or distracted by them?

November 18, 2013 07:19 PM

November 17, 2013

Rumbling Edge - Thunderbird

2013-11-16 Calendar builds

Common (excluding Website bugs)-specific: (14)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

November 17, 2013 02:05 AM

2013-11-16 Thunderbird comm-central builds

Thunderbird-specific: (37)

MailNews Core-specific: (24)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

November 17, 2013 02:04 AM

November 07, 2013

Robert Kaiser

Internet Archive Fire: Donate to Rebuild

I just got word that a fire destroyed the Internet Archive Scanning Center in San Francisco.



I have blogged about what the archive has and can do a few months ago and I probably will mention it again when I get to more posts on preserving software.

I think it's in the best interest of everyone, esp. us as Mozillians, to keep this organization going and make the history of the Internet and more openly available to current and future generations.

Please help them to rebuild and continue on their way and make a Donation. I will for sure.

November 07, 2013 06:04 PM

November 01, 2013

Robert Kaiser

Badges for Stability Contributions?

Various groups at Mozilla have been thinking about awarding Open Badges for contributors in the recent year or so (there's even a contributor badges site for easily awarding them). I again and again wonder if we could do that for stability contributions as well, but it's a tough nut to crack.



One idea would be to award badges automatically to people who leave their email in a crash report (i.e. a badge for taking part in our efforts by delivering data through crash reports) - but that would probably end up being a bad badge because you' award people for crashing Firefox (and not award people who don't see crashes). We really do not want people to search for how they can crash so that they can get a badge - after all our mission is to avoid crashes, not provoke them. So, no badges for submitting crashes (thanks to Benjamin for pointing me to that issue right away when I was rolling that idea).

So, how can we potentially award a badge for Stability/"CrashKill" work without rewarding bad behavior?

One thing I could think of was "filed x crash bugs that ended up fixed". That should be relatively easy to get out of Bugzilla data and I think there's no doubt that filing bugs that end up with a fixed resolution is a good thing.
This idea would also open up the avenue for different badges for different amounts of badges (similar to the webdev badges), or to create a badge for developers who fixed x crash bugs, and similar things.

What do you think? Which ideas do you have for awarding badges for contributions to the Mozilla Stability program?

November 01, 2013 05:14 PM

October 30, 2013

Meeting Notes

Thunderbird: 2013-10-29

Remember to use headphones and mute yourself when not talking

Feel free to ask questions in the meeting either by speaking up or by asking them in #maildev on IRC.

Other ways to get in touch with us can be found on our communications page

Meeting Changes

Attendees

aceman, clokep, jcranmer, JosiahOne, sshagarwal, mconley, Paenglab, Fallen

Action items from last meetings

  • [Standard8] Are WADA, Aryx, Chiaki Ishikawa, JosiahOne and jcranmer from previous nomination covered already for swag?

    • Standard8 is on it, but needs to dig through some info in his inbox.
  • [mconley]
    • Thunderbird usage data isn’t public. Standard8 was tasked with getting that information released. What’s the status on that?

      • Metrics is still needinfo’d on getting non-staff community added to usage metrics report emails

Critical Issues

  • [rkent] Binary add-on issue with TB 24?

    • jcranmer and irving suggest speaking with Fallen – they suspect this is fixed.

    • The builds that came out for 24.1 timeframe release did not fix the bug that rkent is talking about – but extra numbers have been added to AMO to mark compatibility (you’ll need to do rebuilds to side-step the compat issue).

Round Table

  • jcranmer

    • Working on converting JSMime into a separate repository (thanks to James Burke for being patient with me and my packaging issues!)

      • The separate repository will be JSMime 0.2, as I have decided to rework the API to be more idiot-proof

      • Tests are now up-to-date for comm-central, need to port tests for RFC 2047 and address header encoding/decoding
      • Not announcing this more widely public yet, since the conversion process is very messy and I want to embed reviews from the start.
    • Alder is now reserved for the final stages of ccrework, and I should be in contact with releng on getting this done.
    • Some minor build-engineering work with moz.build conversions or ontology
    • Still awaiting review from Neil.
  • JosiahOne (No outgoing audio)
    • Been busy with exams last week, so I didn’t accomplish as much as I wished.

    • I have started re-organizing the directory/file/image structure inside themes/.
    • I plan to create a shared themes directory similar to Firefox unless the build peers/owners are against it.
    • Waiting on review from Mike Conley for the animated thunderbird tabs.
  • clokep
    • Still working on the IB->c-c chat/ update, waiting on reviews from Florian now.

    • Was at the GSoC Mentors Summit ([1])
      • A lot of people told me they were happy we were still working on Thumderbird!

      • A lot of people thought Thunderbird had been killed -> more on this in my blog post
  • mconley
    • Patch to allow us to use nsIPermissionManager to store mail content preferences is up for review. \o/

    • I wrote some mail to tb-planning about the AMO to Marketplace move, and what that means for Thunderbird (and SeaMonkey).
    • Did more reviews, but they’re starting to stack up again
  • Paenglab
    • Waiting for feedback from mconley for bug 925746

Question Time

  • [clokep] Do we still need to use tb-planning? I’m STILL moderated on it after an insane amount of time using it.

    • Can we switch back to using m.d.a.thunderbird? If we have issues with “idiots”, we can always moderate it. (I want my nntp back.)

    • Alternately, can we freely give out posting privileges to people.

Action Items

  • [mconley] Ask jorgev if Marketplace merge would accept patches to make Thunderbird work there…

  • [mconley] Can we pull jb into this AMO discussion? Can we get an advocate higher up the Mozilla-chain? jb+gerv?
  • [mconley] Talk to bwinton and see if we can get “the regulars” whitelisted on tb-planning.

Thunderbird Meeting Details :

October 30, 2013 04:00 AM

October 23, 2013

Kent James

Real Men do Build Engineering (A Real Threat to Thunderbird)

When I was first getting involved in Thunderbird, I recall reading a post from early leader Scott MacGregor that puzzled me. When the project was essentially a two-person project, he said that the next person they needed was a build engineer. I had always thought of that as a backwater for people who couldn’t do real coding.

How my thinking has changed! As we look to the future of the Thunderbird project, it is clear that the main threat at the moment is losing control of the build process. These days, it takes a PhD candidate in computer science (Joshua Cranmer) to understand building Thunderbird, and keep it building. I’m not sure we are winning the battle, though  Joshua assures me that he and Standard8 have a Grand Plan to make it all better.

As for me, I’ve committed myself to improving my record in doing code reviews, which so far has been dismal. I’d like to give 20% of my time back to the core project, and a significant fraction of that should be doing code reviews. But I can’t keep comm-central compiling in 8 hours per week, which makes it very hard to do reviews.

<rant>It seems there is some project over on mozilla-central to replace the build process with some python-based thingy. “Just do mach (insert short command)” is the answer to everything in mozilla-central world. If it works, great. If it does not work (and it never works on comm-central), then you have to call your PhD-candidate build expert to make any sense of the problem. In the old days, where standard unix tools were used, you at least had a hope of googling your problem, and getting some hint of the issue. Now you have to pour over multiple files of python with multiple levels of abstracted variables, none of which I understand.</rant>

Here is today’s typical error:

Reticulating splines...
Traceback (most recent call last):
  File "./config.status", line 881, in <module>
    config_status(**args)
  File "c:\tb\1-central\src\mozilla\build\ConfigStatus.py", line 126, in config_status
    summary = backend.consume(definitions)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\base.py", line 194, in consume
    self.consume_object(obj)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\recursivemake.py", line 333, in consume_object
    CommonBackend.consume_object(self, obj)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\common.py", line 87, in consume_object
    self._test_manager.add(test, flavor=obj.flavor)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\common.py", line 68, in add
    assert path.startswith(self.topsrcdir)
AssertionError
configure: error: c:/tb/1-central/src/mozilla/configure failed for mozilla
*** Fix above errors and then restart with               "c:/mozilla-build/python/python.exe c:/tb/1-central/src/mozilla
/build/pymake/pymake/../make.py -f client.mk build"

So, are there any Real Men out there (or even Real Women) that want to be a hero, and save our project? Or am I going to have to become a Python build expert to do anything at all in Thunderbird?

 

October 23, 2013 06:09 PM