Planet Thunderbird

April 24, 2014

Rumbling Edge - Thunderbird

2014-04-23 Calendar builds

Common (excluding Website bugs)-specific: (34)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

April 24, 2014 07:00 AM

2014-04-23 Thunderbird comm-central builds

Thunderbird-specific: (37)

MailNews Core-specific: (18)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

April 24, 2014 06:59 AM

April 23, 2014

Calendar

The 2014 Google Summer of Code Students are here!

I am particularly excited to announce that this year the Calendar Project has received two slots in Google Summer of Code 2014. Both projects target our backend code. This means users won’t have a chance to complain about user interface changes and instead will be blown away by performance and interoperability improvements.

I would like to take a few minutes to introduce our awesome new students to the community, please join me in giving them a warm welcome!

Reid Anderson: Improve Calendar Provider Backends

This project is about performance and stability for our calendar storage. Here is what Reid has to say:

I have been a student at the University of Minnesota since 2012 studying Chemistry and Computer Science. Outside of the classroom, I spend a lot of my time both watching and playing a variety of sports. I also enjoy reading, talking to friends, or playing a quick game of Civilization. I heard about Google Summer of Code before I entered college, and participating in the program had always been a goal of mine.

I was introduced to the Mozilla community when I started submitting patches to Songbird, a desktop media manager built on the Mozilla framework. Throughout the entire community I saw a consistent message of an open web powered by open technology and open software. This is something that I am excited to be a part of, and I am looking forward to contributing. My project is to improve the cached mode for online calendar providers to the point where it can be used as the default setting. This should allow Lightning to function effectively in an offline environment, while also bringing significant performance improvements. Hopefully these will be a useful contributions to the community, and I’m looking forward to getting started.

Isn’t that wonderful? I’m particularly excited about the performance improvements this will bring.

Malintha Fernando: Update Invitations to the latest Specification

This project will not only improve our invitations support, it will also guard us from future regressions through more tests. Here is an excerpt from Malintha’s blog post:

My name is Malintha Fernando and I am a student developer from Sri Lanka, currently studying at University of Moratuwa. I started contributing to Mozilla some months back (Still got a lot to learn) as my first contribution in open source and glad to be a part of the Lightning project in GSoC 2014.

The objective of this project is to improve Lightning’s scheduling system by updating the available features to the latest RFC specifications. As we know most of the Lightning’s implementation were done referring to the draft version 4 of the RFC 6638, there are some features lagging behind from the final RFC document.

Do you remember the mess we had when 2.6.x was released? At least one of the bugs we had to fix quickly was a regression in the invitations code. With Malintha’s help this won’t happen again!Whats next?According to Google, we are currently in the “Community Bonding Period”. This means we have a little time to set things up and make preparations. Coding officially begins on May 19th. You can follow progress on the projects as mentioned above, I will also blog about major updates here as we get closer to completion. Lets have some fun with this and continue to make Lightning better. Its about time!

April 23, 2014 07:46 PM

April 22, 2014

Jonathan Protzenko

Shutting down my (now redundant) comm-central clone on github

This is just a public service announcement: I'm shutting down the comm-central repository on github I've been maintaining since 2011. Please use the official version: they share the same hashes so it's just a matter of modifying the url in your .git/config. Plus, the official repo has all the cool tags baked in.

(Turns out that due to yet-another hg repository corruption, the repo has been broken since early March. Since no one complained, I suspect no one's really using this anymore. I'm pretty sure the folks at Mozilla are much better at avoiding hg repository corruptions than I am, so here's the end of it!).

April 22, 2014 07:33 AM

April 21, 2014

Instantbird

Google Summer of Code 2014 has Commenced!

Instantbird again has the pleasure of participating in Google Summer of Code under the Mozilla umbrella. In the past we’ve had a variety of exciting projects and this year is no different. Three students will be working with us this summer:

Saurabh Anand (sawrubh), mentored by Patrick Cloke (clokep), will aim to add support for reliable file transfer to Instantbird using FileLink as a fallback to standard file transfer.

Mayank Kumar (mayanktg), mentored by Benedikt P. (Mic), will be adding voice and video support to Instantbird by integrating WebRTC for XMPP. WebRTC makes it easy for us to have real-time communication without the use of additional plugins.

Nihanth Subramanya (nhnt11), who last year added the “awesometab“, will be looking to improve loading of conversations and history under the guidance of aleth. He will work on adding the ability to search across all logs of a contact and loading the previous context of a conversation when scrolling (“infinite scroll”).

Please feel free to stop by #instantbird on irc.mozilla.org to say hello and congratulate our students! Thanks again to Mozilla for allowing us to participate in Google Summer of Code with them!

April 21, 2014 07:00 PM

April 07, 2014

Mike Conley

Much Ado About Brendan (or As I’ve Seen It)

Since Brendan Eich’s resignation, I’ve been struggling to articulate what I think and feel about the matter. It’s been difficult. I haven’t been able to find what I wanted to say. Many other better, smarter, and more qualified Mozillians have written things about this, and I was about to let it go. I didn’t just want to say “me too”.

I felt I had nothing of substance to contribute. I feebly wrote something about Brendan Eich and the Kobayashi Maru, but it became a rambling mess, and the analogy fell apart quite quickly. I was about to call it quits on contributing my thoughts.

And then this post happened.

Don’t ask me where this came from. A muse woke me up in the night to write it (it’s just past 4AM for crying out loud – muse, let me sleep). Maybe through the lens of this nonsense, some real sense will prevail. I’m not hopeful, but this muse is nodding emphatically (and grinning like a lunatic).

Please believe that I’m not at all trying to trivialize, oversimplify, or make light of the events of the past few weeks by writing this. I’m just trying to understand it, and view it with a looking glass I have at least a little familiarity with.

And maybe it’s mostly catharsis.

I also apologize that it’s not really told like a story from the Bard. I think that’d be too long winded (no offense, Shakey). I’m pretty sure the narrator / stage directions have the most lines. It’s actually quite criminal.

I also want to point out that the only “real world names” in this little travesty is Brendan Eich’s, Jay Sullivan’s, and Mitchell Baker’s. The rest are from the world of Shakespeare.

And I also apologize that it’s not in iambic pentameter – that’d probably be more appropriate, but I have neither the wit nor the patience to pull this off with that much verisimilitude.

Oooh! Verisimilitude! Fancy words! Enough apologies, let’s get started.

Much Ado About Brendan (or As I’ve Seen It)

Prologue

Venice, Italy. Sometime during the Renaissance. This glorious city is composed of many families – the Montagues, the Capulets, the Macbeths, the MacDuffs, the Aguecheeks, the Fortanbras, the Whitmore’s, and many many more. Too many to name or count.

Many of these families argue and disagree about things. There’s almost always one thing that one family does or thinks that another family just cannot abide by.

It is in this turbulent city of families that we find The Merchant’s Building. The Merchant’s of Venice are selling their wares, lending or selling books, playing music, and much more – and people are constantly streaming in and out. It’s a marketplace of endless possibility.

In one section of The Merchant’s Building, is the Mozilla booth. Mozilla does and makes many things – but it’s probably best known for its Firefox jewelry. Mozilla is one of a small number of merchants giving away jewelry – and jewelry, in this building, is special: the more people wear your jewelry, the more of a voice you have at the Merchant’s Weekly Meeting, where the rules of the building are written and refined.

So what is special about this Mozilla merchant? Why should we wear their jewelry? There are certainly other merchants giving away jewelry a few booths down. What does Mozilla bring to the table?

For one thing, the jewelry is beautiful. And it makes you walk faster. And it’s got the latest features. And it makes it harder for sketchy people to follow you. And it doesn’t have a built-in tracking device recording which merchants you’re visiting. And you can add cool charms to it, and make it look exactly how you want it.

And another thing that’s unique to the Mozilla booth is that they’re composed of members of every single family in Venice. Every single family has at least one member working in the Mozilla booth. And what’s more – a bunch of these workers are volunteering their time and efforts to make this stuff!

Why? Why do they volunteer? And why do these family members work side by side with people their families might balk at, or sneer at?

Well, In the very center of the Mozilla booth, overhanging the whole thing, is… The Mission. The Mission is the guiding principals upon which the Mozilla booth operates. This is what these family members bury their gauntlets for. They work, sweat and bleed side by side for this mission. This is their connective tissue. This is what guides them when they vote and argue for things at the Merchant’s Weekly Meeting.

The other truly unique thing about the Mozilla booth is that there are no walls to it! You can walk right in, and watch the craftspeople make jewelry! Heck, you can sit right down at a bench and somebody will show you how to make some yourself. They’ll guide you, and they’ll critique you, and soon, somebody will be wearing a piece of jewelry that you made.

The greatest debates also occur within the Mozilla booth. People stand on soap boxes and give their opinions about jewelry, or other merchandise – or merchandise practices. People say what they think out loud, and perhaps print it on a t-shirt and wear it. Sometimes, discussions get heated, but level thinking usually prevails because these Mozillians are an unusually bright bunch.

ACT I

There is a leadership selection underway. Someone needs to be the Chief of Business Affairs (or CBA) in the Mozilla booth. The current chief, Jay, has been holding the position as an interim chief, and the Board of Business Affairs is trying to select someone to take the position permanently.

Two members of this board already have their bags packed – for a while now, they’ve been neglecting other interests of theirs, and after this chief is selected, they feel they need to do other things.

Enter Brendan Eich. Brendan Eich is chief craftsperson of the makers of jewelry in the Mozilla booth. He’s a brilliant and widely respected craftsperson himself, having invented some of the amazing techniques that are used by all serious jewelry makers. He is also one of the founders of the Mozilla booth, having set it up with Mitchell Baker.

The Board of Business Affairs selects Brendan to be the next Chief of Business Affairs.

They announce this, and there is much applause! People clap Brendan on the back. Many craftspeople are pleased that one of their own will be in charge.

The two board members, as they’ve agreed to, take their bags, salute, and walk off out of the booth and on to other things.

A third board member leaves as well, but for reasons not related to what I describe below.

Suddenly, several Montagues and Montague supporters in the Mozillian booth grow concerned. They recall that several years ago, Brendan had donated $1000 dollars to a law that supported Capulet values – a law which impacted their rights. The Montagues and Montague supporters grow concerned that someone who supports this Capulet law is not fit to be Chief of a booth that houses all of the families, Montagues included.

Several of these Montagues raise these concerns out loud. This is not unusual in the Mozilla booth, as most concerns are raised out loud – and, as usual, debate begins. Brendan states that he will 100% abide by the Mozilla participation guidelines, and what’s more, began supporting a project that a Montague in the Mozilla booth has been working on – to bring more Montagues into the booth.

Vigorous debate continues, as is the Mozilla booth custom.

However, as the booth lets anybody in, and the debate can be heard outside of the booth, several Montagues and Montague supporters hear these concerns and start passing the message along to one another – a Capulet has been selected to be the CBA!

Many of these Montagues are reasonable, and say and write reasonable arguments about why they are concerned, and why Brendan may not be the right choice as CBA.

ACT II

A few meters away, the Cupid booth overhears all of this concern from the Montagues. Perhaps they really are Montague supportors (or, more likely, they just wanted to perk up business), but they suddenly decide to take a stand. For people who try to come into their booth wearing Firefox jewelry, they have to read a big sign that tells them about why the Cupid booth believes that restricting the rights of Montagues is terrible, and that the Mozilla booth is terrible for making a Capulet the CBA. They tell the people wearing Firefox jewelry that they should probably wear other things.

And so some people start to take off their Firefox jewelry. Some Montagues take it off angrily, and smash it into the ground – stomping it with their feet, creating a big dust cloud.

Enter Iago, and his team of writers. There are many writers and story-sellers in the Merchant’s Building, but Iago is one of those writers that just wants people to listen to him. He likes to twist words and make things up, or to insinuate things that are not true. He saw the board members leaving the Mozilla booth and concocts some headlines, insinuating that they left in protest of Brendan’s support of the Capulet laws. He also writes about how all of the Mozillians in the booth were not supporting Brendan’s appointment as CBA (which is not true – it’s true that some were concerned and questioned the wisdom of his appointment, but certainly not all). He writes and he writes, and his messengers pass copies and leaflets around. Montagues and Montague supporters read these leaflets, or hear people talking about them, and they grow very concerned. More Montagues start to take off their Firefox jewelry.

Some Montagues start to engage with Mozillians and try to figure out what is happening. As always, each family has calm and reasonable people to converse with – and that’s always welcome in the Mozilla booth.

However, every family also has their groundlings. The groundlings are the members of a family who are always looking for a fight. Always looking for blood. Always hoping an actor will forget their lines, and will shout distracting things at them to make it happen. They always have a bag of rotten fruit and vegetables with them to throw. Some of them just like to make trouble.

Every family has their groundlings. You’ve probably met some yourself.

The groundlings start to hear these rumors that Iago has been spreading around, copied and recopied, distorted and mutilated – and they see the signs at the Cupid booth.

And they rush the Mozilla booth! They start throwing rotten fruit and vegetables, and they tear off their Firefox jewelry, and swear to never wear it again! They gnash their teeth, and they rip out their own hair in a rage, and they scream and yell and make so much noise – it’s almost impossible for the craftspeople in the Mozilla booth to work!

A tempest of Montague rage was upon the Mozilla booth.

ACT III

After several hours of this, Brendan addresses the crowd outside, and speaks to some storytellers (Iago and his team are among them – he always is).

They ask him if he renounces Capulet ways, or if he will apologize for the Montague rights that were impacted by the Capulet law that he helped fund.

And Brendan says something along the lines of “I don’t think that’s helpful to discuss. I don’t think that’s relevant here. I’m not going to run this booth as if everybody in here were Capulets – I helped make this booth, I know that it’s composed of many families, and I know how it operates.”

But Iago and the groundlings were not satisfied. They put up signs and placards claiming that anybody wearing Firefox jewelry is supporting the Capulets!

The Mozillians look at all of the broken and stomped-on jewelry on the market ground. All their work, being trampled. If this continues, their ability to improve things for all families at the Merchant’s Weekly Meeting will fade. Their ability to enact their Mission will fade. They are agitated, discouraged, upset, angry, sad, anxious, confused – a cocktail of emotion playing pretty much the entire spectrum.

Brendan’s speech had not done anything to quell the groundlings. And Iago could smell blood, and was not going to stop writing about Brendan or Mozilla.

The other leaders look to Brendan. What will we do?

And Brendan said, “This noise is getting absurdly loud. How are we supposed to work under these conditions? There’s no way we can enact the mission like this.”

And Brendan steps onto the proscenium, and says:

To leave, or not to leave, that is the question—
Whether ’tis Nobler in the mind to suffer
The Slings and Arrows of outrageous Fortune,
Or to take Arms against a Sea of troubles,
And by opposing end them?

And so, after much thought, he takes arms. He sacrifices, and he chooses to leave the booth – the booth he helped plant into the ground over 15 years ago. The booth he helped build, the jewelry and techniques he helped craft.

“I think if I leave, you folks might have a chance to keep the mission going.”

And so he leaves, to the heartbreak of many Mozillians, and to the cheering of the Montague groundlings outside.

ACT IV

Several of the more sensible Montagues watch Brendan leave and wonder if perhaps the groundlings in their family have made them look petty and vindictive. Some of them are also sad that Brendan left the Mozilla booth – all they wanted from him was an apology, they say. That would have sufficed, they say. They didn’t expect or want him to leave the whole booth.

But the damage is done, and Brendan has left. There is no chief craftsperson, and there is no CBA. Holy shit.

The Mozillians in the booth start to get back to work, since the cheers of the Montagues outside are much easier to work against as a backdrop than the booing, hissing and food-throwing. A bunch of Montagues dust off their stomped Firefox jewelry (or grab new copies!) and put them back on proudly. Others are happy with the new jewelry they got, and don’t care about the Mission. Still others never took off the Firefox jewelry, but said they did. And now they wear it publicly again, proudly.

But suddenly, the Capulets and Capulet supporters in and around the Mozilla booth look at this gaping void where Brendan was and sense injustice. This was wrong, they cry! This man should not have been chased out of here!

Vigorous debate begins, as is the Mozilla booth custom.

And reasonable Capulets say and write reasonable things about why they think it was wrong for Brendan to have left.

And Iago, who never really left the area, hears all of this, and smells more blood in the air. He takes his poison pen, and writes stories about how Brendan was forcibly removed from the Mozilla booth by an angry mob of Montagues. He writes that, like Julius Cesar, Brendan was heard gasping “Et tu, Brute?” as he was stabbed by his fellow senators – or, like King Hamlet, poisoned and betrayed by the people closest to him.

But as usual, Iago gets this completely wrong. Not that he cares or bothers to check. What a douche. And LOUD too, holy smokes. And people listen to Iago, and read what he writes, and hear what he says, and the rumours abound!

And a second tempest starts to brew.

ACT V

Many reasonable Capulets, both inside and outside of the Mozilla booth are concerned about what this means for them. Does this mean that Capulets aren’t allowed to become CBA’s? That’s certainly against the inclusiveness guidelines, is it not? And much debate resonated, as is the Mozilla way.

But, as you recall, every family has their groundlings, and the Capulets are no exception. The Capulet groundlings heard the rumours that Iago and his ilk were slinging, and they gnashed their teeth, and they pulled out their hair.

“YOU KILLED BRENDAN”, the groundings howled at the Mozilla booth.

“No, he left on his own accord to save us and the mission,” some Mozillians said with sadness.

“NO HE DIDN’T, HE WAS BETRAYED AND MURDERED BY HIS CLOSEST ALLIES!” the groundlings yelled back.

“No, that’s simply not true. He left on his own accord in an attempt to save the booth and the mission.”

And the reasonable Capulets understood this, and they understood the mindblowing complexities of this whole clusterfuck. And they spoke with reason and passion.

The Mozillian craftspeople got up from their work making jewelry to talk to these Capulets, and the supporters of the Capulets. And many were very reasonable and calm – but the groundlings among them were vicious and yelled and made so much noise. In some ways, their rage was indistinguishable from the Montegue groundling rage, which I believe is some kind of irony.

And, as you’d expect, the Capulet groundlings, like all groundlings, love blood. They love a fight. And they tore off their Firefox jewelry, and they stamped it into the ground. Vegetables and rotten fruit started to be thrown at the Mozilla booth. Again.

And the Mozillians in the booth looked at each other. They looked at the gaping void where Brendan used to stand. They all hugged one another, and comforted one another, as the jeers and boos of the groundlings got louder and louder, and as rotten fruit and vegetables slammed into them and their works.

And this is where we currently are, I believe.

Epilogue

If these ramblings have offended,
Think but this, and all is mended,
That you have but slumber’d here
While these visions did appear.
And this weak and idle theme,
No more yielding but a dream,
Gentles, do not reprehend:
if you pardon, we will mend:
And, as I am an honest Mike,
I do yet miss this Brendan Eich.
Now to ‘scape the serpent’s tongue,
We will make amends ere long;
Else the Mike a liar call;
So, good night unto you all.
Give me your hands, if we be friends,
And Robin shall restore amends.

For a less silly and more sober analysis of what happened, I suggest reading this next.

April 07, 2014 08:30 AM

April 06, 2014

Jonathan Protzenko

Freedom of speech in the Mozilla Community

I read this morning Andrew Truong's blog on planet:

I guess it's okay to speak out about how we truly feel when somebody resigns over a controversial topic but not to speak out during the controversy? We should ALWAYS speak out. Freedom of Speech.

The reason why I didn't speak up is, just like many other Mozillians I suspect, for fear of retribution. Seeing the attacks perpetrated on Brendan, a well-respected member of the Mozilla community, I can only imagine the torrent of hatemail that I'm going to get for publishing this blog post. (Update: so far, I didn't get any hatemail. It seems like the fears were unjustified.) The other reason is, I didn't want to further encourage a debate which I hoped would fade off after a few days. I somehow hoped we could go back to "business as usual". We were unable to, and thus I see no reason to hold back this blog post anymore.

I live in Europe. We tend to have a different opinion on these matters. But in any case, before you go any further down, I should just mention that I do support the right for anyone to marry the one they love, and I voted according to that belief for the last few elections in my country. I am a straight person, though.

I'm going to quote a few blog posts that resonate with me, and add a few comments.

Daniel wrote:

Who said "Mozilla Community"? Who said Openness? Pfffff. I've been a Mozillian for fourteen years and I'm not even sure I still recognize myself in today's Mozilla Community. Well done guys, well done. What's the next step? 100% political correctness? Is it still possible to have a legally valid personal opinion while being at Mozilla and express it in public?

From private conversations I had with other (mostly European) Mozillians, I know for certain that people have opinions that they are afraid to express in the Mozilla Community. Some people are religious, and will take great care _not_ to reveal that fact. Some people may have other beliefs that do not align with the dominant, Silicon-Valley progressive ideology. They also make sure that these are not apparent. Andrew Truong mentions freedom of speech. I believe there is freedom of speech in the Mozilla community as long as you happen to have the right opinions.

In my personal case, I fortunately happen to side with the prevalent ideology for most points, but I am now very afraid of slipping and expressing an opinion that is not considered progressive enough. I am now afraid of what is going to happen to me then: will I be kicked out? Will people call out for my name being removed from about:credits? Will people call on Twitter for my being ousted from Mozillians.org?

Ben Moskowitz wrote:

For the record, I don’t believe Brendan Eich is a bigot. He’s stubborn, not hateful. He has an opinion. It’s certainly not my opinion, but it was the opinion of 52% of people who voted on Prop 8 just six years ago, and the world is changing fast.

Again, my European point of view on the matter is simple: I interact in my daily life with many people who do not support same-sex marriage. I'm pretty sure some bosses up my hierarchy do not support that. I believe that, ultimately, there will be less and less such people, and that society will change enough that being against same-sex marriage will be a thing of the past. I also believe that as long as my bosses do their jobs, I'm fine with that. I do not care about the personal life of my president; I just care for him to run the country according to the values that his party adheres to. Same thing goes for Brendan.

Christie, who in a much better position that I am to speak about that, says:

Certainly it would be problematic if Brendan’s behavior within Mozilla was explicitly discriminatory, or implicitly so in the form of repeated microagressions. I haven’t personally seen this (although to be clear, I was not part of Brendan’s reporting structure until today). To the contrary, over the years I have watched Brendan be an ally in many areas and bring clarity and leadership when needed. Furthermore, I trust the oversight Mozilla has in place in the form of our chairperson, Mitchell Baker, and our board of directors.

The way people demanded a public apology reminded me of the glorious times of Soviet Russia and Communist China. What next? Should Brendan be photoshopped out of all the pictures? If we leave the matter as is, the only reasonable thing left to do is to add an extra round of interviews when hiring people: the political interview. There, we should make sure that the people we hire share the "right" political opinion. Otherwise, it seems like they is no space for them in the Mozilla Community.

Andrew Sullivan wrote:

If this is the gay rights movement today – hounding our opponents with a fanaticism more like the religious right than anyone else – then count me out. If we are about intimidating the free speech of others, we are no better than the anti-gay bullies who came before us.

While being a strong supporter of the movement, I really feel sad today: what legitimacy is there now for people who've been dragging through the mud an opponent, instead of treating them like a decent, human being? Is this what Mozilla is about?

(Comments are disabled on this post.)

April 06, 2014 09:25 PM

Andrew Sutherland

webpd: a Polymer-based web UI for the beets music library manager

beets webpd filtered artists list

beets is the extensible music database tool every programmer with a music collection has dreamed of writing.  At its simplest it’s a clever tagger that can normalize your music against the MusicBrainz database and then store the results in a searchable SQLite database.  But with plugins it can fetch album art, use the Discogs music database for tagging too, calculate ReplayGain values for all your music, integrate meta-data from The Echo Nest, etc.  It even has a Music Player Daemon server-mode (bpd) and a simple HTML interface (web) that lets you search for tracks and play them in your browse using the HTML5 audio tag.

I’ve tried a lot of music players through the years (alphabetically: amarok, banshee, exaile, quodlibetrhythmbox).  They all are great music players and (at least!) satisfy the traditional Artist/Album/Track hierarchy use-case, but when you exceed 20,000 tracks and you have a lot of compilation cd’s, that frequently ends up not being enough. Extending them usually turned out to be too hard / not fun enough, although sometimes it was just a question of time and seeking greener pastures.

But enough context; if you’re reading my blog you probably are on board with the web platform being the greatest platform ever.  The notable bits of the implementation are:

beets webpd madonna and morrissey

“What’s with all those tastefully chosen colors?” is what you are probably asking yourself.  The answer?  Two things!

  1. A visualization of albums/releases in the database by time, heat-map style.
    • We bin all of the albums that beets knows about by year.  In this case we assume that 1980 is the first interesting year and so put 1979 and everything before it (including albums without a year) in the very first bin on the left.  The current year is the rightmost bucket.
    • We vertically divide the albums into “albums” (red), “singles” (green), and “compilations” (blue).  This is accomplished by taking the MusicBrainz Release Group / Types and mapping them down to our smaller space.
    • The more albums in a bin, the stronger the color.
  2. A scatter-plot using the echo nest‘s acoustic attributes for the tracks where:
    • the x-axis is “danceability”.  Things to the left are less danceable.  Things to the right are more danceable.
    • the y-axis is “valence” which they define as “the musical positiveness conveyed by a track”.  Things near the top are sadder, things near the bottom are happier.
    • the colors are based on the type of album the track is from.  The idea was that singles tend to have remixes on them, so it’s interesting if we always see a big cluster of green remixes to the right.
    • tracks without the relevant data all end up in the upper-left corner.  There are a lot of these.  The echo nest is extremely generous in allowing non-commercial use of their API, but they limit you to 20 requests per minute and at this point the beets echonest plugin needs to upload (transcoded) versions of all my tracks since my music collection is apparently more esoteric than what the servers already have fingerprints for.

Together these visualizations let us infer:

Code is currently in the webpd branch of my beets fork although I should probably try and split it out into a separate repo.  You need to enable the webpd plugin like you would any other plugin for it to work.  There’s still a lot lot lot more work to be done for it to be usable, but I think it’s neat already.  It definitely works in Firefox and Chrome.

April 06, 2014 04:56 PM

April 05, 2014

Joshua Cranmer

Announcing jsmime 0.2

Previously, I've been developing JSMime as a subdirectory within comm-central. However, after discussions with other developers, I have moved the official repository of record for JSMime to its own repository, now found on GitHub. The repository has been cleaned up and the feature set for version 0.2 has been selected, so that the current tip on JSMime (also the initial version) is version 0.2. This contains the feature set I imported into Thunderbird's source code last night, which is to say support for parsing MIME messages into the MIME tree, as well as support for parsing and encoding email address headers.

Thunderbird doesn't actually use the new code quite yet (as my current tree is stuck on a mozilla-central build error, so I haven't had time to run those patches through a last minute sanity check before requesting review), but the intent is to replace the current C++ implementations of nsIMsgHeaderParser and nsIMimeConverter with JSMime instead. Once those are done, I will be moving forward with my structured header plans which more or less ought to make those interfaces obsolete.

Within JSMime itself, the pieces which I will be working on next will be rounding out the implementation of header parsing and encoding support (I have prototypes for Date headers and the infernal RFC 2231 encoding that Content-Disposition needs), as well as support for building MIME messages from their constituent parts (a feature which would be greatly appreciated in the depths of compose and import in Thunderbird). I also want to implement full IDN and EAI support, but that's hampered by the lack of a JS implementation I can use for IDN (yes, there's punycode.js, but that doesn't do StringPrep). The important task of converting the MIME tree to a list of body parts and attachments is something I do want to work on as well, but I've vacillated on the implementation here several times and I'm not sure I've found one I like yet.

JSMime, as its name implies, tries to work in as pure JS as possible, augmented with several web APIs as necessary (such as TextDecoder for charset decoding). I'm using ES6 as the base here, because it gives me several features I consider invaluable for implementing JavaScript: Promises, Map, generators, let. This means it can run on an unprivileged web page—I test JSMime using Firefox nightlies and the Firefox debugger where necessary. Unfortunately, it only really works in Firefox at the moment because V8 doesn't support many ES6 features yet (such as destructuring, which is annoying but simple enough to work around, or Map iteration, which is completely necessary for the code). I'm not opposed to changing it to make it work on Node.js or Chrome, but I don't realistically have the time to spend doing it myself; if someone else has the time, please feel free to contact me or send patches.

April 05, 2014 05:18 PM

April 03, 2014

Joshua Cranmer

If you want fast code, don't use assembly

…unless you're an expert at assembly, that is. The title of this post was obviously meant to be an attention-grabber, but it is much truer than you might think: poorly-written assembly code will probably be slower than an optimizing compiler on well-written code (note that you may need to help the compiler along for things like vectorization). Now why is this?

Modern microarchitectures are incredibly complex. A modern x86 processor will be superscalar and use some form of compilation to microcode to do that. Desktop processors will undoubtedly have multiple instruction issues per cycle, forms of register renaming, branch predictors, etc. Minor changes—a misaligned instruction stream, a poor order of instructions, a bad instruction choice—could kill the ability to take advantages of these features. There are very few people who could accurately predict the performance of a given assembly stream (I myself wouldn't attempt it if the architecture can take advantage of ILP), and these people are disproportionately likely to be working on compiler optimizations. So unless you're knowledgeable enough about assembly to help work on a compiler, you probably shouldn't be hand-coding assembly to make code faster.

To give an example to elucidate this point (and the motivation for this blog post in the first place), I was given a link to an implementation of the N-queens problem in assembly. For various reasons, I decided to use this to start building a fine-grained performance measurement system. This system uses a high-resolution monotonic clock on Linux and runs the function 1000 times to warm up caches and counters and then runs the function 1000 more times, measuring each run independently and reporting the average runtime at the end. This is a single execution of the system; 20 executions of the system were used as the baseline for a t-test to determine statistical significance as well as visual estimation of normality of data. Since the runs observed about a constant 1-2 μs of noise, I ran all of my numbers on the 10-queens problem to better separate the data (total runtimes ended up being in the range of 200-300μs at this level). When I say that some versions are faster, the p-values for individual measurements are on the order of 10-20—meaning that there is a 1-in-100,000,000,000,000,000,000 chance that the observed speedups could be produced if the programs take the same amount of time to run.

The initial assembly version of the program took about 288μs to run. The first C++ version I coded, originating from the same genesis algorithm that the author of the assembly version used, ran in 275μs. A recursive program beat out a hand-written assembly block of code... and when I manually converted the recursive program into a single loop, the runtime improved to 249μs. It wasn't until I got rid of all of the assembly in the original code that I could get the program to beat the derecursified code (at 244μs)—so it's not the vectorization that's causing the code to be slow. Intrigued, I started to analyze why the original assembly was so slow.

It turns out that there are three main things that I think cause the slow speed of the original code. The first one is alignment of branches: the assembly code contains no instructions to align basic blocks on particular branches, whereas gcc happily emits these for some basic blocks. I mention this first as it is mere conjecture; I never made an attempt to measure the effects for myself. The other two causes are directly measured from observing runtime changes as I slowly replaced the assembly with code. When I replaced the use of push and pop instructions with a global static array, the runtime improved dramatically. This suggests that the alignment of the stack could be to blame (although the stack is still 8-byte aligned when I checked via gdb), which just goes to show you how much alignments really do matter in code.

The final, and by far most dramatic, effect I saw involves the use of three assembly instructions: bsf (find the index of the lowest bit that is set), btc (clear a specific bit index), and shl (left shift). When I replaced the use of these instructions with a more complicated expression int bit = x & -x and x = x - bit, the program's speed improved dramatically. And the rationale for why the speed improved won't be found in latency tables, although those will tell you that bsf is not a 1-cycle operation. Rather, it's in minutiae that's not immediately obvious.

The original program used the fact that bsf sets the zero flag if the input register is 0 as the condition to do the backtracking; the converted code just checked if the value was 0 (using a simple test instruction). The compare and the jump instructions are basically converted into a single instruction in the processor. In contrast, the bsf does not get to do this; combined with the lower latency of the instruction intrinsically, it means that empty loops take a lot longer to do nothing. The use of an 8-bit shift value is also interesting, as there is a rather severe penalty for using 8-bit registers in Intel processors as far as I can see.

Now, this isn't to say that the compiler will always produce the best code by itself. My final code wasn't above using x86 intrinsics for the vector instructions. Replacing the _mm_andnot_si128 intrinsic with an actual and-not on vectors caused gcc to use other, slower instructions instead of the vmovq to move the result out of the SSE registers for reasons I don't particularly want to track down. The use of the _mm_blend_epi16 and _mm_srli_si128 intrinsics can probably be replaced with __builtin_shuffle instead for more portability, but I was under the misapprehension that this was a clang-only intrinsic when I first played with the code so I never bothered to try that, and this code has passed out of my memory long enough that I don't want to try to mess with it now.

In short, compilers know things about optimizing for modern architectures that many general programmers don't. Compilers may have issues with autovectorization, but the existence of vector intrinsics allow you to force compilers to use vectorization while still giving them leeway to make decisions about instruction scheduling or code alignment which are easy to screw up in hand-written assembly. Also, compilers are liable to get better in the future, whereas hand-written assembly code is unlikely to get faster in the future. So only write assembly code if you really know what you're doing and you know you're better than the compiler.

April 03, 2014 04:52 PM

Andrew Sutherland

monitoring gaia travis build status using webmail LED notifiers

usb LED webmail notifiers showing build status

For Firefox OS the Gaia UI currently uses Travis CI to run a series of test jobs in parallel for each pull request.  While Travis has a neat ember.js-based live-updating web UI, I usually find myself either staring at my build watching it go nowhere or forgetting about it entirely.  The latter is usually what ends up happening since we have a finite number of builders available, we have tons of developers, each build takes 5 jobs, and some of those jobs can take up to 35 minutes to run when they finally get a turn to run.

I recently noticed ThinkGeek had a bunch of Dream Cheeky USB LED notifiers on sale.  They’re each a USB-controlled tri-color LED in a plastic case that acts as a nice diffuser.  Linux’s “usbled” driver exposes separate red/green/blue files via sysfs that you can echo numbers into to control them.  While the driver and USB protocol inherently support a range of 0-255, it seems like 0-63 or 0-64 is all they give.  The color gamut isn’t amazing but is quite respectable and they are bright enough that they are useful in daylight.  I made a node.js library at https://github.com/asutherland/gaudy-leds that can do some basic tricks and is available on npm as “gaudy-leds”.  You can tell it to do things by doing “gaudy-leds set red green blue purple”, etc.  I added a bunch of commander sub-commands, so “gaudy-leds –help” should give a lot more details than the currently spartan readme.

I couldn’t find any existing tools/libraries to easily watch a Travis CI build and invoke commands like that (though I feel like they must exist) so I wrote https://github.com/asutherland/travis-build-watcher.  While the eventual goal is to not have to manually activate it at all, right now I can point it at a Travis build or a github pull request and it will poll appropriately so it ends up at the latest build and updates the state of the LEDs each time it polls.

Relevant notes / context:

April 03, 2014 01:58 PM

April 01, 2014

Robert Kaiser

How Effective is the Mozilla Stability Program?

One of my goals for last quarter was to get some basic metrics for the effectiveness of Mozilla's stability program. This can most easily be determined by measuring how often Firefox Desktop and Firefox for Android crash over time. Below you'll find some graphs and discussion on the data I could gather on that topic so far.

The Crash Rate

The crash rate is our primary stability measure used at Mozilla. We measure this rate in "crashes per 100 active daily installations (ADI)" or "crashes / 100 ADI". (ADI is the number of daily requests sent by Firefox Desktop and Firefox for Android to update their copy of our add-on blocklist. This value is considered a good enough estimation for usage for our purposes.)

Challenges for a Long-Term Rate

In our daily work, we tend to look at crash rates in terms of short-term changes within a single version, esp. development versions, so we can determine regressions and then dig deeper into what those are. For determining long-term program efficiency, it makes sense though to look at cross-version crash rates instead, so we know how our releases (or betas) improved. So it might make sense to look at all users on the release "channel", i.e. anyone using a stable release. On the other hand, we sometimes have leftover users of old and unsupported users producing a lot of crashes, but those are not really relevant to our current effectiveness of the stability program, so I wanted some way to age out old versions from this overall rate. To take all that into account, I needed some way to more or less "concatenate" the stability rate graphs of a series of versions. Also, people updating to or installing a release very soon after it's published tend to have somewhat different usage patterns than those installing it only after some time and therefore crash rates to those updating late in the cycle, so I needed to find some way to smoothen over that as well and ideally make this into an algorithm that can be automatically requested and put into an SQL query (as the data I base this on is in a PostgreSQL database).

Used Methodology

So, I began to think we could always sum up the crash and ADI numbers of the most recent two releases, or the ones that have the most users. But sometimes we release two adjacent versions 6 weeks apart and sometimes we do a fast update after a week and when the second of those is released, the one before might not have a lot of people updated to it yet so taking only those two might only cover a small portion of users and skew the numbers. So in the end, I decided to go with a moving window that always counts all versions where the builds have been created within the last 12 weeks for the Release channel, and the last 4 weeks for the Beta channel (I had 9 and 3 in the beginning but extended that to make numbers smooth over the impact of the 2-week hiatus we had over New Year's this year). The data we have in usable form goes back to the last few days of September 2011, so that's what I could use for the graphs (I'm trying to get some older data but that is harder to dig out).

Graphs & Discussion

So, here are some screen copies of the graphs I have created out of the data collected with that algorithm (includes data up to March 5, which was current when I originally wrote up this post):

Image No. 23207

The first graph, with data from the Firefox desktop release channel, shows three lines, as the legend says the include crashes of the browser process, those of a plugin process (the vast majority of the plugin processes are Adobe Flash), and so-called "hangs" where we kill the plugin process after it doesn't react to contact from the browser process for a long time (by default, 45 seconds).
For one thing, you'll see that weekends have higher crash rates than weekdays. This could for example be because the ADI data isn't as reliable/accurate as one would hope or because people using Firefox on weekends do things that are more crash-prone (including work/home usage pattern and possibly machine differences).
In this graph we can also clearly see the results of known stability events in this time frame: For example, it nicely shows the Google Doodle crash of August 2012, where almost every startup of the browser crashed when Google was set as the home page, and where we scrambled to get a fix out in very short time (and Google helped us by putting a workaround in their doodles as well). It's also easy to see a few other sharp spikes where we had ADI (upwards) or crash submission (downwards) issues, as well as the crash-and-hang-rich Flash 11.3 release in June of 2012 and subsequent fixes for Flash, including the concerted efforts between Adobe and us to get down to the old levels with fixes on both sides in May/June 2013. For the most time on the graph, you'll see that the browser crash rate didn't change very much (other than the sharp spikes mentioned). In January of 2013, though, it's possible to see the rise in crashes that caused us to ship Firefox 18.0.2 with a fix for that. Right following that, at the end of February, you'll see the sharp rise in crashes when we released Firefox 19.0, triggered by a bug in certain AMD CPUs, which we worked around by rebuilding and releasing a 19.0.1. Those examples, like anything showing up in that graph significantly and not being a data error has pretty intricate story, any of those could make up a separate blog post.
That said, the fact that we could keep the crash rate pretty much at 1.0 browser crashes / 100 ADI over that whole time (and even slightly improve to just below that with the Firefox 26 release in December 2013) is a statement on how effective the Mozilla Stability Program is on keeping Firefox crashes down even though a whole lot of code has been added to support a ton of new features that the web has gained over that time.

Now, let's see how Firefox Beta looks in comparison:

Image No. 23208

At the end of 2012, we apparently did manage to improve base level stability of the Beta channel, but you'll see that this channel is more noisy - which is expected as here we still see regressions and work on fixes before the issues hit release. For example, you can see that Firefox 27 Beta regressed stability in December 2013. We fixed that only very late in the cycle so that you don't see 27 being worse on the Release channel, but 28 had other regressions in the beginning and a rather large one in 28 Beta 4 (mid February 2014) - once we fixed that, you see that we come down to the 1.0 line in the last one or two weeks, so that looks pretty good for the 28 release, which was to be released ~2 weeks after the end of that data.
Also, you'll see that the plugin improvements of early 2013 are about 6 weeks earlier in Beta than in Release, which shows pretty well that there were actual patches in our code that helped with Flash hangs and crashes (as our code is on a 6-week cadence while Adobe's releases hit both channels pretty much at the same time).

Now, let's see how the picture looks when we look at a product that was newly created while we already had the mechanisms in place to record this data, like the current "native UI" Firefox for Android:

Image No. 23211

The early releases had higher crash rates, but we significantly improved over time due to our efforts in the Stability Program. You also can make out that the sharper changes happen pretty exactly at the edges of the 6-week release cycles. Also, you'll see that Firefox 23 for Android in September 2013 was pretty good but we became worse in the following months. Because of that, we started a renewed effort to improve stability of Firefox for Android this January. The current Firefox 27 for Android release is somewhat better than the one before, but it's not where we want to be yet, obviously. We didn't have too much time to pound on issues from the start of the year until 27 was release, but Beta can show us if our newer efforts are pointing in the right direction:

Image No. 23212

Now this graph looks pretty nice, doesn't it? When we started off putting this product on Beta the first time, we were seeing the usual churn of exposing a new product to a wider audience for the first time, but we burned down the issues pretty well. Then we had a big regression, fixed it, and burned down bugs slowly over multiple months again. The regressions of late 2013 look even more dramatic here as we had even worse issues there but could actually fix the worst parts of those so that the regressions on the Release channel weren't as bad as the first Betas we had there. Many of the 6-week cycles in this graph look like burn-down charts, high in the beginning, going down over the cycle as we push for bugs being fixed. It's also pretty awesome to see how the efforts since the start of this year have really paid off and current Beta is rivaling the best Beta numbers we had so far - you can imagine how I was looking forward to Firefox 28 for Android hitting Release based on that data! :)

All that said, we know there's more we can do on both products, and while holding crash rates pretty stable over a long time while adding a ton of features is awesome, we strive for improving overall stability. Those graphs are one part of measuring the effectiveness of the stability program. I hope we will be able to put them up in a more dynamic and daily updating form at some point (right now I manually construct them in LibreOffice).

And in case you're interested in digging deeper into the source of the graphs, the code to pull the data from the crash-stats DB is in my crash-report-tools repo and the JSON coming out of that and powering my charts is in my directory on crash-analysis (F*-bytype.json files). Also feel free to contact me for more details.

April 01, 2014 05:58 PM

March 29, 2014

Robert Kaiser

Lantea Maps conversion to WebGL

I blogged about Lantea Maps 18 months ago. As its marketplace listing describes, the app's purpose is displaying maps as well as recording and displaying GPS tracks.

I wrote this app both to scratch an itch (I wanted an OpenStreetMap-based app to record GPS tracks) and to learn a bit more of JavaScript and web app development. As maps are a 2D problem and the track display requires drawing various lines and possibly other location indicators, I wrote this app based on 2D canvas. I started off with some base code on drawing bitmap tile maps to canvas, and wrote the app around that, doing quite some rewriting on the little bit of code I started from. I also ended up splitting map and track into separate canvases so I wouldn't need to redraw everything when deleting the track or when moving the indicator of the last location or similar. Over time, I did a lot of improvements in various areas of the app, from the tile cache in IndexedDB via OpenStreetMap upload of tracks to pinch zooming on touch screens.

Still, performance of the map canvas was not good - on phones (esp. the small 320x480 screens like the ZTE Open), where you only have a handful of 256x256 map tiles to draw, panning was slightly chunky, but on larger screens, like my Android tablet or even my pretty fast desktop, it ranged from bad to awful (like, noticeably waiting from any movement until you saw any drawing of a move map). Also, as it takes until images are loaded (cached from IndexedDB or out from the web) and that's all called asynchronously, the positions the images ended up being drawn often weren't completely correct any more at the time of drawing them. I tried some optimizations with actually grepping the pixels from the canvas, setting them in the new positions and only actually redrawing the images on the borders, but that only helped slightly on small screens while making large ones even worse in performance.

Given what I read and heard about how today's graphics chips and pipelines work, I figured that the problem was with the drawImage() calls to draw the tiles to the canvas as well as the getImageData()/putImageData() calls to move the pixels in the optimizations. All those copy image data between JS and graphics memory, which is slow, and doing it a lot doesn't really fit well with how graphics stacks work nowadays. The only way I heard that should improve that a lot would be to switch from 2D canvas to WebGL (or go to the image-based tile maps that many others are using, but that wouldn't be as much fun). I don't remember all sources for that, but just did get another pointer to a Mozilla Hacks post that explains some of it. And as Google also seems to being moving their Maps site to WebGL (from image-based tiles, mind you), it can't be a really wrong move. :)

So, I set out to try and learn the pieces of WebGL I needed for this app. You'd guess that Mozilla, who invented that API together with Khronos, would have ample docs on it, but the WebGL MDN page does only have one tutorial for an animated 3D cube and a list of external links. I meanwhile have filed a bug on a WebGL reference so may improve this further in the future, but I started off first trying with the tutorial that MDN has. I didn't get a lot to work there except some basics, and a lot of the commands in there were not very well explained, but the html5rocks tutorial helped me to get things into a better shape, and some amount of trying around and the MSDN WebGL reference helped to understand more and get things actually right.
One thing that was pretty useful there as well was separating the determination of what tiles should be visible and loading them into textures from the actual drawing of the textures to the canvas. By doing the drawing itself on requestAnimationFrame and this being the only thing done when we pan as long as I have all tiles loaded into textures, I save work and should improve performance.

Image No. 23214 Image No. 23213
2D Canvas (left) and WebGL (right) version of Lantea Maps on the ZTE Open

As you can see from the images, the 2D canvas and WebGL versions of Lantea Maps do not look different - but then, that was not intended, as the map is the map after all. Right now, you can actually test both versions, though: I have not moved the WebGL to production yet, so lantea.kairo.at still uses 2D canvas, while the staging version lantea-dev.kairo.at already is WebGL. You'll notice that panning the map is way more fluid in the new version and the tile distortions that could happen with delayed loading in the old one do not happen. I still wonder though why it sometimes happens that you have to wait very long for tiles to load, esp. after zooming. I still need to figure that out at some point, but things render after waiting, so I found it OK for now. Also, I found the WebGL app to work fine on Firefox desktop (Linux), Firefox for Android, as well as Firefox OS (1.1, 1.2, and 1.5/Nightly).

So, I'm happy I did manage the conversion and learn some WebGL, though there's still a lot to be done. And as always, the code to Lantea Maps is available in my public git as well as GitHub if you want to learn or help. ;-)

March 29, 2014 12:02 AM

March 14, 2014

Joshua Cranmer

Understanding email charsets

Several years ago, I embarked on a project to collect the headers of all the messages I could reach on NNTP, with the original intent of studying the progression of the most common news clients. More recently, I used this dataset to attempt to discover the prevalence of charsets in email messages. In doing so, I identified a critical problem with the dataset: since it only contains headers, there is very little scope for actually understanding the full, sad story of charsets. So I've decided to rectify this problem.

This time, I modified my data-collection scripts to make it much easier to mass-download NNTP messages. The first script effectively lists all the newsgroups, and then all the message IDs in those newsgroups, stuffing the results in a set to remove duplicates (cross-posts). The second script uses Python's nntplib package to attempt to download all of those messages. Of the 32,598,261 messages identified by the first set, I succeeded in obtaining 1,025,586 messages in full or in part. Some messages failed to download due to crashing nntplib (which appears to be unable to handle messages of unbounded length), and I suspect my newsserver connections may have just timed out in the middle of the download at times. Others failed due to expiring before I could download them. All in all, 19,288 messages were not downloaded.

Analysis of the contents of messages were hampered due to a strong desire to find techniques that could mangle messages as little as possible. Prior experience with Python's message-parsing libraries lend me to believe that they are rather poor at handling some of the crap that comes into existence, and the errors in nntplib suggest they haven't fixed them yet. The only message parsing framework I truly trust to give me the level of finess is the JSMime that I'm writing, but that happens to be in the wrong language for this project. After reading some blog posts of Jeffrey Stedfast, though, I decided I would give GMime a try instead of trying to rewrite ad-hoc MIME parser #N.

Ultimately, I wrote a program to investigate the following questions on how messages operate in practice:

While those were the questions I seeked the answers to originally, I did come up with others as I worked on my tool, some in part due to what information I was basically already collecting. The tool I wrote primarily uses GMime to convert the body parts to 8-bit text (no charset conversion), as well as parse the Content-Type headers, which are really annoying to do without writing a full parser. I used ICU to handle charset conversion and detection. RFC 2047 decoding is done largely by hand since I needed very specific information that I couldn't convince GMime to give me. All code that I used is available upon request; the exact dataset is harder to transport, given that it is some 5.6GiB of data.

Other than GMime being built on GObject and exposing a C API, I can't complain much, although I didn't try to use it to do magic. Then again, in my experience (and as this post will probably convince you as well), you really want your MIME library to do charset magic for you, so in doing well for my needs, it's actually not doing well for a larger audience. ICU's C API similarly makes me want to complain. However, I'm now very suspect of the quality of its charset detection code, which is the main reason I used it. Trying to figure out how to get it to handle the charset decoding errors also proved far more annoying than it really should.

Some final background regards the biases I expect to crop up in the dataset. As the approximately 1 million messages were drawn from the python set iterator, I suspect that there's no systematic bias towards or away from specific groups, excepting that the ~11K messages found in the eternal-september.* hierarchy are completely represented. The newsserver I used, Eternal September, has a respectably large set of newsgroups, although it is likely to be biased towards European languages and under-representing East Asians. The less well-connected South America, Africa, or central Asia are going to be almost completely unrepresented. The download process will be biased away towards particularly heinous messages (such as exceedingly long lines), since nntplib itself is failing.

This being news messages, I also expect that use of 8-bit will be far more common than would be the case in regular mail messages. On a related note, the use of 8-bit in headers would be commensurately elevated compared to normal email. What would be far less common is HTML. I also expect that undeclared charsets may be slightly higher.

Charsets

Charset data is mostly collected on the basis of individual body parts within body messages; some messages have more than one. Interestingly enough, the 1,025,587 messages yielded 1,016,765 body parts with some text data, which indicates that either the messages on the server had only headers in the first place or the download process somehow managed to only grab the headers. There were also 393 messages that I identified having parts with different charsets, which only further illustrates how annoying charsets are in messages.

The aliases in charsets are mostly uninteresting in variance, except for the various labels used for US-ASCII (us - ascii, 646, and ANSI_X3.4-1968 are the less-well-known aliases), as well as the list of charsets whose names ICU was incapable of recognizing, given below. Unknown charsets are treated as equivalent to undeclared charsets in further processing, as there were too few to merit separate handling (45 in all).

For the next step, I used ICU to attempt to detect the actual charset of the body parts. ICU's charset detector doesn't support the full gamut of charsets, though, so charset names not claimed to be detected were instead processed by checking if they decoded without error. Before using this detection, I detect if the text is pure ASCII (excluding control characters, to enable charsets like ISO-2022-JP, and +, if the charset we're trying to check is UTF-7). ICU has a mode which ignores all text in things that look like HTML tags, and this mode is set for all HTML body parts.

I don't quite believe ICU's charset detection results, so I've collapsed the results into a simpler table to capture the most salient feature. The correct column indicates the cases where the detected result was the declared charset. The ASCII column captures the fraction which were pure ASCII. The UTF-8 column indicates if ICU reported that the text was UTF-8 (it always seems to try this first). The Wrong C1 column refers to an ISO-8859-1 text being detected as windows-1252 or vice versa, which is set by ICU if it sees or doesn't see an octet in the appropriate range. The other column refers to all other cases, including invalid cases for charsets not supported by ICU.

DeclaredCorrectASCIIUTF-8 Wrong C1OtherTotal
ISO-8859-1230,526225,6678838,1191,035466,230
Undeclared148,0541,11637,626186,796
UTF-875,67437,6001,551114,825
US-ASCII98,238030498,542
ISO-8859-1567,52918,527086,056
windows-125221,4144,3701543,31913029,387
ISO-8859-218,6472,13870712,31923,245
KOI8-R4,61642421,1126,154
GB23121,3075901121,478
Big562260801741,404
windows-125634310045398
IBM437842570341
ISO-8859-1331160317
windows-125113197161290
windows-12506969014101253
ISO-8859-7262600131183
ISO-8859-9127110017155
ISO-2022-JP766903148
macintosh67570124
ISO-8859-16015101116
UTF-7514055
x-mac-croatian0132538
KOI8-U282030
windows-125501800624
ISO-8859-4230023
EUC-KR0301619
ISO-8859-14144018
GB180301430017
ISO-8859-800001616
TIS-620150015
Shift_JIS840113
ISO-8859-391111
ISO-8859-10100010
KSC_56013609
GBK4206
windows-1253030025
ISO-8859-510034
IBM8500404
windows-12570303
ISO-2022-JP-22002
ISO-8859-601001
Total421,751536,3732,22611,52344,8921,016,765

The most obvious thing shown in this table is that the most common charsets remain ISO-8859-1, Windows-1252, US-ASCII, UTF-8, and ISO-8859-15, which is to be expected, given an expected prior bias to European languages in newsgroups. The low prevalence of ISO-2022-JP is surprising to me: it means a lower incidence of Japanese than I would have expected. Either that, or Japanese have switched to UTF-8 en masse, which I consider very unlikely given that Japanese have tended to resist the trend towards UTF-8 the most.

Beyond that, this dataset has caused me to lose trust in the ICU charset detectors. KOI8-R is recorded as being 18% malformed text, with most of that ICU believing to be ISO-8859-1 instead. Judging from the results, it appears that ICU has a bias towards guessing ISO-8859-1, which means I don't believe the numbers in the Other column to be accurate at all. For some reason, I don't appear to have decoders for ISO-8859-16 or x-mac-croatian on my local machine, but running some tests by hand appear to indicate that they are valid and not incorrect.

Somewhere between 0.1% and 1.0% of all messages are subject to mojibake, depending on how much you trust the charset detector. The cases of UTF-8 being misdetected as non-UTF-8 could potentially be explained by having very few non-ASCII sequences (ICU requires four valid sequences before it confidently declares text UTF-8); someone who writes a post in English but has a non-ASCII signature (such as myself) could easily fall into this category. Despite this, however, it does suggest that there is enough mojibake around that users need to be able to override charset decisions.

The undeclared charsets are described, in descending order of popularity, by ISO-8859-1, Windows-1252, KOI8-R, ISO-8859-2, and UTF-8, describing 99% of all non-ASCII undeclared data. ISO-8859-1 and Windows-1252 are probably over-counted here, but the interesting tidbit is that KOI8-R is used half as much undeclared as it is declared, and I suspect it may be undercounted. The practice of using locale-default fallbacks that Thunderbird has been using appears to be the best way forward for now, although UTF-8 is growing enough in popularity that using a specialized detector that decodes as UTF-8 if possible may be worth investigating (3% of all non-ASCII, undeclared messages are UTF-8).

HTML

Unsuprisingly (considering I'm polling newsgroups), very few messages contained any HTML parts at all: there were only 1,032 parts in the total sample size, of which only 552 had non-ASCII characters and were therefore useful for the rest of this analysis. This means that I'm skeptical of generalizing the results of this to email in general, but I'll still summarize the findings.

HTML, unlike plain text, contains a mechanism to explicitly identify the charset of a message. The official algorithm for determining the charset of an HTML file can be described simply as "look for a <meta> tag in the first 1024 bytes. If it can be found, attempt to extract a charset using one of several different techniques depending on what's present or not." Since doing this fully properly is complicated in library-less C++ code, I opted to look first for a <meta[ \t\r\n\f] production, guess the extent of the tag, and try to find a charset= string somewhere in that tag. This appears to be an approach which is more reflective of how this parsing is actually done in email clients than the proper HTML algorithm. One difference is that my regular expressions also support the newer <meta charset="UTF-8"/> construct, although I don't appear to see any use of this.

I found only 332 parts where the HTML declared a charset. Only 22 parts had a case where both a MIME charset and an HTML charset and the two disagreed with each other. I neglected to count how many messages had HTML charsets but no MIME charsets, but random sampling appeared to indicate that this is very rare on the data set (the same order of magnitude or less as those where they disagreed).

As for the question of who wins: of the 552 non-ASCII HTML parts, only 71 messages did not have the MIME type be the valid charset. Then again, 71 messages did not have the HTML type be valid either, which strongly suggests that ICU was detecting the incorrect charset. Judging from manual inspection of such messages, it appears that the MIME charset ought to be preferred if it exists. There are also a large number of HTML charset specifications saying unicode, which ICU treats as UTF-16, which is most certainly wrong.

Headers

In the data set, 1,025,856 header blocks were processed for the following statistics. This is slightly more than the number of messages since the headers of contained message/rfc822 parts were also processed. The good news is that 97% (996,103) headers were completely ASCII. Of the remaining 29,753 headers, 3.6% (1,058) were UTF-8 and 43.6% (12,965) matched the declared charset of the first body part. This leaves 52.9% (15,730) that did not match that charset, however.

Now, NNTP messages can generally be expected to have a higher 8-bit header ratio, so this is probably exaggerating the setup in most email messages. That said, the high incidence is definitely an indicator that even non-EAI-aware clients and servers cannot blindly presume that headers are 7-bit, nor can EAI-aware clients and servers presume that 8-bit headers are UTF-8. The high incidence of mismatching the declared charset suggests that fallback-charset decoding of headers is a necessary step.

RFC 2047 encoded-words is also an interesting statistic to mine. I found 135,951 encoded-words in the data set, which is rather low, considering that messages can be reasonably expected to carry more than one encoded-word. This is likely an artifact of NNTP's tendency towards 8-bit instead of 7-bit communication and understates their presence in regular email.

Counting encoded-words can be difficult, since there is a mechanism to let them continue in multiple pieces. For the purposes of this count, a sequence of such words count as a single word, and I indicate the number of them that had more than one element in a sequence in the Continued column. The 2047 Violation column counts the number of sequences where decoding words individually does not yield the same result as decoding them as a whole, in violation of RFC 2047. The Only ASCII column counts those words containing nothing but ASCII symbols and where the encoding was thus (mostly) pointless. The Invalid column counts the number of sequences that had a decoder error.

CharsetCountContinued2047 ViolationOnly ASCIIInvalid
ISO-8859-156,35515,6104990
UTF-836,56314,2163,3112,7049,765
ISO-8859-1520,6995,695400
ISO-8859-211,2472,66990
windows-12525,1743,075260
KOI8-R3,5231,203120
windows-125676556800
Big551146280171
ISO-8859-71652603
windows-12511573020
GB2312126356051
ISO-2022-JP10285049
ISO-8859-13784500
ISO-8859-9762100
ISO-8859-471200
windows-1250682100
ISO-8859-5662000
US-ASCII3810380
TIS-620363400
KOI8-U251100
ISO-8859-16221022
UTF-7172183
EUC-KR174409
x-mac-croatian103010
Shift_JIS80003
Unknown7207
ISO-2022-KR70000
GB1803061001
windows-12554000
ISO-8859-143000
ISO-8859-32100
GBK20002
ISO-8859-61100
Total135,95143,3603,3613,33810,096

This table somewhat mirrors the distribution of regular charsets, with one major class of differences: charsets that represent non-Latin scripts (particularly Asian scripts) appear to be overdistributed compared to their corresponding use in body parts. The exception to this rule is GB2312 which is far lower than relative rankings would presume—I attribute this to people using GB2312 being more likely to use 8-bit headers instead of RFC 2047 encoding, although I don't have direct evidence.

Clearly continuations are common, which is to be relatively expected. The sad part is how few people bother to try to adhere to the specification here: out of 14,312 continuations in languages that could violate the specification, 23.5% of them violated the specification. The mode-shifting versions (ISO-2022-JP and EUC-KR) are basically all violated, which suggests that no one bothered to check if their encoder "returns to ASCII" at the end of the word (I know Thunderbird's does, but the other ones I checked don't appear to).

The number of invalid UTF-8 decoded words, 26.7%, seems impossibly high to me. A brief check of my code indicates that this is working incorrectly in the face of invalid continuations, which certainly exaggerates the effect but still leaves a value too high for my tastes. Of more note are the elevated counts for the East Asian charsets: Big5, GB2312, and ISO-2022-JP. I am not an expert in charsets, but I belive that Big5 and GB2312 in particular are a family of almost-but-not-quite-identical charsets and it may be that ICU is choosing the wrong candidate of each family for these instances.

There is a surprisingly large number of encoded words that encode only ASCII. When searching specifically for the ones that use the US-ASCII charset, I found that these can be divided into three categories. One set comes from a few people who apparently have an unsanitized whitespace (space and LF were the two I recall seeing) in the display name, producing encoded words like =?us-ascii?Q?=09Edward_Rosten?=. Blame 40tude Dialog here. Another set encodes some basic characters (most commonly = and ?, although a few other interpreted characters popped up). The final set of errors were double-encoded words, such as =?us-ascii?Q?=3D=3FUTF-8=3FQ=3Ff=3DC3=3DBCr=3F=3D?=, which appear to be all generated by an Emacs-based newsreader.

One interesting thing when sifting the results is finding the crap that people produce in their tools. By far the worst single instance of an RFC 2047 encoded-word that I found is this one: Subject: Re: [Kitchen Nightmares] Meow! Gordon Ramsay Is =?ISO-8859-1?B?UEgR lqZ VuIEhlYWQgVH rbGeOIFNob BJc RP2JzZXNzZW?= With My =?ISO-8859-1?B?SHVzYmFuZ JzX0JhbGxzL JfU2F5c19BbXiScw==?= Baking Company Owner (complete with embedded spaces), discovered by crashing my ad-hoc base64 decoder (due to the spaces). The interesting thing is that even after investigating the output encoding, it doesn't look like the text is actually correct ISO-8859-1... or any obvious charset for that matter.

I looked at the unknown charsets by hand. Most of them were actually empty charsets (looked like =??B?Sy4gSC4gdm9uIFLDvGRlbg==?=), and all but one of the outright empty ones were generated by KNode and really UTF-8. The other one was a Windows-1252 generated by a minor newsreader.

Another important aspect of headers is how to handle 8-bit headers. RFC 5322 blindly hopes that headers are pure ASCII, while RFC 6532 dictates that they are UTF-8. Indeed, 97% of headers are ASCII, leaving just 29,753 headers that are not. Of these, only 1,058 (3.6%) are UTF-8 per RFC 6532. Deducing which charset they are is difficult because the large amount of English text for header names and the important control values will greatly skew any charset detector, and there is too little text to give a charset detector confidence. The only metric I could easily apply was testing Thunderbird's heuristic as "the header blocks are the same charset as the message contents"—which only worked 45.2% of the time.

Encodings

While developing an earlier version of my scanning program, I was intrigued to know how often various content transfer encodings were used. I found 1,028,971 parts in all (1,027,474 of which are text parts). The transfer encoding of binary did manage to sneak in, with 57 such parts. Using 8-bit text was very popular, at 381,223 samples, second only to 7-bit at 496,114 samples. Quoted-printable had 144,932 samples and base64 only 6,640 samples. Extremely interesting are the presence of 4 illegal transfer encodings in 5 messages, two of them obvious typos and the others appearing to be a client mangling header continuations into the transfer-encoding.

Conclusions

So, drawing from the body of this data, I would like to make the following conclusions as to using charsets in mail messages:

  1. Have a fallback charset. Undeclared charsets are extremely common, and I'm skeptical that charset detectors are going to get this stuff right, particularly since email can more naturally combine multiple languages than other bodies of text (think signatures). Thunderbird currently uses a locale-dependent fallback charset, which roughly mirrors what Firefox and I think most web browsers do.
  2. Let users override charsets when reading. On a similar token, mojibake text, while not particularly common, is common enough to make declared charsets sometimes unreliable. It's also possible that the fallback charset is wrong, so users may need to override the chosen charset.
  3. Testing is mandatory. In this set of messages, I found base64 encoded words with spaces in them, encoded words without charsets (even UNKNOWN-8BIT), and clearly invalid Content-Transfer-Encodings. Real email messages that are flagrantly in violation of basic spec requirements exist, so you should make sure that your email parser and client can handle the weirdest edge cases.
  4. Non-UTF-8, non-ASCII headers exist. EAI not withstanding, 8-bit headers are a reality. Combined with a predilection for saying ASCII when text is really ASCII, this means that there is often no good in-band information to tell you what charset is correct for headers, so you have to go back to a fallback charset.
  5. US-ASCII really means ASCII. Email clients appear to do a very good job of only emitting US-ASCII as a charset label if it's US-ASCII. The sample size is too small for me to grasp what charset 8-bit characters should imply in US-ASCII.
  6. Know your decoders. ISO-8859-1 actually means Windows-1252 in practice. Big5 and GB1232 are actually small families of charsets with slightly different meanings. ICU notably disagrees with some of these realities, so be sure to include in your tests various charset edge cases so you know that the decoders are correct.
  7. UTF-7 is still relevant. Of the charsets I found not mentioned in the WHATWG encoding spec, IBM437 and x-mac-croatian are in use only due to specific circumstances that limit their generalizable presence. IBM850 is too rare. UTF-7 is common enough that you need to actually worry about it, as abominable and evil a charset it is.
  8. HTML charsets may matter—but MIME matters more. I don't have enough data to say if charsets declared in HTML are needed to do proper decoding. I do have enough to say fairly conclusively that the MIME charset declaration is authoritative if HTML disagrees.
  9. Charsets are not languages. The entire reason x-mac-croatian is used at all can be traced to Thunderbird displaying the charset as "Croatian," despite it being pretty clearly not a preferred charset. Similarly most charsets are often enough ASCII that, say, an instance of GB2312 is a poor indicator of whether or not the message is in English. Anyone trying to filter based on charsets is doing a really, really stupid thing.
  10. RFCs reflect an ideal world, not reality. This is most notable in RFC 2047: the specification may state that encoded words are supposed to be independently decodable, but the evidence is pretty clear that more clients break this rule than uphold it.
  11. Limit the charsets you support. Just because your library lets you emit a hundred charsets doesn't mean that you should let someone try to do it. You should emit US-ASCII or UTF-8 unless you have a really compelling reason not to, and those compelling reasons don't require obscure charsets. Some particularly annoying charsets should never be written: EBCDIC is already basically dead on the web, and I'd like to see UTF-7 die as well.

When I have time, I'm planning on taking some of the more egregious or interesting messages in my dataset and packaging them into a database of emails to help create testsuites on handling messages properly.

March 14, 2014 04:17 AM

March 10, 2014

Rumbling Edge - Thunderbird

2014-03-09 Calendar builds

Common (excluding Website bugs)-specific: (27)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

March 10, 2014 08:16 AM

2014-03-09 Thunderbird comm-central builds

Thunderbird-specific: (31)

MailNews Core-specific: (26)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

March 10, 2014 08:15 AM

March 07, 2014

Ludovic Hirlimann

Thunderbird 28.0b1 is out and why you should care

We’ve just released another beta of Thunderbird. We are now in the middle of the release cycle until the next major version is released to our millions of daily users. (we’ve fixed 200+ bugs since the last major release (version 24)). We currently have less than 1% of our users - using the beta and that’s not enough to catch regressions - because Thunderbird offers mail, newsgroups and rss feeds we can’t cover the usage of our user base. Also many companies out there sell extensions for spam filtering, for virus protection and so forth. The QA community just doesn’t have the time to try all these and run these with Thunderbird betas to find issues.

And that’s where you dear reader can help. How you might ask well here is a list of examples of how you can help :

If you find issues let us know either thru bugzilla or thru the support forums, so we can try to address them.

ps the current download page says English only because of a bug in our build infrastructure for windows. Linux and Mac builds are available localized.

March 07, 2014 02:55 PM

March 05, 2014

David Ascher

Product Thinking

I have a new job!  Still with Mozilla, still doing a lot of what I’ve done in the past, just hopefully more/better/faster.  The group I’m joining has a great culture of active blogging, so I’m hoping the peer pressure there will help me blog more often.

What’s the gig you ask? My new focus is to help the Mozilla Foundation make our products as adoptable as possible.

MoFo (as we affectionately call that part of the Mozilla organization) has a few main ways in which we’re hoping to change the world — some of those are programs, like Open News and the Science Lab, some are products. In a program, the change we’re hoping to effect happens by connecting brains together, either through fellowship programs, events, conferences, things like that. That work is the stuff of movement-building, and it’s fascinating to watch my very skilled colleagues at work — there is a distinctive talent required to attract autonomous humans to a project, get them excited about both what you’re doing and what they could do, and empowering them to help themselves and others.

Alongside these programmatic approaches, MoFo has for a while been building software whose use is itself impactful.  Just like getting people to use Firefox was critical to opening up the web, we believe that using products like the Webmaker tools or BadgeKit will have direct impact and help create the internet the world needs.

And that’s where I come in!  Over the last few years, various smart people have kept labeling me a “product person”, and I’ve only recently started to understand what they meant, and that indeed, they are right — “product” (although the word is loaded with problematic connotations) is central for me.

I’ll write a lot more about that over the coming months, but the short version is that I am particularly fascinated by the process that converts an idea or a pile of code into something that intelligent humans choose to use and love to use.  That translation to me is attractive because it requires a variety of types of thinking: business modeling, design, consumer psychology, and creative application of technology.  It is also compelling to me in three other aspects: it is subversive, it is humane, and it is required for impact.

It is subversive because I think if we do things right, we use the insights from billions of dollars worth of work by “greedy, evil, capitalist corporations” who have figured out how to get “eyeballs” to drive profit and repurpose those techniques for public benefit — to make it easy for people to learn what they want to learn, to allow people to connect with each other, to amplify the positive that emerges when people create.  It is humane because I have never seen a great product emerge from teams that treat people as hyper-specialized workers, without recognizing the power of complex brains who are allowed to work creatively together.  And it is required for impact because software in a repo or an idea in a notebook can be beautiful, but is inert.  To get code or an idea to change the world, we need multitudes to use it; and the best way I know to get people to use software is to apply product thinking, and make something people love.

I am thrilled to say that I have as much to learn as I have to teach, and I hope to do much of both in public.  I know I’ll learn a lot from my colleagues, but I’m hoping I’ll also get to learn on this blog.

I’m looking forward to this new phase, it fits my brain.

March 05, 2014 12:09 AM

March 03, 2014

Instantbird

Pardon the Interruption! Instantbird nightly builds are back!

As of today, March 3rd, 2014, Instantbird nightly builds (1.6a1pre) are being built again. We last had nightly builds on January 9th, 2014 and they have been broken since due to a series of a large infrastructure change we’ve been going through to merge the Instantbird Bugzilla and code repository with Mozilla’s. Unfortunately, getting nightly builds working again took us longer than expected as it involved many related issues: updating Instantbird to work with newer versions of Mozilla, reconfiguring our buildbot and working on getting libpurple to build as an extension.

The results of this is that Instantbird is now building out of the “comm-central” code repository (the same place the code for Thunderbird is stored). What does this mean for you?

Again, sorry for any interruption. Regular development should be continuing now. Thanks for all the concerned emails telling us our builds had stopped!

March 03, 2014 06:21 PM

March 01, 2014

Ludovic Hirlimann

Fixing email ...

A recent twitter thread annoyed me about innovation in email :

That got me annoyed, because I always here we need to innovate with email and gmail is always taken as a reference in terms of email innovation.

But if you look at it gmail brought the following things to the world :

  1. unlimited “brokenish imap”
  2. Dkim/spf to fight spammers
  3. Good/very good spam filters (using the sender trust level)
  4. new nice UI to webmail

And that’s it. I’ve been involved with a MUA long enough now that I think email can’t be fixed by ‘just’ UI work or client work. Some things need to happen at the spec level and be implemented on both client and servers.

Unfortunately email has been around for 30ish years now and working for all that time - so fixing and simplifying can’t happen over night.

Let’s make a list of what needs to be fixed client side first :

On the server side there are way more things that need to be fixed

  1. smtp needs to be replaced with something that would reduce spam and try to keep the identify of the sender (things done today with things like dkim/spf/openpgp).
  2. mailing list need to be rethought in the way you interact with them - eg digest mode / subscribe / unsubscribe / vacation / cross-posting
  3. Rich Format email

Of course the last thing to fix is users, why ho why on earth do I get a Microsoft word has an attachment - while I could have directly got the content in the email itself.

March 01, 2014 10:56 AM

February 27, 2014

Robert Kaiser

Preserving Software: Emulators

It's been a while since I wrote a post here, and even longer since I wrote about preserving software. But there's two more topics I have on my list to write about the event I attended last May. This is one of them.

One problem for preserving software is that the original hardware that the software did run on might not survive very long. Some people are still keeping some old machines like C64, Apple ][ and others running, but at some point there won't be many left as the original ones wear out or get damaged, and other hardware might not be usable at any more already at this point. And for sure, those machines are not available broadly to the public. Ideally, we'd have the hardware and recreate the full experience, e.g. how you connected the machine to your own TV in the living room and played or worked with it there - but that is pretty unlikely or at least hard to do, esp. with the hardware being less and less available, as I mentioned.

But there's one way to bring at least part of the experience to users: We can emulate the old machines and let the preserved software run within that emulator. That doesn't give us the living-room-TV experience, but there's a better chance in both preserving that way of running the old pieces of software for a long time and making the experience broadly available. Now, it's not always easy to get emulators running well, but there are a number of projects out there, and we heard about a few interesting solutions in the preserving software event at the LoC, but one was particularly appealing to us as Mozillians.



I blogged about The Internet Archive (archive.org) and Jason Scott already some time ago, and he was it that mentioned this very appealing kind of emulator called JSMESS. What hides behind that name is the multi-platform MESS emulator, cross-compiled into JavaScript via EmScripten, a project that should be well-known here at Mozilla. :)



Since the event in May, a lot of work has been flowing into JSMESS, and as Jason has blogged about, there are a thousand cartriges available now in the Historical Software Collection of The Internet Archive, and performance is pretty decent within the browser now.

With that, a whole lot of old software is available for everyone, at any time, to try and experience within their own browser!

That's a powerful way to preserve software for the current world and upcoming generations, isn't it?

February 27, 2014 01:40 AM

February 10, 2014

Ludovic Hirlimann

PGP Key signing Party in London MArch 25th 2014

On march 25th 2014 I’m organizing a pgp key signing party in the Mozilla London office after working hours.

If you value privacy and email you should probably attend. In order to organize the meeting I need to know the number of participants , for that I’m requesting that people register using eventbrite.

For thos who don’t know how pgp key signing party work you should read this http://sietch-tabr.tumblr.com/post/61392468398/pgp-key-signing-party-at-mozilla-summit-2013-details.

February 10, 2014 01:59 PM

February 09, 2014

Rumbling Edge - Thunderbird

2014-02-09 Calendar builds

Common (excluding Website bugs)-specific: (6)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

February 09, 2014 11:55 PM

2014-02-09 Thunderbird comm-central builds

Thunderbird-specific: (85)

MailNews Core-specific: (32)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

February 09, 2014 11:53 PM

February 04, 2014

Ludovic Hirlimann

Fosdem 2014 report

This year I attended a full fosdem, after missing it last year and  only having 1/2 fosdem the two years before. This year I attended more than a couple of talks outside of the mozilla room. I also helped setting up the Mozilla booth on the fist day.

Mozilla booth being setup at fosdem

I attended the keynote, how we found 10 millions errors in wikipedia - which turned about being only 1 millions with false positive. Overall a great presentation of the LanguageTool a nice grammar checking tool written in java. They have an extension for firefox/thunderbird that will let you grammar check after spell checking is done. Also they have a recent change pages for wikipedia so anyone can check grammar error in “realtime” on wikipedia. Last but not least they where looking for help to provide the tool in JS instead of Java.

Next I went to the mozilla room to get updates on Fennec, and how the google summer of code worked. Basically very interesting numbers.

I then on listen to a talk about scaling dovecot:

at the dovecot talk at fosdem

Very very interesting. Of course I asked what the shittiest client was. The answer was Thunderbird , because it tries to implement more. After the talk I went and asked Timo Sirainen if he could give me a list of such bugs. The answer was search for my name in bugzilla.  My bugzilla fu being what it is I was of course unable to find any bugs :(

The next talk I listen to was also about email, but postfix this time - very very nice and informative about spam, and how to scale for spam. Full of interesting data.

My turn to talk came. You’ll find the slides here.

Sunday was less busy. I loosely listen to a talk about OTR, a good talk from the people at kaltura on HTML5 nad the video tag. I say very interesting because the lack of a file format standard really complicate things.

I then went on the largest PGP signing party. Will only sign 91 keys cause I had to go before the end.

Last but not least, ipv6 support was very strong on the fosdem network. Next year it will be full v6 only.

I took a bunch of pictures with my phone as I wanted to travel light with out my heavy Camera.

http://www.flickr.com/photos/lhirlimann/sets/72157640364126833/

Update : Of course someone from #tb-qa with the proper bugzilla foo gave me the list of dovecot bug related.

February 04, 2014 02:33 PM

February 01, 2014

Joshua Cranmer

Why email is hard, part 5: mail headers

This post is part 5 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. This post discusses the more general problem of email headers.

Back in my first post, Ludovic kindly posted, in a comment, a link to a talk of someone else's email rant. And the best place to start this post is with a quote from that talk: "If you want to see an email programmer's face turn red, ask him about CFWS." CFWS is an acronym that stands for "comments and folded whitespace," and I can attest that the mere mention of CFWS is enough for me to start ranting. Comments in email headers are spans of text wrapped in parentheses, and the folding of whitespace refers to the ability to continue headers on multiple lines by inserting a newline before (but not in lieu of) a space.

I'll start by pointing out that there is little advantage to adding in free-form data to headers which are not going to be manually read in the vast majority of cases. In practice, I have seen comments used for only three headers on a reliable basis. One of these is the Date header, where a human-readable name of the timezone is sometimes included. The other two are the Received and Authentication-Results headers, where some debugging aids are thrown in. There would be no great loss in omitting any of this information; if information is really important, appending an X- header with that information is still a viable option (that's where most spam filtration notes get added, for example).

For this feature of questionable utility in the first place, the impact it has on parsing message headers is enormous. RFC 822 is specified in a manner that is familiar to anyone who reads language specifications: there is a low-level lexical scanning phase which feeds tokens into a secondary parsing phase. Like programming languages, comments and white space are semantically meaningless [1]. Unlike programming languages, however, comments can be nested—and therefore lexing an email header is not regular [2]. The problems of folding (a necessary evil thanks to the line length limit I keep complaining about) pale in comparison to comments, but it's extra complexity that makes machine-readability more difficult.

Fortunately, RFC 2822 made a drastic change to the specification that greatly limited where CFWS could be inserted into headers. For example, in the Date header, comments are allowed only following the timezone offset (and whitespace in a few specific places); in addressing headers, CFWS is not allowed within the email address itself [3]. One unanticipated downside is that it makes reading the other RFCs that specify mail headers more difficult: any version that predates RFC 2822 uses the syntax assumptions of RFC 822 (in particular, CFWS may occur between any listed tokens), whereas RFC 2822 and its descendants all explicitly enumerate where CFWS may occur.

Beyond the issues with CFWS, though, syntax is still problematic. The separation of distinct lexing and parsing phases means that you almost see what may be a hint of uniformity which turns out to be an ephemeral illusion. For example, the header parameters define in RFC 2045 for Content-Type and Content-Disposition set a tradition of ;-separated param=value attributes, which has been picked up by, say, the DKIM-Signature or Authentication-Results headers. Except a close look indicates that Authenticatin-Results allows two param=value pairs between semicolons. Another side effect was pointed out in my second post: you can't turn a generic 8-bit header into a 7-bit compatible header, since you can't tell without knowing the syntax of the header which parts can be specified as 2047 encoded-words and which ones can't.

There's more to headers than their syntax, though. Email headers are structured as a somewhat-unordered list of headers; this genericity gives rise to a very large number of headers, and that's just the list of official headers. There are unofficial headers whose use is generally agreed upon, such as X-Face, X-No-Archive, or X-Priority; other unofficial headers are used for internal tracking such as Mailman's X-BeenThere or Mozilla's X-Mozilla-Status headers. Choosing how to semantically interpret these headers (or even which headers to interpret!) can therefore be extremely daunting.

Some of the headers are specified in ways that would seem surprising to most users. For example, the venerable From header can represent anywhere between 0 mailboxes [4] to an arbitrarily large number—but most clients assume that only one exists. It's also worth noting that the Sender header is (if present) a better indication of message origin as far as tracing is concerned [5], but its relative rarity likely results in filtering applications not taking it into account. The suite of Resent-* headers also experiences similar issues.

Another impact of email headers is the degree to which they can be trusted. RFC 5322 gives some nice-sounding platitudes to how headers are supposed to be defined, but many of those interpretations turn out to be difficult to verify in practice. For example, Message-IDs are supposed to be globally unique, but they turn out to be extremely lousy UUIDs for emails on a local system, even if you allow for minor differences like adding trace headers [6].

More serious are the spam, phishing, etc. messages that lie as much as possible so as to be seen by end-users. Assuming that a message is hostile, the only header that can be actually guaranteed to be correct is the first Received header, which is added by the final user's mailserver [7]. Every other header, including the Date and From headers most notably, can be a complete and total lie. There's no real way to authenticate the headers or hide them from snoopers—this has critical consequences for both spam detection and email security.

There's more I could say on this topic (especially CFWS), but I don't think it's worth dwelling on. This is more of a preparatory post for the next entry in the series than a full compilation of complaints. Speaking of my next post, I don't think I'll be able to keep up my entirely-unintentional rate of posting one entry this series a month. I've exhausted the topics in email that I am intimately familiar with and thus have to move on to the ones I'm only familiar with.

[1] Some people attempt to be to zealous in following RFCs and ignore the distinction between syntax and semantics, as I complained about in part 4 when discussing the syntax of email addresses.
[2] I mean this in the theoretical sense of the definition. The proof that balanced parentheses is not a regular language is a standard exercise in use of the pumping lemma.
[3] Unless domain literals are involved. But domain literals are their own special category.
[4] Strictly speaking, the 0 value is intended to be used only when the email has been downgraded and the email address cannot be downgraded. Whether or not these will actually occur in practice is an unresolved question.
[5] Semantically speaking, Sender is the person who typed the message up and actually sent it out. From is the person who dictated the message. If the two headers would be the same, then Sender is omitted.
[6] Take a message that's cross-posted to two mailing lists. Each mailing list will generate copies of the message which end up being submitted back into the mail system and will typically avoid touching the Message-ID.
[7] Well, this assumes you trust your email provider. However, your email provider can do far worse to your messages than lie about the Received header…

February 01, 2014 03:57 AM

January 24, 2014

Joshua Cranmer

Charsets and NNTP

Recently, the question of charsets came up within the context of necessary decoder support for Thunderbird. After much hemming and hawing about how to find this out (which included a plea to the IMAP-protocol list for data), I remembered that I actually had this data. Long-time readers of this blog may recall that I did a study several years ago on the usage share of newsreaders. After that, I was motivated to take my data collection to the most extreme way possible. Instead of considering only the "official" Big-8 newsgroups, I looked at all of them on the news server I use (effectively, all but alt.binaries). Instead of relying on pulling the data from the server for the headers I needed, I grabbed all of them—the script literally runs HEAD and saves the results in a database. And instead of a month of results, I grabbed the results for the entire year of 2011. And then I sat on the data.

After recalling Henri Svinonen's pesterings about data, I decided to see the suitability of my dataset for this task. For data management reasons, I only grabbed the data from the second half of the year (about 10 million messages). I know from memory that the quality of Python's message parser (which was used to extract data in the first place) is surprisingly poor, which introduces bias of unknown consequence to my data. Since I only extracted headers, I can't identify charsets for anything which was sent as, say, multipart/alternative (which is more common than you'd think), which introduces further systematic bias. The end result is approximately 9.6M messages that I could extract charsets from and thence do further research.

Discussions revealed one particularly surprising tidbit of information. The most popular charset not accounted for by the Encoding specification was IBM437. Henri Sivonen speculated that the cause was some crufty old NNTP client on Windows using that encoding, so I endeavored to build a correlation database to check that assumption. Using the wonderful magic of d3, I produced a heatmap comparing distributions of charsets among various user agents. Details about the visualization may be found on that page, but it does refute Henri's claim when you dig into the data (it appears to be caused by specific BBS-to-news gateways, and is mostly localized in particular BBS newsgroups).

Also found on that page are some fun discoveries of just what kind of crap people try to pass off as valid headers. Some of those User-Agents are clearly spoofs (Outlook Express and family used the X-Newsreader header, not the User-Agent header). There also appears to be a fair amount of mojibake in headers (one of them appeared to be venerable double mojibake). The charsets also have some interesting labels to them: the "big5\n" and the "(null)" illustrate that some people don't double check their code very well, and not shown are the 5 examples of people who think charset names have spaces in them. A few people appear to have mixed up POSIX locales with charsets as well.

January 24, 2014 12:53 AM

January 15, 2014

Ludovic Hirlimann

For those of you attending FOSDEM

In a a few weeks it will be Fosdem week-end. Something I’ve been attending since 2004.

This year I’d like to tell people that care about email privacy that fosdem has the biggest pgp key signing party in Europe. If you use pgp, or gnupg you might want to join the party.

To do so you’ll need to register before the 30th of January and follow the detailed instructions at https://fosdem.org/2014/keysigning/ .

update: People are sending plenty of keys it’s going to be a great event.

January 15, 2014 10:42 AM

January 08, 2014

Meeting Notes

Thunderbird: 2014-01-07

Thunderbird meeting notes 2014-01-07
Minute taker – don’t forget to save a revision of the pad before clearing it for next use.
Please don’t forget to post on wiki.mozilla.org after the end of the meeting so that they will go public in the meeting notes blog.

Attendees

  • rkent, florian, jcranmer, Roland, JosiahOne, mkmelin

Action items from last meetings

  • [mconley] Put together an Etherpad for contributor badge categories and graphics

    • Make sure each badge has an explicit, once-sentence goal.

Friends of the tree

Current status and discussions

  • mconley still did not put together this etherpad!

  • Chat updates in brief:
    • Instantbird now uses bugzilla.mozilla.org

    • instantbird front-end code will be merged into comm-central in a folder named im/
    • GPL code (libpurple + glue code) would be in a separate repository
      • This would make it easier to develop the add-on for TB as well

Upcoming

Round Table

clokep/florian

  • Finally completed the 6th chat/ merge of IB -> c-c (bug 920801)

    • Dealt with a couple of minor bugs from this (bugs 956487 and 956767)
  • Planning to do another chat/ merge of IB -> c-c soon
  • Merged the Instantbird bugzilla into Mozilla’s bugzilla (bug 749586): sorry for the bugspam
    • Includes a “Chat core” component for code shared between Thunderbird and Instantbird

      • Need to resolve duplicates and move bugs to their proper components
  • Filed bug to merge the Instantbird UI into c-c (bug 956609)
    • Hoping to finish this merge by the end of the week

    • Will likely close the tree for a few hours once we are ready to land

jcranmer

  • Windows builds almost work on Alder (down to a packaging failure)

    • Pymake’s shell quoting is annoying

    • bug 957720 is the tracking for work needed to get windows builds green
  • Goal is to finish cc-rework by the end of January:
    • Still need to ensure unit test configs work in new setup

    • Unresolved OS X Universal builds failure
    • Unresolved Windows packaging failure
    • Need to land bug 957720, bug 944952
  • More talk about killing RDF, so returning to working on bug 441437
  • Had a wonderful winter weather in Chicago followed by a bitterly cold weekend here in Urbana

JosiahOne (Left meeting early)

  • Did a lot of reviews, still have more to do. (Paenglab’s been knocking out bugs left and right.)

    • If you have flagged me for review, I plan to finish reviews this week.
  • Created several shared CSS files for theme with no regressions so far.
  • Landed a bunch of minor theme improvements
    • Removed the grain texture that has been causing issues. Bug 935023.
  • My plan now is actually to start doing less theme stuff and move to Cocoa and JS-based Front end.

mkmelin

  • bug 953426 expose remote content per-host privileges

  • bug 956586 get rid of FillInHTMLTooltip which is now in a binding (and bug 331772)

Question Time

Other

  • Q12014: We are (finally) moving the Knowledge Base from support.mozillamessaging.com to support.mozilla.org & from getstatisfaction to support.mozilla.org forums [roland]

January 08, 2014 04:00 AM

December 20, 2013

Instantbird

Instantbird 1.5 Released!

Instantbird 1.5 has been released: go grab your copy now! There are a ton of new features and bugs fixed for this new release, but we’d like to highlight a couple of new features below.

An exciting new feature you’ll find in Instantbird 1.5 is the New Conversation tab. It displays a list of your contacts, ordered based on how frequently and recently you’ve talked to them. Starting a conversation has never been easier! No longer will you have to open a separate window and scroll through your contact list to find a person. Just click the “+” button or press Ctrl/Cmd+T, start typing the name of the contact, and you should see your contact appear at the top of the list after typing only a few letters! You can then press enter and your conversation opens! The first time you open the tab, Instantbird will load your chat logs and learn who you talk to most often in order to offer accurate suggestions. New friends might not show up at the top immediately, but keep talking to them and they’ll reorder themselves. Don’t worry though, this ranking data is kept only on your own computer and is not transmitted or shared in any way!

Additionally, if you use Instantbird for IRC, the New Conversation tab will automatically query your servers to download the list of channels that are available to you. (This is generally known as LIST in IRC jargon.) Just like with your contacts, you can type in the name of a channel and it’ll bubble to the top of the list. Sometimes you don’t always know the channel name (that’s why you’re searching, right?): we’ve got you covered there too! Instantbird will search the channel topics in addition to channel names so you can quickly find new channels to join!

A very visible user interface improvement that was included for Instantbird 1.5 is redone tooltips that fit more into the visual style of the rest of the user interface. They should be immediately familiar to Instantbird users as they’re modeled after the conversation header! Hopefully this will help you find information quickly and easily whether conversing with your contacts or just checking their status.

For Linux users out there, we are still only offering 32-bit builds, although we hope to change that soon! If you are running a 64-bit Linux distribution, previously you’d have to install the ia32-libs (see our FAQ), but this has changed in recent versions of Ubuntu which no longer offer this package. The procedure now is to run:

sudo apt-get install libgtk2.0-0:i386 libpangox-1.0-0:i386 libpangoxft-1.0-0:i386 libidn11:i386 libglu1-mesa:i386 libxt-dev:i386 libasound-dev:i386

If you’d like to see a complete list of what’s new in Instantbird 1.5, please view the release notes.

December 20, 2013 09:12 PM

December 19, 2013

Robert Kaiser

LCARStrek and Australis

The latest version of my LCARStrek theme does not just support the latest SeaMonkey and Firefox releases. As I'm using it myself on Nightly, I'm trying to keep it working in an experimental way with that as well - and with that, a pretty huge challenge came up in the last weeks: A redesign of the Firefox interface code-named "Australis".

I blogged a month ago about how it may affect my customizations and I have dealt with those to a good degree by now, even though not yet even as drastically as I thought when writing that blog post. As always, more will follow. It took me some time until I switched over actually, as I wanted to keep using my theme, but it was naturally not compatible with such a huge redesign.

But after a lot of hours of my free time in the last few weeks, I have experimental support for Australis working in LCARStrek. The new changes living together with support for pre-Australis Firefox in the same theme require quite a few hacks to have a number of styles only apply on one side or the other. But then, I have been doing theme design for long enough (about 14 years now) that I know a few tricks and could use those - thankfully, there are a few changes in attributes set on the main toolbox, for example.

There's still a lot to be done in this area to fix some details (and I see a painting issue that is triggered in the submenus of the new main menu but is probably Linux-specific and connected to transparency used in the arrowpanel), but the main things seems to work decently now. See this screenshot:
Image No. 23159

Given that I'm using it every day, I hope starting now gives me enough experience with it that I can deliver a really decent theme when Australis finally will ship, probably with Firefox 29. :)

December 19, 2013 10:43 PM

Rumbling Edge - Thunderbird

2013-12-17 Calendar builds

Common (excluding Website bugs)-specific: (14)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

December 19, 2013 09:49 AM

2013-12-17 Thunderbird comm-central builds

Thunderbird-specific: (32)

MailNews Core-specific: (10)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

December 19, 2013 09:48 AM

December 14, 2013

Mike Conley

Australis Performance Post-mortem Part 3: As Good As Our Tools

While working on the ts_paint and tpaint regressions, we didn’t just stab blindly at the source code. We had some excellent tools to help us along the way. We also MacGyver‘d a few of those tools to do things that they weren’t exactly designed to do out of the box. And in some cases, we built new tools from scratch when the existing ones couldn’t cut it.

I just thought I’d write about those.

MattN’s Spreadsheet

I already talked about this one in my earlier post, but I think it deserves a second mention. MattN has mad spreadsheet skills. Also, it turns out you can script spreadsheets on Google Docs to do some pretty magical things – like pull down a bunch of talos data, and graph it for you.

I think this spreadsheet was amazingly useful in getting a high-level view of all of the performance regressions. It also proved very, very useful in the next set of performance challenges that came along – but more on those later.

MattN’s got a blog post up about his spreadsheet that you should check out.

The Gecko Profiler

This is a must-have for Gecko hackers who are dealing with some kind of performance problem. The next time I hit something performance related, this is the first tool I’m going to reach for. We used a number of tools in this performance work, but I’m pretty sure this was the most powerful one in our arsenal.

Very simply, Gecko ships with a built-in sampling profiler, and there’s an add-on you can install to easily dump, view and share these profiles. That last bit is huge – you click a button, it uploads, and bam – you have a link you can send to someone over IRC to have them look at your profile. It’s sheer gold.

We also built some tools on top of this profiler, which I’ll go into in a few paragraphs.

You can read up on the Gecko Profiler here at the official documentation.

Homebrew Profiler

At one point, jaws built a very simple profiler for the CustomizableUI component, to give us a sense of how many times we were entering and exiting certain functions, and how much time we were spending in them.

Why did we build this? To be honest, it’s been too long and I can’t quite remember. We certainly knew about the Gecko Profiler at this point, so I imagine there was some deficiency with the profiler that we were dealing with.

My hypothesis is that this was when we were dealing strictly with the ts_paint / tpaint regression on Windows XP. Take a look at the graphs in my last post again. Notice how UX (red) and mozilla-central (green) converge at around July 1st on Ubuntu? And how OS X finally converges on t_paint around August 1st?

I haven’t included the Windows 7 and 8 platform graphs, but I’m reasonably certain that at this point, Windows XP was the last regressing platform on these tests.

And I know for a fact that we were having difficulty using the Gecko Profiler on Windows XP, due to this bug.

Basically, on Windows XP, the call tree wasn’t interleaving the Javascript and native-code calls properly, so we couldn’t trust the order of tree, making the profile really useless. This was a serious problem, and we weren’t sure how to workaround it at the time.

And so I imagine that this is what prompted jaws to write the homebrew profiler. And it worked – we were able to find sections of CustomizableUI that were causing unnecessary reflow, or taking too long doing things that could be shortcutted.

I don’t know where jaws’ homebrew profiler is – I don’t have the patch on my machine, and somehow I doubt he does too. It was a tool of necessity, and I think we moved past it once we sorted out the Windows XP stack interleaving thing.

And how did we do that, exactly?

Using the Gecko Profiler on Windows XP

jaws profiler got us some good data, but it was limited in scope, since it only paid attention to CustomizableUI. Thankfully, at some point, Vladan from the Perf team figured out what was going wrong with the Gecko Profiler on Windows XP, and gave us a workaround that lets us get proper profiles again. I have since updated the Gecko Profiler MDN documentation to point to that workaround.

Reflow Profiles

This is where we start getting into some really neat stuff. So while we were hacking on ts_paint and tpaint, Markus Stange from the layout team wrote a patch for Gecko to take “reflow profiles”. This is a pretty big deal – instead of telling us what code is slow, a reflow profile tells us what things take a long time to layout and paint. And, even better, it breaks it down by DOM id!

This was hugely powerful, and I really hope something like this can be built into the Gecko Profiler.

Markus’ patch can be found in this bug, but it’ll probably require de-bitrotting. If and when you apply it, you need to run Firefox with an environment variable MOZ_REFLOW_PROFILE_FILE pointing at the file you’d like the profile written out to.

Once you have that profile, you can view it on Markus’ special fork of the Gecko Profiler viewer.

This is what a reflow profile looks like:

Screen Shot 2013-12-13 at 11.49.34 PM

I haven’t linked to one I’ve shared because reflow profiles tend to be very large – too large to upload. If you’d like to muck about with a real reflow profile, you can download one of the reflow profiles attached to this bug and upload it to Markus’ Gecko Profiler viewer.

These reflow profiles were priceless throughout all of the Australis performance work. I cannot stress that enough. They were a way for us to focus on just a facet of the work that Gecko does – layout and painting – and determine whether or not our regressions lay there. If they did, that meant that we had to find a more efficient way to paint or layout. And if the regressions didn’t show up in the reflow profiles, that was useful too – it meant we could eliminate graphics and layout from our pool of suspects.

Comparison Profiles

Profiles are great, but you know what’s even better? Comparison profiles. This is some more Markus Stange wizardry.

Here’s the idea – we know that ts_paint and tpaint have regressed on the UX branch. We can take profiles of both the UX and mozilla-central. What if we can somehow use both profiles and find out what UX is doing that’s uniquely different and uniquely slow?

Sound valuable? You’re damn right it is.

The idea goes like this – we take the “before” profile (mozilla-central), and weight all of its samples by -1. Then, we add the samples from the “after” profile (UX).

The stuff that is positive in the resulting profile is an indicator that UX is slower in that code path. The stuff that is negative means that UX is faster.

How did we do this? Via these scripts. There’s a script in this repository called create_comparison_profile.py that does all of the work in generating the final comparison profile.

Here’s a comparison profile to look at, with mozilla-central as “before” and UX as “after”.

Now I know what you’re thinking – Mike – the root of that comparison profile is a negative number, so doesn’t that mean that UX is faster than mozilla-central?

That would seem logical based on what I’ve already told you, except that talos consistently returns the opposite opinion. And here’s where I expose some ignorance on my part – I’m simply not sure why that root node is negative when we know that UX is slower. I never got a satisfying answer to that question. I’ll update this post if I find out.

What I do know is that drilling into the high positive numbers of these comparison profiles yielded very valuable results. It allowed us to quickly determine what was unique slow about UX.

And in performance work, knowing is more than half the battle – knowing what’s slow is most of the battle. Fixing it is often the easy part – it’s the finding that’s hard.

Oh, and I should also point out that these scripts were able to generate comparison profiles for reflow profiles as well. Outstanding!

Profiles from Talos

Profiling locally is all well and good, but in the end, if we don’t clear the regressions on the talos hardware that run the tests, we’re still not good enough. So that means gathering profiles on the talos hardware.

So how do we do that?

Talos is not currently baked into the mozilla-central tree. Instead, there’s a file called testing/talos/talos.json that knows about a talos repository and a revision in that repository. The talos machines then pull talos from that repository, check out that revision, and execute the talos suites on the build of Firefox they’ve been given.

We were able to use this configuration to our advantage. Markus cloned the talos repository, and modified the talos tests to be able to dump out both SPS and reflow profiles into the logs of the test runs. He then pushed those changes to his user repository for talos, and then simply modified the testing/talos/talos.json file to point to his repo and the right revision.

The upshot being that Try would happily clone Markus’ talos, and we’d get profiles in the test logs on talos hardware! Brilliant!

Extracting and symbolicating those profiles would be handled by more of Markus’ scripts – see get_profiles.py.

Now we were cooking with gas – reflow and SPS profiles from the test hardware. Could it get better?

Actually, yes.

Getting the Good Stuff

When the talos tests run, the stuff we really care about is the stuff being timed. We care about how long it takes to paint the window, but not how long it takes to tear down the window. Unfortunately, things like tearing down the window get recorded in the SPS and reflow profiles, and that adds noise.

Wouldn’t it be wonderful to get samples just from the stuff we’re interested in? Just to get samples only when the talos test has its stopwatch ticking?

It’s actually easier than it sounds. As I mentioned, Markus had cloned the talos tests, and he was able to modify tpaint and ts_paint to his liking. He made it so that just as these tests started their stopwatches (waiting for the window to paint), an SPS profile marker was added to the sample taken at that point. A profile marker simply allows us to decorate a sample with a string. When the stopwatch stopped (the window has finished painting), we added another marker to the profile.

With that done, the extraction scripts simply had to exclude all samples that didn’t occur between those two markers.

The end result? Super concentrated profiles. It’s just the stuff we care about. Markus made it work for reflow profiles too – it was really quite brilliant.

And I think that pretty much covers it.

Lessons

So with these amazing tools we were eventually able to grind down our ts_paint and tpaint regressions into dust.

And we celebrated! We were very happy to clear those regressions. We were all clear to land!

Or so we thought. Stay tuned for Part 4.

December 14, 2013 05:00 PM

December 12, 2013

Calendar

Finally, A New Website!

This is the moment I have been waiting for so long! I can’t say this big enough because I am so excited:

Check out this awesome:

new-website

The new site is clear and concise. Not too many links, all important links on one page. The download button gives you the latest version directly from addons.mozilla.org.

There are some features still lacking that we will be adding in later, it was more important to just get this thing out in the open first. The holiday calendars page is still missing. Ideally the new holiday calendars area will auto-generate itself from the existing holiday files. We also wanted to show the most recent blog headlines.

The new site is part of bedrock, Mozilla’s shiny new website framework. This means we get a lot of stuff for free, one of them is localization. I haven’t found out how this works yet, but it will be possible to translate the page into any language.

Is there any information you are missing? Let us know what you think!

December 12, 2013 10:51 PM

Lightning 2.6.x Version Recap

As you may have read in the previous post, there have been quite a few issues with Lightning 2.6.x. I wanted to explain what happened and what we can do to avoid these issues in the future.

The Lightning build process is closely coupled with Thunderbird. Every time Thunderbird does a release, we get builds for Lightning for free. This means we mostly depend on them doing a release, otherwise I have to patch the final builds manually. This is a little more work. Luckily, each of the releases between 2.6 and 2.6.4 have been done together with a Thunderbird build.

Google Calendar Issues via CalDAV

Just before Lightning 2.6 was released, we made some last minute changes to accomodate for the fact that Google Calendar had changed their CalDAV URL. Not only that, they also implemented a specification for faster synchronization of CalDAV. We already supported this specification, but only an older version. A quick fix was done to take care. In total, were some authentication issues and an error loading calendars. We knew we had to release a 2.6.1, but we didn’t know it had to happen so fast…

Version compatibilitiy issues

When Thunderbird 24.0.1 was released, Lightning 2.6 did not work on Linux. The reason for this was a regression in the Mozilla Platform around Thunderbird 23. The binary component we have was built with a specifc compiler flag with a parameter that was too strict. It bascially said “this binary component is only for version 24.0.1″. The fix was easy, change it to “this binary component is for version 24.*”, but it took a while for that fix to be completed and admitted to all branches. Lightning 2.6.1 was quickly released as a workaround specificly compatible to Thunderbird 24.0.1, Lightning 2.6.2 was needed for Thunderbird 24.1.0.

Another problem why this was so hard to figure out for users is that some Linux distributions decided to skip the minor releases and only do 24.0, 24.1.0, 24.2.0 and so on. There were complaints because the latest Lightning version wasn’t working, when 24.1.1 was missing from the distribution repositories. We still needed to release consequent Lightning versions though, otherwise users using the stock builds would complain.

Lightning 2.6.3: Issues with CalDAV

Unfortunately, one of the patches for 2.6.1 had an error in it. We decided there needs to be a quick fix, and it was just in time for Lightning 2.6.3. The binary compatibility bug had been fixed by now, so this should also be the first version that is compatible to any version of Thunderbird 24.1.1 and up.

Lightning 2.6.4: Yet another one

Now this is the release that really annoyed me. First of all, I did a bad job on one of the patches. The other one was a minor issue with servers that don’t have a certain XML element in their response. These are the kinds of issues we could have easily figured out before the release with more and better unit tests. We might have even saved another release.

Conclusion

We probably could have known about all of these issues beforehand if we had tests to catch them. Just running the any of the tests using the build machinery would have caught the binary compatibility issue. If we had at least some manual tests to test CalDAV servers, we could have started them for a few public demo servers and caught all of the CalDAV regressions. Both of this has been on my list for quite some time, but given all the other things coming up I never got around to it.

Integrating the tests with the build system is unfortunately something only someone with Mozillians trust can do, but if you want to help us write some unit test, that would be marvelous. The cool new thing to use is promises and tasks, which allow writing really easy to read asynchronous code. I have some demo code thats not quite working but is ready for someone to pick up.

If you want to help in some other way, please contact me! Even if you are not a developer, there is a lot that can be done for someone with a little initiative.

 

December 12, 2013 10:24 PM

December 11, 2013

Meeting Notes

Thunderbird: 2013-12-10

Thunderbird meeting notes 2013-12-10
Minute taker – don’t forget to save a revision of the pad before clearing it for next use.
Please don’t forget to post on wiki.mozilla.org after the end of the meeting so that they will go public in the meeting notes blog.

Attendees

  • mconley, irving, jcranmer, paenglab, rolandtanglao, aceman, mkmelin, sshagarwal, javirueda

Action items from last meetings

  • none

Friends of the tree

  • nobody nominated but i am 100% somebody did something awesome

Current status and discussions

  • mconley: can we create an open contributor badge for TB contributors?

    • Friends of the tree, new contributor, I lost the list of because I’m not a fast typer

    • Need graphics or something or thing
    • Roland knows how to do it, so give him the graphics that we need. Goals, graphics, and people to award the badges to.

Critical Issues

  • Bug 948555 – Core graphics fix broke us in the last merge, but jcranmer is on it – we’ve essentially lost all OS X Mozmill tests. :/

Upcoming

Round Table

mconley

  • Still no takers on my Ensemble call for help – although I’ve responded to email asking how it currently works

  • Took another chunk out of my review queue this past weekend! \o/ I’ll have another opportunity this evening or tomorrow evening.
    • Part of my review queue includes needinfo?’s from tessarakt’s questions from the last meeting

jcranmer

  • Review queue is backed up, apologies

  • Hoping to land bug 842632… when bug 948555 is fixed (:-( @ m-c bustage)
  • Alder is green on 5/8 trees!
    • OS X opt broken due to universal build + nsinstall + ldap headaches

    • Windows requires a lot more investigation to fix
  • Will have spotty internet access and work times from ~Dec 21-Dec 29, and likely to be overloaded until Jan 14 with school stuff

Paenglab (no Mic)

  • Used a bit TB with dark Persona and found two bugs:

    • Bug 948384 – No chat icon when unreadMessages=”true” with dark Personas.

    • Bug 948568 – Use inverted chat toolbar icons with dark Personas.
  • Around christmas will look to update TB’s tabs to Fx Australis implementation.

Question Time

Other

  • Q12014: We are (finally) moving the Knowledge Base from support.mozillamessaging.com to support.mozilla.org & from getstatisfaction to support.mozilla.org forums [roland]

Action Items

  • [mconley] Put together an Etherpad for contributor badge categories and graphics

    • Make sure each badge has an explicit, 1-sentence goal.

December 11, 2013 04:00 AM

December 06, 2013

Robert Kaiser

Firefox OS DevTreff Vienna

Last month, I was contacted within a few days by a local "open mobile devices" enthusiast, a Mozilla events manager and a fellow German-speaking Mozilla Rep, all of them pointing to an event here in Vienna called Firefox OS DevTreff Austria.

While the local just asked me if I'd go there, the Mozilla contacts had been asked by the organizers for a speaker to open up the event. We were trying to get someone more used to talking about Firefox OS, but everyone's busy this time of year, so in the end we settled with me doing this keynote.

Now, I have been giving presentations on different occasions and events in the last years, but I never have actually keynoted anything, so that made me somewhat nervous. The other talks that were lined up for the evening were about app development, to some part about very concrete pieces of it, so I figured I should give that some frame and introduce people to Firefox OS, starting with why we are doing it, moving to what and where it is and giving a bit of glance onto where we want to take it. So I came up with "Firefox OS: Reasons, Status & Plans" as the title (my slides are behind the link).

Image No. 23157 Image No. 23158

The audience was supposed to be about 50 people, I guess 30-35 really showed up (the pictures, taken "in style" with Firefox OS on my Peak, only show one part of the room), but those were an awesome bunch. They were really into the topic, asked interesting questions, and the talks following me were showing that we really had capable developers in the room, from those that do JS in their free time to those who earn their bread and butter by doing apps.
We also had two Mozillians, both of which I had not met in person before, even though I spent a lot of time in this city in the last decade!

As the event was going on, I was often the voice in the room who would have answers from the Mozilla side or could explain our point of view and initiatives - and in quite a few cases, I could loop back to something I said in my keynote. It was really great to see how apparently I had touched exactly on the right things there and gave everything else a good base to build on. Interestingly, there was quite a bit of interest in the DeviceStorage API, probably because accessing local files is something people can refer better to than storing items in-app. I was thankful someone did a talk on our Marketplace and in-app payment API/Services as that's one area I'm actually weak in, but it also sparked quite a bit of interest. The permission model did also get a few questions.

We surely had people with Firefox OS app experience in there, but I think more of those people might pick up web app development, esp. if more similar events come around, which would be cool. And maybe someone should tell them how to do simple apps without larger libraries or frameworks, and explain app manifests in more detail. I hope they will organize more of those and the chance for that will come along!

December 06, 2013 04:16 AM

December 04, 2013

Joshua Cranmer

Why email is hard, part 4: Email addresses

This post is part 4 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. This post discusses the problems with email addresses.

You might be surprised that I find email addresses difficult enough to warrant a post discussing only this single topic. However, this is a surprisingly complex topic, and one which is made much harder by the presence of a very large number of people purporting to know the answer who then proceed to do the wrong thing [0]. To understand why email addresses are complicated, and why people do the wrong thing, I pose the following challenge: write a regular expression that matches all valid email addresses and only valid email addresses. Go ahead, stop reading, and play with it for a few minutes, and then you can compare your answer with the correct answer.

 

 

 

Done yet? So, if you came up with a regular expression, you got the wrong answer. But that's because it's a trick question: I never defined what I meant by a valid email address. Still, if you're hoping for partial credit, you may able to get some by correctly matching one of the purported definitions I give below.

The most obvious definition meant by "valid email address" is text that matches the addr-spec production of RFC 822. No regular expression can match this definition, though—and I am aware of the enormous regular expression that is often purported to solve this problem. This is because comments can be nested, which means you would need to solve the "balanced parentheses" language, which is easily provable to be non-regular [2].

Matching the addr-spec production, though, is the wrong thing to do: the production dictates the possible syntax forms an address may have, when you arguably want a more semantic interpretation. As a case in point, the two email addresses example@test.invalid and example @ test . invalid are both meant to refer to the same thing. When you ignore the actual full grammar of an email address and instead read the prose, particularly of RFC 5322 instead of RFC 822, you'll realize that matching comments and whitespace are entirely the wrong thing to do in the email address.

Here, though, we run into another problem. Email addresses are split into local-parts and the domain, the text before and after the @ character; the format of the local-part is basically either a quoted string (to escape otherwise illegal characters in a local-part), or an unquoted "dot-atom" production. The quoting is meant to be semantically invisible: "example"@test.invalid is the same email address as example@test.invalid. Normally, I would say that the use of quoted strings is an artifact of the encoding form, but given the strong appetite for aggressively "correct" email validators that attempt to blindly match the specification, it seems to me that it is better to keep the local-parts quoted if they need to be quoted. The dot-atom production matches a sequence of atoms (spans of text excluding several special characters like [ or .) separated by . characters, with no intervening spaces or comments allowed anywhere.

RFC 5322 only specifies how to unfold the syntax into a semantic value, and it does not explain how to semantically interpret the values of an email address. For that, we must turn to SMTP's definition in RFC 5321, whose semantic definition clearly imparts requirements on the format of an email address not found in RFC 5322. On domains, RFC 5321 explains that the domain is either a standard domain name [3], or it is a domain literal which is either an IPv4 or an IPv6 address. Examples of the latter two forms are test@[127.0.0.1] and test@[IPv6:::1]. But when it comes to the local-parts, RFC 5321 decides to just give up and admit no interpretation except at the final host, advising only that servers should avoid local-parts that need to be quoted. In the context of email specification, this kind of recommendation is effectively a requirement to not use such email addresses, and (by implication) most client code can avoid supporting these email addresses [4].

The prospect of internationalized domain names and email addresses throws a massive wrench into the state affairs, however. I've talked at length in part 2 about the problems here; the lack of a definitive decision on Unicode normalization means that the future here is extremely uncertain, although RFC 6530 does implicitly advise that servers should accept that some (but not all) clients are going to do NFC or NFKC normalization on email addresses.

At this point, it should be clear that asking for a regular expression to validate email addresses is really asking the wrong question. I did it at the beginning of this post because that is how the question tends to be phrased. The real question that people should be asking is "what characters are valid in an email address?" (and more specifically, the left-hand side of the email address, since the right-hand side is obviously a domain name). The answer is simple: among the ASCII printable characters (Unicode is more difficult), all the characters but those in the following string: " \"\\[]();,@". Indeed, viewing an email address like this is exactly how HTML 5 specifies it in its definition of a format for <input type="email">

Another, much easier, more obvious, and simpler way to validate an email address relies on zero regular expressions and zero references to specifications. Just send an email to the purported address and ask the user to click on a unique link to complete registration. After all, the most common reason to request an email address is to be able to send messages to that email address, so if mail cannot be sent to it, the email address should be considered invalid, even if it is syntactically valid.

Unfortunately, people persist in trying to write buggy email validators. Some are too simple and ignore valid characters (or valid top-level domain names!). Others are too focused on trying to match the RFC addr-spec syntax that, while they will happily accept most or all addr-spec forms, they also result in email addresses which are very likely to weak havoc if you pass to another system to send email; cause various forms of SQL injection, XSS injection, or even shell injection attacks; and which are likely to confuse tools as to what the email address actually is. This can be ameliorated with complicated normalization functions for email addresses, but none of the email validators I've looked at actually do this (which, again, goes to show that they're missing the point).

Which brings me to a second quiz question: are email addresses case-insensitive? If you answered no, well, you're wrong. If you answered yes, you're also wrong. The local-part, as RFC 5321 emphasizes, is not to be interpreted by anyone but the final destination MTA server. A consequence is that it does not specify if they are case-sensitive or case-insensitive, which means that general code should not assume that it is case-insensitive. Domains, of course, are case-insensitive, unless you're talking about internationalized domain names [5]. In practice, though, RFC 5321 admits that servers should make the names case-insensitive. For everyone else who uses email addresses, the effective result of this admission is that email addresses should be stored in their original case but matched case-insensitively (effectively, code should be case-preserving).

Hopefully this gives you a sense of why email addresses are frustrating and much more complicated then they first appear. There are historical artifacts of email addresses I've decided not to address (the roles of ! and % in addresses), but since they only matter to some SMTP implementations, I'll discuss them when I pick up SMTP in a later part (if I ever discuss them). I've avoided discussing some major issues with the specification here, because they are much better handled as part of the issues with email headers in general.

Oh, and if you were expecting regular expression answers to the challenge I gave at the beginning of the post, here are the answers I threw together for my various definitions of "valid email address." I didn't test or even try to compile any of these regular expressions (as you should have gathered, regular expressions are not what you should be using), so caveat emptor.

RFC 822 addr-spec
Impossible. Don't even try.
RFC 5322 non-obsolete addr-spec production
([^\x00-\x20()\[\]:;@\\,.]+(\.[^\x00-\x20()\[\]:;@\\,.]+)*|"(\\.|[^\\"])*")@([^\x00-\x20()\[\]:;@\\,.]+(.[^\x00-\x20()\[\]:;@\\,.]+)*|\[(\\.|[^\\\]])*\])
RFC 5322, unquoted email address
.*@([^\x00-\x20()\[\]:;@\\,.]+(\.[^\x00-\x20()\[\]:;@\\,.]+)*|\[(\\.|[^\\\]])*\])
HTML 5's interpretation
[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*
Effective EAI-aware version
[^\x00-\x20\x80-\x9f]()\[\]:;@\\,]+@[^\x00-\x20\x80-\x9f()\[\]:;@\\,]+, with the caveats that a dot does not begin or end the local-part, nor do two dots appear subsequent, the local part is in NFC or NFKC form, and the domain is a valid domain name.

[1] If you're trying to find guides on valid email addresses, a useful way to eliminate incorrect answers are the following litmus tests. First, if the guide mentions an RFC, but does not mention RFC 5321 (or RFC 2821, in a pinch), you can generally ignore it. If the email address test (not) @ example.com would be valid, then the author has clearly not carefully read and understood the specifications. If the guide mentions RFC 5321, RFC 5322, RFC 6530, and IDN, then the author clearly has taken the time to actually understand the subject matter and their opinion can be trusted.
[2] I'm using "regular" here in the sense of theoretical regular languages. Perl-compatible regular expressions can match non-regular languages (because of backreferences), but even backreferences can't solve the problem here. It appears that newer versions support a construct which can match balanced parentheses, but I'm going to discount that because by the time you're going to start using that feature, you have at least two problems.
[3] Specifically, if you want to get really technical, the domain name is going to be routed via MX records in DNS.
[4] RFC 5321 is the specification for SMTP, and, therefore, it is only truly binding for things that talk SMTP; likewise, RFC 5322 is only binding on people who speak email headers. When I say that systems can pretend that email addresses with domain literals or quoted local-parts don't exist, I'm excluding mail clients and mail servers. If you're writing a website and you need an email address, there is no need to support email addresses which don't exist on the open, public Internet.
[5] My usual approach to seeing internationalization at this point (if you haven't gathered from the lengthy second post of this series) is to assume that the specifications assume magic where case insensitivity is desired.

December 04, 2013 11:24 PM

December 01, 2013

Mike Conley

Australis Performance Post-mortem Part 2: ts_paint and t_paint

Continued from Part 1.

So we’d just gotten Talos data in, and it looked like we were regressing on ts_paint and tpaint right across the board.

Speaking just for myself, up until this point, Talos had been a black box. I vaguely knew that Talos tests were run, and I vaguely understood that they measured certain performance things, but I didn’t know what those things were nor where to look at the results.

Luckily, I was working with some pretty seasoned veterans. MattN whipped up an amazing spreadsheet that dynamically pulled in the Talos test data for each platform so that we could get a high-level view of all of the regressions. This would turn out to be hugely useful.

Here’s a link to a read-only version of that spreadsheet in all of its majesty. Or, if that link is somehow broken in the future, here’s a screenshot:

Numbers!

Numbers!

So now we had a high-level view of the regressions. The next step was determining what to do about it.

I should also mention that these regressions, at this point, were the only big things blocking us from landing on mozilla-central. So naturally, a good chunk of us focused our attention on this performance stuff. We quickly organized a daily standup meeting time where we could all get together and give reports on what we were doing to grind down the performance issues, and what results we were getting from our efforts.

That chunk of team, however, didn’t initially include me. I believe Gijs, Unfocused, mikedeboer and myself kept hacking on customization and widget bugs while jaws and MattN dug at performance. As time went on though, a few more of us eventually joined MattN and jaws in their performance work.

The good news in all of this is that ts_paint and tpaint are related – both measure the time it takes from issuing the command to open a browser window to actually painting it on the screen. ts_paint is concerned with the very first Firefox window from a cold-start, and tpaint is concerned with new windows from an already-running Firefox. It was quite possible that there was some overlap in what was making us slow on these two tests, which was somewhat encouraging.

The following bugs are just a subset of the bugs we filed and landed to improve our ts_paint and tpaint performance. Looking back, I’m pretty sure these are the ones that made the most difference, but the full list can be found as dependencies of these bugs.

Bug 890105 - TabsInTitleBar._update should group measurements and style changes to avoid unnecessary reflows

After a bit of examination, MattN dealt the first blow when he filed Bug 890105. The cross-platform code that figures out how best to place the tabs in the titlebar (while taking into account things like the system font size) is run before the window first paints, and it was being inefficient.

By inefficient, I mean it was causing more reflows than necessary. Here’s some information on reflows. The MDN page states that the article is obsolete, but the page still does a pretty good job of explaining what a reflow is.

The code would take a measurement of something on the page (causing a reflow), update that thing’s size (causing a reflow), and then repeat the process. MattN found we could cluster the measurements into a single pass, and then do all of the changes one after another. This reduced the number of reflows, which helped speed up both ts_paint and tpaint.

And boom, we saw our first win for both ts_paint and tpaint!

Bug 892532 – Add an optional fast-path to CustomizableUI.isWidgetRemovable

jaws found the next big win using a home-brewed profiler. The home-brewed profiler simply counted the number of times we entered and exited various functions in the CustomizableUI code, and recorded the time it took from entering to exiting.

I can’t really recall why we didn’t use the SPS profiler at this point. We certainly knew about it, but something tells me that at this point, we were having a hard time getting useful data from it.

Anyhow, with the home-brew profiler, jaws determined that we had the opportunity to fast-path a section of our code. Basically, we had a function that takes the ID of a widget, looks for and retrieves the widget, and returns whether or not that widget can be removed from its current location. There were some places that called this function during window start-up, and those places already had the widget that was to be found. jaws figured we could fast-path the function by being able to pass the widget itself rather than the ID, and skip the look-up.

Bug 891104 – Skip calling onOverflow during startup if there wasn’t any overflowed content before the toolbar is fully initialized

It was MattN’s turn again – this time, he found that the overflow toolbar code for the nav-bar (this is the stuff that handles putting widgets into the overflow panel if the window gets too small) was running the overflow handler as soon as the nav-bar was initted, regardless of whether anything was overflowed. This was causing a reflow because a measurement was on the overflowable toolbar to see if items needed to be moved into the overflow panel.

Originally, the automatic call of the overflow handler was to account for the case where the nav-bar is overflowed from the very beginning – but jaws made it smarter by attaching an overflow handler before the CSS attribute that made the toolbar overflowable was applied. That meant that if the nav-bar would only call the overflow handler if it really needed to, as opposed to every time.

Bug 898126 – Cache client hit test values

Around this time, a few more people started to get involved in Australis performance work. Gijs and mstange got a bug filed to investigate if there was a way to make start-up faster on Windows XP and 7. Here’s some context from mstange in that bug in comment 9:

It turns out that Windows XP sends about 200 WM_NCHITTEST events per second when we open a new window. All these events have the same position – possibly the current mouse position. And all the ClientMarginHitTestPoint optimizations we’ve been playing with only make a difference because that function is called so often during the test – one invocation is unnoticeably quick, but it starts to add up if we call it so many times.

This patch makes sure that we only send one hittest event per second if the position doesn’t change, and returns a cached value otherwise.

After some fiddling about with cache invalidation times, the patch landed, and we saw a nice win on Windows XP and 7!

Bug 906075 – Only send toolbars through buildArea if they’re not in their default state

It was around now that I started to get involved with performance work. One of my first successful bugs was to only run a toolbar through CustomizableUI’s buildArea function if the toolbar was not starting in a default state. The buildArea function’s job is to populate a customizable area with only the things that the user has moved into the area, and remove the things that the user has taken out. That involves cycling through the nodes in the area to see if they belong, and that takes time. I wrote a patch that cached a “dirty” state on a toolbar to indicate that it’d been customized in the past, and if we didn’t see that value, we didn’t run the toolbar through the function. Easy as pie, and we saw a little win on both ts_paint and tpaint on all platforms.

Bug 905695 – Skip checking for tab overflows if there is only one tab open

This was another case where we had an unnecessary reflow during start-up. And, like bug 891104, it involved an overflow event handler running when it really didn’t need to. jaws writes:

If only one tab is opened and we show the left/right arrows, we are actually removing quite a bit of space that could have been used to show the tab. Scrolling the tabbox in this state is also quite useless, since all the user can do is scroll to see the other parts of the *only* tab.

If we make this change, we can skip a synchronous reflow for new windows that only have one tab.

Which means we could skip a reflow for all new windows. Are you starting to notice a pattern? Sections of our code had been designed to operate the same way, regardless of whether or not it was in the default, common case. We were finding ways of detecting the default case, and fast-pathing them.

Chalk up another win!

Bug 907787 – Australis: toolbar overflow button should be hidden by default

Yet another example where we could fast-path the default case. The overflow button in the nav-bar is only supposed to be displayed if there are too many items in the nav-bar, resulting in some getting put into the overflow panel, which anchors on the overflow button.

If nothing is being overflowed and the panel is empty, the button should not be displayed.

We were, however, displaying the button by default, and then hiding it when we determined that nothing was overflowed. Bug 907787 inverted that logic, and hid the button by default, and only showed it when things got overflowed (which was not the default case).

We were getting really close to performance parity with mozilla-central…

Bug 908326 – default the navbar to overflowable to avoid needless reflowing

Once again, an example of us not greasing the default-path. Our overflowable toolbar code applies an overflowable attribute to the nav-bar in order to apply some CSS styles to give the toolbar its overflowing properties. Adding that attribute dynamically means a reflow.

Instead, we just added the attribute to the node’s definition in browser.xul, and dropped that unnecessary reflow like a hot brick.

So how far had we come?

Let’s take a look at the graphs, shall we? Remember, in these graphs, the red points represent UX, and the green represent mozilla-central. Up is bad, and down is good. Our goal was to sink the red dots down into the noise of the green dots, which would give us performance parity.

ts_paint

Windows XP - ts_paint improvements

Windows XP – ts_paint improvements

Ubuntu - ts_paint improvements

Ubuntu – ts_paint improvements

OSX 10.6 ts_paint improvements

OSX 10.6 ts_paint improvements

You might be wondering what that bug jump is for ts_paint for OSX 10.6 at the end of the graph. This thread explains.

tpaint

Windows XP - tpaint improvements

Windows XP – tpaint improvements

 

Ubuntu - tpaint improvements

Ubuntu – tpaint improvements

OSX 10.6 tpaint improvements

OSX 10.6 tpaint improvements

Looking good.

The big lessons

I think the big lesson here is to identify the common, default case, and optimize it as best you can. By definition, this is the path that’s going to be hit the most, so you can special-case it, and build in fast paths for it. Your users will thank you.

Close the feedback loop as much as you can. To test our theories, we’d push our patches to try and use compare-talos to compare our tpaint and ts_paint numbers to baseline pushes to see if we were making improvements. This requires several hours for the try builds to complete. This is super slow. Release Engineering was awesome and lent us some Windows XP talos slaves for us to experiment on, and that helped us close the feedback loop a lot. Don’t be afraid to ask Release Engineering for talos slaves.

Also note that while it’s easy for me to rattle off bug numbers and explain where we were being slow, all of that investigation and progress occurred over several months. Performance work can be really slow. The bottleneck is not making the slow code faster – the bottleneck is identifying where the slow code is. Profiling is the key here. If you’re not using some kind of profiler while doing performance work, you’re seriously impeding yourself. If you don’t have a profiler, build a simple one. If you don’t know how to build a simple one, find someone who can.

I mentioned Gecko’s built-in SPS profiler a few paragraphs back. The SPS profiler was instrumental (pun intended) in getting our performance back up to snuff. We also built a number of tools alongside the SPS profiler to help us in our analyses.

Read up about those tools we built in Part 3…

December 01, 2013 08:24 PM

November 21, 2013

Mike Conley

Australis Performance Post-mortem Part 1: Where We Started

Getting to the merge

Last Monday, November 18th, Australis merged into our Nightly release channel, meaning lots of people are getting to try it and give us feedback. It’s been an exciting week, and we’re all very pleased with the response so far!

Up until then, if you wanted to try Australis, you had to use the Nightlies from the UX branch. If you followed along on the UX branch, you’ll know that the tabs and the customization work have been in a pretty steady state for the last few months.

So what was the hold up? Why did it take so long to get to the merge?

Gather round folks, I have a story to tell.

Some terminology

I’m going to be batting around a few terms here, and some people will understand them right away, and some people won’t, so I’ll just spell them out here, in no particular order:

Australis
If at this point you’re still not sure what I mean by Australis, you might want to check out this blog post and the accompanying video.
mozilla-central
mozilla-central, in this instance, refers to code that did not have the Australis changes in them. In the grand scheme of things, mozilla-central was where non-Australis code went, and then we’d merge those changes into the UX branch.
UX branch
The UX branch was where we were storing all of the Australis code.
Talos
Talos is a series of tests that we can run against a build of Firefox to measure the performance of different things – for example, how long it takes for a window to be opened. As of this writing, Talos tests for Desktop Firefox are run on Ubuntu Linux 12.04, OS X (10.6, 10.7 and 10.8), and Windows (XP, 7 and 8).

Where we started from

Let’s rewind a bunch of months. Let’s go to about early June, 2013. At this time, the curvy tab work was essentially finished on Windows, and had been ported to OS X and Linux. The customization code was still being hacked on, but we felt like we were in a pretty decent place – the team felt like we were ready to merge into mozilla-central to get some real user feedback and testing.

The problem was that up until that point, we hadn’t been running the Talos tests on the UX Branch, which means we didn’t really have a good idea about how we were performing in comparison to mozilla-central.

And then we turned the Talos tests on. Data started to flow in, and it wasn’t happy data. In particular, we were regressing pretty badly on two tests: ts_paint and tpaint.

ts_paint
this test measures how long it takes for Firefox to paint the first window on startup.
tpaint
this test measures how long it takes for Firefox to paint a newly opened window from a Firefox that is already running

Before I show you this data, I should clear some things up:  as mentioned above, we run these Talos tests on a bunch of operating systems, and a variety of operating system versions. I don’t want to bog this post down with too many charts, so I’m going to extract a chart for each operating system, and forgo breaking it down by operating system version. Suffice it to say that the regressions were pretty consistent from version to version.

Also, in each of these graphs, green represents mozilla-central, and red represents the UX branch. Up is bad (slower). Down is good (faster).

Anyhow, here’s what we saw:

ts_paint

Windows XP - ts_paint regression

Windows XP – ts_paint regression

Linux 32 - ts_paint regression

Ubuntu 12.04 – ts_paint regression

OSX 10.6 - ts_paint regression

OSX 10.6 – ts_paint regression

tpaint

Windows XP - tpaint regression

Windows XP – tpaint regression

Linux 32 - tpaint regression

Ubuntu 12.04 – tpaint regression

OSX 10.6 - tpaint regression

OSX 10.6 – tpaint regression

Ouch

The team has been working like crazy to make Firefox look and feel faster. Hitting a regression like this blows.

It’s also flat out unacceptable to have a regression like this unless there’s a really really good reason for it.

So we had to investigate. What was making us slow? What had we done wrong?

Find out in Part 2.

November 21, 2013 06:13 PM

November 20, 2013

Joshua Cranmer

Why email is hard, part 3: MIME

This post is part 3 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discuses internationalization. This post discusses MIME, the mechanism by which email evolves beyond plain text.

MIME, which stands for Multipurpose Internet Mail Extensions, is primarily dictated by a set of 5 RFCs: RFC 2045, RFC 2046, RFC 2047, RFC 2048, and RFC 2049, although RFC 2048 (which governs registration procedures for new MIME types) was updated with newer versions. RFC 2045 covers the format of related headers, as well as the format of the encodings used to convert 8-bit data into 7-bit for transmission. RFC 2046 describes the basic set of MIME types, most importantly the format of multipart/ types. RFC 2047 was discussed in my part 2 of this series, as it discusses encoding internationalized data in headers. RFC 2049 describes a set of guidelines for how to be conformant when processing MIME; as you might imagine, these are woefully inadequate for modern processing anyways. In practice, it is only the first three documents that matter for building an email client.

There are two main contributions of MIME, which actually makes it a bit hard to know what is meant when people refer to MIME in the abstract. The first contribution, which is of interest mostly to email, is the development of a tree-based representation of email which allows for the inclusion of non-textual parts to messages. This tree is ultimately how attachments and other features are incorporated. The other contribution is the development of a registry of MIME types for different types of file contents. MIME types have promulgated far beyond just the email infrastructure: if you want to describe what kind of file binary blob is, you can refer to it by either a magic header sequence, a file extension, or a MIME type. Searching for terms like MIME libraries will sometimes refer to libraries that actually handle the so-called MIME sniffing process (guessing a MIME type from a file extension or the contents of a file).

MIME types are decomposable into two parts, a media type and a subtype. The type text/plain has a media type of text and a subtype of plain, for example. IANA maintains an official repository of MIME types. There are very few media types, and I would argue that there ought to be fewer. In practice, degradation of unknown MIME types means that there are essentially three "fundamental" types: text/plain (which represents plain, unformatted text and to which unknown text/* types degrade), multipart/mixed (the "default" version of multipart messages; more on this later), and application/octet-stream (which represents unknown, arbitrary binary data). I can understand the separation of the message media type for things which generally follow the basic format of headers+body akin to message/rfc822, although the presence of types like message/partial that don't follow the headers+body format and the requirement to downgrade to application/octet-stream mars usability here. The distinction between image, audio, video and application is petty when you consider that in practice, the distinction isn't going to be able to make clients give better recommendations for how to handle these kinds of content (which really means deciding if it can be displayed inline or if it needs to be handed off to an external client).

Is there a better way to label content types than MIME types? Probably not. X.400 (remember that from my first post?) uses OIDs, in line with the rest of the OSI model, and my limited workings with other systems that use these OIDs is that they are obtuse, effectively opaque identifiers with no inherent semantic meaning. People use file extensions in practice to distinguish between different file types, but not all content types are stored in files (such as multipart/mixed), and the MIME types is a finer granularity to distinguish when needing to guess the type from the start of a file. My only complaints about MIME types are petty and marginal, not about the idea itself.

No, the part of MIME that I have serious complaints with is the MIME tree structure. This allows you to represent emails in arbitrarily complex structures… and onto which the standard view of email as a body with associated attachments is poorly mapped. The heart of this structure is the multipart media type, for which the most important subtypes are mixed, alternative, related, signed, and encrypted. The last two types are meant for cryptographic security definitions [1], and I won't cover them further here. All multipart types have a format where the body consists of parts (each with their own headers) separated by a boundary string. There is space before and after the last parts which consists of semantically-meaningless text sometimes containing a message like "This is a MIME message." meant to be displayed to the now practically-non-existent crowd of people who use clients that don't support MIME.

The simplest type is multipart/mixed, which means that there is no inherent structure to the parts. Attachments to a message use this type: the type of the message is set to multipart/mixed, a body is added as (typically) the first part, and attachments are added as parts with types like image/png (for PNG images). It is also not uncommon to see multipart/mixed types that have a multipart/mixed part within them: some mailing list software attaches footers to messages by wrapping the original message inside a single part of a multipart/mixed message and then appending a text/plain footer.

multipart/related is intended to refer to an HTML page [2] where all of its external resources are included as additional parts. Linking all of these parts together is done by use of a cid: URL scheme. Generating and displaying these messages requires tracking down all URL references in an HTML page, which of course means that email clients that want full support for this feature also need robust HTML (and CSS!) knowledge, and future-proofing is hard. Since the primary body of this type appears first in the tree, it also makes handling this datatype in a streaming manner difficult, since the values to which URLs will be rewritten are not known until after the entire body is parsed.

In contrast, multipart/alternative is used to satisfy the plain-text-or-HTML debate by allowing one to provide a message that is either plain text or HTML [3]. It is also the third-biggest failure of the entire email infrastructure, in my opinion. The natural expectation would be that the parts should be listed in decreasing order of preference, so that streaming clients can reject all the data after it finds the part it will display. Instead, the parts are listed in increasing order of preference, which was done in order to make the plain text part be first in the list, which helps increase readability of MIME messages for those reading email without MIME-aware clients. As a result, streaming clients are unable to progressively display the contents of multipart/alternative until the entire message has been read.

Although multipart/alternative states that all parts must contain the same contents (to varying degrees of degradation), you shouldn't be surprised to learn that this is not exactly the case. There was a period in time when spam filterers looked at only the text/plain side of things, so spammers took to putting "innocuous" messages in the text/plain half and displaying the real spam in the text/html half [4] (this technique appears to have died off a long time ago, though). In another interesting case, I received a bug report with a message containing an image/jpeg and a text/html part within a multipart/alternative [5].

To be fair, the current concept of emails as a body with a set of attachments did not exist when MIME was originally specified. The definition of multipart/parallel plays into this a lot (it means what you think it does: show all of the parts in parallel… somehow). Reading between the lines of the specification also indicates a desire to create interactive emails (via application/postscript, of course). Given that email clients have trouble even displaying HTML properly [6], and the fact that interactivity has the potential to be a walking security hole, it is not hard to see why this functionality fell by the wayside.

The final major challenge that MIME solved was how to fit arbitrary data into a 7-bit format safe for transit. The two encoding schemes they came up with were quoted-printable (which retains most printable characters, but emits non-printable characters in a =XX format, where the Xs are hex characters), and base64 which reencodes every 3 bytes into 4 ASCII characters. Non-encoded data is separated into three categories: 7-bit (which uses only ASCII characters except NUL and bare CR or LF characters), 8-bit (which uses any character but NUL, bare CR, and bare LF), and binary (where everything is possible). A further limitation is placed on all encodings but binary: every line is at most 998 bytes long, not including the terminating CRLF.

A side-effect of these requirements is that all attachments must be considered binary data, even if they are textual formats (like source code), as end-of-line autoconversion is now considered a major misfeature. To make matters even worse, body text for formats with text written in scripts that don't use spaces (such as Japanese or Chinese) can sometimes be prohibited from using 8-bit transfer format due to overly long lines: you can reach the end of a line in as few as 249 characters (UTF-8, non-BMP characters, although Chinese and Japanese typically take three bytes per character). So a single long paragraph can force a message to be entirely encoded in a format with 33% overhead. There have been suggestions for a binary-to-8-bit encoding in the past, but no standardization effort has been made for one [7].

The binary encoding has none of these problems, but no one claims to support it. However, I suspect that violating maximum line length, or adding 8-bit characters to a quoted-printable part, are likely to make it through the mail system, in part because not doing so either increases your security vulnerabilities or requires more implementation effort. Sending lone CR or LF characters is probably fine so long as one is careful to assume that they may be treated as line breaks. Sending a NUL character I suspect could cause some issues due to lack of testing (but it also leaves room for security vulnerabilities to ignore it). In other words, binary-encoded messages probably already work to a large degree in the mail system. Which makes it extremely tempting (even for me) to ignore the specification requirements when composing messages; small wonder then that blatant violations of specifications are common.

This concludes my discussion of MIME. There are certainly many more complaints I have, but this should be sufficient to lay out why building a generic MIME-aware library by itself is hard, and why you do not want to write such a parser yourself. Too bad Thunderbird has at least two different ad-hoc parsers (not libmime or JSMime) that I can think of off the top of my head, both of which are wrong.

[1] I will be covering this in a later post, but the way that signed and encrypted data is represented in MIME actually makes it really easy to introduce flaws in cryptographic code (which, the last time I surveyed major email clients with support for cryptographic code, was done by all of them).
[2] Other types are of course possible in theory, but HTML is all anyone cares about in practice.
[3] There is also text/enriched, which was developed as a stopgap while HTML 3.2 was being developed. Its use in practice is exceedingly slim.
[4] This is one of the reasons I'm minded to make "prefer plain text" do degradation of natural HTML display instead of showing the plain text parts. Not that cleanly degrading HTML is easy.
[5] In the interests of full disclosure, the image/jpeg was actually a PNG image and the HTML claimed to be 7-bit UTF-8 but was actually 8-bit, and it contained a Unicode homograph attack.
[6] Of the major clients, Outlook uses Word's HTML rendering engine, which I recall once reading as being roughly equivalent to IE 5.5 in capability. Webmail is forced to do their own sanitization and sandboxing, and the output leaves something to desire; Gmail is the worst offender here, stripping out all but inline style. Thunderbird and SeaMonkey are nearly alone in using a high-quality layout engine: you can even send a <video> in an email to Thunderbird and have it work properly. :-)
[7] There is yEnc. Its mere existence does contradict several claims (for example, that adding new transfer encodings is infeasible due to install base of software), but it was developed for a slightly different purpose. Some implementation details are hostile to MIME, and although it has been discussed to death on the relevant mailing list several times, no draft was ever made that would integrate it into MIME properly.

November 20, 2013 07:54 PM

November 18, 2013

Robert Kaiser

Bye, Bye, My Customizations - Hello New Ones?

Australis is landing on Nightly and therefore in my builds, and so it's time to review my customizations.
The illustrations below are from my pre-Australis state (though with my custom LCARStrek theme removed to make it easier to see what the customizations are). I still need to figure out the post-Australis state fully.

Image No. 23154

For one thing, I like having everything bookmark-like bundled in the bookmarks toolbar. That starts with the home button (that goes to my custom start page), the bookmarks menubutton, and then quick-access bookmark items like a menu with Bugzilla query live feeds and frequently visited sites. All those items, including the home and bookmarks buttons, have the same layout of an icon with text next to it and the same small height. I'm using the home button quite frequently to get to that site, same for the bookmark items, and I'm using the bookmarks list in the menubutton for retrieving sites I rarely visit (so they're often falling off the awesomebar) but want to go to every now and then (say, those sites where I buy karaoke songs every few months - I might not remember the exact name but I know where to find them in my bookmarks hierarchy). Australis removes the text from the buttons and actually makes them larger in height (which would make the whole toolbar higher), so I cannot place them in that place any more without breaking design. I might remove the home button completely and replace it with just a bookmark item pointing to the page, which should do the same job nicely (though the home button might not actually increase height, I need to test that a bit more). For the bookmarks button, I'm not sure. I don't feel like I want it at the right of any bar where it's far away from the bookmarks, and I don't want to size of the bookmarks bar to grow. Maybe I'll also hide it away completely, possibly place it in the "Hamburger" menu, as I don't use the starring feature too much anyhow, and move my whole bookmarks hierarchy to a folder on the bookmarks bar. That might seem strange in the logic of bookmark hierarchies, but it should do the job.

Image No. 23155

The throbber button at the right of my tab bar and the "Nightly" button on its left need to go as well, so I'll get rid of the former (even though it's an old friend from days gone by - but it's not even available for customization any more), and the latter is being morphed into the "Hamburger" button the right of the navigation bar, it's default location. While we're at the right of the navigation bar, I might think about removing the search field, actually. I find it annoying that it reminds me for hours of the last search I did because its content never goes away, and I do my Google searches from the location bar anyhow - and other searches by first going to the respective site and search from there. Maybe the old "Search Tabs" idea comes along one time and gives me nicer ways to use alternate search engines.

Image No. 23156

And then, there's add-ons. I have/had the Sync, Lightbeam and Diaspora EasyShare buttons as well as the MemChaser display in my add-ons bar. The Sync button is only showing if Sync is in progress and able to trigger one intentionally, I probably will remove that as I can easily live without it. Diaspora Easyshare might move to the bookmarks or even navigation bar, I'm not using it a lot but it actually looks very handy and I do use Diaspora quite a bit as my only social network. MemChaser will need to go away, unfortunately. It's nice to look at those numbers every now and then, esp. when problems might occur, but there's just no place in Australis where I can place a large and constantly updating thing like that without constant distraction. The bottom border of the screen/window was perfect for that. But OK, I'll probably end up removing or at least disabling the add-on completely. Lightbeam will either go away or move to the "Hamburger" menu - given how rarely I use it, it has no place at the top of my window in primary UI. While I'm at it, it might make sense to go through my add-ons list and do some cleanup there in general.
And once I get a feeling if I can do my work with the new "Hamburger" menu or not, I will see if I'll need to turn on the menu bar again or leave it hidden, like I've had it for a while now - but first I want to see how well the new stuff works for me so I can really evaluate it.
Unfortunately we don't have the capability in desktop FHR to track add-on behavior changes or uninstalls that might come with Australis (mobile has a new format that can track that), otherwise it would be interesting to see if it's just me or if others are disabling or uninstalling add-ons as well with that change.

What changes does this mean for you? Are you able to cope (like me), excited about them, or distracted by them?

November 18, 2013 07:19 PM

November 17, 2013

Rumbling Edge - Thunderbird

2013-11-16 Calendar builds

Common (excluding Website bugs)-specific: (14)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

November 17, 2013 02:05 AM

2013-11-16 Thunderbird comm-central builds

Thunderbird-specific: (37)

MailNews Core-specific: (24)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

November 17, 2013 02:04 AM

November 07, 2013

Robert Kaiser

Internet Archive Fire: Donate to Rebuild

I just got word that a fire destroyed the Internet Archive Scanning Center in San Francisco.



I have blogged about what the archive has and can do a few months ago and I probably will mention it again when I get to more posts on preserving software.

I think it's in the best interest of everyone, esp. us as Mozillians, to keep this organization going and make the history of the Internet and more openly available to current and future generations.

Please help them to rebuild and continue on their way and make a Donation. I will for sure.

November 07, 2013 06:04 PM

November 01, 2013

Robert Kaiser

Badges for Stability Contributions?

Various groups at Mozilla have been thinking about awarding Open Badges for contributors in the recent year or so (there's even a contributor badges site for easily awarding them). I again and again wonder if we could do that for stability contributions as well, but it's a tough nut to crack.



One idea would be to award badges automatically to people who leave their email in a crash report (i.e. a badge for taking part in our efforts by delivering data through crash reports) - but that would probably end up being a bad badge because you' award people for crashing Firefox (and not award people who don't see crashes). We really do not want people to search for how they can crash so that they can get a badge - after all our mission is to avoid crashes, not provoke them. So, no badges for submitting crashes (thanks to Benjamin for pointing me to that issue right away when I was rolling that idea).

So, how can we potentially award a badge for Stability/"CrashKill" work without rewarding bad behavior?

One thing I could think of was "filed x crash bugs that ended up fixed". That should be relatively easy to get out of Bugzilla data and I think there's no doubt that filing bugs that end up with a fixed resolution is a good thing.
This idea would also open up the avenue for different badges for different amounts of badges (similar to the webdev badges), or to create a badge for developers who fixed x crash bugs, and similar things.

What do you think? Which ideas do you have for awarding badges for contributions to the Mozilla Stability program?

November 01, 2013 05:14 PM

October 30, 2013

Meeting Notes

Thunderbird: 2013-10-29

Remember to use headphones and mute yourself when not talking

Feel free to ask questions in the meeting either by speaking up or by asking them in #maildev on IRC.

Other ways to get in touch with us can be found on our communications page

Meeting Changes

Attendees

aceman, clokep, jcranmer, JosiahOne, sshagarwal, mconley, Paenglab, Fallen

Action items from last meetings

  • [Standard8] Are WADA, Aryx, Chiaki Ishikawa, JosiahOne and jcranmer from previous nomination covered already for swag?

    • Standard8 is on it, but needs to dig through some info in his inbox.
  • [mconley]
    • Thunderbird usage data isn’t public. Standard8 was tasked with getting that information released. What’s the status on that?

      • Metrics is still needinfo’d on getting non-staff community added to usage metrics report emails

Critical Issues

  • [rkent] Binary add-on issue with TB 24?

    • jcranmer and irving suggest speaking with Fallen – they suspect this is fixed.

    • The builds that came out for 24.1 timeframe release did not fix the bug that rkent is talking about – but extra numbers have been added to AMO to mark compatibility (you’ll need to do rebuilds to side-step the compat issue).

Round Table

  • jcranmer

    • Working on converting JSMime into a separate repository (thanks to James Burke for being patient with me and my packaging issues!)

      • The separate repository will be JSMime 0.2, as I have decided to rework the API to be more idiot-proof

      • Tests are now up-to-date for comm-central, need to port tests for RFC 2047 and address header encoding/decoding
      • Not announcing this more widely public yet, since the conversion process is very messy and I want to embed reviews from the start.
    • Alder is now reserved for the final stages of ccrework, and I should be in contact with releng on getting this done.
    • Some minor build-engineering work with moz.build conversions or ontology
    • Still awaiting review from Neil.
  • JosiahOne (No outgoing audio)
    • Been busy with exams last week, so I didn’t accomplish as much as I wished.

    • I have started re-organizing the directory/file/image structure inside themes/.
    • I plan to create a shared themes directory similar to Firefox unless the build peers/owners are against it.
    • Waiting on review from Mike Conley for the animated thunderbird tabs.
  • clokep
    • Still working on the IB->c-c chat/ update, waiting on reviews from Florian now.

    • Was at the GSoC Mentors Summit ([1])
      • A lot of people told me they were happy we were still working on Thumderbird!

      • A lot of people thought Thunderbird had been killed -> more on this in my blog post
  • mconley
    • Patch to allow us to use nsIPermissionManager to store mail content preferences is up for review. \o/

    • I wrote some mail to tb-planning about the AMO to Marketplace move, and what that means for Thunderbird (and SeaMonkey).
    • Did more reviews, but they’re starting to stack up again
  • Paenglab
    • Waiting for feedback from mconley for bug 925746

Question Time

  • [clokep] Do we still need to use tb-planning? I’m STILL moderated on it after an insane amount of time using it.

    • Can we switch back to using m.d.a.thunderbird? If we have issues with “idiots”, we can always moderate it. (I want my nntp back.)

    • Alternately, can we freely give out posting privileges to people.

Action Items

  • [mconley] Ask jorgev if Marketplace merge would accept patches to make Thunderbird work there…

  • [mconley] Can we pull jb into this AMO discussion? Can we get an advocate higher up the Mozilla-chain? jb+gerv?
  • [mconley] Talk to bwinton and see if we can get “the regulars” whitelisted on tb-planning.

Thunderbird Meeting Details :

October 30, 2013 04:00 AM

October 23, 2013

Kent James

Real Men do Build Engineering (A Real Threat to Thunderbird)

When I was first getting involved in Thunderbird, I recall reading a post from early leader Scott MacGregor that puzzled me. When the project was essentially a two-person project, he said that the next person they needed was a build engineer. I had always thought of that as a backwater for people who couldn’t do real coding.

How my thinking has changed! As we look to the future of the Thunderbird project, it is clear that the main threat at the moment is losing control of the build process. These days, it takes a PhD candidate in computer science (Joshua Cranmer) to understand building Thunderbird, and keep it building. I’m not sure we are winning the battle, though  Joshua assures me that he and Standard8 have a Grand Plan to make it all better.

As for me, I’ve committed myself to improving my record in doing code reviews, which so far has been dismal. I’d like to give 20% of my time back to the core project, and a significant fraction of that should be doing code reviews. But I can’t keep comm-central compiling in 8 hours per week, which makes it very hard to do reviews.

<rant>It seems there is some project over on mozilla-central to replace the build process with some python-based thingy. “Just do mach (insert short command)” is the answer to everything in mozilla-central world. If it works, great. If it does not work (and it never works on comm-central), then you have to call your PhD-candidate build expert to make any sense of the problem. In the old days, where standard unix tools were used, you at least had a hope of googling your problem, and getting some hint of the issue. Now you have to pour over multiple files of python with multiple levels of abstracted variables, none of which I understand.</rant>

Here is today’s typical error:

Reticulating splines...
Traceback (most recent call last):
  File "./config.status", line 881, in <module>
    config_status(**args)
  File "c:\tb\1-central\src\mozilla\build\ConfigStatus.py", line 126, in config_status
    summary = backend.consume(definitions)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\base.py", line 194, in consume
    self.consume_object(obj)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\recursivemake.py", line 333, in consume_object
    CommonBackend.consume_object(self, obj)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\common.py", line 87, in consume_object
    self._test_manager.add(test, flavor=obj.flavor)
  File "c:\tb\1-central\src\mozilla\python\mozbuild\mozbuild\backend\common.py", line 68, in add
    assert path.startswith(self.topsrcdir)
AssertionError
configure: error: c:/tb/1-central/src/mozilla/configure failed for mozilla
*** Fix above errors and then restart with               "c:/mozilla-build/python/python.exe c:/tb/1-central/src/mozilla
/build/pymake/pymake/../make.py -f client.mk build"

So, are there any Real Men out there (or even Real Women) that want to be a hero, and save our project? Or am I going to have to become a Python build expert to do anything at all in Thunderbird?

 

October 23, 2013 06:09 PM

October 19, 2013

Rumbling Edge - Thunderbird

2013-10-18 Calendar builds

Common (excluding Website bugs)-specific: (11)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

October 19, 2013 05:40 PM

2013-10-18 Thunderbird comm-central builds

Thunderbird-specific: (48)

MailNews Core-specific: (21)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

October 19, 2013 05:39 PM

October 17, 2013

Kent James

The proxy debate over paid addons

I recently received a review of my ExQuilla addon for Thunderbird with the glowing report “It works well, I’ll give you that. No more, no less”. Should be 5 stars, right? No, the addon got 2.

This is the proxy battle over whether there should be paid addons in the Mozilla ecosystem, fought by trashing the ratings of paid addons. The reviewer gave me 2 stars because he believes “Please find purchasers from paid email software, not in an open source software.” He also complained about my response to an earlier reviewer who gave ExQuilla one star because, even though it “works great”, but “i’m really angry against Mozilla for not highlitning (sic) the paying extensions when you download it” (My earlier comment was: “It’s unfortunate that ExQuilla reviews are being used a a proxy for a debate about the philosophy of paid extensions. A better venue would be to send emails to amo-editors@mozilla.org”)

ExQuilla provides support to allow Thunderbird to work with Exchange server. Anyone using Exchange server and ExQuilla is already embedded in the paid software world.

So what do you think, Mozilla? Is it really such a sin to seek to earn some income?After all, I’ve spent years doing patches in Thunderbird without pay, and I have plenty of other free addons for Thunderbird. “I don’t believe we should subsidize businesses who want free software” was how Mitchell put the situation to me privately when Mozilla drastically reduced their support for Thunderbird, and I agree with that completely. Is it so terrible to ask users to pay for their software with cash instead of indirectly by selling their privacy (which is what most so-called “free” open source software apps do)?

October 17, 2013 11:17 PM

October 14, 2013

Ludovic Hirlimann

Encryption refused :(

Hum today I replied to a mailing list post, and as usual I signed the email (well I sign less but still try to sign emails everytime I send one). I got the following auto-reply :

The IT department has automatically stopped an email sent by you to xxxx@yyyyy.ch because it contained an encrypted or password-protected document. The use of encryption contravenes the company’s email Acceptable Usage Policy.

If you require further information, please contact the IT helpdesk.

This is the first time I see this, an it feels weird. I would understand if my email would have been encrypted but it was just signed !!!

Ludo

October 14, 2013 03:18 PM

October 13, 2013

Calendar

Using Lightning 2.6.* on Linux? Be sure you are using the exact compatible Thunderbird Version

As you may have noticed, Lightning is no longer working with Thunderbird 24.0.1. This is totally unexpected for us, it seems Thunderbird 24.0 and 24.0.1 are not binary compatible. We will be releasing Lightning 2.6.1 this week to fix the issue and afterwards find out how this could have happened.

If you are using Lightning 2.6, please downgrade to Thunderbird 24.0 for now and you will regain access to Lightning and your calendars.

I’m sorry for the inconvenience. Here is the compatibility table:

Thunderbird Version Lightning Version
Thunderbird 24.0 Lightning 2.6
Thunderbird 24.0.1 Lightning 2.6.1
Thunderbird 24.1.0 Lightning 2.6.2
Thunderbird 24.1.1 Lightning 2.6.3
Thunderbird 24.2.0 (unreleased) Lightning 2.6.3 and up

Update 1: This seems to be a Linux-only issue. Windows and Mac users can safely upgrade to Thunderbird 24.0.1!

Update 2: You can get the English version of Thunderbird 24.0 for Linux here. For other languages, please see the release directory on the ftp server.

Update 3: Lightning 2.6.1 is the version compatible to Thunderbird 24.0.1. To date it has not been reviewed by the Mozilla Addons Team, but you can still get it manually using the Other Versions page.

Update 4: Lightning 2.6.1 is now public. On Linux, it is compatible ONLY with Thunderbird 24.0.1, so go ahead and upgrade now.

Update 5: To be more clear: If you are using Thunderbird 24.0 on Linux you MUST continue to use Lightning 2.6. If you are using Thunderbird 24.0.1 on Linux, you MUST use Lightning 2.6.1. Thunderbird 24.1.0 (no typo) will be released soon, you MUST use the upcoming Lightning 2.6.2 here.

Update 6: If you cannot use the newer Lightnig versions yet and want to disable addon updates: Go to the Addons Manager → right click on Lightning → Show More Information → Disable Automatic Updates.

Update 7: Thunderbird 24.1.0 (not .0.1) has just been pushed to the mirrors. On Linux you will need Lightning 2.6.2 together with it. I have heard from some packagers for the Linux distributions that, in contrary to 24.0.1, this version will be made available. If you need Lightning 2.6.2 now, you can get it from the Other Versions page until it has been reviewed.

Update 8: I was now able to set up the version compatibility correctly. If all goes well, Thunderbird 24.0 users should NOT be getting upgrades to Lightning 2.6.1 anymore, all other versions should work correctly too.

Update 9: Updated for 24.1.1 release. The problems are now going away :) I will recap as soon as I get to it.

October 13, 2013 10:16 AM

October 11, 2013

Joshua Cranmer

Why email is hard, part 2: internationalization

This post is part 2 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure, as well as the issues I have with it. This post is discussing internationalization, specifically supporting non-ASCII characters in email.

Internationalization is not a simple task, even if the consideration is limited to "merely" the textual aspect [1]. Languages turn out to be incredibly diverse in their writing systems, so software that tries to support all writing systems equally well ends up running into several problems that admit no general solution. Unfortunately, I am ill-placed to be able to offer personal experience with internationalization concerns [2], so some of the information I give may well be wrong.

A word of caution: this post is rather long, even by my standards, since the problems of internationalization are legion. To help keep this post from being even longer, I'm going to assume passing familiarity with terms like ASCII, Unicode, and UTF-8.

The first issue I'll talk about is Unicode normalization, and it's an issue caused largely by Unicode itself. Unicode has two ways of making accented characters: precomposed characters (such as U+00F1, ñ) or a character followed by a combining character (U+006E, n, followed by U+0303, ◌̃). The display of both is the same: ñ versus ñ (read the HTML), and no one would disagree that the share the meaning. To let software detect that they are the same, Unicode prescribes four algorithms to normalize them. These four algorithms are defined on two axes: whether to prefer composed characters (like U+00F1) or prefer decomposed characters (U+006E U+0303), and whether to normalize by canonical equivalence (noting that, for example, U+212A Kelvin sign is equivalent to the Latin majuscule K) or by compatibility (e.g., superscript 2 to a regular 2).

Another issue is one that mostly affects display. Western European languages all use a left-to-right, top-to-bottom writing order. This isn't universal: Semitic languages like Hebrew or Arabic use right-to-left, top-to-bottom; Japanese and Chinese prefer a top-to-bottom, right-to-left order (although it is sometimes written left-to-right, top-to-bottom). It thus becomes an issue as to the proper order to store these languages using different writing orders in the actual text, although I believe the practice of always storing text in "start-to-finish" order, and reversing it for display, is nearly universal.

Now, both of those issues mentioned so far are minor in the grand scheme of things, in that you can ignore them and they will still probably work properly almost all of the time. Most text that is exposed to the web is already normalized to the same format, and web browsers have gotten away with not normalizing CSS or HTML identifiers with only theoretical objections raised. All of the other issues I'm going to discuss are things that cause problems and illustrate why properly internationalizing email is hard.

Another historical mistake of Unicode is one that we will likely be stuck with for decades, and I need to go into some history first. The first Unicode standard dates from 1991, and its original goal then was to collect all of the characters needed for modern transmission, which was judged to need only a 16-bit set of characters. Unfortunately, the needs of ideographic-centric Chinese, Japanese, and Korean writing systems, particularly rare family names, turns out to rather fill up that space. Thus, in 1996, Unicode was changed to permit more characters: 17 planes of 65,536 characters each, of which the original set was termed the "Basic Multilingual Plane" or BMP for short. Systems that chose to adopt Unicode in those intervening 5 years often adopted a 16-bit character model as their standard internal format, so as to keep the benefits of fixed-width character encodings. However, with the change to a larger format, their fixed-width character encoding is no longer fixed-width.

This issue plagues anybody who works with systems that considered internationalization in that unfortunate window, which notably includes prominent programming languages like C#, Java, and JavaScript. Many cross-platform C and C++ programs implicitly require UTF-16 due to its pervasive inclusion into the Windows operating system and common internationalization libraries [3]. Unsurprisingly, non-BMP characters tend to quickly run into all sorts of hangups by unaware code. For example, right now, it is possible to coax Thunderbird to render these characters unusable in, say, your subject string if the subject is just right, and I suspect similar bugs exist in a majority of email applications [4].

For all of the flaws of Unicode [5], there is a tacit agreement that UTF-8 should be the character set to use for anyone not burdened by legacy concerns. Unfortunately, email is burdened by legacy concerns, and the use of 8-bit characters in headers that are not UTF-8 is more prevalent than it ought to be, RFC 6532 notwithstanding. In any case, email explicitly provides for handling a wide variety of alternative character sets without saying which ones should be supported. The official list [6] contains about 200 of them (including the UNKNOWN-8BIT character set), but not all of them see widespread use. In practice, the ones that definitely need to be supported are the ISO 8859-* and ISO 2022-* charsets, the EUC-* charsets, Windows-* charsets, GB18030, GBK, Shift-JIS, KOI8-{R,U}, Big5, and of course UTF-8. There are two other major charsets that don't come up directly in email but are important for implementing the entire suite of protocols: UTF-7, used in IMAP (more on that later), and Punycode (more on that later, too).

The suite of character sets falls into three main categories. First is the set of fixed-width character sets, most notably ASCII and the ISO 8859 suite of charsets, as well as UCS-2 (2 bytes per character) and UTF-32 (4 bytes per character). Since the major East Asian languages are all ideographic, which require a rather large number of characters to be encoded, fixed-width character sets are infeasible. Instead, many choose to do a variable-width encoding: Shift-JIS lets some characters (notably ASCII characters and half-width katakana) remain a single byte and uses two bytes to encode all of its other characters. UTF-8 can use between 1 byte (for ASCII characters) and 4 bytes (for non-BMP characters) for a single character. The final set of character sets, such as the ISO 2022 ones, use escape sequences to change the interpretation of subsequent characters. As a result, taking the substring of an encoding string can change its interpretation while remaining valid. This will be important later.

Two more problems related to character sets are worth mentioning. The first is the byte-order mark, or BOM, which is used to distinguish whether UTF-16 is written on a little-endian or big-endian machine. It is also sometimes used in UTF-8 to indicate that the text is UTF-8 versus some unknown legacy encoding. It is also not supposed to appear in email, but I have done some experiments which suggest that people use software that adds it without realizing that this is happening. The second issue, unsurprisingly [7], is that for some character sets (Big5 in particular, I believe), not everyone agrees on how to interpret some of the characters.

The largest problem of internationalization that applies in a general sense is the problem of case insensitivity. The 26 basic Latin letters all map nicely to case, having a single uppercase and a single lowercase variant for each letter. This practice doesn't hold in general—languages like Japanese lack even the notion of case, although it does have two kana variants that hold semantic differences. Rather, there are three basic issues with case insensitivity which showcase enough of its problems to make you want to run away from it altogether [8].

The simplest issue is the Greek sigma. Greek has two lowercase variants of the sigma character: σ and &varsigma (the "final sigma"), but a single uppercase variant, Σ. Thus mapping a string s to uppercase and back to lowercase is not equivalent to mapping s directly to lower-case in some cases. Related to this issue is the story of German ß character. This character evolved as a ligature of a long and short 's', and its uppercase form is generally held to be SS. The existence of a capital form is in some dispute, and Unicode only recently added it (ẞ, if your software supports it). As a result, merely interconverting between uppercase and lowercase versions of a string does not necessarily lead to a simple fixed point. The third issue is the Turkish dotless i (ı), which is the lowercase variant of the ASCII uppercase I character to those who speak Turkish. So it turns out that case insensitivity isn't quite the same across all locales.

Again unsurprisingly in light of the issues, the general tendency towards case-folding or case-insensitive matching in internationalized-aware specifications is to ignore the issues entirely. For example, asking for clarity on the process of case-insensitive matching for IMAP folder names, the response I got was "don't do it." HTML and CSS moved to the cumbersomely-named variant known as "ASCII-subset case-insensitivity", where only the 26 basic Latin letters are mapped to their (English) variants in case. The solution for email is also a verbose variant of "unspecified," but that is only tradition for email (more on this later).

Now that you have a good idea of the general issues, it is time to delve into how the developers of email rose to the challenge of handling internationalization. It turns out that the developers of email have managed to craft one of the most perfect and exquisite examples I have seen of how to completely and utterly fail. The challenges of internationalized emails are so difficult that buggier implementations are probably more common than fully correct implementations, and any attempt to ignore the issue is completely and totally impossible. In fact, the faults of RFC 2047 are my personal least favorite part of email, and implementing it made me change the design of JSMime more than any other feature. It is probably the single hardest thing to implement correctly in an email client, and it is so broken that another specification was needed to be able to apply internationalization more widely (RFC 2231).

The basic problem RFC 2047 sets out to solve is how to reliably send non-ASCII characters across a medium where only 7-bit characters can be reliably sent. The solution that was set out in the original version, RFC 1342, is to encode specific strings in an "encoded-word" format: =?charset?encoding?encoded text?=. The encoding can either be a 'B' (for Base64) or a 'Q' (for quoted-printable). Except the quoted-printable encoding in this format isn't quite the same quoted-printable encoding used in bodies: the space character is encoded via a '_' character instead, as spaces aren't allowed in encoded-words. Naturally, the use of spaces in encoded-words is common enough to get at least one or two bugs filed a year about Thunderbird not supporting it, and I wonder if this subtle difference between two quoted-printable variants is what causes the prevalence of such emails.

One of my great hates with regard to email is the strict header line length limit. Since the encoded-word form can get naturally verbose, particularly when you consider languages like Chinese that are going to have little whitespace amenable for breaking lines, the ingenious solution is to have adjacent encoded-word tokens separated only by whitespace be treated as the same word. As RFC 6857 kindly summarizes, "whitespace behavior is somewhat unpredictable, in practice, when multiple encoded words are used." RFC 6857 also suggests that the requirement to limit encoded words to only 74 characters in length is also rather meaningless in practice.

A more serious problem arises when you consider the necessity of treating adjacent encoded-word tokens as a single unit. This one is so serious that it reaches the point where all of your options would break somebody. When implementing an RFC 2047 encoding algorithm, how do you write the code to break up a long span of text into multiple encoded words without ever violating the specification? The naive way of doing so is to encode the text once in one long string, and then break it into checks which are then converted into the encoded-word form as necessary. This is, of course, wrong, as it breaks two strictures of RFC 2047. The first is that you cannot split the middle of multibyte characters. The second is that mode-switching character sets must return to ASCII by the end of a single encoded-word [9]. The smarter way of building encoded-words is to encode words by trying to figure out how much text can be encoded before needing to switch, and breaking the encoded-words when length quotas are exceeded. This is also wrong, since you could end up violating the return-to-ASCII rule if your don't double-check your converters. Also, if UTF-16 is used as the basis for the string before charset conversion, the encoder stands a good chance of splitting up creating unpaired surrogates and a giant mess as a result.

For JSMime, the algorithm I chose to implement is specific to UTF-8, because I can use a property of the UTF-8 implementation to make encoding fast (every octet is looked at exactly three times: once to convert to UTF-8, once to count to know when to break, and once to encode into base64 or quoted-printable). The property of UTF-8 is that the second, third, and fourth octets of a multibyte character all start with the same two bits, and those bits never start the first octet of a character. Essentially, I convert the entire string to a binary buffer using UTF-8. I then pass through the buffer, keeping counters of the length that the buffer would be in base64 form and in quoted-printable form. When both counters are exceeded, I back up to the beginning of the character, and encode that entire buffer in a word and then move on. I made sure to test that I don't break surrogate characters by making liberal use of the non-BMP character U+1F4A9 [10] in my encoding tests.

The sheer ease of writing a broken encoder for RFC 2047 means that broken encodings exist in the wild, so an RFC 2047 decoder needs to support some level of broken RFC 2047 encoding. Unfortunately, to "fix" different kinds of broken encodings requires different support for decoders. Treating adjacent encoded-words as part of the same buffer when decoding makes split multibyte characters work properly but breaks non-return-to-ASCII issues; if they are decoded separately the reverse is true. Recovering issues with isolated surrogates is at best time-consuming and difficult and at worst impossible.

Yet another problem with the way encoded-words are defined is that they are defined as specific tokens in the grammar of structured address fields. This means that you can't hide RFC 2047 encoding or decoding as a final processing step when reading or writing messages. Instead you have to do it during or after parsing (or during or before emission). So the parser as a result becomes fully intertwined with support for encoded-words. Converting a fully UTF-8 message into a 7-bit form is thus a non-trivial operation: there is a specification solely designed to discuss how to do such downgrading, RFC 6857. It requires deducing what structure a header has, parsing that harder, and then reencoding the parsed header. This sort of complicated structure makes it much harder to write general-purpose email libraries: the process of emitting a message basically requires doing a generic UTF-8-to-7-bit conversion. Thus, what is supposed to be a more implementation detail of how to send out a message ends up permeating the entire stack.

Unfortunately, the developers of RFC 2047 were a bit too clever for their own good. The specification limits the encoded-words to occurring only inside of phrases (basically, display names for addresses), unstructured text (like the subject), or comments (…). I presume this was done to avoid requiring parsers to handle internationalization in email addresses themselves or possibly even things like MIME boundary delimiters. However, this list leaves out one common source of internationalized text: filenames of attachments. This was ultimately patched by RFC 2231.

RFC 2231 is by no means a simple specification, since it attempts to solve three problems simultaneously. The first is the use of non-ASCII characters in parameter values. Like RFC 2047, the excessively low header line length limit causes the second problem, the need to wrap parameter values across multiple line lengths. As a result, the encoding is complicated (it takes more lines of code to parse RFC 2231's new features alone than it does to parse the basic format [11]), but it's not particularly difficult.

The third problem RFC 2231 attempts to solve is a rather different issue altogether: it tries to conclusively assign a language tag to the encoded text and also provides a "fix" for this to RFC 2047's encoded-words. The stated rationale is to be able to have screen readers read the text aloud properly, but the other (much more tangible) benefit is to ameliorate the issues of Unicode's Han unification by clearly identifying if the text is Chinese, Japanese, or Korean. While it sounds like a nice idea, it suffers from a major flaw: there is no way to use this data without converting internal data structures from using flat strings to richer representations. Another issue is that actually setting this value correctly (especially if your goal is supporting screen readers' pronunciations) is difficult if not impossible. Fortunately, this is an entirely optional feature; though I do see very little email that needs to be concerned about internationalization, I have yet to find an example of someone using this in the wild.

If you're the sort of person who finds properly writing internationalized text via RFC 2231 or RFC 2047 too hard (or you don't realize that you need to actually worry about this sort of stuff), and you don't want to use any of the several dozen MIME libraries to do the hard stuff for you, then you will become the bane of everyone who writes email clients, because you've just handed us email messages that have 8-bit text in the headers. At which point everything goes mad, because we have no clue what charset you just used. Well, RFC 6532 says that headers are supposed to be UTF-8, but with the specification being only 19 months old and part of a system which is still (to my knowledge) not supported by any major clients, this should be taken with a grain of salt. UTF-8 has the very nice property that text that is valid UTF-8 is highly unlikely to be any other charset, even if you start considering the various East Asian multibyte charsets. Thus you can try decoding under the assumption that is UTF-8 and switch to a designated fallback charset if decoding fails. Of course, knowing which designated fallback to use is a different matter entirely.

Stepping outside email messages themselves, internationalization is still a concern. IMAP folder names are another well-known example. RFC 3501 specified that mailbox names should be in a modified version of UTF-7 in an awkward compromise. To my knowledge, this is the only remaining significant use of UTF-7, as many web browsers disabled support due to its use in security attacks. RFC 6855, another recent specification (6 months old as of this writing), finally allows UTF-8 mailbox names here, although it too is not yet in widespread usage.

You will note missing from the list so far is email addresses. The topic of email addresses is itself worthy of lengthy discussion, but for the purposes of a discussion on internationalization, all you need to know is that, according to RFCs 821 and 822 and their cleaned-up successors, everything to the right of the '@' is a domain name and everything to the left is basically an opaque ASCII string [12]. It is here that internationalization really runs headlong into an immovable obstacle, for the email address has become the de facto unique identifier of the web, and everyone has their own funky ideas of what an email address looks like. As a result, the motto of "be liberal in what you accept" really breaks down with email addresses, and the amount of software that needs to change to accept internationalization extends far beyond the small segment interested only in the handling of email itself. Unfortunately, the relative newness of the latest specifications and corresponding lack of implementations means that I am less intimately familiar with this aspect of internationalization. Indeed, the impetus for this entire blogpost was a day-long struggle with trying to ascertain when two email addresses are the same if internationalized email address are involved.

The email address is split nicely by the '@' symbol, and internationalization of the two sides happens at two different times. Domains were internationalized first, by RFC 3490, a specification with the mouthful of a name "Internationalizing Domain Names in Applications" [13], or IDNA2003 for short. I mention the proper name of the specification here to make a point: the underlying protocol is completely unchanged, and all the work is intended to happen at roughly the level of getaddrinfo—the internal DNS resolver is supposed to be involved, but the underlying DNS protocol and tools are expected to remain blissfully unaware of the issues involved. That I mention the year of the specification should tell you that this is going to be a bumpy ride.

An internationalized domain name (IDN for short) is a domain name that has some non-ASCII characters in it. Domain names, according to DNS, are labels terminated by '.' characters, where each label may consist of up to 63 characters. The repertoire of characters are the ASCII alphanumerics and the '-' character, and labels are of course case-insensitive like almost everything else on the Internet. Encoding non-ASCII characters into this small subset while meeting these requirements is difficult for other contemporary schemes: UTF-7 uses Base64, which means 'A' and 'a' are not equivalent; percent-encoding eats up characters extremely quickly. So IDN use a different specification for this purpose, called Punycode, which allows for a dense but utterly unreadable encoding. The basic algorithm of encoding an IDN is to take the input string, apply case-folding, normalize using NFKC, and then encode with Punycode.

Case folding, as I mentioned several paragraphs ago, turns out to have some issues. The ß and &varsigma characters were the ones that caused the most complaints. You see, if you were to register, say, www.weiß.de, you would actually be registering www.weiss.de. As there is no indication of Punycode involved in the name, browsers would show the domain in the ASCII variant. One way of fixing this problem would be to work with browser vendors to institute a "preferred name" specification for websites (much like there exists one for the little icons next to page titles), so that the world could know that the proper capitalization is of course www.GoOgle.com instead of www.google.com. Instead, the German and Greek registrars pushed for a change to IDNA, which they achieved in 2010 with IDNA2008.

IDNA2008 is defined principally in RFCs 5890-5895 and UTS #46. The principal change is that the normalization step no longer exists in the protocol and is instead supposed to be done by applications, in a possibly locale-specific manner, before looking up the domain name. One reason for doing this was to eliminate the hard dependency on a specific, outdated version of Unicode [14]. It also helps fix things like the Turkish dotless I issue, in theory at least. However, this different algorithm causes some domains to be processed differently from IDNA2003. UTS #46 specifies a "compatibility mode" which changes the algorithm to match IDNA2003 better in the important cases (specifically, ß, &varsigma, and ZWJ/ZWNJ), with a note expressing the hope that this will eventually become unnecessary. To handle the lack of normalization in the protocol, registrars are asked to automatically register all classes of equivalent domain names at the same time. I should note that most major browsers (and email clients, if they implement IDN at all) are still using IDNA2003: an easy test of this fact is to attempt to go to ☃.net, which is valid under IDNA2003 but not IDNA2008.

Unicode text processing is often vulnerable to an attack known as the "homograph attack." In most fonts, the Greek omicron and the Latin miniscule o will be displayed in exactly the same way, so an attacker could pretend to be from, say, Google while instead sending you to Gοogle—I used Latin in the first word and Greek in the second. The standard solution is to only display the Unicode form (and not the Punycode form) where this is not an issue; Firefox and Opera display Unicode only for a whitelist of registrars with acceptable polices, Chrome and Internet Explorer only permits scripts that the user claims to read, and Safari only permits scripts that don't permit the homograph attack (i.e., not Cyrillic or Greek). (Note: this information I've summarized from Chromium's documentation; forward any complaints of out-of-date information to them).

IDN satisfies the needs of internationalizing the second half of an email address, so a working group was commissioned to internationalize the first one. The result is EAI, which was first experimentally specified in RFCs 5335-5337, and the standards themselves are found in RFCs 6530-6533 and 6855-6858. The primary difference between the first, experimental version and the second, to-be-implemented version is the removal of attempts to downgrade emails in the middle of transit. In the experimental version, provisions were made to specify with every internalized address an alternate, fully ASCII address to which a downgraded message could be sent if SMTP servers couldn't support the new specifications. These were removed after the experiment found that such automatic downgrading didn't work as well as hoped.

With automatic downgrading removed from the underlying protocol, the onus is on people who generate the emails—mailing lists and email clients—to figure out who can and who can't receive messages and then downgrade messages as appropriate for the recipients of the message. However, the design of SMTP is such that it is impossible to automatically determine if the client can receive these new kinds of messages. Thus, the options are to send them and hope that it works or to rely on the (usually clueless) user to inform you if it works. Clearly an unpalatable set of options, but it is one that can't be avoided due to protocol design.

The largest change of EAI is that the local parts of addresses are specified as a sequence of UTF-8 characters, omitting only the control characters [15]. The working group responsible for the specification adamantly refused to define a Unicode-to-ASCII conversion process, and thus a mechanism to make downgrading work smoothly, for several reasons. First, they didn't want to specify a prefix which could change the meaning of existing local-parts (the structure of local-parts is much less discoverable than the structure of all domain names). Second, they felt that the lack of support for displaying the Unicode variants of Punycode meant that users would have a much worse experience. Finally, the transition period would be hopefully short (although messy), so designing a protocol that supports that short period would worsen it in the long term. Considering that, at the moment of writing, only one of the major SMTP implementations has even a bug filed to support it, I think the working group underestimates just how long transition periods can take.

As far as changes to the message format go, that change is the only real change, considering how much effort is needed to opt-in. Yes, headers are now supposed to be UTF-8, but, in practice, every production MIME parser needs to handle 8-bit characters in headers anyways. Yes, message/global can have MIME encoding applied to it (unlike message/rfc822), but, in practice, you already need to assume that people are going to MIME-encode message/rfc822 in violation of the specification. So, in practice, the changes needed to a parser are to add message/global as an alias to message/rfc822 [16] and possibly tweaking some charset detection heuristics to prefer UTF-8. I would very much have liked the restriction on header line length removed, but, alas, the working group did not feel moved to make those changes. Still, I look forward to the day when I never have to worry about encoding text into RFC 2047 encoded-words.

IMAP, POP, and SMTP are also all slightly modified to take account of the new specifications. Specifically, internationalized headers are supposed to be opt-in only—SMTP are supposed to reject sending to these messages if it doesn't support them in the first place, and IMAP and POP are supposed to downgrade messages when requested unless the client asks for them to not be. As there are no major server implementations yet, I don't know how well these requirements will be followed, especially given that most of the changes already need to be tolerated by clients in practice. The experimental version of internationalization specified a format which would have wreaked havoc to many current parsers, so I suspect some of the strict requirements may be a holdover from that version.

And thus ends my foray into email internationalization, a collection of bad solutions to hard problems. I have probably done a poor job of covering the complete set of inanities involved, but what I have covered are the ones that annoy me the most. This certainly isn't the last I'll talk about the impossibility of message parsing either, but it should be enough at least to convince you that you really don't want to write your own message parser.

[1] Date/time, numbers, and currency are the other major aspects of internalization.
[2] I am a native English speaker who converses with other people almost completely in English. That said, I can comprehend French, although I am not familiar with the finer points that come with fluency, such as collation concerns.
[3] C and C++ have a built-in internationalization and localization API, derived from POSIX. However, this API is generally unsuited to the full needs of people who actually care about these topics, so it's not really worth mentioning.
[4] The basic algorithm to encode RFC 2047 strings for any charset are to try to shift characters into the output string until you hit the maximum word length. If the internal character set for Unicode conversion is UTF-16 instead of UTF-32 and the code is ignorant of surrogate concerns, then this algorithm could break surrogates apart. This is exactly how the bug is triggered in Thunderbird.
[5] I'm not discussing Han unification, which is arguably the single most controversial aspect of Unicode.
[6] Official list here means the official set curated by IANA as valid for use in the charset="" parameter. The actual set of values likely to be acceptable to a majority of clients is rather different.
[7] If you've read this far and find internationalization inoperability surprising, you are either incredibly ignorant or incurably optimistic.
[8] I'm not discussing collation (sorting) or word-breaking issues as this post is long enough already. Nevertheless, these also help very much in making you want to run away from internationalization.
[9] I actually, when writing this post, went to double-check to see if Thunderbird correctly implements return-to-ASCII in its encoder, which I can only do by running tests, since I myself find its current encoder impenetrable. It turns out that it does, but it also looks like if we switched conversion to ICU (as many bugs suggest), we may break this part of the specification, since I don't see the ICU converters switching to ASCII at the end of conversion.
[10] Chosen as a very adequate description of what I think of RFC 2047. Look it up if you can't guess it from context.
[11] As measured by implementation in JSMime, comments and whitespace included. This is biased by the fact that I created a unified lexer for the header parser, which rather simplifies the implementation of the actual parsers themselves.
[12] This is, of course a gross oversimplification, so don't complain that I'm ignoring domain literals or the like. Email addresses will be covered later.
[13] A point of trivia: the 'I' in IDNA2003 is expanded as "Internationalizing" while the 'I' in IDNA2008 is for "Internationalized."
[14] For the technically-minded: IDNA2003 relied on a hard-coded list of banned codepoints in processing, while IDNA2008 derives its lists directly from Unicode codepoint categories, with a small set of hard-coded exceptions.
[15] Certain ASCII characters may require the local-part to be quoted, of course.
[16] Strictly speaking, message/rfc822 remains all-ASCII, and non-ASCII headers need message/global. Given the track record of message/news, I suspect that this distinction will, in practice, not remain for long.

October 11, 2013 04:07 AM

October 03, 2013

Meeting Notes

Thunderbird: 2013-10-02

October 03, 2013 03:00 AM

October 02, 2013

Meeting Notes

Thunderbird: 2013-10-01

Remember to use headphones and mute yourself when not talking

Feel free to ask questions in the meeting either by speaking up or by asking them in #maildev on IRC.

Other ways to get in touch with us can be found on our communications page

Meeting Changes

Attendees

aceman, clokep, jcranmer, mconley, mkmelin, rolandtanglao, rkent, JosiahOne, sshagarwal

Action items from last meetings

  • [Standard8] mconley nominates Chiaki Ishikawa for friend of the tree. Swag?

  • [Standard8] Are WADA and Aryx from previous nomination covered already?
    • I sent mail for both of these, but I think Standard8 is traveling.

Friends of the tree

  • rkent nominates jcranmer for nearly getting the tree green

Current status and discussions

Critical Issues

  • Roland: no support issues as far as i can tell (other than gmail condstore and other known issues)

  • mconley: add-on reviews seem to be quite behind.

Upcoming

Round Table

  • mconley

  • clokep
    • Quentin Headen has finished up his GSoC — Yahoo chat protocol

    • Currently merging Instantbird changes into c-c in bug 920801
      • Invalid cert handling

      • Yahoo prpl
      • Better character counting in Twitter, IRC
  • jcranmer
    • Awaiting review from Neil on several JSMime patches

    • Planning for EAI and IDN
  • JosiahOne
    • Continuing to knock out theme bugs

    • Animated TB tabs are almost done, just waiting on review.
  • rkent
    • Exquilla is fully out and working reasonably well! (Using TB as an email-client for Exchange Web Services).

      • The hacks are incredible. [mconley says] You have no idea what kind of hacks rkent has to do. They’re unreal. rkent wants to reduce the number of hacks that he has to use in order to reduce fragility.

      • [mconley says] rkent would like to be able to use a separate database for message storage, because Mork is insanity.
      • [mconley says] rkent thinks TB needs sources of monetization. irving loves to work for free, but some of us like to be paid, so that’s what I am working on, to monetize Exquilla.
  • aceman
    • Working to fix breakages that are candidates for TB 24.0.1, e.g. bug 922614, bug 882901, bug 921410.

Question Time

  • [jcranmer]: With Thunderbird 31 it makes sense to have some kind of new feature that we can put into the What’s New page that is easily visible to users. What feature is this?

    • [jcranmer]: Internationalized email addresses? [aceman says] What is not working with them? There were some patches in the past to allow sending to IDN I think (ask mkmelin). (but only email addresses, not server hostnames).

    • [jcranmer]: new addressbook?
    • [jcranmer]: Recording a video message, and sending it through external file storage service as a link (like Filelink)
    • [jcranmer]: OTR chat
  • [jcranmer says] Wanted to talk with asuth and squib about making Gaia email client and Thunderbird share more code. Will send email this afternoon.
  • [rkent]: When is the information going to be publicly available for TB usage

Action Items

  • mconley

    • Thunderbird usage data isn’t public. Standard8 was tasked with getting that information released. What’s the status on that?

Thunderbird Meeting Details :

October 02, 2013 03:00 AM

September 30, 2013

Philipp Kewisch

Thunderbird Developer Tools Wrapup

In my earlier two posts I showed you my work on the Google Summer of Code 2013 Project to bring the Developer Tools to Thunderbird. The method for doing so is making use of Firefox’s remote debugging protocol, allowing to use the web developer tools available in Firefox to manipulate Thunderbird. More details are covered in the earlier posts. The Summer of Code has now come to an end, so I would like to tell you about my progress, the goals I’ve reached and those my mentor and I have decided are out of scope.

First of all, let me tell you about the remaining features I have implemented since the last post. One of the features is support for the remote inspector. This was pretty easy to do, although support for it is still preliminary. There are still a few quirks, but it’s mostly usable.  You can see here I’ve changed an attribute value:

Devtools Inspector in Action

Next up is support for scratchpad, which is still work in progress on the client side but is almost complete. Here is a screenshot:

Devtools Scratchpad in Action

Also, there is the app manager. This is, as far as I’ve understood, still in a beta stadium and aims to be a central place for managing remote devices. Thunderbird is one of these “remote devices”. The app manager shows some information about Thunderbird like its resolution and allows making screenshots:

app-manager

Finally, I’ve made progress packaging the glue code required for the debugger server into an extension. This is mostly a build system change that allows packaging the code as a restartless addon which I can distribute on addons.mozilla.org. The extension has an option dialog which allows starting and stopping the remote connection. From within Thunderbird this extension is not needed, but it is helpful for other applications based on the Mozilla Platform, like those based on XULRunner. I will post an update when the extension is available on addons.mozilla.org. Here is a screenshot of the options dialog:

Devtools Server Extension Dialog in Action

In my original milestone planning there were a few features considered a bonus. Some of these were not completed. It turns out those extra features are a substantial amount of effort, possibly even worth their own Summer of Code Project.

The first of these two is adding a way to inspect IMAP connections in the network monitor. This requires providing a specific interface in the IMAP channel implementation which makes it possible to inspect the content even after the request has been sent. Also, it is needed to mimic certain aspects of a http channel, specifically the concept of request and response headers. In Thunderbird, the IMAP channel implementation is heavily cached. Hooking up the channel interface to the network monitor would cause display of cached requests as separate requests. Also, this would only fix it for IMAP connections. A better way would be to add a general mechanism in the Mozilla Platform to be able to inspect TCP connections. This requires some changes very deep down in the networking platform and is probably not easy to carry out. I have filed a bug to solve this, but it won’t be a part of the Summer of Code.

The next feature that is missing is gcli, also known as the “Developer Toolbar”, that small black bar you can open in the Web Developer menu that allows executing text commands. The problem here is that the code has a lot of dependencies to Firefox code. A substantial amount of files need to be moved from the directory containing Firefox code to a directory common to all XUL applications. Some files also need to be split up.  As this feature is a nice to have, but not considered vital functionality for Thunderbird Developer Tools, we have decided to postpone it. If you see a need for this feature, please leave a comment describing what you want to do with it. In the meanwhile you can follow the bug on bugzilla.

With this I have covered all the features I have proposed, I’d say it was a very successful Summer of Code. I have managed to reduce the code needed in Thunderbird and made most of the changes inside the Developer Tools code. This makes sure that support for Thunderbird will work in the future without needing updates. Also, new remote features will automatically work, given there is no Firefox specific code in them.

If you want to jump right in and try it, I have to appeal to your patience. Some of the patches required for functionality are still in review by my mentor and the Mozilla developer tools team. I will let you know once everything is in place. I’m pretty sure we will able to get all code into the tree by the end of the current cycle.


Tagged: firefox, Inspector, scratchpad, thunderbird, web developer tools

September 30, 2013 06:29 AM

Thunderbird Style Editor, Web Console, Network Monitor and Profiler

As you can see in my earlier post, I have worked on giving the Firefox developer tools access to Thunderbird using the remote protocol. Ultimately this means you will be able to debug and profile scripts, edit styles, view network traffic and view console messages.

Previously, I was able to make the debugger work, which was the most important feature to my mind. Now that the mid-terms are coming up soon, I thought I’d give you an update on what I have achieved. After fixing a few bugs in the developer tools code, I was able to add the remaining actors needed for the profiler, web console and style editor. The network monitor actually worked out of the box.

The code is not yet reviewed or pushed to the tree, so you cannot test it just yet. To bridge the gap, I’d like to present a few screenshots where you can see its working.

This is the web console in action. As you can see, evaluated JavaScript in the web console in Firefox executes in Thunderbird:Devtools Web Console in Action
 

This is the style editor. You can do everything you can do in Firefox: Disable style-sheets, change style rules, or add new ones. Here I’ve changed the #today-pane-panel background color to red:Devtools Style Editor in Action

 

This is the network monitor. You can see the Lightning calendaring extension connecting to a CalDAV server. In this case I have just added the New Event, which sends a PUT request to the server. Pure socket connections like IMAP are not visible yet, but anything that is HTTP will show up. I will be looking into adding socket connections to the network monitor after the midterms:Devtools Network Monitor in Action

 

Finally, the profiler in action. I was able to start the profiler in Firefox, then I did some random actions in Thunderbird. The profiler analyzed which functions were executed how often and how long they took and I could inspect the result in Firefox:Devtools Profiler in Action

 

If you want to track my progress on a more detailed level, head over to this issue on bugzilla and put yourself on the CC list. There you will also see what is left to do. I will add a comment there when the current patches are pushed and usable in the nightly builds.


September 30, 2013 06:29 AM

The Thunderbird (Remote) Debugger is alive!

For quite some time now, I have been forced to use printf-style debugging for any work on the Mozilla Calendar Project. In most cases, its a real pain. Evaluating variables without restarting is so much more comfortable. There used to be Venkman, but due to ongoing “improvements” in the Mozilla Platform and Firefox, Venkman is broken and is no longer doing the job. When support for the first version of the Javascript Debugger interface (JSD1) is removed, that will be the final nail in the coffin of Venkman.

So it looks like we need an alternative. I’ve heard of lots of interest in creating alternatives, but the deal breaker is often the lack of time to actually work on a such project. In the meanwhile, Mozilla is investing time and resources to add native developer tools to Firefox. Maybe there is some way we can make use of these resources? Yes there is! The developer tools team is doing a great job. And by great I mean outstanding. Thanks to Firefox for Android and Firefox OS, the team designed the debugger in a client-server constellation. The Mozilla Platform provides debugger server component that is (almost) free of Firefox-specific code. Then there is the very Firefox specific developer tools client you know from the Firefox Tools Menu.

It became obvious to me that using this debugger server in Thunderbird would be a very future safe method. In contrast to copying the debugger UI into its own extension and make that compatible with Thunderbird, we just need to ensure that the already very general debugger server is kept clean of hardcoded Firefox-isms. For this reason I have applied to the Google Summer of Code as a student to make it happen.

Although the Summer has just started, I am proud to present a first success. With the latest nightly builds of Thunderbird 24.0a1 and a matching Firefox 24.0a1 nightly, its possible to debug Thunderbird code right from in your browser. Here is how:

  1. Download a Firefox nightly build.
  2. Download a Thunderbird nightly build.
  3. Start Thunderbird, select Tools → Allow Remote Debugging
  4. Start Firefox, open about:config, set devtools.debugger.remote-enabled to true and restart Firefox
  5. In Firefox, select Tools → Web Developer → Connect…
  6. Fill in connection details in case you changed anything, otherwise localhost port 6000 should be fine
  7. Now you should get a list with “Main Process”. Click on that

And that’s it! Now switch to the debugger tab in Firefox, and after a short load you will start seeing scripts and can set breakpoints. I will be improving support during the next weeks, so other tools can also be used. Track my progress in bug 876636.

As I’ve used the term “Remote Debugging” more than once in this post and it has already come up on the bugtracker, I will also tell you a little about privacy. It may sound like we are opening doors here so that anyone who might like to connect to your Thunderbird instance can control it. That is not at all true.

First of all, remote debugging is turned off by default. If you don’t do anything about it, then you won’t even notice its there, nor will any attacker. If you do enable remote debugging via the menu, either on purpose or by accident, there is another preference guarding you called devtools.debugger.force-local. The default value for this preference is true, this means that even with “Remote Debugging” enabled, only connections from localhost (i.e your computer) will be accepted. If you decide to circumvent this too by setting that preference to false, there is yet another wall to save you: If a remote debugger attempts to access your computer, you are presented with a dialog to accept, decline or even disable remote debugging. If you decline or disable, no harm is done.

If you have any further concerns regarding privacy, please do comment or contact me.


Tagged: developer tools, eloper tools, javascript debugger, lightning, Mozilla, venkman

September 30, 2013 06:29 AM

September 26, 2013

Kent James

Mesquilla is releasing a Calendar EWS Provider addon compatible with ExQuilla

Recently we learned that the “Exchange 2007/2010/2013 Provider” addon is no longer being developed, and the final version did not fully support Thunderbird 24 and Exchange Server 2013 & Office365. While eventually we intend to fully integrate Calendar and Task support into ExQuilla, at the moment that support is incomplete and experimental, so many of our customers had relied on the now-abandoned addon. So as a convenience to our customers, we have forked and updated that addon to provide a version that works with current Thunderbird and Microsoft Exchange, as well as includes better compatibility with ExQuilla. We call this new fork the “Calendar EWS Provider”. Since the original addon is released GPL, so is this one, with code on Bitbucket here.

This addon is currently considered to be beta quality (version 3.2.0-Beta47 as of today, continuing the versioning of the predecessor addon), but you may download the current version from this link:

http://mesquilla.net/ewscalendar-currentrelease.xpi

Calendars configured using the previous addon will not work with this fork, so if you are a current user of the Exchange Provider addon, you should remove any calendars from that and uninstall it before installing the “Calendar Ews Provider” addon (also called ewscalendar for short).

Calendar EWS Provider differs from its predecessor in the following ways:

  1. Because ExQuilla already includes Contacts and GAL support, that was removed.
  2. The storage of passwords has been made compatible with ExQuilla, so if you use ExQuilla on the same account the password is only stored once.
  3. Naming has been changed to remove support links to the previous developer, and replace them with current links.
  4. Various issues and bugs that we have uncovered have been fixed, including issues of compatibility with Office365 and Thunderbird 24. The new xpi file claims compatibility with both Thunderbird 17 and Thunderbird 24.
  5. Because this is a fork with changed capabilities and not a continuation of the existing addon, we felt that internal identifiers needed to be changed to reduce the possibility of conflict with the original addon, though we still do not recommend that you attempt to run both at once.
  6. When creating new calendar entries, the EWS server parameters are copied from the equivalent ExQuilla values if an ExQuilla email is selected. So you should not have to do autodiscover again or manually enter server parameters when adding calendars to existing ExQuilla accounts.
  7. Addon updates now follow the standard Mozilla addon UI (while the previous addon had a custom solution), with updates coming automatically from the mesquilla.net site if you download from there.

Although it would be convenient if the Calendar EWS Provider addon was distributed directly with ExQuilla, the GPL license prevents that, so users must do a separate download and installation. But at the same time, ExQuilla is not required for its operation, so you are free to use this addon for Calendaring without using ExQuilla.

We would appreciate testing of this addon by potential users and reporting of experiences. However, this is new code for us, and considered to be a temporary solution (where “temporary” is approximately 1 year), so we may not be able to immediately attend to all reported issues. Please use the new “Calendar EWS Provider” forum here for any comments or support requests.

September 26, 2013 10:43 PM

September 22, 2013

Calendar

Lightning 2.6 has been released

I am happy to announce the release of Lightning 2.6, compatible to Thunderbird 24 and Seamonkey 2.21. The release went live on Tuesday, September 17th and is the next major release after Thunderbird 17 and Lightning 1.9. If you are using the Provider for Google Calendar, you will also have to upgrade to version 0.25.

You may have not received the updates automatically because of server side throttling, you can either wait until the update occurs, force checking for updates via the About Thunderbird dialog, or grab them manually:

Before upgrading, be sure to backup your profile so you can restore in case something goes wrong unexpectedly.

The release notes can be found on addons.mozilla.org. An important note for users of Google Calendar via CalDAV (not via Provider for Google Calendar): Due to a server side change at Google, you must update the URL of the calendar, as described in this post.

Should you be experiencing any issues, here are some steps that might resolve them:

  • Make sure that you are running Thunderbird 24 or SeaMonkey 2.21.
  • Redownload and install Lightning using the download links above.
  • If you are using the Provider for Google Calendar make sure you upgrade to version 0.25
If you are experiencing issues (Lightning not installing or the calendar not working at all), try removing the Lightning addon and doing a fresh install. Your calendar data will be kept intact, as it is contained in your profile. To be sure, create a profile backup as described above.

If you enjoy this update or want to thank us for the hard work we have done, feel free to leave a review at addons.mozilla.org. If you have issues upgrading, please don’t misuse the reviews. Leave a comment here and I’ll try to get back to you soon! If you are sure you have found a bug, you can also search for it on bugzilla or file a new one if it doesn’t yet exist.

September 22, 2013 10:12 PM

September 21, 2013

Kent James

ExQuilla as a Model for “Thunderbird Professional” aka SwanFox

About a year ago, when we on the Thunderbird team were having very active discussions about the future of Thunderbird after Mozilla’s drastic cutback in funding, Axel Grude and myself were minority voices promoting the importance of developing funding sources for Thunderbird if it were to prosper in the future. In contrast, the majority viewpoint, as I understand it, was some combination of 1) We don’t need funding, Thunderbird is fine the way it is, and eager volunteers will move things forward, and 2) Mozilla Messaging tried for years to develop funding and failed, so it is probably impossible to do so.

At the time I floated a proposal, with the code name SwanFox, to put together a commercial entity to develop and package what might be called “Thunderbird Professional” that would combine new features and better support, but would require some sort of payment from users. That entity would have had some sort of preferential relationship with Mozilla that would involve promotion within the community version of Thunderbird and licensing of the Thunderbird trademark, along with the SwanFox team offering resources back to Thunderbird to continue the development of the community version. Ultimately though I concluded that there was inadequate support from the powers-that-be for such an entity, such that any benefits of a formal relationship with Mozilla would come with so many restrictions that it would not be worthwhile, so I dropped public discussions of the concept.

But that does not mean that my opinions have changed, nor that I have been inactive in that period. In fact, the next few months will be crucial as a testing ground for some of the concepts behind the SwanFox proposal, because I am in the process of transitioning my very large ExQuilla addon (which provides email and contacts support for Exchange Web Services) into a pay model. The success or failure of that process will  provide very significant data about the viability of a pay model for something like SwanFox (possibly with ExQuilla forming the foundation).

There are cultural, technical, and financial issues at stake here.

First, the cultural issues. Mozilla tries to maintain a delicate balance between the commercial and open-source worlds, but clearly the public persona leans toward the open source world. There is a natural base of strong opposition to commercialization of any Mozilla-related product. A good example of that are some of the reactions I have received to the steps I have taken towards commercialization. A few months ago, users of ExQuilla were given a prompt that says “We are transitioning to a pay model soon, but if you want a free six month license click here”. That announcement alone was enough to trigger a slew of what I lovingly call the “I hate you” one-star reviews, with comments like “If I have to pay then I’ll just buy Outlook”, “This crosses the line into scam and I am reporting it to mozilla”, “it pops up a message asking for payment. Complete bullshit”. Meanwhile the discussions with users that I have on my support site are quite encouraging. After all, email is job-critical to most people, and $10 a year is really not very much money to maintain job productivity. What I expect here is to continue to see an aggressive backlash within a minority of the Mozilla community, but overall positive support from the actual users. I expect this backlash to get even worse when in a few months the addon will stop working for non-paying users.

Second, the technical issues. One thing that ExQuilla is proving is that you really can do a massive rewrite of portions of Thunderbird through a binary addon. The core changes required to get Thunderbird to accept a new email protocol have been surprisingly few, while the hacks required within ExQuilla have been enormous. As a result, I am convinced that the way forward for a SwanFox product would be as a massive addon to the core Thunderbird code. That avoids the forking alternative that Postbox took, which left them frozen outside of newer Thunderbird features and security updates.

But the addon approach has its own problems, mostly Mozilla culture related. The Mozilla addon review process is painfully slow (ExQuilla compatible with Thunderbird 24 has been waiting for a month for a review, such that current users have a crisis when Thunderbird updates). This is really not workable for a mission-critical addon. There are also stubborn biases against being too commercially-friendly in Mozilla, such as the issue of dealing with reviews of proprietary code. Although Mozilla has a process to review proprietary code, they are not willing to sign non-disclosure agreements, which creates a major unacceptable legal risk to any serious commercial venture.  It’s pretty clear to me that SwanFox would need to do private-label builds of Thunderbird for the sole purpose of maintaining an independent update channel that they control, rather than relying on Mozilla channels with their commercially-unfriendly features. This is also the approach that ExQuilla takes, maintaining a separate update channel for users independently of Mozilla as well as allowing downloads and updates from the Mozilla addons site.

Finally, the financial issues. ExQuilla is being offered on an annual subscription basis for $10 per year. As users transition to a pay model over the next six months, it’s a big question mark what fraction of current users will convert, and how much resistance future users will have. But preliminary indications are quite positive. In any case, even the most pessimistic estimations of the annual income that ExQuilla will provide, with its tiny user base, are significantly more than the $10,000 per year income that Mozilla earns from Thunderbird. (I’ve tried unsuccessfully to get Mozilla to publicly report their Thunderbird income, but all I have received has been this private estimate). But currently I am quite optimistic that this financial basis will provide a solid future for ExQuilla. If that same $10 per year model were extended to the 20,000,000 estimated users of Thunderbird, even a modest 1% conversion to a SwanFox/Thunderbird Professional would generate $2,000,000 per year, which would be sufficient to ensure a robust future for the Thunderbird project.

Contrast the ExQuilla plans to that to a sister addon, the GPL-licensed Exchange Provider for Calendar. There was an original version released by Simon Schubert, but that effort was mostly abandoned and replaced by a different version by Michel Verbraak. Recently Michel announced that he too is abandoning development. It is not reasonable to expect a major business-focused feature to be supported indefinitely by a volunteer. Businesses understand this and are willing to pay, but things like the GPL and Mozilla culture sometimes stand in the way.

So I hope that users and developers who care about Thunderbird are cheering for the financial success of ExQuilla, even if you do not use Microsoft Exchange Server, as ExQuilla may end up being the precursor of a Thunderbird Professional that could finally take business users seriously, while providing the financial basis for a robust future for Thunderbird.

September 21, 2013 08:05 PM

September 18, 2013

Meeting Notes

Thunderbird: 2013-09-17

Remember to use headphones and mute yourself when not talking

Feel free to ask questions in the meeting either by speaking up or by asking them in #maildev on IRC.

Other ways to get in touch with us can be found on our communications page

Meeting Changes

Attendees

Usul, jcranmer, Irving, clokep (hearing), mconley, tessarakt, aceman (hearing), sshagarwal

Friends of the tree

  • mconley nominates Chiaki Ishikawa++

Current status and discussions

  • Thunderbird 24 is out! (throttled) – FIREWORKS!!!

  • Thunderbird 17.0.9ESR is out! (unthrottled)
  • \o/

Action items from last meetings

Upcoming

Round Table

  • mconley

    • GSoC is wrapping up

      • Jon Demelo’s connector is pretty much done, and we’ve tested it against Radicale, and it seems to work as advertised. Ensemble now has theoretical connection support for CardDAV. \o/

      • Fallen’s got patches up for Thunderbird dev tools. He’s got Scratchpad working, as well as the Inspector, Web Console…and others. It’s really _really_ awesome. See https://bugzilla.mozilla.org/show_bug.cgi?id=876636
    • Chewed through some reviews yesterday. Still have to do more.
  • tessarakt
  • jcranmer
    • Getting reviews on the new header parser! Should land this week

    • Fixed the perma-bustage on our builds! Just down to the click-to-play plugin bustage
    • Probably busy for the next few days due to school
  • clokep
    • JS-Yahoo is ALMOST preff’ed on by default in Instantbird, working with qheaden to get this done in the next week

    • Will attempt to get this merged into Daily ASAP.
  • wsmwk
    • crashes – still no obvious big bad boys in beta, nor so far ~2 hr into release – will probably be a different story in a couple days. But it’s fascinating to see some early crashes that repeatedly hit the same user multiple times, that don’t have reported bugs – see http://tinyurl.com/omgxfnk
  • Usul
    • had semi organized test – which was a big fail. (maybe too many holidays?)

    • busy and burned out by day job – slowly following on stuff
  • JosiahOne (not at meeting)
    • Very busy now with classes now, however I still managed to land a few more theme patches in the past week.

    • New composer UI for Linux and Windows has been slow to work with for a bunch of reasons, however things seem to be functioning properly now. So hopefully it’ll be done within a couple weeks.

Question Time

  • Is there anything TB related planned for Summit?

    • That’s a really good question. I think the agenda is still very fluid.

    • I (clokep) would at least like to know who’ll be in each place and meet them!
      • clokep, mconley, jcranmer, JosiahOne, irving, wsmwk, will be in Toronto

      • A lot of Instantbird people will be in Brussels (aleth, Florian, Even)

Action Items

  • [Standard8] mconley nominates Chiaki Ishikawa for friend of the tree. Swag?

  • [Standard8] Are WADA and Aryx from previous nomination covered already?

Thunderbird Meeting Details :

September 18, 2013 03:00 AM

September 17, 2013

Thunderbird Blog

Updated Thunderbird released today

Mozilla, a global, nonprofit organization dedicated to making the Web better, today released new versions of Mozilla Thunderbird, its free and open source email application, available for Windows, Mac, and Linux.

We are happy to announce that the new Thunderbird release includes several new features and improvements, such as improved message thread management options, as well as the support of email addresses using the newly adopted standard for international domain names.

Thunderbird is now more accessible with the ability to magnify the compose window. This release is also comprised of many bugs and security fixes. More details can be found in the release notes.

To get the latest versions of Thunderbird for Windows, Mac or Linux download Thunderbird from getthunderbird.com or go into the About dialog to upgrade to the latest version.

September 17, 2013 04:47 PM

September 16, 2013

Calendar

Google is changing the Location URL of their CalDAV Calendars

Google has decided to change the authentication mechanism for their CalDAV calendars to OAuth, which required some changes in Lightning to accommodate.

Due to these changes, the URL to access the calendar has also changed. The old endpoint will stop working after September 16th (today!). This affects only Google calendars using CalDAV protocol.

Calendars using the Provider for Google Calendar or iCal read-only access won’t be affected.

How do I know if I’m affected?

Open your calendar’s properties by right-clicking on calendar name and check if the location starts with https://www.google.com/calendar/dav/. If it does, you are using CalDAV and need to set up your calendar again with the new URL.

How do I set up the new calendar?

To use the new CalDAV Endpoint, you will need Lightning 2.6 and Thunderbird 24, which will be released tomorrow, September 17th. There will be a blog post on the release tomorrow, so please use the navigation to view the new post when its there to get the download links. Anyway, here are the steps:

More details on how to set up from Google can be found here. After setting up new calendar you can safely remove the old one from Lightning. This will only remove it from calendars list, your events will not be deleted from Google’s servers.

But I want it to work now!

In that case, you will have to use a beta version of Thunderbird and Lightning. You can get Thunderbird 24.0b3 here and the corresponding Lightning 2.6b2 here. Please use these versions only if you really can’t wait, once the release version is out there will be no support for beta versions.

One more thing…

Its a little unfortunate that Google is shutting down the API in coincidence with the release. I’m pretty sure a few people will blame Lightning’s new version for this (understandably, since that is the only thing they knowingly changed). This will mean we will get a few support requests covered as bad reviews, I already saw one today! If you found this blog post useful, please stop by addons.mozilla.org and give us a few stars.

Update on @googlemail.com addresses

If you created your Google account with an @googlemail.com email address, you will need to use this email address in the URL, even if you’ve switched to @gmail.com now! Otherwise it will look like everything works, until you try to add/modify events. I will notify the Google folks so they can get this bug fixed on their end.

Update: Still not working? Here is why!

Unfortunately there are two more problems. First of all, you may be experiencing an instant failure, if you inspect the logs you might see an error 400. Aside from the location change, Google has also introduced support for webdav-sync. In theory this makes the synchronization process faster, but Google does not implement the fallback mechanism we rely on. We can probably fix this on our side though by updating to a newer version of the webdav-sync draft.

Next up is a bug on our side. When the OAuth token expires, there is an error refreshing the token. I will take care of fixing this for Lightning 2.6.1

But I really need it to work now!

Ok then, I have one option left for you. For the time being you could switch to the Provider for Google Calendar, an extension that uses the Google Calendar API to connect. You will have to set up the calendar a bit differently, using the new “Google Calendar” option in the new calendar wizard and the XML address as the location URL.

Thanks to Merike Sell for drafting this blog post

September 16, 2013 05:36 PM

Ludovic Hirlimann

PGP key signing party at Mozilla Summit 2013: details

So the summit is coming soon and as promised earlier here are the details on what you need to participate.

Prepare

Make sure you have an official ID (passport, ID card etc …)

Have pgp fingerprint printed on paper - these needs to have your findgerprint and your email address.  Little tutorial on how ro prepare keys for signing :

1) find your key id :

gpg —list-keys ludovic

pub   3072D/6B17EA1E 2010-02-04
uid                  Ludovic Hirlimann (Work key) <lhirlimann@mozilla.com>
uid                  Ludovic Hirlimann (New stronger key) <ludovic@hirlimann.net>
uid                  [jpeg image of size 28447]
uid                  Ludovic Hirlimann (Work alias) <ludovic@mozilla.com>
sub   4096g/95C69506 2010-02-04

2) print your fingerprint :

Oulan:releng ludo$ gpg —fingerprint 6B17EA1E
pub   3072D/6B17EA1E 2010-02-04
 Empreinte de la clef = AD36 071B AF5B C0A2 72A3  CACC 025A B010 6B17 EA1E
uid                  Ludovic Hirlimann (Work key) <lhirlimann@mozilla.com>
uid                  Ludovic Hirlimann (New stronger key) <ludovic@hirlimann.net>
uid                  [jpeg image of size 28447]
uid                  Ludovic Hirlimann (Work alias) <ludovic@mozilla.com>
sub   4096g/95C69506 2010-02-04


3) copy and paste the above in a document multiple time, print on paper and cut them out.

At the summit


On the 4th of October at 5:30 PM we’ll meet in the hotel lobby in SC, and in Toronto. In Brussels stay at the summit venues and meet at the entrance. Find , Justdave in Toronto, Otto in Brussels and me in Santa-Clara. Bring an ID, and the papers you’ve prepare like explained above.

We’ll then line up - to check Ids and exchange fingerprint.

After the key party

Sign keys and upload them to keyservers. Caff is a nice tool to use to do that on both Linux and OSX.

September 16, 2013 08:03 AM

September 14, 2013

Joshua Cranmer

Why email is hard, part 1: architecture

Which is harder, writing an email client or writing a web browser? Several years ago, I would have guessed the latter. Having worked on an email client for several years, I am now more inclined to guess that email is harder, although I never really worked on a web browser, so perhaps it's just bias. Nevertheless, HTML comes with a specification that tells you how to parse crap that pretends to be HTML; email messages come with no such specification, which forces people working with email to guess based on other implementations and bug reports. To vent some of my frustration with working with email, I've decided to post some of my thoughts on what email did wrong and why it is so hard to work with. Since there is so much to talk about, instead of devoting one post to it, I'll make it an ongoing series with occasional updates (i.e., updates will come out when I feel like it, so don't bother asking).

First off, what do I mean by an email client? The capabilities of, say, Outlook versus Gaia Email versus Thunderbird are all wildly different, and history has afforded many changes in support. I'll consider anything that someone might want to put in an email client as fodder for discussion in this series (so NNTP, RSS, LDAP, CalDAV, and maybe even IM stuff might find discussions later). What I won't consider are things likely to be found in a third-party library, so SSL, HTML, low-level networking, etc., are all out of scope, although I may mention them where relevant in later posts. If one is trying to build a client from scratch, the bare minimum one needs to understand first is the basic message formatting, MIME (which governs attachments), SMTP (email delivery), and either POP or IMAP (email receipt). Unfortunately, each of these requires cross-referencing a dozen RFCs individually when you start considering optional or not-really-optional features.

The current email architecture we work with today doesn't have a unique name, although "Internet email" [1] or "SMTP-based email" are probably the most appropriate appellations. Since there is only one in use in modern times, there is no real need to refer to it by anything other than "email." The reason for the use of SMTP in lieu of any other major protocol to describe the architecture is because the heart of the system is motivated by the need to support SMTP, and because SMTP is how email is delivered across organizational boundaries, even if other protocols (such as LMTP) are used internally.

Some history of email, at least that lead up to SMTP, is in order. In the days of mainframes, mail generally only meant communicating between different users on the same machine, and so a bevy of incompatible systems started to arise. These incompatible systems grew to support connections with other computers as networking computers became possible. The ARPANET project brought with it an attempt to standardize mail transfer on ARPANET, separated into two types of documents: those that standardized message formats, and those that standardized the message transfer. These would eventually culminate in RFC 822 and RFC 821, respectively. SMTP was designed in the context of ARPANET, and it was originally intended primarily to standardize the messages transferred only on this network. As a result, it was never intended to become the standard for modern email.

The main competitor to SMTP-based email that is worth discussing is X.400. X.400 was at one time expected to be the eventual global email interconnect protocol, and interoperability between SMTP and X.400 was a major focus in the 1980s and 1990s. SMTP has a glaring flaw, to those who work with it, in that it is not so much designed as evolved to meet new needs as they came up. In contrast, X.400 was designed to account for a lot of issues that SMTP hadn't dealt with yet, and included arguably better functionality than SMTP. However, it turned out to be a colossal failure, although theories differ as to why. The most convincing to me boils down to X.400 being developed at a time of great flux in computing (the shift from mainframes to networked PCs) combined with a development process that was ill-suited to reacting quickly to these changes.

I mentioned earlier that SMTP eventually culminates in RFC 821. This is a slight lie, for one of the key pieces of the Internet, and a core of the modern email architecture, didn't exist. That is DNS, which is the closest thing the Internet has to X.500 (a global, searchable directory of everything). Without DNS, figuring out how to route mail via SMTP is a bit of a challenge (hence why SMTP allowed explicit source routing, deprecated post-DNS in RFC 2821). The documents which lay out how to use DNS to route are RFC 974, RFC 1035, and RFC 1123. So it's fair to say that RFC 1123 is really the point at which modern SMTP was developed.

But enough about history, and on to the real topic of this post. The most important artifact of SMTP-based architecture is that different protocols are used to send email from the ones used to read email. This is both a good thing and a bad thing. On the one hand, it's easier to experiment with different ways of accessing mailboxes, or only supporting limited functionality where such is desired. On the other, the need to agree on a standard format still keeps all the protocols more or less intertwined, and it makes some heavily-desired features extremely difficult to implement. For example, there is still, thirty years later, no feasible way to send a mail and save it to a "Sent" folder on your IMAP mailbox without submitting it twice [2].

The greatest flaws in the modern architecture, I think, lie in particular in a bevy of historical design mistakes which remain unmitigated to this day, in particular in the base message format and MIME. Changing these specifications is not out of the question, but the rate at which the changes become adopted is agonizingly slow, to the point that changing is generally impossible unless necessary. Sending outright binary messages was proposed as experimental in 1995, proposed as a standard in 2000, and still remains relatively unsupported: the BINARYMIME SMTP keyword only exists on one of my 4 SMTP servers. Sending non-ASCII text is potentially possible, but it is still not used in major email clients to my knowledge (searching for "8BITMIME" leads to the top results generally being "how do I turn this off?"). It will be interesting to see how email address internationalization is handled, since it's the first major overhaul to email since the introduction of MIME—the first major overhaul in 16 years. Intriguingly enough, the NNTP and Usenet communities have shown themselves to be more adept to change: sending 8-bit Usenet messages generally works, and yEnc would have been a worthwhile addition to MIME if its author had ever attempted to push it through. His decision not to (with the weak excuses he claimed) is emblematic of the resistance of the architecture to change, even in cases where such change would be pretty beneficial.

My biggest complaint with the email architecture isn't actually really a flaw in the strict sense of the term but rather a disagreement. The core motto of email could perhaps be summed up with "Be liberal in what you accept and conservative in what you send." Now, I come from a compilers background, and the basic standpoint in compilers is, if a user does something wrong, to scream at them for being a bloody idiot and to reject their code. Actually, there's a tendency to do that even if they do something technically correct but possibly unintentionally wrong. I understand why people dislike this kind of strict checking, but I personally consider it to be a feature, not a bug. My experience with attempting to implement MIME is that accepting what amounts to complete crap not only means that everyone has to worry about parsing the crap, but it actually ends up encouraging it. The attitude people get in bugs starts becoming "this is supported by <insert other client>, and your client is broken for not supporting it," even when pointed out that their message is in flagrant violation of the specification. As I understand it, HTML 5 has the luxury of specifying a massively complex parser that makes /dev/urandom in theory reliably parsed across different implementations, but there is no such similar document for the modern email message. But we still have to deal with the utter crap people claim is a valid email message. Just this past week, upon sifting through my spam folder, I found a header which is best described as =?UTF-8?Q? ISO-8859-1, non-ASCII text ?= (spaces included). The only way people are going to realize that their tools are producing this kind of crap is if their tools stop working altogether.

These two issues come together most spectacularly when RFC 2047 is involved. This is worth a blog post by itself, but the very low technically-not-but-effectively-mandatory limit on the header length (to aide people who read email without clients) means that encoded words need to be split up to fit on header lines. If you're not careful, you can end up splitting multibyte characters between different encoded words. This unfortunately occurs in practice. Properly handling it in my new parser required completely reflowing the design of the innermost parsing function and greatly increasing implementation complexity. I would estimate that this single attempt to "gracefully" handle wrong-but-of-obvious-intent scenario is worth 15% or so of the total complexity of decoding RFC 2047-encoded text.

There are other issues with modern email, of course, but all of the ones that I've collected so far are not flaws in the architecture as a whole but rather flaws of individual portions of the architecture, so I'll leave them for later posts.

[1] The capital 'I' in "Internet email" is important, as it's referring to the "Internet" in "Internet Standard" or "Internet Engineering Task Force." Thus, "Internet email" means "the email standards developed for the Internet/by the IETF" and not "email used on the internet."
[2] Yes, I know about BURL. It doesn't count. Look at who supports it: almost nobody.

September 14, 2013 10:50 PM

September 13, 2013

Rumbling Edge - Thunderbird

2013-09-13 Calendar builds

Common (excluding Website bugs)-specific: (4)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

September 13, 2013 08:33 PM

2013-09-13 Thunderbird comm-central builds

Thunderbird-specific: (20)

MailNews Core-specific: (17)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

September 13, 2013 08:32 PM

September 08, 2013

Robert Kaiser

Wanted Apps: Simple IRC

At Mozilla, as well as a lot of other Free / Open Source Software projects, IRC (Internet Relay Chat) is the backbone of real-time communication within the project.

The beauty of this chat service is that it's a really simple and lightweight protocol and there's a ton of different clients to access it, including for example ChatZilla, which is written completely in JavaScript, and multiple web-based clients. The latter would make nice Firefox OS apps, one might think, but their major downsides are that they are all running on the server-side and not really on the device you installed the app in - and their UI doesn't really fit the phone form factor.

Now, what I'd really like to see, though, is an app that runs locally on the Firefox OS device in its entirety, and which has a UI that is useful and nice on a phone. Especially the latter might mean not implementing all the fancy functionality that many IRC clients have, but only those parts required for some simple chatting.

We have the technology to run a full IRC client on a Firefox OS phone with the TCPSocket API, and the simplicity of the IRC protocol would make it a nice reason for someone who wants to play with this API.

The UI, OTOH, would make a very interesting challenge for someone who like UX design, as on a phone, you need to be way more minimalistic, and you probably need to consciously decide what functionality and which elements to leave out, or implement completely differently than what we might be used to.

I'd really love to be able to have an easy way to tell my manager via IRC that I'll be late for a 1:1 while I'm on my way, or be able to make a quick inquiry in a chat channel while I'm traveling.

Anyone up for the challenge?

September 08, 2013 12:57 AM

September 06, 2013

Ludovic Hirlimann

Extra summit activity #part 1 Santa Clara Photowalks

Like I did at the previous summits and previous mozcamps, I’ll be going with my friend Roland on photo walks early in the morning in Santa Clara. We haven’t planned much yet and we usually don’t. Feel free to join us at 6:30 am on Friday @ the Hotel reception. We’ll figure out times better for the other days, but 6:30 gives us a good hour of walk and let’s us time for breakfast before the summit activity starts.

September 06, 2013 05:30 AM

August 29, 2013

Calendar

Lightning 2.6b2 has been uploaded, some locales broken

Lightning 2.6 is due on September 17th and will be a major release. The last stable relase was Lightning 1.9 and the only testing that has been done since is a little testing on the beta channel. I say this is not enough!

One problem we have is that a handful of locales are broken (only one bug report, by the way!). This is mostly due to some changes we had to undertake regarding the l10n dashboard, making it hard for localizers to figure out where to sign off.

If you can read a language other than English, please take a few minutes for the following steps:

  1. Download Lightning 2.6b2 from addons.mozilla.org together with Thunderbird 24
  2. After starting the app, check the error console for anything that looks bad. Warnings are probably ok. Clear the error console.
  3. Open every dialog you know of (print, event, task, summary, preferences, calendar properties, …). Make sure it looks normal.
  4. Set up a remote calendar of your choice, make sure you can connect.
  5. Check the error console again if something alarming was added.

You can of course still help if English is your only language. Please do a few tests to make sure everything works. You might as well upgrade your original Lightning installation, because this is what you will receive in about 3 weeks anyway. The earlier we find issues, the higher the chance we can fix it before it reaches everyone.

Thank you for your support!

August 29, 2013 07:58 PM

August 24, 2013

Rumbling Edge - Thunderbird

2013-08-24 Calendar builds

Common (excluding Website bugs)-specific: (11)

Sunbird will no longer be actively developed by the Calendar team.

Windows builds Official Windows

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

August 24, 2013 07:23 PM

2013-08-24 Thunderbird comm-central builds

Thunderbird-specific: (22)

MailNews Core-specific: (28)

Windows builds Official Windows, Official Windows installer

Linux builds Official Linux (i686), Official Linux (x86_64)

Mac builds Official Mac

August 24, 2013 07:22 PM