## December 09, 2013

### Sean McArthur — Tally

Tally:

I had need of a simple counter application on my phone, and I scoured the Play Store for a good one. They all have atrocious UIs, with far too many buttons and features. I found one that was nicely designed, Tap Counter, but the touch target was too small, only directly in the center, making it very easy to miss when counting people and not looking at the screen.

So, over the weekend, I made one.

It’s a simple, beautiful Holo design. It picks from one of the Android Holo colors for the circle up app launch, so it will change up each time you use it. There’s only one button, and it takes up the whole screen. To reset, simply hold down that button, and you’ll get a fun animation.

That’s it. That’s Tally.

### Armen Zambrano Gasparnian — Killing ESR17 and Thunderbird-ESR17

As I raised on dev.planning last week, today we will be disabling ESR17. Read below the post (fixed some small changes):

Hello all,Next week, we will have our next merge date [1] on Dec. 9th, 2013.As part of that merge day we will be killing the ESR17 [2][3] builds and testson tbpl.mozilla.org as well as Thunderbird-Esr17 [4].This is part of our normal process where two merge days after thecreation of the latest ESR release (e.g. ESR24 [5]) we obsolete the lastone (e.g. ESR17).On an unrelated note to this post, we will be creating updates fromESR17 to ESR24 even after that date, however, we will have no morebuilds and tests on check-in.Please let me know if you have any questions.regards,Armen#############Zambrano Gasparnian, Armen (armenzg)Mozilla Senior Release Engineerhttps://mozillians.org/en-US/u/armenzg/http://armenzg.blogspot.ca[1] https://wiki.mozilla.org/RapidRelease/Calendar[2] https://wiki.mozilla.org/Enterprise/Firefox/ExtendedSupport:Proposal[3] https://tbpl.mozilla.org/?tree=Mozilla-Esr17[4] https://tbpl.mozilla.org/?tree=Thunderbird-Esr17[5] https://tbpl.mozilla.org/?tree=Mozilla-Esr24

This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

### Aki Sasaki — LWR (job scheduling) part iv: drilling down into the dependency graph

I already wrote a bit about the dependency graph here, and :catlee wrote about it here. While I was writing part 2, it became clear that

1. I had a lot more ideas about the dependency graph, enough for its own blog post, and
2. since I want to tackle writing the dependency graph first, solidifying my ideas about it beforehand would be beneficial to writing it.

I've been futzing around with graphviz with :hwine's help. Not half as much fun as drawings on napkins, but hopefully they make sense. I'm still thinking things through.

jobs and graphs

A quick look at TBPL was enough to convince me that the dependency graph would be complex enough just describing the relationships between jobs. The job details should be separate. Per-checkin, nightly, and periodic-PGO dependency graphs trigger overlapping sets of jobs, so avoiding duplicate job definitions is a plus.

We'll need to submit both the dependency graph and the associated job definitions to LWR together. More on how I think jobs and graphs could work in the db below in part 5.

For phase 1, I think job definitions will only cover enough to feed into buildbot and have them work.

dummy jobs

• In my initial dependency graph thoughts, I mentioned breakpoint jobs as a throwaway idea, but it's stuck with me.

We could use these at the beginning of graphs that we want to view or edit in the web app before proceeding. Or if we submit an experimental patch to Try and want to verify the results after a specific job or set of jobs before proceeding further. Or if we want to represent QA signoff in a release graph, and allow them to continue the release via the web app.

I imagine we would want a request timeout on this breakpoint, after which it's marked as timed out, and all child jobs are skipped. I imagine we'd also want to set an ACL on at least a subset of these, to limit who can sign off on releases.

Also in releases, we have simple notification jobs that send email when the release has passed certain milestones. We could later potentially support IRC pings and bug comments.

A highly simplified representation of part of a release:

We currently continue the release via manual Release Engineering intervention, after we see an email "go". It would be great to represent it in the dependency graph and give the correct group of people access. Less RelEng bottleneck.
• We could also have timer jobs that pause the graph until either cancelled or the timeout is hit. So if you want this graph to run at 7pm PST, you could schedule the graph with an initial timer job that marks itself successful at 7, triggering the next steps in the graph.
• In buildbot, we currently have a dummy factory that sleeps 5 and exits successfully. We used this back in the dark ages to skip certain jobs in a release, since we could only restart the release from the beginning; by replacing long-running jobs with dummy jobs, we could start at the beginning and still skip the previously successful portions of the release.

We could use dummy jobs to:

1. simplify the relationships between jobs. In the above graph, we avoided a many-to-many relationship by inserting a notification job in between the linux jobs and the updates.
2. trigger when certain groups of jobs finish (e.g. all linux64 mochitests), so downstream jobs can watch for the dummy job in Pulse rather than having to know how many chunks of mochitests we expect to run, and keep track as each one finishes.
3. quickly test dependency graph processing: instead of waiting for a full build or test, replace it with a dummy job. For instance, we could set all the jobs of a type to "success" except one "timed out; retry" to test max retry limits quickly. This assumes we can set custom exit statuses for each dummy job, as well as potentially pointing at pre-existing artifact manifest URLs for downstream jobs to reference.

Looking at this list, it appears to me that timer and breakpoint jobs are pretty close in functionality, as are notification and dummy (status?) jobs. We might be able to define these in one or two job types. And these jobs seem simple enough that they may be runnable on the graph processing pool, rather than calling out to SlaveAPI/MozPool for a new node to spawn a script on.

statuses

At first glance, it's probably easiest to reuse the set of TBPL statuses: success, warning, failure, exception, retry. But there are also the grey statuses 'pending' and 'running'; the pink status 'cancelled'; and the statuses 'timed out', and 'interrupted' which are subsets of the first five statuses.

Some statuses I've brainstormed:

• inactive (skipped during scheduling)
• request cancelled
• pending blocked by dependencies
• pending blocked by infrastructure limits
• skipped due to coalescing
• skipped due to dependencies
• request timed out
• running
• interrupted due to user request
• interrupted due to network/infrastructure/spot instance interrupt
• interrupted due to max runtime timeout
• interrupted due to idle time timeout (no output for x seconds)
• completed successful
• completed warnings
• completed failure
• retried (auto)
• retried (user request)

The "completed warnings" and "completed failure" statuses could be split further into "with crash", "with memory leak", "with compilation error", etc., which could be useful to specify, but are job-type-specific.

If we continue as we have been, some of these statuses are only detectable by log parsing. Differentiating these statuses allows us to act on them in a programmatic fashion. We do have to strike a balance, however. Adding more statuses to the list later might force us to revisit all of our job dependencies to ensure the right behavior with each new status. Specifying non-useful statuses at the outset can lead to unneeded complexity and cruft. Perhaps 'state' could be separated from 'status', where 'state' is in the set ('inactive', 'pending', 'running', 'interrupted', 'completed'); we could also separate 'reasons' and 'comments' from 'status'.

Timeouts are split into request timeouts or runtime timeouts (idle timeouts, max job runtime timeouts). If we hit a request timeout, I imagine the job would be marked as 'skipped'. I also imagine we could mark it as 'skipped successful' or 'skipped failure' depending on configuration: the former would work for timer jobs, especially if the request timeout could be specified by absolute clock time in addition to relative seconds elapsed. I also think both graphs and jobs could have request timeouts.

I'm not entirely sure how to coalesce jobs in LWR, or if we want to. Maybe we leave that to graph and job prioritization, combined with request timeouts. If we did coalesce jobs, that would probably happen in the graph processing pool.

For retries, we need to track max [auto] retries, as well as job statuses per run. I'm going to go deeper into this below in part 5.

relationships

For the most part, I think relationships between jobs can be shown by the following flowchart:

If we mark job 2 as skipped-due-to-dependencies, we need to deal with that somehow if we retrigger job 1. I'm not sure if that means we mark job 2 as "pending-blocked-by-dependencies" if we retrigger job 1, or if the graph processing pool revisits skipped-due-to-dependencies jobs after retriggered jobs finish. I'm going to explore this more in part 5, though I'm not sure I'll have a definitive answer there either.

It should be possible, at some point, to block the next job until we see a specific job status:

• don't run until this dependency is finished/cancelled/timed out
• don't run unless the dependency is finished and marked as failure
• don't run unless the dependency is finished and there's a memory leak or crash

For the most part, we should be able to define all of our dependencies with this type of relationship: block this job on (job X1 status Y1, job X2 status Y2, ...). A request timeout with a predefined behavior-on-expiration would be the final piece.

I could potentially see more powerful commands, like "cancel the rest of the [downstream?] jobs in this graph", or "retrigger this other job in the graph", or "increase the request timeout for this other job", being potentially useful. Perhaps we could add those to dummy status jobs. I could also see them significantly increasing the complexity of graphs, including the potential for infinite recursion in some constructs.

I think I should mark any ideas that potentially introduce too much complexity as out of scope for phase 1.

branch specific definitions

Since job and graph definitions will be in-tree, riding the trains, we need some branch-specific definitions. Is this a PGO branch? Are nightlies enabled on this branch? Are all products and platforms enabled on this branch?

This branch definition config file could also point at a revision in a separate, standalone repo for its dependency graph + job definitions, so we can easily refer to different sets of graph and job definitions by SHA. I'm going to explore that further in part 5.

I worry about branch merges overwriting branch-specific configs. The inbound and project branches have different branch configs than mozilla-central, so it's definitely possible. I think the solution here is a generic branch-level config, and an optional branch-named file. If that branch-named file doesn't exist, use the generic default. (e.g. generic.json, mozilla-inbound.json) I know others disagree with me here, but I feel pretty strongly that human decisions need to be reduced or removed at merge time.

graphs of graphs

I think we need to support graphs-of-graphs. B2G jobs are completely separate from Firefox desktop or Fennec jobs; they only start with a common trigger. Similarly, win32 desktop jobs have no real dependencies on macosx desktop jobs. However, it's useful to refer to them as a single set of jobs, so if graphs can include other graphs, we could create a superset graph that includes the appropriate product- and platform- specific graphs, and trigger that.

If we have PGO jobs defined in their own graph, we could potentially include it in the per-checkin graph with a branch config check. On a per-checkin-PGO branch, the PGO graph would be included and enabled in the per-checkin graph. Otherwise, the PGO graph would be included, but marked as inactive; we could then trigger those jobs as needed via the web app. (On a periodic-PGO branch, a periodic scheduler could submit an enabled PGO graph, separate from the per-checkin graph.)

It's not immediately clear to me if we'll be able to depend on a specific job in a subgraph, or if we'll only be able to depend on the entire subgraph finishing. (For example: can an external graph depend on the linux32 mochitest-2 job finishing, or would it need to wait until all linux32 jobs finish?) Maybe named dummy status jobs will help here: graph1.start, graph1.end, graph1.builds_finished, etc. Maybe I'm overthinking things again.

We need a balancing act between ease of reading and ease of writing; ease of use and ease of maintenance. We've seen the mess a strong imbalance can cause, in our own buildbot configs. The fact that we're planning on making the final graph easily viewable and testable without any infrastructure dependencies helps, in this regard.

graphbuilder.py

I think graphbuilder.py, our [to be written] dependency graph generator, may need to cover several use cases:

• Create a graph in an api-submittable format. This may be all we do in phase 1, but the others are tempting...
• Combine graphs as needed, with branch-specific definitions, and user customizations (think TryChooser and per-product builds).
• Verify that this is a well-formed graph.
• Run other graph unit tests, as needed.
• Potentially output graphviz files for user-friendly local graph visualization?
• It's unclear if we want it to also do the graph+job submitting to the api.

I think the per-checkin graph would be good to build first; the nightly and PGO graphs, as well as the branch-specific defines, might also be nice to have in phase 1.

I have 4 more sections I wrote skeletons for. Since those sections are more db-oriented, I'm going to move those into a part 5.

In part 1, I covered where we are currently, and what needs to change to scale up.
In part 2, I covered a high level overview of LWR.
In part 3, I covered some hand-wavy LWR specifics, including what we can roll out in phase 1.
In part 5, I'm going to cover some dependency graph db specifics.
Now I'm going to meet with the A-team about this, take care of some vcs-sync tasks, and start writing some code.

comments

## December 08, 2013

### K Lars Lohn — Socorro Support Classifiers

Socorro has a new feature: classifiers inside the processors. It is  a programmatic way of tagging a crash. The Support Classifiers are based on the TransformRule system added to the Socorro Processors way back in February of 2012.

Think of the term “Support” as a category for a set of tags.  Another term for “category” in this context is “facet”. Support Classifiers are intended for helping with user support.  For example, let's say we're getting crashes from installations of Firefox for which there are known support articles.  This tagging system will make it simpler to associate the crash with the support article.  Eventually, we could implement a system where the user could be automatically directed to a support article based on how the processor categorized the crash.

Classifications are defined by a list of rules. The first rule to match gets to assign the classification. Rules are in two parts: a predicate and an action:
• predicate: a Python function that implements a test to see if a condition within the raw and/or processed crash is True
• action: a Python function that will attach a tag to the appropriate place within a processed crash
In the initial implementation, there is only one Support Classifier rule. It is called the BitguardClassifier. It tests to see if the list of loaded modules contains “bitguard.dll”. If that module is present, then the classification “bitguard” is assigned to the Support classification. Since that is the only rule defined so far, “bitguard” is the only possible value. We hope to add more as this feature becomes more well known and we move toward engaging our users about crashes.

Support Classifiers are the second implementation of Classifiers within the processor. The first was the experimental SkunkClassfiers. We can add as many Classifiers as we wish. While both Support and Skunk classifiers define a single facet with a single value, more complex rules could add multiple classifications. Sets of rules can work together in many ways: apply all rules, apply rules until one fails, apply rules until one succeeds, etc.

Support Classifiers are pretty simple since it can only assign one value to the facet “support”. Each rule is tried one at a time and the first one to succeed gets to assign the value. The Skunk Classifiers work the same way. Future Signature Classifiers could include alternate or experimental signature generation algorithms.

Do you have a Classifier that you'd like to see applied to crashes? There are two ways that you can get your idea implemented in the processor.

1. decide if your classifier is one for Support or some other categorization.
2. define, in plain English, what you want your predicate and action to be. For example:
• predicate: if the the user put a comment in the crash submission AND they specified an email address and the crash has the signature “EnterBaseline”
• action: add a Support Classification: “contact about EnterBaseline”
3. Enter a bug in Bugzilla with the topic New Classifier with your classification category as well as the predicate and action.  Make sure that you CC :lars so I can vet your work.
4. pending approval, your classifier will be implemented by someone on the Socorro team and pushed to production with the next release.
How do I search in the UI for crashes with a certain classification?
At the moment, the UI for Socorro does not support searching for classifiers. See Bug 947723 for the current status of adding classifications to the UI.

Want to try your hand at writing your own Support classifier?

Classifier functions are implemented as methods _predicate and _action in a class derived from the base class SupportClassifierBase.

The predicate is a function that accepts references to the raw and processed crashes as well as a reference to a Socorro Processor object itself. It returns a boolean value. The purpose is to determine if the rule is eligible to be applied. For example, the rule could test to see if the crash is a specific product and version. If the test is true, the predicate returns true and execution passes to the action function. If the test returns False, then the action is skipped and we move on to the next rule.

The action is a function that also accepts a copy of the raw and processed crashes as well as a reference to the Socorro Processor object. The action is generally to just add the classification to the processed crash.

All together, a support classifier should look like this:

from socorro.processor.support_classifiers import SupportClassificationRuleclass WriteToThisPersonSupportClassifier(SupportClassificationRule):    def version(self):        return '1.0'    def _predicate(self, raw_crash, processed_crash, processor):        # implement the predicate as a boolean expression to be         # returned by this method         return (            raw_crash.UserComment is not None             and raw_crash.EmailAddress is not None             and processed_crash.signature == 'EnterBaseline'        )    def _action(self, raw_crash, processed_crash, processor):        self._add_classification(             # add the support classification to the processed_crash             processed_crash,            # choose the value of your classification on the next line            'contact about EnterBaseline',            # any extra data about the classification (if any)            None,             # the place to log that a classification has been assigned            processor.config.logger,         )

What's inside the raw_crash and processed_crash that my classifier can access?

The raw_crash and the processed_crash are represented in persistent storage the form of a json compatible mapping.  When passed to a classifier, they are in the form of a DotDict, a DOM-like structure accessed with '.' notation.  There are no "out of bounds" fields with in the crash.  The classifier code is running a privileged environment, so all classifiers must be fully vetted before they can be put into production.

    raw_crash.UserComment    raw_crash.ProductName    raw_crash.BuildID    processed_crash.json_dump.system_info.OS    processed_crash.json_dump.crash_info.crash_address    processed_crash.json_dump.threads[thread_number][frame_number].function    processed_crash.upload_file_minidump_flash1.json_dump.modules[1].filename

Here's the form of a raw_crash.  This is what Socorro receives from crashing instances of Firefox.

{   "AdapterDeviceID" : "0x104a",   "AdapterVendorID" : "0x10de",   "Add-ons" : "%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D:25.0.1",   "BuildID" : "20131112160018",   "CrashTime" : "1386534180",   "EMCheckCompatibility" : "true",   "EmailAddress": "...@..."   "FlashProcessDump" : "Sandbox",   "id" : "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}",   "InstallTime" : "1384605839",   "legacy_processing" : 0,   "Notes" : "AdapterVendorID: ... ",   "PluginContentURL" : "http://www.j...",   "PluginFilename" : "NPSWF32_11_7_700_169.dll",   "PluginName" : "Shockwave Flash",   "PluginVersion" : "11.7.700.169",   "ProcessType" : "plugin",   "ProductID" : "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}",   "ProductName" : "Firefox",   "ReleaseChannel" : "release",   "StartupTime" : "1386531556",   "submitted_timestamp" : "2013-12-08T20:23:08.450870+00:00",   "Theme" : "classic/1.0",   "throttle_rate" : 10,   "timestamp" : 1386534188.45089,   "Vendor" : "Mozilla",   "Version" : "25.0.1",   "URL" : "http://...",   "UserComment" : "Horrors!",   "Winsock_LSP" : "MSAFD Tcpip ... "}

Here's the form of the processed_crash:

{   "additional_minidumps" : [      "upload_file_minidump_flash2",      "upload_file_minidump_browser",      "upload_file_minidump_flash1"   ],   "addons" : [      [         "testpilot@labs.mozilla.com",         "1.2.3"      ],      ...  more addons ...   ],   "addons_checked" : true,   "address" : "0x77a1015d",   "app_notes" : "AdapterVendorID: 0x8086, ...",   "build" : "20131202182626",   "classifications" : {      "support" : {         "classification_version" : "1.0",         "classification_data" : null,         "classification" : "contact about EnterBaseline"      },      "skunk_works" : {         "classification_version" : "0.1",         "classification_data" : null,         "classification" : "not classified"      },      ... additional classifiers ...   },   "client_crash_date" : "2013-12-05 23:59:13.000000",   "completeddatetime" : "2013-12-05 23:59:56.119158",   "cpu_info" : "GenuineIntel family 6 model 42 stepping 7 | 4",   "cpu_name" : "x86",   "crashedThread" : 0,   "crash_time" : 1386287953,   "date_processed" : "2013-12-05 23:59:38.160492",   "distributor" : null,   "distributor_version" : null,   "dump" : "OS|Windows NT|6.1.7601 Service Pack 1\n... PIPE DUMP ...,   "email" : null,   "exploitability" : "none",   "flash_version" : "11.9.900.152",   "hangid" : "fake-e167ea3d-8732-4bae-a403-352e32131205",   "hang_type" : -1,   "install_age" : 680,   "json_dump" : {      "system_info" : {         "os_ver" : "6.1.7601 Service Pack 1",         "cpu_count" : 4,         "cpu_info" : "GenuineIntel family 6 model 42 stepping 7",         "cpu_arch" : "x86",         "os" : "Windows NT"      },      "crashing_thread" : {         "threads_index" : 0,         "total_frames" : 55,         "frames" : [            {               "function_offset" : "0x15",               "function" : "NtWaitForMultipleObjects",               "trust" : "context",               "frame" : 0,               "offset" : "0x77a1015d",               "normalized" : "NtWaitForMultipleObjects",               "module" : "ntdll.dll",               "module_offset" : "0x2015d"            },            ... more frames ...         ]      },      "thread_count" : 10,      "status" : "OK",      "threads" : [         {            "frame_count" : 55,            "frames" : [               {                  "function_offset" : "0x15",                  "function" : "NtWaitForMultipleObjects",                  "trust" : "context",                  "frame" : 0,                  "module" : "ntdll.dll",                  "offset" : "0x77a1015d",                  "module_offset" : "0x2015d"               },               ...  more frames ...            ]         },         ...  more threads ...      ],      "modules" : [         {            "end_addr" : "0x12e6000",            "filename" : "plugin-container.exe",            "version" : "26.0.0.5084",            "debug_id" : "8385BD80FD534F6E80CF65811735A7472",            "debug_file" : "plugin-container.pdb",            "base_addr" : "0x12e0000"         },         ... more modules ...      ],      "sensitive" : {         "exploitability" : "none"      },      "crash_info" : {         "crashing_thread" : 0,         "address" : "0x77a1015d",         "type" : "EXCEPTION_BREAKPOINT"      },      "main_module" : 0   },   "java_stack_trace" : null,   "last_crash" : null,   "os_name" : "Windows NT",   "os_version" : "6.1.7601 Service Pack 1",   "PluginFilename" : "NPSWF32_11_9_900_152.dll",   "pluginFilename" : null,   "pluginName" : null,   "PluginName" : "Shockwave Flash",   "PluginVersion" : "11.9.900.152",   "processor_notes" : "processor03_mozilla_com.89:2012; HybridCrashProcessor",   "process_type" : "plugin",   "product" : "Firefox",   "productid" : "{ec8030f7-c20a-464f-9b0e-13a3a9e97384}",   "reason" : "EXCEPTION_BREAKPOINT",   "release_channel" : "beta",   "ReleaseChannel" : "beta",   "signature" : "hang | F_1152915508___________________________________",   "startedDateTime" : "2013-12-05 23:59:49.142604",   "success" : true,   "topmost_filenames" : "F_388496358________________________________________",   "truncated" : false,   "uptime" : 668,   "url" : "https://...",   "user_comments" : null,   "user_id" : "",   "uuid" : "e167ea3d-8732-4bae-a403-352e32131205",   "version" : "26.0",   "Winsock_LSP" : "...",      ... the next entries are additional crash dumps included in this crash ...      "upload_file_minidump_browser" : {      "address" : null,      "cpu_info" : "GenuineIntel family 6 model 42 stepping 7 | 4",      "cpu_name" : "x86",      "crashedThread" : null,      "dump" : "... same form as "dump" key above ... ",      "exploitability" : "ERROR: something went wrong"      "flash_version" : "11.9.900.152",      "json_dump" : { ... same form as "json_dump" key above ... }      "os_name" : "Windows NT",      "os_version" : "6.1.7601 Service Pack 1",      "reason" : "No crash",      "success" : true,      "truncated" : false,      "topmost_filenames" : [         ... same form as "topmost_filenames" above ...      ],      "signature" : "hang | whatever",   },      "upload_file_minidump_flash1" : {      ... same form as "upload_file_minidump_browser" above ...   },      "upload_file_minidump_flash2" : {      ... same form as "upload_file_minidump_browser" above ...   },}

### Pascal Finette — On Stage With A Legend

A few weeks ago I had the incredible fortune to be on stage with Mike Watt, punk rock legend, founding member of bands such as Minutemen, Dos and fIREHOSE, bassist for the Stooges (Iggy Pop's band). You get the idea – the man is a true legend.

My dear friends at CASH Music, a non-profit which builds tools which allow musicians to represent themselves on the web, organized one of their amazing Summits; this one was in LA and it had me on stage with Mike talking about punk rock, open source and everything in-between. It was epic. Here's the video:

### Christian Heilmann — An open talk proposal – Accidental Arrchivism

People who have seen me speak at one of the dozens of conferences I covered in the last year know that I am passionate about presenting and that I love covering topic from a different angle instead of doing a sales pitch or go through the motions of delivering a packaged talk over and over again.

For a few months now I have been pondering a quite different talk than the topics I normally cover – the open web, JavaScript and development – and I’d love to pitch this talk to the unknown here to find a conference it might fit. If you are organising a conference around digital distribution, tech journalism or publishing, I’d love to come around to deliver it. Probably a perfect setting would be a TEDx or Wired-like event. Without further ado, here is the pitch:

### Accidental arrchivism

The subject of media and software piracy is covered in mainstream media with a lot of talk about greedy, unpleasant people who use their knowledge to steal information and make money with it. The image of the arrogant computer nerd as perfectly displayed in Jurassic Park. There is also no shortage of poster children that fit this bill and it is easy to bring up numbers that show how piracy is hurting a whole industry.

This kind of piracy, however, is just the tip of the iceberg when it comes to the whole subject matter. If you dig deeper you will find a complex structure of hierarchies, rules, quality control mechanisms and distribution formats in the piracy scene. These are in many cases superior to those of legal distributors and much more technologically and socially advanced.

In this talk Chris Heilmann will show the results of his research into the matter and show a more faceted view of piracy – one that publishers and distributors could learn from. He will also show positive – if accidental – results of piracy and explain which needs yet unfilled by legal release channels are covered and result in the success of the pirates – not all of them being about things becoming “free”. You can not kill piracy by making it illegal and applying scare tactics – its decentralised structure and its very nature of already being illegal makes that impossible. A lot of piracy happens based on convenience of access. If legal channels embraced and understood some of the ways pirates work and the history of piracy and offered a similar service, a lot of it would be rendered unnecessary.

If you are a conference organiser who’d be interested, my normal presentation rules apply:

• I want this to be a keynote, or closing keynote, not a talk in a side track in front of 20 people
• I want a good quality recording to be published after the event. So far I was most impressed with what Baconconf delivered on that front with the recording of my “Helping or Hurting” presentation.
• I’d like to get my travel expenses back. If your event is in London, Stockholm or the valley, this could be zero as I keep staying in these places

If you are a fan of what I do right now and you’d be interested in seeing this talk, spread this pitch far and wide and give it to conference organisers. Thanks.

## December 07, 2013

### David Humphrey — An Hour of Code spawns hours of coding

One of the topics my daughters and two of their friends asked me to do this year in our home school is programming.  They call it Code Club, and we have been learning HTML, JavaScript, CSS together.  We’ve been using Thimble to make things like this visualization of Castles around the world.  It’s been a lot of fun.

This past week I introduced them to the Processing programming language using this great new interactive tutorial.  It was made for Computer Science Education Week and the Hour of Code, in which many organizations (including Mozilla Webmaker, who is also participating) have put together tutorials and learning materials for new programmers.

One of the things I love about the Processing Hour of Code tutorial is that it was made using two projects I worked on for Mozilla, namely Processing.js and Popcorn.js, and does something I always wanted to see people do with them: build in-browser, interactive, rich media, web programming curriculum.  Everything you need to learn and code is all in one page, with nothing to install.

I decided to use the Processing tutorial for Code Club this past week, and let the girls try it out.  I was a bit worried it would be too hard for them, but they loved it, and were able to have a great time with it, and understand how things worked.  Here’s the owl two of the girls made:

The other girls made a one-eyed Minion from Despicable Me.  As they were preparing to show one another their creations, disaster struck, and the code for the Minion was lost.  Some tears were shed, and we agreed to work on making it again.

Today we decided to see if we could fix the issue that caused us to lose our work in the first place.  The Processing Hour of Code was designed to inspire new programmers to try their hand at programming, and what better way than to write some real code that will help other people and improve the tutorial?

What follows is a log I wrote as the girls worked on the problem.  Without giving them all the answers, I gave them tips, explained things they didn’t understand, and let them have an organic experience of debugging and fixing a real bug.  All three of us loved it, and these are the steps my daughters took along the way:

1) Go to http://hello.processing.org, click on “Click Here To Begin” and go to the Introduction.

2) Try to make the bug happen again.  We tried many things to see if we could make the bug happen again, and lose our work.  After experimenting for a while, we discovered that you could make the bug happen by doing the following:

• Go to the “Color” section
• Click “Jump to Exercise” on the video.
• Change the code in the code editor.
• Click on the blue outside the code editor
• Press the Delete/Backspace key
• The web page goes back to the beginning, and now we’ve lost our code

3) Find a way to stop this from happening.  We went to Google and tried to research some solutions.  Here are some of the searches we tried, and what we found:

• “How do you make a web page not go back?”
• “How to keep the delete key from triggering the back button?”
• “How to stop a web page from losing my work hitting backspace”
• “what is the code to stop a web page from losing my work hitting backspace”

Most of these searches gave us the wrong info.  Either we found ways to stop a web browser from doing Back when you press Delete, or else we found complicated code for ignoring the Delete key.  We tried another search:

• “stop the page from closing”

This brought us to a question on a site called Stack Overflow, with an answer that looked interesting.  It talked about using some code called window.onbeforeunload.  We had never heard of this, so we did another search for it:

• “window.onbeforeunload”

Many pages came back with details on how to use it, and one of them was Mozilla’s page, which we read.  In it we found a short example of how to use it, which we copied into our clipboard:

window.onbeforeunload = function(e) {
return 'Dialog text here.';
};

We tried pasting this into the Processing code editor, but that didn’t work.  Instead we needed to change the code for the hello.processing.org web page itself.  Our dad showed us one way to do it.

4) We used the Developer Tools to modify the code for the hello.procesing.org web page by pasting the code from Mozilla’s example into the Web Console (Tools > Web Developer > Web Console)

Now we tried again to trigger our bug, and it asked us if we wanted to leave or stay on the page.  We fixed it!!!

We opened another tab and loaded hello.processing.org again, and we see that this version still has the old bug.  We now need to make this fix for all versions.

5) We want to fix the actual code for the site.  “Dad, how do we fix it?  Doesn’t he have have to fix it? How can we fix it for the whole world from here?”  Great questions!  First we have to find the code so we can make our change.  We look at the site’s About page, and see the names of the people who made this web page listed under Credits.  We do a search for their names and “Hello Processing”:

• “daniel shiffman scott garner scott murray hello processing”

In the results, we find a site where they were talking about the project, and Dan Shiffman was announcing it.  In his announcement he says, “We would love as many folks as possible to test out the tutorial and find bugs.” He goes on to say: “You can file bug reports here: https://github.com/scottgarner/Processing-Hour-Of-Code/issues?state=open We are glad to read that he wants people to test it and tell him if they hit bugs.  Now we know where to tell them about our bug.

6) At https://github.com/scottgarner/Processing-Hour-Of-Code we find the list of open issues (there were 5), and none of them mentioned our problem–maybe they don’t know about it.  We also see all of the code for their site, and there is a lot of it.

7) We create a new account on Github so that we can tell them about the issue, and also fix their code.  We then fork their project, and get our own version at https://github.com/threeamigos/Processing-Hour-Of-Code

8 ) Now we have to figure out where to put our code.  It’s overwhelming just looking at it.  Our dad suggests that we put the code in the page that caused our error, which was http://hello.processing.org/editor/index.html.  On github, we see that editor/index.html is an HTML file https://github.com/threeamigos/Processing-Hour-Of-Code/blob/gh-pages/editor/index.html.

9) Next we have to find the right place in this HTML file to paste our code.  Our dad tells us we need to find a <script>...</script> block, and we quickly locate a bunch of them.  We don’t understand how all of them work, but notice there is one at the bottom of the file, and decide to put our code there.

10) We clicked “Edit” on the file in the Github web page, and made the following change:

https://github.com/threeamigos/Processing-Hour-Of-Code/commit/1cf2be198f7a1db12c35e45b8bc1e0edd73f8e6c#diff-d05c1452bfe9809c27d44ee7c8df31d1

10) Finally, we made a Pull Request from our code to the original.  We told them about the bug, and how it happens, and also that we’d fixed it, and that they could use our code.  We’re excited to see their reply, and we hope they will use our code, and that it will help other new programmers too.

### Arky — Mozilla Taiwan Localization Sprint

Last week I traveled to Taipei for localization sprint with Mozilla Taiwan community. The community translates various Mozilla projects into Chinese (Traditional)(zh-TW). The goal of a localization sprint is to bring together new and experienced translators under one roof. Such events help promote knowledge sharing through peer learning and mentor-ship. Special thanks to Michael Hung, Estela Liu and Natasha Ma for making the Mozilla space available and inviting by providing Pizzas.

The event began with a short introduction to localization by Peter Chen, followed by a brief overview of various translation projects such as translating Mozilla Support (SUMO) by Ernest Chiang, translating Webmaker by Peter Chen, translating addons.mozilla.org by Toby, translating Mozilla Developer Network(MDN) articles by Carl, translating Mozilla videos with Amara tool by Irvin and translating Mozilla Links by Chung-Hui Fang. The speakers then organized participants into topic specific working groups, based on each individual's interest.

It was interesting to see how people used various tools such as Narro, Pootle, Transifex and even Google Docs for translation. It gave me an opportunity to observe and note some of the potential problems in the translation process. At the end of the day, everyone gathered to share and present their group's work. The also took time to answer question that participants had. All in all it was a very productive and enjoyable event. Mozilla badges were issued to recognize the participants' contributions.

Check out the event photos and etherpad for additional details. The Mozilla Taiwan community will continue to translate during their weekly MozTwLab meetups and a follow-up event is planned for the sprint 2014.

## December 06, 2013

### Jason Orendorff — How EgotisticalGiraffe was fixed

In October, Bruce Schneier reported that the NSA had discovered a way to attack Tor, a system for online anonymity.

The NSA did this not by attacking the Tor system or its encryption, but by attacking the Firefox web browser bundled with Tor. The particular vulnerability, code-named “EgotisticalGiraffe”, was fixed in Firefox 17, but the Tor browser bundle at the time included an older version, Firefox 10, which was vulnerable.

I’m writing about this because I’m a member of Mozilla’s JavaScript team and one of the people responsible for fixing the bug.

I still don’t know exactly what vulnerability EgotisticalGiraffe refers to. According to Mr. Schneier’s article, it was a bug in a feature called E4X. The security hole went away when we disabled E4X in Firefox 17.

You can read a little about this in Mozilla’s bug-tracking database. E4X was disabled in bugs 753542, 752632, 765890, and 778851, and finally removed entirely in bugs 833208 and 788293. Nicholas Nethercote and Ted Shroyer contributed patches. Johnny Stenback, Benjamin Smedberg, Jim Blandy, David Mandelin, and Jeff Walden helped with code reviews and encouragement. As with any team effort, many more people helped indirectly.

Thank you.

Now I will write as an American. I don’t speak for Mozilla on this or any topic. The views expressed here are my own and I’ll keep my political opinions out of it.

The NSA has twin missions: to gather signals intelligence and to defend American information systems.

From the outside, it appears the two functions aren’t balanced very well. This could be a problem, because there’s a conflict of interest. The signals intelligence folks are motivated to weaponize vulnerabilities in Internet systems. The defense folks, and frankly everyone else, would like to see those vulnerabilities fixed instead.

It seems to me that fixing them is better for national security.

In the particular case of this E4X vulnerability, mainly only Tor users were vulnerable. But it has also been reported that the NSA has bought security vulnerabilities “from private malware vendors”.

### Bitcoin Mining

While I’ve had friends doing it for a while, on a lark I picked up a couple of ASIC-based bitcoin miners as dedicated hardware a while ago. I managed to get them, along with buying a few bitcoins directly, before the massive recent increase in prices. I look on it as an experiment and one that I don’t take very seriously. “Never gamble money you can’t afford to lose” is a good motto. If I lost everything that I put in, I would call it a lesson learned but so far I’m actually looking to break even on the cost of the gear in about two months (including the costs of power). My main complaint with it so far is that the miners are in my home office because they need decent network connectivity and I also work in there. It is kind of like working next to a pair of hairdryers that you never turn off (on a plus note, I’m not cold). I’ve had to find various places for doing my videoconferencing as the noise can be a bit burdensome. I’m quite interested in where bitcoin may wind up going but I really don’t have any expectations.

### Tabletop Role-Playing Games

The short explanation is always “You remember Dungeons and Dragons? Well, it is like that except we don’t play D&D.” Right now there is a renaissance of independent role-playing games going on (for most of a decade now but really kicked up further by things like kickstarter). I was in an RPG group that met once a month, then twice a month, and now we have a weekly pickup game with people who feel like playing along with two Sundays a month of regular sessions. The weekly games have been used as an opportunity for us to play one or two-off games that are either interesting concept pieces or just intriguing without any kind of commitment to regular play. I and two of the cohort have plotted out an extended scenario/game, using the very simple Lady Blackbird rules as a basis, involving the shift from Pulp Era Heroes (think “The Shadow”) to Golden Age Superheroes (think “Superman”). We’re going to do some design work on this and playtest it with our group before releasing it under some kinf of Creative Commons license.

Beyond all of this, I’m on the board once again of my local hackerspace, the aforementioned Ace Monster Toys, and it continues to thrive. I may also be going to Japan for a week or two in March on matters Buddhist related but nothing has been set in stone as of yet. My work is still focused on being a program manager for security over at Mozilla (though I largely focus on Firefox efforts).

### Austin King — Book Review – Feedback Control for Computers

If you write code for fun or for a livelihood, I recommend you check out my friend Philipp Janert’s newest book Feedback Control for Computer Systems.

Feedback Control is a topic well known to mechanical engineers, but not so much in our industry. Feedback Control is about making smarter systems that can cope with dynamic environments. Many knobs that we build into configuration can actually be automated with feedback loops.

Examples given early in the book:

• A Cache by tracking hit rate and changing the cache size
• A Server Farm by tracking request latency and changing number of deployed server nodes
• A Queueing System by tracking wait time and changing the number of workers
• A Graphics Library by tracking memory consumption and changing the output resolution

The book is well written. It starts out with practical examples and working code. It later introduces the deep theory and drops some math bombs. Don’t worry, there is Python code for everything and you don’t have to understand the math.
It gives solid advice, like don’t blindly use Feedback Control for optimization; optimization needs a higher level strategy guiding the process.

Lastly, there are references for further reading, if you do want to work through more of the theory.

It also sets realistic expectations. You’ll control one metric by changing one variable. This is no silver bullet.

The term Enterprise is thrown about, don’t let this scare you away This is a valuable book for many types of software problems. A couple I’ve brainstormed of:

• Controlling difficulty of a video game, to react to how skilled a player is
• Controlling aspects of an animation
• Controlling polling of APIs for fresh data
• Driving load testing to find different scaling points (errors, high latency, etc)

I haven’t had much test to put these ideas into practice… so you’re don’t throw too many tomatoes at these wacky ideas.

Update:

• There is also a Blog series on the topic
• Let’s port the Python examples to JavaScript including a JS port of Gnuplot

### Tantek Çelik — Toward A People Focused Mobile Communication Experience

Focus-enhancing note: all in-line hyperlink citations[number] are listed in the References footer of this post. You may procrastinate clicking them until the end for a more in-flow reading experience.

#### Smart and dumber

Remember when phones were dumb and people were smart?

While smart phones become smarter, their push notification interruptions fuel a mobile dopamine[1] addiction[2] that's making us dumber[3].

#### App focus and notification distractions

These devices and their app store interfaces have also trained us to install, organize, and curate ever more mobile "apps" on our home screens. Thanks to designers' obsession over attention, retention, returning eye-balls, and need to compete with all those other apps, they ever more aggressively demanding our attention.

Their push notifications insist that they're more important than anything else we could possibly be doing. We miss things right in front of us, or we overreact, overmute, and miss important things. Not things. People.

Virtual notifications distract us from real people.

This is a broader systemic design problem beyond smart phones: Hospitals look to reduce danger of 'alarm fatigue'[4].

Take a moment to recover your focus after skimming or bookmarking those links.

#### App centric interfaces cause dopamine fueled distraction

Right now we have screenfuls of apps to communicate and interact with people online. Screenfuls like:

The problems with this current state of the mobile interface:

1. Person A wants to communicate with person B
2. Person A has to pick (absent B's context!) a communication app
3. Person A launches the specific app
4. The app immediately shows Person A:
• Missed calls from others!
• Unread count of new messages!
• Actual new messages from others!

Every one of those exclamation points (!) is a dopamine squirt (you probably got a little one even just reading about it happening).

Consequence: Person A is distracted by the missed calls, unread counts, new messages beckoning their attention - ooh someone reached out to me! I better check if that's important before I communicate with, who was it I was going to communicate with?

Worse yet: the dopamine reinforces this distraction based behavior, turning it into a habit of reacting and losing our context, rather than acting and keeping our flow.

#### What if we focused on people first?

What if our mobile devices focused on people first and apps second?

Remember when they used to? When you looked up a person first, and decided to txt, call, or email them second?

What if we put people first in our mobile interfaces, before choosing an app to interact?

Could we grow a culture of adding icons of each other to our home screens instead of application intermediaries?

What if we organized screenfuls of icons of people that matter to us rather than apps that distract us?

If we could organize screenfuls of icons of people, what would it look like?

An interface with a bunch of faces[5] certainly feels a lot more human.

#### How do we organize screenfuls of icons of people?

The above is an actual screenshot. The answer, today, is to go to their personal sites, tap the action icon (or choose add bookmark), and then "Add to Home Screen".

Yes, this is why you should make sure your personal site has an icon of you that people can add to their home screen.

#### Why would someone want an icon of you on their home screen?

In short, human-focused rather than app-focused communication.

• You want to catch up on someone's site (recent writings, activities) just before meeting up with them in person.
• You miss someone and are wondering what they're up to.
• Communicating with a person - person first, method second:
• if they happen to have sms: mailto: tel: etc. links on their home page, then their home page becomes the way you can contact them.
• Your home page becomes your communication protocol.
• callee-preferred comm apps icon UI
• What if you provided icons for each of those yourself as if they were apps, e.g. in a pane on your home page?
• Like a Contact folder that when tapped would open up a row of icons of the ways you could be contacted, maybe even in your order of preference!

Would it be too disruptive to the mobile experience and ecosystem to focus on people rather than apps?

#### User experience flow

How would a person use this?

• Go to someone's domain, e.g. tap their icon from home screen
• See their personal home page which with methods of contact as a list or icons in the order that they prefer to be contacted.
• Go across and down that list until you see something you can (and want) to use to communicate, and tap/click it.
• The browser takes you to a website or "native" app to open the communication channel / new message.

Thus after tapping the person you want to communicate with, just one more tap to open a new IM, email, or audio/video call.

Note that there was no distraction by unread IM/email or new activity counts beckoning your attention away from your desire to communicate with a specific person.

#### UX flow with identification

By identifying yourself to the personal site, the site can provide progressive enhancement of communication options:

• Go to someone's domain, e.g. tap their icon from home screen
• Identify yourself to the site (e.g. with IndieAuth, or perhaps your device/browser automatically detects IndieAuth and identifies you if you've been to the site before)
• Now their personal site provides more (or possibly fewer!) communication options based on who you are.
• Again pick the first method of communication you see that you want to use
• You're again routed to either a website or "native" app to start communicating.

Thus after going to someone's personal site, with one tap you can perhaps SMS Facetime or Skype as well.

#### Context Enabled Presence

Someone's personal site could even do presence detection (some personal sites already show live IM presence status), and show/hide communication options in accordance with their apparent availability. E.g. some combination based on determining if they are:

• Asleep?
• In an area with poor network reception?
• In a meeting (or noisy location)?
• Or otherwise pre-occupied?
• Running or otherwise in motion
• Have IM/Skype client open (for more than 10 minutes)

Then their site could enable/disable various things by either hiding or disabling (dimming or greyscaling) the respective icons for:

• realtime interactive audio/video (AKA "phone" calls)
• IM busy/idle/away/active status

User-friendly privacy: such context-based selection should be seamless enough and yet coarse enough that you cannot necessarily determine from the (un)availability of various methods of communication, what their actual context (asleep, busy, in motion etc.) is.

#### Solving the "Can we talk" problem

Perhaps this is the solution to the "Can we talk?"[7] problem.

Nevermind all this "what should I ..."

Domains (or proxies thereof) work as identity.

Just share domain names when you meet, add their icon to your home screen and you're done.

Or even share Twitter handles (purely as a quicker-easier-to-say discovery mechanism for domain names), add their icon and you're done.

The rest is automatically handled when you tap their icon.

#### How do you make this work on your site?

How do you make this work for when someone taps an icon to your site?

By adding this to your personal site:

• aim:, mailto:, etc. hyperlinks (add rel=me to them)
• platform familiar icons and grid layout (combining elements of adaptive and responsive design)
• IndieAuth support - to allow visitors to identify themselves
• Conditionally show more (or fewer) hyperlinks based on whitelists, i.e. check their identity against a whitelist or two and then provide e.g. sms:, facetime:, skype: (callto:?) links.

Optionally have your site passively (or in response) check your meeting schedule, your Foursquare location, perhaps even native app specific presence (e.g. IM), and cache/show/hide links accordingly.

#### Who has done this?

Nobody so far - this is a user experience brainstorm.

#### Can we do this?

Yes. Some communication protocols are supported in today's mobile operating systems / browsers:

iOS Mobile Safari[8][9]
facetime, mailto, skype, sms, tel
Android[10]
tel
Firefox OS Browser[11]
mailto, sms, tel

I couldn't easily find specific references for protocol support in Android Chrome and Firefox for Android browsers. My guess is that the various mobile browsers likely support more communication protocols than the above (and the reference documents) claim. It's probably worth some testing to expand the above lists.

Even maps.apple.com/?q= links are supported on iOS[8] (and "geo:" links on Android[10]) as a way to launch the native maps app - perhaps a person could for some identified visitors have a geo/maplink that showed exactly (or roughly) where the person was if and when they chose to.

There's a whole wiki page of URL protocols supported on iOS and iOS apps[12] and here's a blog post providing clickable examples of Special links: phone calls, sms, e-mails, iPhone and Android apps,...[13] (ht: Ryan Barrett for both). Both are quite useful, especially for instant messaging / telephony protocols. However keep in mind that it may be better to use mobile web app URLs where possible instead of app-specific protocols, e.g.:

Because the mobile web URL is more robust, platform/device independent, and never mind that the twitter: protocol[12] lacks a way to open messages (or a new message) to a specific person.

In addition, I feel I can better depend on DNS to go to twitter.com as intended, whereas it seems like it could be easier for a malevolent native app to hijack "twitter:" URLs by claiming to handle them.

#### What next?

Next, it's time to try prototyping this on our personal sites and mobile devices to see if we can make it work and how the interaction feels.

If this kind of out-of-the-app-box thinking and personal site focused hackery appeals to you, try it out yourself, and come on by tonight's Homebrew Website Club meeting and show what you got working!

#### Previously

Previous posts and notes related to focus (distraction) and specifically to human interface design and processes to improve (reduce) respectively.

#### References

In order of appearance:

1. Psychology Today: Why We're All Addicted to Texts, Twitter and Google
2. Computerworld: Nerd, interrupted: Inside a smartphone addiction treatment center
3. Mashable: How is Facebook Addiction Affecting Our Minds?
4. SFGate: Hospitals look to reduce danger of 'alarm fatigue'
5. IndieWebCamp: icon FAQ: Should you use a photo of your face
6. Event Homebrew Website Club Meeting
7. WIRED: Can We Talk?
8. accessed: Apple: Apple URL Scheme Reference
9. Max Firtman: How to create click-to-call links for mobile browsers
10. accessed: Google: Intents List: Invoking Google Applications on Android Devices
11. accessed: Mozilla: Bug 805282 - MailtoProtocolHandler.js, SmsProtocolHandler.js and TelProtocolHandler.js in package-manifest.in
12. accessed: Akosma Software: wiki page of URL protocols supported on iOS and iOS apps
13. Adrian Ber: Special links: phone calls, sms, e-mails, iPhone and Android apps, …

#### Comments

1. Robert O'Callahan: WebRTC And People-Oriented Communications

### Bill McCloskey — Multiprocess Firefox

Since this January, David Anderson and I have been working on making Firefox use multiple processes. Tom Schuster (evilpie on IRC) joined us as an intern this summer, and Felipe Gomes, Mark Hammond and other have made great contributions. Now we’re interested to hear what people think of the work so far.

Firefox has always used a single process. When Chrome was released, it used one UI process and separate “content” processes to host web content. Other browsers have adopted similar strategies since then. (Correction: Apparently IE 8 used multiple processes 6 months before Chrome was released.) Around that time, Mozilla launched a far-reaching effort, called Electrolysis, to rewrite Firefox and the Gecko engine to use multiple processes. Much of this work bore fruit. Firefox OS relies heavily on the multiprocessing and IPC code introduced during Electrolysis. However, the main effort of porting the desktop Firefox browser to use multiple processes was put on hold in November 2011 so that shorter-term work to increase responsiveness could proceed more quickly. A good summary of that decision is in this thread.

This blog entry covers some of the technical issues involved in multiprocess Firefox. More information is available on the project wiki page, which includes links to tracking bugs and mailing lists. Tom Schuster’s internship talk gives a great overview of his work.

### Why are we doing this?

There are a couple reasons for using multiple processes.

Performance. Most performance work at Mozilla over the last two years has focused on responsiveness of the browser. The goal is to reduce “jank”—those times when the browser seems to briefly freeze when loading a big page, typing in a form, or scrolling. Responsiveness tends to matter a lot more than throughput on the web today. Much of this work has been done as part of the Snappy project. The main focuses have been:

• Moving long-running actions to a separate thread so that the main thread can continue to respond to the user.
• Doing I/O asynchronously or on other threads so that the main thread isn’t blocked waiting for the disk.
• Breaking long-running code into shorter pieces and running the event loop in between. Incremental garbage collection is an example of this.

Much of the low-hanging fruit in these areas has already been picked. The remaining issues are difficult to fix. For example, JavaScript execution and layout happen on the main thread, and they block the event loop. Running these components on a separate thread is difficult because they access data, like the DOM, that are not thread-safe. As an alternative, we’ve considered allowing the event loop to run in the middle of JavaScript execution, but doing so would break a lot of assumptions made by other parts of Firefox (not to mention add-ons).

Running web content in a separate process is a nice alternative to these approaches. Like the threaded approach, Firefox is able to run its event loop while JavaScript and layout are running in a content process. But unlike threading, the UI code has no access to content DOM or or other content data structures, so there is no need for locking or thread-safety. The downside, of course, is that any code in the Firefox UI process that needs to access content data must do so explicitly through message passing.

We feel this tradeoff makes sense for a few reasons:

• It’s not all that common for Firefox code to access content DOM.
• Code that is shared with Firefox OS already uses message passing.
• In the multiprocess model, Firefox code that fails to use message passing to access content will fail in an obvious, consistent way. In the threaded model, code that accesses content without proper locking will fail in subtle ways that are difficult to debug.

The last point is, in my opinion, the most compelling. None of the approaches for improving responsiveness of JavaScript code are easy to implement. The multiprocess approach at least has the advantage that errors are reproducible.

Security. Right now, if someone discovers an exploitable bug in Firefox, they’re able to take over users’ computers. There are a lot of techniques to mitigate this problem, but one of the most powerful is sandboxing. Technically, sandboxing doesn’t require multiple processes. However, a sandbox that covered the current (single) Firefox process wouldn’t be very useful. Sandboxes are only able to prevent processes from performing actions that a well-behaved process would never do. Unfortunately, a well-behaved Firefox process (especially one with add-ons installed) needs access to much of the network and file system. Consequently, a sandbox for single-process Firefox couldn’t restrict much.

In multiprocess Firefox, content processes will be sandboxed. A well-behaved content process won’t access the filesystem directly; it will have to ask the main process to perform the request. At that time, the main process can verify that the request is safe and that it makes sense. Consequently, the sandbox for content processes can be quite restrictive. Our hope is that this arrangement will make it much harder to craft exploitable security holes for Firefox.

Stability. Although Firefox usually doesn’t crash very often, multiple processes will make crashes much less annoying. Rather than killing the entire browser, the crash will only take down the content process that crashed.

### Trying it out

We’ve been doing all of our development on mozilla-central, which means that all of our code is available in Firefox nightlies. You can try out multiprocess Firefox with the following steps:

• Update to a current nightly.
• We strongly recommend you create a new profile!
• Set the browser.tabs.remote preference to true.
• Restart Firefox.

Here’s what Firefox looks like in multiprocess mode. Notice that the title of the tab is underlined. This means that the content for the tab is being rendered in a separate process.

This screenshot shows what happens when a content process crashes. The main Firefox process remains alive, along with any tabs being rendered in other processes.

### What works?

The best way to find out is to try it! All basic browsing functionality should work at this time. In particular: back and forward navigation, the URL bar and search bar, the context menu, bookmarks, tabs, Flash (except on Macs), the findbar, and even Adblock Plus. Some less-used features are still missing: developer tools, printing, and saving pages to disk. Add-on support is pretty spotty—some add-ons work, but most don’t.

To be conservative, we’re using only one content process for all web content. To get all of the performance, security, and stability benefits of multiple processes, we’ll need to use multiple content processes.

### When will it be released?

We simply don’t know. It’s a large project and any predictions at this point would be foolhardy. There are a lot of concerns about compatibility with add-ons and plugins that will need to be settled before we have any sense of a release date.

### How much memory will it use?

Many people who hear about the project are concerned that multiple processes will cause the memory usage of Firefox to balloon. There’s a fairly widespread belief that more processes will require a lot more memory. I don’t believe this will be the case though.

The actual overhead of an empty process is very small. A normal desktop system can easily support thousands of processes. Even when all the code for Firefox is loaded into each process, this memory is shared, so not much more memory is used than in the single-process case.

Using multiple processes does take more memory if data that could be shared between tabs in a single process must be duplicated in a multiprocess browser. There are many different caches and other shared data structures in Firefox where this might be a problem. We have plans to mitigate these issues. For example, it should be possible for processes to store some of this data in read-only shared memory. A process would be able to observe that it is holding local data that is identical to some data already stored in the read-only cache. In that case, it would drop its own copy of the data and use the shared version instead. Periodically, processes could add new data to the cache and mark it read-only. Processes would want to avoid sharing data that might include sensitive information like credit card numbers or bank data. However, it would probably be safe to share image data, JavaScript scripts, and the like.

Admittedly, it will take some time to optimize the memory usage of multiprocess Firefox. We already have about:memory working for multiple processes, which should make the work go more quickly. Nevertheless, until we feel that multiprocess Firefox is competitive with the single-process version, we will probably use a single content process to render all tabs. That model brings most of the security benefits and some of the performance and stability advantages we’re seeking. It also takes only a small amount more memory than single-process Firefox. I’ve taken some simple measurements using about:memory and Gregor Wagner’s MemBench benchmark. The benchmark opens 50 tabs and then measures memory usage.

 Memory usage Single-process Multiprocess UI process 974 MB 124 MB Content process 860 MB Total 974 MB 984 MB

The total memory usage for multiprocess Firefox is only 10MB greater than single-process Firefox. We should be able to shrink this difference with some effort.

### How will it work?

At a very high level, multiprocess Firefox works as follows. The process that starts up when Firefox launches is called the parent process. Initially, this process works similarly to single-process Firefox: it opens a window displaying browser.xul, which contains all the principal UI elements for Firefox. Firefox has a flexible GUI toolkit called XUL that allows GUI elements to be declared and laid out declaratively, similar to web content. Just like web content, the Firefox UI has a window object, which has a document property, and this document contains all the XML elements from browser.xul. All the Firefox menus, toolbars, sidebars, and tabs are XML elements in this document. Each tab element contains a <browser> element to display web content.

The first place where multiprocess Firefox diverges from single-process Firefox is that each <browser> element has a remote="true" attribute. When such a browser element is added to the document, a new content process is started. This process is called a child process. An IPC channel is created that links the parent and child processes. Initially, the child displays about:blank, but the parent can send the child a command to navigate elsewhere.

In the remainder of this section, I will discuss the most important aspects of multiprocess Firefox.

Drawing. Somehow, displayed web content needs to get from the child process to the parent and then to the screen. Multiprocess Firefox depends on a new Firefox feature called off main thread compositing (OMTC). In brief, each Firefox window is broken into a series of layers, somewhat similar to layers in Photoshop. Each time Firefox draws, these layers are submitted to a compositor thread that clips and translates the layers and combines them together into a single image that is then drawn.

Layers are structured as a tree. The root layer of the tree is responsible for the entire Firefox window. This layer contains other layers, some of which are responsible for drawing the menus and tabs. One subtree displays all the web content. Web content itself may be broken into multiple layers, but they’re all rooted at a single “content” layer.

In multiprocess Firefox, the content layer subtree is actually a shim. Most of the time, it contains a placeholder node that simply keeps a reference to the IPC link with the child process. The content process retains the actual layer tree for web content. It builds and draws to this layer tree. When it’s done, it sends the structure of its layer tree to the parent process via IPC. Backing buffers are shared with the parent either through shared memory or GPU memory. References to this memory are sent as part of the layer tree. When the parent receives the layer tree, it removes its placeholder content node and replaces it with the actual tree from content. Then it composites and draws as normal. When it’s done, it puts the placeholder back.

The basic architecture of how OMTC works with multiple processes has existed for some time, since it is needed for Firefox OS. However, Matt Woodrow and David Anderson have done a lot of work to get everything working properly on Windows, Mac, and Linux. One of the big challenges for multiprocess Firefox will be getting OMTC enabled on all platforms. Right now, only Macs use it by default.

User input. Events in Firefox work the same way as they do on the web. Namely, there is a DOM tree for the entire window, and events are threaded through this tree in capture and bubbling phases. Imagine that the user clicks on a button on a web page. In single-process Firefox, the root DOM node of the Firefox window gets the first chance to process the event. Then, nodes lower down in the DOM tree get a chance. The event handling proceeds down through to the XUL <browser> element. At this point, nodes in the web page’s DOM tree are given a chance to handle the event, all the way down to the button. The bubble phase follows, running in the opposite order, all the way back up to the root node of the Firefox window.

With multiple processes, event handling works the same way until the <browser> element is hit. At that point, if the event hasn’t been handled yet, it gets sent to the child process by IPC, where handling starts at the root of the content DOM tree. There are still some problems with how this works. Ideally, the parent process would wait to run its bubbling phase until the content process had finished handling the event. Right now, though, the two happen simultaneously, which can cause problems if the content process called stopPropagation() on the event. Bug 862519 covers this problem. The usual manifestation is that hitting a key combination like Cmd-Left on a Mac will go back even if a textbox is in focus.

Inter-process communication. All IPC happens using the Chromium IPC libraries. Each child process has its own separate IPC link with the parent. Children cannot communicate directly with each other. To prevent deadlocks and to ensure responsiveness, the parent process is not allowed to sit around waiting for messages from the child. However, the child is allowed to block on messages from the parent.

Rather than directly sending packets of data over IPC as one might expect, we use code generation to make the process much nicer. The IPC protocol is defined in IPDL, which sort of stands for “inter-* protocol definition language”. A typical IPDL file is PNecko.ipdl. It defines a set messages and their parameters. Parameters are serialized and included in the message. To send a message M, C++ code just needs to call the method SendM. To receive the message, it implements the method RecvM.

IPDL is used in all the low-level C++ parts of Gecko where IPC is required. In many cases, IPC is just used to forward actions from the child to the parent. This is a common pattern in Gecko:

void AddHistoryEntry(param) {
if (XRE_GetProcessType() == GeckoProcessType_Content) {
// If we're in the child, ask the parent to do this for us.
SendAddHistoryEntry(param);
return;
}

// Actually add the history entry...
}

bool RecvAddHistoryEntry(param) {
// Got a message from the child. Do the work for it.
AddHistoryEntry(param);
return true;
}


When AddHistoryEntry is called in the child, we detect that we’re inside the child process and send an IPC message to the parent. When the parent receives that message, it calls AddHistoryEntry on its side.

For a more realistic illustration, consider the Places database, which stores visited URLs for populating the awesome bar. Whenever the user visits a URL in the content process, we call this code. Notice the content process check followed by the SendVisitURI call and an immediate return. The message is received here; this code just calls VisitURI in the parent.

The code for IndexedDB, the places database, and HTTP connections all runs in the parent process, and they all use roughly the same proxying mechanism in the child.

Content scripts. IPDL takes care of passing messages in C++, but much of Firefox is actually written in JavaScript. Instead of using IPDL directly, JavaScript code relies on the message manager to communicate between processes. To use the message manager in JS, you need to get hold of a message manager object. There is a global message manager, message managers for each Firefox window, and message managers for each <browser> element. A message manager can be used to load JS code into the child process and to exchange messages with it.

As a simple example, imagine that we want to be informed every time a load event triggers in web content. We’re not interested in any particular browser or window, so we use the global message manager. The basic process is as follows:

// Get the global message manager.
let mm = Cc["@mozilla.org/globalmessagemanager;1"].
getService(Ci.nsIMessageListenerManager);

// Wait for load event.
mm.addMessageListener("GotLoadEvent", function (msg) {
dump("Received load event: " + msg.data.url + "\n");
});

// Load code into the child process to listen for the event.
mm.loadFrameScript("chrome://content/content-script.js", true);


For this to work, we also need to have a file content-script.js:

// Listen for the load event.
addEventListener("load", function (e) {
// Inform the parent process.
let docURL = content.document.documentURI;
sendAsyncMessage("GotLoadEvent", {url: docURL});
}, false);


This file is called a content script. When the loadFrameScript function call runs, the code for the script is run once for each <browser> element. This includes both remote browsers and regular ones. If we had used a per-window message manager, the code would only be run for the browser elements in that window. Any time a new browser element is added, the script is run automatically (this is the purpose of the true parameter to loadFrameScript). Since the script is run once per browser, it can access the browser’s window object and docshell via the content and docShell globals.

The great thing about content scripts is that they work in both single-process and multiprocess Firefox. So if you write your code using the message manager now, it will be forward-compatible with multiprocessing. Tim Taubert wrote a more comprehensive look at the message manager in his blog.

Cross-process APIs. There are a lot of APIs in Firefox that cross between the parent and child processes. An example is the webNavigation property of XUL <browser> elements. The webNavigation property is an object that provides methods like loadURI, goBack, and goForward. These methods are called in the parent process, but the actions need to happen in the child. First I’ll cover how these methods work in single-process Firefox, and then I’ll describe how we adapted them for multiple processes.

The webNavigation property is defined using the XML Binding Language (XBL). XBL is a declarative language for customizing how XML elements work. Its syntax is a combination of XML and JavaScript. Firefox uses XBL extensively to customize XUL elements like <browser> and <tabbrowser>. The <browser> customizations reside in browser.xml. Here is how browser.webNavigation is defined:

<field name="_webNavigation">null</field>

<property name="webNavigation" readonly="true">
<getter>
<![CDATA[
if (!this._webNavigation)
this._webNavigation = this.docShell.QueryInterface(Components.interfaces.nsIWebNavigation);
return this._webNavigation;
]]>
</getter>
</property>


This code is invoked whenever JavaScript code in Firefox accesses browser.webNavigation, where browser is some <browser> element. It checks if the result has already been cached in the browser._webNavigation field. If it hasn’t been cached, then it fetches the navigation object based off the browser’s docshell. The docshell is a Firefox-specific object that encapsulates a lot of functionality for loading new pages, navigating back and forth, and saving page history. In multiprocess Firefox, the docshell lives in the child process. Since the webNavigation accessor runs in the parent process, this.docShell above will just return null. As a consequence, this code will fail completely.

One way to fix this problem would be to create a fake docshell in C++ that could be returned. It would operate by sending IPDL messages to the real docshell in the child to get work done. We may eventually take this route in the future. We decided to do the message passing in JavaScript instead, since it’s easier and faster to prototype things there. Rather than change every docshell-using accessor to test if we’re using multiprocess browsing, we decided to create a new XBL binding that applies only to remote <browser> elements. It is called remote-browser.xml, and it extends the existing browser.xml binding.

The remote-browser.xml binding returns a JavaScript shim object whenever anyone uses browser.webNavigation or other similar objects. The shim object is implemented in its own JavaScript module. It uses the message manager to send messages like "WebNavigation:LoadURI" to a content script loaded by remote-browser.xml. The content script performs the actual action.

The shims we provide emulate their real counterparts imperfectly. They offer enough functionality to make Firefox work, but add-ons that use them may find them insufficient. I’ll discuss strategies for making add-ons work in more detail later.

Cross-process object wrappers. The message manager API does not allow the parent process to call sendSyncMessage; that is, the parent is not allowed to wait for a response from the child. It’s detrimental for the parent to wait on the child, since we don’t want the browser UI to be unresponsive because of slow content. However, converting Firefox code to be asynchronous (i.e., to use sendAsyncMessage instead) can sometimes be onerous. As an expedient, we’ve introduced a new primitive that allows code in the parent process to access objects in the child process synchronously.

These objects are called cross-process object wrappers—frequently abbreviated CPOWs. They’re created using the message manager. Consider this example content script:

addEventListener("load", function (e) {
let doc = content.document;
sendAsyncMessage("GotLoadEvent", {}, {document: doc});
}, false);


In this code, we want to be able to send a reference to the document to the parent process. We can’t use the second parameter to sendAsyncMessage to do this: that argument is converted to JSON before it is sent up. The optional third parameter allows us to send object references. Each property of this argument becomes accessible in the parent process as a CPOW. Here’s what the parent code might look like:

let mm = Cc["@mozilla.org/globalmessagemanager;1"].
getService(Ci.nsIMessageListenerManager);

mm.addMessageListener("GotLoadEvent", function (msg) {
let uri = msg.objects.document.documentURI;
dump("Received load event: " + uri + "\n");
});
mm.loadFrameScript("chrome://content/content-script.js", true);


It’s important to realize that we’re send object references. The msg.objects.document object is only a wrapper. The access to its documentURI property sends a synchronous message down to the child asking for the value. The dump statement only happens after a reply has come back from the child.

Because every property access sends a message, CPOWs can be slow to use. There is no caching, so 1,000 accesses to the same property will send 1,000 messages.

Another problem with CPOWs is that they violate some assumptions people might have about message ordering. Consider this code:

mm.addMessageListener("GotLoadEvent", function (msg) {
mm.sendAsyncMessage("ChangeDocumentURI", {newURI: "hello.com"});
let uri = msg.objects.document.documentURI;
dump("Received load event: " + uri + "\n");
});


This code sends a message asking the child to change the current document URI. Then it accesses the current document URI via a CPOW. You might expect the value of uri to come back as "hello.com". But it might not. In order to avoid deadlocks, CPOW messages can bypass normal messages and be processed first. It’s possible that the request for the documentURI property will be processed before the "ChangeDocumentURI" message, in which case uri will have some other value.

For this reason, it’s best not to mix CPOWs with normal message manager messages. It’s also a bad idea to use CPOWs for anything security-related, since you may not get results that are consistent with surrounding code that might use the message manager.

Despite these problems, we’ve found CPOWs to be useful for converting certain parts of Firefox to be multiprocess-comptabile. It’s best to use them in cases where users are less likely to notice poor responsiveness. As an example, we use CPOWs to implement the context menu that pops up when users right-click on content elements. Whether this code is asynchronous or synchronous, the menu cannot be displayed until content has responded with data about the element that has been clicked. The user is unlikely to notice if, for example, tab animations don’t run while waiting for the menu to pop up. Their only concern is for the menu to come up as quickly as possible, which is entirely gated on the response time of the content process. For this reason, we chose to use CPOWs, since they’re easier than converting the code to be asynchronous.

It’s possible that CPOWs will be phased out in the future. Asynchronous messaging using the message manager gives a user experience that is at least as good as, and often strictly better than, CPOWs. We strongly recommend that people use the message manager over CPOWs when possible. Nevertheless, CPOWs are sometimes useful.

### How will add-ons be affected?

The effect of multiple processes on add-ons is a huge topic. The most important thing I want to emphasize is that we intend to go to great lengths to ensure compatibility with add-ons. We realize that add-ons are extremely important to Firefox users, and we have no intention of abandoning or disrupting add-ons. At the same time, we feel strongly that users will appreciate the security and responsiveness benefits of multiprocess Firefox, so we’re willing to work very hard to get add-ons on board. We’re very interested in working with add-on developers to ensure that their add-ons work well in multiprocess Firefox. I hope to cover this topic in greater depths in future blog posts.

By default, add-on code runs in the parent process. Some add-ons have no need to access content objects, and so they will continue to function without any need for changes. For the remaining add-ons, we have a number of ideas and proposals to ensure compatibility.

The best way to ensure that add-ons are multiprocess-compatible is for them to use the message manager to touch content objects. This is what we’ve been doing for Firefox itself. It’s the most efficient option, and it gives the best user experience. Developers can use the message manager in their code right now; it was added in Firefox 4. We plan to convert the add-on SDK to use the message manager, which should allow most Jetpack add-ons to work with multiple processes without any additional changes. If you’re writing a new add-on, you would be doing Mozilla a great service if you use the message manager or the add-on SDK from the start.

Realistically, though, it will be a long time before all add-ons are converted to use the message manager. Until then, we have two strategies for coping with incompatible add-ons:

• Any time an add-on accesses a content object (like a DOM node), we can give it a CPOW instead. For the most part, CPOWs behave exactly like the objects they’re wrapping. The main difference is that only primitive values can be passed as arguments when calling CPOW methods. This problem usually surfaces when the add-on calls element.addEventListener("load", function (e) {...}). This won’t work if element is a CPOW because the second argument is a function, which is not a primitive. We have some ideas for mitigating this restriction. Despite the argument problem, CPOWs make it possible for some add-ons to work with multiple processes. It’s possible to use Adblock Plus with multiprocess Firefox right now. As our CPOW techniques become better, we expect that more and more add-ons will be compatible.
• For add-ons that don’t use the message manager or the add-on SDK and that don’t work with CPOWs, we can fall back to single-process Firefox. There are a few difficulties with this approach (what happens to your existing tabs if you install an incompatible add-on?), but we think they’re tractable.

In the future, I’m planning to update some popular add-ons to work with multiprocess Firefox. I’ll blog about difficulties I run into and possible solutions. I hope this material will be useful to people trying to use the message manager in their add-ons.

### More information

Tim Taubert and David Rajchenbach-Teller have written great blog posts about message passing and the new process model.

### Sean McArthur — www.persona.org

www.persona.org:

We launched a site that better describes Persona, with the goal of having decent landing pages for developers and users.

Shiny!

## December 04, 2013

### Joshua Cranmer — Why email is hard, part 4: Email addresses

This post is part 4 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. This post discusses the problems with email addresses.

You might be surprised that I find email addresses difficult enough to warrant a post discussing only this single topic. However, this is a surprisingly complex topic, and one which is made much harder by the presence of a very large number of people purporting to know the answer who then proceed to do the wrong thing [0]. To understand why email addresses are complicated, and why people do the wrong thing, I pose the following challenge: write a regular expression that matches all valid email addresses and only valid email addresses. Go ahead, stop reading, and play with it for a few minutes, and then you can compare your answer with the correct answer.

Done yet? So, if you came up with a regular expression, you got the wrong answer. But that's because it's a trick question: I never defined what I meant by a valid email address. Still, if you're hoping for partial credit, you may able to get some by correctly matching one of the purported definitions I give below.

The most obvious definition meant by "valid email address" is text that matches the addr-spec production of RFC 822. No regular expression can match this definition, though—and I am aware of the enormous regular expression that is often purported to solve this problem. This is because comments can be nested, which means you would need to solve the "balanced parentheses" language, which is easily provable to be non-regular [2].

Matching the addr-spec production, though, is the wrong thing to do: the production dictates the possible syntax forms an address may have, when you arguably want a more semantic interpretation. As a case in point, the two email addresses example@test.invalid and example @ test . invalid are both meant to refer to the same thing. When you ignore the actual full grammar of an email address and instead read the prose, particularly of RFC 5322 instead of RFC 822, you'll realize that matching comments and whitespace are entirely the wrong thing to do in the email address.

Here, though, we run into another problem. Email addresses are split into local-parts and the domain, the text before and after the @ character; the format of the local-part is basically either a quoted string (to escape otherwise illegal characters in a local-part), or an unquoted "dot-atom" production. The quoting is meant to be semantically invisible: "example"@test.invalid is the same email address as example@test.invalid. Normally, I would say that the use of quoted strings is an artifact of the encoding form, but given the strong appetite for aggressively "correct" email validators that attempt to blindly match the specification, it seems to me that it is better to keep the local-parts quoted if they need to be quoted. The dot-atom production matches a sequence of atoms (spans of text excluding several special characters like [ or .) separated by . characters, with no intervening spaces or comments allowed anywhere.

RFC 5322 only specifies how to unfold the syntax into a semantic value, and it does not explain how to semantically interpret the values of an email address. For that, we must turn to SMTP's definition in RFC 5321, whose semantic definition clearly imparts requirements on the format of an email address not found in RFC 5322. On domains, RFC 5321 explains that the domain is either a standard domain name [3], or it is a domain literal which is either an IPv4 or an IPv6 address. Examples of the latter two forms are test@[127.0.0.1] and test@[IPv6:::1]. But when it comes to the local-parts, RFC 5321 decides to just give up and admit no interpretation except at the final host, advising only that servers should avoid local-parts that need to be quoted. In the context of email specification, this kind of recommendation is effectively a requirement to not use such email addresses, and (by implication) most client code can avoid supporting these email addresses [4].

The prospect of internationalized domain names and email addresses throws a massive wrench into the state affairs, however. I've talked at length in part 2 about the problems here; the lack of a definitive decision on Unicode normalization means that the future here is extremely uncertain, although RFC 6530 does implicitly advise that servers should accept that some (but not all) clients are going to do NFC or NFKC normalization on email addresses.

At this point, it should be clear that asking for a regular expression to validate email addresses is really asking the wrong question. I did it at the beginning of this post because that is how the question tends to be phrased. The real question that people should be asking is "what characters are valid in an email address?" (and more specifically, the left-hand side of the email address, since the right-hand side is obviously a domain name). The answer is simple: among the ASCII printable characters (Unicode is more difficult), all the characters but those in the following string: " \"\$]();,@". Indeed, viewing an email address like this is exactly how HTML 5 specifies it in its definition of a format for <input type="email"> Another, much easier, more obvious, and simpler way to validate an email address relies on zero regular expressions and zero references to specifications. Just send an email to the purported address and ask the user to click on a unique link to complete registration. After all, the most common reason to request an email address is to be able to send messages to that email address, so if mail cannot be sent to it, the email address should be considered invalid, even if it is syntactically valid. Unfortunately, people persist in trying to write buggy email validators. Some are too simple and ignore valid characters (or valid top-level domain names!). Others are too focused on trying to match the RFC addr-spec syntax that, while they will happily accept most or all addr-spec forms, they also result in email addresses which are very likely to weak havoc if you pass to another system to send email; cause various forms of SQL injection, XSS injection, or even shell injection attacks; and which are likely to confuse tools as to what the email address actually is. This can be ameliorated with complicated normalization functions for email addresses, but none of the email validators I've looked at actually do this (which, again, goes to show that they're missing the point). Which brings me to a second quiz question: are email addresses case-insensitive? If you answered no, well, you're wrong. If you answered yes, you're also wrong. The local-part, as RFC 5321 emphasizes, is not to be interpreted by anyone but the final destination MTA server. A consequence is that it does not specify if they are case-sensitive or case-insensitive, which means that general code should not assume that it is case-insensitive. Domains, of course, are case-insensitive, unless you're talking about internationalized domain names [5]. In practice, though, RFC 5321 admits that servers should make the names case-insensitive. For everyone else who uses email addresses, the effective result of this admission is that email addresses should be stored in their original case but matched case-insensitively (effectively, code should be case-preserving). Hopefully this gives you a sense of why email addresses are frustrating and much more complicated then they first appear. There are historical artifacts of email addresses I've decided not to address (the roles of ! and % in addresses), but since they only matter to some SMTP implementations, I'll discuss them when I pick up SMTP in a later part (if I ever discuss them). I've avoided discussing some major issues with the specification here, because they are much better handled as part of the issues with email headers in general. Oh, and if you were expecting regular expression answers to the challenge I gave at the beginning of the post, here are the answers I threw together for my various definitions of "valid email address." I didn't test or even try to compile any of these regular expressions (as you should have gathered, regular expressions are not what you should be using), so caveat emptor. RFC 822 addr-spec Impossible. Don't even try. RFC 5322 non-obsolete addr-spec production ([^\x00-\x20()\[$:;@\\,.]+(\.[^\x00-\x20():;@\\,.]+)*|"(\\.|[^\\"])*")@([^\x00-\x20():;@\\,.]+(.[^\x00-\x20():;@\\,.]+)*|$(\\.|[^\\$])*\])
RFC 5322, unquoted email address
.*@([^\x00-\x20():;@\\,.]+(\.[^\x00-\x20():;@\\,.]+)*|$(\\.|[^\\$])*\])
HTML 5's interpretation
[a-zA-Z0-9.!#%&'*+/=?^_{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)* Effective EAI-aware version [^\x00-\x20\x80-\x9f]():;@\\,]+@[^\x00-\x20\x80-\x9f():;@\\,]+, with the caveats that a dot does not begin or end the local-part, nor do two dots appear subsequent, the local part is in NFC or NFKC form, and the domain is a valid domain name. [1] If you're trying to find guides on valid email addresses, a useful way to eliminate incorrect answers are the following litmus tests. First, if the guide mentions an RFC, but does not mention RFC 5321 (or RFC 2821, in a pinch), you can generally ignore it. If the email address test (not) @ example.com would be valid, then the author has clearly not carefully read and understood the specifications. If the guide mentions RFC 5321, RFC 5322, RFC 6530, and IDN, then the author clearly has taken the time to actually understand the subject matter and their opinion can be trusted. [2] I'm using "regular" here in the sense of theoretical regular languages. Perl-compatible regular expressions can match non-regular languages (because of backreferences), but even backreferences can't solve the problem here. It appears that newer versions support a construct which can match balanced parentheses, but I'm going to discount that because by the time you're going to start using that feature, you have at least two problems. [3] Specifically, if you want to get really technical, the domain name is going to be routed via MX records in DNS. [4] RFC 5321 is the specification for SMTP, and, therefore, it is only truly binding for things that talk SMTP; likewise, RFC 5322 is only binding on people who speak email headers. When I say that systems can pretend that email addresses with domain literals or quoted local-parts don't exist, I'm excluding mail clients and mail servers. If you're writing a website and you need an email address, there is no need to support email addresses which don't exist on the open, public Internet. [5] My usual approach to seeing internationalization at this point (if you haven't gathered from the lengthy second post of this series) is to assume that the specifications assume magic where case insensitivity is desired. ### Jared Wein — Australis-Styled Widgets Presentation I’ve been pretty quiet this semester about the work that a team of students have been focused on. However, don’t let my quietness be a representation of how hard they have worked. We’re now reaching the end of the semester and the students have put together a video of their work throughout the semester. The students were tasked with creating three add-ons for the upcoming Australis version of Firefox. The goal of the project was to get feedback on the new Australis add-on APIs before it became too late to make significant changes. Through the process some bugs were filed, but none that caused us to have to go back and rethink our initial direction. The three add-ons that the students were asked to create were a weather add-on, music add-on, and Bugzilla add-on. Please watch the video below to get an overview of their capabilities. I’ll be posting links to the source code repositories and download links for the add-ons sometime later this week. Tagged: australis, capstone, firefox, msu, planet-mozilla ### Karl Dubost — Future Fail and User Agent Sniffing Very often I use the expression "UA detection is a future fail strategy". It's a quick sentence with punch which makes angry some of the people in User Agent Detection business. They try to do a good job at providing the most complete, up to date, database of user agent strings and their relative capabilities. The issue is not in the intent of their database. The issue is often how the identification is used and how the code is shaped with regards to this identification. It is not an issue only related to UA databases. We see it every day in small pieces of codes. Today I was checking the code of http://login.yahoo.com/ which is basically the portal for people to get identified when they have to access one of the Yahoo Web properties where it is required to have a login and password. I found this little piece of code: if(navigator.userAgent.indexOf('Firefox/6') > 0) { style = ''; } else { style = "left:70px;"; }  Quite a simple piece of code. Let's see on my current desktop in the Web Console. > navigator.userAgent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0" > navigator.userAgent.indexOf('Firefox/6') -1 > var ua = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/6.0" undefined > ua.indexOf('Firefox/6') 69  Logical and working as expected. But now let's go back to the future… with Firefox 60.0 > var ua = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/60.0" undefined > ua.indexOf('Firefox/6') 69  Oooops. It means that the script will kick for Firefox version 60 to 69 and 600… You got the idea. This code is probably not used anymore, but it's here because people have forgotten about it, but one day the condition it was supposed to meet or not meet will come back for any kind of reasons. Otsukare! ### Justin Wood — Android Mobile Marketshare, and how we (Mozilla) stack up. – Part 1 So, I should preface this with a few big caveats! • I am not a metrics guy, nor do I pretend to be. • This post will not give you absolute numbers of users. • This post is not meant to show any sense of penetration into the market • Ignores all things Firefox OS for the purposes herein. • I present this as an attempt at data only and no pre-judging of where we should go, nor what we should do. • I am explicitly trying to avoid color commentary here, and allowing the reader to draw their own conclusions based on the static data.¹ What this series will attempt to show is: • Where the current marketshare is on Android OS’s (with citations where possible) • Where our (Firefox for Android) userbase is • Where we invest in builds/tests (due to length, this will be in a Part 2 — Will link from here once published) • How what we do in Release Engineering correlates to our known market (based on these stats). (also in Part 2) Now to the juicy bits! ### Google’s own stats on Android Marketshare Currently Android has a pretty healthy marketshare on the later OS’s, and the earlier ones are seeing very very diminishing returns. Android Usage Share on Dec 3, 2013, from http://developer.android.com/about/dashboards/index.html Version Codename API Distribution 2.2 Froyo 8 1.6% 2.3.3 - 2.3.7 Gingerbread 10 24.1% 3.2 Honeycomb 13 0.1% 4.0.3 - 4.0.4 Ice Cream Sandwich 15 18.6% 4.1.x Jelly Bean 16 37.4% 4.2.x 17 12.9% 4.3 18 4.2% 4.4 KitKat 19 1.1% This data was all from http://developer.android.com/about/dashboards/index.html so feel free to see updated data as of whenever you are reading this. ### Stats from Google Play for “Firefox for Android” Before we begin with the data, I need to clarify something readers may not be aware of at first glance, “What is an Install Install in this context is any currently active device which has Firefox installed on it. It does not actually indicate frequent use. #### Installs by OS – Release (org.mozilla.firefox) Yes, you see that right, this is 81.4% of our GA users on some version of Android 4.0+. #### Installs by OS – Beta (org.mozilla.firefox_beta) Our beta audience is pretty similar with 83.59% on an Android 4.x or higher. #### Installs by Chipset Selecting by Chipset is a bit harder, since to do so we have to take a factor of how Releng does our Play Store releases (different buildID’s to factor to different chipsets). I am doing this by a feature of the play store, namely “Export as CSV” which gives buildID info So with that in mind, here is the data: Arm V7 Arm V6 x86 Firefox Beta 96.19% 0.90% 2.91% Firefox GA 98.61% 1.39% N/A The caveats to note is that I only gathered data from today’s Google Play installs, and I aggregated all installs over all versions, even ones that are multiple years old. We also do not have x86 released officially yet, so we only have beta users using that version. ### Coming in Part 2: • Where does Mozilla Invest build and test resources for Android? • How does this compare to Mozilla’s Testing Infrastructure? ¹ - This post as-is is indeed intended to be data without analysis/commentary. I don’t feel I’m greatly suited for the latter compared to other people possibly reading. In Part 2 I intend to show some correlaries-as-data to what we are doing inside Release Engineering as it compares to our users at large. I’m currently hoping to also write it devoid of any assertions/commentary. - The underlying reasoning here is to help spur thoughts and commentary in others in order to further our mission using data, while at the same time without inserting my own opinions or biases into what I am presenting with this 2-part series. I do not yet know if I will do a commentary piece referencing these posts or not, and I may do so, I just don’t yet plan to. ### Pete Moore — Weekly review 2013-12-04 Accomplishments & status: Bug 905742 - Provide B2G Emulator builds for Darwin x86 We have finally this week managed to get B2G emulator builds working in OS X 10.7.2. Please note last week we only had it working in 10.7.4, and we really didn’t want to upgrade the OS if we didn’t have to, due to the other builders that run on these machines. There are still a couple of loose ends that need to be resolved properly, but essentially these are the requirements: 1) We install the Command Line Tools from apple from late March 2012 MD5 (cltools_lion_latemarch12.dmg) = 3f999cabd47936eb1d8d656ab6425286 2) We curl gnupg-1.4.15.tar.bz2 into /Library/Caches/Homebrew/gnupg-1.4.15.tar.bz2 as a prerequisite for step 3 (since brew cannot download it through the firewall - this caches it ready for brew)… 4) Create a symbolic link gcc in /usr/local/bin for gcc-4.6 5) Change owner of /usr/local/lib/perl5/site_perl and subdirectories to cltbld user and admin group 6) Upgrade git: i) remove the link /usr/local/bin/git, ii) brew install git 7) Upgrade python: i) remove the link /usr/local/bin/python2.7, ii) brew install python 8) Upgrade tar: i) brew install gnu-tar, ii) create symbolic link tar in /usr/local/bin for gtar After all this, the environment is set up, we are ready to build… 1) We prepare a buildprops.json file, and export PROPERTIES_FILE env variable to point to it 2) Add /usr/local/bin to the front of PATH 3) unset GIT_DIR 4) hash -d gcc tar 5) hg clone http://hg.mozilla.org/build/mozharness, and check out production branch 6) Patch b2g_build.py to fix the “script” command to work on darwin, and remove the mock setup steps in three places 7) Create the /builds/git-shared/git directory 8) run mozharness: 'scripts/scripts/b2g_build.py' '—target' 'generic' '—config' 'b2g/releng-emulator.py' '—b2g-config-dir' 'emulator' '—gaia-languages-file' 'locales/languages_dev.json' '—gecko-languages-file' 'gecko/b2g/locales/all-locales' In parallel we have one more fix we (currently) have to make: 1) we create directory build-dir/build/out/host/darwin-x86/usr while the build is running ### Patrick Cloke — GSoc Lessons: Part Deux: The Arms Race This post title might be a little excessive, but I'll blame The Sum of All Fears that I was watching last night. This is the second part of a set of posts about ideas I heard at the Google Summer of Code 2013 Mentor Summit (you can read the first part about the application process). This will explore an interesting anecdote I heard about the interaction between applicants from another organization that, on reflection, seemed to resonate somewhat with what I had seen in my corner of the Mozilla community. The organization these students were applying to required patches to be fixed for a student's application to be accepted (as discussed in my previous post). For a particular project there existed multiple highly motivated and skilled students, but only one slot. Thus, a "patch race" of sorts occurred where the students competed by continually providing more patches that were increasingly complex. (Note that this wasn't a in response to a challenge from community members, it was a spontaneous situation.) Once a single student started to submit extra patches the other students felt they must also submit more patches to be considered equal/superior (hence my allusion to an "arms race"). Interestingly, they would also sometimes work on the same bug in a sort of race to see who could fix it first. There's a couple things I took away from this: 1. Great, the project just had a lot of things fixed! 2. The students were investing escalating amounts of time during the application phase. 3. The students were not working in an open manner. I won't really expand much more about the first point, it's always good to fix things. Although submitting patches might showcase a student's skill, it also relates to how much time the student is willing and able to put into the application period. This, in particular, matters since different areas of the world end their school year at different times. A student that has already finished his semester during the application period may have a lot of free time to attempt to get a GSoC slot (but will most likely not have as much time during the actual summer!) This something that mentors should keep in mind while reviewing applications. A downside of increasing amounts of time invested is that the rejection is that much harder for both the mentor (especially if the student is now part of the community!), as well as for the student who has now vested a large amount of time in the project. The realization that actually upset me, however, is that these students were not working in an open manner! Instead of collaborating, they were competing! To me, this would set off a very poor tone for the rest of GSoC. In fact, one of the biggest challenges I've had with GSoC students is getting them to work in the open (i.e. "show me the code", anyone in #instantbird is probably tired of hearing me say that). At this point you might think this is a hypothetical case I made up! Upon letting it sink in and reflecting on it...I realized I had actually seen similar situations during the application periods I've been involved with. This year, we found a bug in Instantbird's IRC code (CTCP quoting and dequoting); after referencing some specifications, I was pretty quickly able to figure out the vague areas where people should look for a fix. A couple of GSoC students in the room started looking into it and exhibited a greatly reduced form of the behavior I discussed above. The students were sharing information, but were not comfortable sharing code. Unfortunately, this led to some very vague questions which I was unable to answer (or answered incorrectly) and led to me coining my catchphrase from above. I by no means think this reflects poorly on our students! I think this is some what natural and expected for most students unfamiliar with open development. (Extrapolating from my experiences in school...) Students generally work individually (or in small groups) on projects and are directly competing for grades (at least if the course is graded on a curve). This would foster a sense of competition as opposed to cooperation! Luckily the students working with us understood (with very little prompting, I might add!) that we'd prefer they work together and help each other. We were able to successfully fix the dequoting bug (which then caused a bug in the quoting code to be visible...sigh...). My short take away from all this: remember that students are not yet a community and they're competing with each other until they've been accepted. (And that they're used to competing, e.g. homework and exams, not collaborating!) I don't really know whether I feel the above situation is good or bad, but it's certainly an interesting effect from the way the GSoC process works. ### Wladimir Palant — Links not working on a website? You can fix that! Adblock Plus users who decided to disable tracking have been complaining about severe issues on some websites for a while already. On the websites in question, clicking a link simply wouldn’t do anything unless one disables Adblock Plus. Our investigation has shown that bugs in Adobe SiteCatalyst are to blame for this issue. SiteCatalyst is a tracking solution that Adobe acquired from Omniture. A “forced link tracking” feature introduced recently is the source of these issues. Originally it was enabled for Google Chrome only, in a follow-up version for Mozilla Firefox as well. #### What can be done about it? Have you found a website with broken links? You can report it in the EasyList/EasyPrivacy forum (no registration required). The filter authors will add a new filter to block Adobe SiteCatalyst on this website. Unfortunately, blocking it on all websites by a generic filter isn’t possible because the website owners tend to install it under different names. You might also be able to block that script yourself. In Firefox you can click the Adblock Plus icon on the problematic web page and choose “Open blockable items” from the menu. Enter “script” as the search term to list scripts only. Typically, the SiteCatalyst script will have “omniture” somewhere in its name, or it will be called “s_code.js”. If the corresponding line isn’t red then the script isn’t blocked — you can double-click the line to open the filter assistant and block it. Make sure to select the first option in the “Look for pattern” section of the filter assistant — you want to block only this one script, not the entire folder. If you are a website owner using SiteCatalyst, you can disable link tracking in the configuration of your SiteCatalyst instance with the following statement: s.useForcedLinkTracking=false;  #### Did you try to contact Adobe? Yes, we did. We contacted Adobe mid-June, listed the bugs in SiteCatalyst and how they can be fixed. We were then asked for exact steps to reproduce and we delivered those, an explanation of how the issue can be reproduced on adobe.com. Our contact confirmed that this was exactly what they needed — and then there was silence. We’ve sent a reminder three months later but the current state is still that Adobe is for some reason unable or unwilling to fix the bugs in their scripts. So Adobe SiteCatalyst continues to cause issues on websites of Adobe’s paying customers and even on Adobe’s own website. #### What are these bugs exactly? Are they hard to fix? No, fixing these bugs is trivial. The main issue concerns the way data is transmitted. When a link is clicked on the website SiteCatalyst “swallows” the click and creates an image instead to send tracking data. It then generates a fake link click when this image loads. The code looks like this: if(!im) im=s.wd[imn]=new Image; im.s_l=0; im.onload=new Function('e','...');  The issue here is that the image can fail to load, something that might be caused by Adblock Plus but also by connectivity issues or a server failure. There will be no load event in that scenario but rather an error event and so the fake link click will never happen. The solution is to add the handler for both events so that it is always executed when the request is done: im.onload=im.onerror=new Function('e','...');  Now SiteCatalyst isn’t meant to track more than one link click. Normally only the first link click should be swallowed by the bug above, the subsequent clicks would be handled as intended. The code looks like this: if(!s.useForcedLinkTracking) ... else s.b.removeEventListener(\"click\",s.bc,false);  This should remove the click listener after it is triggered for the first time. The problem is that the code adding the click listener looks like this: s.b.addEventListener('click',s.bc,true)  Can you spot the bug? Right, the third parameter of the two calls (useCapture) isn’t identical — a capturing event listener has been added but the code tries to remove a non-capturing listener. As a result, removing the listener doesn’t actually happen. This isn’t really rocket science. The bugs can be fixed with minimal changes that wouldn’t take any developer more than an hour to implement and test. The bug report we sent to Adobe had the same level of detail, it outlined exactly what went wrong and how it can be fixed. Why wasn’t Adobe able to fix their bugs then? Beats me. ### Nick Cameron — OMTC for Windows nightly users I just landed a patch flipping the switch for all Windows users with HWA (d3d9/10/11) to use off-main-thread compositing. This is a fairly big change to our rendering pipeline, so if you notice rendering issues on Windows, please file bugs. For now, only nightly users will get this change. Riding the trains depends on bugs 913503 and 904890, and general stability. We wanted to land this early to get some extra testing time and because without being tested, it has been rotting super quickly. I will arrange for a branch to keep testing main thread composition asap. One known issue is windowed plugins lagging during scrolling (913503), so please ignore that (for now) if you observe it. OMTC can be disabled by setting the 'layers.offmainthreadcomposition.enabled' pref to false. If there are more problems than we can fix relatively quickly, we can do this for all users very easily. This is the culmination of months of work, so it is really exciting to see it finally land. I fully expect, however, to have to turn it off due to a tidal wave of unforeseen bugs, but such is life. ### Jared Wein — Download a Holly nightly today Continuing with the Australis work that much of the Firefox front-end team has been laser-focused on recently, we now have automatically updating nightly builds of Holly (Windows, OS X, Linux). Holly is the version of Nightly that doesn’t include the Australis changes. We are running this special “backout” branch of Nightly because Australis won’t be ready to make the move to Firefox Aurora by the December 9th merge date. We will continue to work on Australis in the Nightly 29 train, with the goal of Australis merging to Firefox Aurora 29. In the meantime, the Holly branch is what will be merged to Firefox Aurora 28. It is very important that we have nightly testers who use Holly to help the Firefox community make sure that we have good code coverage over the changes that will be making their way to our Aurora population. If you’d like to help test out the Holly branch, you can now download an auto-updating nightly build of Holly (Windows, OS X, Linux). Again, these will be very similar to the official Firefox Nightly builds with the exception that they don’t include the Australis user interface changes. Tagged: australis, firefox, mozilla, planet-mozilla ### Brian Birtles — Web Animations @ html5j 2013 Over the weekend I had the chance to speak about Web Animations at the HTML5 conference 2013 in Tokyo. I put a fair bit of work into the presentation so I thought I’d put up an English version of the slides (including videos of the demos) in case they’re useful to someone else looking for a gentle introduction to Web Animations. I ran out of steam when producing the last few slides so it kind of ends with a fizzle but I put a fair bit of work into the other ones so hopefully it’s entertaining. Although you can’t tell from the slideshare version most of the slides include animation somewhere and most of the pictures are made with SVG so I think it looked pretty. View the presentation slides In the rare chance that you’re reading this blog directly or the syndicator didn’t eat this bit, you can view the slides right here: (If you’re curious there’s the HTML version too. But be warned that it doesn’t have explanatory notes like the slideshare version; it won’t work in Chrome since it makes heavy use of scoped styles; and one animation relies on enabling mixed-blend-mode which is only in Aurora or Nightly, see bug 902525.) (日本語の資料ももちろんあります！イベントの配信、 分かりやすくしたスライド, 原本のHTML版のスライドをご覧ください。） ### Jared Wein — Good read: The value of Ignite Recently I got a shout-out from Tricia Broderick in her blog post, “The value of Ignite“. The post was a great reminder of what someone can accomplish when they step out of their comfort zone and try something they’ve never done before. We held multiple of these “Ignite” events at TechSmith. At each event we had about eight presenters who covered various work and non-work related topics. The twist to the presentations is that each slide can only be on-screen for 15 seconds (auto-advancing) and you only get 20 slides. These turned out to be great activities for people to learn more about their coworkers as well as get practice presenting. See Tricia’s blog post for her take on the event. Tagged: planet-mozilla, presentation, techsmith ### Nigel Babu — Mozilla Summit 2013 - Connections There already have been several excellent blog posts about the Summit. I want to talk about the biggest opportunity that the Summit provided – in-person connections. I’ve been involved with Mozilla since 2011 and this is the third Mozilla event I’ve attended. Compared to the previous events, Mozilla Summit 2013 was a sensory overload, in a pleasant way of course. On Wednesday, I met pleia2 at Union Square. We walked around and had dinner at her favorite burger place, which had a beautiful view of the Union Square. The next day, I was at the Mozilla Space in San Francisco. I spent most of the day working on HTML parsing for “Who Owns What”. It turned out that Rob was headed to Santa Clara via Caltrain and stopped by the office to say hi. I love trains, and I joined Rob. We had a great conversation going all the way until the hotel. At the hotel, I was excited to say hi to Ben, we’ve known each other from Ubuntu and Mozilla communities. I accidentally got into the wrong elevator and I met Wes on it. That evening, a hilarious confusion happened, which is now a running joke among those who know Ashish and me. Jen and a few others walked up to Ashish and asked if he was Nigel. When I finally did meet Jen, Sole, and jbuck; sole amended my nametag to say “The real nigelb”. I believe Ashish later had “Nigel Babu*” written on one side of name tag, with the * expanded below to “*Not”. That evening, I met Jessica Ledbetter and James Tantum, who I know primarily from the Ubuntu community, for dinner at a nice Greek restaurant. Over the course of the Summit, I met glob, bhearsum, dolske, edmorely, Dino, Gen, sid0, peterbe, Kaitlin, Kate, Hilary, Ludovic, and lots of Mozillians from the Asian and especially Indian community who were familiar from the Mozcamps. On Friday evening, after the Firefox OS dinner, I met morgamic for the first time! It was definitely an exciting moment for me. Later, philikon was talking to morgamic and he looked familiar. I asked him his IRC nick name and I had an Aha! moment. I’m grateful to have met all the folks from Mozilla Webdev who were in Santa Clara – Ben, Erik Rose, Luke, David Walsh, Jen, Sole, Owen, James, Craig, Peter, Lars, Rob, zalun, and others who I don’t even remember names to make a proper list. After the Summit, I went to the Pinterest office to meet Dave Dash. He was my mentor when I first started contributing to Mozilla and again, it was great to meet him in person. As I think back to the summit, all the people I’ve met are my most treasured memory. Note: If I haven’t mentioned your name, it’s because I’ve forgotten it. These few weeks have been a bit stressful and it’s been more then 2 months since the Summit. ### James Long — A Deep Dive into Asynchronous Templating I wrote a JavaScript templating engine called Nunjucks. Recently it reached 1.0 and one of the new features is asynchronous templating. You may be wondering, like I was a few months ago, what does that even mean? I tend to prioritize feature requests by popularity, and one of the features that kept coming up was asynchronous templates. It took me a while to figure out what people meant by that, and I think the result is quite interesting. Nunjucks does a lot of things, like loading templates, calling filters, and more. All of this is synchronous by default (which isn't a problem for loading templates, since they are loaded once and cached forever). This limits what you can do in filters and template loaders, since you can't use any async functions. Asynchronous templates can be paused in the middle of rendering and resumed later. This hasn't been a problem for a long time, and for most people never will be a problem. You don't want to mix too much logic with your templates, so you usually do all the complicated async work in a controller and pass the data to the template. However, I can imagine sites that are heavily template-driven and developers wanting to wrap up some sort of behavior that depends on an async operation into a filter or custom tag. Nunjucks is built to allow people to add logic to their templates as needed, and works great for large content-heavy projects where not everybody is familiar with the backend. Regardless, I think asynchronous control is an interesting feature that doesn't adversely effect existing templates, so I decided to dig into it. Here's what I came up with. ### A Basic Example In nunjucks, you can define filters that are used in templates like so: Hello {{ user | formatName }}!  The way you create filters looks like this: var env = nunjucks.configure('views'); env.addFilter('formatName', function(user) { return user.firstname + ' ' + user.lastname; })  This means that you can define points in the template which call out to custom JavaScript code. There are two other places this can happen: extensions, which let you create custom tags that process content at run-time, and loaders which let you handle how templates are loaded when a block like include or extends is hit. The problem is that if you want to use any asynchronous API in your custom code, you can't. The previous nunjucks API only supported synchronous functions which returned a value at the end. For example, let's say you wanted to load a value from a database in a filter: env.addFilter('getCategory', function(item) { db.get('item-category-' + item.id, function(err, res) { return res; }); })  That won't work. The getCategory filter returns undefined because nothing is actually returned, so nothing gets rendered. The async call is just ignored because there's nothing it can do in the callback. This is the technical reason why asynchronous templates are necessary. If we want to support asynchronous behavior in custom code, everything up the stack needs to be asynchronous as well. This means that all of the template code becomes asynchronous, so template rendering can be "paused" and resumed at a later time. ### The Solution As of nunjucks 1.0, you can write asynchronous filters, extensions, and loaders. Because async work might happen, all of the API calls must be async as well, such as render. Here's an example that creates an async filter and renders a template: env.addFilter('getCategory', function(item, cb) { db.get('item-category-' + item.id, cb); }, true) env.render('foo.html', function(err, res) { // ... })  Asynchronous style is completely optional in nunjucks. I made it that way because I believe 99% of templates will not use it, and it sucks to enforce such a big change for a rarely used feature. That's why you need to pass true to env.addFilter as the last argument, which tells nunjucks to give you a callback for async work. Otherwise the system will assume your filter is synchronous. Note that env.render now takes a callback instead of returning the rendered template. Everything up the stack has to be asynchronous as well for templates to be paused/resumed. Extensions and loaders have similar ways to mark them as async. Since everything is implicitly synchronous, the async work is marked explicitly. Nunjucks is able to take advantage of this for performance as you will see in the next section. If you never use any asynchronous filters, extensions, or loaders, you can still simply just write var res = env.render('foo.html'). ### Implementation Details Nunjucks has always been a really fast templating engine because it compiles templates to straight-forward code. For example, look at this template: {% for item in items %} {{ item.name }} last seen {{ item.id | getLastSeen }} {% endfor %}  This compiles to: function root(env, context, frame, runtime, cb) { var lineno = null; var colno = null; var output = ""; try { output += "\n"; frame = frame.push(); var t_3 = runtime.contextOrFrameLookup(context, frame, "items"); if(t_3) { for(var t_1=0; t_1 < t_3.length; t_1++) { var t_4 = t_3[t_1]; frame.set("item", t_4); output += "\n "; output += runtime.suppressValue(runtime.memberLookup((t_4),"name", env.autoesc), env.autoesc); output += " last seen "; output += runtime.suppressValue(env.getFilter("getLastSeen").call(context, runtime.memberLookup((t_4),"id", env.autoesc)), env.autoesc); output += "\n"; } } frame = frame.pop(); output += "\n"; cb(null, output); } catch (e) { cb(runtime.handleError(e, lineno, colno)); } }  While there is a bunch of boilerplate to handle scoping, autoescaping, and other features, it basically boils down to a simple for loop and string concatenation. The philosophy of nunjucks has been to compile out to unsurprising JavaScript, which makes it really fast. But to support asynchronous behavior, we need to radically transform the generated code so that the template can be "paused" at any point and then picked up later when the async work is done. Performance would suffer greatly from that kind of code, unfortunately, as every operation needs to be wrapped in some kind of delayed fashion. Imagine trying to pause it in the middle of the for loop; you can't, so you have to use a custom iteration mechanism to control it and you lose simplicity and performance. Worse, this major (unbenchmarked but obvious) performance hit is for a feature that most people won't use. There is a key insight that will solve the performance problem, though. Before we dig into nunjucks, it's worth mentioning dust.js which is the only other templating engine I know of that is asynchronous. It's easy to see how it works if you look at the example on the homepage: Hello {name}! You have {count} new messages.  compiles to: (function() { dust.register("demo", body_0); function body_0(chk, ctx) { return chk.write("Hello ") .reference(ctx.get("name"), ctx, "h") .write("! You have ") .reference(ctx.get("count"), ctx, "h") .write(" new messages."); } return body_0; })();  The code it generates chains together every single step of the rendering, so nothing is eagerly evaluated. It has its own iterator for looping and isn't able to take advantage of JavaScript optimizations. However, dust.js is a very cool templating language, and the performance might be fine for you. It's able to do lots of cool stuff like streaming templates because of how it's structured. However, nunjucks templates tend to be large and very fast and I wanted to keep it that way. #### Key Insight There is a particular characteristic of asynchronous nunjucks templates that we can take advantage of: asynchronous work can only be triggered within filters, extensions, and loaders that are explicitly marked asynchronous. That means that only at those places do we need to worry about asynchronous transformations; everything else can be synchronous. You'll see the great benefits we can reap from this property below. #### Transformation So what kind of generated code do we need to produce? Let's start with a basic example and go from there. Hello {{ user.name }}, last logged in {{ user.id | getLastSeen }}  This template compiles to: function root(env, context, frame, runtime, cb) { var lineno = null; var colno = null; var output = ""; try { output += "\nHello "; output += runtime.suppressValue( runtime.memberLookup(runtime.contextOrFrameLookup(context, frame, "user"),"name", env.autoesc), env.autoesc ); output += ", last logged in "; output += runtime.suppressValue( env.getFilter("getLastSeen").call(context, runtime.memberLookup((runtime.contextOrFrameLookup(context, frame, "user")),"id", env.autoesc)), env.autoesc ); output += "\n"; cb(null, output); } catch (e) { cb(runtime.handleError(e, lineno, colno)); } }  In this template, we only need to worry about the getLastSeen filter being asynchronous. The code above calls it synchronously and expects it to return a value. What if we changed the compiler to generate the following code? function root(env, context, frame, runtime, cb) { var lineno = null; var colno = null; var output = ""; try { output += "\nHello "; output += runtime.suppressValue( runtime.memberLookup((runtime.contextOrFrameLookup(context, frame, "user")),"name", env.autoesc), env.autoesc ); output += ", last logged in "; env.getFilter("getLastSeen").call( context, runtime.memberLookup((runtime.contextOrFrameLookup(context, frame, "user")),"id", env.autoesc), function(t_1,hole_0) { if(t_1) { cb(t_1); return; } output += runtime.suppressValue(hole_0, env.autoesc); output += "\n"; cb(null, output); } ); } catch (e) { cb(runtime.handleError(e, lineno, colno)); } }  Now it calls the getLastSeen filter with a callback, which renders the rest of the template. I know the code is a little dense, but I want to keep it real compiled code from nunjucks so you really see how it works. It's important to see that the callback contains the entire code for the rest of the template. You can see if better if I add more stuff to the template: Hello {{ user.name }}, last logged in {{ user.id | getLastSeen }}. Today is {{ day }}!  The filter call would become: env.getFilter("getLastSeen").call( context, runtime.memberLookup((runtime.contextOrFrameLookup(context, frame, "user")),"id", env.autoesc), function(t_1,hole_0) { if(t_1) { cb(t_1); return; } output += runtime.suppressValue(hole_0, env.autoesc); output += ". Today is "; output += runtime.suppressValue(runtime.contextOrFrameLookup(context, frame, "day"), env.autoesc); output += "!\n"; cb(null, output); } )  Since we only have to watch out for filters, extensions, and loaders we can add asynchronous support rather easily into our existing linear code. Internally, as the compiler emits sequential statements, it keeps track of a current "scoping level" so it knows how many functions to close at the end of the template. Here's a really high-level overview. Previously nunjucks simply walked through a list of expressions and generated code for each of them, so it was sequential like this: output += expr1 output += expr2 output += expr3 output += expr4 output += expr5 output += expr6  Now, if expr2 and expr4 is asynchronous, we generate the opening of a callback function, add a scoping level so it is closed at the end, and continue generating code: output += expr1 expr2(function(err, res) { output += res output += expr3 expr4(function(err, res) { output += res output += expr5 output += expr6 }) })  Although the asynchronous expressions generate slightly different code now, the rest of the expressions are generated exactly the same as before. It just so happens that syntactically they are wrapped in the callback. In this way we defer the rest of the template by sticking it all into the callback function. It works just as good if there are multiple callbacks (produced by multiple asynchronous forms). #### Iteration So we've successfully transformed the generated code to support asynchronous control (the above technique can be triggered also by an async extension or loader)! Unfortunately it breaks down if you do anything async inside a for loop. The plague of asynchronous behavior is that everything must be asynchronous. You can't call an async API inside a normal JavaScript for loop; there's no way to "pause" the iteration. That means that we can't use for loops anymore. Nunjucks will generate code that uses our own iterator, asyncEach: {% for item in items %} {{ item.name }} last seen {{ item.id | getLastSeen }} {% endfor %}  function root(env, context, frame, runtime, cb) { var lineno = null; var colno = null; var output = ""; try { output += "\n"; frame = frame.push(); var t_3 = runtime.contextOrFrameLookup(context, frame, "items"); runtime.asyncEach(t_3, 1, function(item, t_1, t_2,next) { frame.set("item", item); output += "\n "; output += runtime.suppressValue(runtime.memberLookup((item),"name", env.autoesc), env.autoesc); output += " last seen "; env.getFilter("getLastSeen").call(context, runtime.memberLookup((item),"id", env.autoesc), function(t_4,hole_0) { if(t_4) { cb(t_4); return; } output += runtime.suppressValue(hole_0, env.autoesc); output += "\n"; next(t_1); }); }, function(t_6,t_5) { if(t_6) { cb(t_6); return; } frame = frame.pop(); output += "\n"; cb(null, output); }); } catch (e) { cb(runtime.handleError(e, lineno, colno)); } }  asyncEach calls a callback with a few arguments, most notably next which is called when it should move to the next item. We use the same technique of playing around with scoping levels but still in general just generating sequential statements that render the template. #### Lifting Expressions So we're done, right? Not exactly. Nunjucks supports complex expressions like this one: Hey {{ foo(1, 2, username | title ) }}  This compiles out mostly to a normal JavaScript function call, and our transformation would break because it expects to be at the top-level. It would generate something like foo(1, 2, getFilter('title').call(this, username, function(err, res) {). Even if it were syntactically valid, the filter call wouldn't return anything. We need to convert the whole expression to be asynchronous. Sound complicated? I hope you're not feeling like this: Because it's actually pretty easy to fix. I know this post is quite dense, but dogs make everything better, right? And if you skipped down here, seriously, go back up! What we need to do is lift all the asynchronous filters into the outer scope, and then evaluate the expression. We can do this because it's not valid to mutate anything within an expression, so we can guarantee the same effect if we evaluate all the async stuff first and then simply fill in the original locations with the results. If we were in JavaScript, the transformation would look like this: foo(1, 2, title(username, function(err, _username) {})); // into title(username, function(err, _username) { foo(1, 2, _username) });  Indeed, you can see this pattern in the generated code for the original expression: function root(env, context, frame, runtime, cb) { var lineno = null; var colno = null; var output = ""; try { output += "\nHey "; env.getFilter("title").call(context, runtime.contextOrFrameLookup(context, frame, "username"), function(t_1,hole_0) { if(t_1) { cb(t_1); return; } output += runtime.suppressValue( runtime.callWrap(runtime.contextOrFrameLookup(context, frame, "foo"), "foo", [1, 2, hole_0]), env.autoesc ); output += "\n"; cb(null, output); }); } catch (e) { cb(runtime.handleError(e, lineno, colno)); } }  The title filter is called first and then in the callback foo is called with hole_0, which is the result of the title filter. You can lift as many asynchronous filters as needed, as long as you evaluate them in the same order as they are found. The lifting step introduces a new phase in the compiler: transforming. Previously a template was parsed into an AST, and then the AST was compiled. Now after the parser makes an AST, it is passed through a transformer which does all the lifting, and then the compiler takes the final AST and compiles it to JavaScript. #### Optimizing for the Common Use Case At this point, we finally have robust asynchronous templates. But hold on now, didn't I bemoan the loss of real for loops and code simplicity? Indeed, a quick benchmark of our new code shows a big drop in performance! (I don't remember how much, but it was somewhere around 2x-3x drop). This is sad. Since most people won't even do asynchronous work, what if we could generate asynchronous code only when actual asynchronous filters/extensions/loaders are used? If we require asynchronous filters and extensions to be known at compile-time, we can be very optimistic with the generated code. Let's ignore loaders for now, as they have some edge cases that aren't worth discussing. Let's take the basic example again: Hello {{ user | formatName }}!  If we have a list of names of all the asynchronous filters, we can check if formatName is asynchronous. If it is not, the compiler can generate fast synchronous code and forego the callback mess. This is groundbreaking because suddenly we can deduce if a whole chunk of code is asynchronous or not. For example, look at this example again: {% for item in items %} {{ item.name }} last seen {{ item.id | getLastSeen }} {% endfor %}  We can actually scan the entire code within the for loop and check to see if any asynchronous filters are used. If they aren't, we can fall back to a normal (and highly performant) JavaScript for loop! You can see this happening here in the AST transformer. When it hits an if or for, it scans all the nodes inside and checks for any async nodes. If it finds any, it converts the if or for into an IfAsync or AsyncEach node, which generates async code instead, and continues walking up the AST. Now the generated code is by default synchronous (and fast!) just like it was before any of this happened, but you can trigger asynchronous code generation when you need it. #### And We're Done! That was a whirlwind tour of how I implemented asynchronous templating in nunjucks. I thought it was an interesting exercise and I was happy that I was able to keep normal synchronous templates (which is by far the most common) fast like they've always been. ### Parallel Execution Now that we have asynchronous ability, we should take advantage of it. There is a lot more nunjucks could do, but I'm taking it slowly to see how users use it. The nice thing is that you can abstract away complex asynchronous scenarious that would result in complicated code. Take an asynchronous map, for example. If you have an array of items, and want to do something asynchronous to all of them in parallel, it gets complex with error handling (promises help, but it's still verbose). Maybe you can just use the new nunjucks tag, asyncAll, which renders all items in parallel: {% asyncAll item in items %} {{ item.id | lookupName }} {% endall %}  It's exactly like for but fires off the rendering for each item in parallel, and when all of them are finished renders the completed output in the right order. If lookupName is asynchronous, you'll get a nice speedup doing this in parallel. If you don't do anything asynchronous inside the loop, it just renders sequentially. We could possibly implement streaming templates, more powerful parallel execution, and all kinds of things, but I'm not sure those needs are a good fit for nunjucks. In the future, they might be. ### Conclusion I hope you enjoyed this, and you can read more specific details about asynchronous support in the docs. As always, I'm happy to answer questions on the mailing list. ## December 03, 2013 ### John O'Duinn — Infrastructure load for November 2013 • We’re back to typical load again in November. • #checkins-per-month: We had 7,601 checkins in November 2013. This is our 2nd heaviest load on record, and is back at expected range. For the curious, our heaviest month on record was in August 2013 (7,771 checkins) and our previous 2nd heaviest month was September2013 (7,580 checkins). • #checkins-per-day:Overall load was consistently high throughout the month, with a slight dip for US Thanksgiving. In November, 18-of-30 days had over 250 checkins-per-day, 13-of-30 days had over 300 checkins-per-day, and 1-of-30 days had over 400 checkins-per-day. Our heaviest day had 431 checkins on 18nov; close to our single-day record of 443 checkins on 26aug2013. • #checkins-per-hour: Checkins are still mostly mid-day PT/afternoon ET. For 10 of every 24 hours, we sustained over 11 checkins per hour. Our heaviest load time this month was 11am-12noon PT 15.6 checkins-per-hour (a checkin every 3.8 minutes!) – slightly below our record of 15.73 checkins-per-hour. mozilla-inbound, b2g-inbound, fx-team: • mozilla-inbound had 16.6% of all checkins. This continues to be heavily used as an integration branch. As developers use other *-inbound branches, the use of mozilla-inbound has reduced over recent months, and is stabilizing around mid-teens of overall usage. • b2g-inbound had 11.5% of all checkins. This continues to be a successful integration branch, with usage slightly increased over last month’s 10.3% and a sign that usage of this branch is also stabilizing. • fx-team had 6% of all checkins. This continues to be a very active third integration branch for developers. Usage is almost identical to last month, and shows that usage of this branch is also stabilizing. • The combined total of these 3 integration branches is 34.1% , which is slightly higher then last month yet fairly consistent. Put another way, sheriff moderated branches consistently handle approx 1/3 of all checkins (while Try handles approx 1/2 of all checkins). The use of multiple *-inbounds is clearly helping improve bottlenecks (see pie chart below) and the congestion on mozilla-inbound is being reduced significantly as people use switch to using other *-inbound branches instead. Overall, this configuration reduces stress and backlog headaches on sheriffs and developers, which is good. All very cool to see working at scale like this. mozilla-aurora, mozilla-beta, mozilla-b2g18, gaia-central: Of our total monthly checkins: • 2.6% landed into mozilla-central, slightly lower than last month. As usual, most people land on sheriff-assisted branches instead of landing directly on mozilla-central. • 1.4% landed into mozilla-aurora, lower then last month’s abnormally high load. This is consistent with the B2G branching, which had B2G v1.2 checkins landing on mozilla-aurora, and now moved to mozilla-b2g26_v1_2. • 0.9% landed into mozilla-beta, slightly higher than last month. • 0.0% landed into mozilla-b2g18, slightly lower then last month. This dropped to almost zero (total of 8 checkins) as we move B2G to gecko26. • 3.3% landed into mozilla-b2g26_v1_2, as part of the B2Gv1.2 branching involving Firefox25. As predicted this is significantly more then last month, and is expected to continue until we move focus to B2G v1.3 on gecko28. • Note: gaia-central, and all other gaia-* branches, are not counted here anymore. For details, see here. misc other details: As usual, our build pool handled the load well, with >95% of all builds consistently being started within 15mins. Our test pool is getting up to par and we’re seeing more test jobs being handled with better response times. Trimming out obsolete builds and tests continues. As always, if you know of any test suites that no longer need to be run per-checkin, please let us know so we can immediately reduce the load a little. Also, if you know of any test suites which are perma-orange, and hidden on tbpl.m.o, please let us know – those are the worst of both worlds – using up scarce CPU time *and* not being displayed for people to make use of. We’ll make sure to file bugs to get tests fixed – or disabled – every little bit helps put scarce test CPU to better use. ### Karl Dubost — Spelling, Forms and Browsers With the rise of mobile phones, typing long text with one finger letter by letter is becoming a straining task. Mobile implementers have developed a number of strategies to make it easier for people to type their text faster and minimizing the mistakes. It includes things like: • Spell checking • Predictive typing • auto completion of words • auto capitalization Here an example of capitalization on Safari iOS And an example of autocorrect in Safari on Desktop, where I typed "whattever you say" and the system has replaced it by "whatever you say" These features can become annoying or even counter-effective when the user is typing a password, an email address, a domain name, etc. So it becomes necessary to be able to deactivate them. ### Many devices, many browsers, many options, no Web standard #### Mozilla Firefox OS Firefox OS has a non documented attribute x-inputmode based on the inputmode attribute in HTML5. The inputmode content attribute is an enumerated attribute that specifies what kind of input mechanism would be most helpful for users entering content into the form control. The specific value verbatim is defined as Alphanumeric Latin-script input of non-prose content, e.g. usernames, passwords, product codes. When using this attribute on form input or textarea elements, the browser deactivate any predictive typing and do not propose any automatic completion of words. Markup example <input x-inputmode="verbatim"/>  Firefox OS is currently the only browser supporting x-inputmode. The current documentation on Mozilla is not totally consistent or clear. There is a page with a mention of inputmode, but no details given about it. In a "Follow your nose" principle, you may find: But all of this doesn't really lead to a real explanation on how it is actually implemented. #### Apple iOS Safari Apple is writing extensively in iOS documentation about autocorrect and autocapitalize features. Set the autocorrect attribute to on if you want automatic correction and the autocapitalize attribute to a value if you want automatic capitalization. If you do not set these attributes, then the browser chooses whether or not to use automatic correction or capitalization. For example, Safari on iOS turns the autocorrect and autocapitalize attributes off in login fields and on in normal text fields. The documentation is going through Markup example <input autocorrect="off"/>  On my version of Safari and during my tests only autocapitalize and autocorrect seemed to be working. I will be happy to have more feedback about this. #### Google Blink Google has an opened bug for implementing inputmode as described in the specification. But it seems awkward with regards to autocorrect which are indeed not supported in any versions of chrome. #### Microsoft IE Finally, when going through the different systems, Microsoft since IE10 seems to have implemented a spellcheck attribute, but I'm not sure what it exactly does. The documentation doesn't say if it should block the autocorrect or autocompletion features. Markup example <input spellcheck="false"/>  ### Test page and some results I created a test page with all cases which are supposed to have been implemented in some fashion in a Web page. For people, typing on mobile this address is shorter. http://la-grange.net/forms. Do not hesitate to contact me on twitter @karlpro if you have other findings. Otsukare! ### Kat Braybrooke — Building Cultural Heritage: A RemixJam with Tate Britain Building Cultural Heritage: A RemixJam with Tate Britain: The open web has presented cultural heritage institutions with big opportunities to engage global audiences and make their collections more discoverable (and shareable!) than ever before. A few weeks ago I headed to the Tate Britain to deliver a skills-sharing workshop (at Mozilla, we call this method Train the Trainer) to prepare their Gallery Collectives on the use of digital remix tools ahead of the Tate’s public housewarming party. Our goal? Use Webmaker's open source webpage remix tool XRay Goggles to engage makers of all ages with the process of digital curation and licensing for cultural heritage institutions. During the session, sitting in the Tate’s brand new ‘digital studio’, the group realised there was a real need to create a public, remixable curriculum kit that other heritage institutions could use to engage audiences critically with their digital collections (and importantly, with the open web and the cultural commons). The result? This Cultural Remixjam Teaching Kit. The Cultural Remixjam, in a nutshell, is a hands-on 3 hour workshop that introduces participants to the co-design process (a real passion for those of us who run the ODandH collective), webmaking and the commons in the context of digitized cultural collections and archives. Activities are participatory, creative and critical, encouraging participants to form a conceptual understanding of the web not only as a creative medium but also a place where cultural heritage can thrive if given the right kinds of digital circumstances which allow for use and re-use of shared public assets. This Remixjam is still in its infancy, and it needs further edits and critical review. But the process of training, discussing and then putting together its co-design challenges for the use of the Tate’s Digital Studio has proven to be a very powerful example of what interest-led, co-designed curriculum development can look like, especially in the context of the increasingly-salient Web Literacy Standard, connected learning and the open design theories underlying this work. This is one of the first instances in the UK where we’ve engaged in a process of interest-based curriculum design with a heritage partner, so your feedback, criticism and thoughts would be really useful, especially coming from different local contexts. Please do share it all here — constructive criticism would be much appreciated, and will help make the next iteration of this Remixjam even more applicable for a wide group of cultural heritage organisations to engage with and build upon. ### Gen Kanai — Shanghai Community meeting December 8 Please join me at the upcoming Mozilla community meeting in Shanghai on the afternoon/evening of Sunday, December 8th. My presentation will be in English on the topic of community building strategies but I think the bulk of the meeting will be in Chinese. A draft agenda is as follows (this may change): 3:00 - 3:10 pm Short introduction about Mozilla/Firefox l10n related work 3:10 - 3:30 pm the translation guide lines introduction 3:30 - 3:45 pm break 3:45 - 4:45 pm Firefox OS and Firefox Marketplace 4:45 - 5:00 pm break and free discussion 5:00 - 6:00 pm AMO, MDN, SUMO translation, l10n sprint 6:05 - 6:45 pm Gen's speech & QA 6:45 - 7:15 pm Pizza dinner 7:15 - 8:00 pm Movie "Code Rush" - Event venue 上海市静安区昌平路990号8号楼 联合创业办公社 (延平智阁) Google Maps link Please feel free to either show up at the event itself or if you’d like, please leave a comment and we’ll know to look for you. Hope you can join us! ### Matthew Noorenberghe — Introducing a Google spreadsheet library to fetch Talos results As part of the performance investigation for Australis (the in-progress Firefox desktop redesign), we wanted to be able to easily track Talos performance benchmark numbers across all of the relevant platforms without having to open 52 graph server links (Datazilla wasn't ready at the time) . My goal was to use the graph server API to pull in the data and display the regression percentage for all of the tests we were tracking in an overview. Rather than writing a new tool from scratch, I decided to look into Google Apps Script, which I had only heard about before, and it seemed to be able to help implement what I wanted on top of Google Spreadsheets. The result is a shared Talos Google Apps Script Library [Google login required] (revision log) (API docs) that anyone can use. You can see it in use for TART and other tests for Australis along with the demo spreadsheet. ### How to use the Talos library in a Google spreadsheet Also see Google's documentation on Libraries and the demo spreadsheet. 1. In your Google Spreadsheet, choose Tools > Script Editor 2. In the new tab, click Close to skip the tutorials and templates. 3. In the menu, choose Resources > Manage Libraries. If you are asked to save your project, do so. 4. In the "Included Libraries" window, paste the following in the "Find a Library" textbox and click Select: MCurDezZ1B01NQX34URNDye9_pJh_2yq6 5. The "Talos" library should have appeared above. Choose a version of the library (the latest is probably best), and click Save. 6. Replace the contents of the default script file ("Code.gs" at the time of writing) with pass-through wrappers for every function you want to use (example). Let me know if there is a better way to do this as it's not ideal. 7. Call your wrapper functions from within your spreadsheet. e.g. =getTalosResult(226, 59, 25, "ca7577238ef4"). You can get test, branch and platform IDs from the compare.py script or from the URLs output on TBPL. Now you can use the power of spreadsheets to slice and dice the data as you please. Perhaps you like custom graphs? ### Caveats • Google seems to cache return values of functions for given arguments and so you should only only request results when the talos runs are done and have as many re-triggers as desired. Otherwise, you'll need to change one of the arguments to the functions or add a new cache-breaker argument. • Occasionally you will get an error that the script took too long to execute. I haven't found documentation on what the upper bound is and I'm not sure of the cause but since there is caching in the library as well, recalculating the whole document within the script's cache period (currently 12 minutes) normally allows the scripts to finish since many of the rows and/or columns can use the cached data. Eventually the library should switch to using Datazilla but this works for now since talos results are being reported to both services. If you would like to contribute changes to the library, let me know. ### Carla Casilli — Badge pathways: part 0, the prequel This prequel blog post is part of an ongoing trilogy. The trilogy consists of three posts—the prequel, the “quel” and the sequel—plus a bonus paraquel post. The first post to appear, the paraquel, can be found here; the “quel” post can be found here; the prequel post you’re reading right now; and the sequel post is in process. All of these posts provide a window into our thoughts about pathways—past, present and future. You may have noticed that these posts have come out of order. Why is this so? For a simple reason. Because they’ve occurred to me in this order. And somewhat poetically, their order underscores the exact ideas that I argue in all of these linked posts—that there are few simple linear trajectories, even with blog posts. A long time ago in a galaxy far, far away We started down the road toward making Open Badges a reality about 3 years ago, so it’s possible (and useful!) for us to take a look back to our inception to make sense of the past and provide us with clues about where we might head. Episode IV: A NEW HOPE In the beginning, the Open Badge Infrastructure (OBI) was focused on the development of software that allowed people to develop their own badges—badges without traditional definitions or parameters—and with little to no input from socially prevalent hierarchical organizations. Mozilla cheered badge systems that did not hew to limiting linear learning paths, badge systems that investigated new and dynamic ways to recognize learning regardless of where and when and how it occurred. And yet, in those early days we spoke about the OBI only as a sort of plumbing, as a tool that would privilege the earner rather than the badge issuer. By linking people who wanted to create badges with people who chose to earn badges with people who wanted to display and consume badges, we gambled that a meaningful marketplace would arise. This marketplace would foster new types of skill, learning, and competency acknowledgement and encourage new forms of assessment. And all of this would begin to occur in a new way thanks to the space of possibility created by this new tool, the OBI. And so it has. The force is strong, or the power of disjunctive and conjunctive tasks In retrospect, it’s easy to see that in addition to creating a dynamic and effective tool we were creating a community-driven movement as well. How did we arrive at that social movement? By alternately marching to the drumbeat outlined above and finding serendipitous alignments with other folks seeking similar objectives. Through the confluence of disjunctive / conjunctive tasks. But what exactly are disjunctive and conjunctive tasks? The organizational theorist, I.D. Steiner distinguished between disjunctive tasks, those in which only one person needs to succeed and conjunctive tasks: those in which everyone’s contribution is critical. (Page, 2007, p. xv) The OBI began as a disjunctive task. In other words, the disjunctive nature of the task required that Mozilla succeed at developing a functional technical implementation of the OBI. The success of the OBI as a tool was of primary importance. And I’m pleased to say that we have built a robust and dynamic, fully functioning tool. And yet, Open Badges operates as both a tool (and soon a series of tools) and an ecosystem—an ecosystem that houses a series of other systems: individual badge systems created by many different issuing organizations as well as a variety of badge consuming organizations. Each of those systems acts in a conjunctive way in reference to the larger open badges ecosystem. They’re important for the growth, continuity, and development of the ecosystem. A single badge system, consisting of a number of badges. Wheel within wheels Given that they’re conjunctive for the ecosystem, here’s a bit of a mindbender: each of the individual badge systems operate as disjunctive tasks. They need to depend only on their own systemic integrity to thrive. Consequently, those systems are free to explore, consider and attempt various criteria, assessments, and systems design. Even more of a mindbender? All of those badge systems are in turn, conjunctive: the success or failure of them is dependent upon the individual badges—that are their own disjunctive tasks. And yes, this can all seem a bit fractal. Similar types of badge systems begin to coalesce into a rough typology. Indeed, this systemic plasticity creates a space of possibility and is one of the primary reasons why we (Mozilla) encourage so much developmental experimentation and why we support so many alternative approaches to assessment. The Open Badges ecosystem can accommodate significant speculative load. All this is to say that together, as a community, we’ve developed a truly distributed information project. Setting the stage for growth Or how we rely on the kindness of our community member to develop, improve, and police our system. As the economic historian Paul David pointed out to [Scott Page], one of the great challenges in constructing distributed organizations is transforming conjunctive tasks into disjunctive tasks. For example, the success in open-source software development requires an initial template that modularizes the problem into a collection of disjunctive parts. (Page, 2007, p. xvi). Dawning of the open badges ecosystem: many types of disjunctive badge systems begin to form. Et voilà! Here you have the Open Badge Infrastructure. A loosely designed system rooted in this precise theory: distributed co-creation. And by direct and indirect extension, really any badge system that operates within the open badges parameters and framework. As badge systems increase within the ecosystem, system strengths and network ties appear. Resilience as a result of a conjunctive system It may seem obvious, but on the off chance that it’s not, let’s discuss what we’ve been somewhat indirectly addressing here: resilience. As I’ve noted in previous blog posts, there is great value to having an extremely resilient system. In its current iteration, the larger system (the Open Badges ecosystem) can accommodate failure because all of the systems can act both independently and interdependently. We might consider the open badges ecosystem’s ability to withstand failure—its resilience—to be one of its absolute strengths. Some of this may have come from extremely savvy planning, some of it may have come from working with the community to build an agreeable tool and some of it may have come from luck. To quote from George Lucas, “when Star Wars first came out, I didn’t know where it was going either. The trick is to pretend you’ve planned the whole thing out in advance.” The open badges ecosystem continue to evolve, developing systemic resilience. All this talk about what’s come before, what about pathways? As noted above, these posts are stitching together our experiences thus far, seeking a narrative for our ecosystem pathway. Along similar lines, we’ve been finding some resonance with Bitcoin (open source P2P money) as an analogue to the development of a new system possessing social value. Of course that product also includes actual financial value as well and so is a whole other kettle of fish. (As for the conceptual trajectory Bitcoin has been tracing, now there’s an interesting pathway worth examining closely. Possibly more about that in a future post.) To be continued… Distributed problem solving can be thought of as a form of innovation. This opening up of innovation activities is sometimes called distributed co-creation. The diverse toolboxes that people bring to problems enable large populations to enable novel breakthroughs. (Page, 2007, p. xvii) The thriving open badges ecosystem contains various types of badge systems: an expansive, inclusive universe. Using distributed problem solving as our lodestone, we’ll continue to move ahead. We’re creating new opportunities as we go, charting new directions for other organizations to follow, and encouraging the development of the badge universe to continue to expand. We’re embracing emergence and encouraging novelty. Much more soon. references: Page, S. (2007). The difference: how the power of diversity creates better groups, firms, schools and societies. Princeton, NJ: Princeton University Press. Available from: http://press.princeton.edu/titles/8757.html Hibbard, J. (2010). George lucas sends a letter to lost. Hollywood Reporter. Retrieved from Wikipedia: http://en.wikipedia.org/wiki/Star_Wars#Prequel_trilogy Tagged: badge system design, dmlbadges, drumbeat, mozilla, OBI, open source, openbadges, software, system design ### Patrick Cloke — GSoC Lessons: Part 1: Application Period I briefly talked about my experiences at the Google Summer of Code 2013 Mentor Summit. I've been pretty remiss in sharing what was actually discussed there and for that I must apologize! This will hopefully be one of a few posts about what I learned and discussed at the Summit. The first part I'd like to talk about is the application period: welcoming students, requirements for student applications, etc. Much of what I say on here is just ideas I've heard other organizations implement (with my personal opinion on them, please don't think this represents what Mozilla is suggesting students do, or even what I'm suggesting Mozilla should ask students to do!) I had many separate conversations about what is required for an application to be accepted. It seems that Mozilla is actually on the side of one of the easier organizations to apply to. We don't (to my knowledge) require that students have contributed at all to the community beforehand. It is possible that some smaller communities inside of Mozilla require more than just an application, but there does not seem to be any rule across Mozilla. I said I wouldn't offer my opinion above...but I lied: I think Mozilla should make it clearer to applicants what is expected of them before the application. There seem to be a variety of things different organizations "require" before accepting a student application, for example: • A patch / pull request • IRC / email involvement / idling • File a bug (I mean this in the "Mozilla" sense: an actual bug, a feature request, etc.) • Fix a bug / make a commit I think all of these have pros and cons and making any a hard and fast rule would probably be a bad idea. Personally for Instantbird, we greatly encourage students to idle on IRC and get to know us; and to fix a minor bug or two or three. What I'm always looking for is: use, passion, and skill. Asking for a patch / pull request (I include these together since they really just depend on how an organization accepts changes) can be a bit intimidating for a new user. I think this can be a pretty rough thing to ask for new contributors that might not want to share their work publicly with a large group of people (on a mailing list, public bug tracker, etc.) where they might be wrong. Even after being part of the community, I find that GSoC students are often very unwilling to publicly share code unless it's "perfect", but I digress. Anyway, if you're considering "requiring" this, I think it should be pretty clear that this changeset doesn't need to be perfect, it just needs to show that the student is able to read code, understand a bug report, provide a fix and test it. I think it's perfect reasonable to ask students to idle on IRC and join mailing lists. They should definitely be trying to understand the community before attempting to join it. It isn't just a matter of if the community thinks the student would be a good fit, but also the student must ensure they can fit into the community. Filing a bug is a great way for a student to show a few different things: they've used your software; they've used your software enough to find a bug in it (and there most likely is one!); they're able to express themselves in a clear and concise matter. If you're lucky they'll find something that actually annoys them and fix it themselves! I have fix a bug listed last. You might ask how this differs from submitting a patch...and it does! Fixing a bug requires a patch to go through whatever review process your project uses, but builds upon just submitting a patch. My thoughts on this are pretty similar to just submitting a patch, but it depends on how large the bug is. Something I found interesting is that almost everyone I talked to didn't treat their GSoC students any differently than they would treat a new contributor to their project. They still had to prove they were worthy of commit access, etc. Is there anything else you ask of your students before they apply to GSoC? I'd love to hear it! Some other topics I'll hopefully find some time to write about include: community lessons, and handling a failing student. The community one will be very not-GSoC focused and could apply to just trying to incorporate new contributors...but I'll include it in this series. ### Chris Pearce — Why does the HTML fullscreen API ask for approval after entering fullscreen, rather than before? The HTML fullscreen API is a little different from other JS APIs that require permission, in that it doesn't ask permission before entering fullscreen, it asks forgiveness *after* entering fullscreen.  Firefox's fullscreen approval dialog, which asks "forgiveness" rather than permission. The rationale for having our fullscreen API implementation ask forgiveness rather than request permission is to make it easier on script authors. When the original API was designed, we had a number of HTML/JS APIs like the geolocation API that would ask permission. The user was prompted to approve, deny, or ignore the request, though they could re-retrieve the request later from an icon in the URL bar to approve the request at a later time.  Geolocation approval dialog, from Dive Into HTML's geolocation example. The problem with this design for script authors is that they can't tell if the user has ignored the approval request, or is just about to go back and approve it by bringing up the geolocation door-hanger again. This model of requesting permission has been seen to cause problems for web apps in the wild using the geolocation API. Often if a user ignores the geolocation permission request, the web app doesn't work right, and if you approve the request some time later, the site often doesn't start working correctly. The app just doesn't know if it should throw up a warning, or if it's about to be granted permission. So the original developers of the fullscreen spec (Robert O'Callahan, and later I and others were involved), opted to solve this problem by having our implementation ask forgiveness. Once you've entered fullscreen, the user is asked to confirm the action. This forces the user to approve or deny the request immediately, and this means that script will immediately know whether fullscreen was engaged, so script will know whether it needs to take its fallback path or not. Note that the specification for requestFullscreen() defines that most of the requestFullscreen() algorithm should run asynchronously, so there is scope to change the fullscreen approval dialog to being a permission request before entering fullscreen instead if future maintainers, or other implementors/browser, wish to do so. ## December 02, 2013 ### Richard Newman — Building and testing multi-locale Firefox for Android This is a follow-up to my earlier post on locale switching. By default, local Firefox builds are English-only. Including other locales involves merging in content from the l10n repositories. Building an APK that includes other locales, then, means performing the following steps. You only have to do them once. In short: 1. Get checkouts of the appropriate l10n repositories for the locales you care about. 2. Put some incantations in your .mozconfig. 3. Install compare-locales. (See Bug 940103 to remove this step.) Then, each time you build, run a small script between ./mach build and ./mach package. ### Getting checkouts for Fennec’s supported locales MOZILLA_CENTRAL=~/moz/hg/mozilla-central LOCALES=MOZILLA_CENTRAL/mobile/android/locales/maemo-locales
L10NBASEDIR=~/moz/hg/l10n
mkdir -P $L10NBASEDIR pushd$L10NBASEDIR
while read line; do hg clone "http://hg.mozilla.org/releases/l10n/mozilla-aurora/$line"; done <$LOCALES

### Augmenting your .mozconfig

Add the following lines:

# Make this match your checkouts.
mk_add_options 'export MOZ_CHROME_MULTILOCALE=en-US cs da de es-ES fi fr ja ko it nb-NO nl pl pt-BR pt-PT ru sk sv-SE zh-CN zh-TW'

# Use absolute paths.
mk_add_options 'export L10NBASEDIR=/Users/rnewman/moz/hg/l10n'
ac_add_options --with-l10n-base=/Users/rnewman/moz/hg/l10n

### Install compare-locales

pip install compare-locales

### Build and package

This step should be improved when we fix Bug 934196. Personally, I’ve just dumped the extra stuff in a locales.sh script and moved on with my life.

cd $MOZILLA_CENTRAL ./mach build && \ pushd objdir-droid/mobile/android/locales && \ for loc in$(cat ../../../../mobile/android/locales/maemo-locales); do LOCALE_MERGEDIR=$PWD/merge-$loc make merge-$loc LOCALE_MERGEDIR=$PWD/merge-$loc; make LOCALE_MERGEDIR=$PWD/merge-$loc chrome-$loc LOCALE_MERGEDIR=$PWD/merge-$loc; done && \
popd && \
./mach package

Note that the new stuff is in bold.

Once this completes (assuming no errors), you’ll have an APK that contains multiple locales. Install it on your device!

### Updating your l10n checkouts

Every now and then, do something like this:

cd $L10NBASEDIR for loc in$(cat $MOZILLA_CENTRAL/mobile/android/locales/maemo-locales); do \ pushd$loc && hg pull && hg up -C && popd; done

### Testing locale switching

Until we ship a UI for this, you’ll need to use a trivial testing add-on. That add-on puts menu items in the Tools menu; pick one, and it’ll switch your app locale.

The code for this add-on is on GitHub. You can also install the XPI directly. Then you’ll see the Tools menu full of locales, like this:

Try it out… and whenever you make a change to UI code, use it to make sure you haven’t broken anything!

### Richard Newman — New locale-related work in Firefox for Android

I recently landed the first steps towards a new way of choosing your language and locale in Firefox for Android. This is of interest as a feature, of course, but it also means some new capabilities and obligations for Fennec front-end developers, so I thought I’d put pen to paper.

### Context

Right now, Firefox on Android — like most Android apps — displays its UI and web content in the locale you’ve selected in Android’s settings. In short: if your phone is set to use es_ES (Español [España]), then so is Firefox… and without digging around in about:config, you will also get web content in Spanish by default.

That’s not ideal for a number of reasons. Firstly, carriers tend to restrict the locales that you can select, sometimes to as few as four. If you only speak (or prefer to speak) a language that your carrier doesn’t let you choose, that’s a bad scene. Secondly, the Mozilla community extends beyond the locales that Android itself supports. The only way to address these two issues is to decouple Firefox’s locale selection from Android.

The work that just landed to do so is Bug 936756, upon which we will build two selection UIs — Bug 917480 for the app locale, and Bug 881510 for choosing which locales you wish to use when browsing the web.

### What does this mean for users?

Quite simply: once a UI has been layered on top, you’ll be able to switch between each of Firefox for Android’s supported locales without restarting your browser, and maintain that selection independently of the Android OS locale. If you’re happy continuing to use Android’s settings to make that choice, that will continue to work, too.

### How does it work?

It works by persisting a selected language in SharedPreferences, manipulating both the Gecko locale prefs and the Android/Java Locale and Resources frameworks to impose that language. To do so we hook into onConfigurationChanged events. For more details, read the bug!

(This is also a small step toward supporting l20n on Android. More on that at a later date.)

### What does this mean for developers?

Historically, there’s been a sizable rift between day-to-day development and the final localized builds that users see. We front-end developers build en_US-only builds for testing, while our localization communities work with Aurora, 6-12 weeks later. Partly that’s because locale switching itself is cumbersome (watch Android restart everything under the sun!). Partly it’s because building a multi-locale APK has been difficult.

Unfortunately, that results in a failure to detect even obvious l10n issues during development. Take, for example, Bug 933272. Between Firefox 23 and 28, we displayed all plugin-related text in English, regardless of your selected locale. This is the kind of thing that’s easy to find by testing a local build in a non-English locale.

With switchable locales, two things are true:

• It’s now easy to switch locales, so you can (and should!) routinely test with multiple locales, just as we test with tablets and phones, with screens rotated, etc.
• You must test locale switching if you’re working on front-end code that includes strings, so that your new or changed feature doesn’t break locale switching!

I hope that’s a fair trade. Next post: how to build a multi-locale APK, without too much disruption to your existing toolchain.

### Chelsea Novak — Firefox Affiliates: Time for an update

The Firefox Affiliates program was launched in 2011, acting as a hub for banners that Firefox and Mozilla supporters can add to their websites, blogs and Facebook walls. The system is available in more than 15 languages and affiliate buttons are available in more than 30 (Yes, Mozilla localizers are the best).

The site has driven millions of clicks to download over the last two years. So it is with great delight that I share that the time has come to update and enhance Firefox Affiliates. Affiliates have been writing in with suggestions for how we can improve the system and now the time has come where we can put that feedback to use.

It is also time for you to speak up and let us know what would make the Affiliates experience great for you. Requests so far include adding more types of banners and links to the system, gamification for participation, a better mobile experience and options for banner customization.

If would be hugely helpful if you could fill out this survey to help us prioritize where we want to put our energies first.

There are about 100,000 use registered as Firefox Affiliates so another priority will be surfacing other opportunities for them to support Firefox and Mozilla beyond hosting a banner. They have already taken some action as a supporter, so we will be doing a lot of thinking about what other interesting activities to invite them to participate in.

We will also be updating the visual design of the program, bringing it in line with the Sandstone branding that all Mozilla sites use.

All in all, 2014 is going to be an exciting time for the Affiliates program. If you have thoughts, please fill out the form or leave them in the comments. You can also find us in #affiliates on IRC.

### Niko Matsakis — Thoughts on DST, Part 4

Over the Thanksgiving break I’ve been devoting a lot of time to thinking about DST and Rust’s approach to vector and object types. As before, this is very much still churning in my mind so I’m just going to toss out some semi-structured thoughts.

### Brief recap

Treating vectors like any other container. Some time back, I wrote up a post about how we could treat vectors like any other container, which would (to some extent) avoid the need for DST.

Dynamically sized types (DST). In Part 1 of the series, I sketched out how “Dynamically Sized Types” might work. In that scheme, [T] is interpreted as an existential type like exists N. [T, ..N], and Trait is interpreted as exists T:Trait. T. The type system ensures that DSTs always appear behind one of the builtin pointer types, and those pointer types become fat pointers:

• Advantage. Impls for objects and vectors work really well.
• Disadvantage. Hard to square with user-defined smart pointers like RC<[int]>. The problem is worse than I presented in that post, I’ll elaborate a bit more.

Statically sized types (SST). In Part 2 of the series, I sketched out an alternative scheme that I later dubbed “Statically Sized Types”. In this scheme, in some ways similar to today, [T] and Trait are not themselves types, but rather shorthands for existential types where the exists qualifier is moved outside the smart pointer. For example, ~[T] becomes exists N. ~[T, ..N]. The scheme does not involve fat pointers; rather, the existential type carries the length, and the thin pointer is embedded within the existential type.

• Advantage. It is easy to create a type like RC<[int]> from an existing RC<[int, ..N]> (and, similarly, an RC<Trait> from an existing RC<T>).
• Disadvantage. Incompatible with monomorphization except via virtual calls. I described part of the problem in Part 3 of the series. I’ll elaborate a bit more here.

### Where does that leave us?

So, basically, we are left with two flawed schemes. In this post I just want to elaborate on some of the thoughts I had over Thanksgiving. Roughly speaking they are three:

1. DST and smart pointer interaction is even less smooth than I thought, but workable for RC at least.
2. SSTs, vectors, and smart pointers are just plain unworkable.
3. SSTs, objects, and smart pointers work out reasonable well.

At the end, I suggest two plausible solutions that seem workable to me at this point:

### Making DST work with RC requires some contortions

In part 1, I gave the example of how we could adapt an RC type to use smart pointers. I defined the RC type as followings:

struct RC<T> {
priv data: *T,
priv ref_count: uint,
}


Unfortunately, as Partick pointed out on reddit, this simply doesn’t work. The ref count needs to be shared amongst all clones of the RC pointer. Embarassing. Anyway, the correct definition for RC is more like the following:

struct RCData<T> {
priv ref_count: uint,
priv t: T,
}

struct RC<T> {
priv data: *mut RCData<T>
}


In order to be sure that I’m not forgetting details, permit me to sketch out roughly how an RC implementation would look in actual code. To start, here is the code to allocate a new RC pointer, based on an initial value. I’m going to allocate the memory using a direct call to malloc, both so as to express the “maximally customized” case and because this will be necessary later on.

impl<T> RC<T> {
pub fn new(t: T) -> RC<T> {
unsafe {
let data: *mut RCData<T> =
transmute(malloc(sizeof::<RCData<T>>()));

// Intrinsic init initializes memory that contains
// uninitialized data to begin with:
init(&mut *data, RCData { ref_count: 1, t: t });

RC { data: data }
}
}
}


One could dereference and clone an RC pointer as follows:

impl<T> Deref for RC<T> {
fn deref<'a>(&'a self) -> &'a T {
unsafe { &self.data.t }
}
}

impl<T> Clone for RC<T> {
fn clone(&self) -> RC<T> {
unsafe {
self.data.ref_count += 1;
*self
}
}
}


The destructor for an RC<T> would be written:

impl<T> Drop for RC<T> {
fn drop(&mut self) {
unsafe {
let rc = self.data.ref_count;
if rc > 1 {
self.data.ref_count = rc - 1;
return;
}

// Intrinsic drop that frees memory:
drop::<T>(&mut self.data.t);

free(self.data);
}
}
}


OK, everything seems reasonable. Only one problem – this whole scheme is incompatible with DST! To see why, consider again the type RCData:

struct RCData<T> {
priv ref_count: uint,
priv t: T,
}


And, as you can see here, it references T by itself, without using any kind of pointer indirection. But for T to be unsized, it must always appear behind a *T or something similar. This is precisely the example that I showed in the section Limitation: DSTs much appear behind a pointer in Part 1.

Now, it turns out we could rewrite RC to make it DST compatible. The idea is to use the standard trick of storing the reference count at a negative offset. Let’s write up an RC1 type that shows what I mean:

struct RC1Header {
priv ref_count: uint,
}

struct RC1<unsized T> {
priv data: *mut T
}


In this scheme, we have a pointer data directly to a *mut T. This means that the compiler could “coerce” an RC1<[int, ..3]> into a RC1<[int]> by expanding data into a fat pointer. It does have the side-effect of makeing the code to allocate an RC and manipulate its ref count a bit more complex, since more pointer arithmetic is involved.

Here is the code to allocate an RC1 instance. Hopefully it’s fairly clear. One interesting aspect is that, for allocation, we don’t need to accept unsized types T, since at allocation time the full type is known. However, later on, we may “forget” the precise type of T and convert it into an unsized, existential type like [U] or Trait. In that case, we still need to be able to find the reference count, even without knowing the size or alignment of T. Therefore, we must be conservative and do our calculations based on the maximal possible alignment requirement for the platform.

static MAXIMAL_ALIGNMENT: uint = 16; // platform specific

impl<T> RC1<T> {
pub fn new(t: T) -> RC1<T> {
unsafe {
// We need to be able to compute size of header
// without knowing T, so be conservative:
assert!(MAXIMAL_ALIGNMENT > sizeof::<uint>());
let header_size = MAXIMAL_ALIGNMENT;

// Allocate memory for header + data.
let size = header_size + sizeof::<T>();
let alloc: *mut u8 = malloc(size) as *mut u8;

// Initialize the reference count.
let header: *mut RC1Header = alloc as *mut RC1Header;
*ref_count = 1;

// Initialize the data itself.
let data: *mut T = (alloc + header_size) as *mut T;
init(&mut *data, t);

// Construct the GC value.
RC1 { data: data }
}
}
}


Here is a helper to obtain a pointer to the ref count from an RC1 instance. Note that it is carefully written to be compatible with an unsized T.

impl<unsized T> RC1<T> {
fn header(&self) -> *mut RC1Header {
let data: *mut u8 = self.data as *mut u8;
let header_size = MAXIMAL_ALIGNMENT;
(data - MAXIMAL_ALIGNMENT) as *mut uint
}
}


Based on this we can rewrite deref, clone, and drop in a fairly obvious way. All of them are compatible with unsized types.

impl<unsized T> Deref for RC1<T> {
fn deref<'a>(&'a self) -> &'a T {
unsafe { &*self.data }
}
}

impl<unsized T> Clone for RC1<T> {
fn clone(&self) -> RC<T> {
unsafe {
self.header().ref_count += 1;
*self
}
}
}

impl<unsized T> Drop for RC1<T> {
fn drop(&mut self) {
unsafe {
let rc = self.header().ref_count;
if rc > 1 {
self.header().ref_count = rc - 1;
return;
}

// Intrinsic drop that frees memory:
drop::<T>(&mut *self.data);

free(self.data);
}
}
}


OK, so we can see that DST does permit RC<[int]>, but only barely. It makes me nervous. Is this a general enough solution to scale to future smart pointers? It’s certainly not universal.

### Why SST just doesn’t work with vector types.

The SST approach does not employ fat pointers in the same sense and thus is largely free of the limitations on smart pointer layout that DST imposes. But not entirely. In part 3 I described the problem of finding the correct monomorphized instance of deref(). In general, this is not possible, though in many instances the compiler could deduce that it doesn’t matter which type of pointee deref() is specialized to – I thus proposed that a solution might lie in formalizing this idea by permitting a type parameter T to be labeled erased, which would cause the compiler to guarantee that the generated code will be identical no matter what type T is instantiated with. This seems nice, but there are many complications in practice. Let me sketch them out.

First, it is rare that a type can be entirely erased, even in dereference routines. For example, consider the straight-forward RC type that I sketched out before, where the header was made explicit in the representation, rather than being stored at a negative offset. Here is the Deref routine:

impl<T> Deref for RC<T> {
fn deref<'a>(&'a self) -> &'a T {
unsafe { &self.data.t }
}
}


At first, it appears that the precise type T is irrelevant, but in fact we must know its alignment to compute the offset of the field t. This precise situation is why the alternative scheme RC1 made conservative assumptions about the alignment of t. We could address this, though, by manually annotating the alignment of the t field (something we do not yet support, but ought to in any case):

struct RCData<T> {
priv ref_count: uint,

#[alignment(maximum)]
priv t: T,
}


A deeper problem lies with the drop routine. The destructor for an RC<T> needs to do three things, and in a particular order:

1. Decrement ref count, returning if it is not yet zero.
2. Drop the value of T that we encapsulate.
3. Drop the memory we allocated.

The tricky part is that step 2 requires knowledge of T. I thought at first we might be able to finesse this problem by having the destructor run after the contained data had been freed, but that doesn’t work because in this case the data is found at the other end of an unsafe pointer, and the compiler doesn’t traverse that – and worse, we don’t always want to free the T value of an RC<T>, only if the ref count is zero.

Despite all the problems with Drop, it’s possible to imagine that we define some super hacky custom drop protocol for smart pointers that makes this work. But that’s not enough. There are other operations that make sense for RC<[T]> types beyond indexing, and they have the same problems. For example, perhaps I’d like to compare two values of type RC<[T]> for equality:

fn foo(x: RC<[int]>, y: RC<[int]>) {
if x == y { ... }
}


This seems reasonable, but we immediately hit the same problem: what Eq implementation should we use? Can Eq be defined in an “erased” way? Let’s not forget that Eq is currently defined only between instances of equal type. This winds up being basically the same problem as drop – we can only circumvent it by adding a bunch of specialized logic for comparing existential types.

Another problem lies in the case where the length of a vector is not statically known. The underlying assumption of all this work is that a type like ~[T] corresponds to a vector whose length was once statically known but has been forgotten. We were going to move the “dynamic length” case to a type like Vec<T>, that supports push() and so on. But the idea was that Vec<T> should be convertible to a ~[T] – frozen, if you will – once we were doing building it. And that doesn’t work at all.

Finally, even if we could, we don’t want to generate those monomorphized variants anyhow. Even if we could overcome all the above challenges, it’s still silly to have a type like RC<[int]> delegate to some specific destructor for [int, ..N] for whatever length N it happens to be. That implies we’re generating code for every length o the vector that occurs in practice. Not good, and DST wouldn’t have this problem.

OK, so I hope I’ve convinced you that SST and vector types just do not mix.

### Why SST could work for object types.

You’ll note I was careful not to toss out the baby with the bathwater. Although SST doesn’t work well with vector types, I think it still has potential for object types. There are a couple of crucial differences here:

1. With object types, we carry a vtable, permitting us to make crucial operations – like drop – virtual calls.
2. Object types like RC<Trait> support a much more limited set of operations:
• drop;
• invoke methods offered by Trait.

There are many ways we could make RC<Trait> work. Here is one possible scheme that is maximally flexible and does not require the notion of erased type parameters. When you cast an RC<T> to an RC<Trait>, we pair it with a vtable. This vtable contains an entry for drop and an entry for each of the methods in Trait. These entries are setup to take an RC<T> as input and to handle the dereferencing etc themselves, delegating to a monomorphic variant specialized to T. Let me explain by example. First let’s create a simple trait:

trait Mobile {
fn hit_points(&self) -> int;
}

struct PC { ... }
impl Mobile for PC { ... }

struct NPC { ... }
impl Mobile for NPC { ... }


Now imagine I have a routine like:

fn interact(pc: RC<PC>, npc: RC<NPC>) {
let pc_mob: RC<Mobile> = pc as RC<Mobile>; // convert to object type
let npc_mob: RC<Mobile> = npc as RC<Mobile>; // convert to object type
}


The idea would be to package up the RC<Mobile> with a vtable containing adapter routines. These routines would be auto-generated by the compiler, and would look roughly similar to:

fn RC_PC_drop(r: *RC<PC>) {
drop(*r)
}

fn RC_PC_hit_points(r: *RC<PC>) -> uint {
let pc: &PC = (*r).deref();
pc.hit_points()
}


Thus, when we convert a RC<PC> to a RC<Player>, we would pair the RC pointer with a vtable consisting of RC_PC_drop and RC_PC_hit_points. There are some minor complications to work out around the various self pointer types, but that seems relatively straightforward (famous last words). Anyway, the key idea here is to specialize the vtable routines to the smart pointer type, by moving the required deref into the generated method itself. This avoids the need for us to ever invoke code in an erased fashion.

If we added the erased keyword, it could still be used to permit the reuse of these adaptor methods across distinct pointer types. But this can also be done without a special keyword as an optimization (unlike before, it’s not necessary for the type to be erased, merely helpful).

### Squaring the circle

I think we could maybe make DST work, but I still worry it is too magical. It has some real advantages though so perhaps the right thing is to try and elaborate more examples of smart pointer types we anticipate and see whether they can be made to work.

Another solution is to remove vectors from the language, treat them like any other container, and use the SST approach for object types. But there are lots of micro-decisions to be made there, many of which boil down to usability things. For example, what is the meaning of the literal syntax and so on? I’ll leave those thoughts for another day.

### Jess Klein — Brainstorming on Dashboards

During the summer, I spent some time thinking about badge directories and dashboards. The general idea was to prototype a tool for badge earners to make sense of the larger badge ecosystem and in turn to create an integrated dashboard that would help them to collect, maintain and analyze their personal data on their learning, goals and skill acquisition.

Initially I had come up with a few ideas for this dashboard:

a. focusing and personalizing skill search. Here, the user might type in skills that are of interest to them and then popular badges would appear. I like exploring the interaction of incorporating some narrative elements into this framework. Here, instead of just a search box - you have a statement of intent.

b. pathways focused -  This mock up lays out skills you already have and then upon click/hover you can see where your skills could lead you. This is personalized approach, so once you log in, it will display a visualization of skills that relate to your data. However, if you are not logged in, it could display popular or trending skills or even .. geolocate badges based on your location.

c. Toggle vision - this gives you the chance to explore what is available in the ecosystem as well as in your personal badge library - as a list, as a visual display, and on a map.

d. Whimsical Exploration - still playing with the theme of exploration, discovery and happenstance - this is kind of like a wheel of fortune. Each node coming out of the circle lists skills and then if you are logged in and in fact have the skill, it will be notated. There is a natural progression from this view to a more sophisticated learning pathways exploration.

e. Your (data) garden - This one is a little crazy - but imagine that all of the trees below represent your skills at different stages of growth. You can have "community" gardens as well as "secret gardens" - giving you the ability to curate what data you are in fact sharing. Here you can also set goals, be informed about your "garden health" - which might just equate to giving feedback on various goals that you have set up for yourself - and tool tips - which could be mentorship or coaching based on your goals. There's a lot of metaphor going on here and it probably would be a brain game to figure out how to design it .... but this is just a sketch, so wha ha ha haaaa.

So - the Summer came and went and I thought about these prototypets a little bit more. More specifically, I started to consider what a dashboard and a discovery feature would mean in the context of something like BadgeKit. The goals of the dashboard, by nature would change to accommodate an issuer (as opposed to an earner). I think that these explorations are still totally valid and even hackable to modify for this new lens. Some of the goals for the user might include:
• keeping track of the badges that they are offering
• getting notifications regarding badge assessment needs
• analyzing trends for badge earners.
• sharing rights for admin functionality.

While I push forward on the thinking some more - I was reminded of the video, the Powers of Ten, when thinking about the display of badge data and badge ecosystem information. Upon first glance, a user might want a 1000 ft view of their world, but perhaps they want the ability to easily navigate back and forth through the level of detail that their data could be providing. Maybe an issuer only wants to view their goals, or their assessments - but perhaps they want to see trends, data about individual badges, and individual users. Maybe an issuer wants to connect their data to algorithms... the possibilities are endless!   Will update as we start to think about this more.

### Ben Hearsum — Contribution opportunity: Release Engineering systems

Release Engineering runs a vast array of infrastructure and systems that do much of the continuous integration and releases for Mozilla. Many of our systems are small in their scope but must be able to scale up to support the incredible load that developers put on them. Other systems receive millions of requests every day from live Firefox, Fennec, and Thunderbird installations.

Do you want help developer productivity or get releases into users hands more quickly and efficiently? Do you want to gain experience working on systems that must work at scale? If so, Release Engineering is a great place to look. Below are a few interesting bugs that could use some attention. If you’re interested in working on any of them I’m interested in mentoring you. You should be familiar with Python, but you don’t need to be an expert. Have a look below and contact me directly if anything interests you.

• Partial update generation service: Arguably, updates are the most important part of release process. Partial updates in particular help us keep a good user experience by reducing the amount of data a user needs to download, which means they update more quickly. We generate many of these already but creating this service would allow much more flexibility over what and when we generate partial updates. This project would involve writing the service from scratch, most likely in Python.
• Update Balrog schema to support multiple partials: Balrog is the code name of our new update server (which I’ve previously blogged about). It’s original design came about before we supported serving partial updates to users on multiple older versions of Firefox. In order to start using Balrog for Betas and Releases we need to add this feature. Balrog is written in Python and this will mostly involve server side changes to it.
• Improve update verify output: “Update verify” is a very important test that we run as part of our release automation. It’s job is to make sure that all users, regardless of where they’re coming from, end up in the same state after updating to the latest release. It’s output currently consists of thousands and thousands of lines of text, with test results interspersed. This bug is about finding and implementing a way to make the output easier for a human to make sense of and parse upon failure. The update verify scripts are written in bash, but this could be implemented by modifying them or post-processing the output.
• Store history of machine actions requested through API: We recently deployed a new system that helps us manage our thousands of build and test machines. It aims to be a single entry point for information gathering and common operations on them. Currently, the data in it is volatile — all history of operations is lost when the server is restarted. This bug will involve adding permanent storage (maybe SQL, maybe something else) to that server, which is written in Python.

## December 01, 2013

### Mike Conley — Australis Performance Post-mortem Part 2: ts_paint and t_paint

Continued from Part 1.

So we’d just gotten Talos data in, and it looked like we were regressing on ts_paint and tpaint right across the board.

Speaking just for myself, up until this point, Talos had been a black box. I vaguely knew that Talos tests were run, and I vaguely understood that they measured certain performance things, but I didn’t know what those things were nor where to look at the results.

Luckily, I was working with some pretty seasoned veterans. MattN whipped up an amazing spreadsheet that dynamically pulled in the Talos test data for each platform so that we could get a high-level view of all of the regressions. This would turn out to be hugely useful.

Here’s a link to a read-only version of that spreadsheet in all of its majesty. Or, if that link is somehow broken in the future, here’s a screenshot:

Numbers!

So now we had a high-level view of the regressions. The next step was determining what to do about it.

I should also mention that these regressions, at this point, were the only big things blocking us from landing on mozilla-central. So naturally, a good chunk of us focused our attention on this performance stuff. We quickly organized a daily standup meeting time where we could all get together and give reports on what we were doing to grind down the performance issues, and what results we were getting from our efforts.

That chunk of team, however, didn’t initially include me. I believe Gijs, Unfocused, mikedeboer and myself kept hacking on customization and widget bugs while jaws and MattN dug at performance. As time went on though, a few more of us eventually joined MattN and jaws in their performance work.

The good news in all of this is that ts_paint and tpaint are related – both measure the time it takes from issuing the command to open a browser window to actually painting it on the screen. ts_paint is concerned with the very first Firefox window from a cold-start, and tpaint is concerned with new windows from an already-running Firefox. It was quite possible that there was some overlap in what was making us slow on these two tests, which was somewhat encouraging.

The following bugs are just a subset of the bugs we filed and landed to improve our ts_paint and tpaint performance. Looking back, I’m pretty sure these are the ones that made the most difference, but the full list can be found as dependencies of these bugs.

#### Bug 890105 - TabsInTitleBar._update should group measurements and style changes to avoid unnecessary reflows

After a bit of examination, MattN dealt the first blow when he filed Bug 890105. The cross-platform code that figures out how best to place the tabs in the titlebar (while taking into account things like the system font size) is run before the window first paints, and it was being inefficient.

By inefficient, I mean it was causing more reflows than necessary. Here’s some information on reflows. The MDN page states that the article is obsolete, but the page still does a pretty good job of explaining what a reflow is.

The code would take a measurement of something on the page (causing a reflow), update that thing’s size (causing a reflow), and then repeat the process. MattN found we could cluster the measurements into a single pass, and then do all of the changes one after another. This reduced the number of reflows, which helped speed up both ts_paint and tpaint.

And boom, we saw our first win for both ts_paint and tpaint!

#### Bug 892532 – Add an optional fast-path to CustomizableUI.isWidgetRemovable

jaws found the next big win using a home-brewed profiler. The home-brewed profiler simply counted the number of times we entered and exited various functions in the CustomizableUI code, and recorded the time it took from entering to exiting.

I can’t really recall why we didn’t use the SPS profiler at this point. We certainly knew about it, but something tells me that at this point, we were having a hard time getting useful data from it.

Anyhow, with the home-brew profiler, jaws determined that we had the opportunity to fast-path a section of our code. Basically, we had a function that takes the ID of a widget, looks for and retrieves the widget, and returns whether or not that widget can be removed from its current location. There were some places that called this function during window start-up, and those places already had the widget that was to be found. jaws figured we could fast-path the function by being able to pass the widget itself rather than the ID, and skip the look-up.

#### Bug 891104 – Skip calling onOverflow during startup if there wasn’t any overflowed content before the toolbar is fully initialized

It was MattN’s turn again – this time, he found that the overflow toolbar code for the nav-bar (this is the stuff that handles putting widgets into the overflow panel if the window gets too small) was running the overflow handler as soon as the nav-bar was initted, regardless of whether anything was overflowed. This was causing a reflow because a measurement was on the overflowable toolbar to see if items needed to be moved into the overflow panel.

Originally, the automatic call of the overflow handler was to account for the case where the nav-bar is overflowed from the very beginning – but jaws made it smarter by attaching an overflow handler before the CSS attribute that made the toolbar overflowable was applied. That meant that if the nav-bar would only call the overflow handler if it really needed to, as opposed to every time.

#### Bug 898126 – Cache client hit test values

Around this time, a few more people started to get involved in Australis performance work. Gijs and mstange got a bug filed to investigate if there was a way to make start-up faster on Windows XP and 7. Here’s some context from mstange in that bug in comment 9:

It turns out that Windows XP sends about 200 WM_NCHITTEST events per second when we open a new window. All these events have the same position – possibly the current mouse position. And all the ClientMarginHitTestPoint optimizations we’ve been playing with only make a difference because that function is called so often during the test – one invocation is unnoticeably quick, but it starts to add up if we call it so many times.

This patch makes sure that we only send one hittest event per second if the position doesn’t change, and returns a cached value otherwise.

After some fiddling about with cache invalidation times, the patch landed, and we saw a nice win on Windows XP and 7!

#### Bug 906075 – Only send toolbars through buildArea if they’re not in their default state

It was around now that I started to get involved with performance work. One of my first successful bugs was to only run a toolbar through CustomizableUI’s buildArea function if the toolbar was not starting in a default state. The buildArea function’s job is to populate a customizable area with only the things that the user has moved into the area, and remove the things that the user has taken out. That involves cycling through the nodes in the area to see if they belong, and that takes time. I wrote a patch that cached a “dirty” state on a toolbar to indicate that it’d been customized in the past, and if we didn’t see that value, we didn’t run the toolbar through the function. Easy as pie, and we saw a little win on both ts_paint and tpaint on all platforms.

#### Bug 905695 – Skip checking for tab overflows if there is only one tab open

This was another case where we had an unnecessary reflow during start-up. And, like bug 891104, it involved an overflow event handler running when it really didn’t need to. jaws writes:

If only one tab is opened and we show the left/right arrows, we are actually removing quite a bit of space that could have been used to show the tab. Scrolling the tabbox in this state is also quite useless, since all the user can do is scroll to see the other parts of the *only* tab.

If we make this change, we can skip a synchronous reflow for new windows that only have one tab.

Which means we could skip a reflow for all new windows. Are you starting to notice a pattern? Sections of our code had been designed to operate the same way, regardless of whether or not it was in the default, common case. We were finding ways of detecting the default case, and fast-pathing them.

Chalk up another win!

#### Bug 907787 – Australis: toolbar overflow button should be hidden by default

Yet another example where we could fast-path the default case. The overflow button in the nav-bar is only supposed to be displayed if there are too many items in the nav-bar, resulting in some getting put into the overflow panel, which anchors on the overflow button.

If nothing is being overflowed and the panel is empty, the button should not be displayed.

We were, however, displaying the button by default, and then hiding it when we determined that nothing was overflowed. Bug 907787 inverted that logic, and hid the button by default, and only showed it when things got overflowed (which was not the default case).

We were getting really close to performance parity with mozilla-central…

#### Bug 908326 – default the navbar to overflowable to avoid needless reflowing

Once again, an example of us not greasing the default-path. Our overflowable toolbar code applies an overflowable attribute to the nav-bar in order to apply some CSS styles to give the toolbar its overflowing properties. Adding that attribute dynamically means a reflow.

Instead, we just added the attribute to the node’s definition in browser.xul, and dropped that unnecessary reflow like a hot brick.

### So how far had we come?

Let’s take a look at the graphs, shall we? Remember, in these graphs, the red points represent UX, and the green represent mozilla-central. Up is bad, and down is good. Our goal was to sink the red dots down into the noise of the green dots, which would give us performance parity.

#### ts_paint

Windows XP – ts_paint improvements

Ubuntu – ts_paint improvements

OSX 10.6 ts_paint improvements

You might be wondering what that bug jump is for ts_paint for OSX 10.6 at the end of the graph. This thread explains.

#### tpaint

Windows XP – tpaint improvements

Ubuntu – tpaint improvements

OSX 10.6 tpaint improvements

Looking good.

### The big lessons

I think the big lesson here is to identify the common, default case, and optimize it as best you can. By definition, this is the path that’s going to be hit the most, so you can special-case it, and build in fast paths for it. Your users will thank you.

Close the feedback loop as much as you can. To test our theories, we’d push our patches to try and use compare-talos to compare our tpaint and ts_paint numbers to baseline pushes to see if we were making improvements. This requires several hours for the try builds to complete. This is super slow. Release Engineering was awesome and lent us some Windows XP talos slaves for us to experiment on, and that helped us close the feedback loop a lot. Don’t be afraid to ask Release Engineering for talos slaves.

Also note that while it’s easy for me to rattle off bug numbers and explain where we were being slow, all of that investigation and progress occurred over several months. Performance work can be really slow. The bottleneck is not making the slow code faster – the bottleneck is identifying where the slow code is. Profiling is the key here. If you’re not using some kind of profiler while doing performance work, you’re seriously impeding yourself. If you don’t have a profiler, build a simple one. If you don’t know how to build a simple one, find someone who can.

I mentioned Gecko’s built-in SPS profiler a few paragraphs back. The SPS profiler was instrumental (pun intended) in getting our performance back up to snuff. We also built a number of tools alongside the SPS profiler to help us in our analyses.

Read up about those tools we built in Part 3 (to be published soon)..

## November 29, 2013

### Karl Dubost — Thanks Giving Design For Planet Web Compatibility

Because of Thanks Giving in USA, it's borderline useless to try to contact Web sites owners. Everyone is replying that they are busy slaughtering turkeys instead of Web sites. Lawrence announced that we recently launched Planet Web Compatibility. It is a news aggregator on the specific topic of Web Compatibility on the Web. Think about The Daily Mail for Web sites issues. The worse things about the Web are aggregated there. The ugly grey design that you can currently see on the planet is mine. Yes I admit, I'm really as bad for design than jokes. (You see what I just did here).

So today, everyone else being busy, I decided to try give another shot and do another ugly design. Let's hope it is slightly less ugly, like by… 1 point. Anyway, the code is online. You can shoot a Pull Request on the repo to add yourself in the config file if you are talking about Web Compatibility on your blog. You can also fix the design if it really hurts your eyes.

It should be fairly responsive, if I didn't butcher too much.

And we will see how it resists to local CSS configuration of every individuals.

Have a good turkey

Otsukare!

### Nicholas Nethercote — DMD now works on Windows

DMD is our tool for improving Firefox’s memory reporting.  It helps identify where new memory reporters need to be added in order to reduce the “heap-unclassified” value in about:memory.

DMD has always worked well on Linux, and moderately well on Mac (it is crashy for some people).  And it works on Android and B2G.  But it has never worked on Windows.

So I’m happy to report that DMD now does work on Windows, thanks to the excellent efforts of Catalin Iacob.  If you’re on Windows and you’ve been seeing high “heap-unclassified” values, and you’re able to build Firefox yourself, please give DMD a try.

### John O'Duinn — Proposed changes to RelEng’s OSX build and test infrastructure

tl;dr: In order to improve our osx10.6 test capacity and to quickly start osx10.9 testing, we’re planning to make the following changes to our OSX-build-and-test-infrastructure.

1) convert all 10.7 test machines as 10.6 test machines in order to increase our 10.6 capacity. Details in bug#942299.
2) convert all 10.8 test machines as 10.9 test machines.
3) do most 10.7 builds as osx-cross-compiling-on-linux-on-AWS, repurpose 10.7 builder machines to be additional 10.9 test machines. This cross-compiler work is ongoing, it will take time to complete, and it will take time to transition into production, hence, it is listed last in this list. The curious can follow bug#921040.

Each of these items are large stand-alone projects involving the same people across multiple groups, so we’ll roll each out in the aforementioned sequence.

Additional details:
1) Removing specific versions of an OS from our continuous integration systems based on vendor support and/or usage data is not a new policy. We have done this several times in the past. For example, we have dropped WinXPsp0/sp1/sp2 for WinXPsp3; dropped WinVista for Win7; dropped Win7 x64 for Win8 x64; and soon we will drop Win8.0 for Win8.1; …
** Note for the record that this does *NOT* mean that Mozilla is dropping support for osx10.7 or 10.8; it just means we think *automated* testing on 10.6,10.9 is more beneficial.

2) To see Firefox’s minimum OS requirements see: https://www.mozilla.org/en-US/firefox/25.0.1/system-requirements

3) Apple is offering osx10.9 as a free upgrade to all users of osx10.7 and osx10.8. Also, note that 10.9 runs on any machine that can run 10.7 or 10.8. Because the osx10.9 release is a free upgrade, users are quickly upgrading. We are seeing a drop in both 10.7 and 10.8 users and in just a month since the 10.9 release, we already have more 10.9 users than 10.8 users.

4) Distribution of Firefox users from the most to the least (data from 15-nov-2013):
10.6 – 34%
10.7 – 23% – slightly decreasing
10.8 – 21% – notably decreasing
10.9 – 21% – notably increasing
more info: http://armenzg.blogspot.ca/2013/11/re-thinking-our-mac-os-x-continuous.html

5) Apple is no longer providing security updates for 10.7; any user looking for OS security updates will need to upgrade to 10.9. Because OSX10.9 is a free upgrade for 10.8 users, we expect 10.8 to be in similar situation soon.

6) If a developer lands a patch that works on 10.9, but it fails somehow on 10.7 or 10.8, it is unlikely that we would back out the fix, and we would instead tell users to upgrade to 10.9 anyways, for the security fixes.

7) It is no longer possible to buy any more of the 10.6 machines (known as revision 4 minis), as they are long desupported. Recycling 10.7 test machines means that we can continue to support osx10.6 at scale without needing to buy/rack/recalibrate test and performance results.

8) Like all other large OS changes, this change would ride the trains. Most 10.7 and 10.8 test machines would be reimaged when we make these changes live on mozilla-central and try, while we’d leave a few behind. The few remaining would be reimaged at each 6-week train migration.

If we move quickly, this reimaging work can be done by IT before they all get busy with the 650-Castro -> Evelyn move.

For further details, see armen’s blog http://armenzg.blogspot.ca/2013/11/re-thinking-our-mac-os-x-continuous.html. To make sure this is not missed, I’ve cross-posted this to dev.planning, dev.platform and also this blog. If you know of anything we have missed, please reply in the dev.planning thread.

John.

[UPDATED 29-nov-2013 with link to bug#942299, as the 10.7->10.6 portion of this work just completed.]

### Gervase Markham — Between Three And Six What?

The other way the project can lower tensions around release planning is to make releases fairly often. When there’s a long time between releases, the importance of any individual release is magnified in everyone’s minds; people are that much more crushed when their code doesn’t make it in, because they know how long it might be until the next chance. Depending on the complexity of the release process and the nature of your project, somewhere between every three and six months is usually about the right gap between releases, though maintenance lines may put out micro releases a bit faster, if there is demand for them.

– Karl Fogel, Producing Open Source Software

### Peter Bengtsson — Wish List Granted on Hacker News report

On Wednesday this week, I managed to get a link to Wish List Granted onto Hacker News. It had enough upvotes to be featured on the front page for a couple of hours. I'm very grateful for the added traffic but not quite so impressed with the ultimate conversions.

• 4,428 unique visitors
• 43 Wish Lists created
• 2 Usersnap pieces of constructive feedback
• 0 payments made

So that's 1% conversion of people setting up a wish list. But kinda disappointing that no body ever made a payment. Actually, one friend did make a payment. But he's a colleague and a friend so not a stranger who stumbled onto it from Hacker News.

Also, it's now been 3 days since those 43 wish lists were created and still no payments. That's kinda disappointing too.

I'm starting to fear that Wish List Granted is one of those ideas that people think it's a great idea but have no interest in using.

### Soledad Penades — Invitada en “ñerds” especial 004

¡Esta semana participé en un (video|pod)cast!

Hablamos, entre otras cosas, de mi trayectoria informática, proyectos varios en los que estoy involucrada, Firefox OS, node.js vs otros entornos “tradicionales”, y cómo conseguir que la web sea mucho más guay chévere ;-)

Hay enlaces a todo en la página del episodio, incluyendo enlace al podcast para escuchar offline (que empieza con… mi canción coffee rulez)

Estoy aún recuperándome de un resfriado espantoso que se ha ensañado con mi aparato respiratorio, así que hablo aún más bajo que de normal, y resistir durante todo el podcast sin explotar en una tormenta de tos fue de lo más extraordinario que me ha sucedido esta semana.

Gracias a Mauricio y Roberto por la experiencia. ¡Fue divertido!

And now in English: I was in a podcast this week. It was fun, and in a mixture of South American Spanish vs Spanish Spanish with a hint of Spanglish. Feel free to listen to it if curious or just trying to learn some Spanish ;)

### Chris Lord — Efficient animation for games on the (mobile) web

Drawing on some of my limited HTML5 games experience, and marginally less limited general games and app writing experience, I’d like to write a bit about efficient animation for games on the web. I usually prefer to write about my experiences, rather than just straight advice-giving, so I apologise profusely for how condescending this will likely sound. I’ll try to improve in the future

There are a few things worth knowing that will really help your game (or indeed app) run better and use less battery life, especially on low-end devices. I think it’s worth getting some of these things down, as there’s evidence to suggest (in popular and widely-used UI libraries, for example) that it isn’t necessarily common knowledge. I’d also love to know if I’m just being delightfully/frustratingly naive in my assumptions.

First off, let’s get the basic stuff out of the way.

### Help the browser help you

If you’re using DOM for your UI, which I’d certainly recommend, you really ought to use CSS transitions and/or animations, rather than JavaScript-powered animations. Though JS animations can be easier to express at times, unless you have a great need to synchronise UI animation state with game animation state, you’re unlikely to be able to do a better job than the browser. The reason for this is that CSS transitions/animations are much higher level than JavaScript, and express a very specific intent. Because of this, the browser can make some assumptions that it can’t easily make when you’re manually tweaking values in JavaScript. To take a concrete example, if you start a CSS transition to move something from off-screen so that it’s fully visible on-screen, the browser knows that the related content will end up completely visible to the user and can pre-render that content. When you animate position with JavaScript, the browser can’t easily make that same assumption, and so you might end up causing it to draw only the newly-exposed region of content, which may introduce slow-down. There are signals at the beginning and end of animations that allow you to attach JS callbacks and form a rudimentary form of synchronisation (though there are no guarantees on how promptly these callbacks will happen).

Speaking of assumptions the browser can make, you want to avoid causing it to have to relayout during animations. In this vein, it’s worth trying to stick to animating only transform and opacity properties. Though some browsers make some effort for other properties to be fast, these are pretty much the only ones semi-guaranteed to be fast across all browsers. Something to be careful of is that overflow may end up causing relayouting, or other expensive calculations. If you’re setting a transform on something that would overlap its container’s bounds, you may want to set overflow: hidden on that container for the duration of the animation.

### Use requestAnimationFrame

When you’re animating canvas content, or when your DOM animations absolutely must synchronise with canvas content animations, do make sure to use requestAnimationFrame. Assuming you’re running in an arbitrary browsing session, you can never really know how long the browser will take to draw a particular frame. requestAnimationFrame causes the browser to redraw and call your function before that frame gets to the screen. The downside of using this vs. setTimeout, is that your animations must be time-based instead of frame-based. i.e. you must keep track of time and set your animation properties based on elapsed time. requestAnimationFrame includes a time-stamp in its callback function prototype, which you most definitely should use (as opposed to using the Date object), as this will be the time the frame began rendering, and ought to make your animations look more fluid. You may have a callback that ends up looking something like this:

var startTime = -1;
var animationLength = 2000; // Animation length in milliseconds

function doAnimation(timestamp) {
// Calculate animation progress
var progress = 0;
if (startTime < 0) {
startTime = timestamp;
} else {
progress = Math.min(1.0, animationLength /
(timestamp - startTime));
}

// Do animation ...

if (progress < 1.0) {
requestAnimationFrame(doAnimation);
}
}

// Start animation
requestAnimationFrame(doAnimation);

You’ll note that I set startTime to -1 at the beginning, when I could just as easily set the time using the Date object and avoid the extra code in the animation callback. I do this so that any setup or processes that happen between the start of the animation and the callback being processed don’t affect the start of the animation, and so that all the animations I start before the frame is processed are synchronised.

To save battery life, it’s best to only draw when there are things going on, so that would mean calling requestAnimationFrame (or your refresh function, which in turn calls that) in response to events happening in your game. Unfortunately, this makes it very easy to end up drawing things multiple times per frame. I would recommend keeping track of when requestAnimationFrame has been called and only having a single handler for it. As far as I know, there aren’t solid guarantees of what order things will be called in with requestAnimationFrame (though in my experience, it’s in the order in which they were requested), so this also helps cut out any ambiguity. An easy way to do this is to declare your own refresh function that sets a flag when it calls requestAnimationFrame. When the callback is executed, you can unset that flag so that calls to that function will request a new frame again, like this:

function redraw() {
drawPending = false;

// Do drawing ...
}

var drawPending = false;
function requestRedraw() {
if (!drawPending) {
drawPending = true;
requestAnimationFrame(redraw);
}
}

Following this pattern, or something similar, means that no matter how many times you call requestRedraw, your drawing function will only be called once per frame.

Remember, that when you do drawing in requestAnimationFrame (and in general), you may be blocking the browser from updating other things. Try to keep unnecessary work outside of your animation functions. For example, it may make sense for animation setup to happen in a timeout callback rather than a requestAnimationFrame callback, and likewise if you have a computationally heavy thing that will happen at the end of an animation. Though I think it’s certainly overkill for simple games, you may want to consider using Worker threads. It’s worth trying to batch similar operations, and to schedule them at a time when screen updates are unlikely to occur, or when such updates are of a more subtle nature. Modern console games, for example, tend to prioritise framerate during player movement and combat, but may prioritise image quality or physics detail when compromise to framerate and input response would be less noticeable.

### Measure performance

One of the reasons I bring this topic up, is that there exist some popular animation-related libraries, or popular UI toolkits with animation functions, that still do things like using setTimeout to drive their animations, drive all their animations completely individually, or other similar things that aren’t conducive to maintaining a high frame-rate. One of the goals for my game Puzzowl is for it to be a solid 60fps on reasonable hardware (for the record, it’s almost there on Galaxy Nexus-class hardware) and playable on low-end (almost there on a Geeksphone Keon). I’d have liked to use as much third party software as possible, but most of what I tried was either too complicated for simple use-cases, or had performance issues on mobile.

How I came to this conclusion is more important than the conclusion itself, however. To begin with, my priority was to write the code quickly to iterate on gameplay (and I’d certainly recommend doing this). I assumed that my own, naive code was making the game slower than I’d like. To an extent, this was true, I found plenty to optimise in my own code, but it go to the point where I knew what I was doing ought to perform quite well, and I still wasn’t quite there. At this point, I turned to the Firefox JavaScript profiler, and this told me almost exactly what low-hanging-fruit was left to address to improve performance. As it turned out, I suffered from some of the things I’ve mentioned in this post; my animation code had some corner cases where they could cause redraws to happen several times per frame, some of my animations caused Firefox to need to redraw everything (they were fine in other browsers, as it happens – that particular issue is now fixed), and some of the third party code I was using was poorly optimised.

### A take-away

To help combat poor animation performance, I wrote Animator.js. It’s a simple animation library, and I’d like to think it’s efficient and easy to use. It’s heavily influenced by various parts of Clutter, but I’ve tried to avoid scope-creep. It does one thing, and it does it well (or adequately, at least). Animator.js is a fire-and-forget style animation library, designed to be used with games, or other situations where you need many, synchronised, custom animations. It includes a handful of built-in tweening functions, the facility to add your own, and helper functions for animating object properties. I use it to drive all the drawing updates and transitions in Puzzowl, by overriding its requestAnimationFrame function with a custom version that makes the request, but appends the game’s drawing function onto the end of the callback, like so:

animator.requestAnimationFrame =
function(callback) {
requestAnimationFrame(function(t) {
callback(t);
redraw();
});
};

My game’s redraw function does all drawing, and my animation callbacks just update state. When I request a redraw outside of animations, I just check the animator’s activeAnimations property first to stop from mistakenly drawing multiple times in a single animation frame. This gives me nice, synchronised animations at very low cost. Puzzowl isn’t out yet, but there’s a little screencast of it running on a Nexus 5:

### Henrik Skupin — Mozmill speed improvements after upgrading Python from 2.6 to 2.7.3

Yesterday we tried to upgrade our mozmill-ci cluster to the previously released Mozmill 2.0.1. Sadly we failed on the OS X 10.6 machines and had to revert this change. After some investigation I found out that incompatibility issues between Python 2.6 and 2.7.3 were causing this problem in mozprofile. Given the unclear status of Python 2.6 support in mozbase, and a talk in the #ateam IRC channel, I have been advised to upgrade those machines to Python 2.7. I did so after some testing, also because all other machines are running Python 2.7.3 already. So I didn’t expect any fallout. First post upgrade tests have proven this.

The interesting fact I would like to highlight here is that we can see speed improvements by running our tests now. Previously a functional testrun on 10.6 has been taken about 15 minutes. Now after the upgrade it went down to 11 minutes only. That’s an improvement of nearly 27% with Mozmill 1.5.24. With Mozmill 2.0.1 there is a similar drop which is from 8 minutes to 6 minutes.

Given all that and the upcoming upgrade (hopefully soon) of our mozmill-ci system to Mozmill 2.0.1 we will see an overall improvement of 60% (15 minutes -> 6 minutes) per testrun!! This is totally stunning and allows us to run 2.5 times more tests in the same timespan. With it we can further increase our coverage for locales from 20 to 40 for beta and release candidate builds as next step.

### Andy McKay — Default private browsing

Yesterday I found that Firefox was behaving oddly. Sites were behaving oddly, as if the cookies weren't being passed correcly to sites, repeatedly. On reloading old tabs weren't loading. When trying to go into private browsing mode, there was no purple indicator in the top right.

Going to the File menu I noticed that the private browsing link key shortcut had changed to ⌘N instead of shift-⌘P.

But that shortcut still worked and seemed to open new non-private windows.

What was going on? I tried removing add-ons, cleaning out jetpacks and anything I could change. Finally I went to about:config and searched for "private" and then found this setting:

browser.privatebrowsing.autostart

...was set to true. When this is set, all windows and new windows are in private browsing mode. But with no notification or warning of that.

Changed that back to false and it was all good. I'm not sure what toggled that but being able to have normal browsing and private browsing meant I could go back to running my two Google accounts at once, get past newspaper pay-walls and easily test my sites.

I was tempted to file a bug about this, but then realised this was probably all intended. But here's a post that the search engines can find for the future.

### Honza Bambas — Building mozilla code directly from Visual Studio IDE

Yes, it’s possible!  With a single key press you can build and have a nice list of errors in the Error List window, clickable to get to the bad source code location easily.  It was a fight, but here it is.  Tested with Visual Studio Express 2013 for Windows Desktop, but I believe this all can be adapted to any version of the IDE.

• Create a shell script, you will (have to) use it every time to start Visual Studio from mozilla-build’s bash prompt:

export MOZ__INCLUDE=$INCLUDE export MOZ__LIB=$LIB export MOZ__LIBPATH=$LIBPATH export MOZ__PATH=$PATH export MOZ__VSINSTALLDIR=$VSINSTALLDIR # This is for standard installation of Visual Studio 2013 Desktop, alter the paths to your desired/installed IDE version cd "/c/Program Files (x86)/Microsoft Visual Studio 12.0/Common7/IDE/" ./WDExpress.exe & • Create a solution ‘mozilla-central’ located at the parent directory where your mozilla-central repository clone resides. Say you have a structure like C:\Mozilla\mozilla-central, which is the root source folder where you find .hg, configure.in and all the modules’ sub-dirs. Then C:\Mozilla\ is the parent directory. • In that solution, create a Makefile project ‘mozilla-central’, again located at the parent directory. It will, a bit unexpectedly, be created where you probably want it – in C:\Mozilla\mozilla-central. • Let the Build Command Line for this project be (use the multi-line editor to copy & paste: combo-like arrow on the right, then the <Edit…> command): call "$(MOZ__VSINSTALLDIR)\VC\bin\vcvars32.bat" set INCLUDE=$(MOZ__INCLUDE) set LIB=$(MOZ__LIB) set LIBPATH=$(MOZ__LIBPATH) set PATH=$(MOZ__PATH) set MOZCONFIG=c:\optional\path\to\your\custom\mozconfig cd $(SolutionDir) python mach --log-no-times build binaries  Now when you make a modification to a C/C++ file just build the ‘mozilla-central’ project to run the great build binaries mach feature and quickly build the changes right from the IDE. Compilation and link errors as well as warnings will be nicely caught in the Error List. BE AWARE: There is one problem – when there is a typo/mistake in an exported header file, it’s opened as a new file in the IDE from _obj/dist/include location. When you miss that and modify that file it will overwrite on next build! (I’ll ask Bas Schouten as Chris Pearce suggests if there is some solution.) With these scripts you can use the Visual Studio 2013 IDE but build with any other version of VC++ of your choice. It’s independent, just run the start-up script from different VS configuration mozilla-build prompt. I personally also create projects for modules (like /netwerk, /docshell, /dom) I often use. Just create a Makefile project located at the source root directory with name of the module directory. The project file will then be located in the module – I know, not really what one would expect. Switch Solution Explorer for that project to show all files, include them all in the project, and you are done. Few other tweaks: • Assuming you properly use an object dir, change the Output Directory to point e.g. to $(SolutionDir)\<your obj dir>\$(Configuration)\. Similarly, set the Intermediate Directory to <your obj dir>\$(Configuration)\. The logging and other crap won’t then be created in your source repository.
• Add:
^.*\.vcproj.* ^.*\.vcxproj.* .sln$.suo$ .ncb$.sdf$ .opensdf$ to your custom hg ingnore file to prevent the Visual Studio project and solution files interfere with Mercurial. Same suggested for git, if you prefer it. Note: you cannot use this for a clobbered build because of an undisclosed Python Windows-specific bug. See here why. Do clobbered builds from a console, or you may experiment with clobber + configure from a console and then build from the IDE. ## November 28, 2013 ### Tantek Çelik — Homebrew Website Club Newsletter Volume 1 Issue 1 Edited by Tantek Çelik, Are you building your own website? Indie reader? Personal publishing web app? Or some other digital magic-cloud proxy? If so, you might like to come to a gathering of people with likeminded interests. Exchange information, swap ideas, talk shop, help work on a project, whatever... This announcement, accompanying blog post, and event note brought nine of us together on short notice in the 7th floor main meeting area at Mozilla's San Francisco office. After brief introductions we went around the room in a "broadcast" phase. Everyone introduced themselves and what personal website successes and challenges they were experiencing. All already had a personal website of some sort, yet also expressed a yearning for something more. Opinions and passion were generally dominated by user-centered perspectives, about giving users (especially themselves) control over their own content/narrative, and focusing on user experience first. Four in the group actively post content on their own site (typically in a "blog" format), and three more on secondary domains, Blogspot, or Tumblr. Two in the group already had personal website tweeting up and running using the POSSE method (Publish on your Own Site, Syndicate Elsewhere). And one had an ownCloud setup working with an SSL certificate. We got into a short dispute over whether to focus on public or private content first until it was pointed out that public first is simpler and can inform private content design. There was a PESOS vs. POSSE debate, especially for quantified self / fitness data. Many in the group conveyed a general lamenting of the lack of support for Activity Streams in services and devices, until one participant noted he'd built a proxy that turns interactions from Facebook, Twitter, G+ (e.g. comments, likes) into Activity Streams. Frustrations were shared about services that show promise yet have odd awkwardnesses like Path and Mint. On the open source side, concerns were raised about monoculture and especially the open source community default culture of assuming one codebase to rule them all. There was much praise for the ease of use, beauty, and customization of Tumblr, especially as a good bar to compare efforts to build personal websites and provide user interfaces for future indieweb onboarding experienes. Despite their beauty or convenience, there was a sense that Tumblr, Blogger, and other content hosting silos will all rot. We split up into small groups as part of the "peer-to-peer" part of the meeting. Kevin Marks did an excellent job of live tweeting a lot of the conversation and posted a summary on his site while at the meeting! At 20:00 we closed the meeting and announced that the next meeting would be in two weeks: #### NEXT MEETING WEDNESDAY, , at Mozilla's First Floor Common Area, Embarcadero & Harrison, San Francisco, CA. Are you building your own website? Indie reader? Personal publishing web app? Or some other digital magic-cloud proxy? If so, come on by and join a gathering of people with likeminded interests. Bring your friends that want to start a personal web site. Exchange information, swap ideas, talk shop, help work on a project, whatever... This newsletter is placed into the public domain with a CC0 dedication. (with apologies to Homebrew Computer Club Newsletter number one ) ### William Lachance — mozregression now supports inbound builds Just wanted to send out a quick note that I recently added inbound support to mozregression for desktop builds of Firefox on Windows, Mac, and Linux. For the uninitiated, mozregression is an automated tool that lets you bisect through builds of Firefox to find out when a problem was introduced. You give it the last known good date, the last known bad date and off it will go, automatically pulling down builds to test. After each iteration, it will ask you whether this build was good or bad, update the regression range accordingly, and then the cycle repeats until there are no more intermediate builds. Previously, it would only use nightlies which meant a one day granularity — this meant pretty wide regression ranges, made wider in the last year by the fact that so much more is now going into the tree over the course of the day. However, with inbound support (using the new inbound archive) we now have the potential to get a much tighter range, which should be super helpful for developers. Best of all, mozregression doesn’t require any particularly advanced skills to use which means everyone in the Mozilla community can help out. For anyone interested, there’s quite a bit of scope to improve mozregression to make it do more things (FirefoxOS support, easier installation…). Feel free to check out the repository, the issues list (I just added an easy one which would make a great first bug) and ask questions on irc.mozilla.org#ateam! ### Frédéric Harper — Firefox OS loves at the Athens App Days Click to see full size Yesterday I was invited to help support the Athens App Days. I did the first technical talk of the day, and my goal was to excited developers about the platform, and to show them all the possibilities they have for building their application. I was quite impressed by the dedication of developers during all the hackathon: they were hard at work to get a chance to win one of the amazing prizes we had! As usual, there is also a recording of my presentation. I hope you enjoyed the presentation, and let me know if you still need help with the development of your Firefox OS app: I can’t wait to see them in the marketplace! It was a real pleasure to be part of this event, and of course, to visit Athens for the first time. -- Firefox OS loves at the Athens App Days is a post on Out of Comfort Zone from Frédéric Harper ### Aki Sasaki — LWR (job scheduling) part ii: a high level overview compute farm I think of all the ideas we've brainstormed, the one I'm most drawn to is the idea that our automation infrastructure shouldn't just be a build farm feeding into a test farm. It should be a compute farm, capable of running a superset of tasks including, but not restricted to, builds and tests. Once we made that leap, it wasn't too hard to imagine the compute farm running its own maintenance tasks, or doing its own dependency scheduling. Or running any scriptable task we need it to. This perspective also guides the schematics; generic scheduling, generic job running. This job only happens to be a Firefox desktop build, a Firefox mobile l10n repack, or a Firefox OS emulator test. This graph only happens to be the set of builds and tests that we want to spawn per-checkin. But it's not limited to that. Currently, when we detect a new checkin, we kick off new builds. When they successfully upload, they create new dependent jobs (tests), in a cascading waterfall scheduling method. This works, but is hard to predict, and it doesn't lend itself to backfilling of unscheduled jobs, or knowing when the entire set of builds and tests have finished. Instead, if we create a graph of all builds and tests at the beginning, with dependencies marked, we get these nice properties: • Scheduling changes can be made, debugged, and verified without actually needing to hook it up into a full system; the changes will be visible in the new graph. • It becomes much easier to answer the question of what we expect to run, when, and where. • If we initially mark certain jobs in the graph as inactive, we can backfill those jobs very easily, by later marking them as active. • We are able to create jobs that run at the end of full sets of builds and tests, to run analyses or cleanup tasks. Or "smoketest" jobs that run before any other tests are run, to make sure what we're testing is worth testing further. Or "breakpoint" jobs that pause the graph before proceeding, until someone or something marks that job as finished. • If the graph is viewable and editable, it becomes much easier to toggle specific jobs on or off, or requeue a job with or without changes. Perhaps in a web app. web app The dependency graph could potentially be edited, either before it's submitted, or as runtime changes to pending or re-queued jobs. Given a user-friendly web app that allows you to visualize the graph, and drill down into each job to modify it, we can make scheduling even more flexible. • TryChooser could go from a checkin-comment-based set of flags to a something viewable and editable before you submit the graph. Per-job toggles, certainly (just mochitest-3 on windows64 debug, please, but mochitest-2 through 4 on the other platforms). • If the repository + revision were settable fields in the web app, we could potentially get rid of the multi-headed Try repository altogether (point to a user repo and revision, and build from there). • Some project branches might not need per-checkin or nightly jobs at all, given a convenient way to trigger builds and tests against any revision at will. • Given the ability to specify where the job logic comes from (e.g., mozharness repo and revision), people working on the automation itself can test their changes before rolling them out, especially if there are ways to send the output of jobs (job status, artifact uploads, etc.) to an alternate location. This vastly reduces the need for a completely separate "staging" area that quickly falls out of date. Faster iteration on automation, faster turnaround. Releng-as-a-Service Release Engineering is a bottleneck. I think Ted once said that everyone needs something from RelEng; that's quite possibly true. We've been trying to reverse this trend by empowering others to write or modify their own mozharness scripts: the A-team, :sfink, :gaye, :graydon have all been doing so. More bandwidth. Less bottleneck. We've already established that compute load on a small subset of servers doesn't work as well as moving it to the massively scalable compute farm. This video on leadership says the same thing, in terms of people: empowering the team makes for more brain power than bottlenecking the decisions and logic on one person. Similarly, empowering other teams to update their automation at their own pace will scale much better than funneling all of those tasks into a single team. We could potentially move towards a BYOS (bring your own script) model, since other teams know their workflow, their builds, their tests, their processes better than RelEng ever could. :catlee's been using the term Releng-as-a-Service for a while now. I think it would scale. I would want to allow for any arbitrary script to run on our compute farm (within the realms of operational-, security-, and fiscal- sanity, of course). Comparing talos performance numbers looking for regressions? Parsing logs for metrics? Trying to find patterns in support feedback? Have a whole new class of thing to automate? Run it on the compute farm. We'll help you get started. But first, we have to make it less expensive and complex to schedule arbitrary jobs. This is largely what we talked about, on a high level, both during our team week and over the years. A lot of this seems very blue sky. But we're so much closer to this as a reality than we were when I was first brainstorming about replacing buildbot, 4-5 years ago. We need to work towards this, in phases, while also keeping on top of the rest of our tasks. In part 1, I covered where we are currently, and what needs to change to scale up. In part 3, I'm going to go into some hand-wavy LWR specifics, including what we can roll out in phase 1. In part 4, I'm going to drill down into the dependency graph. Then I'm going to start writing some code. comments ## November 27, 2013 ### Soledad Penades — A few drawings from CascadiaJS 2013 I wasn’t in condition to draw anything the first day of the conference, with being jetlagged and speaking too. But on the second day I was able to sketch five of the speakers for the day before I got exhausted again (that probably coinciding with the fact that my body clock was probably starting to shout at me that it was late in the night –in the London night, 8 timezones away– and it wanted me to SLEEP). You’ll notice that the quality of the drawings quickly degenerates, and I’m quite sorry because the last two speakers I drew were amazing and I don’t feel like I rendered them as accurately as I’d wanted to. Maybe next time, friends, maybe next time! :-) So there we go: ### Charles Bihis “When dealing with money, use integers” (video) ### David Bruant “In the ideal world the specs come first but in the real world they come last” (video) ### Matthew Bergman “Why should you trust me to store your data without reading it?” (video) ### C J Silverio “Time passed, and Moore’s Law did its thing… as it always does!” (video) ### Raquel Vélez (and her Batbot) “After working with robots for a while you start giving them names and having conversations…” (video) ### Christian Heilmann — Help me write a Developer Evangelism/Advocacy guide A few years ago now, I spent two afternoons to write down all I knew back then about Developer Evangelism/Advocacy and created the Developer Evangelism Handbook. It has been a massive success in terms of readers and what I hear from people has helped a lot of them find a new role in their current company or a new job in others. I know for a fact that the handbook is used in a few companies as training material and I am very happy about that. I also got thanked by a lot of people not in a role like this learning something from the handbook. This made me even happier. With the role of developer evangelist/advocat being rampant now and not a fringe part of what we do in IT I think it is time to give the handbook some love and brush it up to a larger “Guide to Developer Evangelism/Advocacy” by re-writing parts of it and adding new, more interactive features. For this, I am considering starting a Kickstarter project as I will have to do that in my free-time and I see people making money with things they learned. Ads on the page do not cut it – at all (a common issue when you write content for people who use ad-blockers). That’s why I want to sound the waters now to see what you’d want this guide to be like to make it worth you supporting me. In order to learn, I put together a small survey about the Guide and I’d very much appreciate literally 5 minutes of your time to fill it out. No, you can’t win an iPad or a holiday in the Carribean, this is a legit survey. Loading… Let’s get this started, I’d love to hear what you think. Got comments? Please tell me on Google+ or Facebook or Twitter. ### Niko Matsakis — Thoughts on DST, Part 3 After posting part 2 of my DST series, I realized that I had focusing too much on the pure “type system” aspect and ignoring some of the more…mundane semantics, and in particular the impact of monomorphization. I realize now that – without some further changes – we would not be able to compile and execute the second proposal (which I will dub statically sized typed (SST) from here on out). Let me first explain the problem and then show how my first thoughts on how it might be addressed. This is part 3 of a series: ### The problem The problem with the SST solution becomes apparent when you think about how you would compile a dereference *rc of a value rc that has type exists N. RC<[int, ..N]> (written long-hand). Typing this dereference is relatively straightforward, but when you think about the actual code that we generate, things get more complicated. In particular, imagine the Deref impl I showed before: impl<T> Deref<T> for RC<T> { fn deref<'a>(&'a self) -> &'a T { &*self.data } }  The problem here is that the way monomorphization currently works, there will be a different impl generated for RC<[int, ..2]> and RC<[int, ..3]> and RC<[int, ..4]> and so on. So if we actually try to generate code, we’ll need to know which of those versions of deref we ought to call. But all we know that we have a RC<[int, ..N]> for some unknown N, which is not enough information. What’s frustrating of course is that it doesn’t actually matter which version we call – they all generate precisely the same code, and in fact they would generate the same code regardless of the type T. In some cases, as an optimization, LLVM or the backend might even collapse these functions into one, since the code is identical, but we have no way at present to guarantee that it would do so or to ensure that the generated code is identical. ### A solution One possible solution for this would be to permit users to mark type parameters as erased. If a type parameter T is marked erased, the compiler would enforce distinctions that guarantee that the generated code will be the same no matter what type T is bound to. This in turn means the code generator can guarantee that there will only be a single copy of any function parameterized over T (presuming of course that the function is not parameterized over other, non-erased type parameters). If we apply this notion, then we might rewrite our Deref implementation for RC as follows: impl<erased T> Deref<T> for RC<T> { fn deref<'a>(&'a self) -> &'a T { &*self.data } }  It would be illegal to perform the following actions on an erased parameter T: • Drop a value of type T – that would require that we know what type T is so we can call the appropriate destructor. • Assign to an lvalue of type T – that would require dropping the previous value • Invoke methods on values of type T – in other words, erased parameters can have no bounds. • Take an argument of type T or have a local variable of type T – that would require knowing how much space to allocate on the stack • Probably a few other things. ### But maybe that erases too much…? For the most part those restrictions are ok, but one in particular kind of sticks in my craw: how can we handle drops? For example, imagine we have a a value like RC<[~int]>. If this gets dropped, then we’ll need to recursively free all of the ~int values that are contained in the vector. Presumably this is handled by having RC<T> invoking the appropriate “drop glue” (Rust-ese for destructor) for its type T – but if T is erased, we can’t know which drop glue to run. And if T is not erased, then when RC<[~int]> is dropped, we won’t know whether to run the destructor for RC<[~int, ..5]> or RC<[~int, ..6]> etc. And – of course – it’s wildly wasteful to have distinct destructors for each possible length of an array. ### Erased is the new unsized? This erased annotation should of course remind you of the unsized annotation in DST. The two are very similar: they guarantee that the compiler can generate code even in ignorance of the precise characteristics of the type in question. The difference is that, with unsized, the compile was still generating code specific to each distinct instantiation of the parameter T, it’s just that one valid instantiation would be an unsized type [U] (that is, exists N. [U, ..N]). The compiler knew it could always find the length for any instance of [U] and thus could generate drop glue and so on. So perhaps the solution is not to have erased, which says “code generation knows nothing about T, but rather some sort of partial erasure (similar to the way that we erase lifetimes from types at code generation, and thus can’t the code generator can’t distinguish the lifetimes of two borrowed pointers). ### Conclusion This naturally throws a wrench in the works. I still lean towards the SST approach, but we’ll have to find the correct variation on erased that preserves enough type info to run destructors but not so much as to require distinct copies of the same function for every distinct vector length. And it seems clear that we don’t get SST “for free” with no annotation burden at all on smart pointer implementors. As a positive, having a smarter story about type erasure will help cut down on code duplication caused by monomorphization. UPDATE: I realize what I’m writing here isn’t enough. To actually drop a value of existential type, we’ll need to make use of the dynamic info – i.e., the length of the vector, or the vtable for the object. So it’s not enough to say that the type parameter is erased during drop – or rather drop can’t possibly work with the type parameter being erased. However, what is somewhat helpful is that user-defined drops are always a “shallow” drop. In other words, it’s the compiler’s job (typically) to drop the fields of an object. And the compiler knows the length of the array etc. In any case, I thnk with some effort, we can make this work, but it’s not as simple as erasing type parameters – we have to be able to tweak the drop protocol, or perhaps convert “partially erased” type parameters into a dynamic value (that would be the length, vtable, or just ()` for non-existential types) that can be used to permit calls to drop and so on. ### Peter Bengtsson — Welcome to the world, Wish List Granted I built something. It's called Wish List Granted. It's a mash-up using Amazon.com's Wish List functionality. What you do is you hook up your Amazon wish list onto wishlistgranted.com and pick one item. Then you share that page with friends and familiy and they can then contribute a small amount each. When the full amount is reached, Wish List Granted will purchase the item and send it to you. The Rules page has more details if you're interested. The problem it tries to solve is that you have friends would want something and even if it's a good friend you might be hesitant to spend$50 on a gift to them. I'm sure you can afford it but if you have many friends it gets unpractical. However, spending \$5 is another matter. Hopefully Wish List Granted solves that problem.

Wish List Granted started as one of those insomnia late-night project. I first wrote a scraper using pyQuery then a couple of Django models and views and then tied it up by integrating Balanced Payments. It was actually working on the first night. Flawed but working start to finish.

When it all started, I used Persona to require people to authenticate to set up a Wish List. After some thought I decided to ditch that and use "email authentication" meaning they have to enter an email address and click a secure link I send to them.

One thing I'm very proud of about Wish List Granted is that it does NOT store any passwords, any credit cards or any personal shipping addresses. Despite being so totally void of personal data I thought it'd look nicer if the whole site is on HTTPS.

More information on the Help & Frequently Asked Questions page.

### Tantek Çelik — How To Lose Your Data In Apple iOS Notes In Five Easy Steps

Or, why is it so hard to get syncing of simple text notes right?

1. Create a note in Notes on your iOS device (e.g. iPod) with some text like "Note to self: do not save anything valuable in Notes".
2. Connect it to your Mac and (Sync) with iTunes, disconnect
3. Open Mail.app and delete the note on your laptop
4. Add something to the note on your iOS device, e.g. "Note 2: Well maybe I'll try adding something anyway."
5. Reconnect them and (Sync) with iTunes again, disconnect

At this point you've lost the entire note on both devices, including additions you made to it on your iOS device after you deleted the copy of it on your laptop.

If you're doubly unfortunate, iTunes has also thrown away the previous backup for your iOS device, keeping only the most recent "backup" for your iOS device, with, you guessed it, the deletion of that note.

The very act of attempting to backup your iOS device by explicitly syncing it with iTunes, was responsible for losing your data.

iTunes treated the copy of the note on your laptop (or rather the tombstone left in place after you deleted it) as more authoritative than the note on your iOS device - ignoring the fact that you explicitly added to the note on your iOS device after (time-wise) you deleted the copy of it in Mail.app.

iTunes treated the data you added as less important than your older act of deletion and threw away your data.

What should it have done instead? How should it have resolved a seeming conflict between a deletion and an addition, both of which happened after the most recent sync?

#### Principles

There are a couple of user interface principles to consider here. Quoting from Apple's own OSX Human Interface Guidelines:

##### Forgiveness

People need to feel that they can try things without damaging the system or jeopardizing their data. Create safety nets, such as the Undo and Revert to Saved commands, so that people will feel comfortable learning and using your product.

Warn users when they initiate a task that will cause irreversible loss of data.

Using iTunes sync jeopardized data. It removed the safety net of the previous iOS device backup. There is no command to Undo a sync or Revert to notes from before. iTunes did not warn before it itself caused an irreversible loss of data.

Next principle, same source:

##### User Control
The key is to provide users with the capabilities they need while helping them avoid dangerous, irreversible actions. For example, in situations where the user might destroy data accidentally, you should always provide a warning, but allow the user to proceed if they choose.

iTunes provided no warning before an irreversible deletion of the note on the iOS device which yes, did destroy data.

#### Instead of Deleting

There are several approaches that Apple could have taken, any one of these would have been better than the irrecoverable deletion that occurred.

• Treat the later adding to the note as the user intending to never have deleted the note in the first place and recreate it on the laptop.
• If a note is edited in one place and deleted in another (in any time order), treat the edit as more important than the deletion.
• Keep a browsable backup of any deleted notes
• Provide the ability to undo a sync
• Provide the ability to recover deleted notes, on either device

#### Why not iOS7 and MacOS Mavericks

Perhaps you're wondering what versions of operating systems (and iTunes) were used: iOS 6, MacOS 10.7, iTunes 10.

Why not upgrade to iOS 7 and MacOS Mavericks (and thus iTunes 11) ?

Because then you lose the ability to sync directly between your devices. As noted previously:

[...] If you use OS X Mavericks v.10.9 or later, your contacts, calendars, and other info are updated on your computers and iOS devices via iCloud. [...]

If Apple can't get syncing right between just two devices/destinations:

iOS device <---> Mac laptop

Why should anyone have any expectation that they can get it right among three?

iOS device <---> iCloud <---> Mac laptop

#### Towards Indie Note Editing And Sync

This episode has illustrated that we cannot trust even Apple to either follow its own user interface guidelines, or implement those guidelines in its own software, nor can we trust Apple's syncing solutions to not lose our data.

I am now investigating alternatives (preferably open source) for:

• editing simple text notes on a mobile (e.g. iOS) device
• syncing them with a mac laptop
• possibly editing them in both places
• syncing them again without loss of data

Preferably without having to use "the cloud" (otherwise known as the internet or the web). That being said, perhaps an open source indie web sync solution could be another path forward. If I could sync any number of my own devices either with each other or directly with my own personal web site, that might work too.

Suggestions welcome. Some discussion already documented on indiewebcamp.com/sync.