One of the other cool things that came out of the Firefox
                  Desktop work week that I mentioned in my previous post was
                  increased exposure to a handy tool written by
                  taras and
                  bsmedberg:
                  jydoop.
                  Benjamin has
                  a post explaining what it is and how it works, but I wanted to describe one specific use: measuring
                  main-thread responsiveness issues in the wild.
                
                First, a bit of backstory!
                
                  One of my team’s overarching goals for a long time has
                  been eliminating all sources of main-thread file I/O.[1] One of the worst main-thread I/O offenders in Firefox code
                  is our
                  SQLite Storage API. A few years ago we went a bit overboard using a version of
                  this API that is synchronous, and does its file I/O work on
                  the main thread. So recently we’ve been focused a lot on
                  eliminating all users of that API.
                
                
                  One consumer of this API that we
                  fixed in Firefox 23
                  is the
                  Form History code
                  (responsible for remembering information entered into web
                  forms and offering to re-fill it).
                  Neil Deakin,
                  markh and
                  mak worked together to
                  fix the core service (based on earlier work from a previous
                  Mozilla intern, Felix),
                  but after landing that improvement, taras noticed that there
                  were
                  still some remaining issues. We eventually determined that the likely primary culprit
                  for the remaining Form History main-thread I/O was Firefox
                  Sync’s use of the older Form History API.
                  markh and
                  rnewman worked
                  together to
                  fix Sync, but we weren’t entirely sure whether that was
                  sufficient to eliminate all of the main thread I/O we were
                  seeing reported by telemetry.
                  
                    Now, to the jydoop part
                
                
                  Taras had identified that our initial Form History fix
                  wasn’t complete using the
                  SlowSQL telemetry dashboard. This gives a high-level overview of slow SQL statements
                  occurring in the wild, and he noticed that after we landed the
                  original fix for form history, some formhistory.sqlite
                  statements were still showing up. Unfortunately, the SlowSQL
                  dashboard only offers two time ranges: previous week, and the
                  previous month, and it aggregates data across all builds from
                  a specific channel (e.g. “nightly”), which means
                  that I couldn’t easily use it to confirm whether our
                  Sync fix was effective.
                
                
                  That’s where jydoop came into the picture. It allows
                  much more sophisticated analysis of the SlowSQL data. I wrote
                  a couple of jydoop scripts:
                  one
                  to summarize the slowSQL reports for formhistory.sqlite from
                  builds before the
                  Sync fix
                  (i.e. Nightly builds with a build ID < 20130623) and
                  another
                  to do the same for builds containing the fix. I then ran those
                  scripts against Telemetry reports from the period from
                  2013-06-21 to 2013-06-24 (inclusive, i.e. roughly two days on
                  either side of the landing of the fix).
                
                The results!
                
                  The
                  results
                  are pretty conclusive: the reports of slow SQL statements for
                  formhistory.sqlite from builds including the Sync fix were
                  dramatically reduced. The most frequently slow statement from
                  builds before the fix (“COMMIT TRANSACTION”) was
                  responsible for a total of 417 seconds of main-thread
                  unresponsiveness (174993 occurences of slowness at an average
                  of 143ms of “jank” at a time), over that period.
                  The Sync fix prevented that statement (and many other slow
                  statements) from occurring at all, thus removing a significant
                  cause of main-thread responsiveness issues experienced by
                  Firefox users in the wild.
                
                Next steps
                
                  If you look at the results in more detail, they reveal that
                  there are still some slow SQL main-thread operations happening
                  on formhistory.sqlite – they’re less frequent and
                  severe, and are tracked by
                  other bugs. I’m going to be using jydoop to gather more data on
                  the improvements my team’s already made, as well as
                  gauge the potential for improvement in areas we’re still
                  working on. I also want others on the team to learn to use
                  jydoop, so that people can make their own measurements to
                  validate the improvements they’re working on.
                
                
                  [1] Main-thread file I/O is a
                    term used to describe trying to read from or write to the
                    disk on the same thread (the “main thread”) as
                    you’re trying to respond to user input (e.g. mouse/key
                    events) and do other things like drawing the UI. Reading
                    from or writing to the disk can take a relatively long time,
                    particularly if you have a slow hard drive, so we want to
                    always do it on a background thread so that those other
                    important responsiveness tasks (responding to input and
                    updating the UI) aren’t affected.