Software by Steven

Blog

My First Bug

After identifying and analyzing various performance issues I noted related to hardware acceleration in Firefox 4 nightlies, it was now time to make my findings public. Bugzilla would be the end result, but there were quite a few steps in between. First, there was consultation. I’d never filed a bug before and as punctual as I try to be, I had the great luck of a one-time schedule conflict occurring when we were covering bug filing in class. I already knew a bit from earlier classes and my own playing around with Bugzilla, but like most of my colleagues I consulted with Dave on IRC before submitting. That’s when things got interesting.

IRC is yet another new thing to me in DPS909. I’ve used chat clients like MSN, but within 5 minutes of conversing with Dave he had pulled in a Senior Software Developer from Mozilla to #seneca where we were talking. After a short bit, when we figured that the performance issue was related to Direct2D, he pulled in a “Platform Graphics Guy” at Mozilla. Bam, bam bam. Within 10 minutes of consulting with Dave on IRC I’d spoken with and been helped by two extremely knowledgeable guys located in various places around North America. The connectivity enabled by IRC still boggles me.

With all of their help, I now had some guidance as to how to go about filing this performance bug. I knew it needed to be filed in the first place. I knew it was in some way related to Direct2D. And I knew to CC Dave on it and add the “[chromeexperiments]” tag to it. With all this new information to process, I consider myself lucky to have remembered 75% of it (thanks for adding the [chromeexperiments] tag, Dave!). After filing the bug, the Open Source wheels churned to life.

Just as Linus’ Law applies to bug identification and resolution, it seems to also apply to bug spotting in a different light. Between the 4 viewers on Bugzilla and another two on IRC, those six sets of eyes were able to spot a duplicate bug, this one from four months prior. A dialog is now going and some of the best programmers in the world are now looking at resolving this with collated effort. Release early, release often indeed.

2010-09-26
Getting ready to file my first bug using xperf

If you’ll remember from my last blog entry, I’ve been stress-testing Minefield using Chromeexperiments in an attempt to identify bugs and bottlenecks. Without being able to identify any crashes or unusual non-performance related behaviour, I then had the task of investigating profiling tools to find the issue. And then things got messy.

I thought things would be as simple the tool, running the test and the tool would tell you exactly what was wrong. That assumed two things: 1) That I knew how to use the tool and 2) That the tool knew how to identify problems. Turns out I started with incorrect assumptions on both. I had never used a profiler before, and the Windows SDK-supplied xperf had a lot of options. Mozilla was great at giving me the setup, but there was more to learn. Firstly, that profilers are not sentient and that their results still require analysis.

The idea behind a profiler is that it generally runs in the background, taking samples of CPU and memory states (or other conditions still, like registry accesses or file io) periodically while all processes are running. Good ones like xperf can be configured to hook into kernel events to create a stack trace to show state at every function call.

This tool was great for discounting some of my earlier theories by using the Windows-default perfmon tool. The ability of xperf to monitor both CPU and memory during stack-walking disproved some of my earlier causal theories and showed me that in most cases, CPU and memory usage was eaten up within DirectDraw drivers.

To figure out where the information I wanted was required a fair bit of googling. Even something so simple as viewing a stack walk took a bit of detective work though for those looking for the Coles notes, here’s how to capture both Heap and CPU state for each stack frame (courtesy of Mozilla and MSDN links above):

xperf -on latency -stackwalk profile
call xperf -start heapsession -heap -PidNewProcess “%platform% %siteToGo% %args%” -stackwalk HeapAlloc+HeapRealloc -BufferSize 512 -MinBuffers 128 -MaxBuffers 512

%platform% would be the absolute path for the executable to run (in this case, Minefield)

%siteToGo% is the web site you wish to test (saves on collecting unneccessary data from loading home page or google)

%args% would be any other arguments you wish to pass to the application. In this case, I opened Minefield with “-P Testing -no-remote” (no quotes) to use my test profile. The test profile was a very stripped-down profile to eliminate any background work in Firefox which may have thrown in extra data. The complete script can be found here.

Various options can be specified in the “On” parameter when calling xperf as Richard Russell shows, but I went with the latency option suggested above by Mozilla. The result were large (1.7 GB for 5 minutes) files, so perhaps for CPU monitoring I would have done well to stick with the “PROC_THREAD” option. Most of the size was heap data, the result of the second call to xperf but every bit helps. Once finished, I stopped the profiler instances and merged the results:

xperf -stop heapsession -d heap.etl
xperf -d main.etl
xperf -merge main.etl heap.etl result.etl

Once the report was compiled, I loaded it and saw some complex graphs. It took some getting used to where everything was, but after a long while I learned the following best practices:

To load symbols (so as to view symbolic function names), enter them under “Trace -> Configure Symbol Paths” and then select “Trace -> Load Symbols” to actually load the symbols. From here, the app has probably become non-responsive for a short time while it associates everything and you’re left still looking at the same graphs. Here’s where the magic happens:

Right click on a graph (“CPU Sampling by Thread” and “Heap Total Allocation Size” work well) and select “Summary Table” or “Simple Summary Table”. From here there was a tree-view allowing a drill down from process to thread to dll to individual function calls. Richard Russell’s blog entry was so helpful, I’ll link it here as well, when I say how much it helped describe using the GUI.

In the end, xperf helped identify some choke-points on some of the experiments I was testing, specifically isolating the need to test both with and without hardware acceleration. Speaking of which, hardware acceleration in Minefield and the upcoming Firefox 4 can be adjusted by use of the following options (enter about:config into your Firefox address bar to adjust):

gfx.direct2d.disabled
gfx.direct2d.force-enabled
gfx.font_rendering.directwrite.enabled
layers.accelerate-all
layers.accelerate-none

The first 3 are for DirectDraw, which handles 2D graphics and text, while the last 2 are for Direct3D, which handles 3D drawing. This is a lot to work with to test hardware acceleration, so Joe the graphics guru from Mozilla has an easier way to test hardware acceleration (Windows only at present).

At around 848 words this seems to be by far my largest blog post, which given that I’m blogging about what I’ve learned through bug filing must be a good thing. Ah yes, bug filing: the ends to these means. That’s the next entry in this blog post queue.

2010-09-24
Performance testing Minefield and Chrome using Chrome Experiments

Now that I have my Firefox testing environment, it’s time for my first task: Performance Testing! For this, I needed to know what to test (in this case, nightly builds of Minefield and Chromium), how to test (the Chrome Experiments) and what to determine (speed, smoothness, and responsiveness). There are quite a few experiments, but as a class of 12 we split the 120 up fairly easily. I was tasked with #71-80.

To begin testing, I had to try and think about what variables could be eliminated. I decided to launch all experiments directly from the command line in an attempt to standardize application startup. For this I wrote a batch file script with a few parameters to help with this. I also used Windows’ built in PerfMon tool to assist with performance measurement so that I could be free to stress-test the application. With this combination, I felt I could gather information quickly and efficiently, and Perfmon also provided lots of great graphs for later analysis. It almost seemed like overkill at the time, but it helped give extremely measurable results.

Given the target implementation platform (Chrome) for these experiments, I wasn’t surprised to find that a few of them weren’t performing quite as well on Minefield as on Google’s browser… yet. What did surprise me was the creativity and range of the experiments.

Now that I had my data, it was time to analyze further and see if I could spot any large discrepancies, and to try and figure out why they occurred. My testing measures focused on the Working Set (Memory) and CPU utilization for the browser process. The jsCanvasBike, Animated Harmonograph and Liquid Particles seemed to require the most tuning, so my attention turned now to profiling rather than monitoring. As perfmon shows, not all of these problems were visible simply by monitoring the CPU.

Luckily, the Windows SDK comes with their own profiling tool (xperf) which features extremely detailed kernel debugging on stack traces. Mozilla even has its own help page with instructions tailored specifically for use with Firefox. The results can look a little daunting, but it appears to be a very powerful tool and I was glad to have so many resources available. I feel as if I’ve entered a whole new realm of testing, but there is still a great deal more to experiment with.

2010-09-18
Setting up a Firefox testing environment

In light of jumping into open source development, I first decided to explore Firefox and to try and learn a bit about it, its architecture and how to accommodate a testing environment for it.

First thing was first: downloading a prerelease version of Firefox. I had two choices: download the current (but less stable) nightly build or the more stable but less current beta. I chose to test a nightly build, but could have chosen both and simply installed to different directories. Installation was simple and straight-forward and just like release versions of Firefox. One thing of note was that pre-release versions were installed in a separate location from release versions, under the name Minefield. Genius, this way I could have both installed concurrently!

From here, I came into an issue:my release version of Firefox had been heavily customized by me, but what if I wanted to test the default configuration? After some research I changed the properties of the shortcut for Minefield to add the command-line argument -P (apparently identical to the -ProfileManager argument), which opened up a profile manager for me to create a secondary profile. Firefox allows for different profiles (which group configuration, extensions and more) to be used to run Firefox, but there is always a default. Upon making my new profile (named Testing), Firefox opens normally. I then closed Firefox and re-modified the command line in the shortcut to include the new profile name in quotes after the -P like so:

This was all fine and well, except that Firefox is designed so that by default only one instance can be run at once. By that I mean if run my release version, minimize and attempt to open Minefield, another instance of the already-running release version pops up. Through lecture with David Humphrey, I learned that the way to resolve this was simply to add the -no-remote command line option with the -P option (after the profile name) to allow multiple instances to be run under different profiles. With this little tweak, I now have completely independent testing environments setup for concurrently testing various Firefox builds.

2010-09-14
Hello (Open Source) world!

As a 7th semester student in Seneca’s Bachelor of Software Development program, I’ve long considered starting up a blog for documenting and generally sharing the projects I am currently involved with. Though fitting for a software student with reluctance to buy a laptop, this web developer has also been slow to make the leap into web 2.0. It’s amazing how a class, in this case DPS909, can change things about you.

Like my aforementioned old-fashioned ways, I’ve long wondered how large projects (and even business models) can grow and thrive while remaining so disconnected and independent. Reading Eric Raymond’s paper The Cathedral and the Bazaar actually showed me that open source is neither. Following the story of fetchmail, it demonstrated how each project is disconnected only geographically, with the internet serving to connect all contributors and bring their independent selves to work together.

Another thought of mine which is slowly being changed is how businesses work. How could Mozilla possibly turn a profit when it gives away products like Firefox for free? Many websites which don’t sell a product can turn a profit by advertising, but Firefox uses none. A recent article in the NY Times showed me how, through a “community development model” and partnerships with companies like Google. The film Revolution OS brought together both of the above for me by telling the story of the genesis of the Free Software Foundation, Linux and the first commercial enterprises.

Having watched studied these, I feel I have a much deeper understanding of Open Source philosophy and its affect on development, distribution, and rights, not to mention the business inherent in today’s world.

2010-09-09