Software by Steven

Blog

More subtitle formats for Popcorn.js
What began as a custom subtitle command is growing into a larger amount of work on Popcorn.js. While initially work was being done to include TTXT subtitle support, support for many more standardized formats will soon be popping into Popcorn. This is inspired by the Universal Subtitles project, which supports a variety of formats. As a combination of efforts between Simon and myself, we’re also working on creating parsers for:
- SRT
- WebSRT, now known as WebVTT
- SSA
- ASS
- SBV
- TTML
One caveat in this wide support is that, like TTXT before them, the initial implementation will have much of the style information being ignored for now. This won’t affect SBV or standard-compliant SRT files, but the remaining will have their styling information stripped from them… for now. We hope to be supporting style information soon, however. In the mean time those formats that support HTML-compliant in-text styling information will work courtesy of the browser’s native display capabilities.

For example, since many media players have expanded on the basic SRT support to handle underlining with the <u></u> tags, existing SRT files with these tags will not break. However, SSA-style commands, which follow a syntax like this:

{\commandArg1} or {\command(Arg1,Arg2,…)}

are also found in SRT files. Even though styling these tags is not immediately supported, the commands still won’t show up as part of the subtitle text: they will be hidden from view. So if you have a subtitle file and want to use it, what do you do? Just include popcorn, the parser plugin and the subtitle plugin in the web page like so:
```
<head>
   <script src="../../popcorn.js"></script>
   <script src="../../plugins/subtitle/popcorn.subtitle.js"></script>
   <script src="popcorn.parserSRT.js"></script>
</head>
```
These files will register all the appropriate code with popcorn, meaning all that’s left is to specify the subtitle file. This is done on the video element via the data-timeline-sources attribute:
```
<video id='video' controls data-timeline-sources="data/data.srt" >
  <source id='mp4' src="../../test/trailer.mp4" type='video/mp4; codecs="avc1, mp4a"'>
  <source id='ogv' src="../../test/trailer.ogv" type='video/ogg; codecs="theora, vorbis"'>
</video>
```
And with that bit of work, Popcorn will do what it does best, which is to take care of the rest.
2011-02-06
Cluster size, Cache Lines and Why Old Video Games had 3-letter High Scores
Like many other software developers, I’m constantly trying to deepen my understanding of things, to learn more and push myself further. As such, I took it upon myself recently to try and make a high score system for a 3D game I’d made last year. I wasn’t content with just a simple “cout >> score”, however, I wanted to make this an academic exercise. I wanted the “perfect” high score system.

So I started researching. Given that this dealt with IO, I wanted to make sure I adhered to a few good practices, and to learn about others. For this, I had to consider the low-level implementation details. First consideration…

Cluster Size

Windows file systems (NTFS, FAT and exFAT) allocate space on the hard drive according to something called cluster size. File allocation on the hard drive must be a multiple of the cluster size. Files up to the cluster size (lets say 4 KB) will take up that much space on the disk. If the file contents amount to less (lets say a 2 KB text file), the rest of the drive space is padded. This is done for speed reasons. Files larger than the cluster size (6 KB) will have extra clusters allocated to allow for the entire file to be stored (8 KB on disk). The relationship can be generalized as:

disk space = round up ( file size / cluster size )

One downside of this is that every file which isn’t equal to exactly the cluster size has wasted space. Another is that since the OS reads in information in entire clusters, performance suffers if file size is less than the cluster size (only some of the data on the cluster is relevant). While the age of terabyte hard drives makes the first downside negligible, the second should still be a concern. However, given this being an academic exercise, I opted to make my high scores file be the cluster size.

So just how big might the cluster size be? Optimal cluster size varies by file system, operating system and hard drive size, but by and large a reliable figure is 4KB. According to Microsoft this is the default on NTFS for any Windows NT 4.0 machine, or anything newer than Windows 2000 with a hard drive size under 16 TB. Given that the game will require DirectX 9 and will likely run on a home computer, the Windows 4-6 family is almost certainly to be the host OS, and NTFS is a near-guarantee for the file system. So until 16 TB hard drives are common, I should be safe with a 4KB file size 🙂

Cache Line Size

Here’s where I had to begin to take a lot of consideration and care. With varying types of memory in computers (registers, L1 Cache, L2 Cache etc) I wanted to try and make my data structures as accommodating as possible. For this, I wanted to be sure they would fit on the L1 Cache as cleanly as possible. This way, more data could be kept there for quick transfer to the CPU rather than having to retrieve values from secondary memory like RAM. Turns out, both AMD and Intel recommend that for this, structure size should be an integral multiple of the cache line size, and aligned on that byte boundary. In other words, an L1 Cache with a cache line size of 64 works best on a 64 (or 128) byte structure stored in memory at an address divisible by 64.

Turns out, nearly all modern processors are 64-bit and work off a 64 byte L1 Cache. It’s not a guarantee to say than an n-bit processor will have a n-byte L1 Cache line size, but based on my research in the Pentium family, it’s a fair assumption. So I started trying to juggle how to work with these 64 bytes. I didn’t see score going above 4 billion, so 4 bytes seemed sufficient for this. I wanted to store a date as well, which at an 8 byte value seemed enough. With 12 bytes down and 52 to go, it seemed 64 was a bit much. Aiming for 32, I then gave an extra 4 bytes for a bit field. 16 bytes down, 16 to go. And I needed a name. 16 bytes makes for an array of 8 wchar_t variables, which meant a name of 7 letters could be supported (plus 1 null byte on the end). Why wchar_t? I wanted to go beyond the 1-byte “char” to not limit myself to ASCII characters.
```
#define CACHE_LINE_SIZE 32
#define NAME_SIZE (CACHE_LINE_SIZE - sizeof(int)*2 - sizeof(long))/2
typedef struct score_t {               // Engineered to be CACHE_LINE_SIZE bytes in size
 int score;                            // Most commonly used member first, used in ordering (4 bytes)
 int flags;                            // Score flags (4 bytes), helps natural alignment of next item (the date)
 long date;                            // Time stamp (8 bytes)
 wchar_t name[NAME_SIZE];              // Player's name, takes up remainder. Place last to ease natural alignment
};
```
With this structure, I was able to fit 2 scores in one cache line. Upon seeing if I could fit 4, I realized something. At a 16 byte struct size, I would’ve stripped out the bit field, which if you do the math would allow for 3 letters to be specified as the earner of the high score. That sounded familiar…

Pac-Man Hi Score

Notice how there’s only three letters in the name field? A lot of old 8-bit games only allowed for 3 letters… Pac-man (above), Galaga, Contra, and Mario among them. Given the knowledge that they ran on an 8-bit processor and the assumption that an 8-byte cache line size was involved, along with an assumption that the character set was likely limited to ASCII characters, you have the following variation on the above:
```
#define CACHE_LINE_SIZE 8
#define NAME_SIZE (CACHE_LINE_SIZE - sizeof(int))
typedef struct score_t {               // Engineered to be CACHE_LINE_SIZE bytes in size
 int score;                            // Most commonly used member first, used in ordering (4 bytes)
 char name[NAME_SIZE];                 // Player's name, takes up remainder. Place last to ease natural alignment
};
```
Do the math and it works out that there is just enough room for 3 letters to be given. Sure, the size of an int wouldn’t’ve been 4 bytes on that processor, but I’m going to guess that based on the perfect Pac-man high score being realized at 3,333,360 that 4 bytes were allocated for the score field all the same. Maybe first, middle and last initials weren’t the real reason for having a 3-letter high score entry.

At the end of it all, I used the first code snippet in my game: 4,096 bytes of high scores at 32 bytes/score makes for a hefty 128 high scores. However the real joy I took out of this wasn’t the code but gaining new insight into some of my first games, with a like-mindedness of trying to eke out every cycle possible. Code on.
2011-02-05
Popping up TTXT Support for Popcorn.js

In a past blog, I discussed some work I’ve had with the GPAC’s TTXT subtitle standard. I’ve begun working on Popcorn.js and after not too much work, Popcorn is beginning to gain TTXT support. While this would’ve been much more work before the 0.2 re-engineering, plugins are now supported and it’s as simple as registering a parser function for a specific file extension.

What does this mean?

For developers, it’s easy to add onto popcorn. Their page covers just about everything you could need to know, and once you learn the intuitive API it’s all systems go. For this initial TTXT support it was as simple as using the existing subtitle plugin and writing a custom parser to convert a TTXT file into subtitle objects. Probably about 40 executable lines of code needed in all.

For Popcorn.js users, I can see this being quite advantageous as well. With the ease of adding functionality, new features are able to be developed quickly and with minimal impact on what’s already in use by Popcorn.js. This means the web pages using Popcorn can only get more dynamic as time goes on.

Back to TTXT: It is a robust standard. I’ve studied the spec and am beginning to learn it all, but thus far only the text information from a file is processed; not the styling information One of the benefits of TTXT is it’s rich support for styles down to a per-subtitle, per-character level. This means one subtitle can be half 12pt Arial font, half 36pt Courier. Another subtitle could be all Verdana and in red, except for a desired letter to be blue. This styling information is what I’m least familiar with about the standard.

My friend Anna has already put the call out if anyone has some more complicated TTXT files to contribute to be studied and run by our parser. I’d like to re-issue the call here, and would be interested in hearing from any developers or film makers who have more familiarity with this standard.

2011-01-26
Visualization and Emulation of the 6502 Processor in HTML 5

As a student and as a software developer, I’m constantly trying to learn more, dig deeper and further educate myself on anything related to my field. What has my attention right now is lower-level software development and optimization targeted for specific hardware. There are a great many papers out there written on the subject, specifically optimization practices on Intel hardware and the importance of keeping cache size and cache line size in mind. What really caught my eye in this research, was a mixture of the old and the new.

In 1970s, there was a 8-bit CPU war going on between (amongst others) Intel and Motorola. Motorola’s team in charge of development of a new chip were soon moved to the small-time MOS company. To make a long story short, the team ended up producing the 6502, a competing 8-bit processor at 1/6 the cost of Intel’s and Motorola’s leading designs, sparking “a series of computer projects that would eventually result in the home computer revolution of the 1980s”. How influential was the 6502? For starters, it came bundled with the wildly popular Commodore, Apple and Atari home computers of the time, with variants of the 6502 also gracing Nintendo’s NES and Atari’s 2600 video game systems.

Moving forward in history, we now have cutting edge HTML 5 running on web browsers on computers with 1,000,000 times the memory that the Atari 2600 could support (8 KB could still do amazing things). And now, the group from Visual6502.org has made a great visualization of the processor in action using HTML 5’s canvas. The demo comes with a custom program which can be executed step-by-step with each instruction lighting up different transistors on the processor.

The best part? There’s a github repo: it’s open source.

2011-01-18
Managing js dependencies and the importance of global-scoped libraries
In performing some cleanup, abstraction and upkeep on the 0.3 release of the BBB player (soon to be 0.3.1), I’ve been working on expanding the testing functionality. In doing so, I’m pushing the test code from the testing page (test.html) into its own external JavaScript file (test.js). Trouble is, I want to keep this code separate from the main BBB.js library, but I need to address the fact that test.js is dependent on BBB.js. It’s possible to put them all in the same file, but that’s a lot of overhead for the 99% of the time that the testing framework isn’t required. So we can put them in different files and trust that on every HTML page that test.js is loaded, BBB.js is loaded before it. This is bad and error-prone!

What to do?

Turns out, we can load BBB.js dynamically as needed from other js files. As many js developers know, you can check if a variable has been used by doing this:
```
typeof(bbb) === 'undefined'
```
This statement will return true if bbb has not been declared or assigned to. So with this, we’re already part-way there: we know if our library (bbb) was not included on our test page. So from here, we just need to load it. Guess we’ll have to make another XMLHttpRequest to get it!
```
if (typeof(bbb) === 'undefined') { // No record of global BBB, must've been forgotten on test page
  var xhr = new XMLHttpRequest();

  xhr.onreadystatechange = function() {
    if (xhr.readyState===4) {
      // Execute BBB.js and populate in bbb variable
    }
  }
  xhr.open("GET", '../BBB.js', false); // Not async, test code below depends on having global bbb object
  xhr.send(null);
}
```
So now we have a way of getting BBB.js if the HTML page doesn’t do it. However, BBB.js comes back as text. if put at the top of a file, the code above will simply retrieve BBB.js from the server and have it sit there. So we’ll have to execute the text as JavaScript. And for that… we use EVAL.

Long-touted as lazy, dangerous and inefficient, eval is great for executing text as JavaScript code. It’s often misused aand security and performance almost always take a hit. If you tell someone you’re using eval in your library, the natural reaction to expect is “REALLY?”. This, of course, after the typical recoil in horror. Turns out, resolving dependencies is one of the few things it’s good for. When added to the code above, it will execute the requested js file and (in the case of a library like BBB) will have a variable stocked full with your code and ready to use. Package the whole thing into a function and you have something like this:
```
function requireBBB(fileRef) {
  if (typeof(bbb) === 'undefined') { // No record of global BBB, must've been forgotten on test page
    var xhr = new XMLHttpRequest();

    xhr.onreadystatechange = function() {
      if (xhr.readyState===4) {
        eval(xhr.responseText); // Execute BBB.js
      }
    }
    xhr.open("GET", fileRef, false); // Not async, test code below depends on having global bbb object
    xhr.send(null);
  }
}

requireBBB('../BBB.js');
```
And this works. Or at least it should. It didn’t for me. Convinced that it was something wrong with this function, I plugged alerts on every other line of code. I used Firebug. I used everything at my disposal to figure it out. Finally, after far too long, I got a hunch. I pulled up the BBB.js file and found the first line to be this:
```
var bbb = (function(){
```
Seems innocent. Run a self-executing anonymous function and store the result in locally-scoped variable bbb. Wait, what? We’d never run BBB.js outside of global page scope, so local scope was always window and it had always worked. However, eval’ing the library code from the xhr.onreadystate function, bbb isn’t put on the global window because that’s not the context the code is executing in. Instead, bbb became a member of xhr. The function then finished and bbb blipped out of existence as quick as it blipped in. A change in BBB.js to this:
```
bbb = (function(){
```
And everything works. The variable is declared as implicitly global and can be accessed elsewhere. The moral of the story: always store your library at global scope if you want to manage dependencies or otherwise avoid scope-related issues.

There is a downside to dynamic loading: performance. Every XmlHttpRequest means a hit on performance; ideally you want to put all js code in it’s own file, and the same for CSS. In fact, using this on-demand js-loading breaks #1 on Yahoo’s list of “Best Practices for Speeding Up You’re Website“. However, since users won’t really be needing the testing framework (and it will just be extra JS for the browser to work with if included in BBB.js) this should be okay. While I have yet to try it, I hope to do a bevy of performance testing using (amongst other tools) the Firefox extension YSlow before formally pushing out 0.3.1. Even without formal profiling, we’ve already caught a few other minor improvements to performance!

Stay tuned for 0.3.1…

UPDATE (Jan 20, 2011): After working on popcorn and learning a few things, there is a much better way than the use of eval. Seems JSONP makes use of dynamic script tags as a way of issues cross-origin requests. This works equally well to dynamically load libraries (like jQuery) and looks much cleaner than XHR:
```
var head = document.getElementsByTagName('head')[0];
var script = document.createElement('script');
script.src = "http://code.jquery.com/jquery-latest.min.js";
script.type = "text/javascript";
head.insertBefore( script, head.firstChild );
```
2010-12-15
BBB Player 0.3
With December upon us, it’s time for the third iteration over the BBB Player. And this one shan’t disappoint. My partner’s release notes forthcoming, I’ll be focused on discussing my own tasks. If you’ll recall, the original list of items tasked to me looked like this:
- Hooking up dummy server calls on chapter addition/deletion
- Toggle subtitles on/off
- Refactor Bookmark object to optimize for memory consumption
- Time In/Out Buttons for setting chapters rather than manual typing
- Popcorn-formatted metadata generation (will be sent to same dummy server)
With the first three being completed by my last BBB-related update. Since then, the list grew to a few more to include:
- Refactor: Automate library loading via DOMContentLoaded event
- Refactor: Create “storage” module from existing functions
With these 7 tasks (plus minor fixes here and there) the code base underwent quit a few changes, however the core API remained unbroken with existing code. For example, while the library can hook into the DOMContentLoaded event to load automatically, one can still manually call the init() function from the onload event of the body to load all internal values. See both options below:
```
bbb.setupWhenReady({playerId: "player", tocId: "tblOfContents", chapterStorage: "server.php", formDivId: "formDiv", statistics: true, watermark: true});

// OR
<body onload="bbb.init({playerId: 'player', tocId: 'tblOfContents', chapterStorage: 'server.php', statistics: true, watermark: true});">
```
Internally, setupWhenReady() calls the init() function with the passed parameters, but the function may be called anywhere within the HTML document. Also, setupWhenReady() will call the bbb.onReady() function when finished. Override this method to execute page-specific logic once the library has loaded. Another callback function, bbb.onChangeVideo(currChap), is called when changing chapters. Taking one argument (the new chapter), it can be used to update parts of the page with internal information about the playing video. If overriding, expect to receive a Bookmark object. See it all in use when it’s used to autopopulate the chapter creation form.

Another great usability feature now implemented is optional subtitles. I discussed this at length in my last BBB blog post, but the VideoJS library we’re using (at least version 1.4) would not allow for subtitles to be optional. If a file is specified, they would play. After a bit of work on Video.js however, they’re now togglable. As part of implementing this, a bug was also found and fixed where subtitles will now display properly when scrolling backwards through the video. Both of these fixes have since also been forwarded to the VideoJS team.

The big thing about this release for me, however, is the creation of popcorn metadata (in the bbb.generator module). Presently supporting 7 command types (Wikipedia, Flickr, Google News, LastFM, Twitter, Video Tag, and Footnote), the library internally handles all form generation and functionality, however placement can be determined by supplying the library with the id of a div on the page in which to place the form. Also, using this div ID and CSS Selectors, it’s easy to style the generated form at the page level. Unfortunately, hosting limitations to the tune of no file IO in JavaScript, no PUT requests to servers, and no file writing means it’s undemoable on its current host, but feel free to check it out all the same!

And so that being that, my portion of the 0.3 release of the BBB player is complete. Minor fixes or structural changes for this release include:
- Moving free-floating page-level setCookie, getCookie functions into bbb.storage module
- Beginning of a bbb.chapters module for storing and working with chapters (though functions from 0.2 remain outside at this time to maintain a consistent API)
- Error checking, validation and output formatting for chapter and popcorn metadata creation.
See the code base on my github, especially the 0.3 branch. While this work was all done through a class with Dave, I don’t want to be done quite yet! Hopefully there’s much more work, development and contribution opportunities ahead. Luckily, with open source, that’s always the case 🙂
2010-12-08
Writing and Running Automated Firefox Browser Chrome Tests
As a followup to making my first patch for Firefox, it came time to write some automated tests for it. Turns out Mozilla has an entire suite of automated testing tools for their platforms. All thoroughly documented, they range from the ultra low-level compiled code tests to “record and play” macro testing for human UI interactions.

Selecting the Test Suite

The hardest part was figuring out what test suite would best compliment my patch, but given that my patch affected the Browser Chrome I went with that section of the Mochitest suite. Browser Chrome tests are really just javascript files running at elevated chrome-level privileges. This means that in them you get to access the tab browser API, more window objects, and other goodies. They’re really simple to write, just follow one convention: put everything in function test(). This is the main line of your test mini-application. You can add more functions, but they’ll be ignored by the tester.

At first I’d put my js file in the source tree directory only ($srcRoot/browser/base/content/test) but turns out that builds compiled with testing enabled pull from a different directory ($objDir/_tests/testing/mochitest/browser/browser/base/content/test). Oh yeah that’s right: Firefox has to be built with the –enable-tests option. This pushes the test code from the source into the appropriate sub-directory of your output object directory (notice how both end in “/browser/base/content/test”?). Since I had written my test after compiling, I had to copy it over myself!

With the right location, it was time to write the test. Simple enough, I just looked at existing tests. The browser_allTabsPanel.js one looked very similar to what I was trying to do, so I used it to get a feel for how to write a test.

Writing the Test

When writing a test you must work by assertions, specifying a message to output when the expected value matches the actual. A few assertion functions are:
```
ok(val, "val exists!");                        // Tests existence
is(val1, val2, "val1 equals val2");            // Tests equality
isnot(val1, val2, "val1 does not equal val2"); // Tests if not equal
```
For a successful test run, you want all of the above comparisons (if used) to be true. As for writing the code itself: I wanted to test the addTab function I’d modified in tabbrowser.xml. To do this, I used the gBrowser object, which seems to be the global object holding the current browser (and giving access to functions in tabbrowser.xml). Here was my test function:
```
function test() {
 var stubTab = gBrowser.addTab();

 var tabs = gBrowser.tabs;
 var owner;

 is(tabs.length, 2, "2 tabs are open");
 is(gBrowser.selectedTab._tPos, 0, "First tab is selected");

 var newTab = gBrowser.addTab();

 is(gBrowser.selectedTab, tabs[0], "First tab is still selected");
 is(gBrowser.selectedTab._tPos+1, newTab._tPos, "Was inserted at #2");
 is(newTab._tPos+1, stubTab._tPos, "Old #2 shifted to #3");

 gBrowser.moveTabTo(newTab,2);
 is(newTab._tPos-1,stubTab._tPos, "Successfully moved new tab, older one shifted down");

 gBrowser.removeTab(stubTab);
 is(newTab, tabs[1],"Successfully deleted stub tab, new tab moved down");

 var newTab2 = gBrowser.addTab();
 is(newTab, tabs[2],"Successfully moved newTab down on newTab2 creation");
 is(newTab2, tabs[1],"Successfully added newTab2 adjacent to selected tab");

 gBrowser.selectTabAtIndex(1);
 var newTab3 = gBrowser.addTab();
 is(newTab3, tabs[2],"Successfully added newTab3 in middle of list");

 while (tabs.length > 1)
 gBrowser.removeCurrentTab();
}
```
Running the Test

With the test written, it was time to run it. For that, I had to go back to command line and from the source root run
```
make -C $(OBJDIR) mochitest-browser-chrome
```
This ran the entire suite! I didn’t want to wait around for my one test to run, so instead I specified the TEST_PATH variable before running it:
```
TEST_PATH=browser/base/content/test/browser_tab_addBeside.js make -C $(OBJDIR) mochitest-browser-chrome
```
Much better, it only runs my test. And 11/11 passed! Perfect.
2010-12-07
InnerHTML vs. DOM Manipulation
In working on pushing out my 0.3 Release for the BBB Player, I’ve had to look a great deal into element generation and the best ways to do it. But how exactly do you measure “best”? Is best the best performance, the least lines of code, or the most compatible? And if we go by all three, what’s the answer then?

Usage and Support

Turns out, this has been a long-lasting debate on the interwebs. Internet Explorer’s handy but non-standard innerHTML property lets you do things quick and painlessly by specifying the HTML code as a string right in the Javascript. You can build this dynamically then via simple string concatenation. It’s short and quick, but not guaranteed to be supported everywhere (though it happens to be by just about every browser out there). Seems to be a carry over from IE5.5 when IE had 95% of the market. What were other browser makers to do to have their browser work with a web site but to implement innerHTML? Besides though, watch how friendly it is to use.
```
var mySpanId = "spanID";
var myDiv = document.getElementById("myDivId");
myDiv.innerHTML = '<span id="'+mySpanId+'">Embedded Content</span>';
```
A quirk about it though is that as simple as it is, it will not work with tables. A price to pay for simplicity, but unfortunately quirks happens when the specifications are not standardized. Now lets compare DOM Manipulation:
```
var mySpanId = "spanID";
var myDiv = document.getElementById("myDivId");
var mySpan = document.createElement('span');
var spanContent = document.createTextNode('Embedded Content');
mySpan.id = mySpanId;
mySpan.appendChild(spanContent);
myDiv.appendChild(mySpan);
```
Longer code, that’s for sure. But this is guaranteed to work on every browser, now and forever (well, almost). Though there are still a few cross-browser quirks. Turns out you can create input elements both ways, but IE does not support the DOM methods outlined above; it balks if you try and set the type of the newly created input element like so:
```
var myInput = document.createElement('input');
myInput.type = 'text';
```
And there is no pure DOM way around this. No using myInput[“type”] = ‘text’, no nothing. In IE, you simply can’t change the type of an input element once it’s been created. As John Resig notes, not even jQuery is immune, and so he had to implement a work around in his API. Turns out you must specify the entire element in HTML markup for it to work:
```
var myInput = document.createElement('<input type="text" />');
```
So there are compatibility concerns on both sides of the fence, but one thing DOM does which innerHTML doesn’t is allow for easy mixing and merging of Document Fragments: collections of nodes outside of the main document structure which can be built, appended and removed as children in any order desired. Contrast this with innerHTML where you need to know what you’re concatenating as strings before doing it, unless you want to worry about complicated substringing later on.

While in the end it’s personal preference, the standard compliance and dynamic changing nature of Document fragments gives DOM the nod in my book.

Performance

So with that being that, how do things compare in performance? Turns out there’s been a lot of benchmarking done on this. John Resig has done some studies comparing the performance of appending Document fragments as opposed to appending many individual nodes. Testing on a wide variety of browsers, he showed that it can vary from two-four times faster using Document Fragments. So lesson learned: Favour Document Fragment usage. They result in only a single redraw operation for the browser, since there is only one modification of the displayed nodes.

Before going on to other studies though, another interesting difference between document fragments and individual nodes: changes to individual nodes after appending them as children persist to the appended-to node, but modifying a document fragment after appending will not. In fact, appending a document fragment as a child will result in clearing the nodes from it in the process of appending to the parent. Clearly, this is the result of pointers and linked lists being worked with, but it’s still good to know that upon appending a document fragment, the parent element is considered the sole owner of the child nodes.

On to other studies! Several years ago Peter Paul Koch conducted his own comparing IE5.5, 6 and 7 to FF 2 and 3 beta, Opera 9.5 Safari 3. At that time, innerHTML won. It makes sense in a way, IE was the monster browser of the time, why not optimize a browser to use IE’s innerHTML well? Well, a little bit later, once a full iteration of browser development had occurred (including the emergence of Google Chrome), Andrew Hedges conducted his own studies. Turns out he was able to show DOM pulling much closer, to the point that the difference was negligible. Perhaps this was the document fragment factor or the result of other optimizations not included in PPK’s method?

In the end, they are now almost interchangeable in speed. So which to use? Andrew Hedges clearly prefers DOM, but as for you and I, pick your pony. I’ll be sticking with DOM, it’s a bit more code but a lot less compatibility to worry about in years to come.
```
myDiv
```
2010-12-07
Modding Firefox
As the semester is drawing to a close and we the open source crew are feeling comfortable with the JavaScript language, Dave took us to the next level. From the content scripts we had been working on, we got our first exposure to chrome scripts: JavaScript that doesn’t just run in the browser: it runs the browser. With elevated privileges (like hard drive access) and a larger object model (like controlling the tabs) things were getting interesting. And how else would we work with this than by rewriting a very small part of Firefox to give it some added spark?

So that’s exactly what we had to do. Something simple, just changing the behaviour of new tabs from by default being appending to the end of the tab collection to appearing just to the right of the currently active tab. It could be done in just a few lines of code, he said. He was right but it took sifting through millions.

Enter MXR: Mozilla Cross-Reference. It’s a powerful web tool that allows you to search through and jump all over the source code through cross-referenced links. Every function call and definition, regular expression and plain text searching, everything was made easier with this tool. The hard part was finding a good starting ground. Kenneth in class suggested a search for “New Tab”. This didn’t get us exactly what we wanted, but it was the best we could hope for considering our ignorance and we ended up following a quick chain.
- We started with the UI descriptive text “New Tab” label, stored in tabCmd.label
- From there a search for tabCmd.label brought us to a menu description, which mapped the label to a command (cmd_newNavigatorTab)
- From there, it was easy to find the mapping of the command to the function (BrowserOpenTab();)
- Now we’re getting into code. A search for the function quickly found the definition…
- …Which held the setting of the currently selected tab
- … And also a call to the addTab function, inside of which held appending to a master tab list
- A little further down from there was the real location, setting the tab position
All of this was done by the class, but conveniently we ran out of time before we could get right to the bottom of how to do this mini-assignment. Though ending on the cusp of discovery, some self research into the rest of tabbrowser.xml quickly found me a solution. 2 lines of code later, I had something that seemed to be working. After generating a diff file using Mercurial…
```
hg diff -p -U 8 > changes.diff
```
I had a portable way to model and ship my changes. I had been working off an older code base (current from when I had last built Firefox) so line numbers are slightly skewed with respect to the current code. It seemed to work on my computer, but I had to test it on other’s. So I got experimenting some more and found out I’d stopped just shy when building Firefox last time: I had never packaged it!
```
make -C objdir/browser/installer                // ZIPs/Tarballs to objdir/dist
make -C objdir/browser/installer installer      // Builds MSI to objdir/dist/install/sea
```
After a few quick commands from the source root (and replacing “objdir” with my object dir), I was ready. Gave the ZIP file to a friend on USB and she tried running it. Error. Hmm. Turns out there were a few missing DLLs. My compiler (Visual Studio 10) had a few dependent dlls in C:\Windows\system32 that hadn’t been archived. Remedying that though, and everything was a-ok.

What did I take away from this? I felt this was a great way to not only learn about chrome scripts, but drilling down from the UI “New Tab” label to the code that made it happen gave me an appreciation not just for the size of the Mozilla code base, but the architecture as well. I learned more about various tools at public disposal to make the 500 foot dive into the Mozilla project seem more like 5. And most of all, I learned the excitement of shaping Firefox.
2010-12-01
Unravelling Win32 Threads
For the 3D game we’re working on, I’ve also taken it upon myself to help alleviate load times. We’re using Collada models and it turns out they can get very, very large. Collada files are XML formatted and as such, text takes a while to read in. A common alternative is to convert these static XML files to binary files for quicker reading, but unfortunately we don’t have the time to implement this. So we decided to investigate multithreading and asynchrony. Turns out, there’s a lot in the Windows API to learn.

Windows has two large APIs for threading: managed and unmanaged. Managed ties into the Common Language Runtime (CLR) and is used in .NET or Managed C++ applications. This was not what I wanted, I wanted the raw, low level control. Being familiar with the CLR implementation though, I found a Managed -> Unmanaged API Mapping useful. Unfortunately, this only mapped out the basics.

From here, I had other decisions to contend with. A lot of the basic functionality exists in the old Windows NT 4.0 functions, but Vista shipped with a whole bunch of specialized functions like a revamped ThreadPool family. Using an older version of DirectX (DirectX 9 mostly), I wanted to be sure things would be backwards-compatible to XP. So as neat as their were, they too were out. I did manage to find another great mapping, this one of the Original ThreadPool API -> New ThreadPool API.

For now though, I only want a single thread so I stuck with CreateThread(). From here, I wanted some way to test the thread’s state (if it was signaled as finished, aborted, etc). There’s a set of functions called WaitForSingleObjectEx and WaitForMultipleObjectsEx which will pause for a predetermined amount of time, returning a status code indicative of the thread(s) status. From here you can get the return code of a thread if its finished by using the GetExitCodeThread() function.

This was straight-forward enough, but I wanted some way to execute a function once the remote thread was done. The reason for this was to notify the parent thread. I began looking into the QueueUserAPC function. No matter what I tried though, it would always execute at the beginning of the thread, not the end. This is despite heeding documentation warnings about being sure the thread is starting before queuing the callback. As I’d later find out from a forum post of someone else in my position, this was not the right way to do things.

Turns out there are different type of threading models, and Windows implements the “pull” model rather than the “push” model. This means that rather than child threads pushing a message to the parent thread to respond to right then, they must queue it up in a message queue just like how it works with Windows GUI events. In other words, if Windows is a librarian and a child thread is an employee tasked with fetching a book for a customer, the employee must not interrupt the librarian when returning with the book. This would be bad, as the librarian may be conversing with and helping the customer. Rather, the employee must signal to the librarian that they have the book and the librarian will take the book when appropriate. There are two ways to do this: PostMessage and PostThreadMessage. The former specifically takes an HWND handle for the window to post to, while the latter takes a DWORD threadId. I used PostMessage because I was posting back to the GUI thread but if I wasn’t I would’ve had to force the creation of a message queue. This would be done by calling this in the parent thread:
```
PeekMessage(&msg, NULL, WM_USER, WM_USER, PM_NOREMOVE)
```
With a way to notify the parent thread, I was ready. Here was my plan:
- Start up a loading screen
- Begin loading model information on a background thread
- Store all information in a pointer in memory
- Notify the GUI thread when done
- Have the GUI thread use the pointer
At first, I tried to be creative. With the GetExitCodeThread method, I tried to return the pointer as a return code for reference by the GUI thread. This was possible because both memory addresses and return codes are 32-bit words. I could just treat the memory address pointed to by the pointer as a numeric value (cast the pointer to a DWORD, or unsigned long), returning that value on the child thread’s function and then cast back again on the parent thread. This worked in debug mode, but for some reason I received Access Violation errors when running in release mode. I’m not sure why, perhaps thread permissions are altered for speed as part of the release build compilation options. At any rate, I needed to think differently.

Enter a storage class. I named mine ThreadMarshaller. It’s basically just an array of void pointers, and holds 2 functions: set(void*, int) and get(int). In the child thread, set a value at an index which is then “getted” from the parent thread after being signaled. Simple, easy, and works in debug and release mode. I had working asynchrony.

I feel like there was a much better way to do this, though. The SetEvent() function, for example, is used to signal a thread, the same signal the above-mentioned WaitForObjectEx function checks for. There’s also something called Thread Local Storage which seems to be a native Windows version of my ThreadMarshaller. Also, using ThreadPool rather than manual thread handling would likely provide for a more scalable solution. Alas, I wasn’t able to play with them all. Hopefully soon 🙂
2010-11-29