Unravelling Win32 Threads

For the 3D game we’re working on, I’ve also taken it upon myself to help alleviate load times. We’re using Collada models and it turns out they can get very, very large. Collada files are XML formatted and as such, text takes a while to read in. A common alternative is to convert these static XML files to binary files for quicker reading, but unfortunately we don’t have the time to implement this. So we decided to investigate multithreading and asynchrony. Turns out, there’s a lot in the Windows API to learn.

Windows has two large APIs for threading: managed and unmanaged. Managed ties into the Common Language Runtime (CLR) and is used in .NET or Managed C++ applications. This was not what I wanted, I wanted the raw, low level control. Being familiar with the CLR implementation though, I found a Managed -> Unmanaged API Mapping useful. Unfortunately, this only mapped out the basics.

From here, I had other decisions to contend with. A lot of the basic functionality exists in the old Windows NT 4.0 functions, but Vista shipped with a whole bunch of specialized functions like a revamped ThreadPool family. Using an older version of DirectX (DirectX 9 mostly), I wanted to be sure things would be backwards-compatible to XP. So as neat as their were, they too were out. I did manage to find another great mapping, this one of the Original ThreadPool API -> New ThreadPool API.

For now though, I only want a single thread so I stuck with CreateThread(). From here, I wanted some way to test the thread’s state (if it was signaled as finished, aborted, etc). There’s a set of functions called WaitForSingleObjectEx and WaitForMultipleObjectsEx which will pause for a predetermined amount of time, returning a status code indicative of the thread(s) status. From here you can get the return code of a thread if its finished by using the GetExitCodeThread() function.

This was straight-forward enough, but I wanted some way to execute a function once the remote thread was done. The reason for this was to notify the parent thread. I began looking into the QueueUserAPC function. No matter what I tried though, it would always execute at the beginning of the thread, not the end. This is despite heeding documentation warnings about being sure the thread is starting before queuing the callback. As I’d later find out from a forum post of someone else in my position, this was not the right way to do things.

Turns out there are different type of threading models, and Windows implements the “pull” model rather than the “push” model. This means that rather than child threads pushing a message to the parent thread to respond to right then, they must queue it up in a message queue just like how it works with Windows GUI events. In other words, if Windows is a librarian and a child thread is an employee tasked with fetching a book for a customer, the employee must not interrupt the librarian when returning with the book. This would be bad, as the librarian may be conversing with and helping the customer. Rather, the employee must signal to the librarian that they have the book and the librarian will take the book when appropriate. There are two ways to do this: PostMessage and PostThreadMessage. The former specifically takes an HWND handle for the window to post to, while the latter takes a DWORD threadId. I used PostMessage because I was posting back to the GUI thread but if I wasn’t I would’ve had to force the creation of a message queue. This would be done by calling this in the parent thread:

PeekMessage(&msg, NULL, WM_USER, WM_USER, PM_NOREMOVE)

With a way to notify the parent thread, I was ready. Here was my plan:

  • Start up a loading screen
  • Begin loading model information on a background thread
  • Store all information in a pointer in memory
  • Notify the GUI thread when done
  • Have the GUI thread use the pointer

At first, I tried to be creative. With the GetExitCodeThread method, I tried to return the pointer as a return code for reference by the GUI thread. This was possible because both memory addresses and return codes are 32-bit words. I could just treat the memory address pointed to by the pointer as a numeric value (cast the pointer to a DWORD, or unsigned long), returning that value on the child thread’s function and then cast back again on the parent thread. This worked in debug mode, but for some reason I received Access Violation errors when running in release mode. I’m not sure why, perhaps thread permissions are altered for speed as part of the release build compilation options. At any rate, I needed to think differently.

Enter a storage class. I named mine ThreadMarshaller. It’s basically just an array of void pointers, and holds 2 functions: set(void*, int) and get(int). In the child thread, set a value at an index which is then “getted” from the parent thread after being signaled. Simple, easy, and works in debug and release mode. I had working asynchrony.

I feel like there was a much better way to do this, though. The SetEvent() function, for example, is used to signal a thread, the same signal the above-mentioned WaitForObjectEx function checks for. There’s also something called Thread Local Storage which seems to be a native Windows version of my ThreadMarshaller. Also, using ThreadPool rather than manual thread handling would likely provide for a more scalable solution. Alas, I wasn’t able to play with them all. Hopefully soon 🙂

Comments

One response to “Unravelling Win32 Threads”

  1. […] myself further. As such, I took it upon myself recently to try and make a high score system for a 3D game I’d made last year. I wasn’t content with just a simple “cout >> […]

Leave a Reply

Discover more from Software by Steven

Subscribe now to keep reading and get access to the full archive.

Continue reading