If you look at the documentation for the GEM AES Scrap Library, two things soon become very obvious.

First, it’s clear that the designers of GEM at Digital Research recognized the need for and utility of a well-designed system clipboard that applications could use to quickly and easily exchange data.

Second, it’s even more clear that within no more than a couple of minutes after sitting down to map out what sort of library functions were needed and what sort of predefined data exchange formats should be used, something distracted them to such a degree that they put down their engineering comp books, got up from their desks, and left the office.

I’m guessing the distraction was lunch time. Maybe the food truck rolled into the parking lot and tooted its horn.

A third thing becomes obvious soon after. After lunch was over, they never came back to finish what they’d started.

Three Functions

The GEM AES scrap library consists of three functions.

You heard me. Three functions.  Here they are:

  • scrp_write allowed an application to specify a directory which would contain file(s) containing the data being placed on the clipboard.
  • scrp_read allowed an application to retrieve the directory containing clipboard files.
  • scrp_clear function deleted all files in the clipboard directory.

Actually, I lied. The scrp_clear function was only on PC, not Atari. I can only imagine that two or three minutes into the five minute task of bringing the PC code over, the guy doing it said, “screw it — nobody’s gonna use this anyway” and went off for a long weekend.

And Yet, Not Very Useful Functions

The first problem with this setup is that it makes individual applications responsible for deciding where the clipboard resides, rather than the system. That’s simply a recipe for disaster, especially on a hard drive where you could easily end up with a separate clipboard folder for each and every application.

The next problem is that it doesn’t provide a definitive way for an application to find out if there is data on the clipboard or not.  It has to read the directory returned by scrp_read to see if there are any files there or not.  That doesn’t sound terrible at first, but there’s no mechanism for an application to be notified when the clipboard status is changed by another application.  So, as a practical matter, an application has to check again every time it loses and then regains focus.  On a hard drive system, with a fixed system-defined directory, this would be moderately painful.  With the possibility of the directory changing every single time you check, it’s much less moderate.  And that’s on a hard drive system.

On a floppy drive system… well, shit, that set of problems deserves its own section of the article.

Not Remotely Practical Without Hard Drive

Another problem can be neatly summed up in three words: floppy disk system.

Well, not so much a problem in its own right as a problem magnifier.  Trying to use this clipboard setup on a floppy disk system basically takes every problem and makes it worse. When the user has to use one disk to boot the system, another to launch an application, and yet another to load documents, there’s a lot of disk swapping going on.  So, let’s make it worse by having the clipboard on floppy disk also.  We could use the document disk for that, or we could use a separate disk just for the clipboard.  Let’s assume that last option.

Now imagine that you need to copy some formatted text from your word processor into your favorite graphics program.  Let’s step through it.

Boot system on one disk.  Switch disks to launch word processor.  Switch disks to load the document you want.  Switch disks again to clipboard floppy.  Copy/cut the text, meaning the word processor writes out one or more files with the currently selected text.  Now, maybe, switch back to the document floppy to save changes to the file.  Exit the word processor.  Switch disks to the graphics program floppy and launch that program.  Switch to the floppy with your graphics files and load your graphics file.  Once that’s loaded, switch disks to the clipboard floppy.

Now, remember what we said earlier about an application having to check the clipboard folder ever time it loses and regains focus?  That didn’t actually happen here.  The application stayed in focus the whole time, but we switched floppy disks.  How is the program supposed to know this so it can, or should, check to see if clipboard data is now available?  Should the program have to monitor floppy drive media change just to use the clipboard? Apparently so, even though that’s just ridiculous.

Anyway, let’s just skip past the details and assume the program has checked again for clipboard files.  Maybe there’s a menu item for “refresh clipboard” or something equally inelegant but functional.  So now the graphics program finds the clipboard data with the text saved earlier by the word processor.

But now there’s a new problem.  The word processor saved out plain ASCII into CLPBOARD.ASC and formatted text into CLPBOARD.WUP, because the word processor is WordUp and WUP is that program’s native file format.  But the graphics program, like every other application in the known universe other than WordUp, has no clue whatsoever how to read a WUP file and extract anything useful from it.

Data Formats

There was no problem with the idea of WordUp saving its own native WUP format to the clipboard.  The problem was that it didn’t also save the information using a more universally recognized file format. There was zero documentation to indicate anything about what kind of data could (or should) be exchanged via this whole process, or what file formats should be used.

I’m not exaggerating. When I say “zero” I mean exactly that.  The official GEM 1.0 documentation says nothing whatsoever about clipboard data formats. I would imagine the guys at DRI assumed GEM metafiles would be used for vector graphics and IMG files for bitmapped graphics, but they didn’t actually put those ideas into the docs.

And what about other data? How should formatted text be transferred? What about a block of cells from a spreadsheet?  How about MIDI sequencer data?

The answer ultimately was, your guess is as good as mine. Is it any wonder that application programmers basically ignored the scrap library?

The real tragedy of this situation is that it’s entirely a documentation problem. There’s got to be probably no more than 8 or 10 basic, simple types of data that would cover probably 95% of all clipboard requirements. If the guys at DRI, or at Atari for that matter, had written a couple of pages of documentation saying “use this format for this kind of data” then maybe applications would have supported the clipboard.

As with many other GEM shortcomings, there would be 3rd party attempts to fix the broken situation. The PC version of GEM eventually included some documentation about data formats, but it was way too late, and limited mostly to very obvious things like TXT, IMG, IMG, and CSV.  And for whatever reason this information never really circulated to the Atari side of things at the time.

The Simple Fix, Part 1

Making the clipboard genuinely useful would have been fairly simple, had anybody been paying attention at the time. The first fix would have been to define a reasonable list of standard data exchange formats. and to make sure that list wasn’t completely GEM-centric.   For example, here’s a few just off the top of my head:

  • GEM – GEM metafile vector graphics
  • IMG – Bitmapped graphics
  • ASC or TXT – Plain ASCII text.
  • CSV – Comma separated values, text file containing one or more data records with fields separated by commas.
  • RTF – Rich Text Format formatted text.
  • MID – MIDI Data

Each “standard” format would be assigned a 4-byte value like “_ASC” or “_IMG” that will be used as an identifier.  In most cases the code would correspond roughly to the filename extension commonly used.

Keep in mind that there are some file formats in common use today that weren’t around in 1984. For example, the TIFF format for graphics files was introduced in 1986.  JPEG came out in the early 90’s just barely in time for the creation of the WWW.

The choice of IMG files for bitmapped graphics seems unavoidable, yet there is a critical flaw in that IMG doesn’t allow color palette information to be saved with the bitmap.  DRI seemed to think you’d save out a .GEM metafile defining the palette, but that conflicts with another likely scenario: if your source data is vector graphics, then saving out a rendered bitmap version on the clipboard is entirely reasonable.

This is just a short simple list that serves to illustrate the point.

The Simple Fix, Part 2

After defining the basic data formats, the main thing would be to allow for the clipboard to be either disk or RAM-based.  Most copy/paste operations could be done with a fairly small RAM-based clipboard, but it would have been easily to accommodate disk-based clipboard data when needed.

The existing functions scrp_read and scrp_write are history in this scenario as far as I’m concerned.  In their place, I would have suggested a pair of functions for saving & retrieving RAM-based clipboard data.  Something like this:

int resultcode = scrp_copy( CLIPBOARDDATA *cDataSaved )
int resultcode = scrp_paste( CLIPBOARDDATA *cDataRequested )

Yeah, I know. We started with two functions and I’m proposing replacing them with two functions. Kind of ironic, but keep reading and I think you’ll agree I get a lot more mileage out of my two functions than AES got out of the original scrap library functions.


The CLIPBOARDDATA structure defines the information being saved to the clipboard or retrieved from it. It looks like this:

typedef struct
    WORD    formatCount;
    DWORD   *formats;
    UCHAR   **data;
    DWORD   *lengths;

The formatCount field indicates specifies the size of the arrays pointed to by the formats, data, and lengths parameters.   When writing data to the clipboard, it specifies how many types of data the application is saving.  When requesting data from the clipboard, it specifies the formats that the application knows how to process.  When the system returns clipboard data to the application, it indicates which formats were actually returned. Note that an application can request one format at a time, or many.

Each of the formats specified should represent the same basic data in different formats. For example, a word processor might save formatted text in its own native file format like WUP, but also using both RTF (Rich Text Format) and plain ASCII TXT.  A graphics application might save vector graphics as EPS and GEM, but also a rendered bitmap version as IMG and TIFF.

The formats parameter is a pointer to an array of 32-bit ASCII format identifier codes (like “_IMG” or “_GEM”)  as mentioned earlier.  Each code represents a single type of data, like ASCII text, IMG bitmap, etc.  The list we defined earlier would form the core of this.

The data parameter is a pointer to an array of UCHAR pointers (WORD aligned), with formatCount elements,  that point to the data.  This field is ignored as input to the scrp_paste function.

Finally, the lengths parameter points to an array, with formatCount elements, where each element specifies the length of the data items pointed to by the data parameter.   This field is ignored as input to the scrp_paste function.

The API Functions

The scrp_copy function would save data, as defined in the CLIPBOARDDATA structure, to the clipboard, in each of the specified formats. Depending on the length of the data, it will be copied from the application space to a buffer allocated and maintained by AES, or saved out as a file to the system clipboard directory.  Or possibly both.

Calling scrp_copy would cause AES to send a message to all open applications indicating that the clipboard had been updated.  (Some later 3rd party revision of GEM added a message named SC_CHANGED for this purpose but it was not part of GEM’s original specification.)

The scrp_paste function would retrieve the current data on the clipboard, if any is available in the requested formats.  On input, the CLIPBOARDDATA structure specifies the data formats the application can accept.  This allows the AES to ignore clipboard formats the program won’t be using.  On output, the structure will be updated to indicate the data formats being returned and the actual data.

The resultcode value returned is 0 for no error,  or various negative values to indicate different errors.

Do We Need More Than That?

We could stop right there and the result would be a billion times more useful than the original scrap library.  There is at least one more function which would be nice to have, though.

  • scrp_bufferinfo – Retrieve information about the RAM-based clipboard data buffer, like maximum size.

Something like this would have been easy to create, fairly compact, and it would have made the GEM clipboard a genuinely useful tool from day one.

Next Time

I think we’ll be headed back to VDI topics for next time around but it’s still up in the air.   Don’t be afraid to comment and let me know what topics you’d like!

Other Articles In This Series

In the last few episodes, we’ve talked about how GEM’s event processing model could have been a bit better, and how it could have better facilitated more cooperation in the cooperative multitasking environment. Then we discussed how the event handling changed a bit under MultiTOS when there was preemptive multitasking.

This time, we’re going to talk about how GEM AES defined and managed GUI elements like windows, buttons, text boxes, and so forth. As we have been doing, we’ll continue to compare GEM to how Microsoft Windows does things.

And once again, to be clear, I’ve chosen Windows to compare against not because I think it’s the standard by which everything else should be judged, but rather because it first came out about the same time as GEM, and because it’s familiar to the greatest number of people.

If you aren’t reasonably familiar with programming for Microsoft Windows, and you haven’t read the previous entry in this series, you might want to do it now. In particular, make sure you’ve read the “What is a Window Class” sidebar.

GEM AES Lacks Consistency

Consistency is an important foundation of how Microsoft Windows works, going all the way back to v1.0. Every UI element is defined by a window class, and they all follow the same basic strategy for how they’re created, how they process events, and how they’re used as components of a greater whole. The really important thing, ultimately, is that everything in Windows works this way. Every UI element, from menu bars or menu items to buttons, combo boxes, or whatever else, is either an object defined by a window class, or is managed by such an object. This means everything works in a consistent manner. You don’t have to learn one set of rules for one part of the user interface and a different set of rules for something else.

By comparison, perhaps the biggest design flaw about GEM AES is how it lacks consistency in the way its UI (user interface) elements are defined, how they work, and how they’re put together to create a complete user interface for an application. GEM doesn’t have anything like windows classes or a single, unified approach to everything. There are basically three different ways to do things.

  • Windows
  • Dialog Boxes
  • Menu Bars

Well, maybe it’s really more accurate to say two and a half. There’s some overlap in the way dialog boxes and menu bars are defined, but also some very fundamental differences in how they’re used.

Overall, the GUI features of GEM break down into two categories, which we’ll call The Elements and The Windows.

The Elements

First let’s talk about the category we called The Elements.  We’re talking about User Interface (UI) elements like buttons, check boxes, list boxes, editable text fields, and so forth.  These simple UI elements are defined via a simple data structure known as an OBJECT.  That’s an unfortunate choice of name by modern standards, but it was applied a few years before object-oriented programming really started to become much of a thing outside of computer science labs.

These elements are normally used in groups, not individually.  Such a group might be used as a dialog box, or a menu bar.

We won’t get into the minute details here, but let’s go over some of the basics of the OBJECT structure.  It was fairly small, just 24-bytes, as you can see below. You can probably guess the function of most of the fields from the names.

typedef struct
   int16_t    ob_next;   /* The next object            */
   int16_t    ob_head;   /* First child                */
   int16_t    ob_tail;   /* Last child                 */
   uint16_t   ob_type;   /* Object type                */
   uint16_t   ob_flags;  /* Manipulation flags         */
   uint16_t   ob_state;  /* Object status              */
   int8_t     *ob_spec;  /* Type specific data pointer */
   int16_t    ob_x;      /* X-coordinate of the object */
   int16_t    ob_y;      /* Y-coordinate of the object */
   int16_t    ob_width;  /* Width of the object        */
   int16_t    ob_height; /* Height of the object       */

To combine multiple UI elements into a larger, more complex UI structure like a dialog box, you used an array of OBJECT structures, also known as an OBJECT tree.

The first three fields of an OBJECT were used to create a hierarchy for items within the tree, such that certain objects could contain other objects.

The ob_type field specified what sort of UI element was represented. There were about 15 or so standard types which included simple UI elements like “button” or “editable text field”. This field not only indicated what the element was supposed to look like, but also how user interaction should be managed. Other fields contained flags that would indicate differences in appearance or behavior, like if the element is selectable, or if it was the default button, and so forth. There were other fields to hold the current object state, and of course, basic details like the object’s location and size.

Some object types required extra data like text strings or a bitmap. Extra data like that was stored elsewhere and pointed to via the ob_spec field.

Note that the OBJECT structure contains no pointers to code of any kind, like a message handler.

Such an array of OBJECT structures, along with the text, bitmap, or other data that goes with it, is known as an Object Tree, and more generally as a Resource. An individual resource might be part of a larger collection of resources loaded from a Resource File by the program at startup time.

Windows Also Has Resources

In Windows, “resource” is a much broader concept than with GEM, but one similar aspect is that a Windows resource file can contain definitions of UI structures like a dialog box, made up of a list of the individual UI elements required.

In GEM, the resource contains the actual data structures for the UI elements, but in Windows, it contains just a list of the parameters required to create each element. And although Windows UI elements have code associated with them, the resource does not contain that code.

In order to distinguish one type of UI element from another, the resource uses the name of the element’s window class. If it’s not a standard type, it’s presumed the application will load the appropriate library or otherwise initialize the window class before the resource is referenced.

This means that Windows can benefit from a relatively compact and simple description of the UI elements required, yet also allow the code for managing those elements be as simple or as complex as they need to be.

GEM AES Objects Are Just Data

The OBJECT structure defines what an individual element is supposed to look like, sort of. That is, it tells GEM, “I’m a button. Draw whatever you think a button should look like“.

The OBJECT structure also defines what an individual element is supposed to do, sort of. That is, it tells GEM, “This is a button. When the user interacts with me, do whatever sort of actions you think a button should do.

Ultimately, in either case, because the OBJECT is just data, it really has no control over the final result. There has to be some code to interpret the OBJECT and make sense of it all. In GEM, this is done by the AES forms library and object library. The forms library is responsible for managing complete structures like dialog boxes, while the object library is responsible for manipulating or drawing UI elements either individually or as a group.

Under Windows, there is nothing that closely corresponds to the GEM AES forms or object libraries. The necessary code for UI elements to do their thing is specified when the window class for that each type of element is registered with Windows. So, each UI element is ultimately a reference to a block of code that knows how to create and display the element, and how to deal with any user interaction. And all of the basic “built-in” UI elements like buttons, checkboxes, etc., are defined in their own library, separate from the rest of Windows, so that even Windows ends up using them in the same way as regular applications.

Showing A Dialog Box

To do a dialog box in GEM, you call the AES form library form_do function, in effect saying , “Here’s a list of UI elements. Draw it, monitor the user’s interaction, and tell me what happens after it’s all over.

The form_do function calls the object library function objc_draw to draw the UI elements specified in the resource tree passed to it, then it monitors the user’s interaction with those elements until the user hits an item with the mouse that is marked as an exit or touchexit item. At that point, control returns to the application.

But that doesn’t mean the dialog box is finished. Now the application has a chance to find out what the user did, by accessing the OBJECT structures and checking the various bits of status information. Depending on what it finds, the application has the option of updating the object tree in some fashion.  It might disable a button, clear a checkbox, or maybe update a list of selectable items.  Then once all that’s done, it can call form_do again for another round of interaction. Eventually, it can call other functions that signify the end of the dialog box, which will release the screen, send redraw messages to whatever was underneath, etc.

It should be clear that for anything other than very simple dialogs, you end up writing a lot of custom code that is unique to that specific dialog box. And all that still assumes you’re using only standard, vanilla UI elements. If you need any customization at all, you probably need to avoid calling the AES form_do function and instead, create your own block of code that does more or less the same thing, plus whatever custom functionality you require.

With Windows, creating a dynamic, interactive dialog is a much more simple process. You simply identify which events will require special attention, and you write handlers for those specific events. For example, let’s say that clicking an item in a list box should make certain buttons elsewhere in the dialog become enabled, disabled, or selected. All you have to do is attach a piece of code to the “item selected” event, and have that code configure the buttons as needed.

This is much simpler, yes?

Dialog Boxes Aren’t Windows, They’re Object Trees

In Windows, a dialog box is just another kind of window. It uses the same exact event processing model as anything else. In most cases the only significant difference for a dialog box is that the window is marked as being modal, meaning that you have to dismiss it before things like mouse events or keyboard events will be given to other windows. And even that is optional.

In GEM, a dialog box isn’t a regular window. Or any other kind of window for that matter. It’s a completely different animal. Instead of being a window, a dialog box is essentially a list of objects arranged in a hierarchal fashion, an object tree as we discussed way back towards the start of this article.

A dialog box object tree will probably start with a G_BOX rectangle object used as an overall container.  Walking the tree from there, you’ll find text label objects, button objects, more G_BOX objects, editable text field objects, and other such UI elements.

A dialog box is typically defined by a resource tree within the program’s resource file. It could also be generated at runtime programmatically, although this would mostly be an exercise in masochism unless your program’s main function was being a resource editor.

To manage the user’s interaction with a dialog box, the AES provides the form_do function. This function uses a specialized event handler loop that knows how do things like navigate the link list of object structures in the resource tree to figure out which button was clicked on, or which editable text field, etc.

When the user performs some action that indicates the dialog box is finished, the form_do function exits. For most dialog boxes, that’s the end of the process, but more sophisticated ones might update something and jump back into form_do again.

Menu Bars

The next part of the GEM AES trifecta of different ways to do things is the menu bar.  Menu bars are object trees, like a dialog box, but they’re managed by the system fairly automatically.  Once you’ve told GEM “Here’s my menu bar! the AES will display it at the top of the screen and allow the user to interact with it.

Under MultiTOS, the menu bar shown at any given moment is that which belongs to whatever application owns the top-most window on screen.

Once the menu bar is in place, things are fairly automatic as far as your program is concerned.  You don’t have to do anything except wait for the user to select a menu item. When that happens, the AES sends your application an MN_SELECTED message which indicates which item was selected.

Your program can dynamically change certain things about the current menu, like individual items being enabled or disabled, or you can update the item text, as long as the object tree for the menu bar doesn’t change when the user could be interacting with it.

Menu Bars Aren’t Modal, Except When They Are

Normally, one thinks of interacting with a menu bar as being a non-modal operation, and in the overall broad sense that’s true. But there are parts of the process that are modal. For example, before drawing a menu, GEM AES saves the appropriate portion of the screen to an offscreen menu.  When this process is done, it restores the original screen contents.  This is done to eliminate the need to send redraw messages to whatever was underneath the menu.

But it’s also a modal operation.  That is, AES locks down the screen when the user interacts with the menu bar.  This includes blocking any application that is currently waiting for an event library call to return.  This normally has little impact, but it can affect programs which are attempting to maintain some sort of live, animated display, as this will probably freeze when the user interacts with the menu bar. At least, if they’re doing it right when they refresh the window for the animation.

Customizing Menus

Although a menu bar is a standard object tree, you can’t get away with placing any sort of OBJECT into a menu. While you’d probably not expect things like editable text fields to make much sense, certain more basic things like icons don’t really work either.  At least not as you’d expect.

When I was working on the 2nd revision to my FONTZ! font editor application, I wanted to be able to have hierarchal submenus in my menu bar.  The first problem I had was that the resource editor programs didn’t understand that idea.  But I managed to put it together.

I managed to get it to draw and interact with the mouse properly.  It didn’t happen automatically, but I did it using only standard AES & VDI functions.  I had to save the screen area underneath the submenu myself, and restore it afterwards.

But even after I got it to draw and track with the mouse, the submenu didn’t generate a message when the user selected an item.  Eventually I ended up doing it by tracking it myself and sending myself a MN_SELECTED message, instead of expecting GEM to do it.

Later revisions of GEM would have support for such submenus built-in, but as far as I know I was the first to do it using 100% legal AES functions before that.

Menus & Event Processing

In our last installment, we talked about how GEM’s event processing could sometimes, at least theoretically, mean that your program received and/or processed messages in a different order from which they occurred.

Menu item selection is a good example of how this can happen.  Suppose a program has a tool bar at the top of the window and it contains a “Quit” button.  So what happens if a user goes into the menu bar, selects the “Save” item, but when the menu goes away the mouse is right on top of the button and it gets clicked too? These might get separated, but it’s possible for both events to be returned by evnt_multi at the same time.

So now the program returns from evnt_multi with a message event for MN_SELECTED and a mouse event for the button click.  The program has no idea which event happened first, so it could SAVE then QUIT, or it could just QUIT and never process the SAVE request.

That’s probably a worse-case scenario, but it’s not hard to imagine other situations where things would be done out of order.

The Windows

The last point on the GEM GUI triangle is the basic application window.

Windows In GEM Aren’t Made Of Objects

Remember earlier when we talked about telling GEM, “Here’s a list of UI elements. Draw it, monitor the user’s interaction, and tell me what happens after it’s all over..”

Well that only applies to menu bars and dialog boxes. Windows aren’t a type of OBJECT, nor are they a resource tree of multiple OBJECTs. Windows are just… windows. They are essentially monolithic entities unto themselves.

You create and open a window by specifying a collection of flags that indicate if individual window elements like scrollbars, close buttons, etc., should be present or not.  You would think such elements would be part of the standard collection used for dialog boxes etc., but no. You also specify things like the position and size where it should go on screen.

When the window is created, you get back an integer window handle that is used thereafter to refer to that window. GEM keeps track of which window handles belong to each application.

But GEM doesn’t really manage the whole window. It tracks the user’s interactions with the outer perimeter, the frame, but not what happens in the window’s client area.

GEM AES Windows and Events

Most window-related events are pretty easy to deal with, but some require a lot of code to handle properly.  There are two reasons for this. First, GEM puts most of the burden for dealing with things like scrollbars onto the application to figure out. Second, because of the way AES handles, or rather doesn’t handle, screen coordinates within a window. You always deal with global screen coordinates.  This connects with the VDI’s lack of ability to do any sort of coordinate system translation,  as we discussed in an earlier episode. 

You get mouse events for things that happen in a window’s client area, but the information you get from the event won’t directly reference the window at all.  It’ll be up to the program to determine which window was at the mouse position by calling the wind_find function. The possibilities include the desktop as well as any open windows belonging to the application.

Once you determine that a mouse event happened inside one of your application’s windows, then you’ll probably have to translate the mouse coordinates from global screen space into something relative to the window’s client area.  This is done using the wind_get function.

Then you’ll have to factor in any offsets represented by the window’s current scroll bar positions. That last part is further complicated by the fact that scrollbar positions and sizes in GEM are always set to a range of zero to 1000, regardless of whether or not you have a 4oo pixel window showing a 410 pixel document or a 10000 pixel document.

And if your application has a “zoom factor” it can apply to what it displays, well, then you’ll have to factor that in at some point.

After all that, you’ll have a set of coordinates relative to the “document” being displayed and you can take whatever action is indicated by the mouse event.

Other than mouse events, the main thing that gets complicated is a redraw message.  When your program gets a redraw message, it will indicate the overall rectangle of the “dirty” area that needs to be redrawn.  In screen coordinates, of course, so you’ll have to jump through the same hoops we mentioned a paragraph or two ago to get an offset for your window’s client area.

And then you can’t just redraw the rectangle in the message.  Turns out, that is the overall bounding rectangle of a list of smaller “dirty” rectangles, which may or may not be contiguous.  You’ve got to use wind_get to get the first such rectangle, set the clipping and redraw it, then repeat the process until wind_get tells you that you’ve reached the end of the rectangle list.

And of course, you’ll have to be translating the coordinates back and forth between global screen space and window client space as needed.

By comparison, when a Windows UI element gets a WM_PAINT message, telling it to redraw something, the (0,0) position of the coordinate system is, by default, set to the top left corner of the element’s client area, with the scrollbar position already factored in. Plus, the graphics library’s clipping is already set to the dirty area being redrawn.  All your paint function has to do is a straightforward redraw of the window contents. If there are multiple “dirty” areas, it’s no big deal because you get a separate WM_PAINT message for each.

Mixing Objects & Windows

The AES manages the process of drawing menu bars and tracking user interaction, once you give it the address of a menu bar resource. It does it for a resource arranged as a dialog box when you call the form_do function. But if you want to use OBJECTs and resource trees in a regular window, your application is going to have to watch over them and make it work. You can’t call form_do because that would block off access to anything other than the object tree. Likewise if you want a dialog box to have additional functionality beyond what GEM AES normally provides. In either case, your program has to supply the code to capture events, traverse through a tree of OBJECT structures, figure out how to apply the events to the OBJECTs.

Mostly, you’ll be replicating what GEM AES does, just so you have the ability change one or two things somewhere. Essentially, it’s going to have to implement the functionality of the form_do function and integrate that with whatever other event processing your window may require. Once developers got sufficiently ambitious that they were trying to do this regularly, Atari released a cleaned-up version of the source to the form_do function to make life easier.

Unlike Windows or other systems, there is no way in GEM for a program to create new types of UI element and drop them into a dialog box or menu bar alongside the predefined ones, mainly because GEM wouldn’t know what to do with an unknown ob_type value. It wouldn’t know how to draw it, or how to handle events for it. If you wanted to manage those details for yourself, then you could provide your own code to do it. Along with the code required to handle all the regular pre-defined object types that might be mixed in there too. Basically your code is all or nothing when it comes to UI elements.

Next Time Around

Our next AES-related article will talk about the scrap library, aka the clipboard. See you then!


Other Articles In This Series

After part 7 of this series came out, I got some interesting feedback and a question in particular stood out. Milan Kovac asked how did MiNT handle things differently regarding applications waiting for evnt_multi() to return?

To clarify, he’s referring to MultiTOS, of which MiNT was the core, and how GEM AES behaved differently in that environment.

That question was sort of out of the scope of the original topic, but it got me thinking and I realized it sort of touched on a few other issues with AES we hadn’t talked about yet.  So here we go, and Milan, if you read through the whole thing your question gets answered eventually.

On a side note… when I write these articles, I often have the GEM source code open in a window in the background so that I can make sure I’m not remembering something incorrectly. Once again I’ve noticed how the original GEM source code is very terse and poorly commented. Function names are generally no more than 6 or 7 characters long, even with an underscore taking up a spot some where. Names of variables or structure elements are about the same. For example:

EVSPEC mwait(mask)	 	 
EVSPEC mask;	 	 
 rlr->p_evwait = mask;	 	 
 if ( !(mask & rlr->p_evflg) )	 	 
 rlr->p_stat |= WAITIN;	 	 

Of course, that’s not really that unusual for code written back in those days.  But gosh, it often seems like the GEM source code takes things to extremes. Someone really ought to dump this stuff into a modern IDE and refactor the source code to give things meaningful names.

What is MiNT and MultiTOS?

Just in case anybody doesn’t know, MiNT is a multitasking kernel created by Eric Smith while he was a university student. He was trying to port over some GNU libraries and utility software to the Atari ST computers, and the problem was that the TOS operating system on the Atari was lacking certain functionality required by the code.  At first, he modified the individual GNU programs and libraries as needed, but eventually decided that instead of changing the libraries and programs, it’d be easier overall to create an extension to TOS to add the required functions. MiNT was the result.

Originally the name stood for MiNT Is Not TOS.  It basically hooked into the BIOS and GEMDOS and provided the ability to do preemptive multitasking, among other things.

MiNT caught the attention of the programmers in the TOS development group at Atari Corp., including Allan Pratt, the programmer who maintained the GEMDOS portion of TOS. He was impressed with MiNT and eventually started talking with Smith about incorporating it into a new, preemptive multitasking version of TOS. Smith was later hired by Atari in 1992, and in 1993, after a lot more work on everything, MultiTOS was released.

Now the name stood for MiNT Is Now TOS.

Unfortunately, the release of MultiTOS came only shortly before Atari decided to focus all of its development efforts on the new Jaguar game console, effectively ending the active life cycle of the ST computer series.  But that’s a story for another day.

Multitasking 101

Let’s cover a couple of basic concepts regarding multitasking that we haven’t talked about before, or to refresh your memory briefly.

There are two main types of multitasking, cooperative and preemptive.  Task switching is what we call it when the system stops one program’s execution and starts another one. Task switching back and forth quickly enough makes it look like all the programs are actually running at the same time. And in fact, on a modern computer with multiple processor cores, your programs probably are actually running at the same time.

Vanilla GEM features cooperative multitasking. “Cooperative” means that the system doesn’t automatically switch from one program to the next. Programs have to cooperate by doing some specific operation before task switching occurs. In vanilla GEM, that specific operation is making a call to the AES event library. Every time a GEM application calls an event library function, it may end up waiting for the event to occur, and it may end up waiting for other applications as well.

MiNT features the other flavor, preemptive multitasking.  The main thing that’s different about preemptive multitasking is that the program doesn’t have to do anything special to facilitate task switching.  It can happen at any time, regardless of what the program is doing at the moment. (There are exceptions to that which we’ll ignore for now.)

Under a preemptive multitasking system, each program, also known as a process, has at least one thread, which is what we call a distinct piece of code being executed.  A process may own more than one thread, but it always has at least one.

Preemptive multitasking systems typically operate using a timer-based interrupt.  Each thread is given a certain maximum amount of time to execute before the system stops it and gives control to another thread. Each chance that a thread gets to run is called a “time slice“.

There are a couple of things that can happen to make a thread end its time slice early, or to make the system skip a turn for a particular thread.  For example, a threads can voluntarily end its time slice early.  There are a variety of reasons it might do this, but waiting for an asynchronous task to finish is a common example. Threads can also choose to sleep and wait for a certain amount of time to pass.  This is a bit different from simply ending a time slice, as it also means that the system will skip past that thread for future time slices, until the requested duration has passed.

It’s also possible for a thread to be blocked waiting for a semaphore or MUTEX (mutually exclusive) object.  These are software mechanisms that are used to allow a thread to wait for a certain condition, or to control access to something in the system that can only be safely accessed by one thread at a time.  A good example would be something trying to send data to the printer port.  If you had several programs trying to send output to the printer at the same time, the result would be a lot of wasted time and paper.

The idea of a MUTEX is that a process has to ask for exclusive access to such items before it can use them.  Upon receiving such a request, the system will then do one of the following:

  • Grant access if the requested item is currently available. The item now belongs to the requesting process until it releases it or until that process ends.
  • If the item is already in use by another process, then the system will block the thread until the item is released.  In some cases, you can optionally have the system return an error indicating that the item is not currently available.

All this means that threads don’t always execute in the same order and frequency. It’s not always A-B-C-A-B-C, etc. It actually gets even more complicated when you consider things like thread priority settings, but that’s another left turn down this tangental road so let’s not.

There are many other important aspects to multitasking, but they are pretty much beyond the scope of this article.

GEM Applications Under Vanilla GEM

Under vanilla GEM, the core of the cooperative multitasking system was contained in the AES event library.  Whenever program “A” calls an event library function like evnt_multi, and there’s no event of the requested type in the queue waiting to be processed, the event library calls a dispatcher function that checks to see if events are waiting for any other GEM applications, and if so, performs a task switch.

That is, incidentally, the purpose of the mwait function shown above as an example of the GEM source code. As simple as it is, that function is where GEM makes the decision to pass control back to the same program, or task switch to another.  It’s called by each of the various public functions of the AES event library, like evnt_multi or evnt_mouse and so forth.

The mask parameter indicates the types of events the application is requesting, and this function compares that against the events that are available.  If nothing is available, it calls the dsptch function, which is responsible for vanilla GEM AES’s cooperative task switching.

If the dspatch function found events waiting for program “B”, which by definition in vanilla GEM would currently be waiting for an event library function to return, then it would perform a task switch to that application so the events could be processed. Eventually, program “B” would make another call to an event library function, and maybe this time program “A” gets control back, or perhaps program “C” is called instead, depending on what’s waiting in the queue of unprocessed events. In this way, all of the applications currently loaded into the system would get a chance to process their events and interact with the user.

This sort of task switching is essentially the same general process that’s used by preemptive multitasking systems, except that it relies on programs making calls to the AES event library. Note that non-GEM applications couldn’t be included in this setup, since they don’t make calls to the event library. Whenever you ran a non-GEM application, it essentially blocked GEM applications until it exited.

GEM Applications Under MultiTOS (MiNT)

A well-designed GEM application that handles events properly and doesn’t try to draw to parts of the screen that it doesn’t “own” should work fine under MultiTOS.  In fact, programs which occasionally need to suspend event processing while doing something else will arguably work better under MultiTOS, since they will no longer freeze up the whole system.  The program’s own UI will be blocked until it starts making event library calls again, but other programs will continue to operate normally.

But as to how it works…

Quite a lot about GEM AES was changed for MultiTOS, but we’re only going to talk about certain things here.

Under MultiTOS, the MiNT kernel is now responsible for handling task switching between applications, rather than the AES event library.  Each application has at least one thread, including non-GEM applications.  Additionally, the AES maintains its own process that corresponds to the “original” single process in vanilla GEM, which is responsible for managing the user’s interaction with UI elements like the menu bar, window frames, etc.

So, if the event library is no longer doing its own task switching, what happens if program “A” calls the event library to request an event, and the desired event is not available?

Instead of doing its own task switch, AES will tell MiNT “I’m done for now” for the current thread’s time slice, prompting MiNT to perform a task switch.  The AES code is actually shorter and more simple than under vanilla GEM.

On the next time slice for program “A”, the first thing it will do is check again to see if the desired event is available. If not, then it will once again release the time slice. This will repeat until the event becomes available. Thus, programs which are waiting for events use very little CPU time; just enough to see if there are events pending.


We talked about MUTEX items earlier. While it doesn’t use that terminology, GEM AES has always had something that acts as a MUTEX, and it’s something all GEM programmers should know about.  When an application does a window update, the process is wrapped with calls to wind_update. This is intended to block any other application from starting a window update while another one is already happening.  It’s intended to provide exclusive access to the screen to a single application at a time.

To accomplish this, the original vanilla GEM code for wind_update ties into the event library.  It adds a special “mutex released” item to the list of requested events so that ending the update has to occur before another application can be called.

Under vanilla GEM, the wind_update function didn’t actually check to see if an application had locked down the screen.  It relied solely on the mutex event to block other applications from being able to do anything, since they wouldn’t be running until AES had events for them to process. However, under MultiTOS, another application might not have been waiting for an event to occur.  In that case, the application will keep on doing whatever it was doing. Unfortunately, this could eventually include a window refresh, so under MultiTOS, the wind_update call gets significantly more complicated than it was under vanilla.

Don’t Cross The Beams!  Whoops!

Finally, we’ve come around to another flaw in vanilla GEM AES.  From day one, GEM was supposed to be a multitasking system, but other than using the wind_update function to manage, somewhat imperfectly, screen output, it didn’t include any sort of a general purpose MUTEX or semaphore library so that applications could avoid stepping on each other when they all wanted to use the same resources at the same time.

It always amazed me that this was never revealed to be the big problem it had the potential to be.  I guess users were really just so used to interacting with just program at a time that it rarely came up. But consider how many things in the system could fail if more than one application wanted access.  Just to name a few:

  • Serial ports.  What if you had a FAX program and a telecommunications terminal program going at the same time?  One as the main application, the other as a desk accessory?  Until the Mega STE and TT030 machines, there was only one serial port so this would have definitely resulted in a conflict as both programs tried to access the same port and modem at the same time.
  • Printer port.  Two programs trying to print something at the same time could step on each other unless both were doing it through GDOS.  Under GDOS,  once the printer workstation is opened by one application, any attempts by other applications to open it will fail.  Unfortunately, the application won’t have any idea why it failed because VDI doesn’t return error codes.
  • MIDI ports.  Basically the same problem as the serial ports, except with different kinds of program.
  • Sound.  Sound on the ST computers was mainly done by banging on the sound chip, either directly on the hardware registers, or via the XBIOS DoSound call. Either way, two programs trying to this at the same time would result in some interesting results that would hurt your ears.

Basically, when it came to these items and other similar system resources, AES basically relied on the idea that a program would start using the item and finish it between one event library call and the next, when no other programs could be called and start trying to use the same resource.

That sounds pretty risky, but actually it more or less worked most of the time.

MultiTOS and Mutex

When MultiTOS came out, MiNT added the basic capability needed to create mutex objects, but except for defining a couple of specific hardware resources like the SCSI or ASCI ports, which were used by the system itself, there were no preset definitions for anything that applications could rely on.

Now that it had the low-level functionality to do the job, you would think that someone would have added some functions to GEM AES to do basic application-level resource management.

You’d think that, but you’d be wrong.  AES continued to ignore the problem under MultiTOS.


Don’t get me wrong… I loved MultiTOS when it finally got to be more or less stable, and I used it on a daily basis long before it even got to that point.

Of course,  at the time, my machine at the office was a TT030 with maxed-out RAM, and a big 320mb hard drive.  And it was reasonably useable on the Falcon030 too.  What about on the older machines running at 8 mHz?  Even with the max 4mb of RAM, I avoided ever really using that setup. So I really couldn’t tell you how badly it sucked.  I was just pretty sure it did.

And by the way, 320mb was big for a hard drive back then.  Honest.  But even so, even with a relatively nice system like what I was using, we all knew how easy it really was to do something that just plain wouldn’t work.

Maybe if Atari had kept going with development on the ST series, some of those issues would have gotten fixed.  We weren’t unaware of them, in many cases, but there was only so much we could do with the manpower and time available.  And then, of course, the Jaguar came along and we all shifted gears to focus on it.

It’s really kind of ironic, because the last two or three years worth of TOS development had seen far more improvements and new functionality added to the system than the previous six years had.