Way back in 1985, I started my “professional” career as a software guy as a developer for the brand new Atari ST computer.  After a few years as a 3rd party developer, I was hired by Atari to provide developer support to ST developers in the USA. 

Part of what made me a good choice for that role was that I had a really good in-depth understanding of GEM.   For example, when I worked on the WordUp word processor for Neocept, I wrote more than a dozen GDOS printer drivers for various printers, including color, that Atari’s drivers didn’t support.  Quite a lot of that information is still burned deep into my brain, even though it’s been many years since I actually wrote any code for the Atari.

These days, when something reminds me of GEM for some reason, the main things that come to mind are the various problems, glitches, and workarounds for various things.  This article is going to be mainly about the various design flaws in GEM, their workarounds, and how they impacted development.

GEM – The Origins

In the mid 80’s, just as computers were starting to break out of their character-based screens into more graphically oriented environments, Digital Research came out with GEM, or the Graphics Environment Manager.  The idea was to offer a graphic-based environment for applications that could compete with the brand new Macintosh computer, and Microsoft’s new Windows product.

GEM started life in the late 70’s and early 80’s as the GSX graphics library.  This was a library that could run on different platforms and provide a common API for applications to use, regardless of the underlying graphics hardware.  This was a pretty big deal at the time, since the standard for graphics programming was to write directly to the video card’s registers.  And since every video card did things a little differently, it often meant that a given application would only support one or two specific video cards.  The GSX library would later become the basis of the VDI portion of GEM, responsible for graphics device management and rendering.

GEM was basically a marriage of two separate APIs.  The VDI (Virtual Device Interface) was responsible for all interaction with graphics devices of any sort, while the AES (Application Environment Services) was responsible for creating and managing windows, menu bars, dialog boxes, and all the other basic GUI components that an application might use.

GEM was first demoed running on the IBM PC with an 8086 processor, running on top of MSDOS.  However, various references in the documentation to the Motorola 68000 processor and integration with their own CP/M-68K operating system as the host make it seem clear that that DR intended GEM to be available for multiple processors at a relatively early stage of development.

Ironically, the PC version of GEM never really took off.  Other than being bundled as a runtime for Ventura Publisher, there were never any major applications written for the PC version.  Overall, it was the Atari ST series where GEM found its real home.

Overview of GEM VDI

In case you never programmed anything for GEM VDI, let me give you a brief overview of how it worked.  The first thing you do in order to use a device is open a workstation.  This returns a variety of information about the device’s capabilities.  Another API call you can do once the workstation has been opened will give you additional information about device capability.  Once you have an open workstation, you can execute the appropriate VDI calls to draw graphics onto the device’s raster area.

Most devices aren’t meant to be shared so you can only have one workstation open at a time.  However, in order to support multitasking with multiple GEM applications and desk accessories running together, you need to be able to share the display.  Therefore, the VDI supports the notion of opening a “virtual” workstation which is basically a context for the underlying physical workstation. 

GEM VDI Design Issues

The VDI has a number of huge design flaws that are easily recognized today.  I’m generally not talking about missing features, either.  I’m sure we could come up with a long list of things that might have been added to the VDI given enough time and resources.  I’m talking about flaws in the intended functionality.  Many of these issues were common cause for complaint from day one.

Also, let me be clear about this: when I suggest some fix to one of these flaws, I’m not saying someone should find the sources and do it now.  I’m saying it should have been done back in 1983 or 1984 when Digital Research was creating GEM in the first place.  Any of these flaws should have been noticeable at the time…  most of them are simply a matter of short-sightedness.

No Device Enumeration

Until the release of FSMGDOS in 1991, 6 years after the ST’s initial release, there was no mechanism for an application to find out what GEM devices were available, other than going through the process of attempting to open each possible device number and seeing what happened.  This was slow and inefficient, but the real problem underneath it all is a bit more subtle.  Even once FSMGDOS hit the scene, the new vqt_devinfo() function still required you to test every possible device ID.

The fix here would have been simple.  There should have been a VDI call that enumerated available devices.  Something like this:

typedef struct
{
/* defined in VDI.H - various bits of device info */
} VDIDeviceInfo;

VDIDeviceInfo deviceinfo[100];
numdevices = 0;
dev_id = 0;
while( dev_id = vq_device(dev_id, &deviceinfo[numdevices++]) );

The idea here is that the vq_device() function would return information about the next available device with a number higher than the dev_id parameter passed into it.   So if you pass in zero, it gives you info on device #1 and returns 1 as a result.  When it returns zero, you’ve reached the end of the list.

Device ID Assignments

Related to the basic problem of device enumeration is the whole way in which device IDs were handled overall.  GEM graphics devices are managed by a configuration text file named assign.sys that lived in the root directory of your boot volume.  This file would look something like this:

PATH=C:\SYS\GDOS
01 screen.sys
scrfont1.fnt
21 slm.sys
font1.fnt
font2.fnt
font3.fnt

The first line specifies the path where device driver files and device-specific bitmapped fonts were located.  The rest of the file specifies the available devices and the fonts that go with them.  For example, device 21 is the “slm.sys” driver, and “font1.fnt”, “font2.fnt” and “font3.fnt” are bitmapped font files for that device.

The device id number is not completely arbitrary.  There are different ranges of values for different device types.  For example, devices 1-10 were considered to be screen devices, 11-20 were considered to be pen plotter devices, 21-30 were printer devices, and so forth.  Oddly complicating things in a few places is Digital Research’s decision to mix input devices like touch tablets together with output devices like screens and printers.

The way device IDs worked was mainly a contributing factor in other situations, rather than a problem in its own right.  For example, because there was no easy way to enumerate available devices, many applications simply made the assumption that the printer was always going to be device 21 and that the metafile driver was device 31.  And in most cases, that’s all they would support.

The bigger problem, however, was that while the device ID assignments were mostly purely arbitrary, they were anything but arbitrary for the display screen.

Getting The Screen Device ID

Remember earlier when I explained how applications would open a “virtual” workstation for the screen?  Well, in order to do that, you have to know the handle of the physical workstation.  That’s something you get from the GEM AES function graf_handle().  One would think, since the physical workstation is already open, that you shouldn’t need to tell VDI the device ID, right?  Wrong.  Even though the physical workstation for the screen device is already opened by the GEM AES, you still need to pass the device ID number as one of the parameters when you open a virtual workstation.  So how do you get the device ID for the screen device that’s already open?  Well, there really isn’t a good answer to that question, and therein lies the chocolaty center of this gooey mess. 

On the Atari, the recommended method was to call the BIOS function GetRez() and add 2 to the returned value.  The first problem with this idea is there is no direct correlation between that value and anything like the screen resolution or number of colors available.   And even if there was some correlation, there are far more different screen modes than you can fit in the device ID range of 1-10.

Furthermore, this method only really worked for those video modes supported by the built-in hardware.  Add-on cards needed to not only have a driver, they also needed to install a patch to make GetRez() return the desired value when other video modes were used.

This pissed me off then, in large part because developers didn’t univerally follow the recommended method and their code broke when Atari or third parties introduced new hardware.  In fact, the very first article that I wrote for the ATARI.RSC Developer newsletter after I started at Atari was about this very subject. 

Looking back, the thing that pisses me off the most about this is the fact that I can think of at least three really easy fixes.  Any one of them would have avoided the situation, but all three are things that probably should have been part of GEM from day one.

The first, and most obvious, is that opening a virtual workstation shouldn’t require a device ID as part of the input.  The VDI should be able to figure it out from the physical workstation handle.  Seriously… what’s the point?  The device is already open!

Another option would have been adding a single line of code to the GEM AES function graf_handle() to make it also return the device ID number, rather than just the handle of the physical workstation.  If you’re going to insist on passing it as a parameter to open a virtual workstation, this is what makes sense.  After all, this function’s whole purpose is to provide you with information about the physical workstation!

Lastly, and independent of the other two ideas, there probably should have been a VDI function that would accept a workstation handle as a parameter and return information about the corresponding physical workstation, including the device ID.  This arguably comes under the heading of “new” features, but I prefer to think that it’s an essential yet “missing” feature.

Palette-Based Graphics

Perhaps the biggest flaws about GEM VDI are based in the fact that that the VDI is wrapped around the idea of a palette-based raster area.  This is where each “pixel” of the raster is an index into a table containing the actual color values that are shown.  Moreover, it’s not even a generic bit-packed raster.  The native bitmap format understood by GEM VDI is actually the same multiple bitplane format as what most VGA video cards used. 

Considering that the goal of the VDI was to create an abstract, virtual graphics device that could be mirrored onto an arbitrary actual piece of hardware, this is hard to forgive.

At the very least, the VDI should have acknowledged the idea of raster formats where the pixel value directly represents the color being displayed.  I’ve often wondered if this failure represents short-sightedness or a lack of development resources.

One might make the argument that “true color” video cards were still a few years away from common usage, and that’s undoubtedly part of the original thinking, but the problem is that this affects more than just the display screen.  Many other devices don’t use palette-based graphics.  For example, most color printers that were available back then had a selection of fixed, unchangeable colors.

Inefficient Device Attribute Management

Quite a lot of the VDI library consists of functions to set attributes like line thickness, line color, pattern, fill style, fill color, etc.  There’s an equally impressive list of functions whose purpose is to retrieve the current state of these attributes.

For the most part, these attributes are set one at a time.  That is, to set up the attributes for drawing a red box with a green hatched fill pattern, you have to do the following:

vsl_type( screenhandle, 0 );    // set solid line style
vsl_width( screenhandle, 3 );  // set line thickness of 3 pixels
vsl_color( screenhandle, linecolor );
vsf_color( screenhandle, fillcolor );
vsf_interior( screenhandle, 3 );
vsf_style( screenhandle, 3 );

By the way, we’re making the assumption here that the linecolor and fillcolor variables have already been set to values that represent red and green colors in the current palette.  That’s not necessarily a trivial assumption but let’s keep this example modest.

At first glance you might say, “Well, six lines of code… I see how it could be improved but that’s really not that terrible.

It really is… if you know how GEM VDI calls work, you’ll recognize how it’s horribly, horribly bad in a way that makes you want to kill small animals if you think about it too much.  Each one of those functions is ultimately doing nothing more than storing a single 16-bit value into a table, but there’s so much overhead involved in making even a simple VDI function call that it takes a few hundred cycles of processor time for each of these calls.

First, the C compiler has to push the parameters onto the stack and call the function binding.  The function binding reads the parameters off the stack and then saves them into the GEM VDI parameter arrays.  Then it loads up the address of the parameter arrays table and executes the 68000 processor’s trap #2 function.  This involved a context switch from user mode to supervisor mode, meaning that all of the processor’s registers and flags had to be saved on entry and restored on exit.  From there, GEM picks up the parameters and grabs the appropriate function pointer out of a table, and then passes control to that function.  At that point, the very, very special 16-bit value we cared about in the first place is lovingly deposited into the appropriate location within the table that the VDI has allocated for that particular workstation handle.  Then the function exits and starts making its way back up to your code. Along the way, there is much saving and restoring of 32-bit registers.  Those are uncached reads and writes on most ST systems, by the way.

The bottom line is that for things like this, GEM was simply horribly inefficient. And this could have been quite easily avoided, is the really bizzare part.

The way that 68000-based programs make GEM VDI calls is to load a magic code into the 68000’s d0 register, and the address of the VDI parameter block in the 68000’s d1 register, and then make a trap #2 call.  The parameter block is simply a list of pointers to the 5 arrays that GEM VDI uses to pass information back and forth with the application.  My idea is simply to add another pointer to the VDI parameter block, pointing to a structure that maintains all of the current drawing attributes of the workstation, including the handle and the device ID.

Suppose that opening a physical workstation (for device #21 in this example) looked something like this:

int v_opnwk( int devID, VDIWorkstation *dev, VDIContext *context );

VDIWorkstation printerDevice;
int handle = v_opnwk( 21, &printerDevice, v_getcontext(0L) );

Opening a virtual workstation is similar, except that we specify the handle for the open physical workstation instead of the device ID:

int v_opnwk( int physHandle, VDIWorkstation *dev, VDIContext *context );

VDIWorkstation screenDevice;
int handle = v_opnvwk( phys_handle, &screenDevice, v_getcontext(0L) ); 

Thereafter, VDI calls look much the same, except that instead of passing the handle of your workstation as a parameter, you pass a pointer to the desired VDIWorkstation structure:

v_ellipse( &screendevice, x, y, xrad, yrad );

instead of:

v_ellipse( handle, x, y, xrad, yrad );

The VDIWorkstation structure would look something like this:

typedef struct {
         VDIContext *ws;
         int *control;
         int *intin;
         int *ptsin;
         int *intout;
         int *ptsout;
} VDIWorkstation;

typedef struct {
         int contextSize;
         int handle;
         int deviceID;
         int lineType;
         int lineWidth;
         int lineColor;
     /* other various attribute fields listed here */
} VDIContext;

The heavy lifting is really done by the addition of the VDIContext structure. The first parameter would be a size field so the structure could be extended as needed.  And a new function called v_getcontext() would be used to allocate and initialize a context structure that resides in the application’s memory space.

With this setup, you would be able to change simple things like drawing attributes by direct manipulation of that context structure.  Let’s return to the example of setting up the attributes to draw a red rectangle with green hatch fill pattern.  Instead of the lines of code we saw earlier, we could instead have something like this:

screenDevice.ws->lineType = 0;  // set solid line style
screenDevice.ws->lineWidth = 3;  // set line thickness of 3 pixels
screenDevice.ws->lineColor = linecolor;
screenDevice.ws->fillColor = fillcolor;
screenDevice.ws->fillInterior = 3;
screenDevice.ws->fillStyle = 3;

This requires no function calls, no 68000 trap #2 call, no pushing or popping a ton of registers onto and off of the stack.  This entire block of code would take fewer cycles than just one line of code from the first example, by a pretty big margin.

The one thing that this does impact is the creation of metafiles, since attribute setting would no longer generate entries in the output file.  But that is easily solved by creating a new function, let’s call it vm_updatecontext(), which would simply take all the parameters from the context structure and output them to the metafile all at once.

These are relatively simple changes from an implementation standpoint, but they would have had a significant impact on the performance of GEM on the 68000, and I suspect the difference would be comparable on the 808x processors as well.

More coming in part 2

In part 2 of this, written whenever I get around to it, I’ll talk more about the VDI including more stuff about true color support, and outline font support — too little, too late?

July 1st, 2009 by Mike Fulton
Categories: PlayStation, Tech

Recently, I was asked by David at the online PlayStation Musuem website, to share some of my memories about my time at Sony Computer Entertainment America doing developer support for the PlayStation. After we were done, it occurred to me that it would make an interesting post here as well.  This was originally a series of back-and-forth email messages, but I’ve edited it all into a single interview.

David: As director of the PlayStation, I am researching and documenting the history of the PlayStation. I intend to include a chapter on how easy it was to program for the PlayStation compared to other competing platform and how SCEA offered developers superb support. Can you tell me what exactly your roles and responsibilities were while at SCEA? I will be able to ask proper questions if I know that. I do know that you assisted developers with programming questions because your replies were recorded in the BBS archives.

Mike Fulton: I was the “Senior Development Support Engineer” for the PlayStation. Aside from things like answering developers questions on phone calls and online, I also contributed to the sample code libraries and was the primary editor when we did our first big documentation revision. I was also the development tools guru and a major proponent of coming up with GUI-based tools instead of running command line tools in a DOS box.

Mainly, though, I was the guy that helped developers with code optimization. To many I was known as “Mr. Program Analyzer”.  This nickname refers to a device called the “program analyzer” which was essentially a PS with a logic analyzer and signal tracer/recorder glued on top of it and with everything mounted together in a PC-style case.

Some years later, the whole thing was redone as a plug-in PC card and made more widely available but this was right after the initial launch of the system, and at the time it was all hand built and there were just 5 of these devices in existence. I once heard that they cost about $100k each to create. We had 2 of them at SCEA. One sat in my cubicle, and the other sat in a special office we had setup just for hosting visiting developers who brought in a project to be analyzed.

I started at SCEA in early 1996, just a few months after the PlayStation was first launched. My first month or so was basically an immersion into learning the development kit and everything else I could. Then I was introduced to the program analyzer and told to become its master. To that end, I underwent training with the SCEI engineer who wrote the software, and beyond that I just spent hours with the machine until it all made sense.

The performance analyzer came with a Windows application that allowed you to capture a recording of up to about 7/60ths of a second that included all of the important signals in the machine. You would run the game and press a button wired up to the analyzer when you got to the point where you wanted to record.

By analyzing the recording, you could determine when memory was being accessed (indicating a cache miss), when the GPU was active, when information was being sent to the sound chip, etc. This allowed us to make determinations such as: “the game is running at 20 fps, but just barely? there’s only a tiny amount of idle time before the GPU finishes processing and the next vertical blank. It would take a lot more optimization to hit 30 fps, but the main worry is that if that code runs even a little bit longer, the frame rate will drop to 15 fps.”  Or, we might say “the game is running at 20 fps, but that’s because the GPU is finishing processing the frame right AFTER the vertical blank and it has to wait for the next one to page flip. In other words, a little code optimization could bump the frame rate up to 30 frames per second.”

And then, having determined that, we’d look at other parts of the analysis and look for certain patterns. If there seemed to be an unusually high number of cache misses, we could look at the memory being accessed and compare it to a symbols listing from the program.  This would let us figure out things like function A and function B are both called all of the time, but because of their relative positions in memory, they’re always bumping each other out of the CPU cache.  Rearranging the order of functions in your source code and the order in which object files were linked was a common optimization for the PS.

Because the first analyzers were hand-built and fragile, they didn’t leave our HQ. So that meant developers came to me. Typically the way it worked was that they would send me a build of their project along with instructions for how to play the game up to the point where analysis was desired, and then I would use the machine and generate a report which I’d send back to them. Then a week or two later, in many cases, the developers themselves would come to visit me in the office and we’d spend a day or two doing additional runs on the analyzer and going over their code. Sometimes they’d make changes after each run and we’d go back and forth with new versions.

Ultimately, the trick was to correctly interpret what all of this information meant and turn that into a plan for what to change in your project’s source code.

Case in point: Tomb Raider for the PS. The *ORIGINAL* game, that is, was running at about 5 fps when the developers brought it in for analysis. It was essentially unplayable, and the developers were beginning to get worried that it wasn’t going to get any better. Keep in mind that this is still just a few months after the machine first launched. Many developers were still working on their first title and didn’t really know what sort of performance they could expect from the machine. After doing our analysis, a few simple optimizations brought the frame rate up to 20 fps.

David: I did not realize that Sony had developed the program analyzer tool back in 1996 and the story about Tomb Raider was very interesting. Do you have pictures of the old hand-built analyzer? I agree with you that a gui was much better than using a command line. I was never able to get the later software versions to run on the H2000 so I was forced to use command line.

Mike Fulton: I do have a few old pix from Sony days but I’ll have to look around for them and see what’s there.  As I answer these questions, keep in mind that it’s been 10-12 years so I may not remember all the details of some things perfectly. 

David:  How was support given to developers? Was it mainly through a forum/BBS on a secure website, telephone support and face-to-face meetings? (Would developers ftp you their code?)

Mike Fulton: Mainly through telephone and face to face, or via direct EMAIL, I’d say. My recollection is that the BBS was kind of underutilized by developers.

If I remember correctly… the BBS was an older style direct-dial system, not a web-based message forum. I’m hazy about when that transition might have occurred. Unfortunately, the whole concept of calling into a BBS had largely been replaced by the web by that time, but we didn’t have much of a REAL web presence for developers at that point… our developer web pages were more about marketing to developers than providing technical information and help. We (the developer support group) wanted to expand our web presence, but the web stuff at that point was handled by Sony corporate and we didn’t have direct control over the developer web pages.

David:  What would you say were the benefits for developers using a forum/BBS? (I assume most questions and answers were posted for all to see and learn from as well as respond to)

Mike Fulton: As I said above, I think the BBS was underutilized. Most game developers were very proprietary about things like their source code and their methodology. For the most part, they barely wanted to share technical details of their projects with us, let alone with each other. It’s ironic, because I saw source code from dozens of different games, but there wasn’t one bit of it that ever really stood out for any reason. I mean, maybe when you look at the big 60,000 lines of source code picture, things might be impressive in one way or another, but when you’re bouncing back and forth in a few hundred lines of code trying to fix a bug or optimize something, it’s kind of hard to see.

David:  How do you think developer support from SCEA differed from Sega, Nintendo, or Atari? What are some possible reasons that developers preferred to program for the PlayStation rather than the Saturn or N64?

Mike Fulton: Developer support at Atari was me, actually. I was the developer support manager for the Jaguar and the Atari ST computers before that. For the Jaguar, my main purview was the tools and sample code. We also had Normen Kowalewski and Scott Sanders providing support.

I can’t say I really had any direct experience of Sega or Nintendo’s programs, so I can’t answer to that. But as to Atari vs. Sony, I have to say that the differences in support were greatly colored by the fact that from 1993-1995, console development underwent a major paradigm shift. It started with the Atari Jaguar, and ended with the PlayStation.

The Jaguar was the first console where 3D graphics were more than an occasional novelty. Prior to the Jaguar, console games were almost all sprites and tiles and had been for a decade or more. Development cycles for the Super Nintendo or Sega Genesis averaged 6 months. But once 3D graphics entered the picture, everything changed. For one thing, the industry had to create a whole new generation of tools for generating 3D artwork and animation, and developers and artists had to learn to use those tools. That all had a big impact on the length of a project.

Also, while the Jaguar had the first real hint of 3D acceleration, the two 64-bit processors weren’t really C-language friendly, so it took a master-class assembly programmer to get good performance out of the system. So a lot of projects included a lot of time for code optimization.

As for 3D processing, the Jaguar GPU was great at doing 3D vector math, but you had to basically hand-feed the blitter individual scan line segments of each polygon at a time to do rendering. Projects involving 3D graphics had to devote at least a couple of months to the development of a good 3D rendering pipeline, to say nothing of the 3D game engine required to feed that pipeline polygons to draw.

Then came the PlayStation. The PlayStation was the first console where developers were expected to use supplied libraries rather than programming for the bare metal. Jaguar programming was definitely bare-metal, but PlayStation programming was anything but.

There’s been a lot of debate about the motivation for Sony making developers to use libraries, but to me the reason is clear. Without the Sony-supplied libraries for accessing the hardware, game projects would have taken 6 months longer or more. The PlayStation is the machine that completed the 3D acceleration equation. Like the Jaguar, it had a co-processor that handled 3D vector math quite efficiently. However, unlike the Jaguar, the PlayStation had a DMA-driven graphics primitive renderer (the GPU). Set up a display list of primitives and set the GPU off and running, and the CPU could do other things while the graphics were being rendered asynchronously.

Because of the asynchronous nature of the way the PlayStation’s hardware worked, once you pointed the hardware at a primitive list, it processed it as fast as possible. This made it possible to get good performance without going through quite so many hoops as with previous console hardware. Combined with the relatively fast (for the time) CPU, it made it generally unnecessary for most projects to worry so much about extreme code optimization. Console development had largely been assembly-language based up to that point, but with the PlayStation it was possible to do entire projects in C or C++ without writing any assembly language code at all.

Thus, the paradigm shift that the Jaguar started, games with 3D graphics and the corresponding tools for artwork & animation, the PlayStation finished with the use of asynchronous processing, the use of code libraries, and the shift to higher-level languages for development.

David: I didn’t realize that you WERE Atari Jaguar support. Seems like a few ex-Atari employees joined Sony (Rehbock, Jessop, J Patton…). Did you know that Ken Kutaragi was given a private demonstration of a prototype Jaguar at a CES (may have been 1992?). I find the story of how he was shown the Jag interesting.

Mike Fulton: I think the Jaguar was first shown behind closed doors at the Jan. 1993 CES, but to be perfectly honest I’m not that sure about the date. Sam Tramiel was probably there, the demo itself would probably have been done by his brother Leonard. Or maybe by Bill Rehbock. Bill is who I heard the story from, but I don’t remember the exact details. I’ll try to remember to ask him about it and see what HE remembers.

David:  Did you hear any complaints about developer support (either SCEA or SCEI)?

Mike Fulton: There were complaints about policies having to do with what information we could release, and some of that may have been directed at developer support even though we weren’t directly responsible. I’m sure there were grumblings from time to time but I can’t think of any particularly big issue.

It’s worth pointing out that we at SCEA didn’t really interact that much with the developer support engineers at SCEI. When we needed information, we went directly to the hardware engineers and library guys for the most part. (Although, to be fair, the line between development and developer support at SCEI was occasionally a bit blurred. That is no doubt the source of some of the things you’ve heard.)

David:  Do you believe the libraries limited some developers?

Mike Fulton: If you want to say “some developers”, then sure. I’m sure you could find a couple that might have gotten better results from going to the bare metal. But those guys are really the exception, not the rule. For every developer who might have had the time and resources to develop their own codebase for the hardware, you’d find another 10 developers who would get much more accomplished by using the supplied libraries. I think the libraries enabled developers to get decent games together much more quickly than they otherwise might have done. For the most part, anyway. The one big exception was when they finally decided to release the information for directly programming the GTE co-processor. But a lot of that was just explicitly documenting what many developers had already figured out anyway.

David:  Were there times when Japanese developers were receiving technical help/information from SCEI that may not have been provided to SCEA in order to better help US developers?

Mike Fulton: The question was raised a few times, but nobody could ever provide a example of what developer and what information.

There were times that I felt like SCEI didn’t give SCEA any MORE information than they might give a developer, but I never felt they gave us any LESS information about anything. I certainly never heard of any specific instance of a developer getting information we didn’t get.

There were perhaps a few things they wouldn’t tell you until you’d actually asked about it, but they weren’t really trying to keep it a secret.

David:  Why wouldn’t Sony allow developers to write code ‘to the metal’? Did developers complain about using libs when they thought they could write more efficient or faster code? I heard two reasons: to ensure future compatibility and also because Sony wanted to protect their architecture and didn’t want people poking around. I’m not sure how true the later is.

Mike Fulton: I’m convinced that the purpose behind the libraries was simply to get games out the door months earlier than would otherwise be the case. They certainly didn’t provide any great barrier to people “poking around”. Keep in mind that the libraries weren’t encrypted or anything… any developer could extract all of the object modules from any library and disassemble them using the tools provided to them by Sony. And many developers did just that. In fact, so many developers had reverse engineered the way that the GTE coprocessor worked that Sony eventually decided to simply add the low-level info to the documentation.

I don’t think future compatibility was really a big factor regarding library development. Library code is still code. The only way to future proof something would be for all of the actual hardware access to be strictly limited to ROM-based code and use fixed ROM vectors for everything. That was certainly not the case with the PS libraries.

David:  Did SCEA dev support have a level of support goal? For example, did they monitor how long it took for a developer to get a response to a question/problem?

Mike Fulton: We didn’t have anything formal in place, but we did have weekly meetings where all of the support engineers would give updates on what issues and what developers they’d been dealing with. I don’t think we ever really had any major issues with problems going unsolved for any great length of time except for those issues where we were unable to come up with answers ourselves and then would end up waiting on information from SCEI.

David:  Did you give support to the Net Yaroze as well? I assume SCEA devoted very little time to assist with Yaroze, after all developers were spending thousands of dollars and should rightfully get support.

Mike Fulton: Yes, I did some work on Net Yaroze. Pre-launch I did some of the sample code we provided, and thereafter we do some direct support – mainly answering questions online. The latter was never really a requirement… it was more like, if you saw an interesting question and felt like responding, you did. It wasn’t a daily thing by any means.

David: I asked John Phua why SCEA released the first analog controller without a vibration function and he couldn’t remember. Do you why it was released WITHOUT the vibration in the US? Do you think it had to do with SCEI pulling the controller off the market (and why did they do that)?

Mike Fulton: As I understand it, the original version of the vibration controller turned out to have some patent conflict that needed to be resolved. It was never actually sold at retail in the US, and I don’t think it was on shelves for very long in Japan.

David: What was the most common problem that developers had?

Three things come to mind. First, I would just say code optimization, in general.

Second, the way the PlayStation’s 3D setup worked, you wanted to avoid having large polygons with significant differences in Z-value (depth from viewer position) from vertex to vertex, because the texture sampling did not do pixel-by-pixel scaling. Such polygons would result in odd texture distortion. This was quite common on things like terrain close to the camera. The recommended method around this was to subdivide your polygons. This would allow the GPU to scale the texture more appropriately for each resulting subsection.

Polygon subdivision wasn’t really especially hard, but it was a pain in the butt and did represent a potential performance hit since you had to spend CPU cycles doing it and then spend more GPU cycles on a longer list of primitives. As a result, developers often put it off until the last minute and in a lot of earlier games, just avoided doing it at all.

Third would be optimizing data access. Actually, I’m not sure this strictly fits the question, because this was really a problem that developers in general simply avoided addressing at all.

Considering that the PS only had 3.5mb of RAM total, it really shouldn’t ever have taken more than about 12 seconds to load anything from the CD. Sure, it might take more time to process that data in some fashion but the actual data read time shouldn’t be much more than about 12 seconds since that’s how long it would take to fill up 3.5mb of RAM at the 300 kps “double-speed” rate.

The recommended method of getting data off the CD was to simply read raw sectors. We had tools that allowed you to put your data together and figure out the sector offset for any given item, but doing all this was a lot of extra work for the programmer.

It was much easier to use the available standard C library functions for file reading, but it was extremely inefficient since there was no caching for the file system. For example, reading two small 2 kilobyte files could take several seconds because of the need to seek back and forth between the disc’s ISO directory info and the actual data. Now multiply that idea by the amount of time needed to fill up the machine with all the data for a game level.

Ultimately, the slower method was an order of magnitude more convenient and therefore used in many projects until “it’s too late to change now!”  I lobbied for changes to the library to add some basic caching for the ISO directory data, but it never happened. It was my impression that the guys at SCEI were disinclined to spend time trying to “fix” a problem that was caused mainly by game developers not doing things the optimal way. The other problem was that it would have taken a reasonable chunk of memory to do it right.

David: Someone just submitted SCEA’s QA review of Superman to the Museum. I don’t think I would have fully understood the problem had I not read your email beforehand.

“Consulting with SCEA’s developer support is encouraged. Several textures used display texture warping. As most do not contain horizontal lines, sub-division looks to be the best solution to address this issue. Some textures such as grates or spider webs that block tunnels bend severely when viewed up close. Objects are not subdivided in runtime. We can additionally subdivide objects that are experiencing texture warping. This will be done during final tweaking after all other code and art optimizations have been done.”

I’m a little surprised that Sony’s QA would bring up this concern. I’ll have to ask QA if they would have rejected a title solely because of texture warping.

Mike Fulton: It’s hard to tell from a simple text description, but the texture warping thing could be pretty bad in places. It was sometimes a really extreme glitch.

David: What was the most difficult challenge you encountered?

Mike Fulton: I would say learning the mysteries of the performance analyzer. The main issue was that the software for it was very primitive, and the updates infrequent. The software was very good at graphing the signals and showing the raw information, but actually didn’t really do any real analysis of the data in terms of telling you something like “this program is missing the cache about twice as often as the average.” You had to look at the raw numbers and put that idea together yourself.

To make the whole process more streamlined, I ended up creating a program where I could take the data saved from the analyzer software and then it would do a bunch of additional analysis. Plus, it maintained a database of all the programs for which I had done this, so I could more systematically figure out how programs did on average in various areas. This allowed me to do my preliminary report fairly automatically, which really improved my throughput.

David: I would like to write a summary about the different hardware that developers utilized. The H500 was contained in a mainframe box, H2000 was a dual ISA PC Plugin module, H2500 was a PCI Plugin module, and H2700 was an ISA PC Plugin module with performance analyzer. Were you familiar with all four standard development kits? If so, what were the main differences between the four (pros/cons)?

Mike Fulton: Except for the DTL-H2700, for the most part, it was just a matter of how much clutter it all took up on your desktop. The H500 was pretty much gone by the time I joined SCEA. In fact, I can only remember ever seeing like maybe 1-2 machines there. More were in evidence at SCEI but I think they were mainly just collector’s items at that point.

The main functional difference, if I remember right, was that the DTL-H2000 required a fairly massive external CD drive (DTL-H2010) if you wanted to run an actual disc. The DTL-H2500 on the other hand, had a connector for connecting the DTL-H2510 CDROM drive that could also be mounted in the same PC. (The DTL-H2510, if I remember correctly, was basically a regular Sony CDROM drive but there might have been some wacky custom firmware in it.)

The DTL-H2700 was long awaited and something I was looking forward to, because it meant I could start transitioning from doing program analysis for developers to training them how to do it for themselves. However, while it was much cheaper than the old PA setup, it was still not cheap and supplies were initially a bit of a trickle. We were really just at the beginning of the transition when I left SCEA.

David: PlayStation was initially regarded as a machine that could not handle 2D games. Do you think that’s true or was it because libraries weren’t available early on that would address 2D?

Mike Fulton: All the initial titles were 3D, so I imagine that at some point someone asked “hey, what about 2D games, can’t it do that?” And the answer, of course, was “yes” but it just wasn’t the focus of the developers at first.

We didn’t offer a “sprite library” per se, but the standard 3D graphics libraries lent themselves to doing sprites just fine. In fact, the PlayStation’s GPU was considered to really be a 2D renderer by many, since it didn’t do the per-pixel (or per scanline) z-value scaling on polygon rendering as I mentioned earlier. Everything it rendered was basically a 2D polygon.

David: You mentioned that you assisted with Tomb Raider and helped the frame rate go from 5 to 20 fps. Can you recall any other games that you used the performance analyzer on that had a tremendous improvement as a result?

Mike Fulton: Tomb Raider was unusual in that it was fairly close to completely finished in most respects but not performing remotely close to what was needed. Most games I saw that demonstrated that level of performance were much earlier in their development cycle.

Most of the games we dealt with were generally playable, like running at 12-15 fps most of the time, and the developers were hoping to figure out how to boost that up to 20-30 fps. That sounds like a really big jump, and it is, but you have to consider that when your game loop is tied to a vertical blank, going just a few cycles over your sync drops your frame rate down to the next step. That means you go from 30 fps to 20 fps in one step, and then to 15 fps, and then to 12 fps.

Ultimately, a game running at 20fps was taking somewhere between 2/60 and 3/60 of a second to process one pass through their rendering / game loop. If the actual time was really close to 2/60, then a little optimization would bump it down and the game would then run at 30 fps instead of 20. On the flip side, if the game loop was taking closer to 3/60, it would take significantly more work to improve the frame rate.

BTW, the rendering loop and game logic loop were the same thing back in those days because games were virtually all single-threaded. Today many games, if not most, would be multi-threaded.

David: Did you test out any games that weren’t released either because the developers couldn’t get the code running efficiently?

Mike Fulton: Oh, gosh there were a bunch of games I saw that ultimately never saw the light of day. But I couldn’t tell you if poor performance was the ultimate reason. There could have been a lot of reasons for a game to get killed.

David: Were you introduced to any interesting peripherals or development hardware that didn’t have an official release? For example, a prototype keyboard adaptor and modem were developed, but were libraries created to support these tools?

Mike Fulton: Well, we saw such goodies but until SCEI was ready to release them we didn’t typically worry about them too much.

One exception was the first version of the vibration controller. The one that was sent out to developers but never sold. When we got it from SCEI the only software was an extremely low-level driver that basically just turned the motor on or off. To actually do anything meaningful you had to turn the motor on and off repeatedly with some specific timing pattern.

We didn’t see developers spending a lot of time figuring out how to create vibration patterns, so I wrote a high-level library for games to use. It was actually included in (I think) two games. One of which was Wing Commander IV, which was in the very last pre-release stage when SCEI pulled the plug on that first version of the controller. They didn’t to pull the code completely out of the game at that late stage, so it’s still in the release version of the game. I think there’s a secret code to enable it.

David: Do you have access to or know someone who has access to a development kit and could compile code? About five years ago the Museum was donated code to an unreleased game. The programmer assured me that the code works but he didn’t have a dev kit to compile it. The game is based on the best selling Atari ST game Oids. Being a huge supporter of the ST, I have always wanted to get this code working. Even though the Museum has development kits, I don’t have the know-how to get it working.

Mike Fulton: I don’t have one myself, but you shouldn’t need an actual development kit just to build the project. You just need a Programmer Tools CD. Remind me to see if I can dig one up in a week or so.

May 4th, 2009 by Mike Fulton
Categories: Apple, iPhone, Macintosh, Tech

Once upon a time, Steve Jobs was the leader of a company called Apple.  Apple was known for being a technology leader, and their latest products were the envy of the industry.  Sadly, though, Apple’s sales figures didn’t seem to be able to keep pace with their reputation.  The board of directors of Apple, thinking that another style of management might be the way to go, decided that they’d had enough of Steve and handed him his walking papers.  The year was 1985.

Steve’s response to the situation was to start another computer company, called NeXT.  The Apple Macintosh was supposed to be the “computer for the rest of us” but with NeXT, it seemed Job’s goal was to create the “computer for the best of us“.  Largely inspired by his experience with getting the Macintosh into the education market, the NeXT Computer was going to be a powerful workstation designed to meet the needs of the scientific and higher educational community.  At the heart of this new computer was going to be NeXTStep, an object-oriented multi-tasking operating system that included tightly integrated development tools to aid users in quickly creating custom applications.

NeXTStep’s Language Of Choice

At the heart of NeXTStep was a fairly new programming language known as Objective C.  It was basically an extension of the C language to add Smalltalk-style messaging and other OOP features.  Conceptually it’s not too far off from where C++ was at the time, but the syntax is fairly different.  However, that simply didn’t matter at the time because most programmers hadn’t done much, if anything, with C++.

In 1985, any sort of object oriented programming was a relatively new thing to most programmers.  Modern languages like Java and C# were still years in the future, and C++ was still largely an experiment, with no standard in place and drastic differences from one implementation to the next.  In fact, most C++ solutions at the time were based on AT&T’s CFront program, which converted C++ code into standard C code that would then be compiled by a standard compiler.  It would be a few years yet before native C++ compilers became commonplace.

There were other OOP languages around, like Smalltalk or Lisp, but they were largely considered acedemic languages, not something you’d use to create shrink-wrapped products.

Since there simply wasn’t any better solution, the choice of Objective C for NeXTStep was completely reasonable at the time.

What Happened NeXT

The first version of NeXTStep was released in Sept. 1989.  Over the next few years, the NeXT computer and NeXTStep made a number of headlines and gained a lot of respect in the industry, but failed to become a major player in terms of sales.  In late 1996, NeXT had just teamed up with Sun Computer to create a cross-platform version called OpenStep, but before that really took off, something else happened.

In 1996, Apple was floundering.  Their stock price was down.  They’d had layoffs.  They had no clear plan for the future in place, and they were in serious danger of losing their place as the master of the graphic user interface.  Microsoft had just released Windows 95, which was a huge leap forward from Windows 3.1 in virtually every way, and PC video cards offering 24-bit and 32-bit color modes had become easily affordable.

Apple CEO Gil Amelio was fairly sure that updating the Mac to use some sort of object-oriented operating system was key to Apple’s future success, but Apple’s internal development had thus far failed to pay off.  Likewise Apple’s investment in Taligent, a company formed in partnership with IBM for the sole purpose of developing an object oriented operating system.  But then Amelio struck a bargain to purchase NeXT Computer and the NeXTStep operating system, bringing NeXT CEO Steve Jobs back into the fold, first as an advisor and then as CEO several months later when Amelio was shown the door.

It took Apple nearly 4 years to integrate their existing operating system with the NeXTStep tools and libraries, but ultimately NeXTStep formed the basis of the new Macintosh OS X operating system, released in March 2001.

Mac Development Tool History

When the Macintosh was first released in early 1984, you pretty much used either 68000 assembly language or Pascal to create programs.  Pascal had always been a popular language with the Apple crowd.  Apple had a set of development tools known as the Macintosh Programmer’s Workshop, which was essentially a GUI interface wrapper for a variety of commandline oriented tools, including the 68000 assembler and the Pascal language compiler.

It didn’t take long for the C language became available for the Mac.  Apple released a version for MPW, but it really took off with the release of LIGHTSPEED C (later renamed to THINK C), which had a GUI IDE of the sort that would be completely recognizable as such even today, almost 25 years later.  Think’s compiler quickly became the defacto standard development environment for the Mac.  Support for C++ would be added in 1993 with version 6.0, after the product was acquired by Symantec.

Unfortunately, when Apple made the transition from the Motorola 680×0 processor family to the PowerPC processor in 1994 & 1995, Symantec C/C++ failed to keep pace.  It wasn’t until version 8, released in 1997, that their compiler was able to generate native PowerPC code. 

Fortunately, a new player in the game appeared to save the day.  When Symantec bought out Think, some members of the Think C development team started a new company called Metrowerks.  While Symantec was struggling to bring out a PowerPC compiler, Metrowerks released their new CodeWarrior C/C++ environment.  In many ways, Codewarrior was like an upgrade to the Symantec product, and it quickly supplanted Symantec among developers.  Codewarrior would remain at the top of the heap until Apple released OS X.

The NeXT Development Tool

When Apple released Mac OS X in 2001, there were two big paradigm shifts for developers.  The first was that Apple now included their development tools with the operating system, at no additional charge.  After nearly two decades of charging premium prices for their tools, this was a big change.  Plus, the new XCode environment was an actual IDE, unlike the old Macintosh Programmer’s Workshop environment, with support for Objective C, C, C++, and Java.

The second paradigm shift was that everything you knew about programing the Mac was now old news.  You could continue to use an existing C/C++ codebase with the new Carbon libraries providing a bridge to the new OS, but this did not allow you to use the new tools such as the Interface Builder.  If you wanted to take full advantage of the new tools Apple and the Cocoa libraries, you needed to use Objective C instead of the familiar C or C++.

Objectionable C

I had been a Mac programmer since getting my first machine in 1986, and when Apple released Mac OS X in 2001, I was fully expecting to continue that tradition.  However, while I had no problems whatsoever with the idea of learning a new set of API calls, or learning new tools, I saw no good reason why it should be necessary to learn a new programming language.  Still, at one time in my younger days I had enjoyed experimenting with different programming languages, so I figured why not give Objective C a try?

Upon doing so, my first thought was, this was an UGLY language.  My second thought was, why did they change certain bits of syntax around for no good reason?  There were things where the old-style C syntax would have gotten the job done, but they changed it anyway.  The third thing that occurred to me was that this was a REALLY UGLY language.

After a few brief experiments, I pretty much stopped playing around with Cocoa and Objective C.  I started playing around with Carbon.  My first project was to rebuild an old project done in C++.  But the first thing I ran into was frustration that I couldn’t use the new tools like the Interface Builder.  It wasn’t too long before I decided I wasn’t getting paid enough to deal with all this BS.  Objective C had sucked all the fun out of Mac programming for me.

The shift to Objective C marked the end of Macintosh development for many other programmers I’ve talked to as well.  One can only conclude from their actions that Apple simply doesn’t care… if one programmer drops the platform, another will come around.  I’m sure there are plenty of other programmers around who either like Objective C just fine or who simply don’t care one way or the other.

As far as I’m concerned, Objective C is an ugly language, an ugly failed experiment that simply has no place in the world today.  It offers nothing substantial that we can’t get from other languages like C++, C#, or Java.  Nothing, that is, except for access to Apple’s tools and libraries.

Some Mac developers would tell you that the Cocoa libraries depend on some of Objective C’s capabilities like late-binding, delegates (as implemented in Cocoa), and the target-action pattern.  My response is that these people are confusing cause and effect.   The Cocoa libraries depend on those Objective C features because that was the best way to implement things with that language.  However, I have no doubt whatsoever that if Apple wanted to have a  C++ version of the Cocoa library, they could figure out a way to get things done without those Objective C features.

A Second Look

A few years later when I got my first Intel-based Mac, I decided to revisit the development tools.  I wrote a few simple programs.  I’d heard a few people express the opinion that Objective C was sort of like the Ugly Duckling… as I used it more and became familiar with it, it would grow into a beautful swan.  Nope.  Uh-uh.  Wrong.  No matter what I did, no matter what I do, Objective C remains just as frickin’ ugly as it was when I started.

I really wanted not to hate Objective C with a fiery vengeance that burned from the bottom of my soul, but what are ya gonna do?  Personally, I’m looking into alternatives like using C# with the Mono libraries.  No matter how non-standard these alternatives are, they can’t be any more icky than using Objective C.

Could It Be That Apple Doesn’t Care About Making Life Easier For Developers? 

The real question here is why the hell hasn’t Apple created a C++ version of the Cocoa library?  It’s been 12 years since Apple bought out NeXT.  Why hasn’t Apple made an effort in all that time to adapt the NeXTStep tools to use C++?  Or other modern languages like C#?  Microsoft may have invented the C# language, but even the Linux crowd has adopted it for gosh sakes!

Or why not annoy Sun and make a native-code version of Java with native Apple libraries?

Could it be they are trying to avoid the embarrassment that would occur when developers abandon Objective C en-masse as soon as there is a reasonable replacement?

Does Apple think developers are happy with Objective C?  Personally, I’ve yet to find a single programmer who actually even likes the language.  The only argument I’ve ever heard anybody put forth for using it has always been that it was necessary because it was the only choice that Apple offered.  I know that’s the only reason I use it.

Why does Apple continue to insist on inflicting Objectionable C on us?  I can only come to the conclusion that Apple simply doesn’t care if developers would rather use some other language.  It’s their way, or the highway.

« Previous Entries