Why are Black Boxes so Hard to Reuse?

Toward a New Model of Abstraction 
in the Engineering of Software

Gregor Kiczales
Xerox Palo Alto Research Center

This talk was presented to OOPSLA'94 and ICSE-17.
The transcript below is from the OOPSLA presentation.


Show Slide Click on the small slides on the left to display the appropriate slide for each point in the talk.

Show slide

I believe that our field is about to go through a very exciting revolution. I believe that what we are about to do is reinvent how we think about and use the concept of abstraction in this field. What I'm going to say is that a number of people -- a number of them in the object-oriented community, but also in operating systems and programming languages and communication and other areas, are dealing with a category of problem and designing a kind of system architecture that isn't very well supported by the current way we think about abstraction, and this work is driving us towards a new abstraction framework.

The purpose of this talk is to try to provide an initial set of words that better support thinking about those problems and those kinds of systems. Now the thing I want to stress is the problems I'm going to be talking about aren't new problems and the kinds of systems I'm going to be talking about aren't entirely new systems either. What is new here I think is a set a words for framing these problems that I think can help us better understand the existing solutions and more importantly, can help us better understand a new kind of solution and design better systems in the future.

In addition to what you'll see during the talk, which is that I am going to reference the work of a lot of other people, I want to make it clear that the development of this talk and the ideas in it was in no way done entirely by myself. There is a group of people at PARC who have been working on this problem that I am fortunate enough to work with every day. And there is also a growing community of other people who have been tackling this and helping to come up with these words. So I'm the one who's lucky enough to be up here talking about it, but a number of other people have worked on this.

Before I go on, I do want to give you a little bit of my own background so you will have some sense of what's driving me personally in this work, because I think as we go on, it will help understand where I'm coming from and where some of the work is coming from. When I first started hacking computers, it was at high school. I was programming in Basic on PDP-11 running RSTS. I don't know how other people got into this field, but you know, when I walked up to this thing and started playing with it, I just thought it was the coolest thing in the world. I was writing these programs, and it was just great -- I just enjoyed it a tremendous amount. And I did that for a couple of years, and started writing more programs, and you know, you can even do systems programming in Basic in RSTS. But, after doing it for a while, I stopped and I went off to do theater lighting instead. And I wish I could tell you that the reason I stopped was because my basic sensibilities of elegance and modularity had been offended. But I'm not sure I had those sensibilities at that time. (I'm sure that if I had, they would have been.) But I did quit computers for a while, and went off to do other stuff.


Show slide

Now, when I went to college, I went to an engineering school. And I started off studying mechanical engineering. The basic theme in all of my classes was that we were going to learn to "think like engineers."

What they kept saying was that the fundamental challenge in engineering is to cope with the complexity of the systems that we are trying to build. And they said that the principal tools that engineers use for doing this are abstraction and decomposition.

After studying mechanical engineering for a while, and sort of on a lark, I went back to try computer science again. And I got to my computer science classes and they were using these very same words. They were talking about engineering and abstraction and decomposition and the control of complexity. It was fun and the programs were wonderful and elegant. I could write these twenty-page programs that that were clean and understandable. It was great fun and so I switched majors.

But it wasn't very long before other problems began to crop up again, and my programs got gross and complicated again. The place where the rubber really hit road was when I worked on the Common Lisp Object System (CLOS) project. Now, if you believe in the principles of high-level modularity and black box abstraction that our field teaches, getting to work on a high-level programming language, it's like dying and going to heaven. Because the claim there is that you are going to make this very high-level abstraction that insulates people from the details of the underlying machine. And you are going to provide it to them as users and they are going to be happy using it and life is going to be great. I had gone through all this education in high-level abstraction and modularity and stuff, and so I was very excited about this project.

But on this project I was in a very interesting position because not only was I working on the design, but I was also trying to implement this language CLOS. In particular I was trying to do a portable implementation of it that ran on top of another high-level language, Common Lisp. And so the interesting thing is that I was the provider and customer of my own story because I was going to make a high-level language for other people to use, using a high-level language that other people had provided.

So, here's the rub -- you can write an implementation of CLOS on top of Common Lisp in ten pages of code; if you want it to have good error messages, maybe twenty pages of code. My implementation, PCL, which ended up being very fast, is about 350 pages of code, and it has to reach around the underlying common lisp at that. So, while I don't claim that that example is 100 percent typical, that's the sort of example that is driving me. I want to know it is that if you do things the "right way", you don't get it to go as fast as you want, and if you get it to go as fast as you want, it in no way resembles the right way. And what I'm going to say is that it is the "right way" that is wrong. That is, our existing words of black box abstraction -- the existing words we use for designing our systems -- aren't driving us toward the kinds of solutions we want, they are actually standing in the way of the kinds of solutions we want.


Show slide

So those are some big words. Let me see if I can try to put some meat on them. Let's imagine for a minute that what we are going to do is build the display portion of a spreadsheet implementation. So, everybody knows how you would design this. The actual code that implements the spreadsheet display would sit on top of a window system which would sit on top of a programming language, and that would sit on top of the operating system. This would be the way that we would decompose this system. In fact this decomposition is so typical and so accepted, that many of these components are standardized, so we have standard window systems and standard languages and standard operating systems.


Show slide

Now if we look at that a little bit closer, what's going on is that each of those components is something that we often call a black box abstraction. The idea is that each component presents an interface, this little blue line is the interface, that provides useful functionality. It provides in the case of the window system the functionality of windowing. The actual implementation is hidden inside of this black box.

Show slide

We do this because inside the implementation it's a horrible and frightening thing. And we don't want our clients to be scared away by that or to have to deal with that on a day-to-day basis.

Show slide

So we cover it up like this, and we just present the clean interface of the system functionality. And then what happens is our client code -- I'm going to use these very thin, neat, elegant, crisp blue lines to denote client of a nicely designed abstraction that gets to write nice simple code -- is very simple because it sits on top of this interface and it's hidden from the horrors of the implementation underneath it.


Show slide

Now, I think a lot of us take this kind of idea for granted, but this kind of system architecture wasn't the way that it always was. This way of thinking about system design wasn't always the prevailing approach. While it is actually very difficult to track down the the origins of this style of system architecture (aka black box abstraction), this famous paper by Parnas is clearly one of the major sources. In this paper he says "every module is characterized by its knowledge of the design decision which it hides from all others essentially. Its interface was chosen to reveal as little as possible about its interworkings." This paper plus some other papers by him introduced the notion of "hiding" which plays such a crucial role in the black box style of system architecture.


Show Slide

So that's all fine and well. Let's get back to the spreadsheet case. Now if you believe in this style of architecture, and you're going to design the spreadsheet display code, it should be a simple problem. Everybody knows what a spreadsheet looks like, it's a bunch of little boxes that sit on the screen and you need to click in them and display in them. And everybody knows what functionality the window system interface exposes. It's a bunch of little square boxes that you can click in and display in.

So everybody knows how to implement a spreadsheet display on top of a window system. You just make yourself a little array of a 100 by a 100 windows and plop them up there on the screen. And it's all done.

Now no one thinks that's going to work. And we will talk about why it's not going to work. But let me make two points here.

The first is that this is really the glory of black box abstraction because that little blue program there is very nice and simple and clean and you understand what it is supposed to do. And the second is that this is software reuse in a very big way because I get to reuse the whole complexity of the window system to plop my little boxes up on the screen.


Show Slide

So, if it worked, it would be great, but it doesn't. And the question is why. Well, the problem is that even though the interface can be designed to do a good job of hiding the implementation, the implementation is still sitting under there. And the implementation has performance characteristics and in this case they come shining right through the interface, making it not possible to write the code this way.


Show Slide

Let's look at that even more closely. What's going on is that not all of the implementation issues that we tried to hide were really details. Some of them are instead strategy questions that affect the performance of the resulting system differentially based on the client's patterns of use. I want to propose the term "mapping dilemma" for those strategy questions. So in this particular case, what happens is that the window system implementor has to decide whether window data structures should be very lightweight data structures and whether mouse polling / tracking should try to take advantage of the geometry. Or whether window data structures should be very heavyweight data structures and mouse polling should use some more general purpose mechanisms. And in this case, the reason why this spreadsheet display doesn't work is because most window system implementors choose the latter set of mapping decisions.

Let me say a bit about this term "mapping dilemma". The reason I like this term is because I have this sense that when you are implementing a functionality, what you are really doing is mapping that functionality down onto the lower level of functionality. So that's the sense of the term mapping. And dilemma comes from the fact that when you face one of these strategy questions, when you as the implementor of some general purpose module, face one of these strategy questions, you have this dilemma over which clients you are going to make happy. In this case, if you choose the heavyweight windows which everybody does, the spreadsheet implementor can't use the system the way they would like to.

Let me also introduce two related terms. When an implementor makes a decision in the face of a mapping dilemma, we are going to call that a mapping decision. A mapping conflict is when the implementor of a module chooses one way, but the client of the module would have preferred that they choose the other way.

The key point here is that the interface it locks in the mapping decision, but it doesn't hide it.

One thing you might say is that while this is a very nice example, it may be too extreme. Let me make three points about that. The first is that the rhetoric that we teach -- exposing functionality and hiding implementation -- doesn't account for this kind of problem. Part of the goal here is to develop rhetoric that does account for it. The second one is that it's really a shame that this didn't work because it would have been nice to be able to implement the spreadsheet that way. And, in fact, an only slightly different window system would have let us implement the spreadsheet that way. And the third point is that I have some less extreme examples.


Show Slide

Here's another one: virtual memory. This is a classic example. Everybody knows what the simple black box abstraction of virtual memory is. There's memory and you can allocate it and you can read it and you can write it. (And, depending on your language, you may also have to (get to) free it.) But hidden behind that are some very real mapping dilemmas. The primary one has to do with page replacement policy. The normal way that people implement virtual memory is they pick a least recently used page replacement policy, an LRU policy. And that's great much of the time. But there are programs that spread their working set around in a way that runs into trouble with that.

The classic example is a database system. Database system wants part of its memory to be LRU but there's another part of its memory that it does sequential scans through, and it would really prefer that was done MRU. This mapping conflict can really cause trouble for database people trying to use a virtual memory system.

Another example is a computer graphics. If you are trying to do rendering, you are going to run through your data structures in a certain order plopping them up on the screen. It's terrible to take a page fault in the middle of that. And you know what order you are going to go in, but if the virtual memory system is prefetching the wrong way against you, you're in trouble trying to use the virtual memory system. The standard solution for that problem is to buy more memory. Garbage collectors are another classic example of people who have problems with virtual memory.


Show slide

Here's two more examples from programming languages.

Everyone's familiar with the procedure abstraction that many programming languages have. (Here I've used a commonly accepted syntax.) Now this is just this procedure MAX that you can call like this, but there's a very very real mapping dilemma hiding behind that abstraction which is whether the compiler writer should implement procedures in line or out of line. And it really makes a difference sometimes if you want to use the procedure abstraction, what the compiler writer decided. In certain performance critical situations, you can't use the procedure abstraction if it is going call out of line.

Here's another example which is very familiar to people who do scientific computing. (This is Fortran syntax.) You allocate a big array, and the question is how is that array going to be laid out in memory? How is it going to be blocked? It really matters depending on how you are going to access it. This is a very similar problem to the virtual memory problem. In particular, on a parallel machine there's a question as to how the compiler should distribute the array across the different processors of the parallel machine. And that can really cause problems for clients of the array abstraction.


Show slide

Now, these are just some simple examples, but what I want to claim is that this is not just a simple academic or theoretical problem. I think that a tremendous amount of the complexity in our systems comes from exactly these kinds of problems. (I think that several hundred of the pages of bloat in my PCL system come from these problems.) I want to talk for a minute about two basic categories of resulting complexity, there are probably others as well.

The first I want to call a hematoma of duplication. (It's been pointed out to me that hematoma is not a good word because hematomas actually aren't very painful.) This is what would happen in the spreadsheet case. Everybody knows how you actually implement a spreadsheet. You make one big window, and then you draw a bunch of vertical lines and you draw a bunch of horizontal lines and you keep track of where all the lines are. And you track the mouse and . . . . In other words, you implement your very own little window system sitting on top of the original window system.

The second I want to call coding between the lines. That's when you find some way to trick the underlying implementation into doing the thing you want it to do. This is another solution to the graphics problem. First you write your program so that it allocates the data structures in the natural order. And then you spend night after night after night, or you get some graduate student to spend night after night after night, finding some way to allocate the objects in a different order so that they just happen to fall on the pages in the right way and the right thing happened. And that I call coding between the lines.


Show slide

An important thing to notice is that all of this is particularly a problem in the case of multiple clients. Because, if you see, only one client is going to use an abstraction, an implementation, then maybe it can be tuned just right for that person. So, for example, most window systems are tuned reasonably well for writing word processors. But then another client comes along, the spreadsheet client comes along, and it's not tuned right for them, and they have to write a hematoma. And it gets worse and worse with more clients with different demands come along.


Show slide

Now another word for this concept of multiple clients is reuse. A bunch of people want to reuse this substrate for different reasons. It concerns me a little bit that if the people who have been designing our system software like operating systems and window systems, our self-appointed best and brightest can't make software that's truly reusable, it's going to be very very hard for the rest of us to do that.

Just to give you one more example, one of the major database vendors on Highway 101 in the Bay area says that 35% of their product is basically stuff like this, because they have to sit on top of multiple operating systems and they have to retune the operating system functionality to work for them. 35% of a database is a very big number. So I think these really are very real problems that we need to deal with. 


Show slide

Now as I said, these problems aren't new and people have been pointing to pieces of them for quite some time. Here's a paper from 1980 by Mary Shaw and Bill Wulf, which talks about this problem specifically in the context of programming languages, and what they say is ``traditionally, the designers and implementors of programming languages have made a number of decisions about the nature and representation of various language features that the authors feel are unnecessarily pre-emptive.'' In other words, they say the language implementor and designer makes a mapping decision which causes problems for the client. This word pre-emptive is a very nice way of talking about the consequences of mapping conflicts. It basically means that if the implementor chooses the wrong way, the client can't do what they might have otherwise done. (In effect, reuse is preempted.) Show slide

They go on to say that ``both the designer and implementor have only notions of typical use available; they are making the decisions too soon.'' This concept of the implementor making the decision too soon is something we will see throughout this talk.
Show slide

Finally they go on to say that ``although most people now agree that the use of high-level languages is desirable, the fact remains that many major systems are still written in assembly language.'' And of course that's still true today, although the syntax is worse and now it's an ANSI standard.


Show slide

Okay, so let me do a quick summary. What I have said so far is that we are engineers and we have to struggle with complexity. We are building extremely complex systems. We want our clients to use them, without being overwhelmed by that complexity. And the fundamental tools for engineers to do that is abstraction and decomposition. And we have been using a particular approach to abstraction and decomposition called black box abstraction. And while that seems very nice, there is this problem that mapping decisions show through and cause mapping conflicts. You can't really hide all the implementation decision. And that winds up with problems with hematomas and coding between the line. Somehow or another the actual client of the service needs some control over the mapping decisions. So the question is how are we going to do that.


Show slide

Well, here are some possibilities -- none of these will be the one of course! One possibility would be to avoid the mapping dilemmas. What that means is design a system which is so low level in functionality that it has no real decisions to make in process of implementing it. I'll come back to that in a second. Another possibility would be to pretend that it isn't a problem. You might laugh at those two but I'll show you a slide in a minute that says that those are the two approaches that the community has traditionally fissioned itself into. The next three are just for fun. You could document the mapping decision, but that clearly won't work because just knowing that you are losing doesn't make you win. You could just give people the sources -- we know what that does. Or you could just give up.


Show slide

Let's get a little more perspective on these first two because I think it teaches us something. Here's a real famous guy who, in 1974, said about programming languages: "In fact, I found that a large number of programs perform poorly because of the language's tendency to hide 'what is going on' with the misguided intention of 'not bothering the programmer with details'." And he basically goes on at great length to say that programming language designers should design languages that have no mapping dilemmas in them. It's a clear argument that says if the compiler writer can't make the best choice for everybody, the features shouldn't be in the language.

What I'll claim to you is that here is the fork in the road at which the C community and the Lisp community split. Because the C community said well, we'll follow this path, and the Lisp community -- I was originally from the Lisp community, so this is a little historical interest for me. The Lisp community took the second approach which is they said we will put in the language the stuff that people wish was there, and we will worry about how to implement it later. Now we know who won commercially. And rather than continue that fight -- I had no interest in that fight anymore -- what I want to say is that in the absence of a clear principled solution to the problem of mapping dilemmas and mapping conflicts, the Dykstra choice, the C choice, was the right choice. And what I am hoping is that this talk will provide us a way of thinking about a different kind of choice.


Show slide

Okay, so look. I said that these were old problems, and in fact, there are some old solutions. Let's look at some of them. In the case of programming languages, there has been this idea called -- different people call it different things -- compiler switches, compiler pragmas, compiler declarations, but basically what you do is you put this little widget on the side that lets the programmer tell the compiler how to make the mapping decision. So in this case, you can say to the compiler, look, inline this procedure. Make that mapping decision this way. In the case of high-performance Fortran, you can tell the compiler, look, distribute this array in memory this way.


Show slide

And there's a similar sort of solution that people have taken for the window system problem which is you just introduce this new kind of window, a lightweight window -- different people call it different thing. But many window systems have an approach much like this in that you basically again let the client tell the implementation -- make this mapping decision this way.


Show slide

And if you analyze these for a moment, they really are getting at the problem in a reasonable way. They are giving the client programmer control over the mapping decision. In these cases, the client programmer can control procedure implementation or array layout or the windowing strategy. And because of this, some kinds of hematomas and coding between the lines are avoided for clients of these systems.

These two examples I showed you have a very declarative nature which makes them very reliable. But their declarative nature also means they have limited power because the client can only flip the switches that are provided.


Show slide

Let me now turn to another solution, a family of solutions to the virtual memory problem that has been evolving over pretty much the last ten years. These solutions are object-oriented in nature, so here in the OOPSLA context of course we'd expect them to be much more powerful! This line of work started with this paper from SOSP 87. This is the first paper on the Mach external pager and then those dots mean that there are a bunch of other papers that build on this list. And what they say here is ``an important component of the Mach design is the use of memory objects which . . . allows applications . . . to participate in decisions [I would say mapping decisions] regarding secondary storage management and page replacement.'' So they are basically going to use an object-oriented mechanism to let the client of the virtual memory abstraction control mapping decisions.


Show slide

I'm just going to walk through this very quickly cause everyone in this room understands about object-oriented programming. Starting with the original virtual memory black box, you have this interface that presents malloc, read and write, and you have this hidden implementation.
 

Show slide

What's going on there is that that implementation maps malloc, read and write down onto physical memory pages and the disk.
 

Show slide

Looking more closely, inside the implementation, there is this amorphous hunk of code that implements all of these operations, and it has some data structures like page tables that tell it for each memory address where that address really is, whether it's on a physical page or whether it's on the disk or whatever. (Note that a couple of these operations tend to actually be implemented in hardware, but that's a detail we don't need to worry about right now.)

Show slide

So the question is if you want different clients to be able to have different paging policies, how might you do that? Well, the first idea would be to say, well, we will let people duplicate that whole amorphous hunk of code and stick their very own copy of it down in there. And so now some memory addresses will be handled by this virtual memory system and some memory addresses will be handled by that virtual memory system. And in some sense that solves the problem in a kind of theoretical sense, but it doesn't really solve it because in order for the client to use it, they have to write their very own virtual memory system from scratch in order to get what they want.

Show slide

So, here's the more object-oriented solution. What you do is you break the regions of memory up into `region of memory objects' and you document nicely defined protocol among those. And there's a default class of region of memory that implements all the normal virtual memory operations. And then because they have this documented protocol, a client who wants to can make a subclass of the virtual memory system and if all they want to do is change the page replacement policy, they will only have to write one method. By just defining that subclass -- only one method -- they have a whole new virtual memory system that has this one crucial different piece of implementation strategy. And I think that is quite familiar to all of us.

Show slide

I'm not going to show you the details of this, but in a later paper, actually a paper from OOPSLA last year, Krueger et al showed that in 50 lines of code you can get yourself an MRU paging policy instead of a LRU paging policy if the internal protocol of the pager is designed well. And now people have done some other nice things in a very small number of lines of code, for example, specializing the pager so that they can get compressed paging.


Show slide

This paper, which is from 1975, is actually the precursor to that kind of work in the virtual memory and the whole operating system community. And the very first sentence of the paper says: ``The extent to which resource allocation policies are entrusted to user-level software determines in large part the degree of flexibility present in an operating system.'' Basically this paper summarizes, in the specific domain of operating systems, much of the argument I have given you so far, which is that we have to let clients control mapping decisions. They use somewhat different words than I am using, I think the comparison is instructive.


Show slide

There's a whole bunch of other examples of systems like this. I've just shown you a few, but given this framework for thinking about it, it is easy to find a lot more. Almost every programming language has compiler switches of this kind. Operating systems as far back as VM370 had ways for the client to control the paging policy, in fact, even older than that. In the operating system community, there's a whole bunch of exciting new work going on giving clients this kind of control. In the OO community, I think we are all familiar with class libraries that work this way.

Now, thinking back for a moment, what I have said so far is that we needed an abstraction framework to work with and we picked this one called black box abstraction. And gee, it tends to make us hide issues that are actually real important.

So now what I want to do is show you a video tape that in a kind of perverse way may make us feel a little better about the fact that we didn't get our abstraction framework right. This famous tape shows the collapse of the Tacoma Narrows bridge due to resonance vibration. What happened here is that bridge design at the time this bridge was built (approximately 1940) was using an abstraction framework that, by conscious decision, ignored dynamic properties. And one day it got windy, in just the bad way, and this is what the bridge started doing. As you can see, this was a bad thing. So it's an interesting thing that happened here. The field as a whole made a decision to abstract away the dynamic properties of the structure, even though they knew from experience with smaller scale structures like aircraft wings that you could run into problems with resonance in structures like this. So they picked what was not quite the right abstraction framework and they got caught. (In fact, there's a lot of other bridges built around this time that were also built ignoring the dynamic properties; they just didn't quite get caught as badly as these people got caught.)

So, I think what is important about that is it gives us perspective on how hard it can be to get the right abstraction framework. Mechanical and civil engineering much more mature disciplines than we are now. And even so, they were missing some crucial issues.


Show slide

So, I have now finished the motivation part of the talk. What I said is, first we have to control complexity, and we tried to do it by hiding the implementation. But that doesn't always work, because stuff shows through. So now we have to take some new approach; we have to let clients control mapping decisions. And, in fact, I showed you that some people have already started doing that. But we have a real problem, which is if we are going to add to the amount of complexity that we expose to clients, the clients' brains haven't gotten any bigger. So somehow we have to provide some principled way of not overwhelming them with that complexity. This imperative is still the fundamental imperative of engineering. So we have to find some way of dealing with that.

I propose that to do that we can borrow another principle from engineering. It's a basic principle, like abstraction, that says if you can't conquer, you should at least divide. That is just the situation we are in -- we couldn't conquer the implementation -- so maybe we can divide the way in which the client gets control over it. But there's an interesting thing, which is that this principle by itself doesn't really say any more than abstraction says by itself. Because it doesn't tell us what we should divide from what.

In order to get a handle on what we should divide from what, I think we can borrow some insights from the community that does work on computational reflection.

Now reflection is as word that has been danced around a lot and it's scared some people. But reflection is just a very simple concept like object oriented programming. Reflection is based on an observation about the world and it's a claim that we can use that observation as a metaphor for designing certain kind of systems. Just like OO, that's all it is. In this case, the observation is that people in the world maintain various kinds of contracts, that are constantly subject to discussion and re-negotiation. Sometimes the contracts and negotiations are formal, sometimes they are informal, but this idea of contracts, or terms, that can be renegotiated is prevalent.

I will claim that this metaphor is applicable in this case, where we have clients that are trying to renegotiate mapping decisions with services.


Show slide

So let me just walk through the reflection metaphor. The idea is that a client and a provider, have some contract, and their day-to-day conversation is governed by that contract. So this person calls up and says I need a thousand more widgets. And the contract says what a widget is, and it says how quickly they are going to arrive and it says what they are going to cost . . . And most of the time that works just fine for them, but every now and then, things change. And when that happens, this person calls up and says look, I want to renegotiate the contract.


Show slide

This shift in the conversation is called going meta, and now instead of talking about widgets, they are talking about the contract itself. So it's a very different conversation because the subject matter of the conversation isn't widgets -- it's the previous conversation. The reflection community uses words like reification and reflection and introspection, that make it possible to talk about subtle aspects of the shift back-and-forth to meta-conversation.


Show slide

I'm not going to go into those today, but the reflection community gives us another thing which I think is just what we need for addressing the problems I have been outlining, which is the notion of a reflective module. And it's based on this metaphor in the same way that object-oriented programming is based on the object metaphor. And what it does is it says look we will have a base interface that provides the normal functionality, the day-to-day functionality. And then we will have this separate meta interface which the client can use to go meta and renegotiate the base interface. And in particular, we, dealing with this problem of mapping decisions, can use this concept as the dividing principle we need, to achieve separation in in the interfaces. We can put the normal black box functionality in the base-interface and we could put the mechanism for controlling the mapping decisions in the meta-interface.


Show slide

So this is a somewhat busy slide; it is also one of the most important slides in the whole talk. What I previously said, is that we have to expose control over mapping decisions to the client. And now what this slide says is that if we are going to do that, and we want it to work in a principled way, this is the ideal. Try to get two separate interfaces. Instead of "hiding the implementation," take this as your fundamental design principle: "separating control over mapping decisions".

The claim is that to the extent that we, as the providers of reusable modules, do this well, it lets our clients divide their attention into two areas. On the one hand, they can program optimistically on top of the black box and then when they need to, they can go renegotiate the mapping decisions. So the key is that we want the client programmer to be able to divide their attention so they can focus on either the functionality of the system or on adjusting the mapping decisions.


Show slide

So that is very nice; let's think about what that means if we bring it down to concrete. I'm going to revisit the examples we talked about before. Well, the two programming language examples already have very good base/meta separation. If you look first at the procedure example, here's the traditional functional interface and then here's the meta control over the mapping decision. And the separation is good because if you cover up the red, you can read this program and understand it without really having to think about the red. You won't capture all the performance issues, but this program is comprehensible. And the same is true of high-performance Fortran. High-performance Fortran adds this declaration and in fact that's a comment character -- this whole line can be ignored. So these two examples have very nice base/meta separation. And if you think back to the virtual memory case -- I won't take us all the way back there -- but they also have this very nice separation because you either use the virtual memory system or you could specialize the virtual memory system to get a new one.


Show slide

Here's an example which I've borrowed from the Framework people, from Taligent in particular, which also has pretty much exactly this same separation. Here's what they call the "client API" which I draw in blue and call base and then here's what they call the "specialization API" which I draw in red and call the meta interface. So this is another instance of this separation principle which I am advocating sort of existing in nature and being pretty well done.

Now you can use this separation not just to talk about existing design, but to improve designs. That's much more useful, of course.


Show slide

Let's go back to the window system case. I would claim that the window system example I showed you before was badly designed, that it had bad base/meta separation, because what it did was it confused the name space of functionality the window should provide with the name space of selecting mapping decisions. If you remember before, the "lightweight" was stuck in the class name. That is bad separation because the two concepts are being stuck in the same name space. You can get a combinatorial explosion because as you get n of these things and m of these things -- well, we all know what combinatorial explosion does.

So here is a redesign of that window system case, which I think learns from this principle of base/meta separation and is more elegant. In fact, there is a very nice paper in the next session which shows exactly how to do this in a number of cases, set classes, in particular. They show how to pull control over mapping decisions out of the class name -- that's what they have done here. Here's the class name and then here's contro over mapping decisions.

So this base/meta separation concept is something we can use once we have decided to provide clients with control over mapping decisions to better modularize that control. So that's what I want to claim is the first design principle of giving clients control over mapping decisions.


Show slide

Let me quickly go through some other design principles for these kinds of systems. One of the concepts we have seen -- we saw this particularly in the virtual memory case -- is this notion of incrementality. What it says is that if the client wants to renegotiate the mapping decision, they would like to have to do as little as work as possible. The first virtual memory solution where you just had to write a whole new virtual memory system was bad incrementality. These cases where you just sort of put lightweight on it, that's very good incrementality -- the client doesn't do much work.

Another is a concept of scope control and what that means is that if one meta-program which changes one mapping decision sits here, it should be possible to restrict the scope of its effect. So if I walk up to the virtual memory system and say look, change the page replacement policy this way, some other client should keep the one that they have. If you are going to let people go down and change the substrate, scope control is a crucial factor.

Interoperability is another factor -- I won't belabor that, but if you take, for example, the array case, it ought to be possible to intermix the use of arrays that are mapped one way with arrays that are mapped another way.


Show slide

So these are three more very important design properties and I think all of us in the OO community are familiar with how you can use object-oriented programming, to achieve these.
Show slide

OO helps with incrementality because it lets us get this kind of internal module structure objects and clients can replace one or the other of them and have fine grain control.
Show slide

OO also helps with scope control. Here's the virtual memory case redrawn. And what's going on here is that the vertical axis is the individual regions of memory and the horizontal axis are individual operations. And what OO in that case did was it gave a kind of cartesian coordinate system for being able to say that one particular region of memory should have a different implementation of one (or more) particular operation(s). So that's OO for two different kinds of locality in some sense.


Show slide

I'm not going to go into detail today about what a metaobject protocol really is, But basically metaobject protocols are a technique for providing client control over the internal workings of a system with a clear base/meta separation and in particular using OOP techniques for scope control and incrementality. So, metaobject protocols are one particular way of building these kinds of architectures and achieving these kinds of design goals. There are other ways; I'll return to that. One thing I want to stress is that we've placed tremendous emphasis in this work on coming up with words for design and words for design goals that are independent of the particular technology used to achieve them.


Show slide

Two more quick design goals -- efficiency is certainly one. Since I started off by saying that we want to make stuff go fast, we better not make it go slow in the process of making it go fast. And one thing you might wonder is gosh if I let people come all the way inside and change the way the thing works, is that going to cost me performance? I'm not going to say much about that, but here's one basic observation.


Show slide

This whole game is a game about binding time. What I've been arguing is that the later you make the decision, the better mapping decision you are going to make, because the client can participate in it and the client knows a lot about what they want.

The traditional approach has been to make the mapping decisions very early because then it's easier to compile them out. And this has sort of been a long standing tension.


Show slide

There's a bunch of recent work, starting with some of the work in Smalltalk and then some of the work in SELF which basically says look, I can do very very late code generation. Late code generation is a very old idea -- it dates from at least the '60's, actually maybe earlier than that. But recently people have actually managed to get it to work. What it lets them do is to generate the code very late so they don't do it until it's needed, and then it can be highly customized. And in particular, you can give the client control over the mapping decision. And there's this magical triad of partial evaluation, lazy evaluation, and run-time code generation playing in there.

I think everyone in this room is already familiar with these two pieces of work, Smalltalk and SELF.


Show slide

Let me also point you to this other fantastic paper. They have this idea called incremental specialization. What they say is "we have introduced the principles of code synthesis . . . frequently executed Synthesis kernel calls are `compiled' and optimized at runtime . . ." I'm not going to go intoo it, but basically what they do is on the fly, they generate a version of the Unix read instruction, which is optimized for this particular file at this particular moment in time from this particular page . . . This kind of incremental specialization can play a key role in opening up control over mapping decisions. This paper is a fun read; it's in SOSP89.


Show slide

So here's another summary, the beginning is familiar by now. We had to control complexity; we tried to do it with black box abstraction; we ran into various kinds of problems; something of a deadend. We need to give clients control over mapping decisions and then what I said if you are going to do that, it's pretty important to get a kind of separation of control; "divide if you can't conquer." And I talked about not only that design goal, but some other design goals. And then I said that one place to look for help with this was reflection. Another place to look for help was object oriented programming. And that some of these compiler techniques I just talked about were another place to look for help. And case studies! There are systems out there that to varying degrees work this way and you could take this analysis and reapply it to those systems and learn both more about the systems, more about the analysis, and most important, more about future systems to build. So this is a summary of the basic argument structure.


Show slide

Now, I want to say something quick about the value of having better words for talking about these problems. I'm not going to say very much about this, but one of the ways that you can use these terms is you can go back and do an analysis of your existing code. And you could say, for example, well this is a hematoma and it comes from this mapping conflict. So that's one way you can use this. Another thing you can do is you can call up your favorite substrate vendor, or perhaps your not so favorite substrate vendor, but your substrate vendor in any case, and say look, I need control over this mapping decision. And then if they propose to give you some hook or some feature, you can ask some of these kind of questions to help negotiate the functionality you need. So that's some of the value I see in the words that have been presented here today.

The other value is that they can help us as a community to talk together and flesh out the rest of this abstraction framework.


Show slide

This is really just the beginning. There's a whole lot of other work to do. Let me just touch on some of the remaining work very very quickly.


Show slide

One thing is how do you identify the mapping dilemmas? People ask this all the time. They say, well yeah this seems pretty motivating; clients need control over the mapping dilemmas. How do I decide what to put in the meta interface? And the answer is that we don't have a methodology for this yet. We need to learn one. One intuition is that it is hard to know what the mapping dilemmas are the first time out. I think this is going to be very iterative. I think that the community of people who work on participatory design can offer us a lot of help because what they can do is teach us how to go look at our existing clients and really see the problems that they are having.


Show slide

Another issue is that when we design a meta interface, we need to tell the client something about how the box works inside, but not too much how it works inside. The meta interface has to be an abstraction of the implementation, not the real implementation. And so we need to find a way of saying enough, but not too much. The frameworks people are hard at work on how to specify those kinds of protocols, so they can help us a lot. The adaptive software work by Karl Lieberherr and his group is also very useful here because it's about telling you something about the way the internal data structures are arranged, but not too much about how they are arranged.

There's a whole other category of issue -- remember I said objects are not the only way to do this. Great surprise, but objects are not the be all and end all of software. And one particular problem that objects have is that they are too brittle, they carve the world up too sharply. A bunch of people are working on that. The workshop on subjective programming talks about that. We at PARC have done some work on a very dynamic object language called Traces, and again, adaptive software plays in here because it is about being looser about the carving of the world.


Show slide

Another issue has to do with methodology of use -- people say meta interfaces are really powerful, maybe it's going to be too dangerous. I think we need to learn how to partition who uses the base interface and who uses the meta interface.


Show slide

Verification. Sometimes people get concerned that if you add a meta-interface to a system it won't be possible to verify it. I just want to make one basic observation about this. Existing verification techniques already make it possible to verify a system based on the verification of its subcomponents. As we've seen, a client meta-program is, in some sense, a sub-component of the module it specializes. So, it ought to be possible to adapt existing verification techniques so that they can base their verification on the verification of a client-supplied meta-program.

Let me make a very important point here. Some people are going to say that they don't want to put a meta interface on their system because it is a powerful thing and the client might shoot themselves in the foot. In dealing with this response, I suggest going all the way back to the hematoma picture. If clients get a system that has an inappropriate mapping decisions for them, one way or another, they are going to find a way around that problem. The intuition behind providing a meta-interface is that maybe we can replace a 200 page hematoma with a ten page meta program.

So yes, adding a meta-interface adds verification problems. But this is an issue worth dealing with.


Show slide

Here's a major piece of work that needs doing. I gave this whole talk in terms of clients needing access to mapping decisions for reason of performance. But there is another kind of reason why clients need access to the inside, which is to control the behavior of the module. That's very familiar to all of us in this room. But I think we need to come up with a clearer account of that and merge it with this. That's a real piece of work. We really saw in our work on the CLOS metaobject protocol that many clients used the meta interface to change the language semantics.

And, of course, that's just a partial list. Because one way or another everything we do in software practice is based on the black box abstraction framework. If we are going to go all the way down there and change that framework -- and I've argued that we are -- a lot of other aspects of how we work will be affected. So that's really great because it means that there is some exciting work we can do here!

So look, before I finish, since one doesn't get to talk up here every day, I'd like to say one final thing that I find interesting. Up to now I have been talking about a change that has been happening and that's happening now. We are coming to a new awareness of a certain category of problem, and that is pushing us to provide clients of modules with a new kind of control, and there are some nice ways of thinking about that.

But now let me look a little farther into the future, to talk about where I think we might be going in the longer term.


Show slide

If you go all the way back to where I started out, we ran into this problem that we tried to hide implementation, and it showed through anyway. And that really shouldn't have surprised us at all because we knew that the original interface, the original abstraction was only a partial description of what was going on. We knew that these programs, these thin blue lines, were just a partial description of what was going on. We designed it that way because we can't cope with the true complexity of what's going on. We know that what is really going on is that know memory is being consumed, the disks are going around and around, the disk heads are going back and forth, and electrons are spinning... Justq a ton of stuff. So, it shouldn't have surprised us that our simple abstraction didn't capture the entire situation. We designed it not to capture the entire situation.

And the same is true for other engineering disciplines. Suppose that I want to build this simple wooden bridge. This too is an extraordinarily complex system. Like any real system, it is too complex to think about all at once. 


Show slide

I can draw this nice picture, this abstract description, which captures some of what's going on in this little bridge. But it doesn't capture all of what's going on. There's sheer forces and creep and quantum mechanics and stability and corrosion and dynamics and aesthetic elegance and warping in the rain and all sorts of stuff that goes on with this thing.


Show slide

Now if you look at the other engineering disciplines, the way they deal with the problem that a single description can't capture everything that's going on, is that they don't try to make a single description capture everything that is going on. What they do is they they have multiple descriptions. Some of them capture more or less detail, which is something that we do too. Our layers of abstraction are about capturing more or less detail.

But then they have other descriptions that are about different issues entirely. So one description might be about the dynamic properties of the bridge, whereas another would be about the static properties. They are about different sets of issues, not about more or less detail. What they do is they draw these different descriptions that talk about the different issues and then they have elaborate social conventions for mediating among these different descriptions.

So they have this wonderful ability to take descriptions that capture different set of issues so they don't get blindsided by having a description that only focuses on one issue.


Show slide

Now going back to us, we have had a different special ability. Our programs, like the pictures we've been looking at are a partial abstract description of the situation. Many things that are important about the situation aren't captured in the program. But our abstract descriptions have a wonderful property that the other engineers don't have, which is we can ``run'' them. We can run them and even though they are partial descriptions, we get the whole behavior. The reason that happens is because the substrate implementation somehow fills in what we didn't say. This is arguably the sexy thing that we as software engineers (or engineers of software I prefer to call it), we as engineers of software have that the other engineers haven't had. We have abstract descriptions that automatically run.


Show slide

So now you can see where I am going. What would it mean to be the best of both kinds of engineers?

Well, it would mean that we would have abstract descriptions because the systems are very complex and we can't focus on everything about the system all at once. (This is something both kinds of engineers have had.)

It would mean that in addition to having descriptions that were more or less detailed, we would have descriptions that were about different aspects of the system, because no one kind of description is going to capture everything that matters. (This is something the other engineers have had.)

Finally, it would mean having descriptions that ``automatically run.'' That is, the resulting overall behavior arises from automatic processing of the abstract descriptions. This is what has been cool about software. You just write it down and you throw some switches and off it goes.

Now the key point in getting all of this to work is to arrange the descriptions to that we can take advantage of what one description doesn't say, to turn around and say it in a different description. That's what the base/meta thing is about in part. It tells us what to say, and what not to say in each description.

This is a very long-term goal, I think. But this is (part of) my sense of what it would be for us to really be engineers. I believe that the kinds of architectures I talked about today and the abstraction framework for thinking about those architectures that I presented today is a step in the direction of this long-term goal. Because what we have is two separate descriptions that are both abstract, that capture different aspects of the situation, but that automatically come together to produce the final behavior.

That's all I have to say. Thank you very much.

 
Some other useful links -
 

http://www.reusability.com/papers2.html
http://www.cio.com/archive/030197_intro.html
http://www.cs.uow.edu.au/people/nabg/ASWEC96/aswec96.html
http://www.toa.com/shnn?edsbibpage
http://www.cis.ohio-state.edu/rsrg/index.html -----------------------------3256157316704-- -----------------------------154651248626030--