James Gosling talks with Bill Venners about his current research project, code-named Jackpot, which builds annotated parse trees for programs and can help you analyze, visualize, and refactor your program.
For the past several years, Java's creator James Gosling has been working at Sun Labs, researching ways to analyze and manipulate programs represented as annotated parse trees, a project called Jackpot. Compilers have long built parse trees when they translate source code into binary. But traditionally, programmers have worked with source code primarily by manipulating text with editors. The goal of the Jackpot project is to investigate the value of treating the parse tree as the program at development time, not just at compile time.
In this interview, which will be published in multiple installments, James Gosling talks about many aspects of programming. In this first installment, Gosling describes the ways in which Jackpot can help programmers analyze, visualize, and refactor their programs.
Treating Programs as Algebraic Structures
Bill Venners: What's the state of Jackpot, your current research project?
James Gosling: Jackpot has been really cool lately. It's what I'm spending most of my time on, and it's been a lot of fun. I was really hoping to have something I could hand out at JavaOne this year, but I've been finding too many entertaining things to do.
It's a very different world when a program is an algebraic structure rather than a bag of characters, when you can actually do algebra on programs rather than just swizzling characters around. A lot of things become possible.
Bill Venners: Like what?
James Gosling: If you look at any of the refactoring books, most of those refactoring actions become much more straightforward, in ways that are fairly deep.
Moving a method isn't just cutting and pasting text. It's a lot more than renaming the parameters and swizzling them around, because you really want to be able to do things like construct forwarding methods. When you construct forwarding methods, they're different from the original methods.
You can't just replace all uses of the forwarding method by uses of the moved method, because they actually behave slightly differently. The difference is usually around what happens when the pivot parameter is null
. That can lead you into a deep morass of essentially theorem proving about properties of the code fragments that you're moving, to understand how they behave with respect to null
. And you can treat all kinds of code manipulation that way.
So Jackpot has a baby theorem prover, or algebraic simplifier, that knows an awful lot about data flow and the implications of values. And it really does treat your program as a piece of algebra to be simplified and transformed. It can do an awful lot of interesting analysis that pays off when you want to make fairly significant pervasive changes to very large programs. That analysis pays off, for example, when you want to replace one API with another API that is almost the same. Often "almost the same" is actually harder than "radically different." I spent most of the last four months working on this baby theorem prover, and that's been a lot of fun.
Creating Visual Representations of Programs
Bill Venners: I read that Jackpot can create interesting graphical representations of a program. What is that about?
James Gosling: Jackpot can take this underlying algebraic structure— it's really the annotated parse tree—and generate a visual representation from that. Our internal notion of the truth is not text. But once it's not text, all of a sudden you can display it in really interesting ways.
We've got an underlying rule engine that's able to do structural pattern matching very efficiently. We can go from the structural patterns it sees in your code to visual representations. So you can write what is kind of like a reverse grammar, where you associate structural patterns with what you can think of almost as TeX descriptions of how to represent the patterns graphically. What you see on the screen has been generated from this pattern matching. So we can, on a user chosen basis, turn various program structures into all kinds of visual representations.
You can, for example, turn the square root function into the obvious mathematical notation. You can turn the identifier theta into the Greek letter theta. You can turn division into the horizontal bar with numbers stacked. And we've done experiments with wackier things, such as trying to generate real time flow charts. That's kind of goofy, but entertaining. Other things like hiding block contents, doing interesting typography, doing typography on comments, all actually work out reasonably well.
Bill Venners: At previous JavaOnes, I have seen some visualization tools that I thought were useful. One of them analyzed your code and drew diagrams that showed the coupling between packages. I felt those diagrams could help you realize that you've got a lot of coupling going from one package to another, which you may not realize by looking at individual source files. Visualization tools like that can help, I think, but a lot of tools that draw graphical representations from source don't seem to help much. Let's say you analyze a big program and generate inheritance charts, with thousands of boxes and lines going all over the place. That often looks as confusing as the source code does.
James Gosling: Yes, doing that kind of visualization is a real challenge.
Coupling Analysis with Action
Bill Venners: What kind of analysis does Jackpot do?James Gosling: We've got a bunch of hooks in Jackpot for plugging in analysis modules. We want not only to be able to do analysis, but to be able to act on that analysis. We have some pieces that do pretty interesting things.
For example, often it's considered bad form to have public instance variables. One piece of analysis, therefore, is to find all the public instance variables. But we can find them and also make them private, add all the setters and getters, and account for what it means to actually access the variables via setters and getters. We can also do things like find all methods whose natural home is not the class they are actually in.
Bill Venners: How do you detect that?
James Gosling: If you look at the analysis books, there are various ways of detecting that. For instance, if you've got a static method that takes an object as a parameter, and it modifies that object, then somebody probably just slapped that method in there because it was easy. They were editing that file, so they put the method there. But they really should have put it someplace else. We've got something that will find those methods and actually move them, then change all the uses of that method to do the right thing. So we're trying to couple analysis with action.
Visualizing with JavaDoc
Bill Venners: What I mostly use for visualization is JavaDoc, because it's an abstract view of the public interface. I generate JavaDoc a lot as I'm designing and developing. I look at the HTML pages generated by JavaDoc and think, well, this looks kind of confusing. And I go back and make some changes to the code. I may not be able to see that it is confusing just by looking at the code. So I think having different ways to visualize code and designs as it is being developed can help guide the design.
James Gosling: Jackpot's editor component tries essentially to do what amounts to real time JavaDoc. JavaDoc is a funny thing. When I did the original JavaDoc in the original compiler, even the people close around me pretty soundly criticized it. And it was interesting, because the usual criticism was: a good tech writer could do a lot better job than the JavaDoc does. And the answer is, well, yeah, but how many APIs are actually documented by good tech writers? And how many of them actually update their documentation often enough to be useful?
Bill Venners: For me JavaDoc doesn't just serve as a way to document the design, it serves as a way to visualize the design. If I see huge classes with 300 public methods, or dozens of packages with only few classes in each, I know there's a problem. It's more obvious when you're looking at the JavaDoc than at the source.
James Gosling: Right. JavaDoc has been enormously successful and enormously powerful. It's really been quite wonderful. And it's also been interesting to see the way that professional tech writers have taken to JavaDoc. A lot of the early criticism from them were things like formatting. That's largely been solved by the sophisticated doclets that people have been able to write. But the tech writers seem to now spend a lot more time just documenting the semantics of what's going on, and a lot less time fussing with structure and formatting. And it actually feels like tech writers end up being more productive.
Bill Venners: So the tech writers are in there changing the code also, and checking in their changes?
James Gosling: Yeah. That's certainly what happens around here. The tech writers are intimately involved in the engineering. And actually I've always found that to be a really good thing to do. One of my general design principles is that it's really helpful to have a good tech writer on the engineering team early on. If you're building something and you have a tech writer trying to document it, and the tech writer walks into your office and says, "I don't know how to describe this," it means one of two things. Either you've got a really stupid tech writer who you should fire. Or much more likely, you've got a bad piece of design and you ought to rethink it. You have to rethink, because an API that isn't comprehensible isn't usable.
Bill Venners: So the tech writer is giving you feedback on your design. One of the values of design reviews is that programmers give you feedback, and that's useful if it's an API because the users of APIs are programmers.
James Gosling: The problem with programmers as reviewers, and especially programmers that have been involved in the program for a while, is that they are kind of oblivious to the complexity. And lots of engineers are complexity junkies. Complexity is in many ways just evil. Complexity makes things harder to understand, harder to build, harder to debug, harder to evolve, harder to just about everything. And yet complexity is often much easier than simplicity. There's that really famous Blaise Pascal letter, where he starts, "I apologize for this long letter. I didn't have the time to make it any shorter." And that's really true.
Bill Venners: I always think the designer's job is not only to create something that will work correctly and efficiently, but something that is also easy for the client to understand and use.
James Gosling: Yeah, you've always got a customer on the other side, whether that's some high end engineer, or some chemist in the lab who's writing a piece of code. Often you've got people whose real job is something other than software, and software is their tool. They don't necessarily get off on all of the complexity. Making things simple can be a real challenge.
No comments:
Post a Comment