this is essentially an extended feedback on the use of Java as a programming language for this course which obviously doesn’t fit into the comment box of the evaluation. I as a student will not be able to profit from any changes to this but I would still like to argue the case. This started out as a post about the usage of
instanceof in this project, but I realized midway that the problem wasn’t
instanceof, but rather Java. I kept that in, because it highlights the problem I perceive.
instanceof in the Compiler Project
(Throughout I refer to
instanceof with pattern matching, NOT
instanceof combined with a cast)
I already completed the Software Engineering Lab (the lecture formerly known as Software-Praktikum) and
instanceof was maligned similarly. Since I was learning Java at the time I didn’t think to argue against it, because maybe the alternatives provided were really better, but alas.
I searched a bit on stack overflow and most of the questions regarding this topic strike a similar tone. The answers are usually a combination of:
- “It’s a code smell”. (= “Hell, if I know!”)
- “Use inheritance instead.”
- “Use the visitor pattern instead.”
- “It violates the Open-Closed principle.”
- “It violates the Liskov-Substitution principle.”
Basically from my understanding the two principles urge us to:
- minimize the amount of changes you have to make to existing code in order to add new classes/functionality. (Open-Closed principle), and
- make sure that new sub classes do not break the contracts of old classes (Liskov-Substitution principle)
Now if I have a functionality which behaves differently on different subtypes of a class and I use
instanceof to essentially switch on these behaviors (i.e. a potentially cascaded
if then else construct) that violates the above principles in the sense that:
- If I add a new class/functionality then I have to go everywhere I discriminate on type and check that I don’t need to add a new case. If I used inheritance instead the static analysis would complain that a method is not overridden. (See the next list for a problem with this)
- Similarly I have to check whether an existing check subsumes the new type I added and make sure that the behavior in that case is compatible.
Note that inheritance and the visitor pattern come with their own set of problems:
- If a method in a superclass has a default implementation (such as simply doing nothing, or returning
false) then I will have to remember to override that behavior, because semantic analysis doesn’t inform me about it.
- Visitor forces me to create a new class hierarchy just for adding a single piece of functionality. Every member of the target hierarchy requires me to copy and paste boilerplate accept methods as well.
- A single behavior is scattered across many files (inheritance) or many methods in a different file. This is less painful if the behavior is complex but becomes completely ridiculous when the behavior is small.
I do not see a clear winner here,
In the compiler project you might want to implement the
toString method for primary expressions (abstract class
PrimaryExpression) by either:
- Using inheritance and overwriting the method in each case, which will always be of the form
return "Const" + getToken()or
return "Var" + getToken(). For the three constant primary expressions I could add a new class (yay!) just to bundle that ONE behavior. This means
#unrelated_methods + 5methods (including prototype) OR if I bundle constants
#unrelated_methods + 3 + 1methods (of course you need a constructor for your fancy new constant class, even if it has NO fields )
instanceofin the abstract class produces a one liner that is easy to read and bundles this very simple functionality in a single place. It needs
#unrelated_methods + 1.
This same reasoning applies to my methods producing a formula out of an expression and to a degree to type checking. Essentially if I pulled those into the
PrimaryExpression class the subclasses would merely be a way of enforcing that variants of
PrimaryExpression exist. Essentially I could give PrimaryExpression an enum that determines which one we are dealing with and switch over that and not much flexibility would be lost. (Now people wouldn’t complain about
instanceof, because that would be replaced by
switch. Rather they would say that it is not good OOP style…)
Essentially the natural way to implement something so simple is frowned upon, because… this is Java and your program is bad if it doesn’t have
500 classes at least.
Shortcomings of Java
Everytime I ask myself whether there is a simple way to do something in Java without huge drawbacks the answer is to use a complicated and costly workaround that doesn’t behave like I would expect. All of the following things are useful and efficient ways to solve simple problems in other languages and they suck in Java:
Optionals: Tacked on, forces boxing of primitives. I could find no guarantees that it doesn’t use up more space and time than simply dealing with
null. Since Java lacks pattern matching you have to use methods to unpack them which makes the code more verbose.
Generics: Tacked on, forces boxing of primitives. Due to (premature) type erasure you can’t do some of the most interesting things with those. Since primitives must be boxed (= performance hit) for the use of generics you are forced to essentially write a non-generic class for every primitive you need in addition to the generic implementation you provide. A prime example of this is the existence of
Streams: Tacked on.
map f l. There is hardly anything more natural than applying some function to each element of a list/array/whatever, but why make it clear and concise when you could use
10times as many symbols to say the same thing.
404 not found(I realize that streams and optionals and all of that jazz are technically monads, but they are handcrafted by JVM experts and are still bad. Anything that I “handcraft” will be a million times worse)
- Pattern Matching: It’s coming this fall! As a preview! And then it becomes a full feature in a year, maybe! Of course you won’t be able to switch on multiple things at once. And until it’s released you can use visitors and inheritance to bloat your codebase instead.
Let Expressions: I guess this is already here, but they don’t really advertise it.
let x = (a, b) invs.
x instanceof Pair, kinda.
I am aware that this is a huge wish list and implementing those features correctly and at no cost to the programmer using them, requires expert knowledge on language and compiler design as well as a historic outlook on what was already done and how it panned out. (Incidentally I realize how little expertise I have in those fields, thanks to the compiler project)
Of course the people maintaining Java and the JVM have those prerequisites but what they work on is outdated and was never meant to provide any of these features in the first place. This is why all of the features above that are in the pipeline (or were in the pipeline) take forever to arrive. Once they do arrive it turns out that some compromise was necessary to make them work and thus they ended up worse than their counterparts in other languages.
Sadly, Java is the only language I really know (except the token OCaml and C I got in Prog1 and Prog2 and some Python I learned on my own:
import ...) so maybe this is an instance of the grass being greener on the other side. In any case I can’t make a strong case for any other language, but I will try.
Basically the idea seems to be to introduce people to a functional language and then to show them some of the lower level stuff that goes on (MIPS, C, SysArch) and after that you show them Java.
After Prog2 students don’t hear much about functional programming again (not as far as I can tell from the curriculum) unless they choose the logic/coq path. In Prog2 functional concepts appear in the kNobel tutorial mostly. So less than ten people get useful information about functional features in Java (or the lack thereof) and everyone else gets to not know about it, unless they stumble over it by chance.
As for alternatives to this I think there are definitely some candidates, like C++, Python or Rust. I don’t think any of them has all of the features mentioned above but all of them have a good portion of them. I realize that Python doesn’t have the same typing standards as Java, Rust and C++ but there seems to be a type annotation functionality which fixes this problem. What these three languages have in common is that they are actually
multiparadigm, not just
single paradigm + some other features.
Java enforces a single paradigm, and if you want to do something in a more elegant/efficient fashion you are reprimanded for it. The same goes for OCaml in Prog1, where the mere mention of a
for-loop makes you a pariah.
Why not teach people a modern multiparadigm language over a year (culminating in the Software Engineering Lab) so they know all the tools they have at their disposal in the real world, instead of thinking that there are a bunch of distinct toolboxes and for each problem only one toolbox is applicable.
I know this might not change anything, but I am sufficiently frustrated with Java to write this much and I doubt I am the only one. If no one ever complains (and how would we know if that happens, the evaluations are anonymous!) then whoever is in charge might think everything is fine.
Thanks for reading!