genug Unfug.

2014-10-27

# To check or not to check, what is your Exception

Several years ago, Robert C. Martin declared in Clean Code that the debate is over with regard to the use of checked or unchecked exceptions in Java. I bought into this opinion for some time until it really got in the way recently, where an otherwise nice library to simplify JDBC use catches the SQLException deep within and converts it into a general RuntimeException. Since I wanted to catch the SQLException to perform a rollback, I was forced to dig into the code and follow all possible paths to see which kind of unchecked exceptions may be thrown were to then decide whether its sensible or not to peform the rollback.

While the library declares its own UncheckedSqlException, I was still forced to look through the code to find out. But in case I missed one of the conversions to unchecked, the only safe way is to catch RuntimeException and then figure out wether it is a real runtime exception to bubble up or just an unchecked one that should trigger the rollback.

"What do you mean by real runtime exception?", I hear you ask. Well, the Java documentation puts it this way: Runtime exceptions represent problems that are the result of a programming problem, the canonical example being the NullPointerException. You could as well call it the result of a programmer's mistake. This seems clear-cut enough, doesn't it?

Then lets try this seemingly simple definition on an example: Pattern.copile(regex) throws an unchecked PatternSyntaxException if the string provided does not parse as a regular expression. So lets consider the code

      Pattern varname = Pattern.compile("[A-Za-z][0-9");

which is missing a close bracket in the regular expression. Clearly a programming problem, a programmer's mistake, so the unchecked exception is well deserved.

But now consider reading regular expressions from a configuration file. The programmer has no control over the strings found and whether they are well formed regular expressions or not. For the pattern to not compile now is not a progamming problem, but completely normal behavior, because the programmer has no means to check whether a string will compile other than by trying it. And we certainly do not want an additional method isWellFormedRegex(String s), because checking for well-formedness is as expensive as just trying to compile the string.

This example shows, that under the definition from the Oracle tutorials, for some APIs neither checked nor unchecked exceptions are correct for all use cases. Interestingly, parsing a date with DateFormat.parse() throws a checked exception, while the new LocalDate.parse() again throws an unchecked exception. In particular for date parsing I think it happens much more often on input data than on static strings, so I would prefer a checked exception.

Now what? When even the creator's of the Java core class libraries are unsure, how am I supposed to know what exceptions to use? The string parsing examples show it quite well. When parsing static strings, a parse error is really a programming problem and deserves an unchecked exception. When parsing input data, a parse error, however, is completely normal business. Wait, what did I just say, normal business? Why do the methods then throw an exception at all, when nothing exceptional is going on?

In C there are no exceptions (just core dumps). Methods that want to signal that they cannot return the expected value do so by returning a special value, often -1, and set a global error message. Not that I want to go back there, but the idea to either return a value or a description of what went wrong, in particular when it is normal business that something goes "wrong", can be implemented in Java quite elegantly. Java 8 gave us Optional, but it only provides a value — or not. It does not allow to pass an explanation of why a value is not available, which is what exceptions do. But we can roll our own.

We need something like Optional which does not only hold a value, but can also provide an explanation if no value is available. Since we are so used to exceptions, the explanation could be an exception with the full stack trace. Another alternative could be just a message, but lets go for the exception first. In German I would not mind to call the class Üei, which is short for Überraschungsei = Kinder Surprise, but lets call it ValEx, because it can contain a value or an exception.

    public class ValEx<T,E extends RuntimeException> {
private final T t;
private final E e;
public ValEx(T t) {this.t = t; this.e = null}
public ValEx(E e) {this.t = nul; this.e = e;}
}

This class can be instantiated with either a value or an exception. Now for the methods. Naturally we must be able to get the value.

    public class ValEx<T,<E extends Exception>> {
...
public T get() throws E {
if (t==null) throw e;
return t;
}
}

But we also want to be able test whether there is a value available to avoid the exception.

    public class ValEx<T,<E extends Exception>> {
...
public boolean isEmpty() {
return t==null;
}
}

Of course we can easily add more convenience methods like a get(T defaultValue) and a getter for the exception. Now suppose that a parsing method for regular expressions was declared to return a ValEx object like.

      ValEx<Pattern,PatternSyntaxException> compile(String s);

Then we could now call it with a static string like so:

      Pattern p = Pattern.compile("[a-z]+").get();

Since PatternSyntaxException is an unchecked exception, we get what we deserve if we pass a string that is not a regular expression. On the other hand, if we are compiling a property value we got from a file, we would use

      String regex = props.get("pattern");
ValEx<Pattern,<? extends Exception>> p = Pattern.compile(regex);
if (p.isEmpty()) {
// do what is necessary, like logging the wrong pattern
LOG.warn("ignoring wrong pattern "+pattern, p.getException());
return;
}

Whether the explanation when no value is available should always be an exception, is debatable. Creating exceptions with their whole stacktrace is not a lightweight operation (so I heard). Another aspect of this implementation is that even on the fast, normal path of the computation, we always create the ValEx object, which is ready to be garbage collected very soon after.

2014-10-07

# Java equals and canEqual

I stumbled upon canEqual in Scala code and immediately wondered why this is needed. Searching for Java, equals and canEqual brings up tons of hits on Google, but most of them will have some relation to Scala. When reading the first Google hit it becomes clear that the idea of canEqual has its place in Java too.

The article is well written and describes the problem along examples. I thought to take a more compact and formal approach to concisely describe what an equals method may do and what not, in particular taking inheritance into consideration.

## The equals contract

The Javadoc specifies that equals must implement an equivalence relation, which is a relation that is reflexive ($a=a$), symmetric ($a=b\Rightarrow b=a$) and transitive ($a=b\wedge b=c \Rightarrow a=c$).

Ok, symmetric, hmmm? Calling an object's method, like in a.equals(b)is inherently non-symmetric, since it is a's equals that is called and b is "only" a parameter. Suppose

A a = new A(...)

and of course when you implemented A you took utmost care to make it symmetric.

Comes along your colleague, half a year later, and writes:

class B extends A {...}

You forgot to make equals a final method to make sure no derived class can ruin your well crafted equals method. And your colleague thinks that B deserves its own, specific equals method. What are the constraints?

## Forced to call super.equals

Symmetry, oooohkeeey!? With

B b = new B(...)

he is forced to ensure that when he implements equals such that

b.equals(a) $\to$ true

which uses B.equals, he has to make sure that also

a.equals(b) $\to$ true

which uses the "old" A.equals of your class. Now consider some arbitrary

A a1 = new A(...) such that a.equals(a1) $\to \alpha$ and
B b1 = new B(...) such that a.equals(b1) $\to \beta$.

The transitivity requirement forces your colleague to make sure that comparing b to a1 and b1 returns the exact same results $\alpha$ and $\beta$ as when comparing a. Since the two were arbitrary objects of A and B, this is true for all elements of these two classes. The safest way to get this result is to make sure that b.equals(...) calls super.equals(...). To summarize

Conclusion 1: If an object B b = new B(...) of a subclass of A shall have b.equals(a)$\to$true for at least one a of A, then for this b the parent implementation super.equals() should be called to treat b as if it were genuinely of class A.

## The small room for a new (in)equality

If every b has at least one a of A with which it shall be equal, then obviously super.equals would be always called. But that would mean we don't need to implement B.equals in the first place.

Consequently there is at least one B bx = new B(...) which has

bx.equals(a)$\to$false for all A a = new A(...)

Conclusion 2: If a subclass B overrides equals of its parent class A, its objects belong to one of two disjoints sets:
1. those which have at least one B.equals partner from A and
2. those that are not equal to any element of the superclass.
The second set must not be empty, since otherwise the derived equals need not be implemented.

## Enforcing the inequality for the superclass

Due to symmetry, a.equals(bx) must also return false for all objects a created with new A(...). But how can it be that the implementation of A.equals, which was written when no bx yet existed, can do just the right thing when some such new type of object comes along? What A.equals typically does is if (bx instanceof A), but this returns true for the objects of the derived class and makes no difference between as and bs.

The solution is the canEqual method featured in the title. But since the mentioned article describes it so well, I don't need to repeat this here.

2014-09-21

# Photon contained in its own Schwarzschild radius

As a followup to my previous post, where I showed that the Schwarzschild radius $r_s$ of a photon with wave length

$l_p$: Planck length
$G$: gravitational constant
$c$: speed of light
$h$: Planck's constant

$$\lambda_o =2\sqrt{2\pi}\, l_p = 2\sqrt{\frac{Gh}{c^3}}$$ is $\lambda_o/2$, I want to add a few simple fun calculations.

The energy of a photon of frequency $\nu$ is $E_p = h\nu$, where $\nu=c/\lambda$ for a given wave length $\lambda$. With the specific $\lambda_o$ we get \begin{align*} E_o &= h c/\lambda_o \\ &= \frac{1}{2} hc \sqrt{\frac{c^3}{Gh}} \\ &= \frac{1}{2} \sqrt{\frac{h^2 c^5}{Gh}} = \frac{1}{2} \sqrt{\frac{h c^5}{G}} \end{align*}

Using Einstein's formula $E=mc^2$ relating energy $E$ and mass $m$, the mass of this photon is \begin{align*} m_o &= E_o/c^2 \\ &= \frac{1}{2} \sqrt{\frac{h c^5}{c^4 G}} = \frac{1}{2} \sqrt{\frac{h c}{G}} \end{align*} Replacing $h$ by $2\pi\hbar$ we arrive at $$m_o = \frac{1}{2}\sqrt{\frac{2\pi\hbar c}{G}} = \frac{1}{2} \sqrt{2\pi}\, m_p$$ where $m_p$ is the Planck mass.

So the photon for which one wave length "fits" into its Schwarzschild radius sphere has a wave length of the Planck length and its mass is the Planck mass, both multiplied by a factor of $1/2\sqrt{2\pi}$. And I wonder if I messed up something here to be left with this silly factor?

I also wanted to know the numerical value of the frequency of such a photon to see where it is located in the electromagnetic spectrum. The frequency is $$\nu_o = c/\lambda_o = \frac{1}{2} \sqrt{\frac{c^5}{G h}}$$ which Google happily computes for us without the need to type in all those digits for the constants to be $$\nu_o = 3.70003533\cdot 10^{42}\, \text{Hz} .$$ It is fun to note that this contains the Answer to the Ultimate Question of Life, the Universe, and Everything.

A more serious note is that this frequency is 23 orders of magnitude beyond gamma rays, where Wikipedia's description of the electromagnetic spectrum ends. My hunch is that we are not going to generate such a photon anytime soon.

2014-09-19

# Wavelength and Schwarzschild Radius of a Photon

Inspired by the questions is there a smallest length, I wondered what it takes, at least formally, to have a photon that might contain itself in its own black hole.

Mass is able to deflect the path of light or photons. The more mass there is, the stronger the deflection. If a given mass $m$ is compressed into a sphere smaller than its Schwarzschild radius, it is no longer only a deflection but, the light cannot escape anymore from that sphere. The formula for the Schwarzschild radius $r_s$ of $m$ is $$r_s(m) = \frac{2Gm}{c^2}$$ where $G\approx 6.6\times 10^{-11}\frac{m^3}{kg\cdot s^2}$ is the gravitational constant and $c=299\,792\,458\frac{m}{s}$ is the speed of light. For a photon with frequency $\nu$, its energy is $h\nu$, where $h\approx 6.6\times 10^{-34} Js$ is Planck's constant. This energy can be related to a mass using Einsteins famous formula $E=mc$ to get $$m = h\nu/c^2 .$$ Due to the fixed relation $c=\lambda\nu$ betweenn the frequency $\nu$ and the wave length $\lambda$ of a photon, we can express the mass also as $$m = \frac{h}{\lambda c} .$$ We can insert this relation into the formula for $r_s(m)$ and get $$r_s(m) = \frac{2Gh}{\lambda c^3} .$$

The interesting bit is that $r_s$ as well as $\lambda$ have the unit of length, so we are relating the wave length of a photon to its Schwarzschild radius. Further, as we decrease the wave length $\lambda$ of a photon, its frequency and thereby its energy increases — as does it Schwarzschild radius. Consequently we can ask when $\lambda$ and $r_s$ are equal. Or, rather, we can ask when a photon of wave length $\lambda$ "fits" into a sphere of radius $r_s$, i.e. $\lambda = 2r_s$ or $\lambda/2=r_s$.

This is the case when $r_s = a = \lambda/2$, where $a$ is an arbitrary new variable which we now enter into the last equation for $\lambda/2$ and $r_s(m)$ to get $$a = \frac{Gh}{ac^3} .$$ We solve this for $a$ and get $$a = \sqrt{\frac{Gh}{c^3}}.$$ So the wave length $\lambda$ of a photon "fits" into a sphere the size of its Schwarzschild radius $r_s(m)$ when both are equal to $a$, which is $$r_s = \sqrt{\frac{Gh}{c^3}} = \lambda/2 .$$ This may not look very interesting, but physicists will recognize this square root as something they know, but not quite. Looking up the Planck length $$l_p = \sqrt{\frac{G \hbar}{c^3}}$$ and knowing that $h = \hbar\cdot 2\pi$, we see that $$r_s = \lambda/2 = \sqrt{\frac{G\cdot2\pi\hbar}{c^3}} = \sqrt{2\pi}\, l_p .$$

Does this mean that when we confine a photon of wave length $\lambda=2\sqrt{2\pi}\,l_p$ into a sphere with that radius $\lambda/2$, that it can not escape and in particular not spread out of this sphere?

I wonder whether this quite simple result is trival, given the definitions of Planck's units, or whether it is a more deeper consequence of the theory behind the Schwarzschild radius.

2014-09-14

# Java unmodifiable vs. immutable vs. recursively immutable

During my current experiments with abstract polynomials for Java, I thought that it would be good to implement them immutable and so searched the Internet for an immutable list for Java. What I found were blogs that use immutable and unmodifiable synonymously, as well as at least one blog which clearly makes the differences, as also explained in this stackoverflow answer.

For the sake of clarity, let me try to define three related concepts:

unmodifiable
shall denote an object that has no has methods itself that change its state,
immutable
shall denote an object that is unmodifiable and, in addition, makes defensive shallow copies of incoming and outgoing objects stored in fields.
recursively immutable
shall denote an object that is immutable and has only fields with recursively immutable content. We leave perfidious changes to the object by reflection out of the picture.

The problem with Java is, that it cannot fit immutable objects anymore into the collection framework. This gets most obvious from Collections.singletonList(). While it returns an immutable list, as the documentation says, this list can be considered broken, for the simple reason that it is immutable. Although the List interface clearly allows for operations on lists to throw an Unsupported­OperationException, this can lead to bugs which are hard to debug. The list will be passed around in the program from one place to the next and eventually some code tries to add and element to the list, because this is what one typically expects can be done with a list — booom, you get an UnsupportedOperationException out of nowwhere. And it is even an unchecked exception, to be even more surprising.

Making objects immutable by implementing an interface for mutable objects only halfway and throwing RuntimeExceptions from the mutating methods really looks like hack. Some people argue that to fix this, Java's mutable collection interfaces need to inherit from immutable ones. That would require to sneek an ImmutableCollection in as a parent interface of Collection. But looking at the Scala approach, this might not be needed. But a completely new hierarchy of immutable colletions would indeed be necessary.

2014-09-01

# Alles neu hier

Nun ist es soweit. Ich kann meine Webseite aus ein paar XML-Vorlagen statisch generieren. Bis vor zwei Monaten habe ich dazu einen Satz von XSL-Transformationen verwendet, den ich mir vor Jahren einmal mühsam zusammengebaut habe. Dann wollte ich eine Klitzekleinigkeit ändern und musste zum X-ten Male feststellen, dass XSL einfach nur Unfug ist: es gibt zu viele Dinge, von denen man als Programmierer einfach gewohnt ist, dass sie in jeder Programmiersprache gehen, die aber in XSL entweder gar nicht gehen, oder nur indem man ziemlich schräge Konstruktionen verwendet.

Mir hat es gereicht. Der nächste Versuch wäre eine fertige Software gewesen, mit der man statische Webseiten generieren kann. Aber die diversen Googleergebnisse haben mir alle nicht gefallen. Deshalb habe ich es selbst geschrieben, auf Basis und mit Hilfe von Xmldego, einem Paket, das ich bereits als Experiment in 2009 aufgesetzt hatte. Das Ergebnis sind ein paar einfache Javaklassen, mit denen ich meine HTML-formatierten Texte in einfache Vorlagen einfüge, so dass dann der komplette Webauftritt heraus kommt. In Kürze werde ich das Paket hier auch publizieren. Die Vorteile gegenüber XSL: Java ist eine echte Programmiersprache und keine Karikatur, und ich habe mir die Kontrolle zurück geholt.

2014-08-24

# Methodenaufrufe über Netz (RPC, RMI)

Remote Procedure Call, Remote Method Invocation oder wie auch immer man es nennt, manchmal ist es hilfreich, wenn man die Verarbeitung seiner Daten auf mehrere Rechner verteilen kann.

Zunächst hatte ich mir Storm angeschaut. Das ist ein komplettes Framework für verteiltes Rechnen. Ich habe mich ernsthaft bemüht es zu verwenden, habe es dann aber aufgeben: Zum einen entspricht es in meiner Anwendung den Kanonen für die Spatzenjagt, zum anderen verlangt es einen Zoo obskurer Klassenbibliotheken und zum dritten kollidierten auch noch einige Versionen der weniger obskuren mit den Versionen die ich benötigte.

Bei Hadoop befürchtete ich Ähnliches, benötige aber insbesondere kein map-reduce, also habe ich es erst garnicht angeschaut.

Meine Anforderungen sind recht einfach:

• Java
• Aufruf einer Methode in einem Server auf einem anderen Rechner.
• Einfach

Schließlich habe ich mir folgende Kandidaten etwas genauer angeschaut:

Lauffähigen Code habe ich bisher mit RMI und Hessian geschrieben und war positiv überrascht, dass das im Prinzip einfach ist, dennoch haben beide Technologien in meinen Augen ihre Macken. Ich versuche mal einen Überblick zu schaffen.

 RMI Hessian Google PB Thrift IDL Compiler oder Reflection reflection reflection Compiler Compiler Server im JDK Servlet Container – eingebaut diverse Programmiersprachen – ja ja ja

Für mich habe ich folgende Schlüsse aus gezogen:

• Bei Apache Thrift und bei Google Protocol Buffers muss ich die Schnittstelle aus einer Interfacebeschreibung compilieren.
• Bei den Google PB muss man offenbar seinen Server selber schreiben, da es sich "lediglich" um ein Serialisierungsprotokoll handelt.
• Für Hessian braucht man immer einen Servletcontainer wie zum Beispiel Tomcat oder Jetty. Je nach Anwendung kann das ganz schön lästig sein.
• RMI funktioniert wirklich nur für Java. Außerdem arbeitet es immer mit mindestens zwei TCP-Ports: der erste wird für eine Registry verwendet, die dann jedem Serviceobjekt wieder einen eigenen Port zuweist. Durch eine Firewall kann das schwierig werden. Dazu kommt, dass RMI davon abhängt, seinen eigenen Rechnernamen korrekt zu kennen. Es nutzt nichts wenn der Client den Zielserver ansteuern kann. Der Zielserver muss auch noch den Namen kennen, unter dem der Client ihn sieht, da dieser Name dann dem Client als Ziel für das eigentliche Serviceobjekt mitgeteilt wird.

Da ich nur eine Lösung für Java suche, würde mir im Grunde RMI völlig ausreichen. Aber das Theater mit dieser Registry müsste man noch irgendwie loswerden. Einfach IP-Adresse und Port des Zielrechners angeben und los, das wäre mir am liebsten: keine Registry, kein Serveletcontainer und vor allem kein IDL Compiler. Aber das scheint es derzeit nicht zu geben.

Schließlich ist es Hessian geworden. Wenn der Tomcat einmal konfiguriert ist, geht das problemlos. Messungen eines Kollegen zeigen, dass bei den großen Datenblöcken, die wir übertragen, der HTTP-Wasserkopf nicht ins Gewicht fällt.

My recent experiment is an HTML app showing Open Street Map maps. It is particularly targeted on mobile devices with Javascript support for geolocation.

Here is the map.