Tags Archiv für 'Java'

Autocasting in Java – An approach

I’m just writing some Java code and for the thousand-and-first time there is something like

public void someMethod(AnyType a) {
 if (a instanceof DerivateOfAnyType) {
  DerivateOfAnyType d=(DerivateOfAnyType)a;
  d.someMethodOfDerivate();
 }
}

Even though these-days Java has Generics, Annotations and that other cool stuff, you still have need for such casts, and I’m pretty sure that if you replace „AnyType“ with „Object“, you’ll find such occurrences also in your code (if you are a Java programmer).

Every time I write such stuff, I think for myself: „Why does that Java compiler not perform the cast on its own?“ I mean: After success of an „instanceof“ check, I can be sure that the object is an instance of the type in question. Why do I have to explicitly name that again. The code would become much more readable and not more confusing if it were changed into


public void someMethod(AnyType a) {
 if (a instanceof DerivateOfAnyType) {
  a.someMethodOfDerivate(); // ((DerivateOfAnyType)a).someMethodOfDerivate();
 }
}

Searching with google quickly lead me to blog entry of Stephen Colebourne who had precisely this idea more than five years ago. The discussion in the comments of that article showed two problems:

  1. It can be ambiguous to call the auto-derivated class by its old name („a“ in the example). The proposed solution was to define an explicit new name within the instanceof statement:

    public void someMethod(AnyType a) {
     if (a instanceof DerivateOfAnyType as d) {
      d.someMethodOfDerivate();
     }
    }
  2. Even using this syntax, there is more space for ambiguities. Examples were
    boolean b=(a instanceof DerivateType as d);
    or
    if (a instanceof Type1 as t1 || a instanceof Type2 as t2).
    For this problem, the discussion ended inconclusively.

After having read that, I’ve elaborated that auto-casting stuff a bit. The examples from above are a bit pathologic. Normally, the whole instanceof stuff is only for one type and only within an if-construct. So, it all comes down to a syntactic-sugar thing preventing the only sensible cast from being written down explicitly. Which brought me to this rule set:

  • The „instanceof“ operator is extended by an (optional) „as“ part. So, it is legal to write a instanceof TypeB as b.
  • An instanceof operator extended this way creates a new name „b“ for the object instance referenced by „a“ which references the same instance automagically as being of class „TypeB“
  • The new name „b“ is only available within the block bound to the statement with the extended instanceof declaration.
  • The extended instanceof syntax is only allowed within a conjunctive non-negating logical expression.
  • Furthermore, it is only allowed if it is the only „instanceof“ statement within the logical expression.

These definitions would make the auto-casting unambigious by restricting it to the actual needed case. Things like „if (a!=null && a instanceof B as b)“ or „while (a instanceof B as b)“ would still be possible while „if (!(a instanceof B as b))“ or „if (a instanceof B as b || a instanceof C as c)“ would not. To me, it seems as if this definition of an extended instanceof operator would solve the annoying casting syntax of these-days Java in this case while avoiding all the pathological cases described in the blog article and its discussion mentioned above. Futhermore, note that all the usual rules for casting and assinging new content to variable names still apply. So, when writing something like „a=new TypeA()„, it would create a new object and leave „b“ pointing to the old object. In the same way, b=new TypeB() would create a new object just as it would if „b“ was declared in these-days Java with the explicit cast at the beginning of the „if“ block.

Of course, all of this only is true if I do not miss something in my conclusions. But if this works out, I think it would be a nice little extension to the Java syntax which eradiactes a rather cumbersome shortcoming in the syntax. And, of course, it is all compiler-only. Anyone knowing how to write a JSR?

Jython memory leak/Out Of Memory problem

Note: This article has been unpublished for quite some time. It’s main parts date back to December 2007. Therefore, if some version number seem to be outdated – I am referring to the state it had back those days.

We have a Java application with embedded Jython scripting engine. The Jython scripts do mass computations on data sets. So far, we had 3000-4000 data sets in one chunk maximum. Now, a new customer starts and will have 8000 and more data sets.

„No big deal,“ I thought. And started the test run one day before our customer will have the whole thing running on the production system for the first time. Computation starts: 1000… 2000… 3000… 4000… 5000… bang „Out of memory. You should try to increase heap size“. The server application halts completely and without any further warning.

I’m a bit shocked. The big problems always arise at place where one would definitely not expect them. I start circumvention attempts: Splitting the run into smaller chunks – does not work. Reinitializing the Jython environment periodically – makes things worse. Replacing our rather outdated (but otherwise functional) Jython 2.1 with the then-current Jython 2.2.1 – does not matter. I do not seem to have a chance to cycle more than about 5200 times through the script before I catch an „Out of memory“ situation – or have to restart the whole server process.

Weird. What should I tell the customer? „Well, you cannot run your computations in one step. Start with the first half, then call us, we will restart the server, then do the second half.“ ??? Not a really professional way of doing things. Even more surprising, looking at the memory situation with Runtime.freeMemory() and friends shows that there is no shortage of memory at all. Actually, when the application crashes, it has used not more than 600 MB out of 2048 MB heap space and more than 50 MB are marked as „free“. This is not precisely what I would summarize as „out of memory“…

Finally, poking Google once more brings the solution. I find an article about just a similar problem. Fortunately, it has a solution and even explains what’s going on: Jython has an internal mapping of PyXXX wrappers to Java objects. The default configuration uses normal references which makes these mappings resistant to garbage collection. Due to mechanisms I do not fully understand, this leads to enormous growth of the mapping set and finally an out-of-memory-situation with the internal resource management.

Fortunately, the solution is as simple as putting a

System.setProperty(„python.options.internalTablesImpl“,“weak“);

somewhere in the code before the Jython subsystem is initialized. Then, the internal table is built with weak references and suddenly, everything runs smoothly. The 8000 data sets are no problem any more and I can deliver the application as expected. Lucky me.

There is only one question remaining: What kind of parapsychological abilities are developers expected to have to find such a solution without having the luck to find an article describing this. And: Why the heck does Jython not use weak references as default? I could not find any problems or even speed penalties.