Tags Archiv für 'Java'

Au­to­cas­ting in Java – An ap­proach

I’m just wri­ting some Java code and for the thou­sand-and-first time there is so­me­thing like

pu­blic void someMethod(AnyType a) {
 if (a in­stan­ceof De­ri­va­teO­fAny­Type) {
  De­ri­va­teO­fAny­Type d=(DerivateOfAnyType)a;

Even though these-days Java has Ge­ne­rics, An­no­ta­ti­ons and that other cool stuff, you still have need for such casts, and I’m pretty sure that if you re­place „Any­Type“ with „Ob­ject“, you’ll find such oc­cur­ren­ces also in your code (if you are a Java pro­gram­mer).

Every time I write such stuff, I think for myself: „Why does that Java com­pi­ler not per­form the cast on its own?“ I mean: Af­ter suc­cess of an „in­stan­ceof“ check, I can be sure that the ob­ject is an in­stance of the type in ques­tion. Why do I have to ex­pli­citly name that again. The code would be­come much more re­a­da­ble and not more con­fu­sing if it were chan­ged into

pu­blic void someMethod(AnyType a) {
 if (a in­stan­ceof De­ri­va­teO­fAny­Type) {
  a.someMethodOfDerivate(); // ((DerivateOfAnyType)a).someMethodOfDerivate();

Se­ar­ching with goo­gle quickly lead me to blog entry of Ste­phen Cole­bourne who had pre­cisely this idea more than five ye­ars ago. The dis­cus­sion in the com­ments of that ar­ti­cle showed two pro­blems:

  1. It can be am­bi­guous to call the auto-de­ri­va­ted class by its old name („a“ in the ex­am­ple). The pro­po­sed so­lu­tion was to de­fine an ex­pli­cit new name wi­t­hin the in­stan­ceof state­ment:

    pu­blic void someMethod(AnyType a) {
     if (a in­stan­ceof De­ri­va­teO­fAny­Type as d) {
  2. Even using this syn­tax, there is more space for am­bi­gui­ties. Ex­amp­les were
    boo­lean b=(a in­stan­ceof De­ri­va­te­Type as d);
    if (a in­stan­ceof Ty­pe1 as t1 || a in­stan­ceof Ty­pe2 as t2).
    For this pro­blem, the dis­cus­sion en­ded in­con­clu­si­vely.

Af­ter ha­ving read that, I’ve ela­bo­ra­ted that auto-cas­ting stuff a bit. The ex­amp­les from above are a bit pa­tho­lo­gic. Nor­mally, the whole in­stan­ceof stuff is only for one type and only wi­t­hin an if-con­struct. So, it all co­mes down to a syn­tac­tic-su­gar thing preven­ting the only sen­si­ble cast from being writ­ten down ex­pli­citly. Which brought me to this rule set:

  • The „in­stan­ceof“ ope­ra­tor is ex­ten­ded by an (op­tio­nal) „as“ part. So, it is le­gal to write a in­stan­ceof Ty­peB as b.
  • An in­stan­ceof ope­ra­tor ex­ten­ded this way crea­tes a new name „b“ for the ob­ject in­stance re­fe­ren­ced by „a“ which re­fe­ren­ces the same in­stance au­to­ma­gi­cally as being of class „Ty­peB“
  • The new name „b“ is only avail­able wi­t­hin the block bound to the state­ment with the ex­ten­ded in­stan­ceof de­cla­ra­tion.
  • The ex­ten­ded in­stan­ceof syn­tax is only al­lo­wed wi­t­hin a con­junc­tive non-ne­ga­ting lo­gi­cal ex­pres­sion.
  • Fur­ther­more, it is only al­lo­wed if it is the only „in­stan­ceof“ state­ment wi­t­hin the lo­gi­cal ex­pres­sion.

These de­fi­ni­ti­ons would make the auto-cas­ting un­am­bi­gious by re­stric­ting it to the ac­tual nee­ded case. Things like „if (a!=null && a in­stan­ceof B as b)“ or „while (a in­stan­ceof B as b)“ would still be pos­si­ble while „if (!(a in­stan­ceof B as b))“ or „if (a in­stan­ceof B as b || a in­stan­ceof C as c)“ would not. To me, it seems as if this de­fi­ni­tion of an ex­ten­ded in­stan­ceof ope­ra­tor would solve the an­noy­ing cas­ting syn­tax of these-days Java in this case while avo­iding all the pa­tho­lo­gi­cal ca­ses de­scri­bed in the blog ar­ti­cle and its dis­cus­sion men­tio­ned above. Fu­ther­more, note that all the usual ru­les for cas­ting and as­sin­ging new con­tent to va­ria­ble na­mes still ap­ply. So, when wri­ting so­me­thing like „a=new Ty­peA()„, it would create a new ob­ject and leave „b“ poin­ting to the old ob­ject. In the same way, b=new Ty­peB() would create a new ob­ject just as it would if „b“ was de­cla­red in these-days Java with the ex­pli­cit cast at the be­gin­ning of the „if“ block.

Of course, all of this only is true if I do not miss so­me­thing in my con­clu­si­ons. But if this works out, I think it would be a nice little ex­ten­sion to the Java syn­tax which era­di­ac­tes a ra­ther cum­ber­some short­co­m­ing in the syn­tax. And, of course, it is all com­pi­ler-only. An­yone kno­wing how to write a JSR?

Jy­thon me­mory leak/​Out Of Me­mory pro­blem

Note: This ar­ti­cle has been un­pu­blis­hed for quite some time. It’s main parts date back to De­cem­ber 2007. The­re­fore, if some ver­sion num­ber seem to be out­da­ted – I am re­fer­ring to the state it had back those days.

We have a Java ap­p­li­ca­tion with em­bed­ded Jy­thon scrip­t­ing en­gine. The Jy­thon scripts do mass com­pu­ta­ti­ons on data sets. So far, we had 3000-4000 data sets in one chunk ma­xi­mum. Now, a new cust­o­mer starts and will have 8000 and more data sets.

„No big deal,“ I thought. And star­ted the test run one day be­fore our cust­o­mer will have the whole thing run­ning on the pro­duc­tion sys­tem for the first time. Com­pu­ta­tion starts: 1000… 2000… 3000… 4000… 5000… bang „Out of me­mory. You should try to in­crease heap size“. The ser­ver ap­p­li­ca­tion halts com­ple­tely and wi­thout any fur­ther warning.

I’m a bit sho­cked. The big pro­blems al­ways arise at place where one would de­fi­ni­tely not ex­pect them. I start cir­cum­ven­tion at­tempts: Split­ting the run into smal­ler chunks – does not work. Re­initia­li­zing the Jy­thon en­vi­ron­ment pe­ri­o­di­cally – ma­kes things worse. Re­pla­c­ing our ra­ther out­da­ted (but other­wise func­tio­nal) Jy­thon 2.1 with the then-cur­rent Jy­thon 2.2.1 – does not mat­ter. I do not seem to have a chance to cy­cle more than about 5200 times through the script be­fore I catch an „Out of me­mory“ si­tua­tion – or have to re­start the whole ser­ver pro­cess.

Weird. What should I tell the cust­o­mer? „Well, you can­not run your com­pu­ta­ti­ons in one step. Start with the first half, then call us, we will re­start the ser­ver, then do the se­cond half.“ ??? Not a re­ally pro­fes­sio­nal way of do­ing things. Even more sur­pri­sing, loo­king at the me­mory si­tua­tion with Runtime.freeMemory() and fri­ends shows that there is no shor­tage of me­mory at all. Ac­tually, when the ap­p­li­ca­tion cras­hes, it has used not more than 600 MB out of 2048 MB heap space and more than 50 MB are mar­ked as „free“. This is not pre­cisely what I would sum­ma­rize as „out of me­mory“…

Fi­nally, po­king Goo­gle once more brings the so­lu­tion. I find an ar­ti­cle about just a si­mi­lar pro­blem. For­t­u­n­a­tely, it has a so­lu­tion and even ex­plains what’s go­ing on: Jy­thon has an in­ter­nal map­ping of Py­XXX wrap­pers to Java ob­jects. The de­fault con­fi­gu­ra­tion uses nor­mal re­fe­ren­ces which ma­kes these map­pings re­sis­tant to gar­bage collec­tion. Due to me­cha­nisms I do not fully un­der­stand, this leads to enor­mous growth of the map­ping set and fi­nally an out-of-me­mory-si­tua­tion with the in­ter­nal re­source ma­nage­ment.

For­t­u­n­a­tely, the so­lu­tion is as sim­ple as put­ting a


so­mew­here in the code be­fore the Jy­thon sub­sys­tem is in­itia­li­zed. Then, the in­ter­nal ta­ble is built with weak re­fe­ren­ces and sud­denly, ever­y­thing runs smoothly. The 8000 data sets are no pro­blem any more and I can de­li­ver the ap­p­li­ca­tion as ex­pec­ted. Lu­cky me.

There is only one ques­tion re­mai­ning: What kind of pa­ra­psy­cho­lo­gi­cal abili­ties are de­ve­l­o­pers ex­pec­ted to have to find such a so­lu­tion wi­thout ha­ving the luck to find an ar­ti­cle de­scri­bing this. And: Why the heck does Jy­thon not use weak re­fe­ren­ces as de­fault? I could not find any pro­blems or even speed pe­nal­ties.