Jy­thon me­mo­ry leak/​Out Of Me­mo­ry pro­blem 5

No­te: This ar­ti­cle has be­en un­pu­blished for qui­te so­me time. It’s main parts date back to De­cem­ber 2007. The­re­fo­re, if so­me ver­si­on num­ber seem to be out­da­ted – I am re­fer­ring to the sta­te it had back tho­se days.

We have a Ja­va ap­p­li­ca­ti­on with em­bed­ded Jy­thon script­ing en­gi­ne. The Jy­thon scripts do mass com­pu­ta­ti­ons on da­ta sets. So far, we had 3000-4000 da­ta sets in one chunk ma­xi­mum. Now, a new custo­mer starts and will have 8000 and mo­re da­ta sets.

„No big deal,“ I thought. And star­ted the test run one day be­fo­re our custo­mer will have the who­le thing run­ning on the pro­duc­tion sys­tem for the first time. Com­pu­ta­ti­on starts: 1000… 2000… 3000… 4000… 5000… bang „Out of me­mo­ry. You should try to in­crea­se heap si­ze“. The ser­ver ap­p­li­ca­ti­on halts com­ple­te­ly and wi­thout any fur­t­her warning.

I’m a bit sho­cked. The big pro­blems al­ways ari­se at place whe­re one would de­fi­ni­te­ly not ex­pect them. I start cir­cum­ven­ti­on at­tempts: Split­ting the run in­to smal­ler chunks – does not work. Re­initia­li­zing the Jy­thon en­vi­ron­ment pe­ri­o­di­cal­ly – makes things worse. Re­pla­cing our ra­ther out­da­ted (but other­wi­se func­tio­nal) Jy­thon 2.1 with the then-cur­rent Jy­thon 2.2.1 – does not mat­ter. I do not seem to have a chan­ce to cy­cle mo­re than about 5200 times through the script be­fo­re I catch an „Out of me­mo­ry“ si­tua­ti­on – or have to re­start the who­le ser­ver pro­cess.

Weird. What should I tell the custo­mer? „Well, you can­not run your com­pu­ta­ti­ons in one step. Start with the first half, then call us, we will re­start the ser­ver, then do the se­cond half.“ ??? Not a re­al­ly pro­fes­sio­nal way of doing things. Even mo­re sur­pri­sing, loo­king at the me­mo­ry si­tua­ti­on with Runtime.freeMemory() and fri­ends shows that the­re is no shor­ta­ge of me­mo­ry at all. Ac­tual­ly, when the ap­p­li­ca­ti­on cras­hes, it has used not mo­re than 600 MB out of 2048 MB heap space and mo­re than 50 MB are mar­ked as „free“. This is not pre­cise­ly what I would sum­ma­ri­ze as „out of me­mo­ry“…

Fi­nal­ly, po­king Goog­le on­ce mo­re brings the so­lu­ti­on. I find an ar­ti­cle about just a si­mi­lar pro­blem. For­tu­n­a­te­ly, it has a so­lu­ti­on and even ex­p­lains what’s go­ing on: Jy­thon has an in­ter­nal map­ping of Py­XXX wrap­pers to Ja­va ob­jects. The de­fault con­fi­gu­ra­ti­on uses nor­mal re­fe­ren­ces which makes the­se map­pings re­sistant to gar­ba­ge collec­tion. Due to me­cha­nisms I do not ful­ly un­der­stand, this leads to enor­mous growth of the map­ping set and fi­nal­ly an out-of-me­mo­ry-si­tua­ti­on with the in­ter­nal re­sour­ce ma­nage­ment.

For­tu­n­a­te­ly, the so­lu­ti­on is as simp­le as put­ting a


so­me­whe­re in the code be­fo­re the Jy­thon sub­sys­tem is in­itia­li­zed. Then, the in­ter­nal ta­ble is built with weak re­fe­ren­ces and sud­den­ly, ever­y­thing runs smooth­ly. The 8000 da­ta sets are no pro­blem any mo­re and I can de­li­ver the ap­p­li­ca­ti­on as ex­pec­ted. Lu­cky me.

The­re is on­ly one ques­ti­on re­mai­ning: What kind of pa­ra­psy­cho­lo­gi­cal abi­li­ties are de­ve­l­o­pers ex­pec­ted to have to find such a so­lu­ti­on wi­thout ha­ving the luck to find an ar­ti­cle de­scribing this. And: Why the heck does Jy­thon not use weak re­fe­ren­ces as de­fault? I could not find any pro­blems or even speed pe­n­al­ties.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.

5 Gedanken zu “Jy­thon me­mo­ry leak/​Out Of Me­mo­ry pro­blem

  • spacekadety

    Thank you ver­ry mutch for this hint, we are ha­ving the sa­me is­sue, be­en loo­king for this so­lu­ti­on for a while ;-). Hap­py Co­ding ever­yo­ne

  • Denis Haskin

    In­te­res­ting. We’re run­ning in­to what sounds li­ke a very si­mi­lar pro­blem, and I tried ap­p­ly­ing your fix, which didn’t seem to have any ef­fect (and Jim’s com­ment im­plies that it wouldn’t have had any ef­fect, any­way).

    We have to crea­te & store ma­ny (~2 mil­li­on Py­thon ob­jects) be­fo­re we can pro­cess them, and we seem to be get­ting 10x ConcurrentHashMap$HashEntry ob­jects — I can’t say for su­re sin­ce I’m run­ning on OSX and I’m ha­ving trou­ble get­ting hprof to show me a de­eper stack trace for its traces. We’re pro­bab­ly go­ing to re­fac­tor so­me­what and store the ob­ject so­le­ly in Java–we can then post-pro­cess in smal­ler bat­ches via the em­bed­ded Py­thon script.