Jy­thon me­mory leak/​Out Of Me­mory pro­blem

Note: This ar­ti­cle has been un­pu­blis­hed for quite some time. It’s main parts date back to De­cem­ber 2007. The­re­fore, if some ver­sion num­ber seem to be out­da­ted – I am re­fer­ring to the state it had back those days.

We have a Java ap­p­li­ca­tion with em­bed­ded Jy­thon scrip­t­ing en­gine. The Jy­thon scripts do mass com­pu­ta­ti­ons on data sets. So far, we had 3000-4000 data sets in one chunk ma­xi­mum. Now, a new cust­o­mer starts and will have 8000 and more data sets.

„No big deal,“ I thought. And star­ted the test run one day be­fore our cust­o­mer will have the whole thing run­ning on the pro­duc­tion sys­tem for the first time. Com­pu­ta­tion starts: 1000… 2000… 3000… 4000… 5000… bang „Out of me­mory. You should try to in­crease heap size“. The ser­ver ap­p­li­ca­tion halts com­ple­tely and wi­thout any fur­ther warning.

I’m a bit sho­cked. The big pro­blems al­ways arise at place where one would de­fi­ni­tely not ex­pect them. I start cir­cum­ven­tion at­tempts: Split­ting the run into smal­ler chunks – does not work. Re­initia­li­zing the Jy­thon en­vi­ron­ment pe­ri­o­di­cally – ma­kes things worse. Re­pla­c­ing our ra­ther out­da­ted (but other­wise func­tio­nal) Jy­thon 2.1 with the then-cur­rent Jy­thon 2.2.1 – does not mat­ter. I do not seem to have a chance to cy­cle more than about 5200 times through the script be­fore I catch an „Out of me­mory“ si­tua­tion – or have to re­start the whole ser­ver pro­cess.

Weird. What should I tell the cust­o­mer? „Well, you can­not run your com­pu­ta­ti­ons in one step. Start with the first half, then call us, we will re­start the ser­ver, then do the se­cond half.“ ??? Not a re­ally pro­fes­sio­nal way of do­ing things. Even more sur­pri­sing, loo­king at the me­mory si­tua­tion with Runtime.freeMemory() and fri­ends shows that there is no shor­tage of me­mory at all. Ac­tually, when the ap­p­li­ca­tion cras­hes, it has used not more than 600 MB out of 2048 MB heap space and more than 50 MB are mar­ked as „free“. This is not pre­cisely what I would sum­ma­rize as „out of me­mory“…

Fi­nally, po­king Goo­gle once more brings the so­lu­tion. I find an ar­ti­cle about just a si­mi­lar pro­blem. For­t­u­n­a­tely, it has a so­lu­tion and even ex­plains what’s go­ing on: Jy­thon has an in­ter­nal map­ping of Py­XXX wrap­pers to Java ob­jects. The de­fault con­fi­gu­ra­tion uses nor­mal re­fe­ren­ces which ma­kes these map­pings re­sis­tant to gar­bage collec­tion. Due to me­cha­nisms I do not fully un­der­stand, this leads to enor­mous growth of the map­ping set and fi­nally an out-of-me­mory-si­tua­tion with the in­ter­nal re­source ma­nage­ment.

For­t­u­n­a­tely, the so­lu­tion is as sim­ple as put­ting a


so­mew­here in the code be­fore the Jy­thon sub­sys­tem is in­itia­li­zed. Then, the in­ter­nal ta­ble is built with weak re­fe­ren­ces and sud­denly, ever­y­thing runs smoothly. The 8000 data sets are no pro­blem any more and I can de­li­ver the ap­p­li­ca­tion as ex­pec­ted. Lu­cky me.

There is only one ques­tion re­mai­ning: What kind of pa­ra­psy­cho­lo­gi­cal abili­ties are de­ve­l­o­pers ex­pec­ted to have to find such a so­lu­tion wi­thout ha­ving the luck to find an ar­ti­cle de­scri­bing this. And: Why the heck does Jy­thon not use weak re­fe­ren­ces as de­fault? I could not find any pro­blems or even speed pe­nal­ties.

5 Antworten to “Jy­thon me­mory leak/​Out Of Me­mory pro­blem”

  • Thank you verry mutch for this hint, we are ha­ving the same is­sue, been loo­king for this so­lu­tion for a while ;-). Happy Co­ding ever­yone

  • F?rat KÜÇÜK

    The same pro­blem oc­cu­red. I am se­ar­ching for a so­lu­tion.

  • In­te­res­ting. We’re run­ning into what so­unds like a very si­mi­lar pro­blem, and I tried ap­p­ly­ing your fix, which didn’t seem to have any ef­fect (and Jim’s com­ment im­plies that it wouldn’t have had any ef­fect, any­way).

    We have to create & store many (~2 mil­lion Py­thon ob­jects) be­fore we can pro­cess them, and we seem to be get­ting 10x ConcurrentHashMap$HashEntry ob­jects — I can’t say for sure since I’m run­ning on OSX and I’m ha­ving trou­ble get­ting hprof to show me a de­eper stack trace for its tra­ces. We’re pro­bably go­ing to re­fac­tor so­mew­hat and store the ob­ject so­lely in Java–we can then post-pro­cess in smal­ler bat­ches via the em­bed­ded Py­thon script.

  • In­ter­nal­Ta­bles is com­ple­tely gone in 2.5, so you should not be run­ning into pro­blems with un­collec­ta­ble ob­jects now.

  • THX, me­mo­ri­zed for fu­ture use *smile*

Hinterlasse eine Nachricht

Mit dem Absenden des Kommentars willigen Sie ein, dass der angegebene Name, Ihre E-Mail-Adresse und die IP-Adresse Ihres Zugangs im Zusammenhang mit Ihrem Kommentar gespeichert werden. E-Mail- und IP-Adresse werden nicht veröffentlicht oder weitergegeben. Siehe Datenschutzhinweise.