too many open files

Post any questions, ideas, or topics related to Jython and Python scripting.
Post Reply
User avatar
joleenf
Posts: 1123
Joined: Mon Jan 19, 2009 7:16 pm

too many open files

Post by joleenf »

Hi,

I wrote a script so that I could get data values at various points in a GOES-16 image. In this code, I select small lat/lon regions, open the grid, write the data value and then move on to the next lat/lon point. Eventually, I process the next file. Along the way, I usually end up with an error like...

java.io.FileNotFoundException: <datapath>/OR_ABI-L1b-RadF-M3C01_G16_s20170320106247_e20170320117013_c20170320117060.nc (Too many open files)

Should I run collectGarbage() after every file is processed, or is there a better way to remove the grid from the memory?

Code is at...
https://gitlab.ssec.wisc.edu/joleenf/ca ... beShell.py

Joleen
User avatar
joleenf
Posts: 1123
Joined: Mon Jan 19, 2009 7:16 pm

Re: too many open files

Post by joleenf »

Actually, collectGarbage() does not remove the data object, it does not seem that boomstick() does either.

Joleen
User avatar
Jon
Posts: 192
Joined: Fri Jan 09, 2009 8:44 pm
Location: Madison, WI

Re: too many open files

Post by Jon »

Hi Joleen,

I'm not able to view your GitLab link (and FWIW I'm logged in), but this seems like something that may require ulimit. IIRC "ulimit -n" will report the maximum number of open files a process may use. If you're opening the files yourself (i.e. you have something to close), you might try using Python's "with statement" or just close the file manually.
User avatar
joleenf
Posts: 1123
Joined: Mon Jan 19, 2009 7:16 pm

Re: too many open files

Post by joleenf »

Hi Jon,

I confused myself a bit. boomstick() actually does remove all the grid files opened and listed in the field selector, which I believe addresses the problem with the error reported above. However, I am not sure how I can verify this other than running my script and noting the absence of a "too many files open" error. boomstick() does not remove the data object which was returned by the last loadGrid executed. I am not sure if that is a problem. I have been told that re-using an old data object name could be a problem. However, I don't want to make a new name each time I run, especially if I cannot clear the data to free memory after the data is not needed.

Thanks,
Joleen
User avatar
joleenf
Posts: 1123
Joined: Mon Jan 19, 2009 7:16 pm

Re: too many open files

Post by joleenf »

Even with boomstick() I can still hit the "too many files open" problem.

Joleen
User avatar
Jon
Posts: 192
Joined: Fri Jan 09, 2009 8:44 pm
Location: Madison, WI

Re: too many open files

Post by Jon »

Hi Joleen,

I apologize for the delayed reply, but I come bearing gifts! Well, that's the intent at least. If you put the following snippet into your Jython library, you should be able to do things like "with managedLoadGrid(...) as foo:" and it should automatically try calling netCDF's close method once you exit the "with block". managedLoadGrid just passes all its parameters off to loadGrid, so hopefully any changes required will be minimal.


Code: Select all

from contextlib import contextmanager

@contextmanager
def managedLoadGrid(*args, **kwargs):
       
    grid = loadGrid(*args, **kwargs)
    try:
        yield grid
    except:
        raise
    finally:
        grid.gridDataset.close()
User avatar
joleenf
Posts: 1123
Joined: Mon Jan 19, 2009 7:16 pm

Re: too many open files

Post by joleenf »

Hi Jon,

I am sorry it took me a while to respond.

When I tried adding the code and using with managedLoadGrid(), I am seeing an error:
Traceback (most recent call last):   File "<stdin>", line 1, in <module> AttributeError: 'generator' object has no attribute '__exit__'


Joleen
User avatar
bobc
Posts: 990
Joined: Mon Nov 15, 2010 5:57 pm

Re: too many open files

Post by bobc »

Hi Joleen -

Apologies for the delayed response to this. I've been attempting to replicate your error message but I've been unsuccessful thus far. If this is still a problem for you, can you please let us know the following:

  1. Did you put the managedLoadGrid function (the code above from Jon) in your Jython Library or is it directly in your script? It should be in the Jython Library.
  2. Are you seeing this error right away, or does it appear at some point in the middle of the script?
  3. Can you please post the updated version of your script that uses managedLoadGrid? Perhaps it is being implemented in a way that isn't working correctly.

Thanks -
Bob
User avatar
joleenf
Posts: 1123
Joined: Mon Jan 19, 2009 7:16 pm

Re: too many open files

Post by joleenf »

Hi Bob,

I just checked this again with a quick script

Code: Select all

from glob import glob
home=expandpath('~')
fileDirectory=os.path.join(home,'data','goesr','checkout','CMIPM2')

fileGlob=os.path.join(fileDirectory,'*M3C11*s201710915*')

myFile=glob(fileGlob)[0]

loadParms=dict(
    field='CMI',
    stride=20
)

with managedLoadGrid(filename=myFile,**loadParms) as data:
    print data


This worked, so this is no longer a problem. Sorry for the work, I am not sure why this was failing originally. I did not save the original attempt and have recently updated to newer nightly on both machines that I use.

Joleen
Post Reply