Hi,
I wrote a script so that I could get data values at various points in a GOES-16 image. In this code, I select small lat/lon regions, open the grid, write the data value and then move on to the next lat/lon point. Eventually, I process the next file. Along the way, I usually end up with an error like...
java.io.FileNotFoundException: <datapath>/OR_ABI-L1b-RadF-M3C01_G16_s20170320106247_e20170320117013_c20170320117060.nc (Too many open files)
Should I run collectGarbage() after every file is processed, or is there a better way to remove the grid from the memory?
Code is at...
https://gitlab.ssec.wisc.edu/joleenf/ca ... beShell.py
Joleen
too many open files
Re: too many open files
Actually, collectGarbage() does not remove the data object, it does not seem that boomstick() does either.
Joleen
Joleen
Re: too many open files
Hi Joleen,
I'm not able to view your GitLab link (and FWIW I'm logged in), but this seems like something that may require ulimit. IIRC "ulimit -n" will report the maximum number of open files a process may use. If you're opening the files yourself (i.e. you have something to close), you might try using Python's "with statement" or just close the file manually.
I'm not able to view your GitLab link (and FWIW I'm logged in), but this seems like something that may require ulimit. IIRC "ulimit -n" will report the maximum number of open files a process may use. If you're opening the files yourself (i.e. you have something to close), you might try using Python's "with statement" or just close the file manually.
Re: too many open files
Hi Jon,
I confused myself a bit. boomstick() actually does remove all the grid files opened and listed in the field selector, which I believe addresses the problem with the error reported above. However, I am not sure how I can verify this other than running my script and noting the absence of a "too many files open" error. boomstick() does not remove the data object which was returned by the last loadGrid executed. I am not sure if that is a problem. I have been told that re-using an old data object name could be a problem. However, I don't want to make a new name each time I run, especially if I cannot clear the data to free memory after the data is not needed.
Thanks,
Joleen
I confused myself a bit. boomstick() actually does remove all the grid files opened and listed in the field selector, which I believe addresses the problem with the error reported above. However, I am not sure how I can verify this other than running my script and noting the absence of a "too many files open" error. boomstick() does not remove the data object which was returned by the last loadGrid executed. I am not sure if that is a problem. I have been told that re-using an old data object name could be a problem. However, I don't want to make a new name each time I run, especially if I cannot clear the data to free memory after the data is not needed.
Thanks,
Joleen
Re: too many open files
Even with boomstick() I can still hit the "too many files open" problem.
Joleen
Joleen
Re: too many open files
Hi Joleen,
I apologize for the delayed reply, but I come bearing gifts! Well, that's the intent at least. If you put the following snippet into your Jython library, you should be able to do things like "with managedLoadGrid(...) as foo:" and it should automatically try calling netCDF's close method once you exit the "with block". managedLoadGrid just passes all its parameters off to loadGrid, so hopefully any changes required will be minimal.
I apologize for the delayed reply, but I come bearing gifts! Well, that's the intent at least. If you put the following snippet into your Jython library, you should be able to do things like "with managedLoadGrid(...) as foo:" and it should automatically try calling netCDF's close method once you exit the "with block". managedLoadGrid just passes all its parameters off to loadGrid, so hopefully any changes required will be minimal.
Code: Select all
from contextlib import contextmanager
@contextmanager
def managedLoadGrid(*args, **kwargs):
grid = loadGrid(*args, **kwargs)
try:
yield grid
except:
raise
finally:
grid.gridDataset.close()
Re: too many open files
Hi Jon,
I am sorry it took me a while to respond.
When I tried adding the code and using with managedLoadGrid(), I am seeing an error:
Joleen
I am sorry it took me a while to respond.
When I tried adding the code and using with managedLoadGrid(), I am seeing an error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'generator' object has no attribute '__exit__'
Joleen
Re: too many open files
Hi Joleen -
Apologies for the delayed response to this. I've been attempting to replicate your error message but I've been unsuccessful thus far. If this is still a problem for you, can you please let us know the following:
Thanks -
Bob
Apologies for the delayed response to this. I've been attempting to replicate your error message but I've been unsuccessful thus far. If this is still a problem for you, can you please let us know the following:
- Did you put the managedLoadGrid function (the code above from Jon) in your Jython Library or is it directly in your script? It should be in the Jython Library.
- Are you seeing this error right away, or does it appear at some point in the middle of the script?
- Can you please post the updated version of your script that uses managedLoadGrid? Perhaps it is being implemented in a way that isn't working correctly.
Thanks -
Bob
Re: too many open files
Hi Bob,
I just checked this again with a quick script
This worked, so this is no longer a problem. Sorry for the work, I am not sure why this was failing originally. I did not save the original attempt and have recently updated to newer nightly on both machines that I use.
Joleen
I just checked this again with a quick script
Code: Select all
from glob import glob
home=expandpath('~')
fileDirectory=os.path.join(home,'data','goesr','checkout','CMIPM2')
fileGlob=os.path.join(fileDirectory,'*M3C11*s201710915*')
myFile=glob(fileGlob)[0]
loadParms=dict(
field='CMI',
stride=20
)
with managedLoadGrid(filename=myFile,**loadParms) as data:
print data
This worked, so this is no longer a problem. Sorry for the work, I am not sure why this was failing originally. I did not save the original attempt and have recently updated to newer nightly on both machines that I use.
Joleen