Hello Marvin,

I think one of the challenges here is that POI will try to keep a Java object representation of the spreadsheet under construction in memory until the entire HSSFWorkbook object is written out and can be GCed. Based on the spreadsheet size indicated by your original post (assuming all cells populated) there will be 100+ million object instances that need to live at the same time. I don't think periodic saves will help because the written sheets, etc. will still be in memory until the whole workbook is reclaimed. Extra GC would certainly help collect any objects created by JDBC and other processing activity, but the accumulation of spreadsheet data will diminish the heap space available until either it runs out altogether or the user runs out of patience waiting for memory thrashing.

So I suspect the only way to deal with this is either to grow the heap enough to hold the entire result and make sure it gets enough physical memory (i.e. pool size) to prevent too much thrashing. Otherwise you might need to do as Bill suggests and split it out so that you can run this job in multiple parts.

Would it be possible to write it to CSV instead and the open that in excel? If formatting is not an issue then this would be much more memory efficient as you can just stream to file and forget each row as it passes by...

t.

On 20-Oct-08, at 12:59 PM, Blalock, Bill wrote:

This could be a garbage collection problem.

Java objects which are no longer needed hang around in memory until
garbage collection sweeps them up and puts them back into the memory
pool. The JVM does garbage collection when it has nothing else to do
... when it is waiting something else.

It sounds like your programs are running full throttle until heap space
is depleted and forces garbage collection. I could be wrong about that.

You have some control over garbage collection for Java programs executed
by RUNJVA:
GCHINL (garbage collection initial size)
GCHMAX (garbage collection maximum size)

Prompt up RUNJVA, get into extended help, and you can read about garbage
collection.

When Java is run from QSH you can indirectly effect garbage collection
by setting the initial and maximum heap sizes. This will effect when
Java start to run short of heap memory (if that is the problem).

In the Java code itself you can use the method
System.gc();
to "suggest" to the JVM that it should collect the garbage.

I hope that helps.

Bill Blalock

-----Original Message-----
From: java400-l-bounces@xxxxxxxxxxxx
[mailto:java400-l-bounces@xxxxxxxxxxxx] On Behalf Of Marvin Radding
Sent: Monday, October 20, 2008 11:21 AM
To: java400-l@xxxxxxxxxxxx
Subject: A question about JAVA JVM and large EXCEL spreadsheet creation

I have created a command (CVTXLS) to convert a file into an EXCEL
spreadsheet using the HSSF/POI classes from Jakarta project. (Thanks to
Scott Klement for his documentation)

I am using this command to convert a 466,000 record file into a
spreadsheet. It has 113 columns of mostly numeric data. The code already
is able to break every 65k records and start a new tab.

The problem is after processing over 35k records quite quickly, it is
now working very slowly and I was wondering why? Is this a memory
allocation problem? Can I speed thinks up by saving the file every now
and then?

Can anyone tell me why it suddenly when from 30k records in the first
hour to only a few hundred records in the next hour? But more
importantly, is there anything I can do short of multiple files, to help
the program process the file faster?

Thanks,

Marvin


--
This is the Java Programming on and around the iSeries / AS400
(JAVA400-L) mailing list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.

_____________

The information contained in this message is proprietary and/or confidential. If you are not the
intended recipient, please: (i) delete the message and all copies; (ii) do not disclose,
distribute or use the message in any manner; and (iii) notify the sender immediately. In addition,
please be aware that any message addressed to our domain is subject to archiving and review by
persons other than the intended recipient. Thank you.
_____________
--
This is the Java Programming on and around the iSeries / AS400 (JAVA400-L) mailing list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.


As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.