Jon:
See embedded remarks below.
(Note that the following comments do not apply to programs running
within PASE.)
HTH,
Mark S. Waterbury
> On 7/12/2015 5:17 AM, Wilson, Jonathan wrote:
An odd question occurred to me while playing around with some code.
How does the i handle individual invocations of a list of programs with
regard to a simple call versus all of the programs compiled into one
massive single program?
There is always some "overhead" to "activate" each program or service
program within a job or activation group. And so, there is also
additional overhead for the OS to activate multiple programs, versus
activating just one large program where all of the *MODULEs are bound
together into a single *PGM object. However, as POWERx architecture
RISC processors keep getting faster and faster, this "overhead' becomes
less and less noticeable. This was part of the rationale for ILE
providing an ability to bind one or more *MODULEs to create a single
*PGM or *SRVPGM object, using the CRTPGM or CRTSRVPGM commands. (When
ILE was designed, OS/400 ran on much slower IMPI technology and CISC
hardware, so the overhead of activating multiple programs versus one
large program was far more significant and very noticeable.)
If I have a program that calls another program that calls another
program (say a menu program calling WW-customers calling ww-customer
types) I now have 3 programs on the call stack. Does/can the i decide
that as the first two programs are no longer actively running, they are
effectively paused until return from the last most called program, they
can be good candidates for the i equivalent of paging?
IBM i and OS/400, like System/38 CPF, is truly a "demand-paging" virtual
memory system. The idea of a "single level storage" means that all
objects reside on DASD, and all of real main storage (memory) is just a
(large) "cache" for any objects that are currently (or recently) "in use".
Does the i even care what part of a program or variable is in or out of
memory or does it use some kind of "well that bit hasn't run for a
little while/long while/very long while" and just shift accordingly
having no knowledge of the state of a program within a stack of calls?
With any "demand paged" virtual memory, if a page goes unreferenced for
a long enough period of time, eventually, its "page frame" (or "slot" in
real memory) may be needed to make room for some other page that needs
to be brought in, and so it might need to be get paged out. Note that
modern IBM i POWER6/7/8 systems have vastly larger main storage sizes in
the tens or even hundreds of Gigabytes, compared with IBM i systems of
just a few years ago (Power4, Power5, etc.) -- the larger the real main
storage "cache" for holding pages of the single-level storage objects in
real memory, the less likely that other pages need to get "paged out"
(or swapped out, aka "stolen"), to make room to bring in other pages of
other objects that are needed by the same or other jobs.
The hardware (page table entry) maintans a "refeerenced" and "changed"
bit for each page frame or "slot." Whenever a program fetches from a
page (e.g. reads the value of a variable or loads executable
instructions into the processor, etc.), the "referenced" bit is set on
for that page slot. Whenever a program alters the value in a variable,
the "changed" bit gets set on for that page in memory. Then,
periodically, the OS scans the page tables to determine which pages are
going "unreferenced" and also notices which pages have changed, and then
those bits are reset to 0. Then, if a slot is needed for another page,
and the page previously in that slot was "changed" it must first be
written back out to "backing store" (DASD), before another page can be
loaded into that slot. On the other hand, if a slot is needed and that
page has not changed, it can simply be overwritten with the new
contents, without first having to save the previous contents. So, the OS
will usually prefer to replace such "read-only" pages before having to
force other pages out to "backing store" since that requires additional
I/O operations that will slow down the whole process, since they must be
completed sequentially (old page contents get saved before new page
contents are loaded).
If, however, I was to write the programs as say three service programs
or three modules and then link them into one massive "pc application"
style program how much difference would there be as to how the i would
handle such a program in comparison to the first example?
Service programs (*SRVPGM) are very much like Dynamic Link Libraries
(DLLs) in OS/2 or MS-DOS and Windows, or like "shared libraries" in Unix
or Linux. You can bind one or more *MODULEs to create one large
*SRVPGM, and then many *PGMs can use it and call procedures within the
same *SRVPGM, and within the same job, there is only one "activation" of
that *SRVPGM (per activation group). In some ways, this can give you
many of the benefits of "linking everything into one large application
program" as you suggested. But, there are also maintenance advantages to
using service programs, versus linking many of the same *MODULEs into
many different *PGMs. Consider what happens when you need to make
changes to one of these *MODULEs. You would then need to hunt down
everyplace where it is bound into, and then replace it (e.g. using
UPDPGM). With *SRVPGMs, you can strive to ensure that each *MODULE is
only ever bound into one *SRVPGM, and then all client programs
dynamically call the procedures of that *MODULE within that one *SRVPGM,
and that way, when you need to make changes to that *MODULE, you have
just one *SRVPGM object to update (via the UPDSRVPGM command.)
Also the start up of "a program". I can't recall if the i loads all of
the program and initialises all of its variable memory when a program is
called, or if it just loads chunks of code and then initialises the
variables the first time it comes across them, or even some combination
- perhaps something totally different - of the above.
This is called "activation". What happens is, some virtual memory is
allocated for any static storage (variables) for that *PGM or *SRVPGM,
within the activation group that it is running under. This is completely
separate from the "executable code" which simply resides in single-level
storage (virtual memory) and gets paged-in and paged-out on demand.
I understand, as far as I can recall, the concepts of the single level
storage where everything is just an address with the OS handling the
details of if the address is in memory or on disk etc. But at some point
for a newly started program the variables used must be set up to be
unique to the user/job/etc. running the program so until that point the
memory of the programs variables have no address... after that the
variables can be dealt with by the single level storage, in memory, in
NVRAM, on SSD's or on good old spinning disks.
With single-level storage, everything (both data and executable code)
resides in the same vast address space, and gets "demand paged" into
memory when it is first referenced. During program activation, any
initialized static storage will also be initialized to its "initial
value" at that time.
I'm guessing, but might be wrong, that the actual program (real good old
honest logic code) in its non active state looks no different to the
program once its in an active state so the program "code" fits nicely
into the single storage level concept so there is no overhead of
un-packing code from its on disk structure to its in memory footprint.
All compilers on IBM i or OS/400 always generate "reentrant code" so
just one copy of the executable code can always be shared across any
number of jobs and processes. This is also in the same single-level
storage virtual memory, and so the code gets paged in and paged out in
the same way as any other virtual memory pages. Note however that if
multiple jobs are using the same code, it will be getting referenced far
more often, and so become far less likely to ever get selected to be
"paged out."
I also can't remember if the "loaded" program is loaded once but points
to unique data per job, or if a job running a program has a unique copy
of both the code and the variable memory.
The "activation" mentioned above is the part the is unique per job or
per activation group. So each job has a "local" copy of the data and
variables, but the executable code is always shared.
While writing the above, something niggled at the back of my memory that
said the first time a variable is accessed it gets a fault saying
something like "not initialised" which differs from a "not in
memory/page" fault... but I might be wrong it was so long ago that I
read up on it.
For more detailed information, see Dr. Frank Soltis' books: "Inside the
AS/400" and "Fortress Rochester." You can find used copies on e-bay or
various used bookstores on-line, such as www.abebooks.com, or you might
even find copies at a public or university library.
Jon.
As an Amazon Associate we earn from qualifying purchases.