Re: Different Physical File Attributes. -- MIDRANGE-L

Vern, Rob, Charles.  Thanks so much for the input.  I have a much better
understanding of DB2/400 now (and some of its history).

Vern, sorry about using "SQL" generally.  Please note that I only slipped up
once (I think).  ;-)
Earlier in the thread I made sure to say "MS SQL" or "SQL Server".  In the
end I might have been typing too fast.

Thanks again.

<vhamberg@xxxxxxxxxxx> wrote in message
news:041220062241.8357.443D8227000383A8000020A52200750784099D0A0D030E0890@xxxxxxxxxxxxxx
> Hi Ryan
>
> I have to smile a bit (with no intended offense to be sure) at the
questions - because these details are things we never think about much if at
all when working on the iSeries. Lots of DBA stuff is just handled by the
OS. And most of these details are not general public knowledge anyway.
>
> Things like "strongly favoring CHAR" have more to do with the heritage
than any SQL decision or attempt to prevent fragmentation. I mean, by
definition, everything is fragmented on the iSeries - all objects are
striped across multiple arms (each arm is a disk drive in PC terms) and this
is to give optimum performance for multiple users.
>
> JDE came from long ago on the iSeries - its tables (physical files) were
originally designed before there was much SQL on the machine or its
predecessor, the System/38. Variable length data types simply did not exist.
The implementation of VARCHARs involves 2 varieties of low-level objects
called spaces. As you might have noticed, iSeries has an extra attribute for
VARCHARs called allocated length. This is the amount of space in the
fixed-length main data space. As Charles, I think, said, data longer than
the allocated length is stored in what is called an auxiliary space. Each
space is 16MB in size and there can be several of these as needed.
>
> These spaces are the primary building block of lots of things on the
iSeries - we don't deal with them as such, however. Physical files (how SQL
tables are implemented) are made up of several kinds of spaces - there are
several subtypes. Again, these are not things we normally mess with.
>
> Having said that, it is possible to see the structures, but without
documentation that few people have, it is difficult to traverse. Suffice to
say, I did this while contracted to the database performance lab in
Rochester. The main data space(s) consist of space headers that have some of
their characteristics. The actual data comes in rows with (for those without
variable length and without null-capable) a single byte that contains the
deleted flag. Deleted records still exist, they are just not processed. You
asked about RGZPFM - yes deleted records are compressed out. I've always
thought of it as a copy of active records to a temp copy, then emptying out
the original and copying back in. Don't know about page split pointers - if
such a thing is used, they are resolved. But I cannot assume that what you
mean by page is the same thing as used on iSeries.
>
> There is much more to be said - and most of it is not useful to know on
the iSeries, IMO.  Sorry, but the matter of database agnosticism has little
to do, I think, with the low-level implementation of the database that you
seem to be describing.
>
> Just a bit more - you use the term "SQL" - I have a feeling you mean "SQL
Server" - I think it is best to remember that "SQL" is a language, "SQL
Server" is an RDBMS that uses SQL as the language for data definition and
manipulation. The same confusion is rife where I work, too, esp. among those
who have not come from the iSeries. Not a criticism, it is just confusing
semantically for me.
>
> On another matter, the indexes on iSeries are always separate from the
tables. DB2 on AIX has the option, I believe, of interleaving the data with
the index records, which might seem to offer some performance advantages. Is
this like the clustered indexes you describe? But indexes on iSeries are
very efficient modified B-trees. Also, if the columns being retrieved are
all in the index, then the data file is usually not used to get the data -
known as index-only access method - and this gets rid of the "2 reads" IO
problem with indexed access.
>
> Let me suggest a link that might be helpful -
> http://www-03.ibm.com/servers/enable/site/db2/porting.html
> has several links that might be interesting. There are porting guides for
coming from, e.g., Oracle and SQL Server. Another link that is especially
good, as it discusses the general architecture of iSeries, is labeled
"General iSeries Porting Information".
>
> This has been interesting, hope it is helpful.
> Vern
>
> -------------- Original message -------------- 
> From: "Ryan Hunt" <ryan.hunt@xxxxxxxxxxxxx>
>
> > Charles, thanks for the correction.
> >
> > Since this "heap" like format is so common on the iSeries, I'm inclined
to
> > think that iSeries tables are not as apt to suffer from the table
> > fragmentation and the forward-pointer random I/O issue as described in
the
> > previously included link.
> >
> > However, with that being said, JDE table design does STRONGLY favor the
use
> > of CHAR (or rather, strongly dis-favors the use of all variable lengh
data
> > types) which pads values with spaces as necessary. This padding may
waste
> > some drive space, but it does make it MUCH less likely that records will
> > change record length after an update. This in turn will decrease page
> > splits (which in SQL would prevent a forward-pointer). Maybe this is
> > another tactic used to be RDBMS agnostic - or maybe iSeries performance
> > suffers from page splits as well.
> >
> > Would you be so gracious as to entertain yet another one of my
questions?
> > Also, if forward-pointers do exist on the iSeries, does the RGZPFM of an
> > object remove deleted records and resolve page split pointers?
> >
> > "Wilt, Charles" wrote in
> > message
> >
news:F520B5C51DB10041B239BC06383A7EDC01C48967@xxxxxxxxxxxxxxxxxxxxxxxxxx
> > > > -----Original Message----- 
> > > > From: midrange-l-bounces@xxxxxxxxxxxx
> > > > [mailto:midrange-l-bounces@xxxxxxxxxxxx]
On
> > Behalf Of Ryan Hunt
> > > > Sent: Wednesday, April 12, 2006 2:42 PM
> > > > To: midrange-l@xxxxxxxxxxxx
> > > > Subject: Re: Different Physical File Attributes.
> > > >
> > > > Charles, thanks for all the info. While you've been learning
> > > > SQL server
> > > > you've expanded my knowledge of the iSeries.
> > > >
> > > > I still have a question about the "Arrival" sequence tables.
> > > > "Arrival"
> > > > sequence tables on the iSeries are synonymous (or most
> > > > synonymous) with heap
> > > > tables in SQL Server (as you've stated).
> > >
> > > You misunderstood me.
> > >
> > > Arrival sequence tables on the iSeries are _not_ synonymous with heap
> > > tables in SQL server.
> > >
> > > In fact, both "arrival" and "keyed" iSeries tables work like SQL
server
> > > heap tables.
> > >
> > > On the iSeries, it doesn't matter if you're dealing with a "keyed
> > > tabled" or an "arrival" table, the physical layout of the data is
> > > generally in arrival sequence. I say generally, because if all you do
> > > are ADDs and UPDATEs then the data is in arrival sequence. If you do
> > > DELETEs and ADDs and the table is set to reuse deleted records, by
> > > default tables created using SQL instead of DDS are so configured;
then
> > > there is no real order to the physical data.
> > >
> > > > In the SQL Server
> > > > world, heaps are
> > > > notoriious for page fragmentation and forward-pointer issues.
> > > >
> > > > Rather than writing a big explanation, check out the section titled:
> > > > "Problems with not having a clustered index " on the followint page:
> > > > http://www.quest-pipelines.com/newsletter-v4/1103_B.htm
> > >
> > > Actually, that's the page where I found out the info about SQL server.
> > >
> > > >
> > > > Do iSeries arrival sequence tables suffer from the same or
> > > > similar maladies?
> > >
> > > No. As far forward pointers go, the iSeries doesn't use them. Rows are
> > > never normally moved. The example given for SQL server, expanding a
> > > VARCHAR column, doesn't apply as the iSeries allocates a fix amount of
> > > space (>=0) for VARCHAR (and BLOB,CLOB) fields in the standard row
data
> > > space. Anything more than the allocated space is stored outside the
> > > standard row data space.
> > >
> > > As far as fragmentation, I'm not sure that it's a big deal for heap
> > > tables in SQL server, "Fragmentation matters little in heap tables
> > > because the data rows do not have to be kept in order." from
> > >
http://www.dba-sql-server.com/sql_server_tips/t_hpsdba_41_indexdefrag_db
> > > cc_dbreindex.htm
> > >
> > > As far as the iSeries goes, the above quote certainly applies since
> > > iSeries tables don't have an order.
> > >
> > > Fragmentation in general on the iSeries isn't usually a concern.
Single
> > > level store is part of the reason. Plus the fact that the iSeries,
like
> > > its predecessors were designed from the ground up to be scalable
> > > multi-user systems. Part of that design was a RAID 0 disk subsystem.
> > > The OS is highly optimized for dealing with pieces of objects
scattered
> > > all over the place. Like memory, the more disks you add to a iSeries
> > > the better it performs.
> > >
> > > >
> > > > Thanks for all the insite.
> > > >
> > > > Ryan
> > > >
> > >
> > > HTH,
> > >
> > > Charles Wilt
> > > -- 
> > > iSeries Systems Administrator / Developer
> > > Mitsubishi Electric Automotive America
> > > ph: 513-573-4343
> > > fax: 513-398-1121
> > >
> > >
> > > -- 
> > > This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
> > list
> > > To post a message email:
> > MIDRANGE-L@xxxxxxxxxxxx
> > > To subscribe, unsubscribe, or change list options,
> > > visit: http://lists.midrange.com/mailman/listinfo/midrange-l
> > > or email: MIDRANGE-L-request@xxxxxxxxxxxx
> > > Before posting, please take a moment to review the archives
> > > at http://archive.midrange.com/midrange-l.
> > >
> > >
> >
> >
> >
> > -- 
> > This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list
> > To post a message email:
MIDRANGE-L@xxxxxxxxxxxx
> > To subscribe, unsubscribe, or change list options,
> > visit: http://lists.midrange.com/mailman/listinfo/midrange-l
> > or email: MIDRANGE-L-request@xxxxxxxxxxxx
> > Before posting, please take a moment to review the archives
> > at http://archive.midrange.com/midrange-l.
> >
> -- 
> This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list
> To post a message email:
MIDRANGE-L@xxxxxxxxxxxx
> To subscribe, unsubscribe, or change list options,
> visit: http://lists.midrange.com/mailman/listinfo/midrange-l
> or email: MIDRANGE-L-request@xxxxxxxxxxxx
> Before posting, please take a moment to review the archives
> at http://archive.midrange.com/midrange-l.
>
>