20.3 How to Design for Minimum Main Storage Use (especially with Java, C, C++)
The iSeries family has added popular languages whose usage continues to increase -- Java, C, C++.
These languages frequently use a different kind of storage -- heap storage.
Many iSeries programmers, with a background in RPG or COBOL are unaware of the influence this may
have on storage consumption. Why? Simply because these languages, by their nature, do not make much
if any use of the heap. Meanwhile, C, C++, and Java very typically do.
The implications can be very profound. Many programmers are unclear about the tradeoffs and, when
reducing memory usage, frequently attack the wrong problem. It is surprisingly easy, with these
languages, to spend many megabytes and even hundreds of megabytes of main storage without really
understanding how and why this was done.
Conversely, with the right understanding of heap storage, a programmer might be able to solve a much
larger problem on the identical machine.
Theory -- and Practice
This is one place where theory really matters. Often, programmers wonder whether a theory applies in
practice. After surveying a set of applications, we have concluded that the theory of memory usage
applies very widely in practice.
In computer science theory, programmers are taught to think about how many “entities” there are, not
how big the entity is. It turns out that controlling the number of entities matters most in terms of
controlling main storage -- and even processor usage (it costs some CPU, after all, to
have
and
initialize
storage in the first place). This is largely a function of design, but also of storage layout. It is also
knowing which storage is critical and which is not. Formally, the literature talks about:
Order(1) -- about one entity per system
Order(N) -- about “N” entities, where “N” are things like number of data base records, Java objects, and
like items.
Order(N log N) -- this can arise because there is a data base and it has an accompanying index.
Order(N squared) -- data base joins of two data bases can produce this level of storage cost
Note the emphasis on “about.” It is the number of entities in relation to the elements of the problem that
count. An element of the problem is not a program or a subsystem description. Those are Order(1) costs.
It is a data base record, objects allocated from the heap inside of loops, or anything like these examples.
In practice, Order(N) storage predominates, so this paper will concentrate on Order(N).
Of course, one must eventually get down to actual sizes. Thus, one ends up with actual costs that get
Order(N) estimated like this:
ActualCostForOrder(1) = a
ActualCostInBytes(N) = a + (b x N)
IBM i 6.1 Performance Capabilities Reference - January/April/October 2008
©
Copyright IBM Corp. 2008
Chapter 20 - General Tips and Techniques
316