background image

Notes/Disclaimers

:

y

These results are relative to each other and do not scale with other environments.

y

IBM System i CPU features without an L2 cache will have lower web server capacities than the CPW value would indicate

2.622

1.873

13.539

7.691

34.730

11.564

Static Page - FRCA

1.243

0.958

3.044

2.095

3.538

2.407

Static Page - Local Cache

1.068

0.830

1.793

1.347

2.016

1.558

Static Page - IFS

On

Off

On

Off

On

Off

KeepAlive

100K Bytes

`10K Bytes

1K Bytes

Transaction Type:

Relative Capacity Metrics

Table 6.3   i5/OS V5R4 Web Serving Relative Capacity for Static (varied sizes)

1KB

10KB

100KB

1KB

10KB

100KB

0

5

10

15

20

25

30

35

 Relative Capacity   

IFS

Local Cache

FRCA

HTTP Server (powered by Apache) for i5/OS

V5R4 Relative Capacities for Static Pages by Size

<- - - - - - - -  KeepAlive on  - - - - - - - - >                                             <- - - - - - - -  KeepAlive off  - - - - - - - -> 

Figure 6.3

  

i5/OS V5R4 Web Serving Relative Capacity for Static

 

Pages and FRCA

Web Serving Performance Tips and Techniques

:

1.

HTTP software optimizations by release:

IBM i 6.1 Performance Capabilities Reference - January/April/October 2008

©

 Copyright IBM Corp. 2008

 Chapter 6 - Web Server  and WebSphere

83

Summary of Contents for 170 Servers

Page 1: ...erstand the performance and tuning factors in IBM i operating system 6 1 and earlier where applicable For the latest updates and for the latest on IBM i performance information please refer to the Per...

Page 2: ...uments are viewable downloadable in Adobe Acrobat pdf format Approximately 1 to 2 MB download Adobe Acrobat reader plug in is available at http www adobe com To request the CISC version V3R2 and earli...

Page 3: ...SD Performance Behavior updates 34 2 13 iSeries for Domino and Dedicated Server for Domino Performance Behavior 33 2 12 Upgrade Considerations for Interactive Capacity 31 2 11 Migration from Tradition...

Page 4: ...JVM to Use 132 Bytecode Verification 131 Garbage Collection 129 JIT Compiler 129 7 4 Classic VM 64 bit 128 Garbage Collection 128 Native Code 127 7 3 IBM Technology for Java 32 bit and 64 bit 126 7 2...

Page 5: ...Ideas 178 13 1 Summary 178 Chapter 13 Linux on iSeries Performance 176 12 4 Conclusions Recommendations and Tips 176 12 3 Test Description and Results 175 12 2 Performance Improvements for WebSphere...

Page 6: ...ss Considerations 220 14 5 1 3 Specific VIOS Configuration Recommendations Traditional non blade Machines 217 14 5 1 2 Generic Configuration Concepts 216 14 5 1 1 Generic Concepts 216 14 5 1 General V...

Page 7: ...IOPLess IOA 270 15 23 9406 MMA DVD RAM 268 15 21 5XX DVD RAM and Optical Library 267 15 20 5XX Tape Device Rates with 571E 571F Storage IOAs and 4327 U320 Disk Units 265 15 19 5XX Tape Device Rates 26...

Page 8: ...Storage Sizing Guidelines 304 19 2 Dynamic Priority Scheduling 302 19 1 Public Benchmarks TPC C SAP NotesBench SPECjbb2000 VolanoMark 302 Chapter 19 Miscellaneous Performance Information 301 18 5 Summ...

Page 9: ...xx Servers 353 C 8 V5R2 Additions February May July 2003 351 C 7 1 IBM i5 Servers 351 C 7 V5R3 Additions May July August October 2004 July 2005 349 C 6 V5R4 Additions January May August 2006 and Janua...

Page 10: ...M future direction and intent are subject to change or withdrawal without notice and represent goals and objectives only Contact your local IBM office or IBM authorized reseller for the full text of t...

Page 11: ...C Novell Inc Netware Intersolve Inc Q E Intersolve Inc INTERSOLV Hewlett Packard Corporation HP 9000 Hewlett Packard Corporation HP UX Gaphics Software Publishing Corporation Harvard Business Applicat...

Page 12: ...y Engine Support Websphere Application Server including WAS V6 1 both with the Classic VM and the IBM Technology for Java 32 bit VM WebSphere Host Access Transformation Services HATS including the IBM...

Page 13: ...mainframe inspired reliability and availability features flexible capacity upgrades and innovative virtualization technologies New 5 0GHz and 4 4GHz POWER6 processors use the very latest 64 bit IBM P...

Page 14: ...c performance information web site is found at http www ibm com systems i advantages perfmgmt index html IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright IBM Corp 2008...

Page 15: ...cessing OLTP replaces the term interactive when referencing interactive CPW or interactive capacity Also new in 2003 when ordering a iSeries server the customer must choose between a Standard Package...

Page 16: ...ks Note the system s interactive capacity utilization may not be equal to the utilization of all interactive tasks Reasons for this are discussed in Section 2 10 Managing Interactive Capacity With the...

Page 17: ...tasks Highlights of this new algorithm include the following y As interactive users exceed the installed interactive CPW capacity the response times of those applications may significantly lengthen a...

Page 18: ...ith journaling and commitment control Traditional non server AS 400 system models had a single CPW value which represented the maximum workload that can be applied to that model This CPW value was app...

Page 19: ...nterrupts etc In general a single interactive job will not cause a significant impact to client server performance Microcode task CFINTn for all iSeries models handles interrupts task switching and ot...

Page 20: ...and hence the knee of the curve for workload interaction is at a different point which offers a much higher interactive workload capability compared to the standard server models For the server model...

Page 21: ...Model behavior 0 1 3 Int CPW Full Int CPW Fraction of Interactive CPW 0 20 40 60 80 100 Available CPU available CFINT interactive Server Model CPU Distribution vs Interactive Utilization Available fo...

Page 22: ...he same amount The following example will illustrate this performance capacity interplay 0 20 40 60 80 100 117 of Published Interactive CPU 0 20 40 60 80 100 Available CPU available CFINT interactive...

Page 23: ...an use the entire interactive capacity with no impacts to client server batch workload response times On the current model 170s if the published interactive capacity is exceeded system overhead grows...

Page 24: ...imately 6 7 of the maximum interactive CPW the knee of the curve the client server processing performance of the system becomes increasingly impacted Once the interactive workload reaches the maximum...

Page 25: ...of 70 The interactive CPU percent at the knee equals 70 CPW 240 CPW or 29 2 The maximum interactive CPU percent 7 6 of the Interactive CPW equals 81 7 CPW 240 CPW or 34 Now if the interactive CPU is...

Page 26: ...ty causes the interactive jobs to slow down and more processing power to be allocated to the client server processing As the interactive jobs receive less processing time their impact on client server...

Page 27: ...resulting in server behavior more like V3R6 and V3R7 That is once the knee is exceeded interactive priority is automatically decreased Assuming non interactive is set at priority 50 interactive could...

Page 28: ...5 0 1 3 Int CPW Full Int CPW Fraction of Interactive CPW 0 20 40 60 80 100 Available CPU available interactive Server Dynamic Tuning High Server Demand Knee Available for Client Server 0 1 3 Int CPW F...

Page 29: ...ogram product The monitor collects system data as well as data for each job on the system including the CPU consumed and the type of job By examining the reports generated by the Performance Tools pro...

Page 30: ...0 workstation communications path somewhere within the job It may be a 5250 data stream that is translated into html or sent to a PC for graphical display but the work on the iSeries is fundamentally...

Page 31: ...substantial amount of interactive CPU on a uniprocessor may easily exceed the threshold even though the normal work on the system is well under it On the other hand the same job on a 12 way can use at...

Page 32: ...5 Work with 6 Release 7 Display message 8 Work with spooled files 13 Disconnect Opt Subsystem Job User Type CPU Function Status __ BATCH QSYS SBS 0 DEQW __ QCMN QSYS SBS 0 DEQW __ QCTL QSYS SBS 0 DEQ...

Page 33: ...nt Most systems have peak workload environments Care must be taken to ensure that peaks can be contained in server model environments Some environments could have peak workloads that exceed the intera...

Page 34: ...st AS 400e series with an acceptable CPW rating 49 7 Note that interactive and client server CPWs are not additive Interactive workloads which exceed even briefly the knee of the curve will consume a...

Page 35: ...nce behavior as V5R1 on the Dedicated Server for Domino models The following discussion of the V5R1 Domino complimentary behavior is applicable to V5R2 Five new DSD models were announced with V5R1 The...

Page 36: ...Activity WRKSYSACT command which is part of the IBM Performance Tools for iSeries via the Overall DB CPU util statistic y Management Central by starting a monitor to collect the CPU Utilization Datab...

Page 37: ...ing in each logical partition in order for non Domino work to be treated as complementary processing Other DSD processing requirements such as the 15 DB2 processing guidelines and the 15 non Domino pr...

Page 38: ...is not present in the Linux logical partition By providing support for running Linux logical partitions on the Dedicated Server it allows customers to run Linux based applications such as internet fir...

Page 39: ...ame system consuming less CPU Conclusions Recommendations y For CPU intensive batch applications run time scales inversely with Relative Performance Rating CPWs This assumes that the number synchronou...

Page 40: ...y require slightly more main storage on RISC Therefore with larger memory sizes in conjunction with using Expert Cache these applications may achieve significant performance gains by decreasing the nu...

Page 41: ...receives Consider blocking data in the application Try to place the application on the same system as the frequently accessed data BMC Software the BMC Software logos and all other BMC Software produ...

Page 42: ...y NLSS CCSID translation between columns y User defined table functions y Sort sequence y Lateral correlation y UPPER LOWER functions y UTF8 16 Normalization support NORMALIZE_DATA INI option of YES y...

Page 43: ...MAX MIN SUM CHAR TRUN 35 improved to 45 15 improved to 400 Arithmetic 0 to 15 0 to 15 Select POWER6 Processor POWER5 Processor Query Attribute Figure 4 1 Processing time degradation with decimal float...

Page 44: ...plan already exists in the plan cache the full open time was reduced by up to 30 In addition to the optimization and full open performance improvements for V6R1 there was a comprehensive effort to red...

Page 45: ...they allow for better performance The implementation of queries which require live data may require temporary indexes for example queries that run with a sensitive cursor or with ALWCPYDTA NO In the c...

Page 46: ...gy preview of materialized query tables Also an April 2005 addition to the DB2 FOR i5 OS V5R3 support was query optimizer support for recognizing and using materialized query tables MQTs also referred...

Page 47: ...seconds performance improvements are nominal For subsecond queries there is little to no improvement for most queries As the runtime increases the reduction in runtime and CPU time become more substan...

Page 48: ...to improved locality of reference for the desired records When used incorrectly table partitioning may degrade the performance of queries by an order of magnitude or more particularly when a large nu...

Page 49: ...Delete Support As developers have moved from native I O to embedded SQL they often wonder why a Clear Physical File Member ClrPfm command is faster than the SQL equivalent of DELETE FROM table The re...

Page 50: ...higher peak disk utilization The effects of the SQE enhancements on SQL query performance will vary greatly depending on many factors Among these factors are hardware configuration processor memory si...

Page 51: ...where T1 A T2 A and T1 B VALUE1 and T2 C VALUE2 Database characteristics indexes on T1 A and T2 A exist NO column statistics T1 has 100 million rows T2 has 10 million rows T1 is 1 GB and T2 0 1 GB Si...

Page 52: ...query optimizer can scan EVIs and automatically build dynamic on the fly bitmaps much more quickly than from traditional indexes y EVIs can be built much faster and are much smaller than traditional i...

Page 53: ...complex DSS queries may run an hour or longer The CPU required to run a DSS query can easily be 100 times greater than the CPU required for a typical OLTP transaction Thus it is very important to choo...

Page 54: ...in a separate pool may dramatically reduce query runtime 4 8 Journaling and Commitment Control Journaling The primary purpose of journal management is to provide a method to recover database files Add...

Page 55: ...An estimation of how long access path recovery will take is provided by SMAPP and SMAPP provides a setting for the acceptable length of recovery SMAPP is shipped enabled with a default recovery time F...

Page 56: ...led SMAPP will place the implicit journal entries in the same place SMAPP automatically manages the system journal For the user journal receivers used by SMAPP RCVSIZOPT RMVINTENT as specified on the...

Page 57: ...ds will generally not result in much improvement and may actually degrade performance due to the added communications overhead y Choose a partitioning key that has many different values This will help...

Page 58: ...information on Referential Integrity see the chapter Ensuring Data Integrity with Referential Constraints in DB2 Universal Database for System i Database Programming manual and the redbook Advanced F...

Page 59: ...r applications and can possibly help improve overall system performance particularly in the case of applying changes to remote systems However some care needs to be used in designing triggers for good...

Page 60: ...amount of spill data is below 60 Also by minimizing the size of the file the performance of operations such as CPYF Copy File will also be improved y When using a variable length field as a join fiel...

Page 61: ...ion of Function This section discusses the support for reuse of deleted record space This database support provides the customer a way of placing newly added records into previously deleted record spa...

Page 62: ...ct that reuse by default blocks up records for disk I Os as much as possible y Increasing the number of indexes over a file will cause degradation for all insert operations regardless of whether reuse...

Page 63: ...ery Optimization manual This document contains detailed information on access methods the query optimizer and optimizing query performance including using database monitor to monitor queries using QAQ...

Page 64: ...output adapters IOAs 573A and 576A These IOAs do not require an input output processor IOP to be installed in conjunction with the IOA Instead the IOA can be plugged into a PCI bus slot and the IOA is...

Page 65: ...cades and TCP IP over Ethernet has grown with it We currently have arrived where different factors influence the capabilities of the Ethernet Some of these influences can come from the cabling and ada...

Page 66: ...s Each System i server is configured as a single LPAR system with one dedicated CPU Each communication test was performed between the two systems and the 10 Gigabit IOAs were installed in the 266 MHz...

Page 67: ...cols The real world perspective offered by the Workload Estimator can be valuable for projecting overall system capacity 5 3 Communication and Storage observations With the continued progress in both...

Page 68: ...EA P P U T Partition to Partition Unicast Traffic or internal switch 16 Gbps per port group 8 4 Processor 7998 61X Blade 9 All measurements are performed with Full Duplex Ethernet 11972 3 8553 0 8 992...

Page 69: ...erver Performance There are also capacity planning examples in that chapter 5 5 TCP IP Secure Performance With the growth of communication over public network environments like the Internet securing t...

Page 70: ...size of the public key the type of encryption and the size of the symmetric key These results may be used to estimate a system s potential transaction rate at a given CPU utilization assuming a partic...

Page 71: ...ure and each variation of security policy y The table data reflects System i as a server not a client y VPN measurements used transport mode TDES AES128 or RC4 with 128 bit key symmetric cipher and MD...

Page 72: ...red mode This is workload dependent y Host Ethernet Adapters require 40 to 56 MB for memory per logical port to vary on y IBM Power 550 9409 M50 May show 2 to 5 percent increase over IBM Power 520 940...

Page 73: ...n s send and receive requests This is the amount of data that the application transfers with a single sockets API call Because sockets does not block up multiple application sends it is important to b...

Page 74: ...The CPU usage for high speed connections is similar to slower speed lines running the same type of work As the speed of a line increases from a traditional low speed to a high speed performance chara...

Page 75: ...ression increases the CPU time by up to 9 times RLE compression uses less CPU time than LZ9 compression MODD parameters ICF and CPI C have very similar performance for small data transfers ICF allows...

Page 76: ...Mbps Ethernet network the following average response times were observed on the system not including the time required to start a SNA session and allocate a conversation 5 23 min 5 40 min 5 16 min 5 1...

Page 77: ...e Communications d Socket Programming for the Sockets Programming guide Information about Ethernet cards can be found at the IBM Systems Hardware Information Center The link for this information cente...

Page 78: ...inks y capacity and caching characteristics of any proxy servers y the responsiveness of any other related remote servers e g payment gateways y congestion of network resources 3 System i Web Server a...

Page 79: ...and the connection is ended If the browser has multiple file requests for the same HTTP server it is possible to get the multiple requests with one connection This feature is known as persistent conn...

Page 80: ...l also chapter 23 The following tables provide a summary of the measured performance data for both static and dynamic Web server transactions These charts should be used in conjunction with the rest o...

Page 81: ...is listed here n a 34 730 Static Page FRCA 2 235 3 538 Static Page Local Cache 1 481 2 016 Static Page IFS Secure Non secure Relative Capacity Metrics Transaction Type Table 6 1 i5 OS V5R4 Web Serving...

Page 82: ...cities that what is listed here 0 436 0 475 CGI Named Activation 0 090 0 092 CGI New Activation Secure Non secure Relative Capacity Metrics Transaction Type Table 6 2 i5 OS V5R4 Web Serving Relative C...

Page 83: ...ytes 1K Bytes Transaction Type Relative Capacity Metrics Table 6 3 i5 OS V5R4 Web Serving Relative Capacity for Static varied sizes 1KB 10KB 100KB 1KB 10KB 100KB 0 5 10 15 20 25 30 35 Relative Capacit...

Page 84: ...and interacts closely with the HTTP Server powered by Apache FRCA greatly improves Web server performance for serving static content refer to Table 6 3 and Figure 6 3 For best performance FRCA should...

Page 85: ...e b The down side If persistent requests are used the Web server thread associated with that series of requests is tied up only if the Web Server directive AsyncIO is turned Off If there is a shortage...

Page 86: ...t likely reduce the overall number of transmissions and therefore increase the potential capacity of the CPU and the IOP The MTU on the interface should be set to the frame size LIND The MTU on the ro...

Page 87: ...he authorization checking that is performed The HTTP Server serves the pages in ASCII so make sure that the files have the correct format else the HTTP Server needs to convert the pages which will res...

Page 88: ...rsistent connections to the database server jobs This allowed the connection handle to be preserved after the transaction completed future incoming transactions re use the same connection handle The w...

Page 89: ...hin the same job that the PHP script is executing in If a specific userid and password are used database access occurs via a QSQSRVR job which is called server mode processing In all tests using ibm_d...

Page 90: ...your PHP application you ll find that there are two alternative connections db2_connect which establishes a new connection each time and db2_pconnect which uses persistent connections The main advanta...

Page 91: ...workload running under DB2_I5_TXN_READ_COMMITTED reduced the overall capacity by about 5 However a given application might never update the underlying data or run with other concurrent updaters and DB...

Page 92: ...l to size a new system to size an upgrade to an existing system or to size a consolidation of several systems The Estimator allows measurement input to best reflect your current workload and provides...

Page 93: ...same partition on a 2 core 2 2Ghz System i partition using Version 6 1 of WebSphere Application Server and IBM Technology for Java VM Although many of the improvements are applicable to 3 tier enviro...

Page 94: ...0 improvement in throughput However as the default is still to use inet sockets you will need to ensure that the class path specified in the JDBC provider is set to use the jt400native jar file not th...

Page 95: ...ropriate link www ibm com software webservers appserv was library Although some capacity planning information is included in these documents please use the IBM Systems Workload Estimator as the primar...

Page 96: ...3 Otherwise the implementation and workflow of the Trade application remains unchanged Trade 6 also supports the recent DB2 V8 2 and Oracle 10g databases The new design of Trade 6 enables performance...

Page 97: ...lls methods on Quote Account and Holdings Entity EJBs to execute the sell as a single transaction y The results of the transaction including the new current balance total sell price and other data are...

Page 98: ...running on small systems with relatively low memory demands this could offer a substantially smaller memory footprint Performance tests have shown approximately 40 smaller Java Heap sizes when using I...

Page 99: ...70 7758 system WebSphere 6 1 using IBM Technology for Java was measured on V5R4 on a 2 way LPAR 570 7758 system Notes Disclaimers WebSphere Application Server Trade Results IBM i 6 1 Performance Capab...

Page 100: ...ful in the Rochester lab for release to release comparison tests to determine if a degradation occurs between releases and what areas to target performance improvements Table 6 1 describes all of the...

Page 101: ...statment PingJDBCWrite PingJDBCRead tests fundamental servlet to JDBC access to a database performing a single row read using a prepared SQL statment PingJDBCRead PingHTTPSession3 large session objec...

Page 102: ...2 WebSphere Trade 3 Primitives PingHtml PingServlet PingServletWriter PingServlet2Servlet PingJSP PingServlet2JSP PingHTTPSession1 PingHTTPSession2 PingHTTPSession3 PingJDBCRead PingJDBCWrite PingServ...

Page 103: ...r for System i With regards to capacity Figure 6 5 shows the 600 CPW model accelerated to 3100 CPW increases capacity 5 5 times Additionally the 1200 CPW model accelerated to 3800 CPW increases capaci...

Page 104: ...nd idea to note is that the presence of L3 cache has little effect on the response time of a single user Of course there are benefits of L3 cache however the absence of L3 cache does not imply poorer...

Page 105: ...tion you need to ensure that you select an XA compliant JDBC provider For WebSphere on the System i platform you have two options depending on if you are running in a two tier environment application...

Page 106: ...the messaging engine and the EJB container you need to ensure that you increase the number of connections allocated to the connection pool To optimize for one phase commit transactions refer to the fo...

Page 107: ...is a special case of option 2 above and is the recommended approach to run GreenScreen applications For the first time the page is requested SFLPAG rows will be returned If the user performs a page d...

Page 108: ...s restoring a complete window of data when it was not required Therefore it is difficult to give a generalized performance comparison between the same application written to a 5250 device and that app...

Page 109: ...ry requirements for Webfacing V5 0 versus V4 0 This memory savings helps reduce the total memory required by the WebSphere Application Server which is referred to as the JVM Heap Size The amount of me...

Page 110: ...as a starting point for setting the cache size The default size if no size is specified would be 600 record data definitions To set the cache size to something other than the default size you need to...

Page 111: ...e url pattern CacheDumper url pattern servlet mapping This servlet can then be invoked with a URL like http server port webapp CacheDumper Then a Web page like that shown below will be displayed Notic...

Page 112: ...finition Loader As a companion to the Cache Content Viewer tool there is also a Record Definition Cache Loader tool which is also referred to as the Bean Loader This servlet can be used to pre load th...

Page 113: ...play name BeanLoader display name servlet class com ibm etools iseries webfacing diags BeanLoader servlet class init param param name FileName param name param value cachedbeannames lst param value in...

Page 114: ...ill realize significantly improved response times see chart below If the end users are attached via a 512K connection evaluate whether the realized response time improvements offset the increased CPU...

Page 115: ...running WebSphere Express V5 0 on an xSeries Server With the IBM WebFacing Tool V5 0 compression is turned on by default This should be turned off if compression is configured in Apache or if the LAN...

Page 116: ...r expains how to help optimize WebFaced Applications on IBM System i servers Requests for the paper require user registration there are no charges http www 919 ibm com servers eserver iseries develope...

Page 117: ...tion and requires very little technical skill or customization Unless you do explicit customization for an application the default HATS rules will be used to transform the application interface dynami...

Page 118: ...are unchanged Moderate An average of 30 of the screens have been customized Advanced All screens have been customized IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright...

Page 119: ...Administration Tool Setup new or manage existing LDAP directories for business application data Please see the following for more information http publib boulder ibm com iseries v5r3 ic2924 info rzahy...

Page 120: ...lications You should use IBM Systems Workload Estimator http www 912 ibm com wle EstimatorServlet to determine the system requirements for additional web applications IBM i 6 1 Performance Capabilitie...

Page 121: ...press 6 0 Custom applications PDM and WCM The Estimator is available at http www ibm com systems support tools estimator Extensive descriptions and help text for the Portal workloads are available in...

Page 122: ...er when creating a PaymentServerClient When this parameter is specified the overhead of sending the entire IBMPaymentServer dtd file with each response is avoided The dtdPath parameter should contain...

Page 123: ...oth requests and response flows The intention of this workload is to drive the server with a heavy load and to quantify the performance of Connect for iSeries Measurement Results One of the main focal...

Page 124: ...oidable e g trouble shooting problems 4 Management Central Logging This feature will log transaction data to be queried and viewed with Management Central Performance is impacted with this feature on...

Page 125: ...de to improve performance Today s Java applications however typically rely on a variety of system services such as JDBC encryption and security provided by i5 OS the Java Virtual Machine VM and WebSph...

Page 126: ...ive applications which include most Java applications Since their introduction in V5R3 System i5 servers employing POWER5 processors models 520 550 570 and 595 have a proven record of providing excell...

Page 127: ...erhead than they did with the Classic VM offering a performance improvement for some applications The performance impact for JNI method calls to ILE will depend on the frequency of JNI calls and the c...

Page 128: ...ap size often provides reasonable performance Keep in mind that the maximum heap size for the 32 bit VM is 3328 MB Attempting to use a larger value for the initial or maximum heap size will result in...

Page 129: ...R2 time frame for most applications and has continued to improve at a faster rate In V6R1 support for DE was eliminated so the JIT will be used for all Java applications Despite the improvements to JI...

Page 130: ...eshold value GCHINL or Xms often referred to as the initial heap size is the most important value to tune The default size for V5R3 and later is 16 MB Using larger values for this parameter will allow...

Page 131: ...ion If these caches are allowed to grow too large they may consume more memory than is physically available on the system Using smaller cache sizes may improve the performance of your application Byte...

Page 132: ...the source for a cached JVAPGM is changed the currently cached version will simply age out since its class will no longer be a byte for byte match and a new JVAPGM will be silently created and cached...

Page 133: ...e 32 bit VM has a maximum heap size of 3328 MB although most applications will have a practical limit of 3 GB or less Applications which require a larger heap should use 64 bit IBM Technology for Java...

Page 134: ...SQL to access the database while traditional iSeries applications tend to use less expensive data access methods like Record Level Access Therefore Java applications will continue to require more pro...

Page 135: ...y Java tends to require more main storage memory than other languages especially when using the Classic VM The 64 bit VMs both Classic and IBM Technology for Java will also tend to require more memor...

Page 136: ...gh cost of starting a new job Other factors which make Java startup slow include class loading bytecode verification and JIT compilation As a result it is far better to use long running Java programs...

Page 137: ...erformance Tips Due to advances in JIT technology many common code optimizations which were critical for performance a few years ago are no longer as necessary in modern JVMs Even today these techniqu...

Page 138: ...n Some common synchronization patterns are easily illustrated with Java s built in String classes Most other Java classes including user written classes will follow one of these patterns Each has diff...

Page 139: ...class test1 int myarray 1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 11 3 4 5 6 7 8 9 10 11 12 4 5 6 7 8 9 10 11 12 13 5 6 7 8 9 10 11 12 13 14 class test2 static final int myarray2 1 2 3 4 5 6 7 8 9 10 2...

Page 140: ...than dynamically building query strings with literal data This will enable reuse of the PreparedStatement with new parameter values Avoid placing the prepareStatement inside of loops e g just before...

Page 141: ...n on performance tuning and analysis when using IBM Technology for Java Most of the document applies to all platforms using IBM s Java VM in addition one chapter is written specifically for i5 OS info...

Page 142: ...ptographic accelerator function in a single PCI X card y 5722 AC3 Cryptographic Access Provider withdrawn This product is no longer required to enable data encryption y Cryptographic Services API func...

Page 143: ...WER5 hardware system which provides Simultaneous Multi Threading The tools used to obtain this data are in some cases only single threaded single instruction stream applications which don t take advan...

Page 144: ...575 27 172 28 005 446 27 349 1024 256 10 AES 34 350 709 524 110 892 831 1 692 65536 128 10 AES 35 754 190 34 916 31 137 523 30 408 1024 128 10 AES 22 614 607 345 72 782 397 1 111 65536 256 1 AES 23 31...

Page 145: ...Cryptographic API Performance This section provides information on the hardware based cryptographic offload solution IBM 4764 PCI X Cryptography Coprocessor Feature Code 4806 This solution will improv...

Page 146: ...sion thus freeing the server for other processing For cryptographic accelerator applications the 4764 Cryptographic Coprocessor is a replacement for the 2058 Cryptographic Accelerator feature code 480...

Page 147: ...resource of server only authentication RSA authentication requests can be offloaded to an IBM 4764 Cryptographic Coprocessor y With the use of Collection Services you can count the SSL TLS handshake...

Page 148: ...ireless Security and PKI services are intended to help customers build trusted electronic relationships with employees customers and business partners These general IBM security services are described...

Page 149: ...re used when accessing using files in the root QOpenSys and user defined file systems UDFS See the iSeries NetServer articles in the iSeries Information Center for more information iSeries NetServer P...

Page 150: ...r File Serving 150 environment can be obtained by sending an email to llhirsch us ibm com Throughput 0 000 50 000 100 000 150 000 200 000 250 000 1 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 Clients M...

Page 151: ...at when customers upgrade to V5R4 they can expect to see an improvement in throughput and response time when using iSeries NetServer IBM i 6 1 Performance Capabilities Reference January April October...

Page 152: ...mance Tips for both SQL programming and JDBC tuning techniques to improve performance are included here y In general when accessing a database it takes less time to retrieve smaller amounts of data Th...

Page 153: ...t block size for the application block size specifies the amount of data to retrieve from the server and cache on the client For the Toolbox driver block size specifies the transfer size in kilobytes...

Page 154: ...The ODBC performance parameters discussed in detail are y Prefetch y ExtendedDynamic y RecordBlocking y BlockSizeKB y LazyClose y LibraryView Prefetch The Prefetch option is a performance enhancement...

Page 155: ...brary name or all but the last 3 characters of the package name can be changed RecordBlocking The RecordBlocking switch allows users to control the conditions under which the driver will retrieve mult...

Page 156: ...B2 Universal Database for System i SQL Call Level Interface ODBC is found under the System i Information Center under Printable PDFs and Manuals y The System i Information Center Http publib boulder i...

Page 157: ...n CPW ratings are used to compare Domino performance across hardware models With the introduction of the POWER6 models it is less necessary to provide separate MCU and CPW ratings Appendix C provides...

Page 158: ...5162 April 2002 y iNotes Web Access on the IBM eServer iSeries Server SG24 6553 February 2002 11 1 Domino Workload Descriptions The Mail and Calendaring Users workload and the Domino Web Access mail...

Page 159: ...erformance with Domino 7 and Domino 8 have been published in a 2 part series of articles The following links refer to these articles y IBM Lotus Notes V8 workloads Taking performance to a new level Se...

Page 160: ...the Domino Domain Monitoring facility which provides a means to monitor and determine the health of an entire domain at a single location and quickly resolve problems Some of the System i guidelines...

Page 161: ...utilization as the Domino 5 0 11 test In Domino 6 new memory caching techniques are being used for the Notes client to improve response time and may require additional memory Both comparisons shown i...

Page 162: ...tant to use appropriate rating metrics see Appendix C or a sizing tool such as the IBM Systems Workload Estimator The POWER4 and POWER5 processors have been designed to run at significantly higher MHz...

Page 163: ...Time 2 X 255Mhz 170 2409 450 Mhz 270 2423 540Mhz 270 2452 Megahertz and Response Time Relationship Figure 11 3 Response Time and Megahertz relationship 11 6 Collaboration Edition and Domino Edition of...

Page 164: ...ini server document settings y Mail box setting Setting the number of mail boxes to more than 1 may reduce contention and reduce the CPU utilization Setting this to 2 3 or 4 should be sufficient for...

Page 165: ...before the server routes mail For our large server runs we set this to 20 Overall this decreased the cpu utilization by approximately 10 by allowing the router to deliver more messages when it makes...

Page 166: ...Don t overwrite free space Select Don t overwrite free space in the advanced properties section of Database properties if system security can be maintained through other means such as the default of...

Page 167: ...ncrease over time you should increase the number of active MAIL BOX files and continue to monitor the statistics 11 9 Domino Subsystem Tuning The objects needed for making subsystem changes to Domino...

Page 168: ...l Users Outbound Inbound Hits Lookups Hits Lookups Rcv Sec 07 58 1515420 18001 39 50 18001 23 0 0 264 0 0 0 07 59 1550099 18001 37 25 18001 23 0 0 0 0 0 0 08 00 1536840 18001 31 95 18001 24 0 0 0 0 0...

Page 169: ...mprove the performance of applications that read and write data in logical I O sizes smaller than 16k Conversely it may slightly degrade the performance of applications that read and write data with a...

Page 170: ...e in Figure 11 4 above that as the base pool decreased in size moving to the right on the chart the page faulting increased for all settings of main storage option Using the DYNAMIC and NORMAL attribu...

Page 171: ...lower response times As is the case with many performance settings your mileage will vary for the use of DYNAMIC and MINIMIZE Depending on the relationship between the CPU disk and memory resources o...

Page 172: ...enhancement to defining and handling Domino Clustering activity Be sure to note the redpaper Sizing Large Scale Domino Workloads on iSeries which is available at http www redbooks ibm com redpapers pd...

Page 173: ...on describes the new Accelerator offerings which provide improved performance characteristics for the i520 models In particular note Figure 6 6 to observe potential response time differences for a 500...

Page 174: ...WebSphere MQ V5 3 CSD6 WebSphere MQ V5 3 CSD6 introduces substantial performance improvements at queue manager start and during journal maintenance Queue Manager Start Following an Abnormal End WebSp...

Page 175: ...ions and tips techniques to consider for WebSphere MQ for iSeries More details are available in the previously mentioned support pacs y MQ V5 3 shows an improvement in peak throughput over MQ V5 2 for...

Page 176: ...rnal receiver used by MQ Series on a user ASP in order to ensure best overall performance MQ defaults to creating the receiver on the system ASP In addition the disk arms and IOPs in the user ASP shou...

Page 177: ...oduct literature to make sure there is support for their desired combination y Linux and other Open Source tools are almost all constructed from a single Open Source compiler known as gcc Therefore th...

Page 178: ...s OS 400 or now Linux to load two jobs tasks threads Linux processes etc into the CPU The CPU itself will then alternate execution between the two tasks if one task waits on a hardware resource such a...

Page 179: ...It could then talk to a second providing the inner fire wall and then the second Linux partition could use virtual LAN to talk to OS 400 to obtain OS 400 services like data base This could be done as...

Page 180: ...on t know about any other partition much less a Linux one Tasks representing Licensed Internal Code may show more activity but attributing this to Linux is not straightforward y If the OS 400 partitio...

Page 181: ...e community and is independent of the CPU architecture Generally for integer based applications general commercial y OS 400 PASE xlc gives the fastest integer performance y ILE C C is usually next y L...

Page 182: ...Performance on the same hardware with other IBM JVMs will be roughly equal except that newer JVMs will often arrive a bit later on Linux The IBM JVM is almost always much faster than the typical open...

Page 183: ...compiler option O3 The current gcc compiler is used for a great fraction of Linux applications and the Linux kernel At this writing the current gcc version is ordinarily 2 95 but this will change over...

Page 184: ...of Virtual LAN as is always the case varies based on items like average IP packet size and so on However in typical use we ve observed speeds of 200 to 400 megabits per second on 600 MHz processors Th...

Page 185: ...for block writes 3 4 MB sec for block reads y Virtual Disk OS 400 1 600 MHz CPU 112 MB sec for block writes 97 MB sec for block reads As noted this is not an absolute comparison Linux has some file s...

Page 186: ...imating system requirements Consult the latest version of Workload Estimator including its on line help text when specifying a system containing relevant Linux partitions The workload estimator can be...

Page 187: ...0 the overall limit of 32 partitions on the one hand and the larger number of processors on the other begins to make shared processors less interesting as a strategy y Use IBM s JVM not the default Ja...

Page 188: ...performance functional and cost leverage to support such uses Remember that some models do not support Shared Processors y Use spread_lpevents n when using multiple Virtual Processors from a Shared Pr...

Page 189: ...ox on Native LAN through the partition with the Native LAN and then moving to a second partition via Virtual LAN then to another IBM i 6 1 Performance Capabilities Reference January April October 2008...

Page 190: ...ations per second but an activity counter in the workload itself No LPAR s were used all system resources were dedicated to the testing The workload is batch and I O intensive small block reads and wr...

Page 191: ...571E 574F 320 3 18 RAID5 4 18 RAID6 90 MB 5737 5776 0648 571B 320 NA NA 5736 5775 0647 571A 300 3 18 RAID5 4 18 RAID6 175 MB 5679 57B8 Aux cache card 57B7 NA 3 8 40 MB 5727 5728 9510 573D Write cache...

Page 192: ...larger capacity drives can appear to be faster than lower capacity drives in the same environment running the same workload in the same size database That perceived improvement can disappear or even...

Page 193: ...ting to the same 15 30 and 45 DASD units at the same time So the number of I O DASD operations are double when saving to SAVF This was not meant to show what can be expected from a backup environment...

Page 194: ...5K 35GB DASD The system CPU used with and without and IOP was basically the same for the 571B with our workload tests IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright...

Page 195: ...e storage The charts try to point out that there may be performance considerations even when the space isn t needed 14 1 4 1 14 1 4 2 IBM i 6 1 Performance Capabilities Reference January April October...

Page 196: ...D units in a 5094 enclosure and a maximum of 12 DASD units in a 5095 enclosure The 2757 and 2780 can support up to 18 DASD units with the same performance characteristics as they display with the 10 D...

Page 197: ...n with all workloads Also note the 571E 574F requires the auxiliary cache card to turn on RAID and the 571F 575B has the function included in its double wide card packaging for better system protectio...

Page 198: ...s environments so we have chosen to display results from only the 571E 574F 14 1 6 1 14 1 6 2 IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright IBM Corp 2008 Chapter 14...

Page 199: ...ormance boundaries of RAID6 on the 571E 574F is about the same as the performance boundaries of our 2780 574F configured using RAID5 so better protection could be achieved at current performance level...

Page 200: ...IBM Corp 2008 Chapter 14 DASD Performance 200 RAID 5 compared to Mirroring 0 0 05 0 1 0 15 0 2 0 25 0 5000 10000 15000 20000 25000 30000 35000 40000 571E 574F 15 35 GB DASD IOPLess RAID 5 571E 14 70 G...

Page 201: ...rformance 201 571F 575B Scaling 0 0 05 0 1 0 15 0 2 0 5000 10000 15000 20000 571F 575B 15 DASD IOPLess RAID 5 571F 575B 1 IOA 18 DASD 3 cages each off 1 571F port IOPLess RAID 5 571F 575B 24 DASD IOPL...

Page 202: ...nts The next limit will be the buses in a single tower We are using a large file concurrent RSTLIB operations from multiple virtual tape drives located on the DASD in the target HSL loop to try to hel...

Page 203: ...wer in an HSL Loop 1_Tower 81_DASD 3_571E 1_571F 1_Tower 117_DASD 3_571E 2_571F 1_Tower 153_DASD 3_571E 3_571F Large Block READs on Multiple 5094 Towers in a Single HSL Loop 1_Tower 3_571E_ _3_571F 15...

Page 204: ...you must ensure the system is configured optimally to achieve the increased performance documented above This is because some card slots or backplanes may only support the PCI protocol versus the PCI...

Page 205: ...0 1000 1200 9406 MMA 4 way 6 433B 70 GB DASD Mirrored No Cache 9406 570 4 way 6 4327 70 GB DASD Mirrored No Cache 9406 570 4 way 6 4327 70 GB DASD Mirrored With Cache Workload Throughput System Respon...

Page 206: ...r 14 DASD Performance 206 0 02 0 05 0 08 0 11 0 14 5000 6000 7000 8000 9000 10000 11000 12000 13000 9406 570 4 way 24 4328 140 GB RAID5 24 active 9406 570 4 way 24 4328 140 GB RAID5 22 active 2 Hot Sp...

Page 207: ...on 8 IOAs I moved the 12X loop to the other 12X GX adapter in the CEC and ran the test again and saw no difference in the testing between the two loops The 12X loop is rated for more throughput than t...

Page 208: ...n Encrypted ASP vs Encrypted ASP 0 0 05 0 1 0 15 0 2 2000 4000 6000 8000 10000 12000 14000 DASD IO Workload Throughput System Response Time sec 9406 MMA 4 Way 571F w ith 24 DASD Non Encrypted ASP 9406...

Page 209: ...ctober 2008 Copyright IBM Corp 2008 Chapter 14 DASD Performance 209 Non Encrypted ASP vs Encrypted ASP 0 5 10 15 20 25 6000 7300 8600 9800 Workload Throughput CPU 9406 MMA 4 Way 571F with 24 DASD Non...

Page 210: ...ich can be RAID5 6 or protected with mirroring IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright IBM Corp 2008 Chapter 14 DASD Performance 210 0 0 05 0 1 0 15 0 2 0 25...

Page 211: ...April October 2008 Copyright IBM Corp 2008 Chapter 14 DASD Performance 211 0 0 05 0 1 0 15 0 2 0 25 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 POWER6 520 57B8 57B7 6 RAID5 DASD in CEC 12 RAID...

Page 212: ...e January April October 2008 Copyright IBM Corp 2008 Chapter 14 DASD Performance 212 0 0 05 0 1 0 15 0 2 0 2000 4000 6000 8000 10000 12000 14000 DASD IO Workload Throughput System Response Time ms 940...

Page 213: ...IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright IBM Corp 2008 Chapter 14 DASD Performance 213...

Page 214: ...ge options you can get more information through your IBM representative and the white papers that are available at the following location https www 304 ibm com systems support IBM i 6 1 Performance Ca...

Page 215: ...o so and do not think most customers will need to either The tuning available from IVM proved sufficient and should be preferred for its ease of use when it is workable Customers should strongly consi...

Page 216: ...y have a range of sizes like 522 524 and others Confusingly for us the industry has gone away from strictly 512 byte sectors for some devices They too have headers that consume extra bytes However as...

Page 217: ...than to take on the performance problems of doing this under IBM i operating system RAID recovery procedures will have to be pursued outside of IBM i operating system in any event so the protection m...

Page 218: ...native IBM i operating system internal disks It should be exceptional to use VIOS for internal disks 5 Prefer RAID 1 or RAID 10 to RAID 5 We are now beginning to generally recommend RAID 1 mirroring o...

Page 219: ...the various internal and external buses etc To the extent practical the user should strive for even numbers of items 10 In general do not share the same physical disk with multiple partitions Only If...

Page 220: ...Active Array Member 35 1GB pdisk4 07 08 00 6 0 Active Array Member 35 1GB pdisk5 07 08 01 0 0 Active Array Member 35 1GB pdisk6 07 08 01 1 0 Active Array Member 35 1GB Here it turns out that these par...

Page 221: ...r In fact it might be an experiment worth running even if you have shared processors configured generally 5 VIOS and memory VIOS arranges for the DMA to go directly to the IBM i operating system memor...

Page 222: ...gement challenges Reference The following is a link to an LPAR white paper http www ibm com systems i solutions perfmgmt pdf lparperf pdf For most of our testing we only utilized one IBM i operating s...

Page 223: ...e 223 VIOS IBM i operating system JS22 Express DS4800 90 DDMs Commercial Performance Workload 0 001 0 01 0 1 1 10 0 10000 20000 30000 40000 50000 60000 Transactions Minute Response Time sec IBM i oper...

Page 224: ...n in the purple line optimized our environment best As I increased from 6 virtual processors I started losing performance until I had increased to the 28 virtual processors available to me shown in th...

Page 225: ...10 20 30 40 50 60 70 80 90 100 20000 25000 30000 35000 40000 45000 50000 55000 60000 Transactions Minute i5 OS CPU 3 Dedicated Processors IBM i operating system Partition 1 Processor VIOS 3 Shared Pro...

Page 226: ...periments Commercial Performance Workload 0 0001 0 1001 0 2001 0 10000 20000 30000 40000 50000 60000 Transactions Minute Response Time Seconds 1 of 2 i5 OS 1 7 Processor Partitions on a 6 Processor VI...

Page 227: ...14000 16000 18000 20000 Transactions Minute Response Time Seconds BladeCenter S JS12 12 BladeCenter SAS DASD IBM i operating system miroring 16GB Memory 1 8 Processors i5 2 VIOS BladeCenter H JS22 12...

Page 228: ...oices here This is the most important consideration as it is difficult to change later Consult also any available Best Practices manuals for a given SAN attached storage server 2 The VIOS partition sh...

Page 229: ...end user might be looking for ease of use and choose to create one array with multiple LUNs where another end user might consider performance to be a more critical issue and select to create multiple...

Page 230: ...Arrays in DB ASP DS3400 JS22 4 WAY 9 LUNs on 9 4DDM RAID5 Arrays in DB ASP DS4800 JS22 4 WAY 9 LUNs on 9 4DDM RAID10 Arrays in DB ASP Blade Center H with a JS22 4 Way Commercial Performance Workload...

Page 231: ...supported Linux virtual SCSI The performance considerations that we detail in this section must be balanced against the savings made on the overall system cost For example the smallest physical disk...

Page 232: ...ire ASP or a portion of an ASP If the server partition provides the client with a partition of a drive then the server decides the area of the drive to serve to the client when the network storage spa...

Page 233: ...n be measured from a single thread or from a set of threads executing concurrently Though many applications are more sensitive to latency than bandwidth bandwidth is crucial for many typical operation...

Page 234: ...ent and virtual SCSI server On the Hardware Management Console the terms virtual SCSI server adapter and virtual SCSI client adapter are used They refer to the same thing When describing the client se...

Page 235: ...ive depending on the read cache performance 14 6 2 2 Virtual SCSI Bandwidth Multiple Network Storage Spaces Figure 2 shows a comparison of measured bandwidth while scaling network storage spaces with...

Page 236: ...storage space should be attached to its own network storage description NWSD Read Scaling 0 20 40 60 80 100 120 Smal l T r ansacti ons 4k 16k M edi um T r ansacti ons 32k 64k Lar ge T r ansacti ons 12...

Page 237: ...100 150 200 250 15 Disk 30 Disk 45 Disk D isk 1NWSD 2 NWSD 4 NWSD 8 NWSD 16 NWSD 24 NWSD Read Performance Large Transactions 0 200 400 600 15 Disk 30 Disk 45 Disk D isk 1NWSD 2 NWSD 4 NWSD 8 NWSD 16 N...

Page 238: ...dicated Processors One sizing method is to size the Virtual SCSI server to the maximum I O rate of the attached storage subsystem The sizing could be biased to small I Os or large I Os Sizing to maxim...

Page 239: ...ted from the table above 10 000 34 34 34 of a total CPU 1 000 000 The total CPU required for a workload which performs 10 000 16k read transactions per second would be 34 of a 2 2Ghz POWER5 processor...

Page 240: ...partition be configured as uncapped so it can take advantage of unused capacity of other partitions it is possible to get more processor time to service I O Because I O latency with Virtual SCSI vari...

Page 241: ...cs using F10 The number of I O requests per second should lower and your throughput to the IBM i operating system Virtual SCSI server should increase Continue adding memory to the IBM i operating syst...

Page 242: ...with larger loads it may be advantageous to keep the I O server as a dedicated processor Extensive information can be found at the System i Information Center web site at http publib boulder ibm com i...

Page 243: ...n this document the rates are used to help determine possible performance A study of some customer data showed that compaction on their database file data occurred at a ratio of approximately 2 8 to 1...

Page 244: ...gnificantly lower CPU utilization and the backup device will perform more efficiently Data Compression DTACPR Data compression is the ability to compress strings of identical characters and mark the b...

Page 245: ...agement functions where they keep the IFS space cleaned up and compressed And the fact that the objects tend to be smaller by nature or are mail documents HTML files or graphic objects that don t comp...

Page 246: ...here The different workloads have different overheads different compaction rates and the backup devices use different buffer sizes and different compaction algorithms The attempt here is to group thes...

Page 247: ...FromWorkLoad Compaction Factor 5 0 0 95 4 75 2 0 9 5 MB S 3600 34200 MB HR 34 GB HR 15 7 Ultra High Performing Backup Devices High speed backup devices are designed to perform best on large files The...

Page 248: ...cts to be processed at the same time from different jobs making better use of the backup devices and the system For systems with a large quantity of data and a few very large database files whether in...

Page 249: ...tion would tell us that we can only flow so much data across a single HSL The total number of 3580 002 tape drives we believe we could put on a link was something a little greater than 2 but the 3rd t...

Page 250: ...0 GB HR 340 GB HR R 5 21 TB HR 5 14 TB HR 4 90 TB HR 4 63 TB HR 4 15 TB HR 2 88 TB HR 1 45 TB HR 1 09 TB HR 730 GB HR 365 GB HR S 320 GB DB file with 80 4 GB members 16 15 14 13 12 8 4 3 2 1 3580 002...

Page 251: ...caling of tape drives on the system along with trying to locate any saturation points that might help our customers identify limitations in their own environment 3 65 TB HR 3 55 TB HR 3 35 TB HR 3 14...

Page 252: ...HR 69 GB HR R 1010 GB HR 995 GB HR 965 GB HR 932 GB HR 858 GB HR 782 GB HR 699 GB HR 627 GB HR 504 GB HR 399 GB HR 272 GB HR 140 GB HR S 12 GB total Library size workload was used for modeling this a...

Page 253: ...nce January April October 2008 Copyright IBM Corp 2008 Chapter 15 Save Restore Performance 253 0 50 100 150 200 GB HR Save Restore Operation 1 Processor to 2 Processors Backup Operations User Mix Work...

Page 254: ...guration in order to determine if it is possible to use multiple high speed devices on the system and still get the most out of these devices No matter what you determine is possible we advocate sprea...

Page 255: ...g helps to show that even if your workload is large file you may not gain anything in your back up window even using the virtual tape drives If your tape drive uses smaller block sizes your virtual ta...

Page 256: ...me performance impact IBM i 6 1 Performance Capabilities Reference January April 2008 Copyright IBM Corp 2008 Chapter 15 Save Restore Performance 256 0 0 5 1 1 5 2 2 5 3 3 5 4 Hours to Save 1 TB of Da...

Page 257: ...tore Performance 257 0 500 1000 1500 2000 2500 3000 3500 GB HR Save Restore Operation Parallel Virtual Tape for Large File 570 16 way 128GB Memory 800 DASD units for Virtual Tape Drives 1 Virtual Tape...

Page 258: ...Performance 258 0 500 1000 1500 2000 2500 3000 3500 GB HR Save Restore Operation Concurrent Virtual Tape for Large File 570 16 way 128GB Memory 800 DASD units for Virtual Tape Drives 1 Virtual Tape D...

Page 259: ...om 6 DASD and scaled up to 1 5 TB HR on the 108 DASD The bottle neck will be limited to where you are writing and how many DASD are available to the write operation IBM i 6 1 Performance Capabilities...

Page 260: ...guide IBM i 6 1 Performance Capabilities Reference January April 2008 Copyright IBM Corp 2008 Chapter 15 Save Restore Performance 260 Large File Save 0 100 200 300 400 500 600 6 D A S D 1 2 D A S D 1...

Page 261: ...D A S D 4 2 D A S D 4 8 D A S D 5 4 D A S D 6 0 D A S D 6 6 D A S D 7 2 D A S D 7 8 D A S D 8 4 D A S D 9 0 D A S D GB HR RAID5 SAVE RAID6 SAVE MIRRORING SAVE User Mix Restores 0 20 40 60 80 100 120...

Page 262: ...fc 5704 card can be added according to the locations recommended above if needed vSpread tape fibre cards across as many HSL s as possible with maximums as follow y On Loops running at 1 GByte e g all...

Page 263: ...icant impact on save times but only a minor impact to restore times IBM i 6 1 Performance Capabilities Reference January April October 2008 Copyright IBM Corp 2008 Chapter 15 Save Restore Performance...

Page 264: ...yption 9406 MMA 4w ay NON Encrypted ASP RSTLIBBRM With Softw are Encryption 9406 570 4w ay NON Encrypted ASP RSTLIBBRM NO Softw are Encryption 9406 570 4w ay NON Encrypted ASP RSTLIBBRM With Softw are...

Page 265: ...0 60 60 50 16 14 47 30 13 13 12 R 70 65 65 65 65 65 65 35 40 27 25 23 S 1 Directory Many Objects 1530 1340 830 570 510 R 1700 1420 890 580 525 S Large File 320GB 1500 1340 830 560 500 390 330 175 68 R...

Page 266: ...5 R 17 22 S Source File 1GB iV5R4 iV5R4M0 Release Measurements were done SLR60 from table 15 18 1 6258 4MM tape Drive Workload S Save R Restore Table 15 19 2 iV5R4M0 Measurements on an 5XX 1 way syste...

Page 267: ...bjects 65 45 60 60 50 R 80 90 80 80 65 S 1 Directory Many Objects 1240 760 785 550 510 R 1420 650 635 525 525 S Large File 320GB 1230 760 785 550 500 R 1380 650 585 510 500 S Large File 64GB 195 182 1...

Page 268: ...6 9 0 2 6 7 5 2 0 6 0 1 8 S User Mix 3GB 4 5 4 5 21 0 9 0 21 0 9 8 21 0 9 2 R 5 3 6 14 0 3 0 12 0 2 2 9 0 1 8 S Source File 1GB V5R3 V5R3 V5R3 V5R3 V5R3 V5R3 V5R3 V5R3 Release Measurements were done 3...

Page 269: ...24 way system doesn t affect the software compression scenario 65 39 Restore 3 1 6 6 Save iV5R2 Using API DTACPR HIGH 31 23 Restore 2 7 1 27 26 Save iV5R2 Using API DTACPR MED 57 37 Restore 1 5 1 108...

Page 270: ...13 4 3 0 S Source File 1GB iV5R4M5 iV5R4M5 Release Measurements were done SAS 6331 DTACPR YES 5X Media SAS 6331 DTACPR NO 5X Media Workload S Save R Restore Table 15 23 1 iV5R4M5 Measurements on an 9...

Page 271: ...6R1M0 V6R1M0 V6R1M0 V6R1M0 IBM i Release Two High Speed Tape Drives on a Single 576B IOA using both ports concurrently Virtual Tape 120 DASD in ASP2 Virtual Tape 60 DASD in ASP2 3592E06 Fiber 576B 2 P...

Page 272: ...ape Drive 4 3592E 5 VXA 320 6 High Ultrium 2 7 IFS Restore Improvements for the Directory Workloads 8 5761 4Gb Fiber Adapter TIPS 1 Backup devices are affected by the media type For most backup device...

Page 273: ...ed causing recovery processing to be done during the IPL The amount of processing is determined by the system activities at the time the system terminates y For an abnormal IPL the benchmark consists...

Page 274: ...database load causes a long directory recovery 9406 MMA 7061 16 way 512 GB Mainstore DASD 1000 70GB 15K rpm arms 3 ASP s defined 196 Nonconfigured DASD 120 RAID5 DASD in ASP1 612 RAID5 DASD in ASP2 7...

Page 275: ...hase is composed of C1xx xxxx C3xx xxxx and C7xx xxxx SLIC is composed of C200 xxxx and C600 xxxx OS 400 is composed of C900 xxxx SRCs to the System i server console sign on 16 5 9406 MMA IPL Performa...

Page 276: ...ing storage allocations The duration of this recovery step will depend on the type of recovery performed and on the size of the directories In most cases a subset directory recovery SRC C6004250 will...

Page 277: ...irectory recovery 595 7499 32 way 384 GB Mainstore DASD 1125 35GB arms 15K rpm arms RAID protected 3 ASP s defined majority of the DASD in ASP2 Mainstore dump was to ASP 2 y This system was tested wit...

Page 278: ...xxx and C7xx xxxx on the 5xx systems SLIC is composed of C200 xxxx and C600 xxxx OS 400 is composed of C900 xxxx SRCs to the IBM i operating system console sign on 16 9 5XX IPL Performance Measurement...

Page 279: ...e you have the time to wait for the table to compress y Reduce the number of device descriptions by removing any obsolete device descriptions y Control the level of hardware diagnostics by setting the...

Page 280: ...t network connections and provides i5 OS system management and disk consolidation for guest System x and BladeCenter platforms The iSCSI HBA solution provides an extensive scalability range from conne...

Page 281: ...i native CPU memory and storage subsystems The rest of this chapter describes some of the performance and memory resource impacts 17 2 1 IXS IXA Disk I O Operations The integrated xSeries servers use...

Page 282: ...eme test loads it s unlikely the IOP will saturate due to disk activity When multiple IXS IXA servers are attached under the same System i partition the partition software imposes a cap on the aggrega...

Page 283: ...on basic disks If a disk has been converted to a dynamic disk8 the DISKPART command creates a new partition and configures a spanned set across the partitions With iSCSI the second partition may exper...

Page 284: ...with the IXS IXA and iSCSI attached servers y The Point to point virtual Ethernet is primarily used for the controlling partition to communicate with the integrated server This network is called poin...

Page 285: ...nts may alter this according to the active load characteristics But there are base memory requirements needed to support the hardware and set of adapters used by the i5 OS partition You can refer to t...

Page 286: ...s 130 800 1000 104 CPWs out of the host processor CPW capacity While it is always better to project the performance of an application from measurements based on that same application it is not always...

Page 287: ...by the IXS or IXA is 32k Thus any Windows disk operations greater than 32k will result in the Windows operating system splitting the operation into 2 or more sequential operations y The IXS IXA cost...

Page 288: ...tion provides better over all performance in Write and Read scenarios While the sector based architecture of the current IXA provides slightly better performance in small block Write operations that d...

Page 289: ...n Port Based VE refers to a port based connection between two guest servers in the same partition VLAN based VE refers to a Virtual LAN based connection between two guest servers in the same partition...

Page 290: ...i jumbo iscsi standard IXS IXA external NICs VE Internal port based 0 100 200 300 400 500 600 700 800 900 1000 16 64 256 1024 4096 16384 Transaction Size bytes Throughput MBits sec iscsi jumbo iscsi s...

Page 291: ...17 7 File Level Backup Performance The Integrated Server support allows you to save integrated server data files directories shares and the Windows registry to tape optical or disk SAVF in conjunctio...

Page 292: ...iSCSI HBA addition in V5R4 increases Integrated server configuration flexibility and performance scalability As part of the preparation for integrated server installations care should be taken to est...

Page 293: ...grated xSeries solution for Linux Information on an IXS or attached xSeries server Microsoft Hardware Compatibility Test URL See http www microsoft com whdc hcl search mspx search on IBM for product t...

Page 294: ...tion on LPAR performance It is located at the following website http www 1 ibm com servers eserver iseries perfmgmt pdf lparperf pdf V5R2 Additions In V5R2 some significant items may affect one s LPAR...

Page 295: ...s may be particularly useful for Linux see Linux chapter 18 2 Considerations This section provides some guidelines to be used when sizing partitions versus stand alone systems The actual results measu...

Page 296: ...n summary the measurements on the 4 way system indicate that when a workload can be logically split between two systems using LPAR to configure two systems will result in system capacities that are gr...

Page 297: ...ines for partitioned systems the standard AS 400 Commercial Processing Workload CPW was run in several environments to better understand two things First how does the sum of the capacity of each parti...

Page 298: ...arge system was also compared to the capacity of an equally sized stand alone system If all the partitions except the partition running the CPW are idle or at low utilization the capacity of the parti...

Page 299: ...ause of a reduction in contention within the CPW workload itself That is the measurement of the standalone 12 way system required a larger number of users to drive the system s CPU to 70 percent than...

Page 300: ...increases will range from 5 to 26 The capacity increase will depend on the number of processors partitioned and on the number of partitions In general the greater the number of partitions the greater...

Page 301: ...this configuration by coupling multiple database servers together in a clustered environment The benchmark is designed in such a way that these clusters scale far better than might be expected in a r...

Page 302: ...rid password through the Notesbench organization Click on Site Registration at the above address An alternate is to refer to the ideasInternational web site listed above For more information on iSerie...

Page 303: ...he affinity field is no longer used in a N way multiprocessor machine Thus on an N way multiprocessor machine a job will have equal affinity for all processors based only on delay cost A new system va...

Page 304: ...lower priority jobs When the CPU utilization is at a point of saturation the lower priority jobs are climbing quite a way up the curve and interacting with other curves all the time This is when the D...

Page 305: ...50 28 9 21 9 Priority 20 CPU Intensive Job CPU 0 75 0 32 RAMP C Average Response Time 56951 60845 RAMP C Transactions per Hour 82 2 77 6 Interactive CPU Utilization 97 8 93 9 Total CPU Utilization QD...

Page 306: ...storage will be available on AS 400 Advanced System model 530 The impact of the 4KB page size on main storage utilization varies by workload The impact of the 4KB page size is dependent on the way dat...

Page 307: ...if you know times when abrupt changes in memory are likely to be required such as a difference between daytime operations and nighttime operations or when you want to always have memory available for...

Page 308: ...set to either 2 or 3 the pool that is used to hold the object should be a private pool so that the dynamic adjustment algorithms do not shrink the pool because of the lack of job activity in the pool...

Page 309: ...storage to the interactive pool may be beneficial see NOTE below Batch 1 flts sum of database and non database faults per second during a meaningful sample interval for the batch pool 2 flt flts diskR...

Page 310: ...ent PCs y Time to collect hardware inventory from client PCs The figures below illustrate the time it takes to collect software and hardware inventory from various numbers of client PCs This test was...

Page 311: ...42 minutes Figure 19 1 AS 400 NetFinity Software Inventory Performance 0 100 200 300 400 500 600 Number of PC Clients 0 20 40 60 80 100 Total Collection Time min AS 400 NetFinity Hardware Inventory Pe...

Page 312: ...on tends to be more chatty on the LAN than software collection depending on the hardware features 4 The communications protocol IPX TCP IP or SNA is not a limitation 5 Collected data is automatically...

Page 313: ...as Lotus Domino or especially Java With Java generally and with certain applications it will be commonplace to have multiple threads in a job That means taking a closer look at some old friends MAXACT...

Page 314: ...s will substantially increase v Note carefully that this can happen as a result of an upgrade If you have just purchased a new machine and it runs slower instead of faster it may be because you re usi...

Page 315: ...lt the manuals in case of uncertainty Generally v OPTIMIZE 10 is the lowest and most debuggable v OPTIMIZE 20 is a trade off between rapid compilation and some minimal optimization v OPTIMIZE 30 provi...

Page 316: ...nk about how many entities there are not how big the entity is It turns out that controlling the number of entities matters most in terms of controlling main storage and even processor usage it costs...

Page 317: ...ly adjusts b in the above equation to account for what is picked for N System Level Considerations In terms of the computer science textbooks we are largely done But for someone in charge of commercia...

Page 318: ...ytes or reducing the currency table from a cost of N squared where N is the number of countries to 2 times N There are two obvious implementations of the currency table 1 Implement the table as a two...

Page 319: ...e such item So where are the savings The above recommendations will save 9 bytes per record If you write the code in RPG this does not seem like much That would be 9 bytes times the number of jobs use...

Page 320: ...l A Final Thought About Memory and Competitiveness The currency storage reduction example remains a good one just at the wrong level of granularity Avoiding a SQL join that produces N squared records...

Page 321: ...l CPU both controlled by hardware It was created by replicating key registers including another instruction counter Generally there is a distinction between the one physical processor and its two logi...

Page 322: ...to be reported on a physical CPU basis ySMT complicates the question of measuring CPU utilization yCPU utilization measurements are not greatly affected by HMT ySMT allows multiple streams of executi...

Page 323: ...d what occurs when the DIMMs of different sizes are mixed we used a Power6 520 9408 M25 F C 5635 a fully enabled system with one partition using all the available resources Mixed here means the DIMMs...

Page 324: ...of the data Although such a specification exists for Binary Floating Point operands the processor designs have the option of allowing free alignment of Binary Floating Pointer operands as well The Po...

Page 325: ...iable will be aligned on a doubleword 1 boundary Each access of this second floating point variable will result in an interrupt on Power6 processors The second of these structures allows the compiler...

Page 326: ...se server or application server automatically switches to a backup system due to the failure of the primary server Partition A cluster event where communication is lost between one or more nodes in th...

Page 327: ...onments available make it difficult to characterize a typical high availability environment The following section provides a simple description of the high availability test environment used in our la...

Page 328: ...ep the number of database objects in SYSBAS low on both systems Larger number of objects in SYSBAS can slow the switchover 21 2 Geographic Mirroring A variety of scenarios exist in Cross Site Mirrorin...

Page 329: ...d from the time the role change is issued from the GUI or the CHGCRGPRI until the new primary systems IASP is available Inactive Switchover Once the geographic mirror copy is synchronized the switchov...

Page 330: ...ngMap switch 4Gigabit Lines 4Gigabit Lines SystemASP iASP SystemASP iASP Hardware Configuration 64 Gig 64Gig Memory 35 Gig 35 Gig Size of Dasd 2757 2757 Type of IOA s 15k 15k Speed of Dasd 50 50 of Ar...

Page 331: ...scenario An enviroment with objects the size of the objects used in this test caused synchronization times 4x s larger Switchable Towers using Geographic Mirroring The following data shows the time r...

Page 332: ...degrade the applications running on the system during the synchronization process Multiple TCP lines should be configured using TCP routes Failure to use TCP routes will lead to a single line on the...

Page 333: ...The visualize solution function can be used to better understand the recommendation in terms of time intervals and virtualization The Estimator can also be optionally linked to the System Planning Too...

Page 334: ...If this is your first time using PM data with the Estimator it is recommended that you take a few minutes to read the Measured Workload Integration tutorial found on the help tutorial tab in the Esti...

Page 335: ...n response times it does adhere to the policy of giving recommendations that abide by generally accepted utilization thresholds This means that the recommendation will likely have acceptable response...

Page 336: ...expended in each transaction because more work is done at the application level instead of at the IBM licensed internal code level A 1 Commercial Processing Workload CPW The CPW rating of a system is...

Page 337: ...approximately the same disk and memory resources per simulated user on all systems y Public benchmarks tend to stress extreme levels of scaling at very high CPU utilizations for very limited applicati...

Page 338: ...listed here are probable placements but not absolute guarantees The importance of having the two measures is to show that different workloads react differently to changes in the configuration IBM s W...

Page 339: ...pgrades whenever possible Increasing the MHz of the processor also helps but you should not expect performance to scale directly with MHz unless other aspects of the system are equally improved An exa...

Page 340: ...e The user can also model the effect of changing a single job into multiple jobs running concurrently It can be found at http www ibm com servers eserver iseries perfmgmt sizing html y PATROL for iSer...

Page 341: ...you always want to generate the database you can configure Collection Services to run CRTPFRDTA as a low priority batch job while data is being collected Separating the collection of the data from th...

Page 342: ...nalyze the model and provide results for various what if conditions Individual batch job run time and overall batch window run times will be reported by this tool BCHMDL Output description 1 Configura...

Page 343: ...ibrary list and start the tool by using the STRBCHMDL command Tips disclaimers and general help are available in the QBCHMDL README file It is recommended that you work closely with your IBM Technical...

Page 344: ...t utilized or provided here y Compute Intensive Workload CIW For a detailed description refer to Appendix A CIW values are no longer utilized or provided here y User based Licensing Many newer models...

Page 345: ...L3 cache 1 per chip Chip Speed GHz Processor Feature Model Processor CPW Table C 1 2 CPW values for Power System Models Note 1 These models have a dedicated L2 cache per processor core and share the L...

Page 346: ...some slight variations in performance difference between models 4 CPW values for Power System models introduced in October 2008 were based on IBM i 6 1 plus enhancements in post release PTFs C 2 V6R1...

Page 347: ...a 0 2 core VIOS partition 3 The value listed is unconstrained CPW there is sufficient I O such that the processor would be the first constrained resource The I O constrained CPW value for a 12 disk c...

Page 348: ...Feature Model Table C 5 1 System i models Note 1 These models have a dedicated L2 cache per processor core and share the L3 cache between two processor cores 2 This is the Edition Feature for the mod...

Page 349: ...36MB 2200 NA 7758 9406 570 35500 67500 Per Processor 16700 31100 4 8 1 9 36MB 2200 NA 7748 9406 570 35500 67500 Per Processor 16700 31100 4 8 1 9 36MB 2200 NA 7764 5 9406 570 67500 130000 0 31100 5850...

Page 350: ...The 64 way CPW value is reflects two 32 way partitions 9 These models are accelerator models The base CPW or MCU value is the capacity with the default processor feature The max CPW or MCU value is th...

Page 351: ...0930 7491 14100 26600 12000 6350 12000 2 4 36 MB 1 9 MB 1650 570 0921 7560 5 14100 26600 0 6350 12000 2 4 36 MB 1 9 MB 1650 570 0921 7494 14100 26600 12000 6350 12000 2 4 36 MB 1 9 MB 1650 570 0921 7...

Page 352: ...0 17400 1570 2890 0 3600 6600 3 6 1 41 MB 1100 825 2473 7416 8700 17400 1570 2890 Max 3600 6600 3 6 1 41 MB 1100 825 2473 7418 20200 29600 3600 5280 Max 7700 11500 5 8 1 41 MB 1300 870 2489 7433 20200...

Page 353: ...rver systems with 0 interactive capability Standard Models represent systems that have interactive features available and also may have Capacity Upgrade on Demand Capability See Chapter 2 iSeries RISC...

Page 354: ...1 41 MB 1300 890 2488 1587 84100 108900 12900 16700 10000 29300 37400 24 32 1 41 MB 1300 890 2488 1585 84100 108900 12900 16700 4550 29300 37400 24 32 1 41 MB 1300 890 2488 1583 84100 108900 12900 167...

Page 355: ...2438 1527 11810 1670 1050 3700 4 4 MB 600 820 2438 1526 11810 1670 560 3700 4 4 MB 600 820 2438 1525 11810 1670 240 3700 4 4 MB 600 820 2438 1524 11810 1670 120 3700 4 4 MB 600 820 2438 1523 11810 167...

Page 356: ...380 0 1070 1 2 MB 540 270 2432 1516 1490 185 30 465 1 n a 540 270 2431 1518 MCU Processor CIW Interactive CPW Processor CPW CPUs L2 cache per CPU Chip Speed MHz Model Table C 10 2 1 Model 2xx Servers...

Page 357: ...n V4R5 December 2000 C 10 4 1 CPW Values and Interactive Features for CUoD Models The following tables list only the processor CPW value for the Startup number of processors as well as a processor CPW...

Page 358: ...CPW CPU Range L2 cache per CPU Chip Speed MHz Model Table C 10 4 1 1 V5R1 Capacity Upgrade on demand Models 48000 62700 6750 8820 16500 13200 16500 18 24 8 MB 500 840 2419 1547 48000 62700 6750 8820 1...

Page 359: ...s of these new models C 11 1 AS 400e Model 8xx Servers 240 4200 4 4 MB 540 830 2402 1533 120 4200 4 4 MB 540 830 2402 1532 70 4200 4 4 MB 540 830 2402 1531 1050 1850 2 2 MB 400 830 2400 1535 560 1850...

Page 360: ...830 2402 1536 1050 4200 4 4 MB 540 830 2402 1535 560 4200 4 4 MB 540 830 2402 1534 Interactive CPW Processor CPW CPUs L2 cache per CPU Chip Speed MHz Model C 11 2 Model 2xx Servers 70 2000 2 4 MB 450...

Page 361: ...s that follows Except for systems which are nearing the need for an upgrade we do not expect this increase to significantly affect transaction response times It is recommended that other sections of t...

Page 362: ...B 262 730 2065 1508 140 120 560 1 4 MB 262 730 2065 1507 81 7 70 560 1 4 MB 262 730 2065 Base 1225 1050 1600 4 4 MB 255 720 2064 1505 653 3 560 1600 4 4 MB 255 720 2064 1504 Interactive CPW Max Intera...

Page 363: ...evious Model 170 s the knee of the curve is about 1 3 the maximum interactive CPW value Note that a constrained c CPW rating means the maximum memory or DASD configuration is the constraining factor n...

Page 364: ...050 0 3660 8 2340 S40 27 7 32 5 496 8 579 6 1794 8 2322 18 5 21 5 331 2 386 4 1794 8 2321 18 5 21 5 184 4 215 1 998 6 4 2320 S30 25 0 29 2 189 8 221 4 759 4 2178 12 5 14 6 94 9 110 7 759 4 2177 S20 CP...

Page 365: ...7 104 2 1 n a 2121 9 3 27 8 7 7 21 4 77 7 1 n a 2120 50S 9 9 29 8 10 3 30 7 87 3 1 n a 2112 9 9 29 8 6 9 20 6 59 8 1 n a 2111 12 5 37 4 3 7 13 8 33 3 1 n a 2110 10 30 1 3 1 9 4 27 0 1 n a 2109 40S 20...

Page 366: ...224 1 2132 20 6 20 6 50 224 1 2131 13 8 13 8 50 160 1 2130 400 V4R1 CPW V3R7 CPW Disk GB Maximum Memory MB Maximum CPUs Feature Code Model Table C 17 1 AS 400 RISC Systems 2340 2095 9 32768 12 2243 17...

Page 367: ...W Disk GB Maximum Memory MB Maximum CPUs Feature Code Table C 18 3 AS 400 CISC Model 9402 Servers 13 7 20 6 80 1 F25 11 8 19 7 80 1 E25 11 6 20 6 80 1 F20 9 7 19 7 72 1 E20 9 6 20 6 72 1 F10 9 7 6 4 6...

Page 368: ...6 AS 400 CISC Model 9406 Systems 177 4 259 6 1536 4 2052 120 3 259 6 1536 2 2051 67 5 259 6 1536 1 2050 320 56 5 159 3 832 2 2044 33 8 159 3 832 1 2043 310 21 1 117 4 160 1 2042 16 8 117 4 80 1 2041...

Reviews: