home about us   contact us
 
   







 


Chicago-Soft
ATTN: TSO Times
One Maple Street
Hanover, NH 03755
(603) 643-4002
information
@tsotimes.com

 

Setting and Reporting TSO Service Levels

by Cheryl Watson

This article is reprinted with permission from Cheryl Watson's Tuning Letter, (c) Watson & Walker, Inc. The second part of the article, "Response Time Using RMF," will appear in our next issue.

Setting and reporting TSO service levels is quite easy, because they're easily obtained from measurement sources. For example, RMF provides average response times for all TSO performance group periods in the type 72 (workload) SMF records. TSO/MON from LEGENT Corporation and provides response distributions, which are a better technique to use to report on TSO service levels. More detail on TSO/MON is provided a little later.

BASIC MEASUREMENTS
To understand the measurements for TSO, you should be familiar with TSO periods. Most installations divide all TSO transactions into multiple periods to give priority to the short transactions and let the longer transactions gradually age into a lower priority. The most common use of periods is to define three periods with between 80 and 90% of the transactions completing in first period. These are often called "trivial" or "short" transactions. The second period might have 5 to 10% of the transactions (usually called "medium"), and the third period or "long" transactions make up the rest. A common technique is to provide a fourth period for especially long transactions that you might want to let compete with an "express" batch class.

Depending on the applications, each installation will have a different breakout of these transactions. Service level objectives are then set for each period. One site that has a heavy CADAM usage could only get 40% of their transactions completed in first period since over half the transactions were very long CADAM work. Periods are defined by a DUR (duration) parameter specified on the PGN statement in the IEAIPSxx PARMLIB member. The DUR parameter specifies the total number of service units to be accumulated before a transaction moves into second period.

Figure 1 - TSO Period Definition

	PGN=2,(DMN=1,DP=F41,DUR=800)
	      (DMN=1,DP=F4,DUR=2000)
	      (DMN=2,DP=M4)

Figure 1 shows a sample TSO definition, where a transaction is considered to be a short transaction (and gets a dispatch priority of F41) until it has accumulated 800 service units. It will then become a period 2 transaction receiving a dispatch priority of F4 for the next 2000 service units when it moves to third period. It will then remain in period 3 at a priority of M4 until it completes. If the duration is increased, transactions will stay in first period longer and therefore more transactions will complete in that period. The response times will also show an increase. Decreasing the DUR value results in transactions moving to the next period faster.

Figure 2 (below) shows an RMF Workload extract showing the statistics for a set of TSO transactions. You see that 1,335 transactions completed in first period, or almost 94% (1135 / 1421) of all transactions. The rightmost column shows the average response time of 1.824 seconds, with a standard deviation of 3.669 seconds. If this is considered to be normal processing (based on other time periods) with everyone satisfied, you could set a TSO service objective of 90% (or more) of all transactions completing in first period with an average response time of 1.9 seconds. Actually, I would find it hard to believe that the users on this system were happy. Notice that second period has an average of over 7 seconds. For this system, I would probably decrease the DUR parameter for the first period.

Figure 2. Tso Workload Data

* PERFORMANCE * ... AVERAGE ABSORPTION, AVERAGE	ENDED	AVG TRANS
GROUP	 GROUP	AVG TRX SERV RATE,	TRANS,	TRANS,	TIME/ STD DEV
NUMBER PERIOD	WORKLOAD LEVEL		MPL	#SWAPS	HHH.MM.SS.TTT
002	   1	ABSRPTN =2,171		.41	1335	000.00.01.824
		TRX SERV=1,765		.33	1841	000.00.03.669

002	   2	ABSRPTN =1,223		.07	73	000.00.07.123
		TRX SERV=1,222		.07	76	000.00.02.948

002	   3	ABSRPTN =926		.07	13	000.00.12.283
		TRX SERV=926		.07	13	000.00.09.291

002	  ALL	ABSRPTN =1,833		.56	1421	000.00.02.192
		TRX SERV=1,582		.48	1930	000.00.04.015

So how is this data really accumulated? The information for TSO is collected and reported each RMF interval and recorded in the type 72 record. At the end of each transaction, the internal response time for each transaction is added to an accumulator, the square of the response time is added to an accumulator to later estimate the standard deviation, and one is added to the transaction count. When the record is written, the total response times are divided by the total transactions to obtain the average response time. The "sum of the squares" is used to estimate the standard deviation. This is not a true standard deviation, but an indication of the variability of the response times. Large standard deviations indicate a wide variance from the average. Small standard deviations indicate fairly consistent response times.

VARIANCES
There are several external influences that can affect these measurements. These include parameters of CNTCLIST, DUR, MSO, RTO, and the effect of Output Wait Swaps.

CNTCLIST: SRM contains a parameter in SYS1.PARMLIB member IEAOPTxx called CNTCLIST. The default value of NO indicates that SRM should treat any CLIST as a single transaction. A value of YES indicates that each command in the CLIST should be treated as a single transaction. As an example, assume a CLIST has 10 commands and takes 10 seconds. If CNTCLIST=NO (default) is specified, RMF would report 1 transactions and an average 10 second response time. If CNTCLIST= YES, RMF would report 10 transactions and an average 1 second response time. Changing this parameter will change both the meaning of "a transaction" and the reported average time. I prefer the default of NO because it more closely represents what the user sees and takes less overhead. Several installations, however, use YES because it improves CLIST response times.

DUR: As mentioned earlier, you can increase the number of transactions completing in a period by increasing the value of DUR. Decreasing it decreases the number of transactions completing in that period. You'll want to change this parameter in order to meet your service objective of a specific percent completing in first period. If DUR is changed for another reason, it will affect the percent of transactions completing in first period.

MSO: If the MSO service definition coefficient (SDC) is set to anything other than a minimal number (e.g. 0.1), then MSO (main storage occupancy) contributes too heavily to the total service units. Installations using an MSO SDC of 3.0 will accumulate over 50% of their total service units due to memory. If MSO plays a large role in total service units, then any change in the amount of storage can affect how rapidly transactions accumulate service units, and therefore how many transactions complete within first period. To eliminate this variability, you should use a much smaller MSO SDC, such as 0.1, 0.01, or 0.0 (only if at SP 4.2 or later).

RTO: Some sites use the TSO RTO (Response Time Option) to restrict the TSO load. Use of RTO carries some political implications and is very controversial, but can be extremely valuable after an upgrade. It serves to provide very consistent response timessystem were happy. Notice that second period has an average of over 7 seconds. For this system, I would probably decrease the DUR parameter for the first period.

Figure 2. Tso Workload Data

* PERFORMANCE * ... AVERAGE ABSORPTION, AVERAGE	ENDED	AVG TRANS
GROUP	 GROUP	AVG TRX SERV RATE,	TRANS,	TRANS,	TIME/ STD DEV
NUMBER PERIOD	WORKLOAD LEVEL		MPL	#SWAPS	HHH.MM.SS.TTT
002	   1	ABSRPTN =2,171		.41	1335	000.00.01.824
		TRX SERV=1,765		.33	1841	000.00.03.669

002	   2	ABSRPTN =1,223		.07	7his delay can be due to
the physical swap-in time and/or the RTO delay. A common calculation often used for an
estimate of swap-in delay is:

      (1 - (MPL / AVG TRANS)) * Response-time

From Figure 2 (which doesn't use RTO), we would calculate:

      (1 - (.33 / .41)) * 1.824 = .356 seconds

Thus, there was a .356 delay for swap-in. Since there was no RTO specified, this implies that the physical swap took .356 second, and the actual response time was 1.468 seconds (1.824 - .356). If you only want to calculate the actual internal response time, you can use the formula:

      (MPL / AVG TRANS) * Response-time.

OUTPUT WAIT SWAPS: If TSO must wait for an output buffer, the transaction is swapped out. When it swaps back it, it's counted as a new transaction. Thus a single transaction that the user sees taking two seconds, will be reported as one transaction with a response time of two seconds if there is no output wait swap, but will be reported as two transactions with an average response of one second if there is an output wait swap.

Look at the RMF Swap Placement report to see if there are many output wait swaps (reported at the bottom of the page). Figure 3 (below) shows an extract from an RMF Monitor I Paging Activity report. First, calculate the percent of output wait swaps. Divide the output wait swaps (at the bottom of the page) by the total terminal input/output wait swaps and multiply by 100. In Figure 3, this percent would be very high - 26.96% ((1088 / 4036) * 100! If the percent of output wait swaps is less than 10% of the total terminal input/output wait swaps, then you shouldn't be too concerned because only 10% of the transactions are being reported incorrectly. If it's greater than that, then you should be aware that the reported response time is less than it should be and the number of transactions is greater than it should be.

Some output wait swaps can be eliminated by increasing two parameters in parmlib member TSOKEY00. You can set BUFRSIZE=2048 and HIBRTEXT=96000. These may increase the TSO working set size, but can reduce output wait swaps if you have applications that send more than 132 characters to the screen at one time. The increased values are also useful for Model 3, 4 or 5 terminals. Some applications, such as FOCUS, ADABASE, graphic displays, and downloads also cause output wait swaps (and there's no way to reduce them).

Figure 3 - TSO Output Wait Swaps

			TOTAL

TERMINAL	CT	4,036
INPUT/OUTPUT	RT	4.22
WAIT		%	66.9%

...

OCCURRENCES OF TERMINAL OUTPUT WAIT - 1,088

Here's a neat technique to estimate the actual response time when there are many output wait swaps. First, determine the percent of Input/Output swaps that are only input swaps with the following calculation:

          (INPUT/OUTPUT-OUTPUT)
          ---------------------
             (INPUT/OUTPUT)

From Figure 3, this value would be: (4036 - 1088) / 4036 = .73 (or 73%). Now, from the workload activity report (see Figure 2), you can estimate the actual internal response time by dividing the average response time by the ratio of the input to the total. Figure 3 was obtained from the same system and time period as the data in Figure 2. The total swaps are much greater than the number of transactions and swaps in this performance group because there were other TSO performance groups. In Figure 2, the first period TSO response time could be interpreted as 1.824 / .73 or 2.499 seconds. You can also estimate the actual number of transactions by multiplying the ended transactions by the ratio. The example in Figure 2 could then be interpreted as having 974.6 transactions (1335 * .73).

Cheryl Watson's Tuning Letter is a highly-respected, impartial and very practical journal of MVS management and advice published six times a year by Watson & Walker Inc. Call 1-800-553-4562 for subscription information.



The TSO Times is back by popular demand!
Register now for your FREE subscription









 

Chicago-Soft, LTD
ISPF Tools & Toys
MVS Help Board
Lionel Dyck's Tools
IBM ISPF Page
Tom Brennan's Vista tn3270 Page
Mark Zelden's MVS Utilities


 


 

home · current articles · archives · forums ·
· subscribe · about us · contact us · links