Wednesday, September 22, 2010

Very Thick Cervical Muscus Like Jelly

As an estimate of capacity and scalability of CPU and I / O (Capacity Planning)

In this paper I will show you how to perform a Capacity Planning (CP) of CPU and I / O in a simple manner using simple math. To make an effective CP is important to determine the sub-modeling. The subsystems are typical CPU, I / O or disks, memory and network. There are commercial products that allow a global PC without having to model for the subsystem, but I prefer to analyze each subsystem separately because I think that the results are more precise and clear. The CPU and I / O are clearly the most important are the subsystems in which we delve into this note. Surely

once your manager has asked them if the number of processors or the disk speed is adequate to ensure that the system remains stable. To this question one can answer just be confident and everything will work well or one can be more professional and do a PC and know exactly where it will be the breaking point, ie not only answer if the HW supports the current load but also, taking the annual growth forecast transacccional, to know where we reached our HW and predict in advance the purchase or upgrade of HW.
now go directly to see how modeling a CPU subsystem and a subsystem of i / o. To perform more sophisticated CP must have some knowledge of queuing theory (Erlang C function, Kendall notation, etc). In this note simply achieved with a little discretion. Figure A) is modeled CPU subsystem. This consists of a single queue and N servers (CPU's). This means that each request can be served by any server arbitrarily or gluing cpu and go when all the CPU is busy (runqueue in terms of OS). Figure B) is modeled subsystem I / O where we have a queue per server or device. In this case, each petition must be served by a particular device.

 



The two models mentioned are represented many times in daily life, when we go to the bank, when we are waiting at the toll booth, as we wait to board a flight, when we place our order in a restaurant de comida rapida, etc.

Ahora vamos a definir las variables que necesitamos usar para realizar el CP:
<= 100000

Uc= User Calls
trx= transacción
λ= tasa de arribo (ej: trx/ms ó Uc/ms)
St= Tiempo de Servicio (ej: s/trx ó s/Uc)
Qt= Tiempo de Espera o Tiempo de espera (ej: s/trx)
 Rt= Tiempo de Respuesta  (ej: s/trx) 
Q= Encolamiento (ej: Nro de trx's)
U= porcentaje de carga (porcentaje de utlización del recurso)
M= nro de servers (cantidad de CPU's o cantidad de IO devices)

y las formulas:

Rt = St + Qt
U = St * λ / M
Q = λ *
 
Qt Rt (cpu) = St / (1-U ^ M)

Rt (io) = St / (1-U)



The goal is to graph the response time vs arrival rate. The graph should be exponential with a curve or turning point that shows that arrival rate (X axis) occurs the break.


 

I will explain more in detail what each thing means:

1) λ (Arrival Rate): This metric can be defined with Oracle such as: Uc / s Trs / s, or using functional metrics, eg purchase order charge per minute or per hour.

2) M (Quantity of CPU'sy I / O Devices): The number of CPU's or I / O Devices. To obtain this value we can use OS utilities and / or consult Oracle views.

3) U (Percent Charge): The percentage of CPU load or I / O. This can be achieved using OS utilities such as sar, iostat.

4) St (service time): The time in which the request is being handled by the CPU or the device i / o.

5) Qt (Waiting Time Tail): The time that a request has to wait in line because the servers are busy.

6) Q (Long Tail): The number of requests queued (waiting to be served)
 7) Rt (Time response): The response time is the sum of St (service time) and Qt (timeout or bonding). A request or is waiting or being served. 

At this point, where I have defined all, I'll show you an example: Example



(Capacity Planning CPU) <= 100000


Suppose we are asked to estimate the impact that increases will occur if the user of an application X by 20%. What we want to determine is whether you need to add more processors or what is good enough to support the additional load.

first thing we do is collect information on a representative period. The more data we collect more accurate will be our CP, then we must characterize the load, this means defining whether we will use average values, maximums, which we will use as arrival rate, etc. In the example I'm going to have peak values \u200b\u200bin the range of maximum load and a rate up using Trx / s.
 λ = 20 trx / s (20 transactions per second, is available from Statspack or AWR) 
U = 0.40 (40% cpu utilization, it can be collected with sar-u)
M = 8 ( there are 8 processor then, is obtained from the parameter of the base: cpu_count)

Since I have the 3 essential parameters, only now I have to apply the formulas and graph:



St * U = λ / M, depejando for St, St

= U * M / λ = 0.40 * 8 / 20 = 0.16 s / trx
St
 Now we can apply the formula for cpu: 



Rt (cpu) = St / (1 -U ^ M) = 0.16 / (1-0.40 ^ 8) = 0.16 s / trx

As seen the response time equals the time of service. This is because the system is loose in CPU for the current load and there is no queuing, ie wait to be served by the cpu's. Now what I do is a table in excel to project and plot the growth curve in order to see where the break occurs.





see that the turning point is around 30-35% (>), consider also that we are starting from a peak workload, and therefore we think that most of the time will be well below. According to CP can be sure that the damper system while the additional burden of 20%, beginning to deteriorate rapidly from 25%.


Example 2 (Capacity Planning for I / O) As discussed in Example 1, we would not have major problems with the cpu to a 30% increase in load. Now let's see what happens with the discs.

5Mb/ms λ = (5 Mb transfer per millisecond can be obtained with sar-do iostat)
U = 0.60 (60% usage of i / o)
M = 50 (there are 50 devices)

St = U * M / λ = 50 * 0.60 / 5 = 6 ms / Mb Replacing in the formula i / o:
Rt (io) = St / (1 -U) = 6 / (1-0.60) = 15ms/Mb



 


As shown in the graph, the system could support up to 50% (7.5Mb/ms) of growth, once reaches 60% of destabilizes.

In summary we can conclude that our hardware is well at both non-cpu i / o as long-term will not scale. This paper

my idea was to show how to make a cpu capacity planning yi / or easily. With this method we have a good estimate how to escalate our HW. According to each case, there are other approaches such as modeling method based on ratios, queuing theory and linear regression method.

most important for accurate estimates is to obtain a sample that is representative joined a good characterization of the workload to be assessed. The more information we have collected better prognosis.

Using this methodology can perform simulations and future projects, for example, what if we add or we get cpu's, how it impacts the aggregate device i / or faster, with higher throughput and lower latency, few users could add to operate with application without compromising system stability, etc.

To those of interest issues in Oracle Capacity Planning I recommend:



"Forecasting Oracle Performance"
. This book by Craig Shallahamer, a real guru on the subject, was excellent and I used it as reference to write the note.

Monday, September 20, 2010

Can I Join The Military If I Have Ezcema

Optimizing statistical collection on partitioned tables


Oracle 10g uses a two pass algorithm to collect statistics on partitioned tables:

1. One pass over the entire table to update the global statistics. 2. A second pass to collect statistics on each of the partitions.

This approach has the disadvantage that if changes are made in a few partitions to make them eligible for automatic collection maintenance window, as well as update an own partition statistics in question, it must perform the global update of the table. For the latter it runs the entire table, including partitions did not change. This can be done very heavy depending on the size of the table.
From Oracle 11g adopting a one-pass algorithm, so that instead of making one pass around the table to update the global information, it performs an incremental update inferring changes from the modified partition. Some of statistics can be derived easily from the statistics of the partitions (eg the number of rows), but other statistics, such as the number of distinct values \u200b\u200bin a column no. To resolve this Oracle uses a new structure called a synopsis for each column at the level of the partition so that the number of distinct values \u200b\u200b(NDV) globally can be derived by merge of the synopsis of the partitions analyzed.







While this is a feature of 11g R1, Oracle 10g R2 10.2.0.4 more precisely on an option to simulate the incremental collection through a new value 'APPROX_GLOBAL AND PARTITION' for the parameter in the procedure GATHER_TABLE_STATS GRANULARITY. Their behavior is equal to 11g
except for the NDV of the column is not partitioned and the number of keys than the global index.

incremental maintenance is disabled by default and can be enabled at the table, schema, even at the level of the database.

Then I pass the results of my tests using Oracle 11g R1 (11.1.0.7):

I'll use a table partitioned by date range with 3 partitions. The table is small (about 5M rows) but serve to illustrate:


select partition_name,
num_rows from table_name WHERE user_tab_partitions = 'T';
partition_name
NUM_ROWS ---------- ------------------------------
P0810 2583379 P0710 1332466 P0910 1084155

PMAX 0


100.000
I will delete records in a partition:


delete from t partition (p0910) WHERE rownum

update statistics, using the default, ie without incremental collection:


begin
dbms_stats.gather_table_stats (ownname => user, tabname => 'T');
end;
Procedure PL / SQL completed successfully. Elapsed: 00:00:11.71




took almost 12 seconds.
select dbms_stats.get_prefs ('INCREMENTAL', tabname => 'T') from dual; FALSE

With

above query was verified that the conventional collection

Now I will turn the collection on table T and I delete rows and I will return to collect the statistics:
begin
dbms_stats.set_table_prefs (ownname => user, tabname => 'T',
pname => 'INCREMENTAL', pvalue => 'TRUE' )
end;


actually verified that the incremental mode is activated on the table T:


select dbms_stats.get_prefs ('INCREMENTAL', tabname => 'T') from dual;
TRUE


begin dbms_stats.gather_table_stats (ownname = > user, tabname => 'T'); end;

Elapsed: 00:00:04.71


Now

took 4s. Instead of going through the whole table looked just changed the partition and then derived the global statistics based on the changes and using the synapses of the partition.

must take into account the global histograms are not preserved after run incremental collection (see Bug 8686932 in Metalink).

While this method used by Oracle to make more effective the collection has been studied in academic and laboratory time ago, is Oracle's first relational database engine to implement it.