Friday, November 22, 2013

AWR Wait Events: Free Buffer Waits vs. Buffer Busy Waits

To diagnose Oracle performance problems, you can use Automatic Workload Repository (AWR)[1] to collect, process, and maintain performance statistics for problem detection. This data is both in memory and stored in the database. The gathered data can be displayed in both reports and views.

In this article, we will discuss the following topics:
  • Top 5 Timed Foreground Events[2]
  • Free buffer Waits vs. Buffer Busy Waits[3]

Top 5 Timed Foreground Events


The Top 5 Timed Foreground Events section is where you can usually spot the problem, by showing you why the sessions are “waiting.” Make sure to analyze the total waits and average waits (ms) separately, in order to determine if the waits are significant. 

For a given workload, this list of events should be relatively stable; you should investigate any significant variation. You will generally see DB CPU at the top of the list. However, CPU usage is not necessarily an indication of a healthy system, as the application may be CPU-bound. The most common wait events are:
  • db file sequential read
  • db file scattered read
  • db file parallel read
  • log file sync

In a nicely performing database, you should see CPU and I/O as the top wait events. If any wait events from the Concurrency wait class or Configuration wait class,[7] investigate those waits further.  For example, in one of the AWR reports, we have found:

EventWaitsTime(s)Avg wait (ms)% DB timeWait Class
free buffer waits1,336,44420,8801655.33Configuration
db file sequential read11,443,82012,942134.30User I/O
DB CPU1,5284.05
undo segment extension5,0315471091.45Configuration
buffer busy waits106847950.22Concurrency


Free Buffer Waits


The Free Buffer Waits event indicates that your session is waiting for a buffer to become empty.  For instance, the database writer is copying the current contents to disc and your session has to wait for this to complete before you can empty the buffer and put something else in it.

Possible Causes and Solutions:
  • DB_BUFFER_CACHE may need to be tuned[4]
    • If the cache hit ratio is low and your application is tuned to avoid performing full table scans, consider increasing the size of the buffer cache.
      • If it is possible to allocate the extra memory required to the buffer cache without causing the system to page, then allocate extra memory.
      • To increase the amount of memory allocated to the buffer cache, increase the value of the DB_CACHE_SIZE initialization parameter.
  • If all your SQL is tuned, free buffer waits could also indicate that:
    • Unselective SQL is causing data to flood the buffer cache with index blocks, leaving none for this particular statement that is waiting for the system to process. 
    • This normally indicates that there is a substantial amount of DML (insert/update/delete) being done and that the Database Writer (DBWR) is not writing quickly enough; the buffer cache could be full of multiple versions of the same buffer, causing great inefficiency. 
    • The solutions include, but not limited to:
      • Avoid performing full table scans
      • Accelerating incremental checkpointing, using more DBWR processes, increasing the size of buffer cache, or increasing the number of physical disks.
  • Investigate if this is an I/O problem

Buffer Busy Waits


The Buffer Busy Waits event indicates that an Oracle session needs to access a block in the buffer cache,
but cannot because the buffer copy of the data block is locked. Buffer busy waits should not be greater than 1 percent.  This buffer busy wait condition can happen for either of the following reasons:
  • The block is being read into the buffer by another session, so the waiting session must wait for the block read to complete. 
  • Another session has the buffer block locked in a mode that is incompatible with the waiting session's request.

Possible Causes and Solutions:[2,3,5]
  • Check the Buffer Wait Statistics section (see next section) to find out if the wait is on:
    • segment header
      • Increase the freelist groups or increase the pctused to pctfree gap
    • file header block
      • If you are hitting file header block contention (Reason Code 13) on the temp files, increase the number of header blocks by adding more tempfiles.[5]
    • undo header
      • Add rollback segments
    • undo block
      • Reduce the data density on the table driving this consistent read or increase the DB_CACHE_SIZE. 
    • data block
      • Move data to another block to avoid this hot block, increase the freelists on the table, or use Locally Managed Tablespaces (LMTs). 
      • Use a smaller block size: fewer records fall within a single block in this case, so it's not as "hot." 
    • index block
      • Rebuild the index, partition the index, or use a reverse key index. 
  • When a DML (insert/update/ delete) occurs, Oracle Database writes information into the block, including all users who are "interested" in the state of the block (Interested Transaction List, ITL). 
    • To decrease waits in this area, you can:
      • Increase the initrans, which will create the space in the block to allow multiple ITL slots. 
      • Increase the pctfree on the table where this block exists (this writes the ITL information up to the number specified by maxtrans, when there are not enough slots built with the initrans that is specified).
  • Other tunings
    • Database writer (DBWR) contention tuning
    • Implementing Automatic Segment Storage Management (ASSM, a.k.a bitmap freelists)

Buffer Wait Statistics

Here is the Buffer Wait Statistics section from our AWR snapshot:
  • ordered by wait time desc, waits desc
ClassWaitsTotal Wait Time (s)Avg Time (ms)
undo header827752909
file header block4540896
data block1107

SQL ordered by Elapsed Time


Finally, from the SQL Ordered by Elapsed Time section, it seems to indicate that a DML (i.e., delete) operation poorly performed.  This has given us another clue to investigate our performance issue.  From the SQL Id, you should be able to find out more details of the interested SQL in your AWR snapshot.


Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
35,931 1,457 1,000 35.93 95.22 83uuf84dqpbaq   delete /*+ INDEX (a EQ$ATTR_VA...
35,925 1,479 0 95.20 6vyng4f5vktyb   DECLARE job BINARY_INTEGER := ...

References

  1. Overview of the Automatic Workload Repository
  2. AWR Top 5 Timed Events - Top 10
  3. What is the difference between buffer busy waits and free buffer waits
  4. Tuning the Database Buffer Cache
  5. Resolving file header contention on tempfiles
  6. Oracle® Database Performance Tuning Guide 10g Release 2 (10.2)
  7. Classes of Wait Events
    • Configuration: Waits caused by inadequate configuration of database or instance resources (for example, undersized log file sizes, shared pool size)
    • Concurrency: Waits for internal database resources (for example, latches)
  8. Oracle® Database Reference 10g Release 2 (10.2)
  9. More on Interested Transaction Lists
  10. Oracle automatic segment space management:  ASSM internal structures

No comments: