oracle AWR PERF TREND HISTORY

2026-01-15 Oracle Oracle AWR

最近常常接觸 AWR 報表分析,想說AWR已經有儲存資料庫運作歷史資訊,為何不依據裡頭的紀錄資訊進行效能分析呢?因此開始著手研究如何將AWR的資料呈現出趨勢資料~

觀察重點

  1. 當 % Non-Parse CPU 低於 80% 時,代表資料庫忙於解析而非執行,這時增加 CPU 核心也無法根本解決問題。
  2. 觀察 Per Transaction 的指標是否惡化。如果 Logical Reads/Tx 維持穩定,系統慢通常是併發或硬體問題;如果該值上升,則是 SQL 或索引問題。
  3. 監控 Shared Pool Statistics 趨勢時,應特別留意 % Mem for SQL w/exec > 1 這個指標。如果該值變低,且 Sharable Memory 排名的 TOP SQL 佔比很高,就代表 Shared Pool 正在被少數幾句低效的 SQL 吞噬,這極易引發 ORA-04031 錯誤。

趨勢資料分析

指標分析項目

  • summary

    • Elapsed
    • DBTime
  • 負載強度 (Load Profile)

    • Load Profile - DB Time(s)
    • Load Profile - DB CPU(s)
    • Load Profile - Redo size (bytes)
    • Load Profile - Logical read (blocks)
    • Load Profile - Block changes
    • Load Profile - Physical read (blocks)
    • Load Profile - Physical write (blocks)
    • Load Profile - Read IO requests
    • Load Profile - Write IO requests
    • Load Profile - Read IO (MB)
    • Load Profile - Write IO (MB)
    • Load Profile - Logons
    • Load Profile - User logons
    • Load Profile - Executes
    • Load Profile - Rollbacks
  • 執行效率 (Efficiency)

    • Efficiency - Buffer Nowait %
    • Efficiency - Buffer Hit %
    • Efficiency - Library Hit %
    • Efficiency - Execute to Parse %
    • Efficiency - Parse CPU to Parse Elapsd %
    • Efficiency - Flash Cache Hit %
    • Efficiency - Redo NoWait %
    • Efficiency - In-memory Sort %
    • Efficiency - Soft Parse %
    • Efficiency - Latch Hit %
    • Efficiency - % Non-Parse CPU
  • 記憶體配置 (Memory & Cache)

    • Memory Statistics - Host Mem (MB)
    • Memory Statistics - SGA use (MB)
    • Memory Statistics - PGA use (MB)
    • Cache Sizes - Buffer Cache
    • Cache Sizes - Shared Pool Size
    • Cache Sizes - In-Memory Area
    • Cache Sizes - Std Block Size
    • Cache Sizes - Log Buffer
  • 解析質量 (Shared Pool Stats)

    • Shared Pool Statistics - Memory Usage %
    • Shared Pool Statistics - % SQL with executions>1
    • Shared Pool Statistics - % Memory for SQL w/exec>1
  • 瓶頸拆解 (Wait Class % DB Time)

    • % DB time - User I/O
    • % DB time - System I/O
    • % DB time - Network
    • % DB time - Other
    • % DB time - Commit
    • % DB time - Concurrency

DBTime

WITH snap_range AS (
    -- 1. 定義時間範圍與基礎快照資訊
    SELECT 
        snap_id, dbid, instance_number, begin_interval_time, end_interval_time,
        -- 計算快照區間的總秒數,方便後續計算
        (CAST(end_interval_time AS DATE) - CAST(begin_interval_time AS DATE)) * 86400 as interval_seconds
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_time_model AS (
    -- 2. 取得 DB Time 的原始累計值
    SELECT snap_id, dbid, instance_number, value
    FROM dba_hist_sys_time_model
    WHERE stat_name = 'DB time'
),
LICENSE AS (
  select inst_id,CPU_CORE_COUNT_CURRENT from gV$LICENSE
)
SELECT * from(
SELECT 
    DB.DBID,
    sr.snap_id,
    sr.instance_number,
    sr.begin_interval_time,
    sr.end_interval_time,
    -- 3. 計算兩次快照間的增量 (Delta),轉換為秒數
    ROUND((rt.value - LAG(rt.value) OVER (ORDER BY sr.snap_id)) / 1000000, 2) AS db_time_seconds,
    -- 4. 使用 CTE 預算的 interval_seconds 轉換成分鐘
    ROUND(sr.interval_seconds / 60, 2) AS interval_minutes,
    -- 5. 額外提供 AAS (Average Active Sessions) 指標,這是衡量負載最直觀的方式
    ROUND(((rt.value - LAG(rt.value) OVER (ORDER BY sr.snap_id)) / 1000000) / NULLIF(sr.interval_seconds, 0), 2) AS aas
FROM snap_range sr
JOIN DB on 1=1
JOIN raw_time_model rt ON sr.snap_id = rt.snap_id AND sr.dbid = rt.dbid AND sr.instance_number = rt.instance_number
ORDER BY sr.snap_id
) T
WHERE t.db_time_seconds >=0

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |DB_TIME_SECONDS|INTERVAL_MINUTES|AAS |
----------+-------+---------------+-----------------------+-----------------------+---------------+----------------+----+
000000000|   1583|              1|2026-01-22 21:40:52.537|2026-01-22 21:50:52.851|              0|              10|   0|
000000000|   1584|              1|2026-01-22 21:50:52.851|2026-01-22 22:00:53.187|              0|           10.02|   0|
000000000|   1585|              1|2026-01-22 22:00:53.187|2026-01-22 22:10:53.517|              0|              10|   0|

Load Profile 趨勢分析 - 增量(Delta)

這段 SQL 會精準計算出每個 Snapshot 區間內的增量(Delta)

WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time,
           (CAST(end_interval_time AS DATE) - CAST(begin_interval_time AS DATE)) * 86400 as interval_seconds
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_stats AS (
    -- 取得 System Statistics (SYSSTAT) 的累計值
    SELECT snap_id,
           MAX(DECODE(stat_name, 'redo size', value)) as redo_size,
           MAX(DECODE(stat_name, 'session logical reads', value)) as logical_reads,
           MAX(DECODE(stat_name, 'db block changes', value)) as block_changes,
           MAX(DECODE(stat_name, 'physical reads', value)) as phys_reads,
           MAX(DECODE(stat_name, 'physical writes', value)) as phys_writes,
           MAX(DECODE(stat_name, 'physical read IO requests', value)) as read_io_req,
           MAX(DECODE(stat_name, 'physical write IO requests', value)) as write_io_req,
           MAX(DECODE(stat_name, 'physical read total bytes', value)) as read_io_bytes,
           MAX(DECODE(stat_name, 'physical write total bytes', value)) as write_io_bytes,
           MAX(DECODE(stat_name, 'logons cumulative', value)) as logons,
           MAX(DECODE(stat_name, 'user logons cumulative', value)) as user_logons,
           MAX(DECODE(stat_name, 'execute count', value)) as executes,
           MAX(DECODE(stat_name, 'user rollbacks', value)) as rollbacks
    FROM dba_hist_sysstat
    WHERE stat_name IN ('redo size', 'session logical reads', 'db block changes', 'physical reads', 
                        'physical writes', 'physical read IO requests', 'physical write IO requests',
                        'physical read total bytes', 'physical write total bytes', 
                        'logons cumulative', 'user logons cumulative', 'execute count', 'user rollbacks')
    GROUP BY snap_id
),
raw_time_model AS (
    SELECT snap_id,
           MAX(DECODE(stat_name, 'DB time', value)) as db_time,
           MAX(DECODE(stat_name, 'DB CPU', value)) as db_cpu
    FROM dba_hist_sys_time_model
    WHERE stat_name IN ('DB time', 'DB CPU')
    GROUP BY snap_id
)
SELECT * FROM (
SELECT 
    DB.DBID,
    s.snap_id,
    s.instance_number,
    s.begin_interval_time,
    s.end_interval_time,
    ROUND((tm.db_time - LAG(tm.db_time) OVER (ORDER BY s.snap_id)) / 1000000, 2) as "DB Time (s)",
    ROUND((tm.db_cpu - LAG(tm.db_cpu) OVER (ORDER BY s.snap_id)) / 1000000, 2) as "DB CPU (s)",
    (rs.redo_size - LAG(rs.redo_size) OVER (ORDER BY s.snap_id)) as "Redo Size (bytes)",
    (rs.logical_reads - LAG(rs.logical_reads) OVER (ORDER BY s.snap_id)) as "Logical Reads (blocks)",
    (rs.block_changes - LAG(rs.block_changes) OVER (ORDER BY s.snap_id)) as "Block Changes",
    (rs.phys_reads - LAG(rs.phys_reads) OVER (ORDER BY s.snap_id)) as "Phys Reads (blocks)",
    (rs.phys_writes - LAG(rs.phys_writes) OVER (ORDER BY s.snap_id)) as "Phys Writes (blocks)",
    (rs.read_io_req - LAG(rs.read_io_req) OVER (ORDER BY s.snap_id)) as "Read IO Req",
    (rs.write_io_req - LAG(rs.write_io_req) OVER (ORDER BY s.snap_id)) as "Write IO Req",
    ROUND((rs.read_io_bytes - LAG(rs.read_io_bytes) OVER (ORDER BY s.snap_id)) / 1024 / 1024, 2) as "Read IO (MB)",
    ROUND((rs.write_io_bytes - LAG(rs.write_io_bytes) OVER (ORDER BY s.snap_id)) / 1024 / 1024, 2) as "Write IO (MB)",
    (rs.logons - LAG(rs.logons) OVER (ORDER BY s.snap_id)) as "Logons",
    (rs.user_logons - LAG(rs.user_logons) OVER (ORDER BY s.snap_id)) as "User Logons",
    (rs.executes - LAG(rs.executes) OVER (ORDER BY s.snap_id)) as "Executes",
    (rs.rollbacks - LAG(rs.rollbacks) OVER (ORDER BY s.snap_id)) as "Rollbacks"
FROM snap_range s
JOIN DB on 1=1
JOIN raw_stats rs ON s.snap_id = rs.snap_id
JOIN raw_time_model tm ON s.snap_id = tm.snap_id
ORDER BY s.snap_id
) T
WHERE t."DB Time (s)" >=0

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |DB Time (s)|DB CPU (s)|Redo Size (bytes)|Logical Reads (blocks)|Block Changes|Phys Reads (blocks)|Phys Writes (blocks)|Read IO Req|Write IO Req|Read IO (MB)|Write IO (MB)|Logons|User Logons|Executes|Rollbacks|
----------+-------+---------------+-----------------------+-----------------------+-----------+----------+-----------------+----------------------+-------------+-------------------+--------------------+-----------+------------+------------+-------------+------+-----------+--------+---------+
000000000|   1583|              1|2026-01-22 21:40:52.537|2026-01-22 21:50:52.851|          0|         0|          2761472|                 15282|         8634|                  0|                 671|          0|         392|       21.24|        15.44|     4|          0|     799|        0|
000000000|   1584|              1|2026-01-22 21:50:52.851|2026-01-22 22:00:53.187|          0|         0|          2769036|                 15753|         8750|                  0|                 675|          0|         379|       30.97|        22.36|     6|          0|     841|        0|
000000000|   1585|              1|2026-01-22 22:00:53.187|2026-01-22 22:10:53.517|          0|         0|          2773864|                 19615|         8748|                  0|                 766|          0|         416|       21.48|        16.27|     3|          0|     718|        0|

指標分析說明

| DB Time vs DB CPU | 如果 DB Time 遠高於 DB CPU,代表資料庫大部分時間在「等待」(如 I/O, Lock),而非在「運算」。 | | Redo Size | 反映資料庫的 DML 密集度。如果數值激增,可能是有大型 Batch Job 或索引維護。 | | Logical vs Physical Reads | 物理讀取(Physical Reads)過高代表快取命中率(Buffer Cache Hit Ratio)不足,通常與全表掃描(Full Table Scan)有關。 | | Executes & Logons | Logons 很高但 User Logons 不高,可能是中間層連線池(Connection Pool)配置不當,導致頻繁重連。 | | Rollbacks | 正常的系統 Rollback 應該極低。若數值偏高,需檢查應用程式邏輯是否有大量異常導致的事務回滾。 |

Load Profile 趨勢分析 - Per Second (每秒平均) 或 Per Transaction (每個事務平均)

核心計算公式

  1. Per Second: Value / Interval Duration (seconds)
  2. Transactions (事務數): 在 Oracle AWR 中,事務數定義為 User Commits + User Rollbacks
  3. Per Transaction: Value / (Delta User Commits + Delta User Rollbacks)
WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time,
           -- 計算快照區間的總秒數
           (CAST(end_interval_time AS DATE) - CAST(begin_interval_time AS DATE)) * 86400 as interval_seconds
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_stats AS (
    SELECT snap_id,
           MAX(DECODE(stat_name, 'redo size', value)) as redo_size,
           MAX(DECODE(stat_name, 'session logical reads', value)) as logical_reads,
           MAX(DECODE(stat_name, 'physical reads', value)) as phys_reads,
           MAX(DECODE(stat_name, 'physical read total bytes', value)) as read_io_bytes,
           MAX(DECODE(stat_name, 'execute count', value)) as executes,
           MAX(DECODE(stat_name, 'user commits', value)) as commits,
           MAX(DECODE(stat_name, 'user rollbacks', value)) as rollbacks
    FROM dba_hist_sysstat
    WHERE stat_name IN ('redo size', 'session logical reads', 'physical reads', 
                        'physical read total bytes', 'execute count', 'user commits', 'user rollbacks')
    GROUP BY snap_id
),
deltas AS (
    -- 先計算出每個指標的增量 (Delta)
    SELECT 
        DB.DBID,
	    s.snap_id,
	    s.instance_number,
	    s.begin_interval_time,
	    s.end_interval_time,
		s.interval_seconds,
        (rs.redo_size - LAG(rs.redo_size) OVER (ORDER BY s.snap_id)) as d_redo,
        (rs.logical_reads - LAG(rs.logical_reads) OVER (ORDER BY s.snap_id)) as d_lread,
        (rs.phys_reads - LAG(rs.phys_reads) OVER (ORDER BY s.snap_id)) as d_pread,
        (rs.read_io_bytes - LAG(rs.read_io_bytes) OVER (ORDER BY s.snap_id)) as d_read_bytes,
        (rs.executes - LAG(rs.executes) OVER (ORDER BY s.snap_id)) as d_exec,
        -- 事務數計算:Commits + Rollbacks
        ( (rs.commits - LAG(rs.commits) OVER (ORDER BY s.snap_id)) + 
          (rs.rollbacks - LAG(rs.rollbacks) OVER (ORDER BY s.snap_id)) ) as d_tx
    FROM snap_range s
    JOIN DB on 1=1
    JOIN raw_stats rs ON s.snap_id = rs.snap_id
)
-- 最終輸出:標準化指標
SELECT 
    DBID,snap_id,instance_number,begin_interval_time,end_interval_time,
    -- Per Second (每秒)
    ROUND(d_redo / NULLIF(interval_seconds, 0), 2) as "Redo/s (bytes)",
    ROUND(d_lread / NULLIF(interval_seconds, 0), 2) as "Logical Reads/s",
    ROUND(d_pread / NULLIF(interval_seconds, 0), 2) as "Phys Reads/s",
    ROUND(d_exec / NULLIF(interval_seconds, 0), 2) as "Executes/s",
    ROUND(d_tx / NULLIF(interval_seconds, 0), 2) as "Transactions/s",
    -- Per Transaction (每個事務)
    ROUND(d_redo / NULLIF(d_tx, 0), 2) as "Redo/Tx (bytes)",
    ROUND(d_lread / NULLIF(d_tx, 0), 2) as "Logical Reads/Tx",
    ROUND(d_pread / NULLIF(d_tx, 0), 2) as "Phys Reads/Tx",
    ROUND((d_read_bytes/1024) / NULLIF(d_tx, 0), 2) as "Read KB/Tx"
FROM deltas
WHERE d_tx >= 0 -- 過濾掉重啟後的負值
ORDER BY snap_id

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |Redo/s (bytes)|Logical Reads/s|Phys Reads/s|Executes/s|Transactions/s|Redo/Tx (bytes)|Logical Reads/Tx|Phys Reads/Tx|Read KB/Tx|
----------+-------+---------------+-----------------------+-----------------------+--------------+---------------+------------+----------+--------------+---------------+----------------+-------------+----------+
000000000|   1583|              1|2026-01-22 21:40:52.537|2026-01-22 21:50:52.851|       4602.45|          25.47|           0|      1.33|          0.03|      184098.13|          1018.8|            0|   1450.13|
000000000|   1584|              1|2026-01-22 21:50:52.851|2026-01-22 22:00:53.187|       4607.38|          26.21|           0|       1.4|          0.02|       184602.4|          1050.2|            0|   2113.97|
000000000|   1585|              1|2026-01-22 22:00:53.187|2026-01-22 22:10:53.517|       4623.11|          32.69|           0|       1.2|          0.03|      184924.27|         1307.67|            0|   1466.13|

指標分析說明 - 如何解讀這些標準化數據?

當您將數據轉換為這兩種維度後,分析的角度會變得完全不同:

1. Per Second (每秒平均) — 觀察「負載強度」
  • 用途: 用來判斷系統的吞吐量上限
  • 案例: 如果 Transactions/s 在某個時段維持在高點,但 Executes/s 突然飆升,這通常代表有大量的短小 SQL 在重複執行,可能是程式迴圈寫得不好。
  • 硬體關聯: Redo/s 直接關聯到 Log Writer (LGWR) 的壓力與磁碟寫入頻寬。
2. Per Transaction (每個事務) — 觀察「工作效率」
  • 用途: 用來判斷應用程式行為是否改變
  • 大師心法: 這是我最喜歡的指標。在正常的系統中,每個事務消耗的資源應該是穩定的。
  • 異常檢測:
  • 如果 Logical Reads/Tx 從 1,000 變成 50,000,即便 Transactions/s 沒變,資料庫也會變慢。這代表索引失效執行計畫改變,導致每個交易都要讀取更多資料塊。
  • 如果 Redo/Tx 突然變大,代表單個交易處理的資料量變多了(例如從更新 1 筆變成更新 100 筆)。

Efficiency Percentages 趨勢分析

WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_stats AS (
    SELECT snap_id,
           -- Buffer Hit 相關
           MAX(DECODE(stat_name, 'session logical reads', value)) as log_rd,
           MAX(DECODE(stat_name, 'physical reads', value)) as phy_rd,
           MAX(DECODE(stat_name, 'physical reads direct', value)) as phy_rd_dir,
           MAX(DECODE(stat_name, 'physical reads cache', value)) as phy_rd_ch,
           -- Parse 相關
           MAX(DECODE(stat_name, 'parse count (total)', value)) as prs_tot,
           MAX(DECODE(stat_name, 'parse count (hard)', value)) as prs_hard,
           MAX(DECODE(stat_name, 'execute count', value)) as exec_cnt,
           -- Redo / Buffer / Latch Nowait
           MAX(DECODE(stat_name, 'redo log space requests', value)) as redo_spc_req,
           MAX(DECODE(stat_name, 'buffer is not pinned count', value)) as buf_not_pin, -- 簡化計算用
           MAX(DECODE(stat_name, 'latch free', value)) as latch_wait,
           -- Sort
           MAX(DECODE(stat_name, 'sorts (memory)', value)) as sort_mem,
           MAX(DECODE(stat_name, 'sorts (disk)', value)) as sort_disk
    FROM dba_hist_sysstat
    GROUP BY snap_id
),
raw_time AS (
    SELECT snap_id,
           MAX(DECODE(stat_name, 'DB CPU', value)) as db_cpu,
           MAX(DECODE(stat_name, 'parse time elapsed', value)) as prs_elap,
           MAX(DECODE(stat_name, 'parse time cpu', value)) as prs_cpu
    FROM dba_hist_sys_time_model
    GROUP BY snap_id
),
deltas AS (
    -- 計算各項指標增量 (Delta)
    SELECT 
		DB.DBID,
	    s.snap_id,
	    s.instance_number,
	    s.begin_interval_time,
	    s.end_interval_time,
        (rs.log_rd - LAG(rs.log_rd) OVER (ORDER BY s.snap_id)) as d_log_rd,
        (rs.phy_rd - LAG(rs.phy_rd) OVER (ORDER BY s.snap_id)) as d_phy_rd,
        (rs.phy_rd_dir - LAG(rs.phy_rd_dir) OVER (ORDER BY s.snap_id)) as d_phy_rd_dir,
        (rs.prs_tot - LAG(rs.prs_tot) OVER (ORDER BY s.snap_id)) as d_prs_tot,
        (rs.prs_hard - LAG(rs.prs_hard) OVER (ORDER BY s.snap_id)) as d_prs_hard,
        (rs.exec_cnt - LAG(rs.exec_cnt) OVER (ORDER BY s.snap_id)) as d_exec,
        (rs.sort_mem - LAG(rs.sort_mem) OVER (ORDER BY s.snap_id)) as d_sort_mem,
        (rs.sort_disk - LAG(rs.sort_disk) OVER (ORDER BY s.snap_id)) as d_sort_disk,
        (rt.db_cpu - LAG(rt.db_cpu) OVER (ORDER BY s.snap_id)) as d_db_cpu,
        (rt.prs_elap - LAG(rt.prs_elap) OVER (ORDER BY s.snap_id)) as d_prs_elap,
        (rt.prs_cpu - LAG(rt.prs_cpu) OVER (ORDER BY s.snap_id)) as d_prs_cpu
    FROM snap_range s
    JOIN DB on 1=1
    JOIN raw_stats rs ON s.snap_id = rs.snap_id
    JOIN raw_time rt ON s.snap_id = rt.snap_id
)
SELECT 
    DBID,snap_id,instance_number,begin_interval_time,end_interval_time,
    -- 1. Buffer Hit %: 1 - (Physical Reads Cache / Session Logical Reads)
    ROUND(100 * (1 - ( (d_phy_rd - d_phy_rd_dir) / NULLIF(d_log_rd, 0) )), 2) as "Buffer Hit %",
    -- 2. Soft Parse %: 1 - (Hard Parse / Total Parse)
    ROUND(100 * (1 - ( d_prs_hard / NULLIF(d_prs_tot, 0) )), 2) as "Soft Parse %",
    -- 3. Execute to Parse %: 1 - (Total Parse / Execute Count)
    ROUND(100 * (1 - ( d_prs_tot / NULLIF(d_exec, 0) )), 2) as "Execute to Parse %",
    -- 4. Parse CPU to Parse Elapsd %: (Parse CPU / Parse Elapsed)
    ROUND(100 * ( d_prs_cpu / NULLIF(d_prs_elap, 0) ), 2) as "Parse CPU to Elap %",
    -- 5. In-memory Sort %: Memory Sorts / (Memory + Disk Sorts)
    ROUND(100 * ( d_sort_mem / NULLIF(d_sort_mem + d_sort_disk, 0) ), 2) as "In-memory Sort %",
    -- 6. % Non-Parse CPU: (DB CPU - Parse CPU) / DB CPU
    ROUND(100 * ( (d_db_cpu - d_prs_cpu) / NULLIF(d_db_cpu, 0) ), 2) as "% Non-Parse CPU"
FROM deltas
WHERE d_log_rd > 0
ORDER BY snap_id

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |Buffer Hit %|Soft Parse %|Execute to Parse %|Parse CPU to Elap %|In-memory Sort %|% Non-Parse CPU|
----------+-------+---------------+-----------------------+-----------------------+------------+------------+------------------+-------------------+----------------+---------------+
000000000|   1583|              1|2026-01-22 21:40:52.537|2026-01-22 21:50:52.851|         100|       99.65|             28.29|                   |             100|               |
000000000|   1584|              1|2026-01-22 21:50:52.851|2026-01-22 22:00:53.187|         100|       99.67|             27.47|                   |             100|               |
000000000|   1585|              1|2026-01-22 22:00:53.187|2026-01-22 22:10:53.517|         100|       99.59|             31.48|                   |             100|               |

指標分析說明

不要只看數值高低,要看它們的穩定性

| Buffer Hit % | 通常 > 95% | 若突然下降,檢查是否有新的大表全表掃描(Full Table Scan)。 | | Soft Parse % | 應接近 100% | 若下降,代表硬解析(Hard Parse)過多,請檢查 SQL 是否未使用 Bind Variables。 | | Execute to Parse % | 越高越好 | 數值低代表 SQL 被解析後只執行了幾次就關閉了,這會浪費大量的 CPU。這通常與應用程式的游標(Cursor)處理邏輯有關。 | | In-memory Sort % | 應接近 100% | 若下降,代表 PGA_AGGREGATE_TARGET 可能不足,導致排序溢出到磁碟(Temp Tablespace)。 | | % Non-Parse CPU | 應接近 100% | 若此值過低(例如 < 80%),說明資料庫花了太多 CPU 在「解析 SQL」而不是「執行 SQL」。 |

Memory Statistics 趨勢分析

WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_osstat AS (
    -- 取得主機實體記憶體總量 (Host Mem)
    SELECT snap_id,
           MAX(DECODE(stat_name, 'PHYSICAL_MEMORY_BYTES', value)) / 1024 / 1024 as host_mem_mb
    FROM dba_hist_osstat
    WHERE stat_name = 'PHYSICAL_MEMORY_BYTES'
    GROUP BY snap_id
),
raw_sgastat AS (
    -- 取得該快照點 SGA 的總和 (SGA Use)
    -- 注意:SGA 在 AWR 中是分組件記錄的,需要加總
    SELECT snap_id,
           SUM(BYTES) / 1024 / 1024 as sga_use_mb
    FROM dba_hist_sgastat
    GROUP BY snap_id
),
raw_pgastat AS (
    -- 取得該快照點 PGA 的總分配量 (PGA Use)
    SELECT snap_id,
           MAX(DECODE(name, 'total PGA allocated', value)) / 1024 / 1024 as pga_use_mb
    FROM dba_hist_pgastat
    WHERE name = 'total PGA allocated'
    GROUP BY snap_id
)
SELECT 
    DB.DBID,
    s.snap_id,
    s.instance_number,
    s.begin_interval_time,
    s.end_interval_time,
    -- Memory Statistics 指標
    ROUND(os.host_mem_mb, 2) as "Host Mem (MB)",
    ROUND(sga.sga_use_mb, 2) as "SGA use (MB)",
    ROUND(pga.pga_use_mb, 2) as "PGA use (MB)",
    -- 衍生計算:總資料庫記憶體佔主機比例
    ROUND((sga.sga_use_mb + pga.pga_use_mb) / NULLIF(os.host_mem_mb, 0) * 100, 2) as "DB Mem/Host %"
FROM snap_range s
JOIN DB on 1=1
LEFT JOIN raw_osstat os ON s.snap_id = os.snap_id
LEFT JOIN raw_sgastat sga ON s.snap_id = sga.snap_id
LEFT JOIN raw_pgastat pga ON s.snap_id = pga.snap_id
ORDER BY s.snap_id

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |Host Mem (MB)|SGA use (MB)|PGA use (MB)|DB Mem/Host %|
----------+-------+---------------+-----------------------+-----------------------+-------------+------------+------------+-------------+
000000000|   1582|              1|2026-01-22 21:30:52.250|2026-01-22 21:40:52.537|      7935.07|     2245.29|      382.34|        33.11|
000000000|   1583|              1|2026-01-22 21:40:52.537|2026-01-22 21:50:52.851|      7935.07|     2245.31|       377.9|        33.06|
000000000|   1584|              1|2026-01-22 21:50:52.851|2026-01-22 22:00:53.187|      7935.07|     2245.21|      382.81|        33.12|

指標分析說明 - 記憶體指標的觀察重點

這些數據在容量規劃與穩定性分析中至關重要:

  1. Host Mem (MB):
  • 這是您作業系統偵測到的總實體記憶體。
  • 觀察重點:此值通常是固定的。如果您的趨勢圖中這個值變小了,可能代表主機發生了硬體故障或是虛擬機動態調整了記憶體(Hot-unplug)。
  1. SGA use (MB):
  • 包含 Buffer Cache, Shared Pool, Large Pool 等。
  • 觀察重點:如果您啟用了 AMM (Automatic Memory Management)ASMM (Automatic Shared Memory Management),您會看到 SGA 各組件之間的大小變動。若 SGA 總量持續接近 sga_max_size,代表資料庫已充分利用預配記憶體。
  1. PGA use (MB):
  • 這是資料庫分配給所有 Session 進行排序、Hash Join 等運算所需的記憶體總和。
  • 觀察重點:這是最容易波動的指標。如果 PGA 趨勢出現「尖峰(Spike)」,通常代表該時段有大型的 Batch Job(如大量的 Parallel Query 或大型排序)在運行。如果 PGA 持續攀升而不下降,則要警惕是否存在 Memory Leak(雖然在 Oracle 中較少見,但某些 OCI 驅動或 Bug 可能導致此現象)。

Cache Sizes 趨勢分析

啟用了 ASMM (Automatic Shared Memory Management),Buffer Cache 與 Shared Pool 的大小會隨負載動態調整,觀察這部分的趨勢能幫助判斷:「資料庫是否在特定的時段因為記憶體抖動(Memory Resizing)而導致效能波動?」

WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_sga_cache AS (
    -- 從 SGASTAT 彙總各關鍵組件的大小
    SELECT 
        snap_id,
        -- Buffer Cache 通常紀錄為 'buffer_cache'
        SUM(DECODE(name, 'buffer_cache', BYTES, 0)) / 1024 / 1024 as buffer_cache_mb,
        -- Shared Pool 需要加總 pool 欄位為 'shared pool' 的所有項目
        SUM(CASE WHEN pool = 'shared pool' THEN BYTES ELSE 0 END) / 1024 / 1024 as shared_pool_mb,
        -- In-Memory Area (若未啟用則為 0)
        SUM(CASE WHEN pool = 'inmemory pool' THEN BYTES ELSE 0 END) / 1024 / 1024 as inmemory_mb,
        -- Log Buffer 是固定大小的項
        MAX(DECODE(name, 'log_buffer', BYTES, 0)) / 1024 / 1024 as log_buffer_mb
    FROM dba_hist_sgastat
    GROUP BY snap_id
),
raw_params AS (
    -- 獲取標準塊大小 (db_block_size)
    SELECT 
        snap_id,
        value as std_block_size
    FROM dba_hist_parameter
    WHERE parameter_name = 'db_block_size'
)
SELECT 
    DB.DBID,
    s.snap_id,
    s.instance_number,
    s.begin_interval_time,
    s.end_interval_time,
    -- Cache Sizes 指標
    ROUND(c.buffer_cache_mb, 2) as "Buffer Cache (MB)",
    ROUND(c.shared_pool_mb, 2) as "Shared Pool (MB)",
    ROUND(c.inmemory_mb, 2) as "In-Memory Area (MB)",
    ROUND(c.log_buffer_mb, 2) as "Log Buffer (MB)",
    p.std_block_size as "Std Block Size"
FROM snap_range s
JOIN DB on 1=1
LEFT JOIN raw_sga_cache c ON s.snap_id = c.snap_id
LEFT JOIN raw_params p ON s.snap_id = p.snap_id
ORDER BY s.snap_id

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |Buffer Cache (MB)|Shared Pool (MB)|In-Memory Area (MB)|Log Buffer (MB)|Std Block Size|
----------+-------+---------------+-----------------------+-----------------------+-----------------+----------------+-------------------+---------------+--------------+
000000000|   1582|              1|2026-01-22 21:30:52.250|2026-01-22 21:40:52.537|             1728|          357.29|                  0|           7.51|8192          |
000000000|   1583|              1|2026-01-22 21:40:52.537|2026-01-22 21:50:52.851|             1728|          357.31|                  0|           7.51|8192          |
000000000|   1584|              1|2026-01-22 21:50:52.851|2026-01-22 22:00:53.187|             1728|          357.21|                  0|           7.51|8192          |

指標分析說明 - 快取大小背後的調校重點

當您觀察這些數據的趨勢時,應特別留意以下現象:

  1. Buffer Cache 與 Shared Pool 的「拉鋸戰」:
  • 如果您發現一個增加時,另一個就減少,這代表 ASMM (Automatic Shared Memory Management) 正在運作。
  • 診斷心法:如果這種切換頻率過高(例如每小時都在變動),可能會觸發 library cache load lock 等等待事件,甚至導致大量的 SQL Hard Parse(因為 Shared Pool 被縮小導致快取失效)。
  1. Log Buffer (MB):
  • Log Buffer 通常在啟動後就是固定的。
  • 診斷心法:如果 log_buffer 在 AWR 顯示過小(通常現代系統建議在 32MB~128MB 以上),且您在負載期間看到大量的 log buffer space 等待,就需要考慮手動加大。
  1. In-Memory Area:
  • 如果您的資料庫使用了 In-Memory Column Store,此處會顯示分配的大小。
  • 觀察重點:確保此數值維持穩定。若發現 IM 列存空間不足,Oracle 會自動將資料回退到 Buffer Cache 讀取,這會讓查詢速度大幅下降。
  1. Std Block Size:
  • 這是一個參考值(通常是 8192)。雖然它在單一實例中不會變動,但在計算 Load Profile(如 Physical Reads 的 Blocks 轉成 Bytes)時,它是不可或缺的基數。

Shared Pool Statistics 趨勢分析

在 AWR 報告中,Shared Pool Statistics 區塊是診斷 「硬解析 (Hard Parse) 壓力」「SQL 重用率」 的關鍵。

這些指標能幫助識别系統是否存在「過多的一次性 SQL (One-time SQL)」,這類 SQL 會迅速消耗 Shared Pool 空間,導致正常的 SQL 被擠出 (Age out),進而引發 Library Cache 競爭。

WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_shared_pool AS (
    -- 1. 計算 Shared Pool 使用率 (總量 vs 空閒)
    SELECT 
        snap_id,
        SUM(bytes) as total_sp_bytes,
        SUM(CASE WHEN name = 'free memory' THEN bytes ELSE 0 END) as free_sp_bytes
    FROM dba_hist_sgastat
    WHERE pool = 'shared pool'
    GROUP BY snap_id
),
raw_sql_reuse AS (
    -- 2. 計算 SQL 重用性 (基於 AWR 捕獲到的 SQL)
    -- 注意:AWR 僅記錄 Top SQL,但其比例趨勢仍具備高度參考價值
    SELECT 
        snap_id,
        COUNT(sql_id) as total_sql_count,
        COUNT(CASE WHEN executions_total > 1 THEN sql_id END) as reused_sql_count,
        SUM(sharable_mem) as total_sql_mem_bytes,
        SUM(CASE WHEN executions_total > 1 THEN sharable_mem ELSE 0 END) as reused_sql_mem_bytes
    FROM dba_hist_sqlstat
    GROUP BY snap_id
)
SELECT 
    DB.DBID,
    s.snap_id,
    s.instance_number,
    s.begin_interval_time,
    s.end_interval_time,
    -- Shared Pool Statistics 指標
    -- Memory Usage %: (Total - Free) / Total
    ROUND(100 * (sp.total_sp_bytes - sp.free_sp_bytes) / NULLIF(sp.total_sp_bytes, 0), 2) as "SP Memory Usage %",
    -- % SQL with executions > 1: (重用 SQL 數 / 總 SQL 數)
    ROUND(100 * sr.reused_sql_count / NULLIF(sr.total_sql_count, 0), 2) as "% SQL w/exec>1",
    -- % Memory for SQL w/exec > 1: (重用 SQL 佔用的記憶體 / 總 SQL 記憶體)
    ROUND(100 * sr.reused_sql_mem_bytes / NULLIF(sr.total_sql_mem_bytes, 0), 2) as "% Mem for SQL w/exec>1",
    ROUND(sr.total_sql_mem_bytes / 1024 / 1024, 2) as "Total_SQL_Mem_MB",
    ROUND(sr.avg_sql_mem_bytes / 1024 / 1024, 2) as "avg_SQL_Mem_MB"
FROM snap_range s
JOIN DB on 1=1
LEFT JOIN raw_shared_pool sp ON s.snap_id = sp.snap_id
LEFT JOIN raw_sql_reuse sr ON s.snap_id = sr.snap_id
ORDER BY s.snap_id

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |SP Memory Usage %|% SQL w/exec>1|% Mem for SQL w/exec>1|Total_SQL_Mem_MB|avg_SQL_Mem_MB|
----------+-------+---------------+-----------------------+-----------------------+-----------------+--------------+----------------------+----------------+--------------+
1979277899|   1590|              1|2026-01-22 22:50:54.714|2026-01-22 23:00:55.037|            63.48|           100|                   100|            3.18|          0.05|
1979277899|   1591|              1|2026-01-22 23:00:55.037|2026-01-22 23:10:55.378|            63.45|           100|                   100|            3.75|          0.06|
1979277899|   1592|              1|2026-01-22 23:10:55.378|2026-01-22 23:20:55.697|            63.47|         98.41|                 99.21|            3.33|          0.05|
1979277899|   1593|              1|2026-01-22 23:20:55.697|2026-01-22 23:30:55.994|            63.48|         98.44|                 99.27|            3.59|          0.06|
1979277899|   1594|              1|2026-01-22 23:30:55.994|2026-01-22 23:40:56.287|            63.46|           100|                   100|            3.03|          0.05|
1979277899|   1595|              1|2026-01-22 23:40:56.287|2026-01-22 23:50:56.589|            63.48|           100|                   100|            3.24|          0.05|

指標分析說明 - 如何根據這些數據進行調優?

當這三個指標出現異常時,通常代表著不同的效能危機:

1. % SQL with executions > 1 (SQL 重用率)
  • 理想狀態:應保持在 80% - 95% 以上。
  • 警訊:如果這個值很低(例如 < 50%),代表系統中充斥著大量未綁定變數 (Literal SQL)
  • 影響:這會導致極高的 CPU 消耗在硬解析上,並可能引發 latch: shared poollibrary cache pin 等待。
2. % Memory for SQL w/exec > 1 (重用 SQL 的記憶體佔比)
  • 理想狀態:此值應盡可能接近 100%
  • 診斷心法:如果此值顯著低於 90%,代表 Shared Pool 的大部分記憶體都被那些「只執行一次就再也不用」的 SQL 給佔據了。這就是所謂的 Shared Pool 碎片化
  • 對策:考慮將初始化參數 CURSOR_SHARING 暫時改為 FORCE (需謹慎評估執行計畫風險),或者推動開發團隊修改程式碼使用綁定變數。
3. Memory Usage % (Shared Pool 使用率)
  • 觀察重點:在生產環境中,這個值通常會穩定在 90% - 95%
  • 關鍵細節:如果使用率一直維持在 99% 以上且 Free Memory 極低,加上重用率低,這就是觸發 ORA-04031 錯誤的前兆。
Total_SQL_Mem_MB, avg_SQL_Mem_MB
  1. 階梯式上升 (The Staircase):
  • 現象:記憶體佔用在某個時間點上升後就不再下降,呈現階梯狀。
  • 含義:這代表有新的 SQL 進入了 Shared Pool 但沒有被釋放(可能是因為 Version Count 攀升或是不斷有新的 Literal SQL 產生)。
  1. 與「SQL 重用率」的反向關聯:
  • 聯動分析:請將這條線與你之前的 % SQL with executions > 1 趨勢圖放在一起看。
  • 大師心法:如果「總記憶體消耗」上升,而「重用率」下降,這 100% 是因為應用程式正在大量使用未綁定變數的 SQL,正在污染 Shared Pool。
  1. 峰值 (Max) 與平均值 (Avg) 的間距:
  • 含義:如果 Max 遠大於 Avg,說明該小時內有「瞬時巨獸」出現(例如一次性的大型 IN 清單查詢),這會導致 Shared Pool 瞬間碎片化。

% DB Time by Wait Class 趨勢分析

DB Time 拆解為各個 Wait Class (等待類別) 的佔比,是判斷資料庫「瓶頸性質」的最快路徑。

WITH snap_range AS (
    SELECT snap_id, dbid, instance_number, begin_interval_time, end_interval_time
    FROM dba_hist_snapshot
    WHERE instance_number = 1
    AND end_interval_time >= SYSDATE - 1
    AND begin_interval_time <= SYSDATE
),
DB AS (SELECT DBID FROM V$DATABASE),
raw_db_time AS (
    -- 取得總 DB Time 的累計值
    SELECT snap_id,
           MAX(value) as db_time_val
    FROM dba_hist_sys_time_model
    WHERE stat_name = 'DB time'
    GROUP BY snap_id
),
raw_wait_class AS (
    -- 取得各 Wait Class 的總等待時間累計值 (單位:微秒)
    -- 過濾掉 Idle 等待,因為它不計入 DB Time
    SELECT snap_id,
           SUM(DECODE(wait_class, 'User I/O', time_waited_micro, 0)) as user_io_time,
           SUM(DECODE(wait_class, 'System I/O', time_waited_micro, 0)) as system_io_time,
           SUM(DECODE(wait_class, 'Network', time_waited_micro, 0)) as network_time,
           SUM(DECODE(wait_class, 'Commit', time_waited_micro, 0)) as commit_time,
           SUM(DECODE(wait_class, 'Concurrency', time_waited_micro, 0)) as concurrency_time,
           SUM(DECODE(wait_class, 'Other', time_waited_micro, 0)) as other_time
    FROM dba_hist_system_event
    WHERE wait_class <> 'Idle'
    GROUP BY snap_id
),
deltas AS (
    -- 計算增量 (Delta)
    SELECT 
	    DB.DBID,
	    s.snap_id,
	    s.instance_number,
	    s.begin_interval_time,
	    s.end_interval_time,
        (t.db_time_val - LAG(t.db_time_val) OVER (ORDER BY s.snap_id)) as d_db_time,
        (wc.user_io_time - LAG(wc.user_io_time) OVER (ORDER BY s.snap_id)) as d_user_io,
        (wc.system_io_time - LAG(wc.system_io_time) OVER (ORDER BY s.snap_id)) as d_sys_io,
        (wc.network_time - LAG(wc.network_time) OVER (ORDER BY s.snap_id)) as d_net,
        (wc.commit_time - LAG(wc.commit_time) OVER (ORDER BY s.snap_id)) as d_cmt,
        (wc.concurrency_time - LAG(wc.concurrency_time) OVER (ORDER BY s.snap_id)) as d_concur,
        (wc.other_time - LAG(wc.other_time) OVER (ORDER BY s.snap_id)) as d_other
    FROM snap_range s
    JOIN DB on 1=1
    JOIN raw_db_time t ON s.snap_id = t.snap_id
    JOIN raw_wait_class wc ON s.snap_id = wc.snap_id
)
SELECT 
	DBID,snap_id,instance_number,begin_interval_time,end_interval_time,
    -- 計算百分比
    ROUND(100 * d_user_io / NULLIF(d_db_time, 0), 2) as "% User I/O",
    ROUND(100 * d_sys_io / NULLIF(d_db_time, 0), 2) as "% System I/O",
    ROUND(100 * d_net / NULLIF(d_db_time, 0), 2) as "% Network",
    ROUND(100 * d_cmt / NULLIF(d_db_time, 0), 2) as "% Commit",
    ROUND(100 * d_concur / NULLIF(d_db_time, 0), 2) as "% Concurrency",
    ROUND(100 * d_other / NULLIF(d_db_time, 0), 2) as "% Other",
    -- 補充:DB CPU 佔比 (DB Time = CPU + Wait Time)
    ROUND(100 * (d_db_time - (d_user_io + d_sys_io + d_net + d_cmt + d_concur + d_other)) / NULLIF(d_db_time, 0), 2) as "% DB CPU (Est.)"
FROM deltas
WHERE d_db_time > 0
ORDER BY snap_id

執行結果範例

DBID      |SNAP_ID|INSTANCE_NUMBER|BEGIN_INTERVAL_TIME    |END_INTERVAL_TIME      |% User I/O|% System I/O|% Network|% Commit|% Concurrency|% Other|% DB CPU (Est.)|
----------+-------+---------------+-----------------------+-----------------------+----------+------------+---------+--------+-------------+-------+---------------+
000000000|   1611|              1|2026-01-23 02:20:01.195|2026-01-23 02:30:01.515|      8.92|    162661.1|      0.2|  150.37|      3439.44|8234.61|     -174394.63|
000000000|   1651|              1|2026-01-23 09:00:13.253|2026-01-23 09:10:13.591|       2.3|      386.14|     0.01|    1.28|         0.01|  20.49|        -310.23|
000000000|   1653|              1|2026-01-23 09:20:13.886|2026-01-23 09:30:14.200|      24.4|      126.35|        0|    0.11|         0.01|   6.55|         -57.42|

指標分析說明

  1. % User I/O 高 (例如 > 40%):
  • 現象:資料庫花大量時間在等待從磁碟讀取資料。
  • 原因:通常是 SQL 缺少索引導致全表掃描(Full Table Scan),或是 Buffer Cache 太小。
  1. % Commit 高 (通常是 log file sync):
  • 現象:每個事務(Transaction)提交時太慢。
  • 原因:磁碟寫入 Redo Log 的速度跟不上,或者是應用程式採取了過於頻繁的提交(例如在迴圈內 Commit)。
  1. % Concurrency 高:
  • 現象:Session 之間在互相排隊。
  • 原因:常見於鎖競爭(TX - row lock contention)或熱點塊(Latch / Buffer Busy Waits)。這通常需要優化應用程式邏輯。
  1. % System I/O 高:
  • 原因:通常與後端進程(如 DBWR 或 LGWR)的寫入壓力有關,或是正在執行備份、資料庫檢查點(Checkpoint)。
  1. % DB CPU 高 (補充指標):
  • 現象:這是好事,說明資料庫主要在「做事」而非「等待」。
  • 例外:如果 CPU 100% 且系統很慢,則代表有 SQL 正在進行大量的邏輯讀(Logical Reads)或運算複雜度過高。