Chủ Nhật, 10 tháng 3, 2024

Oracle B Tree vs Bitmap vs Hash Index Guide

In Oracle, bitmap indexes and hash indexes are two different types of indexes used for efficient data retrieval. Here are the key differences between them:

Bitmap Index:

- Bitmap indexes are designed to improve query performance for columns with a low cardinality, where the number of distinct values is relatively small.

- They use a bitmap representation to index data, where each bit in the bitmap corresponds to a possible value in the indexed column.

- Bitmap indexes work well for queries that involve multiple conditions combined using logical operators (AND, OR).

- They are especially useful for data warehousing and decision support systems where the queries involve complex analysis and aggregation.

- Bitmap indexes are more space-efficient compared to other index types for columns with low cardinality.

- They can be slower for data modification operations (such as inserts, updates, and deletes) because the bitmap indexes need to be updated accordingly.

Hash Index:

- Hash indexes are designed to provide fast lookup access for equality-based queries.

- They use a hash function to compute a hash value for each indexed value and store the hash values in the index structure.

- Hash indexes are well-suited for columns with high cardinality, where the number of distinct values is large.

- They perform best for queries that involve exact match queries (e.g., WHERE column = value).

- Hash indexes are not effective for range scans or pattern matching queries.

- They require a separate area of memory called the hash area, which is used to manage the hash values and provide fast lookups.

- Hash indexes are generally faster for data modification operations since they require fewer updates compared to other index types.

In summary, the main difference between bitmap indexes and hash indexes in Oracle is their intended use and performance characteristics. Bitmap indexes are suitable for columns with low cardinality and are efficient for complex queries involving logical operators. On the other hand, hash indexes are best for columns with high cardinality and excel at exact match queries. It's important to consider the nature of the data and the types of queries you'll be running when choosing the appropriate index type for your Oracle database.

A Bitmap index and a Hash index are two types of indexes used in Oracle to improve query performance.

Bitmap index: A Bitmap index is a type of index that stores a bitmap for each key value, where each bit in the bitmap corresponds to a row in the table. If a bit is set to 1, it means that the corresponding row has the key value in the indexed column. Bitmap indexes are used for low-cardinality columns, where the same values appear frequently in the column. Bitmap indexes are best suited for data warehousing and business intelligence applications, where complex query conditions are common.
Hash index: A Hash index is a type of index that uses a hash function to map the key values in the indexed column to a specific location in the index. Hash indexes are used for high-cardinality columns, where the values in the column are unique or nearly unique. Hash indexes are best suited for OLTP (Online Transaction Processing) systems, where the number of distinct values in the indexed column is large.

Thứ Bảy, 4 tháng 9, 2021

[ORACLE-FLASHBACK] FLASHBACK TABLE TO TIMESTAMP

Flashback Technology provided by Oracle is one of the most useful in a production environment. It is a life saver. You can even rollback a query to a point in time without requiring traditional database restore. I will demonstrate below how to do flashback a table to a point in time. But first you have to enable flashback.

From 11gR2 onwards you don’t have to shutdown the database to enable Flashback. You can do it online. To enable flashback in 11gR2 do the below steps.

1. alter system set db_recovery_file_dest=’+FRA’ scope=both sid=’*’;

2. alter system set db_recovery_file_dest_size=’400G’ scope=both sid=’*’;

3. select flashback_on from v$database;

FLASHBACK_ON
——————
NO

4. alter database flashback on;
select flashback_on from v$database;

FLASHBACK_ON
——————
YES
_______________________________________________________

— For a table to be flash-backed you need to have row movement enabled on it. We have a table here called Employees in Schema HR
ALTER TABLE HR.EMPLOYEES ENABLE ROW MOVEMENT;

— Check current SCN and Database Time
SELECT current_scn, TO_CHAR(SYSTIMESTAMP, ‘YYYY-MM-DD HH24:MI:SS’) FROM v$database;
4784136875 2012-04-30 8:30:00

— Check the current total no. of records in the table
SELECT COUNT(*) FROM HR.EMPLOYEES
Count(*)
126

— Check the current total no. of records in the table as it existed half an hour back at 8:00 AM
SELECT COUNT(*) FROM HR.EMPLOYEES AS OF TIMESTAMP TO_TIMESTAMP(‘2012-04-30 8:00:00’, ‘YYYY-MM-DD HH24:MI:SS’)
Count(*)
127
This means there is one record was deleted. We need to flashback the table as it existsed at 8:00AM

FLASHBACK TABLE HR.EMPLOYEES TO TIMESTAMP TO_TIMESTAMP(‘2012-04-30 8:00:00’, ‘YYYY-MM-DD HH24:MI:SS’);

SELECT COUNT(*) FROM HR.EMPLOYEES
Count(*)
127

Now we can see that the record are displaying as 127. Which means table is back to the state before the record was deleted.

Thứ Năm, 2 tháng 9, 2021

[ORACLE GOLDENGATE] Fixed - Oracle GGS Extract PROCESS ABENDING

1. Checking Extract PROCESS ABENDING, Last Log Checkpoint:

GGSCI (SOURCE) 85> info extract FEXTR1,detail

EXTRACT FEXTR1 Last Started 2021-09-02 20:54 Status ABENDED

Checkpoint Lag 00:00:00 (updated 00:31:37 ago)

Log Read Checkpoint Oracle Redo Logs

2021-08-30 15:11:05 Seqno 678, RBA 571035136

SCN 2183.433217452 (9376346824620)

2. Stop extract/pump/replicat both Source and Target:

Source:

stop extract FEXTR1

stop extract FPXTR1

info all

Target:

stop REPLICAT FRP1

info all

3. Fix GGS extract/pump on DB Source

Soure:

--Check log to determine what time extract was ABENDED.

--or show info extract FEXTR1,detail -->2021-08-30 15:11:05

--Extract

ALTER EXTRACT FEXTR1 BEGIN 2021-08-30 15:11

ALTER EXTRACT FEXTR1 EXTRBA 0

ALTER EXTRACT FEXTR1 ETROLLOVER

start extract FEXTR1

INFO EXTRACT FEXTR1 SHOWCH ---> Current Checkpoint (current write position): Sequence #: 246

--Pump

--Get EXTSEQNO from extract INFO EXTRACT FEXTR1 SHOWCH ---> Current Checkpoint (current write position): Sequence #: 246

--Get EXTSEQNO from et new generating. ls -alt /u01/app/oracle/GGS19c/dirdat --> File et000000246 -> EXTSEQNO=246

ALTER EXTRACT FPXTR1 EXTSEQNO 246

ALTER EXTRACT FPXTR1 EXTRBA 0

ALTER EXTRACT FPXTR1 ETROLLOVER

start extract FPXTR1

INFO EXTRACT FPXTR1 SHOWCH --> Write Checkpoint #1 GGS Log Trail Current Checkpoint (current write position): Sequence #: 2

4. Fix GGS replicat on DB Target:

--Get EXTSEQNO from pump on DB Source INFO EXTRACT FPXTR1 SHOWCH --> Write Checkpoint #1 GGS Log Trail Current Checkpoint (current write position): Sequence #: 2

--Get EXTSEQNO from new file generating from # ls -alt /u01/app/ogg/dirdat | more --> rt000000002 --> EXTSEQNO 2

ALTER REPLICAT FRP1 EXTSEQNO 2

ALTER REPLICAT FRP1 EXTRBA 0

start REPLICAT FRP1

5. Check GGS status

info all

...

Replicate is successful. The Problem was fixed. Thank you

Thứ Hai, 3 tháng 5, 2021

[MySQL] Replication database with GTID

Master::::::::::::::::::::::::

systemctl stop mysqld

vi /etc/my.cnf

The following parameters should be added under the [mysqld] section of my.cnf file

server-id = 1

log-bin = mysql-bin

binlog_format = row

relay-log = relay-log-server

gtid-mode=ON

enforce-gtid-consistency

log-slave-updates

systemctl start mysqld

mysql>create user 'repl_user'@'%' identified by '123456';

mysql>Grant replication slave on *.* to 'repl_user'@'%';

mysql>SET @@GLOBAL.read_only = ON;

mysql> FLUSH TABLES WITH READ LOCK;

mysql> SET GLOBAL read_only = ON;

mysql> show master status;

show global variables like 'gtid_executed';

mysqldump --all-databases -flush-privileges --single-transaction --flush-logs --triggers --routines --events -hex-blob --host=localhost --port=3306 --user=root --password=123456 > /tmp/mysqlbackup_dump.sql

SET @@GLOBAL.read_only = OFF;

SET GLOBAL read_only = OFF;

UNLOCK TABLES;

FLUSH PRIVILEGES;

slave::::::::::::::::

service mysqld stop

vi /etc/my.cnf

server-id = 2

log-bin = mysql-bin

binlog_format = row

relay-log = relay-log-server

read-only = ON

gtid-mode=ON

enforce-gtid-consistency

log-slave-updates

service mysqld start

mysql> show global variables like 'gtid_executed';

mysql> source mysqlbackup_dump.sql ;

mysql> show global variables like 'gtid_executed';

CHANGE MASTER TO MASTER_HOST = '192.168.114.135',MASTER_PORT = 3306,MASTER_USER = 'repl_user',MASTER_PASSWORD = '123456',MASTER_AUTO_POSITION = 1;

SET @@GLOBAL.read_only = OFF;

SET GLOBAL read_only = OFF;

UNLOCK TABLES;

FLUSH PRIVILEGES;

start slave;

mysql> show slave status \G

SELECT RECEIVED_TRANSACTION_SET FROM peformance_schema.replication_connection_status;

Switchover::::::::::::::

Master:

SET @@GLOBAL.read_only = ON;

FLUSH TABLES WITH READ LOCK;

SET GLOBAL read_only = ON;

FLUSH PRIVILEGES;

Slave:

Make sure that all slaves has processed any statements in their relay log with:

mysql> STOP SLAVE IO_THREAD;

mysql> SHOW PROCESSLIST; to see `Has read all relay log` state.on slave 1, promote it to become a master with:

mysql> STOP SLAVE;

mysql> RESET MASTER;

SET @@GLOBAL.read_only = OFF;

SET GLOBAL read_only = OFF;

UNLOCK TABLES;

FLUSH PRIVILEGES;

on old master (or slave 2), point to new master with:

mysql> CHANGE MASTER TO MASTER_HOST = '192.168.114.136',MASTER_PORT = 3306,MASTER_USER = 'repl_user',MASTER_PASSWORD = '123456',MASTER_AUTO_POSITION = 1;

SET @@GLOBAL.read_only = OFF;

SET GLOBAL read_only = OFF;

UNLOCK TABLES;

FLUSH PRIVILEGES;

mysql> START SLAVE;

Failover::::::::::::::

mysql> STOP SLAVE IO_THREAD;

mysql> SHOW PROCESSLIST; to see `Has read all relay log` state.on slave 1, promote it to become a master with:

mysql> STOP SLAVE;

mysql> RESET MASTER;

SET @@GLOBAL.read_only = OFF;

SET GLOBAL read_only = OFF;

UNLOCK TABLES;

FLUSH PRIVILEGES;

Chủ Nhật, 11 tháng 10, 2020

[IBM-AIX] Easy installing Pkgs with YUM ON AIX

1/Download yum_bundle from ibm

https://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/ezinstall/ppc/

2/Install YUM

# ls -alt

total 106408

-rw-r--r-- 1 root system 4036762 Oct 09 00:11 gettext-0.19.7-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 2897799 Oct 09 00:11 db-4.8.24-3.aix6.1.ppc.rpm

drwxr-xr-x 2 root system 4096 Oct 09 00:11 .

-rw-r--r-- 1 root system 56991 Oct 09 00:11 gdbm-1.8.3-5.aix5.2.ppc.rpm

-rw-r--r-- 1 root system 533288 Oct 09 00:11 curl-7.52.1-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 51749 Oct 09 00:11 pysqlite-1.1.7-2.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 3570302 Oct 09 00:11 sqlite-3.15.2-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 1686134 Oct 09 00:11 glib2-2.14.6-2.aix5.2.ppc.rpm

-rw-r--r-- 1 root system 489547 Oct 09 00:11 readline-6.1-2.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 23333701 Oct 09 00:11 python-2.7.10-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 158584 Oct 09 00:11 python-urlgrabber-3.10.1-1.aix6.1.noarch.rpm

-rw-r--r-- 1 root system 830446 Oct 09 00:11 python-tools-2.7.10-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 162093 Oct 09 00:11 python-pycurl-7.19.3-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 37912 Oct 09 00:11 python-iniparse-0.4-1.aix6.1.noarch.rpm

-rw-r--r-- 1 root system 15366474 Oct 09 00:11 python-devel-2.7.10-1.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 62283 Oct 09 00:11 yum-metadata-parser-1.1.4-2.aix6.1.ppc.rpm

-rw-r--r-- 1 root system 923981 Oct 09 00:11 yum-3.4.3-7.aix6.1.noarch.rpm

-rw-r--r-- 1 root system 214726 Oct 09 00:11 ca-certificates-2016.10.7-2.aix6.1.ppc.rpm

drwxr-xr-x 20 root system 4096 Oct 09 00:10 ..

#chfs -a size=2G /opt

#smitty tcpip -> further -> name resolution

# rpm -Uvh *

Preparing... ################################# [100%]

Updating / installing...

1:readline-6.1-2 ################################# [ 6%]

2:sqlite-3.15.2-1 ################################# [ 12%]

3:gettext-0.19.7-1 ################################# [ 18%]

add libintl.so.1 (32bits) shared member to /opt/freeware/lib/libintl.a

add libintl.so.1 (64bits) shared member to /opt/freeware/lib/libintl.a

4:glib2-2.14.6-2 ################################# [ 24%]

5:yum-metadata-parser-1.1.4-2 ################################# [ 29%]

6:gdbm-1.8.3-5 ################################# [ 35%]

7:db-4.8.24-3 ################################# [ 41%]

8:python-2.7.10-1 ################################# [ 47%]

9:pysqlite-1.1.7-2 ################################# [ 53%]

10:python-iniparse-0.4-1 ################################# [ 59%]

11:ca-certificates-2016.10.7-2 ################################# [ 65%]

12:curl-7.52.1-1 ################################# [ 71%]

13:python-pycurl-7.19.3-1 ################################# [ 76%]

14:python-urlgrabber-3.10.1-1 ################################# [ 82%]

15:yum-3.4.3-7 ################################# [ 88%]

16:python-devel-2.7.10-1 ################################# [ 94%]

17:python-tools-2.7.10-1 ################################# [100%]

# pwd

/opt/freeware/etc/yum

vi /opt/freeware/etc/yum/yum.conf

[main]

cachedir=/var/cache/yum

keepcache=1

debuglevel=2

logfile=/var/log/yum.log

exactarch=1

obsoletes=1

plugins=1

[AIX_Toolbox]

name=AIX generic repository

baseurl= http://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/RPMS/ppc/

enabled=1

gpgcheck=0

[AIX_Toolbox_noarch]

name=AIX noarch repository

baseurl= http://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/RPMS/noarch/

enabled=1

gpgcheck=0

[AIX_Toolbox_61]

name=AIX 6.1 specific repository

baseurl= http://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/RPMS/ppc-6.1/

enabled=1

gpgcheck=0

[AIX_Toolbox_72]

name=AIX 7.2 specific repository

baseurl= http://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/RPMS/ppc-7.2/

enabled=1

gpgcheck=0

3/ Open Firewall from the host to IBM public.dhe.ibm.com

4/ Run YUM

yum install bash -y;yum install wget -y;yum install unzip -y;yum install bc -y;yum install screen -y;yum install rsync -y;yum install logrotate -y;

Chủ Nhật, 5 tháng 4, 2020

[ORACLE 11GR2] Flashback Query Example

Flashback Query Example

Expert Oracle Database Tips by Donald BurlesonMarch 25, 2015

RMAN - Oracle Flashback Query

Using Oracle flashback query, you can retrieve the committed data as it was at a past point in time. As most of other flashback technologies, this feature retrieves data from the undo tablespace. It is possible to get undo data based on a specific time or scn number. Using the select as of statement with both OF TIMESTAMP and OF SCN clauses, you can use this feature easily. It can even be performed by any application developer without any DBA intervention. Developers can use the dbms_flashback package to perform flashback query directly from their applications, but the execute privilege should be granted on the dbms_flashback package. Now see it in action through the following scenario:

Scenario 1: At 12.00, Bob got a call from an application developer who told him that somebody deleted all the rows from one of the tables by omitting the WHERE clause and committing the transaction. He wanted to know which rows were deleted from the table. Bob decided to use flashback query to get the before image of the deleted rows.

To test a case similar to Bob's situation, perform the following steps:

Create a table as it was created by the application developer and query it as follows:

SQL>
create
table tbl_seg_tbs as
2 select
segment_name, tablespace_name
from
dba_segments
3 where
rownum<7
order by
bytes desc;
Table created.

SQL>
select
count(*)
from
tbl_seg_tbs;

COUNT(*)
----------
6
SQL>

In order to return back to this state of the table anytime, get the current scn number and timestamp:

SQL>
select
to_char(sysdate,'dd-mm-yyyy hh24:mi:ss') ddate,
dbms_flashback.get_system_change_number() scn from dual;

DDATE SCN
-------------------- ----------
07-02-2010 15:14:21 460141

This step was not performed by Bob or by the application developer. The developer knows the time when the table was in a correct state and told that time to Bob. Here you get the date and current scn number just for testing purposes.

Now run the delete command to clear the table and commit the transaction:

SQL>
delete from
tbl_seg_tbs;
6 rows deleted.

SQL>
commit;
Commit complete.

SQL>
select * from
tbl_seg_tbs;
no rows selected

As Bob knows the exact time of the correct state of the table, he uses flashback query and retrieves the before state of deleted rows from the undo tablespace as follows:

SQL>
select
count(*)
from tbl_seg_tbs
2 as of
timestamp to_timestamp('07-02-2010 15:14:21','dd-mm-yyyy hh24:mi:ss');

COUNT(*)
----------
6
SQL>

In case he knows the scn number, he uses the OF SCN clause to view the data of the table at the specified scn as follows:

SQL>
select * from
tbl_seg_tbs
2 as of scn 460135;

COUNT(*)
----------
6
SQL>

To view the data of the table as it was 15 minutes ago, use the following query:

SQL>
select
count(*)
from tbl_seg_tbs
2 as of timestamp (systimestamp -interval '15' minute);
COUNT(*)
----------
6
SQL>

It is possible to convert scn to timestamp and timestamp to scn using scn_to_timestamp and timestamp_to_scn functions:

SQL>
select scn_to_timestamp(460141) ddate,
timestamp_to_scn(to_timestamp('07-02-2010 15:14:21','dd-mm-yyyy
hh24:mi:ss')) scn from dual;
DDATE                                     SCN
------------------------------------     ----------
07-FEB-10 03.14.21.000000000 PM     460141
SQL>

Viewing/Retrieving the Dropped PL/SQL Object codes using Flashback Query
You can retrieve any dropped PL/SQL object using flashback query. How? It is very easy. The source of these objects is stored in an internal table: sys.source$. It means that when you create a trigger, the line that contains its source is added to that table. The same works for dropping the procedure. Please note that recreating the objects relies on the dependencies being valid; thus, any tables/views referenced in the procedure must be present or recreated before recreating the procedure. So see how it works:

Create a simple trigger and get the current scn valueof the database:

SQL>
create or replace
trigger trg_logon
2 after logon on database
3 begin
4 insert into tbl_logons values(sysdate, user);
5 end;
6 /

Trigger created.

SQL>
select
current_scn
from
v$database;

CURRENT_SCN
-----------
528857
SQL>

Now drop the trigger.

SQL>
drop
trigger trg_logon;
Trigger dropped.
SQL>

Using the scn value that has been noted before dropping the trigger, query the dba_source view as follows:

SQL>
select
text
from dba_source
as of
scn 528857
2 where name='trg_logon';

TEXT
---------------------------------------------------
trigger trg_logon
after logon on database
begin
insert into tbl_logons values(sysdate, user);
end;
SQL>

Using the about result text code, you can create or replace the trigger again.

Thứ Năm, 12 tháng 3, 2020

RAC local parallel query internals

RAC tuning tips
January 10, 2015

One of the greatest benefits of running parallel SQL statements in an Oracle RAC environment is the ability to distribute the parallel slaves over multiple nodes. However, doing so requires extra overhead than if the parallel slaves were all processed on the local node. There are times when the SQL statement will execute more quickly with multiple nodes at work and times when the SQL statement will execute more quickly when run on a single node and avoid that extra overhead. Remember, each SQL statement is different as well as the system configuration it runs on, and parallel SQL statements should be tested to determine if a single node or multi-node execution works best for the specific environment. You should also be considering the impact of multiple parallel processes all running on the same node that could be impacting other processes competing for the same resources.

The parallel_force_local initialization parameter is used to control parallel SQL in Oracle RAC environments distribution over nodes other than the local, originating host. The default value for this parameter is FALSE and it is a best practice to leave this parameter at its default value. If this parameter is set to TRUE, then all parallel operations will only run on the local node. If this parameter does need to be set, it is best to do so at the session with the following command.

alter session set parallel_force_local=true;

To illustrate if forcing the parallel SQL to a specific node is beneficial or not, consider the following query.

select

   /*+ parallel */

   o.order_id,

   max(d.order_date) as max_date,

   count(*) as num_details

from

   orders o

   join

   order_details d

      on o.order_id = d.order_id

group by

   o.order_id;

The DEGREE OF PARALLELISM for this statement is 8. The SQL statement execution was traced when running with parallel SQL across the cluster and when forcing the parallel slaves to all be local. The SQL statement completed in 35 seconds when executed across the cluster and in 20 seconds when forced locally. The wait events when executed across the cluster are shown below as captured from tkprof of a session trace file.

Event waited on                             Times   Max. Wait  Total Waited

----------------------------------------   Waited  ----------  ------------

reliable message                                2        0.01          0.02

enq: KO - fast object checkpoint                2        0.00          0.00

KJC: Wait for msg sends to complete             4        0.00          0.00

PX Deq: reap credit                           620        0.01          0.02

PX Deq: Join ACK                               12        0.00          0.01

IPC send completion sync                        8        0.00          0.00

PX Deq: Parse Reply                             8       12.35         12.35

resmgr:cpu quantum                              2        0.01          0.01

SQL*Net message to client                    1315        0.00          0.00

PX Deq: Execute Reply                         250        1.13         19.77

SQL*Net message from client                  1314        0.02          2.40

PX Deq: Signal ACK EXT                          8        0.00          0.00

PX Deq: Slave Session Stats                     8        0.00          0.00

enq: PS - contention                            4        0.00          0.00

When executed all within the same node, the wait events look like the following.

Event waited on                             Times   Max. Wait  Total Waited

----------------------------------------   Waited  ----------  ------------

reliable message                                1        0.00          0.00

enq: KO - fast object checkpoint                1        0.00          0.00

KJC: Wait for msg sends to complete             4        0.00          0.00

PX Deq: Join ACK                                4        0.00          0.00

enq: PS - contention                            1        0.00          0.00

PX Deq: Parse Reply                             4        0.00          0.00

SQL*Net message to client                    1315        0.00          0.00

PX Deq: Execute Reply                         115        2.52         20.57

resmgr:cpu quantum                              1        0.00          0.00

SQL*Net message from client                  1314        0.01          2.93

PX Deq: Signal ACK EXT                          4        0.00          0.00

PX Deq: Slave Session Stats                     4        0.00          0.00

Notice that the PX Deq: Execute Reply wait events have very similar timings. This wait event is essentially the time that the QC waits for the slave processes to finish their work. The PX Deq: Parse Reply wait event has a total time of zero when executed locally and a total wait time of 12.35 seconds when executed across the nodes. The PX Deq: Parse Reply wait event differences account for almost all of the 15-second difference in runtimes. This wait even occurs when the QC process is waiting for the parallel slaves to receive and parse their work from the QC. When the processes are all local, the wait time is hardly noticeable and it makes sense that when some slave processes are on another note, it takes additional time to complete this operation.

If the local node has plenty of CPU resources and I/O bandwidth available to it, you may want the parallel SQL statement to run locally. Doing so would be no different than running a parallel SQL statement on a single-instance database. If you need the parallel SQL statement to enjoy the scale out capabilities of multiple nodes, then the overhead of distributing the parallel slaves can be overcome by the quicker time to completion for each parallel slave. In the end, the only way to know if running the parallel SQL statement locally or across the cluster is to run the statement both ways.

Oracle RAC and Inter-Instance parallelism

The foundation of Oracle Real Application Clusters revolves around parallelism, and long-term Oracle professionals remember that the original name for RAC was OPS, for Oracle Parallel Server.

With RAC, it is possible for an intra-parallel operation to utilize the processors across the nodes, using the second argument in the Oracle PARALLEL hint. That gives an additional degree of parallelism while executing in parallel. For instance, a parallel query can be set up with ?Parallel Hint? to utilize the CPUs from the many RAC instances.

Because each node requires a parallel query coordinator, many shops use n-1 parallelism, to reserve a CPU for the query coordinator. If we have four instances, each with 4 CPU's, our query might use a degree of parallelism (DOP) of 3 (n-1, to reserve a CPU for the query coordinator process). MOSC note 280939.1 confirms that a separate parallel query coordinator is required on each remote node:

The slaves are controlled by the user background process (called query coordinator QC ). In RAC environment the slaves for one query maybe be are spawned on different Nodes. . . .

Parallel execution does not allocate slaves randomly across the available instances, but rather will start by allocating on the least loaded instance. The goal is to both minimize inter-node traffic and at the same time try to minimize any imbalance of work across the instances.

The INSTANCE_GROUPS/PARALLEL_INSTANCE_GROUP parameter can be used to restrict allocation of query slaves to specific instances in a RAC configuration and over-ride the automatic allocation across instances. This can improve the performance when there are problem with the inter-connect.

Hence, the query might look like this, with a DOP of three:

SELECT /*+ FULL(sales) PARALLEL(sales, 3,4) */
customer_name,
sum(purchase_amount)

from

sales;

In this example, the DOP is three and we use all four instances. The query is executed with a total of 16 processes, 4 on each instance, and one parallel query coordinator on each instance:

From the illustration it becomes clear that the RAC implementation of the query might run slower than an equivalent query on a monolithic server with the same hardware. Note the differences between this RAC query and the vanilla Oracle parallel query on the monolithic server:

One-fourth fewer processes reading the table rows - We must reserve a process for the parallel query coordinator on each node.
Overhead on the cache fusion layer - As each node delivers the result set, the rows must be transferred to the master node that is controlling the query.
Slower sorting - Because the master node only has 8 gig of RAM, the result set is too large to sort in-memory and a time-consuming disk sort is required.

For more on using Oracle Real Application Clusters in a data warehouse, click here.

Parallel Query for distributed instances

In a distributed environment, pieces of a table may reside on many remote servers. For example, assume that we have a distributed architecture where local customer tables are kept on each instance. You could access all of the remote rows in a single query, using inter-instance parallel execution. In this example, the query is executed from the north_carolina instance, accessing two remote instances in-parallel:

select customer_name, sum(purchase_amount) from sales

union

select customer_name, sum(purchase_amount) from sales@san_francisco

union

select customer_name, sum(purchase_amount) from sales@new_york

group by

customer_name;

In this case the north_carolina instance drives the distributed parallel query and it is the north_carolina instance that must gather and sort the result set.

As we see, Oracle offers a wealth of distributed parallel query options, each with its own unique characteristics.

The basic premise of parallel execution is to break down a large body of work into smaller units that will be performed simultaneously. After completion, the smaller units of work are merged into one final result.

Multiple CPU cores can be used to perform each unit of work. If a SQL statement takes X minutes to complete serially, then breaking the work in four equally sized pieces, each performed in parallel, would ideally complete the same SQL statement a little more than one quarter of the serial execution time.

Why a little more? The extra time is due to the effort required to put the results back together.

The number of concurrent processes is called the Degree of Parallelism (DEGREE OF PARALLELISM). The process performing the smaller units of work is called the parallel slave. The process that puts the results back together is called the Query Coordinator (QC).

The following diagram illustrates performing a disk I/O on a database file serially as many database administrators should be very familiar with.

Figure 7.1 Serial Disk Read

The user's server process reads the data from the file before it is passed on to the end user. The diagram above is simplistic in that it removes the SGA and other components from the picture, but this was intentionally done so as to focus on just the parallel aspects discussed in the next diagram.

Figure 7.2 Parallel Operations

In diagram 7.2 above, the same file is read, this time with a parallel degree of 4. Each slave will ideally be responsible for one-fourth of the I/O activity. The results are passed on to the Query Coordinator before the data is ready for the end user. Notice that this diagram shows an extra level of work, that of the QC process. This diagram should help illustrate why parallel processing is not always faster than serial processing. If the time saved by the slaves executing the disk I/O in parallel is more than the time taken by the QC to do its job, the overall execution will be faster than if run serially. At this point, it should be obvious that multiple slave processes reading from the same file can perform worse if the file system does not have enough resources to support the concurrent I/O operations from the slaves. A lack of sufficient resources will hamper parallel SQL statement execution.

The Query Coordinator (QC) has a few jobs to do. The QC acquires slaves processes from a pool. The QC then doles out pieces of the work to each slave process. As the slave processes complete their work, the QC combines the results. Finally, the QC returns the slave processes to the pool and sends the final result to the end user. Don't feel bad that the slave processes appear to be doing all the work. The QC has enough work to do on its own as well. All this extra work can add up. The ultimate goal is that the extra work performed by the QC is significantly less than the time saves by the slaves working in parallel, otherwise the parallel execution will not perform well.

Parallel execution can improve statement execution for SQL statements that involve full table scans or joins of large tables, creation of large indexes, and large Insert, Update and Delete statements.

Parallel execution is not limited to just completing disk I/O faster. Consider a SQL statement that involves aggregate functions such as min or max. Data needs to be sorted before the minimum or maximum value can be found. In this case, the parallel slaves will operate in two steps, or two slave sets. One that reads the data from disk, then another set to sort and aggregate the data, all before returning the results to the Query Coordinator.

If each of the four parallel threads finds the maximum value of its portion of data, then the QC only needs to determine the maximum of four values. The following diagram shows how the parallel slaves can work together to provide multiple levels to the parallel processing.

Figure 7.3 Multiple Parallel Levels

In the above diagram, the parallel slaves that are reading from disk are producers to the slaves on the next level. The slaves performing the sort/aggregation are consumers of the work performed by the producers. The producer does not need to complete its work before the consumer can start to use the results. As soon as the producer has data ready, it will pass it on to the consumer to promote ?Inter-Operation Parallelism?. This helps speed up the parallel processing. The parallel slaves on the same level participate in Intra-Operation Parallelism. The consumer processes are not limited to sort and aggregate operations. Join operations can also be performed at the consumer level. Lastly, the producers have the capability to send their work on to any producer, which leads to all of the arrows between the producers and consumers in the diagram above.

So if a parallel degree of 4 works well, then why not a parallel degree of 10, 20, or even 100? The more parallel slaves, the better, right? Unfortunately, the server's physical resources have limits. For slaves performing disk I/O, there needs to be sufficient bandwidth from the storage subsystem to database server. Sorting operations are CPU intensive and too many slaves may start to push the CPU cores to their limits. If a very high parallel degree is used, there may be few resources for other users of the database, including those that are not performing parallel operations. Each slave process will need access to its own PGA memory. Parallel processing can struggle if the host does not have sufficient resources to support it.

The final sentence of the previous paragraph is where Oracle RAC can come to the rescue. You can scale the workload of parallel processing by spreading the load over multiple nodes in the cluster, thereby providing more resources. The following diagram shows how parallel processing might work in an Oracle RAC environment. If the single server does not have enough resources devoted to running the operation with parallel degree of 4, the processing can be moved to a two-node RAC cluster and run with parallel degree of 2 on each node.

Figure 7.4 Parallel Operations on RAC

In the diagram above, each node has producer and consumer parallel slave processes. It should be noted that a producer is capable of sending results to any consumer across the network, to any RAC instance. The database engine does its best to distribute the work in an intelligent fashion in Oracle RAC to minimize the cross-instance communication between producers and consumers. The diagram above illustrates an ideal situation where there is no cross-instance traffic between producers and consumers but it is also likely that the producer slave processes will send results across the instance boundaries. As such, it is vital to have a well performing Cluster Interconnect to ensure optimal parallel execution in Oracle RAC systems.

The user's shadow process for the connection to the instance serves as the Query Coordinator. This process is either a dedicated or shared server process and can be identified from the spid column of the v$process view. Since the query coordinator will run on only one of the instances, the results from each parallel slave will need to be transferred across the Cluster Interconnect, whenever the slave is running on a different instance. Unlike single-instance databases, parallel operations in Oracle RAC can have poor performance if too much data needs to participate in global cache transfers to the QC's instance.

To reduce or eliminate global cache transfers for parallel operations in Oracle RAC, we can restrict the instances that will host parallel slave processes, a techniques that we will discussed in more detail later in this chapter. Keep in mind that doing segregating RAC instance for parallelism will reduce the resource available to the parallel execution.

Parallel operations have one other feature when performed in Oracle RAC databases. By definition, all parallel slaves are on one instance in a single instance-database, accessing a singular Buffer Cache. In Oracle RAC, there are multiple Buffer Cache memory areas involved. When a parallel operation is performed and the parallel_degree_policy initialization parameter is set to AUTO, Oracle will access data from the different Buffer Cache memory areas. Without this feature, an instance that does not have the data block in the Buffer Cache would be forced to wait for a global cache transfer operation to complete, which would slow down parallel processing.

When a SQL statement is performed with parallel operations, the user's shadow process starts its role as the Query Coordinator. The QC then obtains the required number of parallel slave processes. The parallel slaves perform their work. Results from each slave are sent back to the QC. The QC performs any processing that could not be completed by parallel slaves. The QC puts together the final results and sends the results to the end user.

When an instance starts, a pool of parallel slave processes is automatically started. Initially, the number of slave processes in the pool is defined by the parallel_min_servers initialization parameter. If all of the slave processes from the initial pool allocation are being used and another parallel operation starts, more parallel slave processes are created until a maximum of parallel_max_servers processes have been reached. At this point, no more parallel processes can be spawned.

For most Oracle RAC deployments, the resources available in each node are similar. For example, each node would have the same number of CPU cores. In this case, the minimum and maximum number of slave processes should be configured identically for all instances.

If a node has twice as many CPU cores as another node, then you may consider allowing twice the number of parallel slave processes.

http://www.dba-oracle.com/t_rac_tuning_parallel_query.htm