Creating table using Select query is a very common. But what one should do when you have a very large volume of data and above all the select query has joins and sub queries.
Recently I experienced one such scenario. There was a requirement where I needed to create table and total number of rows were around 3-4 billion (close to 4 GB) and select query has joins from three
tables and sub queries too. Initially I didn’t looked for the tables stats and just created the select query and issued the create command. I checked next day and still query was running. And the fun part was, if I did a DESC on table, it showed “Table does not exist” and if I issue select on the table, query didn’t throw any error.
Any guess why it’s like that?
Actually when we hit the select it goes for parsing, since it’s not throwing error that means some of the data dictionary is updated but since the size is too large, things are messed up at temp segments.
Coming to the original problem. I did some research and added nologging and nocache to the create command. I also used parallel functionality and the results were mind-bogling. The whole table got created in just under 20 minutes.
A Little more information on Parallel functionality.
Oracle can parallelize operations that involve processing an entire table or an entire partition.
We can use Parallel in
• Data loading
Generally it is recommended to use PARALLEL when there is SAP database.
The degree of parallelism used for a SQL statement can be specified at three different levels:
Statement level: Using hints or the PARALLEL clause
Object level: Found in the definition of the table, index, or other object
Instance level: Using default values for the instance
You can use the PARALLEL and PARALLEL_INDEX hints to specify the degree of parallelism for a SELECT statement.
SELECT /*+ PARALLEL (EMP, 4) */ COUNT (*) FROM employees;
Data Manipulation Language (DML) operations such as INSERT, UPDATE, and DELETE can be parallelized by Oracle. Parallel execution can speed up large DML operations and is particularly advantageous in data warehousing environments where it’s necessary to maintain large summary or historical tables.
Rules for UPDATE and DELETE statements
• Oracle can parallelize UPDATE and DELETE statements on partitioned tables, but only when multiple partitions are involved.
• You cannot parallelize UPDATE or DELETE operations on a non-partitioned table or when such operations affect only a single partition.
Parallel DDL works for both tables and indexes, whether partitioned or nonpartitioned. For nonpartitioned tables and indexes, only the following types of DDL statements can be parallelized:
CREATE TABLE…AS SELECT
If you’re working with partitioned tables and indexes, the scope of Oracle’s parallel DDL support broadens. The following statements can be parallelized for partitioned tables and indexes:
CREATE TABLE…AS SELECT
ALTER TABLE…MOVE PARTITION
ALTER TABLE…SPLIT PARTITION
ALTER INDEX…REBUILD PARTITION
ALTER INDEX…SPLIT PARTITION