The following is a list of some tips which can be used as
guideline to write and review custom SQL queries. This is by no means an exhaustive
list to get the best tuning results but can serve as a ready reference to avoid
the common pitfalls while working with Oracle SQL:
1. Do not use the set operator UNION if the objective can be
achieved through an UNION ALL. UNION incurs an extra sort operation which can
be avoided.
2. Select ONLY those columns in a query which are required. Extra
columns which are not actually used, incur more I/O on the database and
increase network traffic.
3. Do not use the keyword DISTINCT if the objective can be
achieved otherwise. DISTINCT incurs an extra sort operation and therefore slows
your queries down.
4. If it is required to use a composite index, try to use the
“Leading” column in the “WHERE” clause. Though Index skip scan is possible, it
incurs extra cost in creating virtual indexes and may not be always possible
depending on the cardinality of the leading columns.
5. There should not be any Cartesian product in the query unless
there is a definite requirement to do so. I know this is a silly point but we
all have done this mistake at one point
6. Wherever multiple tables are used, always refer to a column by
either using an alias or using the fully qualified name. Do not leave the guess
work for Oracle.
7. SQL statements should be formatted consistently (e.g the
keywords should be in CAPS only) to aid readability. Now, this is not a
performance tip really. However, it’s important and part of the practices.
8. If possible use bind variables instead of constant/literal
values in the predicate filter conditions to reduce repeated parsing of the
same statement.
9. Use meaningful aliases for tables/views
10. When writing sub-queries make use of the EXISTS operator where
possible as Oracle knows that once a match has been found it can stop and avoid
a full table scan (it does a SEMI JOIN).
11. If the selective predicate is in the sub query, then use IN.
12. If the selective predicate is in the parent query, then use
EXISTS.
13. Do not modify indexed columns with functions such as RTRIM,
TO_CHAR, UPPER, TRUNC as this will prevent the optimizer from identifying the
index. If possible perform the modification on the constant side of the
condition. If the indexed column is usually accessed through a function (e.g
NVL), consider creating a function based index.
14. Try to use an index if less than 5% of the data needs to be
accessed from a data set. The exception is a small table (a few hundred rows)
which is usually best accessed through a FULL table scan irrespective of the
percentage of data required.
15. Use equi-joins whenever possible, they improve SQL efficiency
16. Avoid the following kinds of complex expressions:
- NVL
(col1,-999) = ….
- TO_DATE(),
TO_NUMBER(), and so on
These expressions prevent the optimizer from assigning valid
cardinality or selectivity estimates and can in turn affect the overall plan
and the join method
17. It is always better to write separate SQL statements for
different tasks, but if you must use one SQL statement, then you can make a
very complex statement slightly less complex by using the UNION ALL operator
18. Joins to complex views are not recommended, particularly joins
from one complex view to another. Often this results in the entire view being
instantiated, and then the query is run against the view data
19. Querying from a view requires all tables from the view to be
accessed for the data to be returned. If that is not required, then do not use
the view. Instead, use the base table(s), or if necessary, define a new view.
20. While querying on a partitioned table try to use the partition
key in the “WHERE” clause if possible. This will ensure partition pruning.
21. Consider using the PARALLEL hint (only when additional
resources can be allocated) while accessing large data sets.
22. Avoid doing an ORDER BY on a large data set especially if the
response time is important.
23. Consider changing the OPTIMIZER MODE to FIRST_ROWS(n) if the
response time is important. The default is ALL_ROWS which gives better
throughput.
24. Use CASE statements instead of DECODE (especially where nested
DECODEs are involved) because they increase the readability of the query
immensely.
25. Do not use HINTS unless the performance gains clear.
26. Check if the statistics for the objects used in the query are
up to date. If not, use the DBMS_STATS package to collect the same.
27. It is always good to understand the data both functionally and
it’s diversity and volume in order to tune the query. Selectivity (predicate)
and Cardinality (skew) factors have a big impact on query plan. Use of
Statistics and Histograms can drive the query towards a better plan.
28. Read explain plan and try to make largest restriction (filter)
as the driving site for the query, followed by the next largest, this will
minimize the time spent on I/O and execution in subsequent phases of the plan.
29. If Query requires quick response rather than good throughput
is the objective, try to avoid sorts (group by, order by, etc.). For good
throughput, optimizer mode should be set to ALL ROWS.
30. Queries tend to perform worse as they age due to volume
increase, structural changes in the database and application, upgrades etc. Use
Automatic Workload Repository (AWR) and Automatic Database Diagnostic Monitor
(ADDM) to better understand change in execution plan and throughput of top
queries over a period of time.
31. SQL Tuning Advisor and SQL Access Advisor can be used for
system advice on tuning specific SQL and their join and access paths, however,
advice generated by these tools may not be always applicable (point 28).
32. SQL Access paths for joins are an component determining query
execution time. Hash Joins are preferable when 2 large tables need to be
joined. Nested loops make work better when a large table is joined with a small
table.
Disclaimer: Points listed above are only pointers and may
not work under every circumstance. This check list can be used as a reference
while fixing performance problems in the Oracle Database.
Suggested further readings
- Materialized
Views
- Advanced
Replication
- Change
Data Capture (Asynchronous)
- Automatic
Workload Repository (AWR) and Automatic Database Diagnostic Monitor
(ADDM).
- Partitioning
strategies.
Now it’s turn if have any more tips which you have used then do
add them in comment section… Your feedback is very valuable and it would be
useful for other viewers too.
No comments:
Post a Comment