Thursday, September 15, 2016

SQL Server: How Optimizer creates a plan for your query

If you have not read previous two posts related this, please have a look on them first;

When a query is submitted, it goes through various phases and finally it comes to Query Optimizer as an Algebrizer Tree including details of physical objects related to the query. The Query Optimizer uses details such as Object Schema (indexes, constraints, data types) and Object Statistics for making plans for the query.

Optimization Phases
In order to reduce the time it takes for creating plans and evaluating them for finding the best, Query Optimizer goes through three phases. There three phases makes sure that it avoids complex plans as much as possible and goes to phases only if it is required. Three phases are:
  1. Simplification
    This tries to make the plan simplify for making the optimization quicker. This does simplifications such as Converting sub-queries to joins, Removing redundant joins.
  2. Trivial plan generation
    For some queries, there will only few possible plans exist. When Query Optimizer experiences it, it picks a single plan without evaluating other plans.
  3. Full optimization
    It comes to this phase if the query has many number of plans and they have to be evaluated for selecting the best. This is called as Full cost-based optimization as well. However, in certain scenario, Query Optimizer does not evaluate all plans because of the cost on resource usage related to evaluation.

    In order to select plans for evaluation, it uses Transformation Rules. The combination of Logical Query Tree (or Algebrizer Tree) and Transformation Rules results an internal structure called Memo. It will be used for calculating the cost.

    There can be an instance that many number of plans are created and even with Transformation Rules, it takes long time for evaluating. Then, SQL Server tries to handle the optimization with three stages; Search 0 - Transaction Processing Phase, Search 1 - Quick Plan Phase, and Search 2 - Full Optimization Phase. When it goes through these stages, if the stage does not return a plan, then it goes to the next one. If none of the stages return a plan, then it picks the plan with lowest cost.
Here are some code samples that explain some phases;

When this code is executed, it ignores Production.ProductCateory table as it is not required. The reason is the Foreign Key Constraint exit between Production.ProductSubCategory and Production.ProductCategory and as per the JOIN condition, it says that no records can exist in the Production.ProductSubCategory without a record in Production.ProductCategory. Therefore Optimizer ignores it. This is an example for Simplification.


Here is another example for simplification. This query has a condition on the Status column and Status column has a check constraint set saying that it should be between 0 and 8. Query Optimizer knows that there cannot be record exist with Status 10 because of the constraint, hence it simplify the query.


Here is one more query that shows the second phase which is Trivial Plan Generation. There are not much options in terms of multiple plans, hence one will be immediately selected.



No comments: