org.apache.derby.impl.sql.compile
Class GroupByNode

java.lang.Object
  extended by org.apache.derby.impl.sql.compile.QueryTreeNode
      extended by org.apache.derby.impl.sql.compile.ResultSetNode
          extended by org.apache.derby.impl.sql.compile.FromTable
              extended by org.apache.derby.impl.sql.compile.SingleChildResultSetNode
                  extended by org.apache.derby.impl.sql.compile.GroupByNode
All Implemented Interfaces:
Optimizable, Visitable

public class GroupByNode
extends SingleChildResultSetNode

A GroupByNode represents a result set for a grouping operation on a select. Note that this includes a SELECT with aggregates and no grouping columns (in which case the select list is null) It has the same description as its input result set.

For the most part, it simply delegates operations to its bottomPRSet, which is currently expected to be a ProjectRestrictResultSet generated for a SelectNode.

NOTE: A GroupByNode extends FromTable since it can exist in a FromList.

There is a lot of room for optimizations here:


Nested Class Summary
private static class GroupByNode.ExpressionSorter
          Comparator class for GROUP BY expression substitution.
 
Field Summary
private  boolean addDistinctAggregate
           
private  int addDistinctAggregateColumnNum
           
private  AggregatorInfoList aggInfo
          Information that is used at execution time to process aggregates.
(package private)  java.util.Vector aggregateVector
          The list of all aggregates in the query block that contains this group by.
(package private)  GroupByList groupingList
          The GROUP BY list
private  ValueNode havingClause
           
private  SubqueryList havingSubquerys
           
private  boolean isInSortedOrder
           
(package private)  FromTable parent
          The parent to the GroupByNode.
private  boolean singleInputRowOptimization
           
 
Fields inherited from class org.apache.derby.impl.sql.compile.SingleChildResultSetNode
childResult, hasTrulyTheBestAccessPath
 
Fields inherited from class org.apache.derby.impl.sql.compile.FromTable
ADD_PLAN, bestAccessPath, bestCostEstimate, bestSortAvoidancePath, correlationName, corrTableName, currentAccessPath, hashKeyColumns, initialCapacity, level, LOAD_PLAN, loadFactor, maxCapacity, origTableName, REMOVE_PLAN, tableNumber, tableProperties, trulyTheBestAccessPath, userSpecifiedJoinStrategy
 
Fields inherited from class org.apache.derby.impl.sql.compile.ResultSetNode
costEstimate, cursorTargetTable, finalCostEstimate, insertSource, optimizer, referencedTableMap, resultColumns, resultSetNumber, scratchCostEstimate, statementResultSet
 
Fields inherited from class org.apache.derby.impl.sql.compile.QueryTreeNode
AUTOINCREMENT_CREATE_MODIFY, AUTOINCREMENT_INC_INDEX, AUTOINCREMENT_IS_AUTOINCREMENT_INDEX, AUTOINCREMENT_START_INDEX, isPrivilegeCollectionRequired
 
Constructor Summary
GroupByNode()
           
 
Method Summary
private  void addAggregateColumns()
          In the query rewrite involving aggregates, add the columns for aggregation.
private  void addAggregates()
          Add the extra result columns required by the aggregates to the result list.
private  void addDistinctAggregatesToOrderBy()
          Add any distinct aggregates to the order by list.
private  void addNewColumnsForAggregation()
          Add a whole slew of columns needed for aggregation.
private  void addNewPRNode()
          Add a new PR node for aggregation.
private  java.util.ArrayList addUnAggColumns()
          In the query rewrite for group by, add the columns on which we are doing the group by.
(package private)  void considerPostOptimizeOptimizations(boolean selectHasPredicates)
          Consider any optimizations after the optimizer has chosen a plan.
 CostEstimate estimateCost(OptimizablePredicateList predList, ConglomerateDescriptor cd, CostEstimate outerCost, Optimizer optimizer, RowOrdering rowOrdering)
          Estimate the cost of scanning this Optimizable using the given predicate list with the given conglomerate.
 boolean flattenableInFromSubquery(FromList fromList)
          Evaluate whether or not the subquery in a FromSubquery is flattenable.
 void generate(ActivationClassBuilder acb, MethodBuilder mb)
          generate the sort result set operating over the source resultset.
private  void genGroupedAggregateResultSet(ActivationClassBuilder acb, MethodBuilder mb)
          Generate the code to evaluate grouped aggregates.
private  void genScalarAggregateResultSet(ActivationClassBuilder acb, MethodBuilder mb)
          Generate the code to evaluate scalar aggregates.
private  ResultColumn getColumnReference(ResultColumn targetRC, DataDictionary dd)
          Method for creating a new result column referencing the one passed in.
(package private)  boolean getIsInSortedOrder()
          Get whether or not the source is in sorted order.
 FromTable getParent()
          Return the parent node to this one, if there is one.
 void init(java.lang.Object bottomPR, java.lang.Object groupingList, java.lang.Object aggregateVector, java.lang.Object havingClause, java.lang.Object havingSubquerys, java.lang.Object tableProperties, java.lang.Object nestingLevel)
          Intializer for a GroupByNode.
 boolean isOneRowResultSet()
          Return whether or not the underlying ResultSet tree will return a single row, at most.
(package private)  ResultColumnDescriptor[] makeResultDescriptors()
           
 ResultSetNode optimize(DataDictionary dataDictionary, PredicateList predicates, double outerRows)
          Optimize this GroupByNode.
 CostEstimate optimizeIt(Optimizer optimizer, OptimizablePredicateList predList, CostEstimate outerCost, RowOrdering rowOrdering)
          Choose the best access path for this Optimizable.
 void printSubNodes(int depth)
          Prints the sub-nodes of this object.
 boolean pushOptPredicate(OptimizablePredicate optimizablePredicate)
          Push an OptimizablePredicate down, if this node accepts it.
 java.lang.String toString()
          Convert this object to a String.
 
Methods inherited from class org.apache.derby.impl.sql.compile.SingleChildResultSetNode
acceptChildren, addNewPredicate, adjustForSortElimination, adjustForSortElimination, changeAccessPath, decrementLevel, ensurePredicateList, forUpdate, getChildResult, getFinalCostEstimate, getFromTableByName, getTrulyTheBestAccessPath, init, initAccessPaths, isNotExists, isOrderedOn, modifyAccessPaths, preprocess, pullOptPredicates, pushExpressions, referencesSessionSchema, referencesTarget, reflectionNeededForProjection, setChildResult, setLevel, subqueryReferencesTarget, updateBestPlanMap, updateTargetLockMode
 
Methods inherited from class org.apache.derby.impl.sql.compile.FromTable
assignCostEstimate, canBeOrdered, considerSortAvoidancePath, convertAbsoluteToRelativeColumnPosition, cursorTargetTable, feasibleJoinStrategy, fillInReferencedTableMap, flatten, getBaseTableName, getBestAccessPath, getBestSortAvoidancePath, getCorrelationName, getCostEstimate, getCurrentAccessPath, getExposedName, getLevel, getName, getNumColumnsReturned, getOrigTableName, getProperties, getResultColumnsForList, getSchemaDescriptor, getSchemaDescriptor, getScratchCostEstimate, getTableDescriptor, getTableName, getTableNumber, getUserSpecifiedJoinStrategy, hashKeyColumns, hasTableNumber, initialCapacity, isBaseTable, isCoveringIndex, isFlattenableJoinNode, isMaterializable, isOneRowScan, isTargetTable, legalJoinOrder, loadFactor, LOJ_reorderable, markUpdatableByCursor, maxCapacity, memoryUsageOK, modifyAccessPath, needsSpecialRCLBinding, nextAccessPath, optimizeSubqueries, rememberAsBest, rememberJoinStrategyAsBest, rememberSortAvoidancePath, resetJoinStrategies, setCostEstimate, setHashKeyColumns, setOrigTableName, setProperties, setTableNumber, startOptimizing, supportsMultipleInstantiations, tellRowOrderingAboutConstantColumns, transformOuterJoins, uniqueJoin, verifyProperties
 
Methods inherited from class org.apache.derby.impl.sql.compile.ResultSetNode
assignResultSetNumber, bindExpressions, bindExpressionsWithTables, bindNonVTITables, bindResultColumns, bindResultColumns, bindTargetExpressions, bindUntypedNullsToResultColumns, bindVTITables, columnTypesAndLengthsMatch, considerMaterialization, disablePrivilegeCollection, enhanceRCLForInsert, generateNormalizationResultSet, generateResultSet, genProjectRestrict, genProjectRestrict, genProjectRestrictForReordering, getAllResultColumns, getCostEstimate, getCursorTargetTable, getFromList, getMatchingColumn, getNewCostEstimate, getOptimizer, getOptimizerImpl, getRCLForInsert, getReferencedTableMap, getResultColumns, getResultSetNumber, isPossibleDistinctScan, isUpdatableCursor, LOJgetReferencedTables, makeResultDescription, markAsCursorTargetTable, markForDistinctScan, markStatementResultSet, modifyAccessPaths, notCursorTargetTable, notFlattenableJoin, numDistinctAggregates, parseDefault, performMaterialization, projectResultColumns, pushOffsetFetchFirst, pushOrderByList, rejectParameters, rejectXMLValues, renameGeneratedResultNames, replaceOrForbidDefaults, returnsAtMostOneRow, setInsertSource, setReferencedTableMap, setResultColumns, setResultToBooleanTrueNode, setTableConstructorTypes, verifySelectStarSubquery
 
Methods inherited from class org.apache.derby.impl.sql.compile.QueryTreeNode
accept, bindOffsetFetch, bindRowMultiSet, bindUserType, checkReliability, checkReliability, convertDefaultNode, createTypeDependency, debugFlush, debugPrint, formatNodeString, foundString, generateAuthorizeCheck, getBeginOffset, getClassFactory, getCompilerContext, getContextManager, getCursorInfo, getDataDictionary, getDependencyManager, getEndOffset, getExecutionFactory, getGenericConstantActionFactory, getIntProperty, getLanguageConnectionContext, getNodeFactory, getNodeType, getNullNode, getParameterTypes, getRowEstimate, getSchemaDescriptor, getSchemaDescriptor, getStatementType, getTableDescriptor, getTypeCompiler, init, init, init, init, init, init, init, init, init, init, init, init, isAtomic, isInstanceOf, isPrivilegeCollectionRequired, isSessionSchema, isSessionSchema, makeConstantAction, makeTableName, makeTableName, nodeHeader, orReliability, parseStatement, printLabel, resolveTableToSynonym, setBeginOffset, setContextManager, setEndOffset, setNodeType, setRefActionInfo, stackPrint, treePrint, treePrint, verifyClassExist
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.derby.iapi.sql.compile.Optimizable
getDataDictionary, getReferencedTableMap, getResultSetNumber
 

Field Detail

groupingList

GroupByList groupingList
The GROUP BY list


aggregateVector

java.util.Vector aggregateVector
The list of all aggregates in the query block that contains this group by.


aggInfo

private AggregatorInfoList aggInfo
Information that is used at execution time to process aggregates.


parent

FromTable parent
The parent to the GroupByNode. If we need to generate a ProjectRestrict over the group by then this is set to that node. Otherwise it is null.


addDistinctAggregate

private boolean addDistinctAggregate

singleInputRowOptimization

private boolean singleInputRowOptimization

addDistinctAggregateColumnNum

private int addDistinctAggregateColumnNum

isInSortedOrder

private boolean isInSortedOrder

havingClause

private ValueNode havingClause

havingSubquerys

private SubqueryList havingSubquerys
Constructor Detail

GroupByNode

public GroupByNode()
Method Detail

init

public void init(java.lang.Object bottomPR,
                 java.lang.Object groupingList,
                 java.lang.Object aggregateVector,
                 java.lang.Object havingClause,
                 java.lang.Object havingSubquerys,
                 java.lang.Object tableProperties,
                 java.lang.Object nestingLevel)
          throws StandardException
Intializer for a GroupByNode.

Overrides:
init in class QueryTreeNode
Parameters:
bottomPR - The child FromTable
groupingList - The groupingList
aggregateVector - The vector of aggregates from the query block. Since aggregation is done at the same time as grouping, we need them here.
havingClause - The having clause.
havingSubquerys - subqueries in the having clause.
tableProperties - Properties list associated with the table
nestingLevel - nestingLevel of this group by node. This is used for error checking of group by queries with having clause.
Throws:
StandardException - Thrown on error

getIsInSortedOrder

boolean getIsInSortedOrder()
Get whether or not the source is in sorted order.

Returns:
Whether or not the source is in sorted order.

addAggregates

private void addAggregates()
                    throws StandardException
Add the extra result columns required by the aggregates to the result list.

Throws:
standard - exception
StandardException

addDistinctAggregatesToOrderBy

private void addDistinctAggregatesToOrderBy()
Add any distinct aggregates to the order by list. Asserts that there are 0 or more distincts.


addNewPRNode

private void addNewPRNode()
                   throws StandardException
Add a new PR node for aggregation. Put the new PR under the sort.

Throws:
standard - exception
StandardException

addUnAggColumns

private java.util.ArrayList addUnAggColumns()
                                     throws StandardException
In the query rewrite for group by, add the columns on which we are doing the group by.

Returns:
havingRefsToSubstitute visitors array. Return any havingRefsToSubstitute visitors since it is too early to apply them yet; we need the AggregateNodes unmodified until after we add the new columns for aggregation (DERBY-4071).
Throws:
StandardException
See Also:
addNewColumnsForAggregation()

addNewColumnsForAggregation

private void addNewColumnsForAggregation()
                                  throws StandardException
Add a whole slew of columns needed for aggregation. Basically, for each aggregate we add 3 columns: the aggregate input expression and the aggregator column and a column where the aggregate result is stored. The input expression is taken directly from the aggregator node. The aggregator is the run time aggregator. We add it to the RC list as a new object coming into the sort node.

At this point this is invoked, we have the following tree:

For each ColumnReference in PR RCL

For each aggregate in aggregateVector .

For a query like,

          select c1, sum(c2), max(c3)
          from t1 
          group by c1;
          
the query tree ends up looking like this:
            ProjectRestrictNode RCL -> (ptr to GBN(column[0]), ptr to GBN(column[1]), ptr to GBN(column[4]))
                      |
            GroupByNode RCL->(C1, SUM(C2), , , MAX(C3), , )
                      |
            ProjectRestrict RCL->(C1, C2, C3)
                      |
            FromBaseTable
            
The RCL of the GroupByNode contains all the unagg (or grouping columns) followed by 3 RC's for each aggregate in this order: the final computed aggregate value, the aggregate input and the aggregator function.

The Aggregator function puts the results in the first of the 3 RC's and the PR resultset in turn picks up the value from there.

The notation (ptr to GBN(column[0])) basically means that it is a pointer to the 0th RC in the RCL of the GroupByNode.

The addition of these unagg and agg columns to the GroupByNode and to the PRN is performed in addUnAggColumns and addAggregateColumns.

Note that that addition of the GroupByNode is done after the query is optimized (in SelectNode#modifyAccessPaths) which means a fair amount of patching up is needed to account for generated group by columns.

Throws:
standard - exception
StandardException

addAggregateColumns

private void addAggregateColumns()
                          throws StandardException
In the query rewrite involving aggregates, add the columns for aggregation.

Throws:
StandardException
See Also:
addNewColumnsForAggregation()

getParent

public FromTable getParent()
Return the parent node to this one, if there is one. It will return 'this' if there is no generated node above this one.

Returns:
the parent node

optimizeIt

public CostEstimate optimizeIt(Optimizer optimizer,
                               OptimizablePredicateList predList,
                               CostEstimate outerCost,
                               RowOrdering rowOrdering)
                        throws StandardException
Description copied from interface: Optimizable
Choose the best access path for this Optimizable.

Specified by:
optimizeIt in interface Optimizable
Overrides:
optimizeIt in class FromTable
Parameters:
optimizer - Optimizer to use.
predList - The predicate list to optimize against
outerCost - The CostEstimate for the outer tables in the join order, telling how many times this Optimizable will be scanned.
rowOrdering - The row ordering for all the tables in the join order, including this one.
Returns:
The optimizer's estimated cost of the best access path.
Throws:
StandardException - Thrown on error
See Also:
Optimizable.optimizeIt(org.apache.derby.iapi.sql.compile.Optimizer, org.apache.derby.iapi.sql.compile.OptimizablePredicateList, org.apache.derby.iapi.sql.compile.CostEstimate, org.apache.derby.iapi.sql.compile.RowOrdering)

estimateCost

public CostEstimate estimateCost(OptimizablePredicateList predList,
                                 ConglomerateDescriptor cd,
                                 CostEstimate outerCost,
                                 Optimizer optimizer,
                                 RowOrdering rowOrdering)
                          throws StandardException
Description copied from interface: Optimizable
Estimate the cost of scanning this Optimizable using the given predicate list with the given conglomerate. It is assumed that the predicate list has already been classified. This cost estimate is just for one scan, not for the life of the query.

Specified by:
estimateCost in interface Optimizable
Overrides:
estimateCost in class FromTable
Parameters:
predList - The predicate list to optimize against
cd - The conglomerate descriptor to get the cost of
outerCost - The estimated cost of the part of the plan outer to this optimizable.
optimizer - The optimizer to use to help estimate the cost
rowOrdering - The row ordering for all the tables in the join order, including this one.
Returns:
The estimated cost of doing the scan
Throws:
StandardException - Thrown on error
See Also:
Optimizable.estimateCost(org.apache.derby.iapi.sql.compile.OptimizablePredicateList, org.apache.derby.iapi.sql.dictionary.ConglomerateDescriptor, org.apache.derby.iapi.sql.compile.CostEstimate, org.apache.derby.iapi.sql.compile.Optimizer, org.apache.derby.iapi.sql.compile.RowOrdering)

pushOptPredicate

public boolean pushOptPredicate(OptimizablePredicate optimizablePredicate)
                         throws StandardException
Description copied from interface: Optimizable
Push an OptimizablePredicate down, if this node accepts it.

Specified by:
pushOptPredicate in interface Optimizable
Overrides:
pushOptPredicate in class FromTable
Parameters:
optimizablePredicate - OptimizablePredicate to push down.
Returns:
Whether or not the predicate was pushed down.
Throws:
StandardException - Thrown on error
See Also:
Optimizable.pushOptPredicate(org.apache.derby.iapi.sql.compile.OptimizablePredicate)

toString

public java.lang.String toString()
Convert this object to a String. See comments in QueryTreeNode.java for how this should be done for tree printing.

Overrides:
toString in class FromTable
Returns:
This object as a String

printSubNodes

public void printSubNodes(int depth)
Prints the sub-nodes of this object. See QueryTreeNode.java for how tree printing is supposed to work.

Overrides:
printSubNodes in class SingleChildResultSetNode
Parameters:
depth - The depth of this node in the tree

flattenableInFromSubquery

public boolean flattenableInFromSubquery(FromList fromList)
Evaluate whether or not the subquery in a FromSubquery is flattenable. Currently, a FSqry is flattenable if all of the following are true: o Subquery is a SelectNode. o It contains no top level subqueries. (RESOLVE - we can relax this) o It does not contain a group by or having clause o It does not contain aggregates.

Overrides:
flattenableInFromSubquery in class SingleChildResultSetNode
Parameters:
fromList - The outer from list
Returns:
boolean Whether or not the FromSubquery is flattenable.

optimize

public ResultSetNode optimize(DataDictionary dataDictionary,
                              PredicateList predicates,
                              double outerRows)
                       throws StandardException
Optimize this GroupByNode.

Overrides:
optimize in class SingleChildResultSetNode
Parameters:
dataDictionary - The DataDictionary to use for optimization
predicates - The PredicateList to optimize. This should be a join predicate.
outerRows - The number of outer joining rows
Returns:
ResultSetNode The top of the optimized subtree
Throws:
StandardException - Thrown on error

makeResultDescriptors

ResultColumnDescriptor[] makeResultDescriptors()
Overrides:
makeResultDescriptors in class ResultSetNode

isOneRowResultSet

public boolean isOneRowResultSet()
                          throws StandardException
Return whether or not the underlying ResultSet tree will return a single row, at most. This is important for join nodes where we can save the extra next on the right side if we know that it will return at most 1 row.

Overrides:
isOneRowResultSet in class SingleChildResultSetNode
Returns:
Whether or not the underlying ResultSet tree will return a single row.
Throws:
StandardException - Thrown on error

generate

public void generate(ActivationClassBuilder acb,
                     MethodBuilder mb)
              throws StandardException
generate the sort result set operating over the source resultset. Adds distinct aggregates to the sort if necessary.

Overrides:
generate in class QueryTreeNode
Parameters:
acb - The ActivationClassBuilder for the class being built
mb - The method for the generated code to go into
Throws:
StandardException - Thrown on error

genScalarAggregateResultSet

private void genScalarAggregateResultSet(ActivationClassBuilder acb,
                                         MethodBuilder mb)
Generate the code to evaluate scalar aggregates.


genGroupedAggregateResultSet

private void genGroupedAggregateResultSet(ActivationClassBuilder acb,
                                          MethodBuilder mb)
                                   throws StandardException
Generate the code to evaluate grouped aggregates.

Throws:
StandardException

getColumnReference

private ResultColumn getColumnReference(ResultColumn targetRC,
                                        DataDictionary dd)
                                 throws StandardException
Method for creating a new result column referencing the one passed in.

Parameters:
targetRC - the source
dd -
Returns:
the new result column
Throws:
StandardException - on error

considerPostOptimizeOptimizations

void considerPostOptimizeOptimizations(boolean selectHasPredicates)
                                 throws StandardException
Consider any optimizations after the optimizer has chosen a plan. Optimizations include: o min optimization for scalar aggregates o max optimization for scalar aggregates

Parameters:
selectHasPredicates - true if SELECT containing this vector/scalar aggregate has a restriction
Throws:
StandardException - on error

Built on Thu 2010-12-23 20:49:13+0000, from revision ???

Apache Derby V10.6 Internals - Copyright © 2004,2007 The Apache Software Foundation. All Rights Reserved.