Method and apparatus for optimizing the processing of join queries based on join cardinality. Embodiments implement the methods in query optimizers in relational database management systems. A good join order for a multiple join query is found with a metric that compares the relative merits of candidate join orders as a whole. Embodiments estimate the join selectivity of foreign key--foreign key joins, where both participating tables are foreign keys with respect to a primary or unique key of one primary table. A graph representation of a query is processed to estimate the join cardinality of an arbitrarily large number of filters and joins, including any combination of primary key--foreign key joins and foreign key--foreign key joins.
Multiple initial orderings of tables are used a multiple starting point in a cost-base, cut-off search for a good ordering of tables in processing a database query that specifies multiple join operations. Multiple heuristics may be used to generate the multiple initial orderings of the tables. The database query is executed with the good ordering of tables.
A robust way is described for optimizing complex data base queries while retaining the optimization speed of heuristic methods. The heuristic join-sequencing algorithm is modified to permit any of, or a combination of: (1) multiple passes of the heuristic algorithm, each with a different metric, producing multiple plans; (2) complex combinations of the criteria by which such heuristics make their choices; and/or (3) backtracking to consider alternatives to any particular decision in the sequence.
A database query optimizer constructs a graph comprising nodes, relations, and expressions. The query optimizer then constructs execution plans for sub-parts of the graph. The combination of execution plans make up the overall execution plan for the query. The execution plan information is appended to the graph itself, allowing changing an execution plan in one portion of the graph without necessarily changing execution plans in other portions of the graph. By representing a query using the graph of the preferred embodiments that includes execution plan information, the query optimizer is able to evaluate the execution plans of different options quickly and efficiently, thereby enhancing the performance of the query optimizer.
Provided are a techniques for processing a query including semi-joins. At execution time, a next semi-join is selected from the semi-joins for execution in a current round of semi-join executions. A reporting threshold is determined that indicates a number of record-identifiers to be retrieved for the determined semi-join. The selected semi-join is executed until the determined number of record identifiers are retrieved.