Query optimization for parallel machines needs to consider machine architecture, processor and memory resources available, and different types of parallelism, making the search space much larger than the sequential case. In this paper our aim is to determine a plan that makes the execution of an individual query very fast, making minimizing parallel execution time the right objective. This creates the following circular dependence: a plan tree is needed for effective resource assignment, which is needed to estimate the parallel execution time, and this is needed for the cost-based search for a good plan tree. In this paper we propose a new search heuristic that breaks the cycle by constructing the plan tree layer by layer in a bottom-up manner. To select nodes at the next level, the lower and upper bounds on the execution time for plans consistent with the decisions made so far are estimated and are used to guide the search. A query plan representation for intra- and inter-operator parallelism, pipelining, and processor and memory assignment is proposed. Also proposed is a new approach to estimating the parallel execution time of a plan that considers sum and mat of operators working sequentially and in parallel, respectively. The results obtained from a prototype optimizer are presented.