The Fork-Join structure is one of the basic modeling structures for parallel processing. Although some algorithms are able to find an optimal schedule under certain conditions, they ignore to economize processors and minimize the total completion time. This paper presents a Task Duplication based Balance Scheduling(TDBS)algorithm which can generate an optimal schedule for fork-join task graph with a complexity of O(vq+vlogv), where v and q are the number of tasks and processors respectively. By considering workload and idle time slots of the used processors, TDBS algorithm tries to assign tasks to scheduled processors and maximize their utilization. Simulation results show that TDBS algorithm has better speedup and efficiency than other compared algorithms. Therefore,TDBS algorithm is a viable option for practical high performance applications.