Sei sulla pagina 1di 18

Computers & Industrial Engineering 46 (2004) 679696

www.elsevier.com/locate/dsw

Application of neural networks to heuristic scheduling algorithms


Derya Eren Akyol*
Department of Industrial Engineering, University of Dokuz Eylul, 35100 Bornova-Izmir, Turkey Available online 2 July 2004

Abstract This paper considers the use of articial neural networks (ANNs) to model six different heuristic algorithms applied to the n job, m machine real owshop scheduling problem with the objective of minimizing makespan. The objective is to obtain six ANN models to be used for the prediction of the completion times for each job processed on each machine and to introduce the fuzziness of scheduling information into owshop scheduling. Fuzzy membership functions are generated for completion, job waiting and machine idle times. Different methods are proposed to obtain the fuzzy parameters. To model the functional relation between the input and output variables, multilayered feedforward networks (MFNs) trained with error backpropagation learning rule are used. The trained network is able to apply the learnt relationship to new problems. In this paper, an implementation alternative to the existing heuristic algorithms is provided. Once the network is trained adequately, it can provide an outcome (solution) faster than conventional iterative methods by its generalizing property. The results obtained from the study can be extended to solve the scheduling problems in the area of manufacturing. q 2004 Elsevier Ltd. All rights reserved.
Keywords: Articial neural networks; Multilayered perceptron; Heuristic scheduling; Flowshop scheduling problems; Fuzzy membership functions

1. Introduction The owshop scheduling problem is considered as one of the general production scheduling problems in which n different jobs must be processed by m machines in the same order. The problem can be considered as nding a scheme of allocation of tasks to a limited number of competing resources, with an objective of satisfying constraints and optimizing performance criteria. Much research literature addresses methods of minimizing performance measures such as makespan. The makespan minimization, within the general owshop scheduling domain, provides a useful area for analysis because it is an important model in scheduling theory and it is usually very difcult to nd its optimal solution (Jain & Meeran, 2002).
* Tel.: 90-232-3881047; fax: 90-232-3887864. E-mail address: derya.eren@deu.edu.tr (D.E. Akyol). 0360-8352/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.cie.2004.05.005

680

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

During the last 40 years, most of the work has been applied to the permutation owshop problem. In the permutation owshop, n different jobs have to be processed on m machines. Each job has one operation on each machine and all jobs have the same ordering sequence on each machine. At any time, each machine can process at most one job. Preemption is not allowed. The objective is to nd a permutation of jobs that minimizes the maximum completion time or makespan. This problem denoted by n=m=P=Cmax ; is found to be an NP-complete problem of combinatorial optimization problems. Complete enumeration, integer programming, branch and bound techniques can be used to nd the optimal sequences for small-size problems but they do not provide efcient solutions for large size problems. In view of the combinatorial complexity and time constraints, most of the large problems can be solved only by heuristic methods (Lee & Shaw, 2000). Though, efcient heuristics cannot guarantee optimal solutions, they provide approximate solutions almost as good as the optimal solutions (Ho & Chang, 1991). In recent years, the technological advancements in hardware and software have encouraged new application tools such as neural networks to be applied to combinatorially exploding NP-hard problems (Jain & Meeran, 1998). They have emerged as efcient approaches in a variety of engineering applications where problems are difcult to formulate or awkwardly dened. They are computational structures that implement simplied models of biological processes which are preferred for their robustness, massive parallelism and ability to learn. They have proven to be more useful for complicated problems difcult to solve with conventional methods. The advantage of them lies in their resilience against distortions in the input data and their learning capabilities. With their learning capabilities, they avoid having to develop a mathematical model or acquiring the appropriate knowledge to solve a task. The ability to map and solve a number of problems motivated the proposal for using neural networks as a highly parallel model for general purpose computing. so that, they have been applied for solving scheduling and different combinatorial optimization problems. Finding the relationship between the data (i.e. processing times, due dates, etc.) and schedules, determining the optimal sequence for the jobs to be processed, identifying best dispatching strategies (i.e. scheduling rules), etc. are some of the application areas of neural networks in the scheduling literature (Sabuncuoglu & Gurgun, 1996). Sabuncuoglu (1998) presented a detailed review of the literature in the area of scheduling, and study of Smith (1999) involves a review of the research works on the use of NNs in combinatorial optimization. Neural networks as learning tools, have demonstrated their ability to capture the general relationship between variables that are difcult or impossible to relate to each other analytically by learning, recalling and generalizing from training patterns as data (Shiue & Su, 2002). In other words, they are universal function approximators and are therefore attractive for automatically learning the (nonlinear) functional relation between the input and output variables (Raaymakers & Weijters, 2003). In this study, a scheduling problem in a real permutation owshop environment is considered. Using the information of production orders for 1 month and the global operation recipe, the best sequence of ve different products are found by six different heuristic algorithms. For each sequence found by six different heuristic algorithms, the completion time of each job on each machine, job waiting times and machine idle times are computed and read into the system. To model these six different heuristic scheduling algorithms, one of the most popular neural network architectures, the multilayered perceptron (MLP) neural network is used. In order to develop a neural network, the Backpack Neural Network System Version 4.0 (by Z Solutions) is used and some necessary steps are followed. For each of the heuristic algorithms, the neural network model is used for estimating the makespan of ve jobs processed on 43 machines. By this way, we presented a neural network based implementation alternative

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

681

to the existing heuristic algorithms. The proposed method is simple and straightforward. An MLP neural network is trained on data from a real world problem to learn the functional relationship between the input and output variables. After the training process is completed, MLP can provide outputs of adequate accuracy over a limited range of input conditions, having the advantage of requiring a lot less computation than other modeling methods (Feng, Li, Cen, & Huang, 2003). In other words, the neural networks computational speed permits fast solutions to problems not seen previously by the network (El-Bouri, Balakrishnan, & Pooplewell, 2000). This paper is organized as follows. Section 2 presents a mathematical formulation of the permutation owshop scheduling problem with the makespan objective. In Section 3, we give information about the heuristic procedures considered in this study. Steps of developing a backpropagation network are explained in Section 4. Section 5 includes the experimental results. Finally, Section 6 provides conclusions. 2. The permutation owshop scheduling problem with the makespan criterion In a permutation owshop scheduling problem, there are a set of jobs I {1; 2; 3; ; n} and a set of machines J {1; 2; 3; ; m}: Each of the n jobs has to be processed on m machines 1,2,m in the order given by the indexing of the machines. Thus the job i; i [ I consists of a sequence of m operations, each of which must be processed on machine j for an uninterrupted processing time pij : Each machine j; j [ J; can process at most one job at a time, each job can be processed on at most one machine at a time and once an operation is started, it must be completed without interruption (Baker, 1974). Let Cij be the completion time of operation i on machine j: The makespan Cmax is the maximum completion among all jobs. In the permutation owshop problem with the makespan objective, the goal is to nd a permutation schedule that minimizes the makespan Cmax ; where a permutation schedule for a owshop instance is a schedule in which each machine processes the jobs in the same order. In order to provide a formal mathematical model of the problem, we apply the notion of job processing order represented by a permutation P {p1 ; p2 ; ; pn } on the set I; where pi denotes the element of I which is in position i in P (Nowicki, 1999). Then we calculate the completion time of the partial schedule pi on machine j denoted by Cpi ; j as follows: Cp1 ; 1 pp1 ; 1 Cpi ; 1 Cpi21 ; 1 ppi ; 1 Cp1 ; j Cp1 ; j 2 1 pp1 ; j for i 2; ; n for j 2; ; m for i 2; ; n; j 2; ; m: 1 2 3 4

Cpi ; j max{Cpi21 ; j; Cpi ; j 2 1} ppi ; j Finally, we dene the makespan as Cmax p Cpn ; m:

The permutation owshop scheduling problem is then to nd a permutation pp in the set of all permutations P such that (Rajendran & Chaudri, 1991) Cmax pp # Cmax p ;p [ P: 6

682

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

In this study, we consider a owshop consisting of M machines, each with unlimited buffer space. There is no additional restriction that the processing of each job has to be continuous so that, there may be waiting times between the processing of any consecutive tasks of each job. The main assumptions for this problem are: a set of n multiple-operation jobs is available for processing at time zero (each job requires m operations and each operation requires a different machine). the set up times for the operations are sequence-independent and are included in the processing times. m different machines are continuously available. individual operations are not preemptable.

3. Heuristics Six owshop heuristics are considered in this study. One can nd the explanations of these methods in Aksoy (1980), Campbell, Dudek, and Smith (1970), Koulamas (1998) and Nawaz, Enscore, and Ham (1983). The other two heuristic algorithms are new and are presented below. 3.1. Aslans frequency algorithm This algorithm is developed by Aslan (1999) with the objective of minimizing makespan and works as follows: Step 1 Take the operation times of each job on each machine, generate an n m dimensional problem. Step 2 Consider all of the combinations of all jobs, produce nn 2 1 pairs (two by two). Step 3 Calculate the partial makespan of all pairs by loading the jobs on the machines. Pair i; j and pair j; i are compared. For the pairs which have smaller completion time, the rst job takes the frequency value of 1, the other job 0. By these comparisons nn 2 1=2 frequency values are obtained. If the completion time of the pairs are the same then both jobs take the frequency value of 1. Step 4 Sum up the frequency values of all jobs and sort them in decreasing order. (This method sequences the jobs in decreasing frequency value order). Step 5 If the jobs have equal frequency values, consider alternative sequences and the sequence which results in less total completion time is the nal sequence.

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

683

3.1.1. Improvement phase The job pairs which have equal total completion times are evaluated and the dominant pairs are found. The pairs which have less total machine idleness are considered as dominant. A frequency value of 1 is added to the rst job and a frequency value of 1 is subtracted from the other job and a new sequence is generated. 3.1.2. A numerical example to Aslans frequency (dual sequencing) algorithm Consider a owshop with 5 machines. There are 4 jobs to be scheduled and their processing times are as shown in Table 1. We compare the calculated total completion times for pairs ij and ji for all of the combinations of the jobs. Pairs (1,2) (1,3) (1,4) (2,3) (2,4) (3,4) . First, pair (1,2) and (2,1) are compared. Since pair (1,2) results in less partial makespan, job1 takes the frequency value of 1 and job 2 takes the frequency value of 0. By executing these nn 2 1=2 comparisons, we obtain the frequency values for all jobs. As it is seen in Table 2, we assign frequency value of 1 to jobs 2 and 4 because pair (2,4) and pair (4,2) have both total completion times of 39. Then the frequency values of all jobs are summed. The frequency values for each job are obtained as follows: Frequency for Frequency for Frequency for Frequency for job 1:1 job 2:1 job 3:3 job 4:2 Completion times 41 46 41 41 39 38 Pairs (2,1): (3,1): (4,1): (3,2): (4,2): (4,3): Completion times 42 42 40 40 39 41

Table 1 Processing times for 4 job 5 machine problem Job Machine M1 J1 J2 J3 J4 5 9 9 4 M2 9 3 4 8 M3 8 10 5 8 M4 10 1 8 7 M5 1 8 6 2

684

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Table 2 Frequency values of each job at each comparison Comparison Job J1 1 2 3 4 5 6 1 0 0 J2 0 1 1 0 1 1 1 1 0 J3 J4

The frequency values of all jobs are sorted in decreasing frequency value order. The method yields the sequence 3-4-2-1 or 3-4-1-2. The makespan of these two sequences are compared and it is found that 3-4-1-2 sequence with Cmax 57 is better than the sequence 3-4-2-1 with Cmax 58: 3.1.3. Improvement phase The job pairs (2,4) and (4,2) have equal total completion times. The dominant pairs are investigated. Because pair (2,4) and pair (4,2) have equal total completion times of 39, we compare these two pairs and try to decide which one is dominant. For each pair, except for the rst machine, we calculate how much the subsequent machine waits the preceding machine. In Fig. 1, the numbers on the upper side of each circle indicate the starting time of each job on each machine and the numbers below the circle indicate the execution time of each job on each machine. For the above example, for pair (2,4), in order to start operation, fourth machine waits for the third machine 7 min, fth machine waits for the fourth machine 6 min so for pair (2,4), the total delayed time is 6 7 13 min. For pair (4,2), in order to start operation, fourth machine waits for the third machine 3 min and the fth machine waits for the fourth machine 2 min and the total delayed time is 2 3 5 min. As it is seen from the results, pair (4,2)

Fig. 1. Comparison of pair (2,4) and pair (4,2).

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

685

is dominant. So we add 1 to the frequency of the fourth job and subtract 1 from the frequency of the second job. The frequency of the fourth job becomes 3 and the frequency of the second job becomes 0. Since the second job has the least frequency value, the second job takes the last place in the sequence. When we overview the frequency values of all jobs, we see that the third and the fourth jobs have equal frequencies so we should consider two alternative sequences, sequence 3-4-1-2 and sequence 4-3-1-2. Sequence 3-4-1-2 gives total completion time of 57 and sequence 4-3-1-2 gives total completion time of 54, so the nal sequence is 4-3-1-2. 3.2. Aslans point algorithm This algorithm is also developed by Aslan (1999) with the objective of minimizing makespan and works as follows: Step 1 Compare pair ij and ji: If pair ij results in less completion time than pair ji then assign positive point to job i which is equal to the difference between makespan ji and makespan ij and also assign negative point to job j which is equal to the difference between makespan ji and ij: Step 2 For all the jobs sum up the points and sequence the jobs in decreasing point order. 3.2.1. A numerical example to Aslans point algorithm When we consider the same problem in Table 1, rst of all we compare pair (1,2) and pair (2,1). We see that pair (1,2) gives less total completion time than pair (2,1) so we assign 42 2 41 1 positive point to job 1 and 2 1 point to job 2. Then we compare pair (1,3) and pair (3,1). Pair (3,1) gives less completion time than pair (1,3) so we assign 46 2 42 4 point to job 3 and 24 point to job 1. Later, we compare pair (1,4) and pair (4,1), we give 41 2 40 1 point to job 4 and 21 point to job 1. We repeat this procedure for all the pairs and the point values obtained are 24 for the rst job, 22 for the second job, 8 for the third job and 22 for the fourth job. The nal sequence is 3-4-2-1 or 3-2-4-1 with the makespan 58. The numerical examples presented for these two heuristic algorithms are only given for demonstration purpose, different experiments need to be performed to test the effectiveness of these new heuristics and they should be compared with other heuristics in the literature. One can nd more information in Aslan (1999). 3.2.2. The solution of 5-job-43-machine problem by heuristic methods In this study, a scheduling problem in a real permutation owshop environment is considered. In this environment, ve different items, in batches are produced by passing through 43 serial machines in Axle Housing Workshop of a manufacturing plant which is a supplier of automative axles and axle components. According to the production orders for 1 month and the global operation recipe, the best sequence of ve different products (big housing, light housing, small housing, additional axle, trailer axle) are found by six different heuristic algorithms and are shown in Table 3. For each sequence found by six different heuristic algorithms, the completion time of each job on each machine, job waiting times and machine idle times are computed.

686

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Table 3 The best sequence and makespans for each heuristic algorithm Algorithm CDS NEH Koulamass Aslans frequency Aslans point Aksoys Sequence 3-2-1-4-5 5-2-1-3-4 2-1-4-3-5 3-5-2-4-1 2-3-5-4-1 3-2-4-1-5 Makespan 32,759.617 31,952.35 33,327.468 32,754.271 32,242.017 34,101.504

4. Developing a neural network by using backpack neural network system To develop a neural network for this problem, a neural network software tool which uses backpropagation training algorithm is employed. The backpropagation network is one of the most widely used network architectures because of its ability to learn complex mappings and strong foundation. Backpropagation is a systematic way of training a multilayer articial neural network in a supervised manner. It involves two phases of computation: a training phase (feed-forward phase) and a backward (recall) phase. In the training phase, the network will learn the relationship between inputs and outputs by applying an input vector to the nodes of the network. In the backward phase, the network predicts outputs when exposed to unseen examples or new inputs (Sabuncuoglu, 1998). As a rst step, the data is read into the system. Job and machine numbers, processing times of each job on each machine, job waiting times, machine idle times are created in Excel and saved as a dbase le which serves as a native database format used in Backpack Neural Network System. In order to realize preprocessing on our data, one of N transformation on jobs and machines are created and a method is proposed to determine the fuzzy completion times, job waiting and machine idle times. Triangular fuzzy numbers are used to represent the fuzzy completion times. Job waiting and machine idle times are considered as triangular or trapezoidal fuzzy numbers. The input variables are scaled to lie in the range 0 1 and the output variables in the range 0.2 0.8. This approach reduces the training time by eliminating the possibility of reaching the saturation regions of the sigmoid transfer function during training. 4.1. Determining the fuzzy membership function for completion times In job sequencing for a owshop, processing times are frequently not known exactly and only estimated intervals are given. Fuzzy numbers are ideally suited to represent these intervals (McCahon & Lee, 1992). Processing times are formed of operation times. In most situations, the calculation of operation times are done using 5 or 10% tolerance via Time Studies. So, in this study, we assume that there can be deviations in the completion times which are formed of processing times. In our study, instead of representing job processing times by using fuzzy numbers, completion times are represented through the use of triangular fuzzy numbers. The graphical representation of a triangular fuzzy number is shown in Fig. 2. For each heuristic method, each jobs completion time on the last machine is considered as the most likely time and by adding and subtracting 5% tolerance value from this time, the pessimistic

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

687

Fig. 2. The graphical representation of a triangular fuzzy number.

time and the optimistic time for carrying out the last operation of each job is found. These 3 values become the parameter set of our fuzzy set that are dened for each job. According to these parameters, the membership functions are developed. 4.2. Determining the fuzzy membership functions for job waiting and machine idle times In this study, to represent fuzzy job waiting and fuzzy machine idle times, trapezoidal fuzzy numbers are employed. The trapezoidal fuzzy numbers are represented by (a, b, c, d). The membership function is 1 at b to c, and becomes zero at the two end points, a and d. The graphical representation of a trapezoidal fuzzy number is illustrated in Fig. 3, where mx is the membership function and x is either job waiting or machine idle times (McCahon & Lee, 1992). For example, a manager may say that the job waiting or machine idle time for job A is generally b to c minutes. But due to other factors which cannot be controlled or predicted, the job waiting or machine idle time may be occasionally as high as d minutes or as low as a minutes. We proposed a new method to represent job waiting times and machine idle times. For each job, we summed up job waiting times and for each machine, we summed up machine idle times in a cumulative manner. We assumed that a is the minimum job waiting or machine idle time and d is the total job waiting or total machine idle time which is the maximum value. b is the rst value of the cumulative total and c is the cumulative total value before the total cumulative times. If only one machine waits or just one job has a waiting time, we used triangular membership functions having the parameters like a is 0, b is the machine idle time or job waiting time and c is equal to 0. The output variable for the model is chosen as the completion time. The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and hidden units. A three layer backpropagation network is developed for each heuristic algorithm. Three layered networks can be taught to perform a particular task or to learn a particular mapping as follows: rst, the network is presented with training examples, which consist of patterns of activities for the input units as well as the desired activity for the output unit.

Fig. 3. Trapezoidal fuzzy number.

688

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Then, a determination is made of how close the actual output of the neural network is to the desired output. The difference between the desired and the actual output is used as an error signal to adjust the connection weights. In this study, to perform the neural network analysis, our data is split into three datasets. These are the training set, the test set and the validation set. Train set is the database containing the observations to train the network. Test set is the database containing the observations to test the neural network to determine when training should be stopped. Validate set is the database containing observations that the neural network has not seen. This data set is applied to our trained neural network to assess its performance in real world conditions. The Split Data function of the software package develops uniform distributions of observations for the train and test data sets. This allows the training (using the train and test sets) to ensure that all features of the distribution are learned. The validate set follows the same distribution as the base data set. To specify data sets, the output variable, completion time is selected to divide the base data set into groups. For each heuristic, the develop data sets function pulls stratied samples from the base data sets and the number of groups (the number of strata) to be used to stratify the data sets is chosen between 3 and 6. For different number of strata, the data set percentage which sets the percentage of the Controlling Group (The group or bin with the least amount of observations. This group sets the number of observations to be randomly selected from the other groups in order to achieve the desired distributions) that is assigned to the train, test and validate data sets are chosen as 60% for the train, 20% for the test and 20% for the validate data set. After specifying the data sets, the training process can begin. During the training of the neural network, the data in the training set will be thoroughly examined and the network will learn or generalize the relationship between the dependent (each variable being predicted) and independent variables (each input variable) so that the trained network can be used to give us estimates or answers given new data cases. At intervals, under the control of the network development system (program), the partially trained network will be presented with the independent variables in the test set and will make predictions of the dependent variable, which is known. The goodness of t of the predictions will be measured. Based upon the t, modications to the network might be made and training continued using the training set. For each heuristic, for different number of strata, different data sets are used which were generated by splitting the data. The number of train and test observations for each heuristic for different number of groups are given in Tables 4 9. These tables show the values belonging to the networks resulting in smallest root mean square errors.
Table 4 Measurement of t for the test set for CDS algorithm Heuristic No. of groups 6 5 4 3 Train obs. read 57 103 71 103 Test obs. read 23 27 32 40 Test set RMSE 0.0966 0.0379 0.0671 0.0681 Low test RMSE 0.0966 0.0377 0.0652 0.0674 No. of epochs 13,000 14,200 6800 8400 No. of failed tries 30 30 30 30 Epoch size 12 12 71 103 No. of hid. layer nodes 6 10 6 6

CDS CDS CDS CDS

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696 Table 5 Measurement of t for the test set for NEH algorithm Heuristic No. of groups 6 5 4 3 Train obs. read 48 68 73 86 Test obs. read Test set RMSE 0.0298 0.0714 0.0642 0.0611 Low test RMSE 0.0282 0.0714 0.063 0.0606 No. of epochs 8000 62,000 7600 9000 No. of failed tries 30 30 30 30 Epoch size 48 68 12 80

689

No. of hid. layer nodes 6 6 6 6

NEH NEH NEH NEH

11 18 16 29

Table 6 Measurement of t for the test set for Koulamas algorithm Heuristic No. of groups 6 5 4 3 Train obs. read 80 89 94 114 Test obs. read 31 21 32 37 Test set RMSE 0.0507 0.0488 0.0445 0.0285 Low test RMSE 0.0502 0.0410 0.0444 0.0269 No. of epochs 15,000 6600 25,000 9000 No. of failed tries 30 30 30 30 Epoch size 12 89 94 114 No. of hid. layer nodes 6 6 6 6

Koulamas Koulamas Koulamas Koulamas

Table 7 Measurement of t for the test set for Aslans frequency algorithm Heuristic No. of groups 6 5 4 3 Train obs. read 59 65 58 78 Test obs. read 18 16 16 26 Test set RMSE 0.0532 0.0965 0.0996 0.0761 Low test RMSE 0.0532 0.0936 0.0738 0.0760 No. of epochs 15,000 6800 6200 18,200 No. of failed tries 30 30 30 30 Epoch size 15 12 58 12 No. of hid. layer nodes 6 6 6 6

Frequency Frequency Frequency Frequency

Table 8 Measurement of t for the test set for Aslans point algorithm Heuristic No. of groups 6 5 4 3 Train obs. read 57 72 93 95 Test obs. read 23 21 33 32 Test set RMSE 0.0835 0.0618 0.0651 0.0637 Low test RMSE 0.0834 0.0615 0.0608 0.0619 No. of epochs 10,800 15,800 8200 7000 No. of failed tries 30 30 30 30 Epoch size 57 12 93 95 No. of hid. layer nodes 6 6 6 6

Point Point Point Point

690

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Table 9 Measurement of t for the test set for Aksoys algorithm Heuristic No. of groups 6 5 4 3 Train obs. read 63 104 69 106 Test obs. read 21 31 23 41 Test set RMSE 0.1034 0.0369 0.0846 0.0254 Low test RMSE 0.1034 0.0366 0.0836 0.0254 No. of epochs 36,600 9800 6400 65,000 No. of failed tries 30 30 30 3 Epoch size 63 104 69 12 No. of hid. layer nodes 6 6 6 10

Aksoy Aksoy Aksoy Aksoy

4.3. Network architecture A backpropagation neural network is adopted in this study, in which signals are passed from the input layer to the output layer through a hidden layer and learning is done by adjusting the connection weights by a gradient descent algorithm that involves backpropagating the error to previous layers. A difcult task with ANNs involves choosing the architecture parameters of the network. Although the neural network has the possibility of solving various scheduling problems, how to specify the values of many parameters and weights of these networks remains a critical issue (Raaymakers & Weijters, 2003). At present, there is no established theoretical method to determine the optimal conguration of a network. Most of the design parameters are application dependent and must be determined empirically. A feedforward ANN with single layer is employed in this work. There are no constraints about the number of hidden layer(s); a network can have only one, or as many hidden layers as selected. However, there is no evidence that the network with more hidden layers can perform better (Dagli, 1994). Patterson (1996) indicated that a single hidden layer would be sufcient for most applications (Shiue & Su, 2002). The need for more than one hidden layer is highly unlikely because the networks with additional layers require signicantly longer training times despite the fact that the networks with larger hidden layers distribute the weights effectively over the layers and provide better performance based on hidden layers criterion. In the light of this information, with the aim of not increasing the training times, a single hidden layered network is chosen for the implementation purpose in this study. The number of hidden layer nodes is also among the most important considerations when solving problems using multilayered feedforward neural networks. An insufcient number of hidden layer neurons generally results in the networks inability to solve a particular problem, while too many hidden layer neurons may result in a network with poor generalization performance and can lead to over-tting (Liu, Chang, & Zhang, 2002). The number of hidden layer nodes for our problem is determined by a trial and error procedure, in which various architectures are constructed by changing the number of nodes in the hidden layer. Our neural networks applied to model different heuristic algorithms, are trained with 4, 6, 10, 15, 20, 25, 30 hidden nodes with epoch sizes of 12 (default value of the software), 15, entire training observations, and epoch sizes of 50 and 80 (for the models having number of training observations greater than 50) for different number of groups. All the network architectures are compared and, for each network, modeling each heuristic algorithm with different number of groups, the one with the minimum root mean square error (RMSE) is selected as the best one for each group. Tables 4 9 show the epoch sizes and hidden layer nodes which result in the smallest RMS errors. The number of output nodes correspond to the number of outputs in the model. Since our study is a prediction problem, typically this will be 1. Because we are dealing with a prediction problem,

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

691

the RMS error is selected as the training criterion. It is a quantitative measure of learning to reect the degree of learning that takes place in a network. As a network learns, its root mean square decreases. Every network is trained until reaching 65,000 epochs of training where each epoch is dened by the training observations presented to the network. Although the maximum number of epochs to execute before terminating the training session is dened as 65,000, the training may end before the maximum is reached if the performance on the test set does not improve. Another important point is the initialization of the networks weights, which is done with random values in order to break the symmetry among the hidden nodes. In this study, the initial weights of the backpropagation network are randomly set between 20.2 and 0.2 and 215 observations are collected from the axle housing workshop belonging to previous month for generating the database. One of the problems of the gradient descent rule which is used as a learning procedure for backpropagation networks is setting an appropriate learning rate (Shiue & Su, 2002). The learning rate determines the size of the steps in the search space to nd the minimal training error (Raaymakers & Weijters, 2003). A small learning rate results in longer learning times while a large learning rate causes oscillations during the training of the network. One efcient and commonly used procedure that allows a greater learning rate without causing divergent oscillations is the addition of a momentum term to the gradient descent method. As with the learning rate, if the momentum term is too large, the network will display a chaotic learning behavior. Because learning time is not the issue in this study, both learning parameters are chosen relatively small at the beginning. Starting value for the learning rate and momentum coefcient are both set to 0.2. Increase in the learning rate is dened as 0.095, and decrease in the learning rate is dened as 0.1. The increase in the learning rate is the amount added to the learning rate if the current weight change is in the same direction (same sign) as the prior weight change, the decrease in the learning rate is the percentage amount the learning rate is decreased by, if the current weight change is in the opposite direction (different sign) as the prior weight change. The increase in the momentum value is dened as 0.05 and the decrease in the momentum term is dened as 0.1.

5. Experimental results In order to evaluate the performance of the ANN model, the validate dataset is applied to our trained model and the prediction accuracy of the model is investigated by comparing the t statistics for each heuristic with different number of strata. According to the values of the correlation coefcient for different levels of signicance, the t statistics showed the validity of our model. After obtaining
Table 10 The summary of the results for each heuristic algorithm with the highest correlation coefcient Heuristic CDS NEH Koulamas Frequency Point Aksoy No. of groups 5 3 4 6 3 3 Corr. coef. R2 0.9882 0.9622 0.9863 0.9339 0.9635 0.9909 RMSE 0.0250 0.0446 0.0280 0.0563 0.0431 0.0224 MAPE 10.74 25.15 7.70 30.74 15.60 9.00 MAE 733.9157 1290.624 762.4662 1705.223 1168.647 609.6883

692

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Fig. 4. Actual versus predicted completion times for CDS algorithm.

satisfactory performance results, it is decided to use the network to predict the completion times of each job on each machine for the next production period. Since correlation coefcient is a measure of the linear relationship between two variables, for each heuristic, the number of stratas that give the highest correlation coefcient are chosen as the best result. The summary of the results for each heuristic algorithm with the highest correlation coefcient is given in Table 10. After the determination of the best number of strata for each heuristic resulting in smallest root mean squared error, the predicted and actual completion times are compared. The t statistics are obtained from the model and t graphs are drawn showing the comparison between the actual and the predicted completion times. The t graphs are given in Figs. 4 9. The last values (makespans) of the predicted and actual completion times (belonging to the last machine) for each heuristic are illustrated in Table 11. Although there are differences between actual and predicted completion times as shown in Figs. 4 9, the results indicate that, the models applied to model different heuristic scheduling algorithms predicted the completion times with good accuracies between 0.9339 and 0.9909. The fact that all values are very close to unity, indicates that the mapping was performed at a satisfactory level even when fuzzy

Fig. 5. Actual versus predicted completion times for NEH algorithm.

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

693

Fig. 6. Actual versus predicted completion times for Koulamas algorithm.

Fig. 7. Actual versus predicted completion times for Aslans frequency algorithm.

Fig. 8. Actual versus predicted completion times for Aslans point algorithm.

694

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Fig. 9. Actual versus predicted completion times for Aksoys algorithm. Table 11 The actual and predicted makespans for each heuristic algorithm Heuristic CDS NEH Koulamas Aslans frequency Aslans point Aksoy Actual 32,024.998 31,952.000 33,327.000 32,753.998 32,242.000 32,435.000 Predicted 32,578.713 31,825.555 34,139.152 32,750 31,583.658 31,999.766 Predicted/actual 1.01729 0.99604 1.02436 0.99987 0.97958 0.98658

information was present. If we calculate the ratio between the predicted makespans and the actual makespans, we can see that all of the algorithms are successful in predicting the makespans. The decision maker can choose one of these algorithms but to specify the due dates in agreement with the customer, taking the average of these ratios which is equal to 1.00062 is offered as an alternative to nd the makespans of the jobs at hand. By this way, a better production plan that meets the customers due dates can be made. It is to be noted that, identifying an appropriate ANN model has a strong impact on the performance of the networks in predicting the makespans. For example, the effect of different values of epoch sizes is shown in the tables below. As shown in Table 12, training the network modeling Aslans frequency algorithm with epoch size of 59 gives better results than training the network with epoch size of 12,
Table 12 Effect of epoch size on the performance of the network modeling Aslans frequency algorithm Epoch size 12 59 15 15 No. of hidden nodes 6 6 6 6 Train obs. read 59 59 86 59 Test obs. read 18 18 21 18 Test set RMSE 0.0596 0.0534 0.0532 0.0597 Corr. coef. 0.8843 0.8902 0.9339 0.8934 Predicted makespan 22,766.303 29,919.959 32,750 29,194.941

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696 Table 13 Effect of epoch size on the performance of the network modeling Aslans point algorithm Epoch size 12 95 15 No. of hidden nodes 6 6 6 Train obs. read 95 95 95 Test obs. read 32 32 32 Test set RMSE 0.0752 0.0637 0.0713 Corr. coef. 0.9489 0.9635 0.9595

695

Predicted makespan 29,076.232 31,583.658 33,147.738

but setting the epoch size to 15 and increasing the number of training observations (by reallocating the observations in each group) gives the best result. Table 13 also shows the effect of different values of epoch sizes on the performance of the network modeling Aslans point algorithm. Training the network with the entire training set (95 observations) gives the best result in predicting the makespan. Because the results showed good prediction ability, no more training observations are needed to train the network.

6. Conclusions Backpropagation networks have been successfully utilized for various complex and difcult scheduling problems by many researchers. The main objective of this study is to introduce a new and alternative approach of using a neural network for the estimation of the makespans of owshops. Rather than determining the optimal sequences directly by the network, backpropagation networks have been applied to predict the completion times of ve jobs planned to be produced for the next production period. Once the neural network results are achieved modeling six heuristic algorithms, they are compared with the actual results to show the feasibility of using the neural networks as an implementation alternative to the existing heuristic scheduling algorithms. The trained model prediction was in good agreement with the actual values hence, producing R2 values between 93.39 and 99.09%. These results show that approximately 93.39 and 99.09% of the variation in the dependent variables (output parameters) can be explained by the independent variables (input parameters) selected and the data set used. This study shows the applicability of articial neural networks to real scheduling problems. Manufacturing plants having a permutation owshop environment can prot from the model obtained in this study with their data. The future directions of this study are to employ additional methods that will improve the accuracy of the results. In this respect, decreasing the mean percentage errors, nding a perfect match between the predicted and actual values seem to be a promising direction. Different neural network architectures may be used to predict the makespan for planning the production schedules. Extensions of the proposed method, involving different neural network architectures may be developed to solve complex scheduling problems having different performance measures.

References
Aksoy, M. (1980). Duzgun sral is yerlerindeki statik sralama problemlerinde toplam is aks suresinin enazlanmas icin yeni bir Sezgisel Yontem. Yoneylem Aras., Bildiriler80, 255268.

696

D.E. Akyol / Computers & Industrial Engineering 46 (2004) 679696

Aslan, D. (1999). Model development and application based on object oriented neural network for scheduling problem. DEU Res. Fund. Proj., Nr. 0908.97.07.01. University of Dokuz Eylul, Izmir. Baker, K. R. (1974). Introduction to sequencing and scheduling. New York: Wiley. Campbell, H. G., Dudek, R. A., & Smith, M. L. (1970). A heuristic algorithm for the n-job, m-machine sequencing problem. Management Science, 16, 630 637. Dagli, C. H. (1994). Articial neural networks for intelligent manufacturing. London: Chapman and Hall. El-Bouri, A., Balakrishnan, S., & Pooplewell, N. (2000). Sequencing jobs on a single machine: A neural network approach. European Journal of Operational Research, 126, 474 490. Feng, S., Li, L., Cen, L., & Huang, J. (2003). Using MLP networks to design a production scheduling system. Computers and Operations Research, 30, 821 832. Ho, J. C., & Chang, Y. L. (1991). A new heuristic for the n-job, M-machine ow-shop problem. European Journal of Operational Research, 52, 194 202. Jain, A. S., & Meeran, S. (1998). Job shop scheduling using neural networks. International Journal of Production Research, 36, 1249 1272. Jain, A. S., & Meeran, S. (2002). A multi-level hybrid framework applied to the general ow-shop scheduling problem. Computers and Operations Research, 29, 1873 1901. Koulamas, C. (1998). A new constructive heuristic for the owshop scheduling problem. European Journal of Operational Research, 105, 66 71. Lee, I., & Shaw, M. J. (2000). A neural-net approach to real time ow-shop sequencing. Computers and Industrial Engineering, 38, 125 147. Liu, D., Chang, T. S., & Zhang, Y. (2002). A constructive algorithm for feedforward neural networks with incremental training. IEEE Transactions on Circuits and SystemsI: Fundamental and Applications, 49, 18761879. McCahon, C. S., & Lee, E. S. (1992). Fuzzy job sequencing for a owshop. European Journal of Operational Research, 62, 294 301. Nawaz, M., Enscore, E., & Ham, I. (1983). A heuristic algorithm for the n-job, m-machine owshop sequencing problem. Omega, 11, 91 95. Nowicki, E. (1999). The permutation ow shop with buffers: A tabu search approach. European Journal of Operational Research, 116, 205 219. Patterson, D. W. (1996). Articial neural networks: Theory and applications. Singapore: Prentice-Hall. Raaymakers, W. H. M., & Weijters, A. J. M. M. (2003). Makespan estimation in batch process industries: A comparison between regression analysis and neural networks. European Journal of Operational Research, 145, 14 30. Rajendran, C., & Chaudri, D. (1991). An efcient heuristic approach to the scheduling of jobs in a owshop. European Journal of Operational Research, 61, 318 325. Sabuncuoglu, I. (1998). Scheduling with neural networks: A review of the literature and new research directions. Production Planning and Control, 9, 2 12. Sabuncuoglu, I., & Gurgun, B. (1996). A neural network model for scheduling problems. European Journal of Operational Research, 93, 288 299. Shiue, Y. R., & Su, C. T. (2002). Attribute selection for neural network based adaptive scheduling systems in exible manufacturing systems. International Journal of Advanced Manufacturing Technology, 20, 532 544. Smith, K. (1999). Neural networks for combinatorial optimization: A review of more than a decade research. Informs Journal on Computing, 11, 15 34.

Potrebbero piacerti anche