The Most Accurate Heuristic-Based Algorithms For Estimating The Oil Formation Volume Factor, Mohammad Reza, 20216

Petroleum 2 (2016) 40e48
Contents lists available at ScienceDirect
Petroleum
journal homepage: www.keaipublishing.com/en/journals/petlm
Original article
The most accurate heuristic-based algorithms for estimating

the oil formation volume factor
Mohammad Reza Mahdiani*, Ghazal Kooti
Faculty of Petroleum Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran, Iran
a r t i c l e i n f o a b s t r a c t
Article history: There are various types of oils in distinct situations, and it is essential to discover a model for
Received 6 November 2015 estimating their oil formation volume factors which are necessary for studying and simulating the
Received in revised form reservoirs. There are different correlations for estimating this, but most of them have large errors
30 December 2015
(at least in some points) and cannot be tuned for a specific oil. In this paper, using a wide range of
Accepted 30 December 2015
experimental data points, an artificial neural network model (ANN) has been created. In which its
internal parameters (number of hidden layers, number of neurons of each layer and forward or
Keywords:
backward propagation) are optimized by a genetic algorithm to improve the accuracy of the model.
Genetic programming
Neural network
In addition, four genetic programming (GP)-based models have been represented to predict the oil
Modeling formation volume factor In these models, the accuracy and the simplicity of each equation are
surveyed. As well as, the effect of modifying of the internal parameters of the genetic programming
(by using some other values for its nodes or changing the tree depth) on the created model. Finally,
the ANN and GP models are compared with fifteen other models of the most common previously
introduced ones. Results show that the optimized artificial neural network is the most accurate and
genetic programming is the most flexible model, which lets the user set its accuracy and simplicity.
Results also recommend not adding another operator to the basic operators of the genetic
programming.
Copyright © 2016, Southwest Petroleum University. Production and hosting by Elsevier B.V. on
behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction available. Thus, an accurate correlation for different ranges of

parameters is necessary. There are numerous correlations for
Oil formation volume factor (Bo) is one of the PVT properties estimating Bo in the literature, but most of them have poor
of oil, which is the ratio of a specific petroleum volume at the prediction for some points even in their applicable range. In
reservoir condition to its corresponding volume at standard addition to that they can't be tuned for a special oil, and their
condition. Bo's value is essential in calculating various parame- accuracy and speed can't be changed base on the need of the
ters such as the depletion rate, oil in place, predicting the future problem. They are not flexible to be used in different problems
of the reservoir, optimizing the rate of production and some and also there is not a comprehensive study to help users to
other simulation and optimization problems. Distinctive corre- select one for their studies. Here, one artificial neural network
lations are used whenever the experimental value for Bo of a (ANN), one hybrid of ANN and genetic algorithm, four genetic
specific oil at particular pressure and temperature is not programming (GP) methods, and fifteen previously introduced
correlations will be surveyed, and their strengths and weak-
nesses will be discussed. Considering the pressure, reservoirs can
* Corresponding author. be classified into three categories: below the bubble point
E-mail address: mahdiani@aut.ac.ir (M.R. Mahdiani). pressure, at bubble point pressure and above the bubble point
Peer review under responsibility of Southwest Petroleum University.
pressure. Usually, there is some free gas in the reservoir at
pressures below the bubble point. Above that, Bo changes
because of the oil and gas compressibility. Knowing the value of
Production and Hosting by Elsevier on behalf of KeAi Bo at bubble point condition, Bo at higher pressures can be easily
http://dx.doi.org/10.1016/j.petlm.2015.12.001
2405-6561/Copyright © 2016, Southwest Petroleum University. Production and hosting by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open
access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
M.R. Mahdiani, G. Kooti / Petroleum 2 (2016) 40e48 41
calculated using compressibility factor. Thus, most of the corre- Table 1

lations estimate Bo at bubble point pressure [1e3]. The models of Range of parameters, used in building the model.
this paper also estimate the Bo at bubble point pressure. Gas gravity Oil gravity Temperature ( F) Solution gas
Different correlations have been represented for predicting oil ratio (SCF/STB)
the oil formation volume factor. In 1942, Katz [4] using 117 Minimum 0.612 0.766 100 104
measurement data developed a correlation to predict the oil Maximum 1.315 0.895 280 1439
formation volume factor using gas gravity, oil gravity, gas oil
ratio, and reservoir pressure and reservoir temperature. In 1947,
Standing [5e8] created a correlation which was the first corre- number of all points was 866. From them 395 points that they
lation that needs only four parameters (gas gravity, oil gravity, were not used in building of any surveyed correlations of this
and solution gas oil ratio and reservoir temperature) to estimate paper put aside for comparing different correlations and 471
the oil formation volume factor. Vasquez and Begg [9](1980) points used for building new models for oil formation volume
developed a correlation which was highly dependent on gas factor using artificial neural network and genetic programming.
gravity. In 1980, Glaso [10] introduced a correlation for a Artificial neural network (ANN) is a modeling method that is
methane contained black oil based on gaseoil equilibrium. Al- based on biological neurons. This network is consisted of units
Marhoun [11,12] (1988) used the linear regression for Middle called artificial neurons. These neurons receive inputs and sum
East oil. After that a lot of researchers tried to introduce co- them to produce the output. Each neuron has a weight and thus
efficients for previous correlations. Among them are Abdul- the summation is weighted. The sum is passed through a func-
Majeed, 1988 [13], Dokla (1992) [14], Al-Marhoun (1992) [15], tion called transfer function. And finally the model is created
Petrosky (1993) [17], Omar (1993) [18], Kartoamodj 1994 [19], [30]. ANN has been used widely in the different courses of sci-
Farshad 1996 [20]. In more recent years some researchers tried to ence [31e33].
introduce completely new correlations. Maybe by adding pa- To create the ANN model, the same data that were used in
rameters or different techniques. Sutton [24] developed a cor- creating the genetic programming models is used. Their range is
relation for density and using that estimated Bo. Sebakhy [25] shown in Table 1 and the property of the neural network model
(2009) used support vector machine modeling Nassar (2013) is shown In Table 2. In this Table, ‘LevenbergeMarquart’ is the
[26]considered separator pressure, temperature and its config- name of a method; for more information see Ref. [34]. As shown
uration (two or three stage) and introduced a correlation for Bo, in this table, type of neural network is a forward propagation and
In 2014 Sulaimon [27] used a group method of data handling to all its details are listed in that. Here using artificial neural
create a model. network a model for estimating the oil formation volume factor
As it can be seen, there are different correlations for the oil created. In next section the created ANN model statistical
formation volume factor in the literature. Now, the question is properties will be discussed and compared with other Bo
which one of them is more accurate. There are a lot of studies correlations.
that compare these correlations. In 1994, using 195 data points, Artificial neural network is a good method but it is very
Ghetto [28]compared different correlations and suggested the random base and different runs of that, results to different
Vasquez and Begg model for estimating the oil formation volume models. As well as changing its internal parameters such as the
factor. In 1997, Almehaideb [22] used the data of 13 oil fields in number of hidden layers, the number of neurons of each layer
UAE to show the performance of common correlations. He sug- and having forward or backward propagation can change the
gested Al-Marhoun, Glaso, and Standing as the top three accurate final model and its efficiency. Thus, here ANN is coupled with GA.
correlations. In 2012, Godefroy [29] compared various correla- GA optimizes the number of hidden layers and the neurons of
tions and suggested Glaso and Al-Marhoun to estimate Bo. Un- each layer to improve the quality of the models in addition to
fortunately none of the above studies have a comprehensive that it selects the type of propagation (forward or backward). The
investigation on distinctive models, they are old and are not basic properties of the artificial neural network are mentioned in
surveyed the modern models and the new methods for modeling Table 2. The properties of GA are listed in Table 3. For knowing
such as neural network, genetic programming and simulated more about the parameters of this table the interested reader can
annealing programming [46,47] and the database of most of refer to [35]. As mentioned before. For that reason in each iter-
them is not wide enough. Here, using an extensive database, 15 ation for every set of inputs (number of hidden layers and
models of the most common literature correlations in addition to number of neurons of each layer and the type of propagation)
modern models such as neural network and different genetic five times the ANN ran and the best model selected. The output
programming models are compared and their strengths and of fitness function of genetic algorithm is average relative error
weaknesses are illustrated. that should be minimized. Five was selected because it is not so
small that random structure of algorithm affects that and is not
2. Methodology so large that causes the low speed of the algorithm. In this study,
As mentioned earlier, here using some artificial neural

network and genetic programming some new models are
Table 2
created and compared with the most common previously The parameters of the artificial neural network model used in this study.
introduced ones in literature. For this purpose there is need to
Parameter Value
have a dataset of the measured oil formation volume factors in
various conditions. This data set, gained from different literature Type of ANN MLPs, forward
Training LevenbergeMarquart
works and experiment, consists of 160 data points of the Middle
Performance Mean Square Error
East oil of Al-Marhoun [12] study, 195 points of Ghetto [28] from Stop tolerance 1.00E-05
Persian gulf, middle east and Africa oils, 41 data point of Glaso Number of nodes of each layer 8
[10] from north sea, 177 points of Katz [4] study 93 points of Number of input layers 4
Malaysian oil of Omar's study as well as 200 other experimental Number of hidden layers 10
Number of output layers 1
data points which were measured of Iranian south oil fields. The
42 M.R. Mahdiani, G. Kooti / Petroleum 2 (2016) 40e48
Table 3 developed for symbolic optimization [36,37]. Despite the fact

The genetic algorithm parameters of the model of this study. that the genetic algorithm uses an array of numbers for opti-
Property name Property value mization, genetic programming uses tree (Fig. 2) representation
Population type Integer vector
[38]. As Fig. 2 shows that a tree consists of nodes that each
Population size 20 contains an operator and terminals which each contains a
Fitness scaling Rank parameter and represents an equation. Using trees is a flexible
Selection function Stochastic uniform method that is used in different courses of science and engi-
Elite count 4
neering such as differential equations [39], chemical reactor
Crossover function Scattered
Crossover fraction 0.8 modeling [40e42], circuit designing, etc. [43]. In the genetic
Mutation function Uniform programming technique, at first an initial population of trees is
Mutation probability 0.05 created. Then in each iteration, the individuals are evaluated and
Migration Forward
some individuals are selected for reproduction. The population
Max generation 100
Max stall generation 50
of selected individuals is increased by cross over and mutation.
Function tolerance 1.00E-06 When an individual with acceptable fitness is found or the
number of iteration passes a specific number, the algorithm stops
[44,45].
GA is changed, not just to save the inputs but also to save the Table 1 Shows the range of parameters used for creating (or
created network. The flowchart of the optimized neural network training) the model. In this range, we are sure that the model is
can be seen in Fig. 1. After running the model it concluded that 10 working well but extrapolating the model can lead to unrealistic
hidden layers, 8 neurons for each layer and forward propagation results. These data are used for creating all the models of this
lead to best results. Finally the created ANN model was study and thus the range of applicability of all the models of this
compared with usual ANN, GP and other correlations. study are as Table 1. In this study, the type of nodes are basic
Now genetic programming models will be created. Genetic mathematical operations {þ, , , /} and the parameters of the
programming is an extension of the genetic algorithm, terminals are {T, gg, go, RS}. Afterward, GP changes the tree
structures to minimize the average relative error of its
Fig. 1. Flowchart of optimized artificial neural network.

Fig. 2. Three examples of tree structures.
corresponding equation over the data, the GP stops whenever purpose this time the value of maximum tree depth was set to 4
the values of the average relative error falls below a specific value and GP was run which resulted in the equation (3).
or its changes fall below a defined tolerance. GP has different
parameters and changing them my cause different results. Thus Bo ¼ 0:83161 þ 0:0012246 T þ 0:00052734 Rs
.
to be able to repeat this modeling these parameters should be
þ 115188:5563 Rs T ðRsþggþ1Þ (3)
known. They are listed in Table 4.
In this study GP was run multiple times. In first time, Again, similar to pervious part, the value of maximum tree
maximum tree depth was set to 4 for having a simpler equation depth was set to 8 and algorithm was run and the equation (4)
(in Fig. 1 the left tree depth is two and the others is three). In fact was the result:
if the number of levels exceeded from four, algorithm considered
.
a penalty for that. Running this led to the equation (1).
Bo ¼ 1:1292 0:67343 Rs ðTggÞ2 þ 8:3272e 06 T 2
Bo ¼ 0:60641 þ 22:1992=T þ 5:0918e 07 Rs T þ 0:00053679 Rs 0:0019306 T
þ 0:0018287 T þ 0:00040493 Rs (1) (4)

The accuracy of these genetic programming models and other
Its accuracy will be discussed later. Then for having a better
different correlations of this part will be discussed in next
accuracy the level of tree branches was set to 8 when the algo-
section.
rithm was run again it resulted in the equation (2).
Bo ¼ 0:9135 þ 0:052341 1=Rs þ 5:3578e 06 Rs go 3. Results and discussion
þ 8:9699e 07 Rs T=go þ 0:00088254 T In this part one usual and one optimized artificial neural
þ 0:00027237 Rs network, and four genetic programming models as well as 15
other literature correlations, for estimating oil formation volume
(2)
factor, will be compared and their different statistical properties
Considering the different correlations that were previously will be discussed to help the user to select a correlation for
introduced reveal that most of them are exponential. Thus, for evaluating Bo which is best suited for his studies. The correlations
third run the genetic programming is let to use exponential in are selected from the most common ones and their equations are
addition to its basic operators for its nodes' values. For this listed in Table 5. Each correlation is applicable just in a range of
data in which it has been created. The range of applicability of
each correlation is shown in Table 6. It is clear that for comparing
Table 4
the models, the test data should be in the range of applicability of
The parameters of the GP model used in this study.
all correlations. These data points are selected consciously from
Parameter Value data base and their ranges are showed in Table 7. All these points
Population type Double vector are in the range of all used correlations. As this table shows,
Population size 15 Almehaideb has the simplest form of equation, after that there
Initial population generation Random
are Macary and Batanoney and Standing, El-Banbi, Sulaimon and
Initial score Calculated by fitness function
Fitness scaling Rank Kartoamodjo and Schmidt have the simplest form. Of course,
Selection function Tournament now, using all these equations are easy, by computers, but still in
Mutation function Uniform some cases having a simple form of equation is desired.
Crossover function Scattered (Two parents) To compare the applicability of different algorithms the cross
Type of replacement elitist
Elite count 3
plot of the estimation oil formation volume factor of 395 sample
Crossover probability 0.7 data points for all correlations versus their measured one plotted
Mutation probability 0.3 (Fig. 3). In this figure exact Bois the Bo that is measured in lab-
Probability of changing 0.3 oratory with PVT tests. And estimated Bo is the Bo that is pre-
terminal - nonterminal
dicted by correlations. In Fig. 3, standing (a), Vazquez and Begg
nodes (vice versa) during
mutation (b), Al-Marhoun (d) and (g), Macary and Batanoney (h), Omar
Hybrid No and Todd (i), Petrosky (j), Kartoamodjo and Schmidt (k),
max generation 500 Almehdajeb (m) and sulaimin (o) all have similar prediction, they
Max stall generation 100 had a good estimation for most of the points but for a small range
Function tolerance 1.00E-15
of points they had a poor result. In these correlations, Standing
Table 5
The equation of different correlations for estimating oil formation volume factor.
Authors Equation
Standing (1947) [6] Bo ¼ 0.972 þ 0.0001472 (Rs (gg/go)0.5 þ 1.125 T)1.175

Vazquez and Bo ¼ 1 þ 0.0004677 Rs þ 1.751 10 5 ((go/gg) (T e 60))1.8106 108 (Rs (go/gg) (T e 60))
Begg (1980) [9]
Glaso (1980) [10] Bo ¼ 1 þ (6.58511 þ 2.91329 Log(Rs (gg/go)0.526 þ 0.968 T) e 0.27683 (Log(Rs (gg/go)0.526 þ 0.968 T))2)
Al-Marhoun (1988) [12] Bo ¼ 0.497069 þ 0.000862963 (T þ 460) þ 0.00182594 (Rs0.74239 gg0.323294 go1.20204) þ 0.00000318099
(Rs0.74239 gg0.323294 go 1.20204)2
Abdul-Majeed and Bo ¼ 0.9657876 þ 0.000773 (T þ 460) þ 0.000048141 (Rs1.2 gg0.147 go5.222) þ 0.00000000068987
Salmon (1988) [13] (Rs1.2 gg0.147 go5.222)2
Dokla and Bo ¼ 0.0431935 þ 0.00156667 (T þ 460) þ 0.00139775 (Rs0.77357 gg0.404 go0.882505) - 0.00000380525 (Rs0.773572
Osman (1992) [14] gg0.404020 go0.882605)2
Al-Marhoun (1992) [15] Bo ¼ 1 þ 0.000177342 Rs þ 0.000220163 Rs (gg/go) þ 0.000004292580 Rs (1 go) (T e 60) þ 0.000528707
(T e 60)
Macary and Batanoney Bo ¼ (1.0031 þ 0.0008 T) (EXP(0.0004 Rs þ 0.0006 (go/gg)))
(1992) [16]
Omar and Todd (1993) [18] Bo ¼ 0.972 þ 0.0001472 (Rs (gg/go)0.5 þ 1.125 T(1.166 þ0.000762 go/gg 0.0399 gg)
Petrosky and Bo ¼ 1.0113 þ 0.0000702046 (Rs0.3738 (gg0.2914/go0.6265)0.5 þ 0.24626 T0.5371)3.0936
Farshad (1993) [17]
Kartoamodjo and Bo ¼ 0.98496 þ 0.0001 (Rs0.755 (gg0.25/go1.5) þ 0.45 T)1.5
Schmidt (1994) [19]
Farshad et al. (1996) [20] Bo ¼ 1 e (>2.65 þ 0.5576 Log(Rs0.5956 (gg0.2369 go1.3282) þ 0.0976 T) - 0.333 (Log(Rs0.5956 (gg0.2369 go1.3282)
þ 0.0976 T))2
Almehaideb (1997) [22] Bo ¼ 1.122 þ 0.00000141 Rs T/go2
El-Banbi (2006) [23] Bo ¼ 47.23e8.833 (Rs ((gg/go)0.5) þ 1.325 (T))0.0092 36.6
Sulaimon (2014) [27] Bo ¼ 0.0000009 (Rs (go/gg) þ Rs (T þ 460)) þ 1.0367
Table 6
Range of applicability of different correlations for estimating oil formation volume factor.
Correlation Temperature (F) Solution gas oil ratio (SCF/STB) Oil gravity Gas gravity
min max min max min max min max
Standing (1947) [6] 100 258 20 1425 0.72 0.96 0.59 0.95
Vazquez and Begg (1980) [9] 75 294 0 2199 0.74 0.96 0.511 1.35
Glaso (1980) [10] 80 280 90 2637 0.79 0.92 0.65 1.28
Al-Marhoun (1988) [12] 74 240 26 1602 0.80 0.94 0.75 1.37
Abdul-Majeed and Salmon (1988) [13] 190 275 181 2266 0.82 0.89 0.8 1.29
Dokla and Osman (1992) [14] 190 275 181 2266 0.82 0.89 0.8 1.29
Al-Marhoun (1992) [15] 75 300 0 3265 0.75 1.00 0.575 2.252
Macary and Batanoney (1992) 130 290 200 1200 0.82 0.90 0.7 1
Omar and Todd (1993) [18] 125 280 142 1440 0.77 0.89 0.612 1.32
Petrosky and Farshad (1993) [17] 114 288 217 1406 0.80 0.96 0.58 0.85
Kartoamodjo and Schmidt (1994) [19] 75 320 0 2890 0.74 0.97 0.38 1.171
Farshad et al. (1996) [20] 95 260 6 1645 0.80 0.95 0.66 1.7
Almehaideb (1997) [22] 190 306 128 3871 0.79 0.87 0.75 1.12
El-Banbi (2006) [23] 186 312 0 8280 0.74 0.85 0.67 1.14
Sulaimon (2014) [27] 125 280 142 1440 0.77 0.89 0.038 1.315
has the simplest equation and thus is preferred. Dokla and more complicated. This is the advantage of GP that the level of
Osman (f) prediction is not good at all, its r-square is about 0.34 accuracy and complexity of its models can be tuned. In usual GP
which is not acceptable however in small Botheir prediction is just the basic operators are applied. But most oil formation vol-
acceptable. Glaso (c) estimated better than Dokla and Osman in ume factor correlations are exponential. In GP exponential model
general case; but despite the good prediction of Dokla and 1 (r) and 2 (s) it's allowed to use exponential as well as four basic
Osman in small Bo, Glaso did not have a good estimation in any operators. The level of complexity of GP (exponential) 1 is similar
range of Bo. Farshd (l) and E-Banbi (n) just have a good prediction to genetic programming 1 but as shown earlier for both of them
for a small range of data. These correlations predictions are not GP found the same equation. Which means adding exponential
accurate for very small or very large Bo. Abdul-Majeed (e) pre- to GP operators only complicates its search process and does not
diction for each point is more than measured one and the increase its accuracy. Again GP exponential 2 is similar to GP 1
amount of his overestimation in all cases is similar. Thus but in this case adding exponential component to operators have
assigning a coefficient to that could have increased its
performance.
Genetic programming model 1(p) and model 2 (q) are similar Table 7
Range of parameters, for testing different correlations.
and both of them are good but model 2 has a slightly better
estimation. The difference between them is the level of their Gas gravity Oil gravity Temperature ( F) Solution gas
equation complexity. Model 2 is more accurate but its equation is oil ratio (SCF/STB)
also more complicated. Using GP, an equation even more accu- Minimum 0.8 0.8 190 217
rate than GP model 2 can be gained but its equation will be much Maximum 0.85 0.85 240 1200
Fig. 3. Regression fitting of different models: (a) standing 1947 [6], (b) Vazequez and Begg 1980 [9], (c) Galso (1980) [10], (d) Al-Marhoun 1988 [12], (e) Abdul-Majeed and
Salmon (1988) [13], (f) Dokal and Osman 1992 [14], (g) Al-Marhoun 1992 [15], (h) Macary and Batanoney (1992) [16], (i) Omar and Todd (1993) [18], (j) Petrosky and Farshd
(1993) [17], (k) Karoamodjo and schmidt (1994) [19], (l) Farshad et al. (1996) [20], (m) Almehaideb (1997) [22], (n) El-Bani (2006) [23], (o) Sulaimon (2014) [27], (p) Genetic
Programming Model 1, (q) Genetic Programming Model 2, (r) Genetic Programming (Exponential) Model 1, (s) Genetic Programming Model 2, (t) Artificial Neural Network, (u)
optimized neural network.
Fig. 3. (continued).
decreased the accuracy of the model in fact adding exponential addition to r-square in Fig. 4(a) average relative errors of
component had made the model 2 too hard for GP to find even different correlations are compared. In this figure optimized ANN
the model that it found without exponential component. In has the lowest average relative error, usual ANN and GP models
artificial neural network (t) just two points out of 395 points have a small error. After them, Standing, Al-Marhoun, Kartoa-
have a poor estimation and the R-square of ANN is good. But modjeb, Almehaideb and Sulaimon have the smallest errors. In
some of genetic programming models were more accurate other side, Glasso, Abdulmajeed and Farshad et al. have the
(comparing r square). Optimized neural network (u) has the best highest average relative errors. Fig. 4(b) compares the median
accuracy. This model prediction is excellent and no point is relative errors of different correlations. As illustrated in this
highly overestimated or underestimated. Thus its prediction is figure, the correlations that had a higher average relative error
reliable. For further investigation of different correlations Fig. 4 have a higher average median error too, and the correlations that
compares average, median and maximum relative error of have a lower average relative error, have a lower median relative
different correlations. These parameters are of great importance error. The point of this figure is that in all correlations, median
in comparing the applicability of two different methods [48]. In relative error is smaller than average relative error. Fig. 4(c)
80 shows the maximum relative error of different correlations. As

Average RelaƟve Error (%)
70 this figure shows, the optimized ANN has the lowest maximum
60
relative error and again the good performance of the new model
50
40 is confirmed. In usual ANN, despite the good performance of its
30 maximum error is high and thus it is not recommended. Genetic
20 programming model 2 has the least maximum error and thus the
10
best accuracy. After that other GP models are good and after
0
Macary and Batanoney…
Petrosky and Farshad…

Kartoamodjo and…
Abdul-Majeed and…
GeneƟc Programming…
OpƟmized ArƟficial…
them Al-Marhoun, standing, Vazquez and Begg, Kartaoamodjo
Glaso (1980)
El-Banbi (2006)
Sulaimon (2014)
Al-Marhoun (1988)
Farshad et al (1996)
Almehaideb (1997)
Dokla and Osman (1992)
Al-Marhoun (1992)
Standing (1947)
Vazquez and Begg (1980)
Omar and Todd (1993)
ArƟficial Neural Network

and sulaimon are the most accurate ones. Finally Fig. 4(d) com-
pares the r-squares of different correlations. As this figure shows
again the optimized artificial neural network has the best per-
formance, after that genetic programming have the best r and
Dokla and Osman and Glaso have the worst ones.
Now after investigating the different correlations it's the time
to recommend the best correlations for different usages. First of
(a)
all the optimized artificial neural network, has the best accuracy.
Usually neural network models just have simple production and
70
summation and thus they are fast. So this model have a good
Median RelaƟve Error (%)
60
efficiency and using that or in a better situation creating this kind
50
of models for each specific oil is highly recommended. Usual
40
30
artificial neural network has a good estimation for most points
20 but because of bad estimation for a small fraction of points its use
10 is not safe. After optimized the ANN, GP model 2 has the least
0 errors and best r. It's recommended for all cases (in its applicable
Kartoamodjo and…
Abdul-Majeed and…
Glaso (1980)
El-Banbi (2006)
Sulaimon (2014)
Standing (1947)
Al-Marhoun (1988)
Almehaideb (1997)
Al-Marhoun (1992)
range). But it is a little more complicated. If a simpler equation is

desired GP model 1 is recommended. It should be mentioned
that using exponential in GP models is not recommended at all.
After that Al-Marhoun (1988, 1992) [12,15] models have low
average and median relative errors and they don't have points
with a huge over estimation or underestimation. Thus their
predictions are good. In next positions there are, Kartoamodjo
and Schmidt (1994) [19], Petrosky and Farshad (1993) [17],
(b)
Standing (1947) [6] which have an average relative error about
180 2% and median relative error less than 2% as well as their
Maximum RelaƟve Error (%)
160
maximum relative error is less than 30% (Fig. 4) thus they are
140
120 recommended. Among them, Standing (1947) [6] has the
100 simplest form of equation. Other equations are similar except
80
60
Glaso (1980) [10], Abdul-Majeed and Salmon (1988) [13], Dokla
40 and Osman (1992) [14], Farshad et al. (1996) [20] and El-Banbi
20 (2006) [23] which had a poor performance.
0

Kartoamodjo and…
Abdul-Majeed and…
Glaso (1980)
El-Banbi (2006)
Sulaimon (2014)
Al-Marhoun (1988)

Al-Marhoun (1992)
Almehaideb (1997)
Standing (1947)
4. Conclusion
1. Usual neural network is more accurate than most of the other

previously introduced correlations for most data points, but
because of bad estimation for some points it is not safe to use.
2. Optimizing the parameters of artificial neural network with
(c)
optimization algorithms such as genetic algorithm can highly
1.2 improve the accuracy of the model. By using this method
1 there is no need to run the artificial neural network over and
0.8
over to find a good model. This method has the highest
R Square
accuracy.
0.6
3. Genetic programming is the most flexible model for pre-
0.4
dicting different properties such as oil formation volume
0.2
factor. It is very accurate and its level of accuracy and its
0
complexity can be tuned based on the problem. Of course
Kartoamodjo and…
Abdul-Majeed and…
Glaso (1980)
El-Banbi (2006)
Sulaimon (2014)
Standing (1947)
Al-Marhoun (1992)
Almehaideb (1997)
Al-Marhoun (1988)
having more accurate model needs a more complicated one.

4. Adding an exponential operator to the genetic programming
node's values, in some cases does not have any effect on the
accuracy of the model and in some cases decreases its
Fig. 4. Statistical properties of different models: (a) Average relative error (%), (b)
(d)
Median relative error (%), (c) Maximum relative error, (d) R square.
efficiency. In fact adding exponential operator just compli- [18] Development of new modified black oil correlation for Malaysian crudes,
in: SPE Asia Pacific Oil and Gas Conference, Singapore, 1993.
cates the searching process of the genetic programming and
[19] Trijana Kartoatmodjo, Zelimir Schmidt, Large data bank improves crude
is not recommended at all. physical property correlations, Oil Gas J. 92 (27) (1994) (United States).
5. As a summary, in investigated equations, optimized artificial [20] Empirical PVT correlations for colombian crude oils, in: SPE Latin American
neural network and after that GP models (whiteout expo- and Caribbean Petroleum Enginerring Conference, Port of Spain, Trinidad
and Tobago, 1996.
nential operator) have the best performance. After them Al- [22] R.A. Almehaideb, Improved PVT correlations for UAE crude oils, in: SPE
Marhoun (1988, 1992) [12,15], Kartoamodjo and Schmidt Middle East Oil Show, Manama, Bahrain, 1997.
(1994) [19], Petrosky and Farshad (1993) [17], Standing (1947) [23] A.H. El-Banbi, K.A. Fattah, M.H. Sayyouh, New modified black-oil PVT
correlations for Gas condensate and volatile oil fluids, in: SPE Annual
[6] are recommended. Among them Standing (1947) [6] has Technical Conference and Exhibition, Texas, USA, 2006.
the simplest form of equation. Other equations are similar [24] R.P. Sutton, An accurate method for determining oil PVT properties using
except Glaso (1980) [10], Abdul-Majeed and Salmon (1988) the Standing-Katz Gas Z-factor chart, SPE Reserv. Eval. Eng. 11 (2) (2008)
246e266.
[13], Dokla and Osman (1992) [14], Farshad et al. (1996) [20], [25] E.A. El-Sebakhy, Forecasting PVT properties of crude oil systems based on
El-Banbi (2006) [23] which have a poor performance. support vector machines modeling scheme, J. Pet. Sci. Eng. 34 (1e4) (2009)
25e34.
[26] I.S. Nassar, A.H. El-Banbi, M.H. Sayyouh, Modified black oil PVT properties
Nomenclature
correlations for volatile oil and gas condensate reservoirs, in: North Africa
Technical Conference & Exhibition, Cairo, Egypt, 2013.
gO Oil Gravity (fraction) [27] A.A. Sulaimon, N. Ramli, B.J. Adeyemi, Modified black oil PVT properties
gg Gas Gravity (fraction) correlations for volatile oil and gas condensate reservoirs, in: SPE Nigerian
Annual International Conferance and Exhibition, Lagos, Nigeria, 2014.
RS Solution Gas Oil Ratio (SCF/STB) [28] D. Ghetto, G. Marco, V. Marco, Reliability analysis on PVT correlations, in:
BO Formation Volume Factor (Reservoir barrel/STB) SPE European Petroleum Conference, London, 1994.
T Temperature (oF) [29] S. Godefroy, S.H. Khor, D. Emms, Comparison and validation of theoretical
and empirical correlations for black oil reservoir fluid properties, in:
P Pressure (psi) Offshore Technology Conference, Texas, USA, 2012.
[30] J.A. Anderson, An Introduction to Neural Networks, A Bradford Book, 1995.
SI metric conversion factor [31] R. Gosukonda, A.K. Mahapatra, X. Liu, G. Kannan, Application of artificial
neural network to predict Escherichia coli O157:H7 inactivation on beef
SCF * 0.02832s sm surfaces, Food Control (2015) 606e614.
STB * 0.15899 sm [32] V. Chandwani, V. Agrawal, R. Nagar, Modeling slump of ready mix concrete
psi * 6870.75 pa using genetic algorithms assisted training of artificial neural networks,
Expert Syst. Appl. (2015) 885e893.
0.56F þ 255.4 K [33] H. Cao, F. Cao, D. Wang, Quantum artificial neural networks with appli-
cations, Inf. Sci. (2015) 1e6.
References [34] A. Adly, S.K. Abd-El-Hafiz, Utilizing neural networks in magnetic media
modeling and field computation: a review, J. Adv. Res. 5 (2014) 615e627.
[35] M. Mitchell, An Introduction to Genetic Algorithms (Complex Adaptive
[1] James P. Brill, H. Dale Beggs, Two Phase Flow in Pipes, Gulf Publishing,
Systems), A Bradford Book, 1998.
1991.
[36] E. Shokir, M. El, Dew point pressure model for gas condensate reservoirs
[2] William D. McCain Jr., The Properties of Petroleum Fluid, PennWell, United
based on genetic programming, Energy Fuels 22 (2008).
States of America, 1990.
[37] J.R. Koza, Genetic Programming: on the Programming of Computers by
[3] W.D. McCain Jr., The Properties of Petroleum Fluids, Penn Wel Booksl,
Means of Natural Evolution, MIT Press, Cambridge, MA, 1992.
Tulsa, 1990.
[38] C.R. Reeves, Genetic algorithm for the operations research, J. Comput.
[4] D.L. Katz, Prediction of Shrinkage of crude oils, in: Drill. Prod. & Prac, API,
Chem. Eng. 9 (1997) 231e250.
Dallas, 1942, p. 137.
[39] E. Sakamoto, H. Iba, Inferring a system of differential equations for a gene
[5] A.A. Al-Shammasi, A review of bubblepoint pressure and oil formation
regulatory network by using genetic programming, in: 14th European
volume factor correlations, SPE Reserv. Eval. Eng. 4 (2) (2001) 146e160.
Conference, EuroGP, Seoul, Korea, 2001.
[6] M.B. Standing, A pressure-volume-temperature correlation for mixtures of
[40] H. Cao, J. Yu, L. Kang, Y. Chen, The Kinetic evolutionary modelling of
California oils and gases, in: Drilling and Production Practice, vol. 275,
complex systems of chemical reactions, Comput. Chem. Eng 23 (2) (1999)
1947. Dallas.
143e151.
[7] M.B. Standing, Oil-System Correlation, McGraw-Hill Book Co, New York
[41] B. McKay, M. Willis, G. Barton, Steady-state modelling of chemical process
City, 1962.
systems using genetic programming, Comput. Chem. Eng. 21 (4) (1997).
[8] M.B. Standing, Volumetric and Phase Behavior of Oil Field Hydrocarbon,
[42] B.D. Al-Anazi, A.A. AlQuraishi, New correlation for Z-factor using genetic
Reinhold Publishing Corp., New York City, 1952.
programming technique, in: SPE Oil and Gas India Conference and Exhi-
[9] Milton Vazquez, H. Dale Beggs, Correlations for fluid physical property
bition, Mumbai, 2010, p. 9.
prediction, J. Petroleum Technol. 32 (06) (1980) 968e970.
[43] E.M. Shokir, M.K. Emera, S.M. Eid, A.A. Abd Wally, A new optimiztion
[10] O. Glaso, Generalized pressureevolumeetemperature correlations, JPT 32
model for 3D well design, Oil Gas Sci. Technol. 59 (2004) 255e256.
(5) (1980) 785e795.
[44] J. Madar, J. Abonyi, F. Szeifert, Genetic programming for system identifi-
[11] M.A. Al-Marhoun, PVT correlations for Saudi crude oils, in: Middle East Oil
cation, in: Intelligent Systems Design and Applications (ISDA 2004) Con-
Technical Conference & Exhibition, Manama, Bahrein, 1985.
ference, Budapest, Hungary, 2004.
[12] M.A. Al-Marhoun, PVT correlations for middle east crude oils, JPT 40 (5)
[45] R. Pearson, Selecting nonlinear model structures for computer control,
(1988) 650e666.
J. Process Control 13 (2003).
[13] G.H.A. Abdul-Majeed, N.H. Salman, s sdasda, An empirical correlation for
[46] M.R. Mahdiani, E. Khamehchi, A new method for building proxy models
FVF prediction, J. Cdn. Pet. Tech. 27 (6) (1988) 118.
using simulated annealing, Middle-East J. Scientific Res. (2014) 324e328.
[14] M. Dokla, M. Osman, Correlation of PVT properties for UAE crudes, in:
[47] M.R. Mahdiani, E. Khamehchi, R. Soltanmohammadi, B. Azkayi, A new
SPEFE, 1992, p. 41.
proxy model, based on meta heuristic algorithms for estimating gas
[15] M.A. Al-Marhoun, New correlation for formation volume factor of oil and
compressor torque, 11th international industrial conference (2015).
gas mixtures, J. Cdn. Pet. Tech. (1992) 3e22.
[48] M.R. Mahdiani, E. Khamehchi, Preventing instability phenomenon in gas-
[16] S.M. Macary, M.H. El-Batanoney, Derivation of PVT correlations for the Gulf
lift optimization, Iran. J. Oil Gas Sci. Technol. (2015) 49e65.
of Suez crude oils, in: EGPC 11th Pet. Exp. & Prod. Conf., 1992.
[17] G. Petrosky, F. Farshad, Pressure-volume-temperature correlation for the
Gulf of Mexico, in: SPE Annual Technical Conference and Exhibition,
Houston, 1993.

The Most Accurate Heuristic-Based Algorithms For Estimating The Oil Formation Volume Factor, Mohammad Reza, 20216

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

The Most Accurate Heuristic-Based Algorithms For Estimating The Oil Formation Volume Factor, Mohammad Reza, 20216

Caricato da

Copyright:

Formati disponibili

Petroleum 2 (2016) 40e48

Contents lists available at ScienceDirect

The most accurate heuristic-based algorithms for estimating

1. Introduction available. Thus, an accurate correlation for different ranges of

calculated using compressibility factor. Thus, most of the corre- Table 1

As mentioned earlier, here using some artiﬁcial neural

Table 3 developed for symbolic optimization [36,37]. Despite the fact

Fig. 1. Flowchart of optimized artiﬁcial neural network.

Fig. 2. Three examples of tree structures.

Bo ¼ 0:60641 þ 22:1992=T þ 5:0918e 07 Rs T þ 0:00053679 Rs 0:0019306 T

þ 0:0018287 T þ 0:00040493 Rs (1) (4)

Bo ¼ 0:9135 þ 0:052341 1=Rs þ 5:3578e 06 Rs go 3. Results and discussion

Standing (1947) [6] Bo ¼ 0.972 þ 0.0001472 (Rs (gg/go)0.5 þ 1.125 T)1.175

min max min max min max min max

80 shows the maximum relative error of different correlations. As

Macary and Batanoney…

Petrosky and Farshad…

Omar and Todd (1993)

ArƟficial Neural Network

Omar and Todd (1993)

ArƟficial Neural Network

range). But it is a little more complicated. If a simpler equation is

Petrosky and Farshad…

Dokla and Osman (1992)

Omar and Todd (1993)

ArƟficial Neural Network

1. Usual neural network is more accurate than most of the other

Dokla and Osman (1992)

Omar and Todd (1993)

ArƟficial Neural Network

having more accurate model needs a more complicated one.

Potrebbero piacerti anche