Spe 198646 Pa PDF
Spe 198646 Pa PDF
Summary
This paper presents a data-driven methodology to predict calcium carbonate (CaCO3)-scale formation and design its inhibition program
in petroleum wells. The proposed methodology integrates and adds to the existing principles of production surveillance, chemistry,
machine learning (ML), and probability theory in a comprehensive decision workflow to achieve its purpose. The proposed model was
applied on a large and representative field sample to verify its results.
The method starts by collecting data such as ionic composition, pH, sample-collection/inspection dates, and scale-formation event.
Then, collected data are classified or grouped according to production conditions. Calculation of chemical-scale indices is then made
using techniques such as water-saturation level, Langelier saturation index (LSI), Ryznar saturation index (RSI), and Puckorius scaling
index (PSI). The ML part of the method starts by dividing the data into training and test sets (80 and 20%, respectively). Classification
models such as support-vector machine (SVM), K-nearest neighbors (KNN), gradient boosting, gradient-boosting classifier, and
decision-tree classifier are all applied on collected data. Prediction results are then classified into a confusion matrix to be used as
inputs for the probabilistic inhibition-design model. Finally, a functional-network (FN) tool is used to predict the formation of scale.
The scale-inhibition program design uses a probabilistic model that quantifies the uncertainty associated with each ML method. The
scale-prediction capability compared with actual inspection is presented into probability equations that are used in the cost model. The
expected financial impact associated with applying any of the ML methods is obtained from defining costs for scale removal and scale
inhibition. These costs are factored into the probability equations in a manner that presents incurred costs and saved or avoided
expenses expected from field application of any given ML model. The forecasted cost model is built on a base-case method (i.e., current
situation) to be used as a benchmark and foundation for the new scale-inhibition program.
As will be presented in the paper, the results of applying the preceding techniques resulted in a scale-prediction accuracy of 95%
and realized threefold cost-savings figures compared with existing programs.
Introduction
Scale precipitation causes several issues in oil and gas fields. There are different causes and sources behind the scale formation in differ-
ent wells. Seawater injection can be considered one of the primary causes of sulfate-scale precipitation because of the mixing of two
different incompatible waters. Scales can be formed in well tubulars as well as the formation. Scale formation in well tubulars can lead
to serious operational issues in which conventional workover rigs might not be sufficient to do the job because of the weight of the
tubing with the scale (Bayona 1993).
Scales also can be formed because of the change in thermodynamic conditions in the reservoir or in the well tubulars. Changes in
pressure, temperature, partial pressure of carbon dioxide (CO2)/hydrogen sulfide, and pH can also cause scale precipitation (Moghadasi
et al. 2003a, 2003b; Mackay et al. 2003).
Fig. 1 shows the constants of the solubility product for magnesium carbonate, calcium carbonate, barium carbonate, and strontium car-
bonate scales. At low temperatures (less than 150 F), strontium carbonate has very low solubility constant, while magnesium carbonate
has the highest one. CaCO3 scales are very common compared with other types of carbonate scales because CaCO3 scale forms anywhere
in the well. It can form inside the reservoir, inside the production tubing, and at the downhole-pump intakes (which cause plugging to the
electrical-submersible pump and degrades its efficiency). In addition, CaCO3 scale can form at the wellhead and in the surface flow lines
(Lakshmi et al. 2013). Calcite has a very low solubility constant at high temperature (Fig. 1), which makes it the most-stable scale type at
the reservoir pressure and temperature (Kamal et al. 2018). CaCO3 scale has different forms such as calcite, vaterite, and aragonite, and
calcite is the common type that exists in the reservoir. The formation of CaCO3 scale depends on several conditions, such as temperature;
pH of the medium; ion concentration, such as calcium and bicarbonate; and ionic strength. The abundance of CO2 in water and the pH of
the medium control the formation of CaCO3 scales. CO2 reacts with water and produces carbonic acid, which is a weak acid that will dis-
sociate to bicarbonate. The abundance of bicarbonate and calcium ions will promote the formation of CaCO3 scales. At higher pH values
and higher temperature, the formation of CaCO3 will be accelerated (Ramstad et al. 2005; Hamid et al. 2016). Iron carbonate (siderite)
can form in the reservoirs and downhole equipment (Amiri et al. 2013). Similar conditions that promote the formation of CaCO3 scales
promote the formation of iron carbonate scales as well. These conditions include temperature change, pressure change, and CO2-solubility
change with pressure and temperature. The liberation of CO2 from the solution will increase the pH of the medium, and this will promote
the carbonate-scale formation by reducing the solubility of these mineral scales at high pH values (Jordan et al. 2014).
CaCO3 is a type of oilfield scale that is classified as inorganic. CaCO3 scale commonly occurs in many fields around the globe. This
type of inorganic scale forms under different conditions (thermodynamic, kinetic, and hydrodynamic) because of the mixing of petro-
leum fluids. CaCO3-scale deposit is caused by a shift toward carbonate in the carbonate/bicarbonate/CO2 equilibrium. When equilib-
rium shifts in the other direction, the precipitation goes back into solution. In a chemical reaction, chemical equilibrium is the state in
which both reactants and products are present in concentrations that have no further tendency to change with time. Chemical equilib-
rium is achieved when the rate of forward reaction is the same as the rate of reverse reaction.
Copyright V
C 2020 Society of Petroleum Engineers
This paper (SPE 198646) was accepted for presentation at the SPE Gas & Oil Technology Showcase and Conference, Dubai, UAE, 21–23 October 2019, and revised for publication. Original
manuscript received for review 16 March 2020. Revised manuscript received for review 28 April 2020. Paper peer approved 5 May 2020.
1.00×10–6
MgCO3
CaCO3
BaCO3
SrCO3
Solubility-Product Constant
1.00×10–7
1.00×10–8
1.00×10–9
1.00×10–10
50 100 150 200 250 300
Temperature (°F)
Fig. 1—Solubility of carbonate scale (after Li et al. 1995). MgCO3 5 magnesium carbonate; BaCO3 5 barium carbonate;
SrCO3 5 strontium carbonate.
The ability to predict scale formation is a major challenge in the oil industry. According to Vetter et al. (1987), “the main variables
dictating the location and amount of CaCO3-scale deposition in an oil field are as follows:
• Pressures and temperatures at any location within the entire production system.
• The brine and oil compositions prior, during and after the reservoir fluids have been exposed to temperature and/or
pressure changes.
• The bubble point and flash behavior of the three-phase oil/brine/gas system as a function of pressure and temperature.
• The distribution of CO2 between oil and brine phases and the dramatic variations of this CO2 partitioning prior and during any
production operation.
• The constant variation of the water oil ratio, the gas oil ratio and the gas water ratio during any production operation.”
Langelier developed an equation setting forth the conditions of carbonate equilibrium. By using this equation, the pH value of water
at equilibrium can be calculated. If the actual pH is higher than the calculated pH, the water has a tendency to form scale. If it is lower,
the water has a tendency to be corrosive. Langelier formulated an equation relating the stability index with pH, calcium concentration,
and total alkalinity. Alkalinity is the ability of a solution to neutralize an acid to the equivalence point of CaCO3. Alkalinity can be cal-
culated as the sum of ion concentrations ½HCO 3 þ 2x½CO3
ð2Þ
þ ½OH ½Hþ . The stability index, if it is a positive index, indicates
scale formation. A negative index indicates corrosion. This equation has been shown to apply to waters with total solid concentration as
high as 4,000 ppm. Larson and Buswell (1942) improved the reliability of the Langelier index by adjusting their own index for the tem-
perature and salinity effects during the CaCO3 precipitation at atmospheric pressure.
Stiff and Davis (1952) proposed an empirical method to extend the application of the Langelier equation to waters of high salt con-
centration. This was done by experimentally deriving that the value of the K term in the Langelier equation applies to waters of high
salt content. Using this equation, the tendency of oilfield waters to deposit CaCO3 can be predicted.
SI ¼ pH K pCa pAlk ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð1Þ
where K is obtained as function of ionic strength (l), and the ionic strength can be determined as
where pCa is the negative logarithm of the calcium concentration; pAlk is the negative logarithm of the total alkalinity; l is ionic
strength; C is the concentration of each ion (in g ion/1000 g solvent); and V is the valence of the ion.
Yeboah et al. (1993) developed the oilfield scale-prediction model, which predicts the potential and deposition profile of scale using
extensive thermodynamic and kinetic data. The model uses experimental solubility data to determine the saturation index. Critical satur-
ation indices beyond which scaling occurs have been established. The model uses the flow characteristics and experimental kinetic data
to predict the scale-deposition profile from the bottomhole to the surface once the critical saturation index is exceeded.
2
½Ca2þ ½HCO3
SI ¼ log ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð3Þ
Ksp PCO2
where SI is the CaCO3 saturation index, Ksp is the equilibrium constant, and PCO2 is the partial pressure of CO2.
Vetter et al. (1987) highlighted the importance of flashed gases from both the oil and the brine phases in CaCO3-scale formation,
which is ignored in many predictive models in the industry. In addition, the authors discussed the various effects of the gas distribution
(especially CO2) between oil and brine phases under reservoir and various production conditions on CaCO3 formation and provided
some algorithms for prediction. They also presented a methodology that can be used to predict CaCO3 formation under field conditions
as a function of water composition, pressure, temperature, water/oil ratio (WOR), gas/oil ratio (GOR), and total CO2, and its partition-
ing between the various liquid phases.
Hamid et al. (2016) developed an empirical model using weight-gain data from coupon tests and from a tube-plugging test. This
model was able to predict scale-growth rate at a given point on a solid surface with pressure, pressure gradient, temperature, fluid veloc-
ity, and brine concentration as independent variables. Artificial-intelligence (AI) methodology was used to develop the model. The
scale-growth field y at a given time is obtained by integrating the scaling-rate function,
y_ ¼ F1F2 þ F3F4; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð4Þ
where y_ is the rate of scale growth at a point on the solid surface (in in./D); F1 depends on an empirical model F0l F0, which is derived from
the artificial neural network of the scale deposition as a function of five variables (pressure, temperature, brine concentration, velocity, and
time); F2 is the locations at which scale would be deposited because of the surface position with respect to gravity and the surface being con-
cave or convex; and F3 and F4 account for boned scaling, which is independent of gravity and a strong function of the pressure gradient.
The model output was compared with experimental data and gave fair results. This model was used to predict scale formation at
inflow-control valves, allowing for a better design completion and fluid-handling system for presalt wells.
Bahadori (2011) created a simple predictive tool to predict CaCO3-scale precipitation as a function of pH, temperature, ionic
strength of the solution, calcium-cation concentration, bicarbonate-anion concentration, and CO2 mole fraction when the water mixture
is saturated with gas containing CO2 to evaluate the effects of solution conditions on the tendency and extent of precipitation. This
method covered concentrations of CaCO3 from 10 to 10 000 mg/L, with temperature ranging from 5 to 90 C, total ionic strength rang-
ing between 0.1 and 3.6, and pH values ranging from 5.5 and 8. This model was created for CO2 sequestration in saline aquifers to
reduce CO2 emissions into the atmosphere. The steps to predict CaCO3-scale precipitation are
lnðKÞ ¼ a þ bI þ cI2 þ dI3 ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð5Þ
SI ¼ pH pHs ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð12Þ
where K is the correction factor for total ionic strength and temperature; I is total ionic strength; pH is the actual pH value in the
system; pHs is the pH value when CaCO3 achieves saturation in the system; Sf is the solubility factor; T is temperature (in K); and XCO2
is the CO2 mole fraction in the water mixture saturated with a gas containing CO2.
The objectives of this paper are to develop a ML workflow to predict scale precipitation in oil and gas wells. Several data sets will
be used as inputs to the model and the model output will be either matrix prediction or an empirical equation that will be developed by
the FN ML technique. In addition, a probabilistic framework was developed to design the inhibition program according to cost.
ML Techniques
The terms ML and AI are often used interchangeably, but there is a difference between the two techniques: ML is more related to pre-
diction, while AI is more concerned with the actions for creating machines using human intelligence. ML and AI both are captivating
fields that integrate computational power with human intelligence to produce smart and reliable solutions for extremely nonlinear and
highly complicated problems. In the last 2 decades, petroleum-engineering journals have been overwhelmed with articles using AI and
ML for regression, function approximation, and classification problems. More details about AI and ML can be found in our previous
publications (Tariq et al. 2018a, 2018b, 2019). The focus of our work is centered around the use of ML techniques such as the KNN
algorithm, SVM, gradient boosting, gradient-boosting classifier, and decision tree.
• KNN: In pattern recognition, the KNN algorithm is a method used for classification and regression. In both cases, the input con-
sists of the k closest training examples in the feature space. The output depends on whether KNN is used for classification
or regression.
• SVM: In ML, SVMs are supervised learning models with associated learning algorithms that analyze data used for classification
and regression analysis. Given a set of training examples, each marked as belonging to one or the other of two categories, an
SVM training algorithm builds a model that assigns new examples to one category or the other, making it a nonprobabilistic
binary linear classifier.
• Gradient boosting: An ML technique for regression and classification problems that produces a prediction model in the form of an
ensemble of weak prediction models, typically decision trees. It builds the model in a stagewise fashion, like other boosting meth-
ods do, and it generalizes them by allowing optimization of an arbitrary function.
• Gradient-boosting classifier: A view of boosting algorithms that optimize a cost function over function space by iteratively choos-
ing a weak hypothesis that points in the negative-gradient direction. This functional-gradient view of boosting has led to the devel-
opment of boosting algorithms in many areas of ML and statistics beyond regression and classification.
• Decision-tree classifier: Decision-tree learning uses a decision tree (as a predictive model) to go from observations about an item
(represented in the branches) to conclusions about the item’s target value (represented in the leaves). It is one of the predictive-
modeling approaches used in statistics, data mining, and ML.
The other technique used in this study, the FN tool, is to predict the existence of scale formation. An FN technique is the relatively
latest AI technique, first discussed by Castillo et al. (1999) for a function-approximation problem. The basis of FN is the combination
of artificial neural networks with the functional theory. It is a supervised data-learning technique mostly used for prediction and
regression purposes.
Methodology
The proposed workflow describes the CaCO3-scale-prediction methodology, from data collection to model building. Furthermore, the
process is extended to include a probabilistic framework by which fieldwide scale-inhibition programs are designed. The steps are
as follows:
1. Data collection. Water samples were collected from wet wells to extract the following inputs:
• Ionic composition of water; example ions are Ca2þ, Naþ, HCO 3 , and so forth.
• pH, which is the 10th logarithm of Hþ concentration.
• Sample-collection date.
• Scale-inspection date. Each case study must have a valid scale-inspection date to verify the scaling-tendency-prediction
results. The inspection location is the manifold-choke valve after dropping the valve spool. The manifold-choke valve is typi-
cally located a few feet away from the production wellhead.
• Scale-formation event: This is to record whether the scale has precipitated, according to the inspection results.
2. Classify the collected data by field. This is performed to achieve as much similarity as possible in the production conditions,
such as flow rate and pressure drop. In the presence of abundant data, it is recommended to group the wells according to their
production performance and water cut.
3. Define the output events. For each case study, denote the scaling event as follows:
• 1 if the scale has occurred in the well within X number of years.
• 0 if the scale has not occurred in the well within X number of years.
X is selected according to the scale-inhibition-program-design criteria and economics, such as the type of the inhibition chemi-
cal, the chemical-residue life, field size, and so forth. For the purposes of the data set used to demonstrate the concept of this
method, an X amount of 10 years was selected. This number represents the scale-protection time (when the inhibition is per-
formed correctly) for existing oilfield chemicals used at the fields under study. However, time-period applications other than
10 years can be used by using data sets within the desired period in the ML prediction model. For example, if the required scale-
prediction duration is 2 years, then the ML models would only be fed with geochemical-analysis data collected within 2 years
of the physical-inspection dates to train and test the models.
4. Calculate the water-saturation level. The equation to calculate the saturation level is
½Ca2þ ½CO2
3
SL ¼ ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð13Þ
Ksp
where ½Ca2þ is the molar concentration of the calcium cation (in mol/L), ½CO2
3 is the molar concentration of the carbonate
anion (in mol/L), and Ksp is the equilibrium constant.
5. Calculate the LSI. The equations to calculate the LSI are
8. Divide the data. After calculating all input parameters in Steps 4 through 7, divide the data randomly into 80% training and 20%
testing sets.
9. Train and test the ML models.
10. Generate the prediction results. For each method discussed previously, generate a confusion matrix that describes the method of
prediction performance. An example matrix is listed in Table 1.
Define the probabilistic model. The following probabilistic equations are used to quantify the uncertainty associated with each
method, if implemented to select scale-inhibition candidates:
In this case, scale has actually formed, and the model correctly predicted the event.
Number of incorrectly predicted scale-formation cases X2
PIsp ¼ Probability of incorrect scale-formation prediction ¼ ¼
Total number of cases in a given field Xt :
ð27Þ
In this case, scale has actually formed, and the model did not correctly predict the event.
PCNsp ¼ Probability of correct no-scale-formation prediction
Number of correctly predicted no-scale-formation cases X4
¼ ¼ ð28Þ
Total number of cases in a given field Xt :
In this case, scale has not actually formed, and the model correctly predicted the event.
PINsp ¼ Probability of incorrect no-scale-formation prediction
Number of incorrectly predicted no-scale-formation cases X3
¼ ¼ ð29Þ
Total number of cases in a given field Xt :
In this case, scale has not actually formed, and the model did not correctly predict the event.
11. Define the cost model. Eqs. 30 through 34 are used to quantify the expected financial effect associated with each ML method, if
implemented to select scale-inhibition candidates. The positive sign indicates incurred cost and the negative sign indicates
saved/avoided cost.
where Cs is the cost of scale removal per well and CT is the cost of scale-inhibition treatment per well.
The logic of Eq. 30 is that the correct prediction of a scale-formation event (i.e., model predicted that scale will form, and
scale has actually formed) leads to avoiding the cost of a scale-removal operation and leads to incurring the cost of scale-
inhibition treatment.
The logic of Eq. 31 is that the correct prediction of no scale-formation event within time period X leads (i.e., model predicted
that no scale will form and scale actually has not formed) to saving the scale-inhibition-treatment operation.
The logic of Eq. 32 is that incorrect predication of a scale-formation event (i.e., model predicted no scale, although scale has
actually formed) leads to incurring the cost of a scale-removal operation.
The logic of Eq. 33 is that incorrect predication of no scale-formation event (i.e., model predicted that scale will form, while
scale has actually not formed) leads to incurring the cost of scale-inhibition treatment.
Eq. 34 presents the overall expected net cost (Cp) of the scale-inhibition program.
12. Define the base case. In this step, the base-case expected cost of the scale-inhibition program is calculated. This cost is defined
using the current practice scenario before introducing any prediction workflow/model. A typical base case to benchmark against
is the conservative scenario, where all wet wells are treated to mitigate scale formation. In such a scenario, the base-case cost is
calculated as
Eq. 35 presents the overall base-case expected cost of the scale-inhibition program.
13. Apply Steps 11 through 13 on every method in Step 9. After producing a Cp value for each method in Step 9, select the method
that satisfies the following two conditions:
• Cp of the selected method is the minimum of all other methods.
• Cp of the method is less than Cbc.
14. If the Step 14 conditions are not applicable, repeat Steps 9 through 14 until the two conditions are satisfied. If no convergence is
achieved, proceed with the base-case scenario.
15. Repeat the complete workflow with every new data point collected from the field to update and enhance the prediction models.
Table 2—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 2 (continued)—Geochemical-analysis and physical-scale-inspection data. Na ¼ sodium; Ca ¼ calcium; Mg ¼ magnesium; SO4 ¼ sulfate;
HCO3 ¼ bicarbonate; TDS ¼ total dissolved solids. Inspection result of 1 means scale has formed and a result of 0 means no scale has formed.
Table 3—Predicted cost analysis using ML. Cost of scale removal per well and cost of scale-
inhibition treatment per well were assumed to be USD 50 and USD 15 million, respectively, for cost-
comparison purposes.
Confusion Matrix
A neural-network model was trained using a pattern-recognition module inside MATLABV R software (The MathWorks, Inc., Natick,
Massachusetts, USA). A total of 486 data points was fed into the AI models. Of the 486 data points, 143 were instances when scale was not
formed and the remaining 343 were the instances when the scale was actually deposited. A confusion matrix is plotted between the actual and
predicted values for the scale-formation problem. The confusion matrix is used to evaluate the performance outcome of the classification prob-
lem. The matrix is applicable when the output has at least two classes. The matrix can be formed with four different combinations of actual and
predicted values. The diagonal values of the matrix represent the values that are correctly classified, while the off-diagonal values of the matrix
indicate values that are incorrectly classified. In fact, the rows represent the actual values, while columns represent the predicted values. The
column on the extreme right and at the bottom of the plot shows the accuracy values. The cell in the bottom right of the plot shows the overall
accuracy. The three main parameters that are derived from the confusion matrix are recall, precision, and F1 score criterion. These parame-
ters define which classification algorithm is superior in terms of performance compared with others. The definitions of these parameters are
True positive
Recall ¼ ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð36Þ
True positive þ false negative
True positive
Precision ¼ ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð37Þ
True positive þ false positive
ðRecall precisionÞ
F1 score ¼ 2 ; .......................................................... ð38Þ
ðRecall þ precisionÞ
where “true positive” is an outcome where the model correctly predicts the positive value. Similarly, a “true negative” is an outcome where
the model correctly predicts the negative value. A “false positive” is an outcome where the model incorrectly predicts the positive value,
and a “false negative” is an outcome where the model incorrectly predicts the negative value. Fig. 2 shows the confusion matrix for our
problem using five ML techniques, such as KNN, SVM, gradient boosting, gradient-boosting classifier, and decision tree. For all five tech-
niques, recall, precision, and F1 were calculated and are given in Fig. 3. All techniques resulted in very good accuracy along with recall,
precision, and F1 score. KNN was chosen for further investigation because of its accuracy and ability to develop further classification plots.
Scale Formation Using KNN Confusion Matrix Scale Formation Using SVM Confusion Matrix
Predicted Values
Predicted Values
11 332 96.8% 26 341 92.9%
2 68.3% 3.2% 2
2.3% 5.3% 70.2% 7.1%
1 2 1 2
Actual Values Actual Values
(a) (b)
Predicted Values
9 338 97.4% 13 334 96.3%
2 2 68.7% 3.7%
1.9% 69.5% 2.6% 2.7%
1 2 1 2
Actual Values Actual Values
(c) (d)
130 9 93.5%
1 26.7% 1.9% 6.5%
Predicted Values
13 334 56.3%
2 68.7% 3.7%
2.7%
1 2
Actual Values
(e)
Fig. 2—Scale-formation confusion matrix using five ML algorithms: (a) KNN, (b) SVM, (c) gradient boosting, (d) gradient-boosting
classifier, and (d) decision tree.
1.00
0.80
Performance Indicator
0.60 Recall
0.983
Precision
0.964
0.950
0.937
0.935
0.935
0.923
0.923
0.923
0.922
0.922
0.909
0.909
0.893
F1
0.818
0.40
0.20
0
KNN SVM Gradient Boosting Gradient-Boosting Decision Tree
Classifier
–1
0 2 4 6 8 10
Fig. 5 shows the input-weight-planes plots. The input-weight-plane plot shows the weights of input features, with the weights of
each 100 neurons in the 10 10 hexagonal-grid structure. The weight planes, which look similar, have a highly dependent correlation,
while dissimilar planes indicate the independent features. Darker color is an indication of larger weights.
Fig. 6 shows the plot of neighbor-weight distances. A blue hexagonal-shaped grid indicates neurons. The red lines connect the
neighboring neurons. The distances between the neurons are represented by the colors of the regions where the red lines are seated. The
light color shows smaller distances between the neurons, while the dark color represents the largest distances. The distances given are
Euclidian distances.
Fig. 7 shows the self-organizing map hits plot for the output layer of size 10 10, in terms of neurons. The sample hit plot shows
how many of the training samples fall in each cluster. The adjacent clusters have learned similar features. The gap with no values
shows the indication of separation between the clusters. It means data are classified into two sets (i.e., either scale is formed or scale is
not formed). There is not any intermediate case. The sample hit plot calculates the classes for the deposition of scale formation and
shows the number of possibilities in each class. The regions with higher number of hit values show the similar highly populated regions
of input-feature spaces.
higher accuracy and less computational time. The minimum-description length is adopted as a fitness criterion for the model-selection
purpose. This is a concept from information theory, which makes the optimal choice using both the network size and accuracy of pre-
dictions. Mathematically, the minimum-description-length measure ðdÞ is given by
plogNp Np
d¼ þ logp2 ; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð39Þ
2 2
where Np is the number of functions in the optimal set (number or parameters), and p is the root-mean-square error given by
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u Np
u1 X
p¼t ei 2 : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ð40Þ
N i¼1
Fig. 5—Input-weight-plane plots. Input 1 is Na, the molar concentration of sodium cations (in mol/L); Input 2 is Ca, the molar con-
centration of calcium cations (in mol/L); Input 3 is Mg, the molar concentration of magnesium cations (in mol/L); Input 4 is SO4, the
molar concentration of sulfate cations (in mol/L); Input 5 is HCO3, the molar concentration of bicarbonate ions (in mol/L); Input 6
is total dissolved solids; Input 7 is pH; Input 8 is T, which is the normalized inspection time; and Input 9 is scale formed
within 10 years.
–1
0 2 4 6 8 10
Hits
8
6 4 2 1 5 6 1 6 8 4
7 1 5 5 0 1 6 3 8 11 10
6 1 6 4 6 4 7 5 5 1 9
5 8 8 1 4 4 2 5 6 10
5
10 6 2 2 9 3 4 9 11 9
4
3 6 5 5 5 6 10 2 4 12
3
7 3 9 8 3 1 6 5 7 3
2
7 3 8 5 6 10 4 4 5 6
1 1 4 2 1 8 5 2 5 7 3
0 1 1 1 10 1 1 6 7 3 9
–1
0 2 4 6 8 10
Start
Input-Parameter
Data collection
Selection
Selection of parametric
Training learning method
data set Exact Approximate
learning learning
Parametric
Learning
data set
Error analysis Error indices
where Na is the molar concentration of sodium cations (in mol/L); Ca is the molar concentration of calcium cations (in mol/L); Mg is
the molar concentration of magnesium cations (in mol/L); SO4 is the molar concentration of sulfate cations (in mol/L); HCO3 is the
molar concentration of bicarbonate ions (in mol/L); TDS is the total dissolved solids; pH represents the acidity or alkalinity of a solu-
tion; and T is the normalized inspection time. The output of the model is either 1 or 2: 1 means precipitation and 2 means no precipita-
tion (previously, we have used 0 for no scaling). The coefficients of Eq. 41 are listed in Table 4.
Order
Np σ1 σ2 σ3 σ4 σ5 σ6 σ7 σ8 Coefficient ω
–5
1 0 0 0 1 1 0 0 0 1.32×10
–5
2 0 0 1 0 1 0 0 0 3.54×10
–5
3 0 1 0 0 1 0 0 0 2.76×10
–5
4 1 0 0 0 1 0 0 0 2.51×10
–6
5 0 0 0 0 1 1 0 0 –9.87×10
–4
6 0 0 0 0 1 0 1 0 2.73×10
–5
7 1 0 0 0 0 0 1 0 1.10×10
–6
8 0 0 0 0 0 1 0 1 7.27×10
–5
9 0 1 0 0 0 0 0 1 –6.69×10
–9
10 0 0 0 1 2 0 0 0 7.57×10
–9
11 0 0 2 0 1 0 0 0 2.09×10
–10
12 1 0 1 0 1 0 0 0 7.48×10
–11
13 2 0 0 0 1 0 0 0 5.36×10
–17
14 0 0 0 0 0 3 0 0 –4.86×10
–12
15 0 0 0 0 1 2 0 0 6.50×10
–11
16 0 0 0 1 1 1 0 0 –1.87×10
–10
17 0 0 1 0 1 1 0 0 –2.55×10
–11
18 1 0 0 0 1 1 0 0 –3.73×10
–5
19 0 0 0 0 0 1 2 0 7.31×10
–6
20 0 0 0 1 0 0 2 0 –3.39×10
–6
21 0 0 0 1 1 0 1 0 –2.00×10
–4
22 0 0 1 0 0 0 2 0 –2.69×10
–4
23 0 1 0 0 0 0 2 0 –2.02×10
–4
24 1 0 0 0 0 0 2 0 –1.88×10
–6
25 0 0 0 0 0 1 0 2 8.17×10
–8
26 0 0 1 0 0 1 0 1 –3.92×10
–5
27 0 1 0 0 0 0 0 2 –2.02×10
–7
28 0 1 1 0 0 0 0 1 1.30×10
–5
29 1 0 0 0 0 0 0 2 –2.30×10
–8
30 1 0 1 0 0 0 0 1 9.65×10
Conclusions
In this paper, methods for predicting and inhibiting CaCO3 scale in hydrocarbon wells using ML techniques are presented. Further, a
conceptualized workflow to quantify the overall cost associated with implementing a scale-inhibition/treatment program is presented.
The results of applying such methodology on different data sets showed high prediction accuracy on two data groups. In addition, cost
savings were realized compared with a base-case scenario.
Future work on this subject can expand ML techniques to include other available methods in the industry to test its accuracy com-
pared with the methods used in this paper. In addition, the applicability of this method on types of scale other than CaCO3 can be stud-
ied and analyzed.
Nomenclature
C ¼ concentration of each ion, g/1000 g
Cbc ¼ base-case cost
Ccsp ¼ expected net saving of correctly predicted scale formation cases per well ¼ Pcsp(CT Cs)
CCNsp ¼ expected saving of correctly predicted no scale formation cases per well ¼ PCNsp * CT
CINsp ¼ cost of incorrectly predicted no scale formation cases per well ¼ PIsp * Cs
CIsp ¼ cost of incorrectly predicted scale formation cases per well ¼ PIsp * CT
Cp ¼ overall expected net cost of the scale inhibition program ¼ Xt (Ccsp þ CIsp þ CCNsp þ CINsp)
Cs ¼ cost of scale removal per well
CT ¼ cost of scale-inhibition treatment per well
F0 ¼ derived from artificial neural network of the scale deposition as function of five variables (pressure, temperature, brine concen-
tration, velocity, and time)
F1 ¼ depends on an empirical model, F0
F2 ¼ locations at which scale would be deposited because of the surface position with respect to gravity and the surface being con-
cave or convex
F3 ¼ accounts for boned scaling, which is independent of gravity and a strong function of the pressure gradient
F4 ¼ accounts for boned scaling, which is independent of gravity and a strong function of the pressure gradient
I ¼ total ionic strength
K ¼ constant, depends on the total salt concentration and temperature
Ksp ¼ equilibrium constant
pAlk ¼ negative logarithm of total alkalinity
pCa ¼ negative logarithm of calcium concentration
pH ¼ pH of water samples as actually determined
pHs ¼ pH value when the CaCO3 achieves saturation in the system
PCNsp ¼ probability of correct no-scale-formation prediction
Pcsp ¼ probability of correct scale-formation prediction
PINsp ¼ probability of incorrect no-scale-formation prediction
PIsp ¼ probability of incorrect scale-formation prediction
PNs ¼ probability of no-scale formation
Ps ¼ probability of scale formation
PCO2 ¼ partial pressure of CO2
Sf ¼ solubility factor
T ¼ temperature
V ¼ valence of the ion
XCO2 ¼ CO2 mole fraction in water mixture saturated with a gas containing CO2
Xt ¼ total number of wells or cases in a given field
X1 ¼ represents the number of cases in which the scale prediction was correct
X2 ¼ represents the number of cases in which the scale prediction was incorrect
X3 ¼ represents the number of cases in which the no-scale prediction was correct
X4 ¼ represents the number of cases in which the no-scale prediction was incorrect
y_ ¼ scaling-rate function
l ¼ ionic strength
References
Amiri, M., Moghadasi, J., and Jamialahmadi, M. 2013. Prediction of Iron Carbonate Scale Formation in Iranian Oilfields at Different Mixing Ratio of
Injection Water with Formation Water. Energ Source Part A 35 (13): 1256–1265. https://doi.org/10.1080/15567036.2010.514596.
Bahadori, A. 2011. Estimation of Potential Precipitation from an Equilibrated Calcium Carbonate Aqueous Phase Using Simple Predictive Tool. SPE
Proj Fac & Const 6 (4): 158–165. SPE-132403-PA. https://doi.org/10.2118/132403-PA.
Bayona, H. J. 1993. A Review of Well Injectivity Performance in Saudi Arabia’s Ghawar Field Sea Water Injection Program. Paper presented at the
Middle East Show, Bahrain, 3–6 April. SPE-25531-MS. https://doi.org/10.2118/25531-MS.
Castillo, E., Cobo, A., Gutiérrez, J. M. et al. 1999. Functional Networks with Applications: A Neural-Based Paradigm, Vol. 473. Boston, Massachusetts,
USA: Springer International Series in Engineering and Computer Science, Springer.
Hamid, S., De Jesús, O., Jacinto, C. et al. 2016. A Practical Method of Predicting Calcium Carbonate Scale Formation in Well Completions. SPE Prod
& Oper 31 (1): 1–11. SPE-168087-PA. https://doi.org/10.2118/168087-PA.
Jordan, M. M., Williams, H., Linares-Samaniego, S. et al. 2014. New Insights on the Impact of High Temperature Conditions (176 C) on Carbonate and
Sulphate Scale Dissolver Performance. Paper presented at the SPE International Oilfield Scale Conference and Exhibition, Aberdeen, Scotland,
14–15 May. SPE-169785-MS. https://doi.org/10.2118/169785-MS.
Kamal, M. S., Hussein, I., Mahmoud, M. et al. 2018. Oilfield Scale Formation and Chemical Removal: A Review. J Pet Sci Eng 171 (December):
127–139. https://doi.org/10.1016/j.petrol.2018.07.037.
Lakshmi, D. S., Senthilmurugan, B., Drioli, E. et al. 2013. Application of Ionic Liquid Polymeric Microsphere in Oil Field Scale Control Process. J Pet
Sci Eng 112 (December): 69–77. https://doi.org/10.1016/j.petrol.2013.09.011.
Larson, T. E. and Buswell, A. M. 1942. Calcium Carbonate Saturation and Alkalinity Interpretation. J Am Water Works Ass 34 (11): 1667–1684. https://
www.jstor.org/stable/41233315.
Mackay, J. E., Collins, R. I., Jordan, M. M. et al. 2003. PWRI: Scale Formation Risk Assessment and Management. Paper presented at the International
Symposium on Oilfield Scale, Aberdeen, Scotland, UK, 29–30 January. SPE-80385-MS. https://doi.org/10.2118/80385-MS.
Moghadasi, J., Jamialahmadi, M., Müller-Steinhagen, H. et al. 2003a. Scale Formation in Iranian Oil Reservoir and Production Equipment during Water Injec-
tion. Paper presented at the International Symposium on Oilfield Scale, Aberdeen, Scotland, UK, 29–30 January. SPE-80406-MS. https://doi.org/10.2118/
80406-MS.
Moghadasi, J. M., Jamialahmadi, M., Müller-Steinhagen, H. et al. 2003b. Scale Formation in Oil Reservoir and Production Equipment during Water
Injection (Kinetics of CaCO4 and CaCO3 Crystal Growth and Effect on Formation Damage). Paper presented at the SPE European Formation
Damage Conference, The Hague, The Netherlands, 13–14 May. SPE-82233-MS. https://doi.org/10.2118/82233-MS.
Ramstad, K., Tydal, T., Askvik, K. M. et al. 2005. Predicting Carbonate Scale in Oil Producers from High Temperature Reservoirs. SPE J. 10 (4):
363–373. SPE-87430-PA. https://doi.org/10.2118/87430-PA.
Stiff, H. A. Jr. and Davis, L. E. 1952. A Method for Predicting the Tendency of Oil Field Waters To Deposit Calcium Carbonate. J Pet Technol 4 (9):
213–216. SPE-952213-G. https://doi.org/10.2118/952213-G.
Tariq, Z., Abdulraheem, A., Mahmoud, M. et al. 2018a. A Rigorous Data-Driven Approach To Predict Poisson’s Ratio of Carbonate Rocks Using a
Functional Network. Petrophysics 59 (6): 761–777. SPWLA-2018-v59n6a2.
Tariq, Z., Mahmoud, M., Abdulraheem, A. et al. 2018b. An Intelligent Solution To Forecast Pressure Drop in a Vertical Well Having Multiphase Flow
Using Functional Network Technique. Paper presented at the PAPG/SPE Pakistan Section Annual Technical Conference and Exhibition, Islamabad,
Pakistan, 10–12 December. SPE-195656-MS. https://doi.org/10.2118/195656-MS.
Tariq, Z., Mahmoud, M., and Abdulraheem, A. 2019. An Intelligent Data-Driven Model for Dean–Stark Water Saturation Prediction in Carbonate
Rocks. Neural Comput & Applic 2019: https://doi.org/10.1007/s00521-019-04674-z.
Vetter, O. J. and Farone, W. A. 1987. Calcium Carbonate Scale in Oilfield Operations. Paper presented at the SPE Annual Technical Conference and
Exhibition, Dallas, Texas, USA, 27–30 September. SPE-16908-MS. https://doi.org/10.2118/16908-MS.
Yeboah, Y. D., Somuah, S. K., and Saeed, M. R. 1993. A New and Reliable Model for Predicting Oilfield Scale Formation. Paper presented at the SPE
International Symposium on Oilfield Chemistry, New Orleans, Louisiana, USA, 2–5 March. SPE-25166-MS. https://doi.org/10.2118/25166-MS.