0% found this document useful (0 votes)
727 views24 pages

KMod Modeling

Permeability prediction using Kmod Modeling Techlog
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
727 views24 pages

KMod Modeling

Permeability prediction using Kmod Modeling Techlog
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 24
Techiog Tutorial Ref: TT K.mod Techlog K.mod® ~ modelling/repair of core and/or log data Table of contents OBJECTIVES AND WORKFLOW. Data CHECKS Dara SELECTION.. ‘THE K.MOD "MODEL MAIN CONTROL WINDOW” POPULATION OF THE MODEL WITH DATA INITIALIZATION OF THE MODEL. TRAINING THE MODEL Assessing the quality of the model .. ADAPTING THE MODEL Change model structure and selected data. Evaluating data outliers and back-propagation of reconstructed input data. Adjusting weights . Using a Subset of the training data.. SAVING YouR WORK Saving the output curves. Saving the mede RECALLING AND APPLYING THE MODEL TO NEW WELLS SPECIAL USES OF THE MLP RECONSTRUCTION OF POOR QUALITY LOG DATA .. CLASSIFICATION .. Techlog Techlog Tutorial Ref: TT K.mod Date: 15/03/2007 Page: 2/24 K.mod® ~ modelling/repair of core and/or log data Objectives and Workflow The main objective when using K.mod® is to create a network model to predict the values of a desired formation property (e.g. permeability), or the values of a desired logging curve, from a combination of input log data. In the first case, the objective would be to predict the formation property outside cored intervals whilst in the second case the objective would be to create modeled curve data to infill a target curve over just those intervals in which the target data were affected by e.g. bad hole conditions. K.mod is a “Multilayer Perceptron” (MLP) type of neural network. It operates on the principle of back-propagation of errors to modify the choice of nodal weight factors for each node in a chosen layered design. The network comprises: + an input layer 0 the multivariate set of input curve vectors + one or more hidden layers © designated “nodes”, each node combines inputs from the layer “above” into’ weighted sum which transfers information from the input layer onto the model of the target data either directly (single hidden layer) or indirectly (>1 hidden layers); errors between modeled and target data are recorded and used to strengthen or weaken contributions from each node in the model (concept of “back-propagation); each node represents a partition of the problem set and comprises input data from one or more depth levels as dictated by the model solution + and an output layer (the target data to be modeled). The workflow follows these broad steps: + Verify quality and consistency of input data (edit, depth match, normalization, etc. Set date and initiation tet Adjust the model as required Train the model, assessing quality of fit by use of diagnostic curves/plots Adapt weighting factors on input data as required Modify input data selection Save final model Apply on other wells, Data Checks The user should perform a standard quality check of the requisite input data, verifying that the data have good assignment of families and units and that the curves are depth- matched and free of defects such as aberrant spikes etc... Here is an illustration of the use of the project filter with Variables Inventory and a Multi-well histogram as suggested means to investigate data quality and integrity. a oe ‘Techlog Date: 15/03/2007 K.mod® ~ modelling/repair of core and/or log data __| page: 3/24 Many are the tools in Techlog® for performing this kind of work; this illustration is included here to serve as a reminder for the importance of this data preparation step. There are also several tools of exploratory statistical techniques located under the Techstat, especially the Factorial Analysis menu that are available to investigate relations amongst data and assist in the choice of pertinent data to use for modeling purposes. These tools are the subject of a separate Tutorial session. Data Selection Invoke K.mod using the 5 icon or the menu item under the TechANN menu drop-down or using the Cntri-Shift-K hotkey combination. The following screen appears ene Techiog Techlog Tutorial Ref: TT K.mod K.mod? ~ modelling/repair of core and/or log data_| pace: joa Note that we are using the families representation for multi-well data access; Aliases could have been used as well as or instead of families alone. The duplication of family name in the multi-selection Input box on the right will aliow us to impose a weight to the input curve by representing it twice in the model. Presently, you will see that it is possible to select individually which curves will participate in the model. The “Output” family will be assigned to the modeled curve produced by K.mod, and this field is also used to signify the candidate curves for the target data of the process. The K.mod “model main control window” EE The model main control screen opens on the schematic structure of the network defined In the data selection screen. From right to left we see the input layer, 2 single hidden layer and the output layer (or target curve family). Note that each node of each layer is connected to all the nodes in the adjacent layer, but connections do not jump layers. The Property Tables associated with this main window are as follows: Date: 15/03/2007 Date: 15/03/2007 Page: 5/24 K.mod® ~ modelling/repair of core and/or log data To adjust the structure of the model, first unlock the screen by clicking in the wi background of the window, then click the red key which appears in the top-right comer, turning it green which is the symbol for being unlocked... The numbers of nodes in the hidden layer (pale tan-brown nodes) can be adjusted by use of the mouse scroll wheel and/or the keyboard up/down (increase/decrease) arrows. A new hidden layer may be introduced by pressing the Insert Key, whereupon a single node appears (connected to all other nodes). Place the cursor on this node and use the scroll wheel or arrows to adjust the number of nodes in the layer to suit. Techiog Tutorial Ref: TT K.mod Techlog 5/03/2007 K.mod® - modelling/repair of core and/or log data 34 Note that introduction of the second layer renders the model very complex. Thus, it Introduces the danger of “over-training” the system (i.e. having a model that is too specifically orientated on the training data), although there can be circumstances where it is quite the best strategy to have two hidden layers. We do not advise to use more than two hidden layers. In this model we will not use initially the duplications of curve data for weighting Purposes, but just use a single representation of each curve. To “turn-off” curves, click on the relevant brown ovals and they will turn grey to signify their status as being “off”. | lecnio og K.mod® - modelling/repair of core and/or log data_| Date: 15/03/2007 Page: 7/24 Sere i i j e [on | O0OOGOGOO QOOOCO0000 Notice that the title banner of the window updates with the model structure, in this case 4-8-1 mens 4 inputs, 8 hidden nodes in a single layer, 1 target node (cf. previous picture). We generally advocate that the hidden layer has more nodes than the number of input data. This is to allow the program to partition the input data space into more parts than input data dimensions which encourages the combinatorial processing of the input data (and reduces the risk of one node being completely dominated by a single input curve). It is also possible to change the colour of the output node, which controls the colour of the modeled curve displayed on the diagnostic plot. Place the cursor in line with the target node “column” and click to access to the colour selection screen. When adjustments are complete, click the green key symbol to tum it red again to signify the screen is locked. Techlog Tutorial Ref: TT K.mod Techlog Date: 15/03/2007 K.mod® ~ modelling/repair of core and/or log data_| page. 8/4 Population of the Model with data Click on the “Datasets” Tab of the main control screen © oe. se> © roe © rroo_uH0 = Bie Derty Ba © oe se © pros ©. Gonmressons Sones ‘oon, 9 on © ony zz 2 Gcovpressona Sonress \© 4, cororesoon Sonress ‘0 or ‘0 on © ote Q om © soi O om © soN_oos © son 2 Gonma eny © Sona Foy ‘© cam © cum © or 8 cancer © 4, Gorma oy srRI204 eae © ones O ck 2 Gana Fay . eseon Posty Lmetone ‘ocum |, teston Prost Leone 9 crew ' GRAIN br © craes 0 one 4, tewon Foret net © Pere Natron ooay nea © PERM_V1_05 © | Pemesbity © PERWIVi-0n paz © cro © TesTrEsit © crowoo.c ¥ (wot) (ten) [tooo] D&D a selected well or wells into the column marked “Learning” to act as the training data source; similarly, D&D a selected well or wells into the “Validation” column to act as blind-test validation data. When developing the model it is normal not to include wells in the "Model" column - this is for application of the model to other wells once it is parameterized satisfactorily (see a later section of this tutorial). In this example the family names shown are listing the candidate curves found for each family and it is necessary to click on the radio button adjacent to the curve name SONI, then RMB and click Default variable in order to set the variable SONI as the variable of choice for the family "Compressional Slowness” wherever it appears in that column (this 5/03/2007 K.mod® ~ modelling/repair of core and/or log data oe | recniog operation is performed once for each column, irrespective of how many wells are in the column). Initialization of the Model If it is necessary to limit the model training data to within a particular zone or group of zones, make the model main control window the active window by clicking in the window somewhere and invoke the desired Zonation and zones using the Zonation dock window. Click on the upper part of the Zonation dock window to select the Zonation of choice, then click the zone or zones to involve from the list that appears in the lower part of the dock window. Click on the button “Initialise” at the base of the main control screen and, the following appears. Measure After resetting the network, one plot is produced for each training well and for each validation well. In our case two plots appear beside the model main control screen. The work area is tiled to accommodate the windows. In this example it is necessary to adjust the layout plots using the scroll bars to bring the data into view. (The datasets are defined to exist from 0 ft in depth, so the plots open at the start of the datasets.) Techlog Tutorial Ref: TT K.mod Techlog K.mod® ~ modelling/repair of core and/or log data Training the Model For the first pass training we will not adjust any system properties but go directly to on the “Learn” button at the base of the model contro! main screen. The model control main screen changes to the “Error curve" tab and two lines begin to snake across the crossplot that is exposed. The red line refers to the training data and the green line refers to the validation “blind-test” data. The lines trace the change in overall error in the minimization surface being used by the program to evolve the best set of weights for the nodes in the network to build the requisite transformation function. As the lines descend across the crossplot the overall differences between the observed and predicted target data become smaller ~ the fit of the network is improving. The magnitude of the values of the lines is not so much of interest, since this depends on the data and the network structure. What is important is that normally, the validation data can be expected to be higher than the training data (green line above red). If either or both lines on this plot start to rise as the network evolves then this is a sign that the fit is deteriorating. Sometimes this happens but further minimization runs of the same network then cause the fit to improve again and the lines to subside. This kind of behaviour can be seen in the example model from time to time, which certain combinations of parameters, and is merely showing the system at work exploring the solution space. Once the defult number of iterations (50) is complete the model pauses. The plot layout displays the evolving modeled curve (in the same colour as was chosen on the model “Structure” screen) superimposed on to the training data in the track with the yellow background second from right. In the right-hand track the goodness of fit and data consistency information are plotted as shading and bounding curve respectively. Blue represents better fit, red represents lower quality fit. The other tracks on the plot (tracks to the left with black curves) are showing the input curve data. If desired, a double-click in the Error Curve crossplot will unlock the plot and permit the user to drag the cursor to and fro across the plot to see the course of evolution of the iceland. Hhue ertaollal Petal can Kuss truck acaba ae tied Gee CMA. AE ko DOIAE Gil the =F .| | EEE | Kmoa® - modelling/repair of core and/or log data | Date: 15/03/2007 Page: 11/24 crossplot, the user may lock it again by 2 double-click, then a restart of the “Learn” button will cause the minimization to begin from the point or iteration represented by the last cursor position in the crossplot. This is a very useful tool to assess the repeatability of the solution: whether the process achieves the same error curve if restarted from an earlier point. [t is not so obvious that it will do so if the error curve is very volatile. In the above example the upper layout plot is the training data and the lower plot is the validation data. In this first pass result it is evident that the training data is well matched after 50 iterations but the validation data systematically does not reach the highest data values. Assessing the quality of the model Below we see these two results in detail. The blue shaded curve on the right of the plots is a representation of the goodness of fit of the model. The shading colour is the variance between the observed and predicted model data. The bounding curve is an estimate of the quality of the model at each depth. We can provide an estimate of these parameters because in the background of the K.mod process is the creation of a "Self-Organizing Map” (SOM). A SOM (see Tutorial on Ipsom® for an outline explanation) provides a means to classify log (or core) data on the basis of patterns of readings, depth-by-depth. In K.mod, each depth in the training data set with target data is grouped into bins with other depths having similar input data patterns. Because we have target data at each of those depths, we also have an estimate for the ability of different patterns of log data to match (or not) its associated target data point. K.mod groups alll the input log data into one of a few bins based on depth-by-depth pattern shape; the number of bins chosen are few enough that each bin will also contain some depths where target data exist; hence each bin or pattern group in the SOM can be given an estimate of its ability to predict the target data by looking at the “observed minus predicted” stat for all points in each bin where this calculation is possible. This goodness of fit is calculated as a variance and shown as a range of colour (red = high variance, poor fit; blue = low variance, good fit). Techlog Tutorial Ref: TT K.mod Date: 15/03/2007 K.mod? ~ modelling/repair of core and/or log data_| pane. 13/94 Variance map using non-default settings (5x5 specified in Properties Table : Map dimensions) Blue: low variance/good fit Red: high variance/poor fit Input cata for both training and validation datasets are sorted into the bins of the SOM Training Data | Validation Data First Pass Results The SOM can be viewed by clicking on the “Variance Map” tab of the model control main window. Normally the map has a minimal number of bins. In the display above, each depth is located in the SOM behind K.mod sn the basis of its input data pattern. This is equally true for the validation data as for the training data. In the quality track on the display the depth is assigned the colour that is carried by the bin on the SOM in which that depth’s data pattern is to be found, and the colour represents the quality of fit for data of particular pattern type. The bounding curve in the coloured track is a measure of the distance in pattern space between the individual data point and the bin centroid pattern. When the curve is of a high value (plotting towards the left on the track) the data point is a poor representation of the “average” pattern type for the bin. Conversely, when the curve is low value (plotting towards the right on the track) the data point is close to the “average” pattern for the bin and is a good representation of the class of data. At any stage, crossplots of observed vs. predicted are easily created from the layout display(s) by selecting the two curves required using Cntri-click to make the multiple curve selection and then RMB, Create plots\Cross-plot. Here is an example of such a plot from the training data, showing ~86% of correspondence. EVUTUE © a | teed | K.mod® - modelling/repair of core and/or log data_| Date: 15/03/2007 tins e114 (e+ e200 A similar plot from the validation data shows a much poorer fit It is possible to display a confidence interval around the modeled curve. If this is tried in the current model, which is a logarithmic model, the low side confidence interval includes the value zero, and as such cannot be displayed on the logarithmically scaled plot. The high side interval displays correctly. To see them, change the Property “Confidence Interval\ Visible” to yes (see next picture below). In order to improve on the applicability of our model, we must adapt the model in one of several ways. Techlog Techlog Tutorial Ref: TT K.mod Date: 15/03/2007 Page: 14/24 K.mod® ~ modelling/repair of core and/or log data Adapting the model To alter the model and improve its results we can do any or all of the following: + Change the structure of the model * Change the input data choices after examination of outliers/back-propagation reconstructions + Introduce weight terms to ensure that the model "reaches" the full dynamic range of the input data + Select a subset (selection) of trai 19 data Change model structure and selected data To investigate adaptation strategies and retain the ability to compare results, first duplicate the model. RMB click in the model control main window and select “Duplicate”. Change the colour of the mode! output prediction curve by clicking on the red key in the “Structure” Tab to unlock the view, and then invoke the colour selection tool by clicking in the “column” above the target curve node. At the same time, adjust the number of nodes in the hidden layer(s) and/or select/unselect input curve data nodes. In our case we will double the weight on the density and neutron data by activating both instances of each curve in this tab and increase the number of nades in the hidden layer. When completed lock the tab by clicking on the key icon again to turn it red. Then click on the “Initialize” button to reset the model and click “Learn” to re-run the model. Now the output of this new model will be a green curve and the quality of fit of the new ‘model will be in the track to the extreme right of the output plot. The following is a screen shot taken of the new model after the new learning pass. eS 12 x | —_— K.mod® - modelling/repair of core and/or log data Pose: tea The degree of improvement in the model output is slight. Something in the validation data is restraining the ability of the training data to act as a general representation of the model. Evaluating data outliers and back-propagation of reconstructed input data To investigate this we can activate the outlier detection facility. This is done by displaying the Property Table for the model itself by clicking anywhere in the model main control window to make it the active window (if it is not already the active window). Click on the “Outliers \visible” property to set it to yes, and, if desired, change the number of standard deviations beyond which data are considered outliers (Tolerance). [Ptemaes | Qt | ey Vase as 1B Pawnee (st Yea aspaeion ibe cere = (Sonat career 3 (ded ot | Wasengtece (Beohowverens Vase cs Bl Apoleaten rr opty mode ont recanstuct Click on “Learn” again on the model main control window or re-initialise and re-run the training. The result is that a series of red bands appear in particular input data tracks on the layout displays (see below) when data for that curve are beyond the specified number of standard deviations of the combined training+validation input curve data distribution. Techiog Tutorial Ref: TT K.mod Date: 15/03/2007 Page: 16/24 Techlog K.mod® - modelling/repair of core and/or log data ij Well9_Dataset:0 [Well: Well9] In the example above we see a lot of red against the GAMM curve for the validation data set from Well9. Another way of investigating potential inconsistency between training and validation data is to invoke the “back propagation” modeling and see the input curves plotted against back-propagation reconstructions for the input curves derived from the modeled result. This is shown on the next screen shot, with the curve reconstructions displayed in purple. ET Training Data | Validation Data Second Model Results — with reconstructions Here we can see that the GAMM curve is systematically poorly reconstructed in the validation data set and is acting as a restraint on the model's ability to achieve a good fit. Consequently, we remove the gamma ray from the training model entirely and re-run the case to see the effect. By itself, this makes little difference; hence we try a new strategy: that of adjusting the weights applied to parts of the data. Adjusting weights Copy the model as before with the RMB “Duplicate function in the model main control window. Display the Property Table for the model itself by clicking anywhere in the model main control window to make it the active window (if it is not already the active window). Techlog Techlog Tutorial Ref: TT K.mod Date: 15/03/2007 K.mod® ~ modelling/repair of core and/or log data In this case we have chosen to add weights only to the high side of the data distribution; it is common also to add weights to the low side, not necessarily symmetrically, but this is done in order to avoid introducing a bias towards one side or the other of the data distribution. The need for the weights, their level of introduction and their severity are only determinable by trial and error. In this case, illustrated above, we have introduced weights as follows: Weight low values no Values lower than — ° Number of iterations 0 Weight high values yes Values greaterthan 1000 Number of iterations 10 This means that any data point with a measured permeability greater than 1000 mD will be weighted to act as if it were present in the data set 10 times. In this example using Well1 training data this will affect about 2% of the data; although only a small amount of data are involved, the instruction introduces the ability for the model to extend the output when the log data patterns in the validation data set are related to the highest values amongst the training data, and we have done this without upsetting the good fit in the training data. Here is an example of the observed vs predicted plot from the training data after the recent model changes, showing 89% of correspondence. apne . A similar plot of the validation data (below) now shows that 139 out of 201 data points can be characterized by an 1? goodness of fit statistic of ~70%, whilst there are two Groups of data that remain to be studied and accommodated by the model even after adjustment by weights and by curve selection. Using a subset of the training data oS This is performed by invoking the normal selection/interaction icon ** to identify on the crosspiot(s) or layout(s) of one of the training wells particular data to be included in the training set by use of the brush tool or polygon tool. Selected points are highlighted on all relevant plots. To train the model on only the selection, make the model main control window the active window, change the “Selection” property on the Property table from no to yes. Re- initialise the model and re-run the training step(s). The user is encouraged to investigate how to improve the model further by his/her own study. Techlog Techlog Tutorial Ref: TT K.mod Date: 15/03/2007 K.mod® ~ modelling/repair of core and/or log data_| page. 26/24 Saving your work Once the model is parameterized to a satisfactory level, and has been seen to model a set of validation data to a sufficiently accurate degree, the model and its results can be saved. Saving the output curves Either click on the modeled curve in the layout and RMB "Variable\Save” to save the modeled results, or invoke the ‘Apply mode” Property and change it from “Display” to "Save and Display” or “Save”. Saving the model To save the model, RMB in the model main control window and click Save... Enter 2 name when prompted and the model will appear in the Project Browser under the heading: “Networks”. Recalling and applying the model to new wells Double-click on the desired network in the Project Browser under the heading: “Networks”. The model main control window will open; the learning (=training) and validation data specifications will be visible even if the datasets/variables are not available in the project. The model has saved references to only those curves that were actually used in the parameterization. To apply the model to new data, D&D the required datasets into the “Model” column of the “Datasets” tab of the new model main control window. Verify the variable choice as necessary, select zones as necessary using the Zonation dock window and then click the “Apply” button. A pre-formatted layout appears for each of the training, validation and “application” wells. Thus it can be seen that the model saves the training and validation data within it, and these data do not need to be part of the active project since they are recalled as part of the model specification.

You might also like