This paper describes the procedure of building a probabilistic decision tree on the basis of the integration of data from multiple sources, conditional probabilities, and the application to map fracture corridors (FCs) in a mature oil field with abundant production data. A fracture corridor is a tabular, subvertical, fault-related fracture swarm that intersects the entire reservoir and extends laterally for several tens or hundreds of meters. Direct indicators of fracture corridors, such as image logs, flow profiles, well tests, and seismic fault maps, are sometimes insufficient to map all fracture corridors in a field. It is also necessary to use indirect fracture-corridor indicators from well data, such as productivity index (PI), gross rate, water cut, and openhole logs. Fracture corridors from indirect indicators can be inferred by a probabilistic decision tree, which makes predictions by integrating data from multiple sources, giving preference to the indicators with the highest relevance. Decision trees are constructed by use of a training set that includes measurements of both direct and indirect fracture-corridor indicators. In this study, wells with borehole images, production logs (flow profiles), and injector/producer short cuts are selected as the training set. The resulting decision trees reveal that total losses, gross production rates, and water cuts are the three most effective indirect indicators of fracture corridors in the test field.
It is often the case that a particular reservoir attribute, such as porosity, has only sparse direct measurements. It is possible, however, to predict values of such a target variable with the help of a set of other variables that exhibit some degree of correlation to the target variable and have abundant measurements. A common example is estimating porosity from seismic attributes. In this paper, the variables that have one-to-one correspondence to the target variable are called direct indicators and the variables that have some degree of correlation are called indirect variables. For example, density and neutron logs are direct indicators of porosity, whereas seismic impedance is an indirect indicator.
There are several statistical techniques to predict a target variable from a set of indirect indicators, and these can be collected under two main groups: supervised prediction techniques and unsupervised prediction techniques.
In the case of supervised prediction techniques, indirect indicators are correlated to a target variable by use of a training set of data that includes measurement of both direct and indirect indicators of the target variable. The generated predictive system can be used to estimate values of the target variable solely on the basis of indirect indicators in wells that do not have any measurement of direct indicators. Multiple regression, back propagation, neural networks, and Bayesian decision trees belong to this category.
In cases where the training set is small or no direct indicators are available, it is possible to adopt statistical techniques that do not require extrapolation from a training set. These are termed unsupervised prediction techniques. Several such techniques exist, including cluster analysis, unsupervised neural networks, and factor analysis (Wasserman 1989; Chester 1993; Van De Geer 1971). The basic idea is to discover hidden factors that control indicator variables and to interpret these factors in terms of the target variable. For example, the density (spacing/relative abundance) of conductive fractures may affect the rapid water-cut rise, high initial PI, and high gross rate. These three indirect indicators will be highly correlated to each other. An unsupervised prediction technique may uncover the hidden factor (fracture density) that controls all three variables from the high correlation among them.
Both supervised and unsupervised inferences are methods for making predictions with incomplete information (Tamhane et al. 2000; Fletcher and Davis 2002). Most of the applications in the oil industry use fuzzy logic or fuzzy neural networks. These applications also use soft computing decision making with incomplete evidence and risk reduction by use of a fuzzy-expert system (Weiss et al. 2001; Chen et al. 2002; Saggaf and Nebrija 2003). This idea has found some application, especially in mapping fracture density by use of seismic attributes (Ouenes et al. 1995; Zellou et al. 2003; Bloch et al. 2003).
Both supervised and unsupervised statistical techniques aim at determining some global attribute of dispersed fractures, such as density. It is often fracture corridors, however, rather than dispersed fractures that are characterized as the main reservoir heterogeneity (Ozkaya and Richard 2006). An FC is a tabular, subvertical, fault-related fracture swarm that intersects the entire reservoir and extends laterally for several tens or hundreds of meters (Fig. 1). FCs could be fluid-conductive or cemented. In this paper, an FC denotes a fluid-conductive FC unless otherwise specified. FCs may have significant conductivity and may play a major role in reservoir dynamics by providing pressure support and, therefore, causing early water breakthroughs and increased gross rates.
The four main requirements to map an FC are location, strike, length, and conductivity. Here, we focus primarily on locating FCs and discuss only briefly how other attributes can be estimated. Our objective is not the actual mapping of FCs but examining Bayesian decision trees as a viable technique in FC identification. The basis and procedures for calculating conditional probabilities, entropy, information Gain (IG), and the construction of decision trees are explained in the Appendix.