|Theme||Visible||Selectable||Appearance||Zoom Range (now: 0)|
Although machine learning models can provide tremendous value to the unconventional oil and gas industry, interpreting their inner workings and outputs can be a laborious, time consuming, and difficult process. Here we present a novel method for extracting an overall rock quality index from a machine learning model trained on well logs. This rock quality index (RQI), which we term geoSHAP, can be used for performance benchmarking, completions tailoring, and acreage evaluation workflows. We trained a decision trees-based model on a regional Williston Basin dataset. The model predicts oil, gas, and water production at 30-day increments out to IP 720 based on training features of completions design, petrophysical grids, and spacing/stacking parameters. We started with over 400 petrophysical grids and reduced them down to 5 principal components using a Gaussian Kernel Principal Components Analysis. We then employ SHAP values (SHapley Additive exPlanations), which reflect how much each individual feature contributed to the model prediction. To extract our RQI, we sum the SHAP values for each of the principal geologic components for each well at each IP day. These summed geoSHAP values reflect the overall rock quality around the basin, identifying sweet spots and low performing areas. The model is able to identify high-performing areas on the Nesson Anticline, Antelope Anticline, Fort Berthold area, and Parshall/Sanish. We also show how the geoSHAP trends with overall operator performance and can be used to benchmark performance relative to expectation. This method is repeatable across trees-based machine learning algorithms. It removes the need to construct partial dependence plots or to take the time-consuming steps of running synthetic pads across the entire basin. Additionally, this method simplifies the selection of petrophysical grids and removes issues with multicollinearity that can debilitate machine learning models. GeoSHAP provides a purely empirical perspective on rock quality that can be compared to more prescriptive, assumptions-laden traditional methods, such as combining Archie’s equation with recovery factors. It also provides a generalizable method applicable to models built with simpler, easier to obtain data such as formation tops and isopachs.
Over the past several years, machine learning methods have found increasingly common usage for well performance prediction and design optimization in unconventional reservoirs. These algorithms offer several advantages that have made them attractive to engineers and geoscientists, including increased accuracy, ability to deal with complex problems, and reduced bias. However, difficulty in assembling datasets and lack of interpretability have limited widespread usage. Because machine learning models thrive on large datasets, operators are often forced to incorporate publicly-available data from non-operated wells (whether directly from a state database or through a vendor). Even if an operator does have hundreds or even thousands of horizontal wells within a given basin, their implemented completions and spacing/stacking configurations may poorly sample the distribution of design parameters, limiting the effectiveness of a single-operator model.
Chen, Cunliang (Tianjin Branch of CNOOC Co., Ltd) | Han, Xiaodong (Tianjin Branch of CNOOC Co., Ltd) | Yang, Ming (Tianjin Branch of CNOOC Co., Ltd) | Zhang, Wei (Tianjin Branch of CNOOC Co., Ltd) | Wang, Xiang (Changzhou University) | Dong, Peng (China University of Petroleum)
Long term waterflooding leads to the formation of dominant channels in sandstone oil reservoirs, which aggravates the heterogeneity of the reservoir and decreases the displacement performance of the injected water. The ineffective water circulation through the dominant channel would significantly increase the cost of water injection and reduce the oilfield exploitation effciency. Therefore, valid identification and controlling of the dominate channel are essential for enhancing the oil production efficiency of waterflooding reservoirs. Although many methods have been proposed to identify the dominant channel, the accuracy of these methods is always unsatisfactory caused. The results obtained by using various methods are not consistent with the others.
A new method that can comprehensively utilize multiple data is proposed here for improving the identification accuracy of dominate channels. The formation of the dominant channel is affected by both geological factors and development factors. And all these parameters change during the waterflooding process and could reflect the formation of dominant channels. In our method, the evaluation index system consisted of both geological factors and development factors that are firstly formed and analyzed. Consequently, the principal component analysis (PCA) method is applied to aggregate the multiple independent indexes into a comprehensive index. The calculated relative value of the comprehensive index is then considered as the assessment criterion to identify the dominant channel. As an artificial intelligence method, PCA is widely used for reducing dimension statistics. This method has good advantages in the identification of the dominant channel since it can take various data into consideration and reduce subjectivity during the identification process.
The proposed method has been applied in several oilfields for dominant channel identification, and the results are entirely satisfactory. Accurate identification of dominant channels is helpful for the design of an effective adjustment plan, which could provide technical support for achieving higher production efficiency and better economic benefits.
Methods and discussion Aero-geophysical surveys acquire data that can be used for many applications including urban planning, agriculture, We use a popular numerical image processing software forestry, or in the case of CPRM - Serviço Geológico do package to test and evaluate procedures with the objective of Brasil (Geological Survey of Brazil), geological mapping of delineating the geological expression of the studied areas a region, conducted by. In this latter case, the geologic using the primary products (TC, K, eTh and eU) of airborne interpretation is based on visual pattern recognition gamma ray spectrometry surveys.
While a number of techniques (Freire et al., 1988; Maraschini et al., 2016, Hinds et al., 1996), have been described in the literature for wavefield separation, the median filter (Haradage, 1985) is still used by most contractors the vast majority of the time. This paper demonstrates the benefits of a cascaded principal component analysis (PCA) approach. More accurate and robust wavefield separation will allow more information about earth properties and multiples to be extracted.
Presentation Date: Tuesday, September 26, 2017
Start Time: 9:45 AM
Presentation Type: ORAL
Correlations between isofrequency amplitude traces from spectral decomposition provide a means of finding frequency notches induced by thin layers. Isofrequency traces tend to be strongly correlated between frequencies at spectral nulls; and amongst those that are not at those frequency notches. Spectral principal component (PC) amplitude attributes take advantage of this property, and are indicative of layer thickness. With proper trace scaling and spectral balancing, spectral PC amplitudes are independent of layer reflection coefficients. Layers with only odd and even pair reflection coefficients have distinctive spectral PC-thickness relationships in synthetic wedge models. Three spectral PC attributes individually delineate amplitudes from: 1) an isolated reflection not affected by tuning; 2) tuning of an even reflection pair; and 3) tuning of an odd reflection pair in a 3-D synthetic turbidite model.
Presentation Date: Tuesday, September 26, 2017
Start Time: 11:25 AM
Presentation Type: ORAL
Liu, Bo (King Fahd University of Petroleum and Minerals) | Nuha, Hilal (King Fahd University of Petroleum and Minerals) | Deriche, Mohamed (King Fahd University of Petroleum and Minerals) | Mohandes, Mohamed (King Fahd University of Petroleum and Minerals) | Fekri, Faramarz (Georgia Institute of Technology)
This work considers the data compression of sequential seismic sensor arrays. First, the statistics of the seismic traces collected by all the sensors are modeled by using the mixture model. Hence, a distributed Principle Component Analysis (PCA) compression scheme for sequential sensor arrays is designed. The proposed scheme does not require transmitting the traces, leading to a more efficient computation and compression compared with the conventional local PCA compression. Furthermore, an efficient communication scheme is developed for the sequential sensor array for delivering the local statistics to the fusion center. In this communication scheme, the sensors update and pass a data package consisting of cumulative variables. The size of the data package does not increase throughout the process, which is more efficient than the direct communication scheme. Finally, the performance of the proposed scheme is evaluated by using both real and synthetic seismic data.
Presentation Date: Tuesday, October 18, 2016
Start Time: 1:25:00 PM
Location: Lobby D/C
Presentation Type: POSTER
Huang, Weilin (China University of Petroleum–Beijing) | Wang, Runqiu (China University of Petroleum–Beijing) | Zhou, Yanxin (China University of Petroleum–Beijing) | Chen, Yangkang (University of Texas–Austin) | Yang, Runfei (University of British Columbia)
The principal component analysis (PCA) is an effective proper orthogonal decomposition (POD) method for data analysis. The target of the PCA is to reduce the dimensionality of a data set and retain the variance presented in the data set as much as possible. We assume the random noise and irregularly missing data are additive and uncorrelated with the signal, and utilize the PCA method to simultaneously reconstruct and de-noise seismic data. In fact, PCA is to find a lower dimensional optimal approximation of the initial data in the least-squares sense. However, the signal has a deflection to the optimal approximation in this lower dimensional space. For this reason, we derive a fine-tuned operator acting on the extracted principal components to make the reconstructed data closer to the signal. Application of this proposed improved method on synthetic and field seismic data demonstrates a superior performance comparing with the traditional PCA.
Presentation Date: Tuesday, October 18, 2016
Start Time: 1:00:00 PM
Presentation Type: ORAL
Siena, Martina (Politecnico di Milano) | Guadagnini, Alberto (Politecnico di Milano) | Della Rossa, Ernesto (eni S.p.A.) | Lamberti, Andrea (eni S.p.A.) | Masserano, Franco (eni S.p.A.) | Rotondi, Marco (eni S.p.A.)
We present and test a new screening methodology to discriminate among alternative and competing enhanced-oil-recovery (EOR) techniques to be considered for a given reservoir. Our work is motivated by the observation that, even if a considerable variety of EOR techniques was successfully applied to extend oilfield production and lifetime, an EOR project requires extensive laboratory and pilot tests before fieldwide implementation and preliminary assessment of EOR potential in a reservoir is critical in the decision-making process. Because similar EOR techniques may be successful in fields sharing some global features, as basic discrimination criteria, we consider fluid (density and viscosity) and reservoir-formation (porosity, permeability, depth, and temperature) properties. Our approach is observation-driven and grounded on an exhaustive database that we compiled after considering worldwide EOR field experiences. A preliminary reduction of the dimensionality of the parameter space over which EOR projects are classified is accomplished through principal-component analysis (PCA). A screening of target analogs is then obtained by classification of documented EOR projects through a Bayesianclustering algorithm. Considering the cluster that includes the EOR field under evaluation, an intercluster refinement is then accomplished by ordering cluster components on the basis of a weighted Euclidean distance from the target field in the (multidimensional) parameter space. Distinctive features of our methodology are that (a) all screening analyses are performed on the database projected onto the space of principal components (PCs) and (b) the fraction of variance associated with each PC is taken as weight of the Euclidean distance that we determine. As a test bed, we apply our approach on three fields operated by Eni. These include light-, medium-, and heavy-oil reservoirs, where gas, chemical, and thermal EOR projects were, respectively, proposed. Our results are (a) conducive to the compilation of a broad and extensively usable database of EOR settings and (b) consistent with the field observations related to the three tested and already planned/implemented EOR methodologies, thus demonstrating the effectiveness of our approach.
Although principal-component analysis (PCA) has been widely applied to effectively reduce the number of parameters characterizing a reservoir, its disadvantages are well-recognized by researchers. First, PCA may distort the probability-distribution function (PDF) of the original model, especially for non-Gaussian properties such as facies indicator or permeability field of a fluvial reservoir. Second, it smears the boundaries between different facies. Therefore, the models reconstructed by traditional PCA are generally unacceptable. In this paper, a work flow is proposed to integrate cumulative distribution-function (CDF) mapping with PCA (CDF/PCA) for assisted history matching on a two-facies channelized reservoir. The CDF/PCA is developed to reconstruct reservoir models by use of only a few hundred principal components. It inherits the advantage of PCA to capture the main features or trends of spatial correlations among properties, and more importantly, it can properly correct the smoothing effect of PCA. Integer variables such as facies indicators are regenerated by truncating their corresponding PCA results with thresholds that honor the fraction of each facies at first, and then real variables such as permeability and porosity are regenerated by mapping their corresponding PCA results to new values according to the CDF curves of different properties in different facies. Therefore, the models reconstructed by CDF/PCA preserve both geological (facies fraction) and geostatistical (non-Gaussian distribution with multipeaks) characteristics of their original or prior models. The CDF/PCA method is first applied to a real-field case with three facies to quantify the quality of the models reconstructed. Compared with the traditional PCA results, the integration of CDF-based mapping with PCA can significantly improve the quality of the reconstructed reservoir models. Results for the real-field case also reveal some limitations of the proposed CDF/PCA, especially when it is applied to reservoirs with three or more facies. Then, the CDF/PCA together with an effectively parallelized derivative- free optimization method is applied to history matching of a synthetic case with two facies. The geological facies, reservoir properties, and uncertainty characteristics of production forecasts of models reconstructed with CDF/PCA are well-consistent with those of the original models. Our results also demonstrate that the CDF/PCA is applicable for conditioning to both hard data and production data with minimal compromise of geological realism.
Seismic data are always contaminated with noise. Therefore, signal-to-noise ratio enhancement plays an important role in seismic data processing. This paper illustrates a robust principal component analysis (RPCA) method to suppress erratic noise that contaminates seismic data. The method operates in the frequency-space domain and relies on a robust low-rank approximation of the seismic data volume. We adopt a nuclear norm constraint that yields the low-rank approximation of the desired data while using an ℓ1 norm constraint to properly estimate the erratic (sparse) noise. The problem is then tackled via the first-order gradient iteration method with two steps of softthresholding. We illustrate the effectiveness of this method via synthetic examples.
Principal component analysis (PCA) is an important tool for multivariate analysis in statistics. The idea is to reduce the dimensionality of a data set while preserving as much variability of data variables as possible (Jolliffe, 2010). Let us consider to recover a low-rank matrix L from the observed data
D = L+E, (1)
where E is a matrix representing the additive error. If we assume E is composed by small random perturbations, an optimal estimate of L can be acquired via the following optimization problem
min | |E| |2/F
s.t. rank(L) = k , D = L+E. (2)
The problem can be efficiently solved via singular value decomposition (SVD) (Golub and van Loan, 1996). The observed data D can be decomposed into a group of eigen-images via the SVD. The low-rank component L can be described with a few eigen-images that are associated to the largest singular values. The error E, however, will have energy spread over all the eigen-images (Trickett, 2003).
A variety of methods based on PCA have been developed in seismic data processing. For instance, Ulrych et al. (1999) introduced a time domain matrix rank reduction method to eliminate incoherent noise from seismic records. A related family of methods, the Karhunen-Loeve transform, has also been introduced for the enhancement of the signal-to-noise ratio of prestack gathers (Al-Yahya, 1991).