U-NET, RESNET OR NONE, COMPARING AND CONTRASTING CONVOLUTIONAL NEURAL NETWORK ARCHITECTURE ININTERPRETATION OF A SALT SEISMIC SETTING

By James Benco

ABSTRACT

One of the most crucial tasks in seismic reflection imaging is the identification of salt bodies and subsequent reservoirs for oil and gas. For the most part salt body identification has been done visually determining the salt/sedimentary sequence boundary, which invites error and requires a non-consequential amount of work. Deep learning techniques have given rise to impressive levels of object identification, enough to supplant the visual analysis currently performed by the industry. Convolutional Neural Networks (CNN) have been invaluable in computer vision applications and are the primary tool we shall apply in this setting. The architecture of a CNN is important as certain architectures lend themselves to certain tasks. We shall explore the differences between architectures of U-Net, and ResNet in the application of salt boundary identification. CNN adjustments including activation functions (ReLu, Sigmoid, SoftMax). The results of this study show that the ResNet model is the optimal architecture of the architectures presented in interpreting salt structures from sedimentary facies from seismic reflection survey images.

INTRODUCTION
1. Purpose and Scope

The main objective of this study is to determine which CNN architecture and activation function perform the best on a set of training data of noisy salt dome features. Specifically varying architecture (U-Net, ResNet) and activation functions (ReLu, Sigmoid, Softmax) will provide useful insights as to the most efficient methodology in performing this work.

Importance of This Study

This is an important question; salt bodies interfaces are one of the primary reservoirs for oil and natural gas. In a basin after the oil is generated and starts migrating out of the host rock and through the reservoir rock, certain salt body geometries act as impermeable boundaries and act as cap rock (Posey, 1988), trapping that fluid in its fold. In the Gulf Coast of the United States due to the thick salt sequence and subsequent halokinesis has given rise to one of the world’s best developed salt dome provinces (Posey, 1988). In fact, the most famous gushing oil well in history at Spindletop Dome (Halbouty, 1979) is the result of these salt dome geometries.

LITERATURE REVIEW
- Overview

With the importance of this geologic setting, strategies to precisely determine salt diapir boundaries are paramount. This study will cover hydrocarbon migration within this setting, convolutional neural networks history and usefulness in this setting, and our specific CNN’s to be studied (U-Net and ResNet) architecture.

Halokinesis and Hydrocarbon Migration

Salt Tectonics and salt movement (Halokinesis) ultimately play a substantial role in oil and natural gas production. Particularly in the United States in which these are the primary cap structures for traditional oil and natural gas reserves. Approximately the Gulf offshore oil production accounts for about 15% of US total crude oil (U.S Energy Information

Administration, 2022), with salt structures being the cap rock for these systems. In a traditional petroleum trap, hydrocarbons migrate from the source rock (shale) through a reservoir rock formation (porous/fractured formation) until the fluid either degrades as its positive buoyancy migrates surface bound, or is trapped beneath/alongside an impermeable formation. In the Gulf of Mexico, this impermeable formation are typically salt domes/diapirs. These structures as seen in fig 1, due to overburden pressure move in the subsurface and create pillar/diapir structures.

Fig 1. Cross-section of the northern Gulf Coast region. Modified after Salvador and Buffler (1982).

These provide migration pathways from source rocks to reservoir rocks and due to the irregularity of the salt dome itself, the salt acts as a caprock for hydrocarbon fluids. Figure 2 displays the form of hydrocarbon trap created by salt diapirs.

Fig 2. This figure shows how a salt diapir (formation denoted by ‘+’ texture) moves isostatically through other formations and creates a piercement trap. The hydrocarbons in this system (denoted by the shaded section) are located within the porous reservoir rock, and surrounded by cap rock (the formation denoted by ‘-‘ texture and the salt diapir). Modified from Biddle and Wielchowsky.

The way the industry determines the size, orientation and location of these structures is to interpret seismic reflection surveys performed over the area. These surveys are the result of pressure waves (P and S form waves) reflecting off of density boundaries within the subsurface. This data undergoes a series of processing functions to stack data from seismometers, and reduce the signal to noise ratio. Presented in figure 3 is an example from the dataset used in this study of a seismic reflection image.

Figure 3. This figure shows the seismic reflection image of a salt diapir and surrounding formations. The diapir presents as the large structure with the red marked boundary.

Images such as Figure 3 are typically interpreted visually by hand. Automation has increased in gathering much of this data and parsing out structures that are not salt diapirs, however ultimately the reports and locations are interpreted by humans at the end of the day, which is a production bottleneck for the oil/gas industry. Deep learning thus has a far-reaching potential in this field in interpreting these structures quickly and accurately.

Primary Technical Approaches

Neural networks show promise of being a useful tool in handling seismic reflection data. This study is interpreting seismic reflection images, in the same sense a petroleum geologist/geophysicist would as they have received this information from a subsurface survey consultant firm. Since that is the case, working with image files and not the underlying raw seismic data is the format that needs to be used as our neural network input. To understand the approach, the architecture of a generalized CNN will be presented, along with the architecture of our case study architectures (U-Net and ResNet).

Generalized CNN Architecture

Convolutional Neural Networks (CNNs) are a powerful tool to handle images as network inputs. This study will be working with seismic reflection images as shown in figure 3, as such this makes CNNs the optimized tool to handle that information. The best example of the general process of a CNN, it is best to go back to the beginning and understand the first CNN architecture. LeNet was the first CNN architecture and was used to recognize handwritten numerals (MNIST dataset). Originally developed by LeCun et al, this network created CNNs with a seven layered, including four convolutional and pooling layers which were followed by three fully-connected layers of an artificial neural network which was called LeNet-5 (Sakib et al., 2019). Figure 5 shows the generalized schematic for a LeNet CNN.

Figure 5. A simplified schematic of a CNN architecture. Modified from Sakib et al., 2019.

U-Net Architecture

The U-Net (Ronneberger et al., 2015) structure, combines a down-sampling path to extract context information and an up-sampling path to extract location information. The U-Net is a CNN, however does not stop at a contracted output, however up-convolutes that contracted output back into an array of similar size to the input array, giving the architecture the characteristic “U” shape. A schematic view of a U-Net is provided in figure 4

Figure 4. A schematic view of a U-Net architecture. The blue boxes denote a multi-channel feature map, where the number of channels increase by stage. The right half of the schematic denotes the opposite in which the number of channels then decrease stage by stage. The horizontal arrows denote the overall movement of the information through each stage of the system, starting in a contracting phase and ending in an expanding phase. Modifed from Zeng et al.,2019.

ResNet Architecture

ResNet is a variant of U-Net architecture to overcome the degradation problem (He et al., 2015). A ResNet is constructed by creating and adding an identity mapping shortcut on top of every few stacked layers. This results in the model having greater precision due to learning the perturbations of the mapping layer, instead of the full input (Zeng et al., 2019).

DATA AND PREPROCESSING
- Data Source

Data for this study comes from the Kaggle competition “TGS Salt Identification Challenge” as this provides a large dataset with noisy images which reflect the reality of the images this network would work with. Data consists of .png images within a zipped file along with corresponding depths and training files (csv file format). Images are 101×101 pixels and each pixel is classified as either salt or sediment. Figure 6 shows some of the images with corresponding masks.

Figure 6. Displays some of the images in (128×128 pixel) image size with corresponding mask. Mask refers to the classified material as either salt (purple) or sediment (yellow).

Data Preprocessing

Data preprocessing took form of image loading and resizing, loading into a NumPy array and resizing into 128×128 pixel format. Normalization occurred by dividing resulting values by 255 to have a range from 0-1 in which the model could handle.

METHODOLOGY
- U-Net Methodology

This methodology is the original U-Net with 9-layers, ReLu activation with an output activation function of sigmoid. Loss function of binary_crossentropy and model metrics being the meanIoU (Intersection over Union) and val_loss (validation loss). Sometimes referred to as the Jacard Index, the intersection over union metric is standard for CNN metrics relating to images (Rezatofighi et al., 2019). This evaluation metric required a predicted boundary of an object in an image file and the actual boundary of that image. The metric is then determined by the area of overlap divided by the total area of union of the boundaries (Rezatofighi et al., 2019).

ResNet Methodology

This methodology is similar to the U-Net however with ResNet block architecture. This will have 9 layers as well with 4 contracting layers, 1 middle hidden layer and 4 expanding layers. This model will use wandb as the dashboard for hyperparameters and will assist in model implementation. The metrics for this model will be the Training IoU, Training Loss, Validation IoU and Validation Loss. Accuracy will also be used as a metric for this model, however the primary metrics to determine the efficacy of the model will be in relation to the Training and Validation losses and IoU. Table 1 describes layers and input block size.

Layer	Input Block Size	Activation Function	Loss Function
1	101	ReLu	Lovasz Hinge
2	50	ReLu	Lovasz Hinge
3	25	ReLu	Lovasz Hinge
4	12	ReLu	Lovasz Hinge
5	6	ReLu	Lovasz Hinge
6	12	ReLu	Lovasz Hinge
7	25	ReLu	Lovasz Hinge
8	50	ReLu	Lovasz Hinge
9	101	ReLu	Lovasz Hinge

Table 1. This table shows the input block size, activation function and loss function for each layer within the ResNet CNN.

RESULTS

Demonstrating the accuracy of the proposed CNN approaches, we first will compare the results against each other in terms of Train_Loss, Train_IoU, Valid_Loss, Valid_IoU, as presented in Table 2 below.

Model	Train_Loss	Train_IoU	Valid_Loss	Valid_IoU
U-Net	0.2596	0.3788	0.2893	0.3934
ResNet	0.0362	0.8203	0.4821	0.7054

Table 2. This table shows the model summary results for each model within the study.

Model specific results will be presented in each subsection below.

U-Net Results

U-Net Results loss vs validation loss model results are presented in Figure 7 below. Full model summary is located in Appendix B U-Net.

Figure 7. Displays the validation loss and loss of the U-Net model over the course of 61 epochs. The model stopped at 14 epochs due to the early stopping function in place when validation_loss is not improving. Both the validation_loss and loss functions appear to be improving until early stop ended the model run.

ResNet Results

The ResNet model results are presented in Figure 8 below. Overall, the ResNet model produced a high training IoU of 0.82 and high validation IoU of 0.705. With low train_loss and valid_loss.

Figure 8. Displays the results of the ResNet model by epoch. This overall shows that the Training IoU increased with time, and that Training loss decreased with time. It is interesting to note that the validation_loss although initially decreased, started to increase near the end points of the model. The validation_IoU increased and was cycling between two sets of values in the latter portions of the model run, this coincides with the validation_loss increasing as well.

Discussion

U-Net Discussion

The U-Net model provided a somewhat low accuracy and that is partially due to the early stopping function of the model. The early stopping function had a pause of 15, thus epochs 46-61 did not significantly improve Valid_loss. This stopped the U-Net model short of the 74 epochs the ResNet model completed. This is likely due to noise within the data as the processing that was done for both datasets was not sufficient for the standard U-Net to be as effective of a model. Mean IoU and Validation IoU was not significantly improved with decreasing loss and validation_loss metrics. This is to be interpreted as a miscalculation of the IoU for this model and as such was not determined to be the important accuracy metric.

ResNet Discussion

This model did perform the best in the comparison. The increase of the Train IoU and the Valid IoU is positive as those metrics were not strong the in the other model. It is interesting to note that the Valid IoU cycled between two value states near the end of the model. This is likely two local maxima of the hidden weights, and as such marks the end of the epochs it would take for this model to be adequately trained. The loss function metrics were promising but note that there is an increase in the Valid_loss function near the end of the model run. Multiple models were run, but all had a similar problem noting an increase in Valid_loss near the end of the model run. Ultimately, this problem was not resolved but interpreted to be the point at which the architecture of this model was no longer able to improve the Valid_IoU and instead was bouncing around multiple local maxima of the hidden weights resulting in increasing and variable Valid_loss values.

Original Implementation and Problems Discussion

Originally this study was to include a LeNet model as a compared model. In the pursuit of the comparison a LeNet was attempted to be implemented however, without an expanding or deconvolution in the model the LeNet was fundamentally different than the U-Net or ResNet models and as such was not compared. If these models were just to move to a classification model of the amount of salt diapir percentage of the image, this might be useful but in the context of this study, a LeNet model architecture just does not make sense.

U-Net implementation problems occurred within the number of epochs before model end. This was due to the early stopping function, and with the patience hyperparameter able to be increased, the model was then able to iterate to 61 epochs instead of 14.

Overall data loading caused problems for each model. Multiple attempts for both U-Net and ResNet were attempted to work on a local Jupyter notebook instead of the built in Kaggle notebook. However, in order to work with the data, the data needed to have a filepath from the input directory of the Kaggle notebook. This forced the data loading to use file-pathways instead of typical pd.read functions I was familiar with. This was rectified for both models, however provided a significant time-sink in starting this study.

CONCLUSIONS

This study compared U-Net and ResNet CNN architecture in interpreting seismic reflection images of a halokinetic geologic environment. This is useful as these salt diapir structures are important hydrocarbon traps for some of the most productive basins within the world. The U-Net and ResNet models were trained on data from the Kaggle competition TGS Salt Identification Challenge, in the form of 101×101 seismic reflection images, corresponding training masks, and depth values.

Both models used Intersection over Union (IoU) and loss metrics for both training and validation data. These accuracy metrics were then compared to determine the best model architecture for interpreting salt structures in this environment. Figure 7 displays the validation_loss vs training_loss for the U-Net architecture showing continued improvement of the training_loss with variability and cyclicity of the validation_loss near the end of the model run. Figure 8 displays the accuracy metrics (Train Iou, Train loss, Valid IoU, and Valid loss) for the ResNet model run.

Conclusions from this study include:

ResNet architecture is greater at salt identification from seismic reflection images.
Salt can be accurately and efficiently identified using CNN architecture.
FUTURE WORK

This is an important problem to be solved as decreasing production time for hydrocarbons in this environment is important to global fuel prices. Future steps to improve upon this study include:

Expanding comparison to include other CNN architectures (AlexNet, DenseNet etc.)
Applying a parallel study of salt identification to use a salt coverage metric instead of mask metric for accuracy. This will allow other CNN architectures which do not output an array of similar size to the input (i.e. LeNet) to be compared. This type of model would not be implemented in the industry as having a corresponding mask showing the salt structure is important, but rather this could be used as an accuracy metric instead. Thus, a model displaying what percentage of the image’s pixels are covered by a salt structure in conjunction with IoU values would be useful to get a better understanding of the CNN accuracy.

Resolving the “bouncing” loss values near the end of the model run.

References:

Di, Haibin, Zhen Wang, and Ghassan AlRegib. “Deep convolutional neural networks for seismic salt-body delineation.” (2018).

Dramsch, Jesper S., and Mikael Lüthje. “Deep-learning seismic facies on state-of-the-art CNN architectures.” In 2018 SEG International Exposition and Annual Meeting. OnePetro, 2018.

Halbouty, M.F., 1979. Salt Domes, Gulf Region, United States and Mexico. Gulf Publishing Co., Houston, Texas, 2nd ed., 561 pp.

Magoon, Leslie B., and Wallace G. Dow. “The petroleum system: chapter 13: Hydrocarbon Traps.” (1994): 219- 235.

Posey, Harry H., and J. Richard Kyle. “Fluid-rock interactions in the salt dome environment: an introduction and review.” Chemical Geology 74, no. 1-2 (1988): 1-24.

Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. “U-net: Convolutional networks for biomedical image segmentation.” In International Conference on Medical image computing and computer-assisted intervention, pp. 234-241. Springer, Cham, 2015.

Sakib, Shadman, Nazib Ahmed, Ahmed Jawad Kabir, and Hridon Ahmed. “An overview of convolutional neural network: its architecture and applications.” (2019).

Salvador, Amos, and R. T. Buffler. “The Gulf of Mexico Basin.” (1982).

Zeng, Yu, Kebei Jiang, and Jie Chen. “Automatic seismic salt interpretation with deep convolutional neural networks.” In Proceedings of the 2019 3rd international conference on information system and data mining, pp. 16-

20. 2019.