Easymatics Ltd., 86-90 Paul Street, London, EC2A 4NE, U.K.
Transforming Businesses through Digital & Technical Services Excellence

Machine Learning for Upstream Oil & Gas

Machine Learning for Upstream Oil & Gas

Machine Learning for Upstream Oil & Gas, Setting the Scene

Machine learning for upstream oil & gas is a hype topic with many vague promises.
There are however concrete use cases in the industry that use machine learning to achieve substantial benefits.

At the same time, many providers are portraying data management principles as an application of machine learning. Providing a solution for example that enables a user to organise structured and unstructured data including tagging for multidimensional elements is not an application of machine learning, neither is normalisation of data.

Bounding Machine Learning

For the benefit of this article, we need to set the parameters for a solution to be an application of machine learning.

Machine Learning Definition: Machine learning is an application of artificial intelligence (AI) that provides systems with the ability to learn and improve experience without being explicitly programmed automatically. Machine learning focuses on the development of computer programs that can access data and use it to self-learn.
Data Management Definition: The official definition provided by DAMA International, the professional organisation for those in the data management profession, is: "Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise."

Therefore from the definitions supplied that providing a taxonomy for structured and unstructured data does not meet the defined criteria of machine learning. But on the other extreme of the solution scale then machine learning is not the application of AI, as it's a specific function rather than the implementation expectations of users when hearing the terms about machine learning implementations.

It is therefore feasible that an implementation of machine learning could learn from data and experience aided by user confirmation. Digesting information from multiple sources though and creating a tag list is not much different to the digitisation of paper records as was a prominent activity during 90's implementations of document management systems. For those who will recall the process involves a workflow where the users classify and tags paper records, which led to the automation of the digitisation process. I remember the scripts and testing of these systems very well, including the many quirks of OCR technologies.

The OCR limitations at the time were in large due to software, instability of operating systems and interfaces. The infamous 'blue screen of death' comes to mind during this period.

It seems though from the many presentations I have witnessed over the past 2-3 years the erasure of this piece of IT history. Digitisation of records was not called machine learning then, so it appears that companies want to expand the credentials of machine learning. I digress a moment because the same has happened with IoT, this is a progression of the distributed IO theme of the same period in history.

Managing Data Quality can be an efficient application of machine learning, but it is essential to focus on data that contribute to the desired output. The issue is for machine learning to be productive the desired outcome has to be defined. Flagging an anomaly based on the previous behaviour may not be sufficient to be of value and monitoring only physical aspects of equipment may not represent the fluid behaviour impact of the overall performance.

A good example could be measuring pump performance; one could say that if the pump is running at the current defined on the performance curve, then no cavitation exists, so the pump must be running well. In this case, we only see current represented in our performance view, unless we have an input for fluid, then our understanding of performance has uncertainty. Typically in process industries access to physical equipment data is available, albeit of variable quality, it tends to be the leading dimension of machine learning across the Upstream Oil & Gas industry.

Wherever you look for examples of machine learning, then rotating equipment is the primary use case, i.e., turbines, pumps and compressors.

Machine Learning in Upstream Oil & Gas

In the oil and gas industry, machine learning may be used to model the response of a physical system to outside influence such as that of the human operators of a facility. Then we can infer what actions would lead to better performance of the plant and recommend activities in real-time.

Current implementations of machine learning, in general, do not consider workflow as a function, and this was also the case when companies referred to these solutions as analytics. Understanding there is an issue, does not allow the user to provide a solution to fix the problem or monitor as it progresses through the organisation to resolution.

Once you start to address these gaps in upstream oil & gas, then the solution becomes fragmented, as we are crossing between business process management, analysis, analytics and machine learning. In situations where we are performing analysis or analytics then depending on the complexity of the rules, the implementation can occur in many different systems. Defining a strategy with set rules for the execution of rules is key to sustainability for machine learning based solutions in upstream oil & gas.

The current trend towards robotic automation exemplifies the challenges of a sustainable solution, as we are focussing on the automation of business processes through the execution of machine learning principles.

To respond activities in real-time then this involves the use of transient models or the routine operation of steady-state models. Fluid modelling in Upstream Oil & Gas tend to be primarily steady-state, and this represents a barrier given the behaviour you are trying to model typically for a different purpose, i.e., rate estimation.

A typical use case would be load balancing between wells on a single field. Drawing maximum load from one well typically affects the available capacity on the neighbouring wells. Achieving optimal extraction conditions requires careful load balancing that can be performed using machine learning of the reservoir response. To execute this use case will be very different for each field, and there are other challenges to consider; a) what am I trying to optimise? b) a complicated reservoir simulation could take anything up to 2 days to process, and this assuming the calculation is successful c) the wells and networks are modelled and calibrated. The use case though is standard for the industry so solutions exist which could be enhanced with machine learning but are typically not maintained to the standard of fidelity required for the outcomes expected from advances in machine learning. So we either have to rethink the execution of the case, or the operator invests heavily in upgrading current solutions.

The confidence to predict whether the operator will miss a planned event, or perform an action to prevent an unplanned event or the most significant saving areas. In these cases then we have to consider the process to manage a plan or provide measures to resolve the outage.

So for these cases, the question has to be; how much prior notice do I need to affect the outcome? Followed by; what confidence do I have in the information provided? Finally; are there any other factors that could influence this issue?

I would suggest that unless we can successfully address these points, then our machine learning adventure will not lead to sustainability, expected savings and greater adoption across the business.


We believe that adoption of machine learning is not just about the technology talk. It's about the strategy surrounding the execution of the intent, delivering value and managing the expectations of the outcome.

As technology providers, we are happy to talk about the technical details and configure software as the next company. If we can't define the outcome though, understand the data that impacts the outcome and provide confidence then the solution is not scalable. Scalability is a challenge across all upstream oil & gas, and in the case of this industry scaling down is just as essential as scaling up.

We have spent the past few years defining strategy models for execution of analytics and processes in upstream oil & gas. The fundamentals of sustainability, scalability and affordability are core to the growth of machine learning in this industry.

The team at Easymatics are in a unique position to support clients in their transformation for upstream oil & gas journey. We have developed a proven framework of methods and accelerators based on six pillars culled from multiple global assignments. Machine learning for upstream oil & gas is one of those six pillars.

Comments are closed.