Predictive Tank Maintenance using Machine Learning

Predictive Tank Maintenance using Machine Learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. The process of learning begins with looking for patterns in observations or data in order to make better predictions on new data coming in.

Machine Learning comes in handy when you need to figure out if a battle tank is healthy and battle ready. Battle tanks are considered key components of modern armies. Since they are used in rough terrains and under hostile environments it becomes necessary to be able to predict if the tank needs maintenance to prevent break down when in use.

Various gauges and meters collect vital data from each of these monstrous machines. For the purpose of illustrating Machine Learning, we will take into account measurements of Current Life, Oil Pressure, Engine Vibrations, and Coolant Temperature. Multiple such readings were taken from various MBTs (main battlefield tanks). These were used to create a tensorflow model that predicts the remaining useful life of the MBT.

In Machine Learning such problems are categorised under Supervised Learning. Supervised learning involves finding a relationship between a set of features to predict the output variable. The output varies in nature, in our case, it is a quantitative measurement of the tank’s life expectancy, which makes this a regression problem. The other kind of outputs are categories or labels like in case of classification of doppler signals into four categories. These are as the name suggests, classification problems.

Data description

A snippet of data generated from gauges and meters fitted into the tank is shown below:

Ser

No.

Eng Hrs Run

(in Hrs)

Vibration

Coolant Temp

(˚C)

Oil Pressure

(in Kg/cm²)

1

260

270

60

9

2

961

428

78

8.2

3

517

287

80

8.4

 

As the tank gets older the engine vibration increases, while on the other hand the Coolant Temperature and Oil Pressure starts rising. However, there exists no relation between Vibration vs Oil Pressure/Coolant Temperature. The vibration measured have highs and lows, we will work with the averages.

Tensorflow Implementation

We have the Engine Run data with various features that we have measured over time and with different tanks. This will be used to train a regression model that can predict the Engine Life in hours of the dataset that we are interested in.

This example uses the tf.keras API, see this guide for details.

 

import urllib
import io
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import logging
import traceback

logging.getLogger('tensorflow').setLevel(logging.DEBUG)
print(tf.__version__)

1.13.0-rc1

1. Define the path and the format of the data

Declare the columns and their type, you want to use for training as shown below.  Floats variable is defined by [0.].

COLUMNS=["Ser","EngHrs","Vibration","CoolantTemp","OilPressure"]
RECORDS_ALL=[[0.0], [0.0], [0.0], [0.0],[0.0]]

In general, we normalize the data so as to prevent the so-called vanishing/exploding gradient. In our case, we will skip this as we know our data is well behaved.

 

2. Define the input_fn function

We need an input function that can give us predictors datasets as requested. Below is the overall code to define the function.

def input_fn(data_file, batch_size, num_epoch=None):                
    # Step 1                
    def parse_csv(value):        
        columns = tf.decode_csv(value, record_defaults=RECORDS_ALL)
        features = dict(zip(COLUMNS, columns))                
        features.pop('Ser')        
        labels =  features.pop('EngHrs')
        return features, labels                            
          
    # Extract lines from input files using the Dataset API.    
    dataset = (tf.data.TextLineDataset(data_file) # Read text file      
          .skip(1) # Skip header row      
          .map(parse_csv))
    dataset = dataset.repeat(num_epoch)    
    dataset = dataset.batch(batch_size)                
    # Step 3    
    iterator = dataset.make_one_shot_iterator()    
    features, labels = iterator.get_next()
    print("Features %s, Labels %s"%(features, labels))  
    return features, labels

Consume the data: Extract lines from input files using the Dataset API.

  • tf.data.TextLineDataset(data_file): This line read the csv file
  • .skip(1) : skip the header
  • .map(parse_csv)): parse the records into the tensorsYou need to define a function to instruct the map object. You can call this function parse_csv.

Import the data: Parse CSV File

This function parses the CSV file with the method tf.decode_csv that returns the features and the label. Code explained below:

  • tf.decode_csv(value, record_defaults= RECORDS_ALL): the method decode_csv uses the output of the TextLineDataset to read the CSV file. record_defaults instructs TensorFlow about the columns type.
  • dict(zip(_CSV_COLUMNS, columns)): Populate the dictionary with all the columns extracted during this data processing
  • features.pop(‘EngHrs’): Exclude the target variable from the feature variable and create a label variable

Create the iterator

Now you are ready for the second step, that is, to create an iterator that returns the elements in the dataset for which we use make_one_shot_iterator.

3.Consume the data

Code below shows how input_fn can be used to generate data for the estimators. The batch size and number of epoch defines how much data it is going to generate.

num_epoch: Epoch is the number of times all the data is taken into account.
batch_size: Batch size is the number of inputs taken at a time
output: output is printed as the features in a dictionary and the label as an array.

Following will show the first line of the CSV file. You can try to run this code many times with different batch size.

next_batch = input_fn(df_train, batch_size=1, num_epoch=None)
with tf.Session() as sess:
   first_batch = sess.run(next_batch)
   print(first_batch)

Features {‘Vibration’: <tf.Tensor ‘IteratorGetNext:2’ shape=(?,) dtype=float32>, ‘CoolantTemp’: <tf.Tensor ‘IteratorGetNext:0’ shape=(?,) dtype=float32>, ‘OilPressure’: <tf.Tensor ‘IteratorGetNext:1’ shape=(?,) dtype=float32>}, Labels Tensor(“IteratorGetNext:3”, shape=(?,), dtype=float32, device=/device:CPU:0)

 

4. Define the feature column

You need to define the numeric columns as follow, and then combine all the variables in a bucket.

X1=tf.feature_column.numeric_column('Vibration')
X2=tf.feature_column.numeric_column('CoolantTemp')
X3=tf.feature_column.numeric_column('OilPressure')
base_columns = [X1, X2, X3]

 

5. Build  the model

You can train the model with the estimator LinearRegressor.  Instead of building a new estimator, we are using a ‘Canned Estimator’ provided by tensorflow. This will save the trained model in ‘train’ directory.

model=tf.estimator.LinearRegressor(feature_columns=base_columns, model_dir='train')

We have skipped customising the optimiser instead we will use the default. A optimiser comes in handy to quickly get to the optimum value of votes(weight in tensorflow terminology) that each input feature gets in deciding the outcome.

 

6. Train  the model

It is now time to train the model, using a lambda function as the input argument to method train, which provides a way of iteratively going thru the predictors.

model.train(steps=500, input_fn=lambda: input_fn(df_train, batch_size=128, num_epoch=None))

num_epoch: Epoch is the number of times all the data is taken into account.
batch_size: Batch size is the number of inputs taken at a time
steps: Steps is the number of batches considered for training.

If you run thru the data too many times the model seems to remember your data, this is called overfitting and is not good. When you have overfitted model you will get accurate results for features that match the input features but results will swing wildly if you use different sets of values. Contrast this with an under fitted model where the results are not very accurate. The sweet spot is somewhere in between and you can hit it intuitively by tuning various estimators parameters.

 

7. Evaluate  the model

Now evaluate the fit of your model by using the data set aside for evaluation:

results = model.evaluate(steps=None,input_fn=lambda: input_fn(df_eval, batch_size=27, num_epoch=1))

for key in results:
    print(" {}, was: {}".format(key, results[key]))

average_loss, was: 57064.83984375
label/mean, was: 453.77777099609375
loss, was: 3081501.25
prediction/mean, was: 614.8172607421875
global_step, was: 2110

Evaluation lets you check if you get a better model on tuning certain parameters. There is no absolute value of the above parameters that will make it good. While tuning your target should be to minimise the loss.

 

8. Prediction

The last step is predicting the value of based on the value of, the matrices of the features. You can write a dictionary with the values you want to predict. The model has 3 features so you need to provide a value for each. The model will provide a prediction for each of them.

In the code below, you wrote the values of each feature that is contained in the df_predict CSV file.

You need to write a new input_fn function because there is no label in the dataset. You can use the API from_tensor from the Dataset.

prediction_input = {
    'Vibration': [325, 272],
    'CoolantTemp': [65,75],
    'OilPressure': [8.2,9.2]
}
def test_input_fn():       
    dataset = tf.data.Dataset.from_tensors(prediction_input)       
    return dataset     

pred_results = model.predict(input_fn=test_input_fn)

for pred in enumerate(pred_results):
    print("Engine remaining life is %d"%(pred[1]['predictions']))

Engine life is 611

 

Summary

Find the full code with data here.
Machine learning is more about understanding the data and the ability to tune various parameters so as to get a model that has the right fitting.