This a retrieved form of the original article from the series, ‘A Spotlight on Undergraduate Research’. Edited by Niharika Gunturu.
The author calls dibs on DeepLanding!
Varun Vankineni, a 5th-year Dual Degree student from the Department of Aerospace Engineering, gives us a glimpse into his Dual Degree Project – which involves developing AI tools to help helicopters gauge their landing strategies on unstable surfaces – a setting with great ramifications for our Airforce and Navy.
When I was in my first year, Artificial Intelligence and Machine Learning were heavily restricted in their use cases. Most applications dealt with Computer Vision and associated fields. It has been four years since, and it is safe to say that this is no more the case. Their areas of application have exponentially increased in number – from Product Recommendation Engines to Stock Price Prediction Networks to even Auto Overclocking your processors!
Today, I’ll detail how I’m using deep learning as a core solution to the problem of ‘Landing Helicopters on Ships.’
The basic landing maneuvre of helicopters is considered safe when it is landed flat to the ground, that is, having all the landing gear touching the ground simultaneously. Otherwise, the helicopter topples quite easily. Having this in mind, let us go through the levels of complexity we face when a ship is involved,
- On a stable deck, it is quite straightforward to land a helicopter. Accidents, which are quite rare, occur primarily due to ground resonance.
- In the presence of moderate winds in every direction possible, matters get more frustrating, but it is still possible to land safely.
- In the presence of rotational disturbances – that too about three different axes – we’re in serious trouble.
Let us now force the deck to have these rotational disturbances around all three axes – popularly called the Pitch, Roll and Yaw. This is where the problem explodes out of hand. Unless there is a systematic trend to these disturbances – and provided we know this trend – it is very difficult to analyze the motion of the ship. On an actual ship, there is the added complexity of the translation of the deck, again in all three axes. From here on we shall also be talking about the ‘quiescent period’.
quiescent period: the time period during which the disturbances of the ship deck are small enough to land safely.
The way we tackle this problem is by breaking it down into 2 major parts,
- Creating a model to predict future ship motion and subsequently the quiescent period
- Making a controller to guide the helicopter pilot in landing and automating the landing
Let us deviate from this a bit and go through what exactly is deep learning in the context of this problem with a basic overview.
The Universal Approximation Theorem:
Let us take the equation,
Understand that here both and are matrices of arbitrary dimensions, that is any number of both input variables and output variables. is also called as the matrix of true values.
Let us say we know at least a few examples of the true value () given their respective inputs ( ), but have no information about the function itself. Analytically solving for this function is near impossible! We shall hence try to create a closely matching function that behaves similarly. Deep Learning (DL) or Machine Learning (ML) in general deals with the modelling of this unknown function ‘f’ such that given an x1 as the input, we obtain a predicted value y1 (the hat here represents it is a predicted value) that is as close as possible to the true value y1.
Generally, we design this function ‘’ as a series of matrix multiplications. For example, take an model –
The total model is the ‘neural network’ and each of these matrices represents what we call a ‘hidden layer’ in DL terminology. A higher number of matrix multiplications lets us model complex functions and we term these as “Deep Neural Networks”. The word deep is contextually analogous to a large number of hidden layers.
We assume that we can find the values of the elements of these series of hidden layer matrices such that we can predict values ( ) close to the true values ().
This would imply that we have found an approximation to the original function, and this approximation lets us link our input quantities to our output quantities with as little error as possible.
This is done during the ‘training phase’ using a theory called ‘back propagation’ which aims in minimizing the ‘loss function’, which is a function of the difference between the predicted value and the true value (ideally, the difference should be zero!). These are some standard concepts associated with Deep Learning – and you can learn more about them through countless MOOCs.
Now all this is not perfect, but it is close enough to ignore its imperfections. So now, we have established that no matter what this function might be, we can obtain an accurate enough approximation.
What you have just read so far is a layman version of the so-called “Universal Approximation Theorem”. DL currently has numerous algorithms built over this simple theory and their applications are forever exploding. We will use this theorem as a piece of the bigger puzzle of our problem statement.
Our first and foremost problem statement is predicting the quiescent period of the ship so that we can land safely.
What’s in an LSTM?
What we primarily use for our solution is Long Short Term Memory (‘LSTM’), a powerful DL algorithm that aims to mimic the simple properties of the brain by implementing functions like memory retention, memory loss and attention to name a few.
Essentially, LSTMs try to ‘remember’ events, both recent and historical, in a condensed form factor, called the cell ‘state’, essentially a matrix. Whenever new data comes into the LSTM network, it updates this ‘cell state’ as and when required and even deletes a few values if they are deemed unnecessary.
Let us take an example to elaborate the working. Suppose I defy storm warnings on the radio and go fishing in the Bay of Bengal. Then suppose that I got hit by a huge sea wave 30 seconds prior. Big ‘surprise’ sea waves are still ‘waves’! They behave quite similarly to radio-waves or any waves in general. So, from the height and speed of the wave, I can analytically calculate when the next peak for that wave will occur. Suppose this number turns out to be 20 seconds, then I can be prepared to not do any landing during that time!
The LSTM network does this without any effort from me. I needn’t explain the laws that govern wave propagation to my LSTM network! As long as I keep feeding inputs regarding the properties of the waves hitting the ship, it will store this information as elements in the cell state and will automatically correlate parameters, to tell me that there will be another disturbance – say 20 seconds into the future.
This is just one example and in reality, LSTMs have a slightly more complicated ‘under the hood’ working, which enables it to give us a lot more insight and various different outputs – as per the user’s needs.
Putting the Pieces together:
I modelled my LSTM such that it predicts ship motion and quiescent period in the future by taking the sea state and ship motion in the past as inputs to the function. A lot of research is being done in designing the network innards using the knowledge of dynamic systems and control theory.
Once you understand the fundamental concept, it is straightforward to control the landing of the helicopter. The new LSTM network now takes in the input of the quiescent period and the predicted motion and outputs the control values, such as the throttle of the rotor to be set, the pitch of the blades etc. such that the helicopter can land safely.
LSTMs, LSTMs everywhere – not a drop of training data?
Finding an adequate amount of data to train our LSTM (and most neural networks) to identify and make correlations is a key difficulty associated with Deep Learning. Evidently, experimental data, in this case, is hard to come by. Therefore, we ‘synthesize’ data through simulations using theoretical models. We aim to create the overlay of the solution using the simulated data which can then be fine-tuned as and when experimental data is available using another method called “Transfer Learning”.
You may wonder what exactly I mean by ‘synthesizing’ or ‘simulating’ data. Let me take you through an example here.
Ship dynamics is a highly researched field, having well-defined theories even during the World Wars. The general dynamic system is an improvised and tailored version of the mass-spring-damper system:
Mass-Spring-Damper System:
Ship Dynamic Equation:
Here the represents the disturbance in the axis of the ship.
I wouldn’t go into further specifics of this equation but you should understand that the coefficients of this equation are very difficult to obtain analytically, and as an added bonus all these are 6 x 6 matrices! (both , run from 1 to 6).
Experiments have to be done on a ship by ship basis, by restricting their motion to particular axes – thereby isolating individual terms of the matrices, which is tedious. There are also theories available for calculating them but they are unfortunately not as accurate as you expect them to be.
The overall linear equation of the system, as determined by the matrices of the LSTM, is a great approximation when the coefficients are accurately calculated.
We try to simulate different ship models. Some of these terms can be set to 0 by design (for instance, by using symmetric ship designs) and we can use these primitive models to build a complex, full ship model. We use analytical theories to do this and a few experimental results are referenced in order to make sure we aren’t deviating too much from the real world.
Deep Learning is a beautiful concept and can be applied into different fields. Though it may not be apparent, careful analysis and modelling of the required outputs and available inputs allow us to build accurate models without the need for complex theory or tedious experiments.
Varun Vankineni is a 5th-year Dual Degree Student from the Department of Aerospace Engineering. This article sheds light into the work he is pursuing as his Dual Degree Project in the Rotorcraft Lab, under the mentorship of Dr. Ranjith Mohan, alongside a team of researchers – Harshit Singh, Mohammed Ajmal and Manoj Velmurugan.
References:
Images Source: Beck, R. F., Cummins, W. E., Dalzell, J. F., Mandel, P., & Webster, W. C. (1989). Principles of naval architecture, motions in waves. The Society of Naval Architects and Marine Engineers (Vol. 3).
Video Source: http://www.prismdefence.com/