본문 바로가기

TORCS

End-to-end learning for autonomous driving

반응형

TORCS, the open racing car simulator, is an interesting tool for A.I racing. (If you want to know how to install it, see 2017/07/18 - [TORCS] - How to install TORCS on ubuntu 16.04LTS). One can make a racing robot that runs on TORCS tracks according to his/her own driving policy. (Very specific information such as speed, position, track info. etc. can be obtained in real time!! So you can make a nice driving algorithm by using the information!!)

 

In this post, I will show you a self-driving car that runs on TORCS tracks. I collected lots of driving experience data (front view camera image-steering wheel angle pairs) and trained a convolutional neural network (CNN) using the data so that the CNN can calculate the desired steering wheel angle from an input driving image. The self-driving car runs according to the steering wheel angle produced by the CNN.

 

 

Collecting driving experience data


I installed TORCS under ubuntu 16.04LTS. In order to obtain driving experience data from TORCS, I modified the main source code of TORCS.

 

(Acturally, the modification is based on the work from Chenyi et al, "DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving".  You can download the original modified version from http://deepdriving.cs.princeton.edu/)

 

The following explains how I collected driving experience data and drove a racing car using the trained CNN.

 

 

I made a racing robot (that drives according to a very simple algorithm) in order to collect the driving experience data. The data (front camera view image-steering wheel angle pairs) was collected when the racing robot ran on TORCS tracks. I could obtain about 40,000  pairs from 6 2-lane tracks.

 

The importance thing that I would like to note here is that, training data manipulation is very important for successful driving.

 

For example, the distribution of the collected steering wheel angles usually shows a peak at 0 steering wheel angle. So if you use the data for training, your car may fail to drive on curvy road.

 

 

 

There are some blogs or git-hub pages that provide some tips about manipulating the collected data for training. (For example, https://github.com/jeremy-shannon/CarND-Behavioral-Cloning-Project) I recommand finding solutions for successful driving (or training) by yourself.

 

 

 

 

Training CNN


The detailed specification of the CNN structure is

 

 

The base structure is based on Nvidia paper, "End to end learning for self-driving cars". Some modifications has been added for the improvment of the training performance. For example, the reason that I use 'tanh' activation function at output layer is that the steering wheel angle value is in the range [-1, 1].

 

The input to the CNN is a normalized RGB image of size (120x300). The original size of the collected image data is (480x640x3). The image is downsampled to the size (240x300) by using bicubic interpolation and the bottom area of the resized image is used for the CNN input.

 

The cropped area is normalized by the average pixel level and the standard deviation of the pixel levels when it is used as a CNN input.

 

The CNN was trained by minimizing MSE cost (MSE between ground-truth steering wheel angles and the angles from the CNN) via adam-optimizer. The coefficients in each layer are initialized by Xavier method. The mini-batch size for training is set to 50 and 100 epochs is set to the maximum iteration number. The CNN coefficients that result in the minimum average MSE cost for test set (10% of the training set) is finally used for CNN driving. The following is the training result.

 

 

CNN driving


The following screenshot was obtained when I ran CNN driving on my PC.

 

 

The terminal at upper-left corner is for running TORCS. The upper-right corner termial is for reading driving images from TORCS (see the window named "Image from TORCS"). The program running on the terminal reads/resizes/crops/normalizes images from TORCS and deliver them to CNN. The bottom right terminal is for running CNN computation. (I trained the CNN by using tensorflow. The trained CNN is saved and is loaded by C++ program running on the terminal. For more details, see my previous posts.)

 


This video shows how CNN drives on 'dirt-3' track of TORCS. Note that the speed of the car is controlled by a simple algorithm regardless of CNN outputs.

 


 

The following video shows how the CNN understand real driving images. The yellow box in the right video is the result of the visualization of the first conv. layer output. Note than I re-trained the CNN using the training images with/without video compression. The reason behind this is that the real driving images are compressed severely so the CNN trained by the images without video compression cannot produce reliable steering wheel angles.