Zum Hauptinhalt springen


SMART2 – Advanced integrated obstacle and track intrusion detection system for smart automation of rail transport

The aim is to develop image processing-based software for obstacle detection (OD) and track intrusion detection (TID). A holistic approach to autonomous obstacle detection for railways would enable increased detection area including areas behind a curve, slope, tunnels and other elements blocking the train’s view on the rail tracks, in addition to a long-range straight rail-tracks OD. The data recorded will be processed to inform a cloud-based Decision Support System (DSS) about possible obstacles and track intrusions in their fields of view. DSS will integrate information coming from three OD&TID sub-systems and will make final decision on OD&TID and will suggest possible actions for the train control as a contribution to a holistic approach for obstacle detection and track intrusion detection in railway application. The following image shows the three sub-systems:

1. On-board system where sensors are mounted at the front of the train.

2. Airborne system which examines the rail tracks that cannot yet be seen by the on-board system.

3. Trackside system that continuously takes a look at track sections with an increased risk of accidents like level crossings.


In order to be able to analyze the environment, several sensors are used in the SMART2 project: In addition to the images from an RGB camera, images from a thermal imaging camera, a night vision camera and a SWIR (Shortwave Infrared) camera are currently being used, depending on the weather and lighting conditions, in order to ensure reliable detection of the environment at any time of day or night.

Aiming to be able to recognize objects in the distance as reliably as possible, various convolutional neural networks (CNN) such as YOLOv3, RetinaNet and CenterNet were investigated in order to achieve an appropriate speed-accuracy trade-off against the background of high reliability with the simultaneous prerequisite of real-time application. With the help of transfer learning, the dataset created in the SMART and SMART2 projects will be used to improve specificity and sensitivity. Artificial intelligence can only ever be as smart as the data it is fed with. For this reason, a large training data set is essential and research is currently being conducted into how synthetic images can be created as training data with the help of a Generative Adversarial Network (GAN). In addition, a feedforward neural network called DisNet is trained, which estimates the distance between the sensor and the detected object based on the object’s class and location in the image. To extend the data set for training DisNet, a projective transformation was performed for real-world data.

In order to interpret the entire scenery that takes place in the vicinity of the investigated railway track section, not only object detection may be performed. In addition, a trajectory prediction is carried out for the detected objects on the basis of the previous frames. In this way, a better risk evaluation should be possible.

However, a good risk assessment can only be guaranteed if the region of interest (ROI) is clearly defined. In the SMART2 project, ROI refers to the area where damage to living beings or objects may be caused by the train or vice versa.  For this reason, it is important to detect the rail tracks. Two different approaches were developed for this:

1. Detection of the rail tracks by instance segmentation

2. Detection of the clearance of the rolling stock with the help of an object detection network

Both approaches have advantages and disadvantages and can help to assess the risk of a real-world scene. In the first approach, the instance segmentation method was newly introduced into the project. YOLACT++ as a fully-convolutional model for real-time instance segmentation was trained to detect potentially hazardous objects (such as people, cars, trucks) as well as the rail tracks. The rail track detection can be used to determine exactly which pixels of the image belong to the ROI and which do not. Furthermore, it was investigated whether the masks of the detected objects enable more accurate distance estimation with DisNet than is the case with bounding box detection.

The second approach is based on the idea of detecting the rail tracks with the help of several bounding boxes. For this purpose, an object detection network was trained with the two classes Primary Railtrack and Secondary Railtrack, so that the network learned to recognise which rail tracks are relevant. Subsequently, the clearance of the rolling stock could be determined and visualized on the basis of these bounding boxes, so that information about the true ROI (where the train runs along) can be extracted. This information can be easily merged with the predicted trajectories of detected objects to evaluate risk potential.