2 RACING CAR ENVIRONMENT
PERCEPTION AND TARGET
DETECTION
Before applying artificial intelligence to fleet
planning and route planning, accurately perceiving
the environment and collecting information are
undoubtedly the primary prerequisites for making
precise judgments. Take Formula One as an example.
Usually, for the convenience of collecting
information, a standard ECU system is installed to use
mobile phone information and transmit it to the
telemetry center. Team staff can understand vehicle
performance in real time and intuitively, and check
engine health, tire wear and fuel consumption. Many
current research reports have mentioned that a
perception system of Lupin is designed to achieve
end-to-end autonomous driving and is suitable for
rapid judgment and planning under extreme working
conditions. As mentioned above, if artificial
intelligence is to be applied to motorsports, it is
necessary to meet the requirements of real-time and
accurate identification of track conditions and the
positions of competing vehicles on the same track, as
well as the judgment and use of special terrains such
as shoulders and buffer zones. Therefore, in order to
obtain information, a series of commonly used
sensors, such as cameras, may need to be equipped
Devices such as LiDAR and millimeter-wave radar,
in combination with advanced algorithms, interpret
this sensor data(Williams et al., 2017).
2.1 Racing Car Target Detection Based
on Computer Vision
Unlike traditional racing car sensors, in terms of
object detection in computer vision, especially the
object detection algorithms based on deep learning,
they play a significant role in processing the captured
image data. Currently, there are two mainstream
algorithms They are respectively "One-Stage
Detectors" and "Two-Stage Detectors" (Kapania et
al., 2016; Kendall et al., 2019).
Firstly, regarding "One-Stage Detectors", the
characteristic of this type of algorithm is that it has an
extremely high recognition speed and is more
efficient. It can form a Bounding Box at the edge of
an object just by looking at a complete image, just like
the human eye, and analyze the category. This fast
response and processing algorithm is highly suitable
for real-time scenarios that require reactions at the
level of one-thousandth of a second. Moreover, in the
F1TENTH Challenge, the researchers precisely
utilized the advantage of You Only Look Once
(YOLO) among One-Stage Detectors (Redmon et al.,
2016). Ultra-high-frequency detection of track drop
pain has been achieved.
Secondly, there are "Two-Stage Detectors",
among which the most representative one is Faster R-
CNN. The characteristics of this type of algorithm are
that the analysis results are more accurate and the
detection accuracy is also very high. It will first scan
the image and propose Region Proposals that may
contain objects. Then, conduct a detailed analysis and
classification of each of these areas, one by one.
Although it is more accurate compared to "one-stage
Detectors", it can also greatly enhance the feasibility
and safety of the strategy.
The reason why these algorithms can recognize
objects on the track is through "learning".
Researchers will prepare a dataset with many track
maps in advance. Then, they will manually process
these images to standards, such as drawing racing
lines on the track, marking reference points,
perfecting track details, etc. They will label each
opponent's car and convert it into structured and
semantic information that machines can understand
(Geiger et al., 2012). This can also help the machine
learning stage obtain more direct input (Ren et al.,
2015).
2.2 Racing Perception Technology and
Multi-Sensor Fusion
Considering the uncontrollability of track conditions,
such as weather, humidity, and light intensity, it is
crucial to introduce sensors based on other principles
for information complementarity in order to stably
ensure that the sensors receive information. First of
all, there is LiDAR. It accurately predicts the distance
to objects by emitting laser beams that are invisible to
the human eye nearby and calculating the time it takes
for the beam to reflect and return. It can instantly
construct an accurate "three-dimensional point cloud
map" composed of a vast number of data points. This
map is like a virtual 3D model, clearly depicting the
geometric outline of the track, the undulations of the
road surface, and even the precise shapes and
positions of the obstacles.
Another feasible option is millimeter-wave radar.
Compared with LiDAR, it uses radar wave detection
and can directly measure the relative speed of the
target, which can largely avoid collisions.
However, when fusing these sensory sensors,
Early Fusion and Late Fusion are usually adopted.
Firstly, the Early Fusion strategy is typically used to
blend the data at the most primitive stage. A typical