network  structure,  the  model  is  robust  to  noise  by 
optimizing the network structure, such as setting two 
identical  neural  networks  to  guide  each  other, 
learning the loss value of each other to avoid falling 
into  overfitting  and  increasing  robustness  (Han  B, 
Yao  Q,  Yu  X.  2018);  The  third  is  the  optimization 
processing  based  on  the  loss  function,  which 
constructs a loss function that is robust to label noise, 
and reduces the influence of label noise through the 
robustness  of  the  loss  function  itself  (Zhang  Z, 
Sabunc  M.  2018;  Wang  Y,  Ma  X,  Chen  Z,  et  al. 
2019).  Among  them,  the  optimization  of  network 
structure  and  loss  function  is  to  increase  the 
robustness  of  the  model.  Since  it  is  impossible  to 
judge  whether  the  data  used  contains  label  noise 
during  modeling,  the  performance  of  the  model 
cannot  be  guaranteed.  Therefore,  optimization 
processing  based  on  training  data  is  more  common 
(Zhang ZZ, Jiang GX & Wang JW, 2020). 
Training  data  optimization  processing  can  be 
divided  into  two  categories  based  on  processing 
methods, namely noise sample removal (Sluban, B., 
Gamberger, D. & Lavrač, N, 2014) and noise sample 
relabeling (Y. Wei, C. Gong, S. Chen, T. Liu, J. Yang 
&  D.  Tao,  2020).  Considering  the  operational 
efficiency,  the  method  of  sample  removal  is  more 
common  than  the  method  of  sample  relabeling 
(Frénay  B,  Verleysen  M,  2014).  However,  the 
problem  of  excessive  removal  may  occur  in  the 
sample removal process, that is, the number of noise 
samples  removed  is  much  larger  than  the  original 
noise  samples.  Therefore,  when  measuring  noise 
sample removal methods, in addition to considering 
the  proportion  of  clean  samples  after  removal,  it  is 
also  necessary  to  consider  the  recall  rate  of  clean 
samples. 
In  the  process  of  sample  processing,  the 
classification method based on confidence is mostly 
used (Chen QQ, Wang WJ, Jiang GX, 2019), but the 
method based on confidence needs to obtain the result 
after  the  model  learning  is  completed,  so  the  time 
consumption is relatively large. At the same time, the 
method  based  on  confidence  will  lead  to  a  higher 
degree of correlation between the classification result 
and  the  reliability  of  the  training  sample.  The 
traditional  way  of  classification  is  mostly  a  single 
fixed threshold to classify the sample (Chen QQ et al. 
2019). However, this method is prone to the problem 
of  prediction  deviation  near  the  threshold.  In  this 
regard, Zhang Zenghui et al. (2020) proposed a local 
probability  sampling  method  based  on  confidence, 
but  this  division  method  uses  a  single  interval  for 
threshold sampling, which is overly dependent on the 
artificially  set  interval,  and  the  performance  under 
different noise rates is quite different. 
Taking model training as a node, the entire model 
construction process can be divided into the following 
three  stages:  data processing  before  model training, 
network  construction  during  model  training,  and 
other  optimization  operations  after  model  training. 
Data processing mostly occurs in the first stage, that 
is, before model training, and then put the processed 
data  into  different  models  for  training.  Data 
processing in the first stage means that the second and 
third  stages  of  the  model  building  process  cannot 
touch  the  original  data  set.  In  particular,  when  the 
removal  method  is  used  to  process  suspected  noise 
samples,  the  size  of  the  data  set  will  be  relatively 
reduced,  and  the  size  of  the  data  set  will  have  an 
impact on the training of the model (Lei SH, Zhang 
H,  Wang  K,  et  al.  2019).  Therefore,  training  the 
model  after  the  data  is  preprocessed  by  the  noise 
filtering  algorithm  does  not  guarantee  the 
improvement of the model classification accuracy. 
This  paper  proposes  a  training  method  for 
weighted correction after filtering for data containing 
label noise. The main contributions of  the proposed 
method are: 
1.   Propose a random threshold label noise filtering 
algorithm  in  the  double  interval  based  on  the 
loss value, which improves the sample filtering 
accuracy  and  recall  rate  while  reducing  time 
loss. 
2.  Based on the filtered data, a weighted correction 
training method is proposed. Through secondary 
training,  the  weight  of  the  correct  sample  and 
the weak sample category is increased, thereby 
improving the accuracy of the model. 
3.  Analyze the influence of noise ratio and model 
depth  on  the  proposed  method  based  on 
experiments,  and  provide  reference  data  for 
subsequent applications. 
2  FILTERED WEIGHTED 
CORRECTION TRAINING 
METHOD 
The processing flow of weighted correction with filter 
data (WCF) is mainly divided into two parts, which 
are based on the noise label filtering algorithm of the 
double interval. The purpose is to process the original 
data set and the weighted correction training method. 
The  purpose  is  to  apply  the  filtered  data  to  the 
correction training of the model.