Improved Pest Detection in Insect Larvae Rearing with Pseudo-Labelling

and Spatio-Temporal Masking

Paweł Majewski

1 a

, Piotr Lampa

2 b

, Robert Burduk

1 c

and Jacek Reiner

2 d

Faculty of Information and Communication Technology, Wrocław University of Science and Technology, Poland

Faculty of Mechanical Engineering, Wrocław University of Science and Technology, Poland

Keywords:

Pseudo-Labelling, Spatio-Temporal, Optical Flow, Object Detection, Insect, Monitoring, Tenebrio Molitor.

Abstract:

Pest detection is an important application problem as it enables early reaction by the farmer in situations of

unacceptable pest infestation. Developing an effective pest detection model is challenging due to the problem

of creating a representative dataset, as episodes of pest occurrence under real rearing conditions are rare.

Detecting the pest Alphitobius diaperinus Panzer in mealworm (Tenebrio molitor) rearing, addressed in this

work, is particularly difﬁcult due to the relatively small size of detection objects, the high similarity between

detection objects and background elements, and the dense scenes. Considering the problems described, an

original method for developing pest detection models was proposed. The ﬁrst step was to develop a basic

model by training it on a small subset of manually labelled samples. In the next step, the basic model identiﬁed

low/moderate pest-infected rearing boxes from many boxes inspected daily. Pseudo-labelling was carried

out for these boxes, signiﬁcantly reducing labelling time, and re-training was performed. A spatio-temporal

masking method based on activity maps calculated using the Gunnar-Farneback optical ﬂow technique was

also proposed to reduce the numerous false-positive errors. The quantitative results conﬁrmed the positive

effect of pseudo-labelling and spatio-temporal masking on the accuracy of pest detection and the ability to

recognise episodes of unacceptable pest infestation.

1 INTRODUCTION

Insect pests cause signiﬁcant losses in the agricul-

tural sector every year (Oerke, 2006). Recently, an

increasing consumer demand for food greenness can

also be observed that favours smart solutions to con-

trol pest numbers and use chemicals, known as smart

pest management (Rustia et al., 2022).

Signiﬁcant advances in machine learning make re-

searchers eager to pursue the topic of pest detection,

mainly for crop pests (Li et al., 2021) and storage

pests (Zhu et al., 2022). Due to the difﬁculty of reg-

istering pests under real-world conditions, solutions

typically involved trapping pests through (1) sticky

paper traps (Rustia et al., 2021), (2) pheromone-based

traps (Sun et al., 2018), and (3) light traps (Bjerge

et al., 2021). The machine vision system, placed at

the appropriate location, enabled easy detection of

https://orcid.org/0000-0001-5076-9107

https://orcid.org/0000-0001-8009-6628

https://orcid.org/0000-0002-3506-6611

https://orcid.org/0000-0003-1662-9762

trapped pests. At the level of models/algorithms, re-

searchers proposed different solutions, where mainly

to be noted are: (1) models based on deep convo-

lutional networks (Jiao et al., 2020; Turkoglu et al.,

2022), (2) models based on transformers (Zhang et al.,

2021; Wang et al., 2023) and (3) classical image pro-

cessing methods (Nagar and Sharma, 2020). Among

the major current challenges identiﬁed by researchers

in pest detection are: (1) the difﬁculty of developing

large datasets with issues of data augmentation and

semi-supervised methods, (2) early detection of low

pest infestation and indirect symptoms, (3) detection

of pests when occlusion occurs, and (4) development

of speciﬁc solutions, model architectures for pest de-

tection problem as opposed to using off-the-shelf so-

lutions (Li et al., 2021; Ngugi et al., 2021).

Despite the considerable amount of work in the

area of detection of crop and storage pests, we do not

ﬁnd much research in the area of detection of pests in

insect farming, e.g. honeybee or mealworm (Tenebrio

molitor) (Siemianowska et al., 2013). Research has

already been undertaken on detecting the mite Varroa

destructor (Rosenkranz et al., 2010) on the bee us-

Majewski, P., Lampa, P., Burduk, R. and Reiner, J.

Improved Pest Detection in Insect Larvae Rearing with Pseudo-Labelling and Spatio-Temporal Masking.

DOI: 10.5220/0012311300003660

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2024) - Volume 2: VISAPP, pages

349-356

ISBN: 978-989-758-679-8; ISSN: 2184-4321

349

ing computer vision. (Bjerge et al., 2019) proposed

an Infestation Level Estimator (ILE) to determine the

level of infestation by the mite Varroa destructor. De-

spite obtaining a relatively high F1-score=0.91 for the

detection of varroa mites and conﬁrming the ability

to recognise the presence of this mite on bees, the

following problems of the proposed solution can be

noted: (1) the signiﬁcant modiﬁcation of the hive to

install the machine vision system, which may affect

the daily functioning of the bees, (2) performing the

dataset development and validation process for bee

populations with relatively high infestation levels (5-

10%), assuming an infestation level of 2% as an ac-

ceptable (Sajid et al., 2020). An effective pest detec-

tion solution should: (1) be designed to operate under

the real conditions of farming with as little interfer-

ence with insect functioning as possible, (2) be devel-

oped and evaluated for samples associated with differ-

ent degrees of pest infestation in the population - the

most difﬁcult is to detect pests at low levels of infes-

tation with an adequate level of precision (this is the

situation most often found under professional farm-

ing conditions.). To the best of our knowledge, there

is no work on pest detection in mealworm (Tenebrio

molitor) rearing.

Considering the indicated research gaps at the

methodological and application levels, we addressed

the detection of the Alphitobius diaperinus Panzer

pest in mealworm (Tenebrio monitor) rearing. To re-

ﬂect the real rearing conditions fairly, the model de-

velopment process used low/moderate pest-infested

boxes with mealworms occurring under large-scale

rearing conditions. As the main highlights of the re-

search carried out, we identify (1) an efﬁcient method

for developing pest detection models under the as-

sumption of low pest infestation of the population and

no specially prepared samples with a high infestation,

(2) a pseudo-labelling method for iteratively develop-

ing pest detection models and increasing model accu-

racy with relatively small manually labelled datasets,

(3) a spatio-temporal masking method for increasing

model precision under low pest infestation conditions,

and (4) fair model evaluation under different degrees

of pest infestation.

2 MATERIAL AND METHODS

2.1 Problem Deﬁnition

The problem addressed in this paper is the detection

of the pest (Alphitobius diaperinus Panzer) in images

of rearing boxes with mealworm (Tenebrio Molitor)

larvae. The solution should include the detection of

the pest in both larva and beetle forms. The problem

is challenging for the following reasons: (1) the rel-

atively small size of the objects to be detected (the

length of the mature larva is about 7 - 11 mm, and

the size of the beetle is about 6 mm) (Dunford and

Kaufman, 2006), (2) the high similarity between the

objects to be detected and the background elements

(possible false-positive errors in the case of small

mealworm larvae, dead larvae), (3) dense scenes caus-

ing the objects to be detected to be often partially

occluded, (4) the difﬁculty of developing a represen-

tative dataset containing examples of the pest under

real-world conditions of mealworm rearing (breeders

want to keep the pest infestation low, so the pest oc-

curs infrequently and sparsely in rearing boxes), and

(5) the labour-intensive manual labelling of images,

which is directly related to the difﬁculties described

in (1), (2) and (3). Examples of detection objects in

the form of larvae (L1-L4) and beetles (B1-B3) in se-

lected image tiles are shown in Figure 1.

Figure 1: Examples of detection objects from the classes

pest larvae and pest beetle.

2.2 Dataset

The basis of the developed dataset was the raw

4096 x 3000 pixels images, from which were ex-

tracted smaller square tiles with size 512. The

livestock-adapted machine vision system acquired

raw images. The imaging conditions allowed the

registration of images with a resolution of 0.143

mm/pixel. Each such image also had a correspond-

ing image taken 1 s later, allowing further calculation

of activity maps. From the raw images, 512 x 512

pixels tiles were extracted (presented in Figure 1) us-

ing the sliding window method with a shift unit of

128 pixels. For labelling, 200 rearing boxes charac-

terised by low/moderate pest infestation levels were

selected, which represented approximately 5% of all

boxes being automatically inspected in a given period.

A weak model (trained on a few manually labelled

samples) for pest detection was used to identify boxes

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

350

with a noticeable pest infestation to avoid manual in-

spection. All 200 raw images were labelled manually

to enable the determination of an upper baseline for

the accuracy of the pest detection model, yielding the

number of labelled objects: 1626 for the pest larvae

class and 1004 for the pest beetle class. The average

number of pests in the selected boxes, characterised

by low/moderate pest infestation levels, was approx-

imately 13. At the given level of infestation, there

are more than 100 mealworm larvae per pest, which

does not yet require intervention from the farmer. The

dataset included 107941 tiles: 16995 tiles with at least

one pest and 90946 tiles without a pest.

2.3 Proposed Method

Considering the difﬁculties described in section 2.1,

an original method for developing a pest detection

model is proposed. The idea scheme of the proposed

solution is presented in Figure 2.

Three main elements of the proposed method are

identiﬁed: (1) basic training (Figure 2a), (2) pseudo-

labelling and re-training (Figure 2b), and (3) spatio-

temporal masking in prediction time (Figure 2c),

which will be described in the following subsec-

tions. Pseudo-labelling addressed the need to speed

up (enable) the labelling of the many unlabelled im-

ages acquired during the daily inspection of the rear-

ing boxes. Spatio-temporal masking was proposed

to reduce false-positive errors, the amount of which

was signiﬁcant in relation to correct predictions for

low/moderate pest infestations.

2.3.1 Basic Training

The basic training consisted of training the model on

a small subset of manually labelled samples. The size

of the subset was deﬁned by the parameter train size,

which determined approximately the proportion of all

labelled objects in the training set (for example, train

size equals 0.16 means that about 16% of all man-

ually labelled objects representing pests were in the

training set). Stratiﬁed sampling was used to maintain

the proportion of objects from the pest larvae and pest

beetle classes in the determined subsets of samples.

The resulting model was evaluated after basic train-

ing, and the results for this type of approach were re-

ferred under the name without pool (lower baseline).

The name of the approach is due to the fact that un-

labelled samples from the pool were not used during

training. The YOLOv5x (Jocher et al., 2020) model

was trained with the following training parameters:

epochs=30, batch_size=8. The basic training was pre-

sented in Figure 2a.

Figure 2: Idea scheme for the proposed solution: (1) ba-

sic training, (2) pseudo-labelling and re-training, and (3)

spatio-temporal masking in prediction time.

2.3.2 Pseudo-Labelling with Re-Training

The second stage of the proposed method involved us-

ing a pseudo-labelling method to label samples from

the pool automatically. The pool did not include sam-

ples selected for the test set. The inference was per-

formed for each sample in the pool, and a prediction

was considered relevant if its conﬁdence level was

higher than the parameter conﬁdence score thresh-

old. The parameter conﬁdence score threshold was

ﬁne-tuned under the constant parameter train size.

After automatic labelling according to the described

Improved Pest Detection in Insect Larvae Rearing with Pseudo-Labelling and Spatio-Temporal Masking

351

method, the model training was repeated, using the

automatically labelled samples and the manually la-

belled samples used in the basic training. The result-

ing model after re-training was evaluated and the re-

sults for this approach were refereed under the name

pool used with pseudo labels. The training settings re-

mained unchanged. Pseudo-labelling with re-training

was presented in Figure 2b.

2.3.3 Spatio-Temporal Masking in Prediction

Time

At the prediction stage, spatio-temporal masking was

introduced to remove some false-positive predictions

characterised by no movement. Each image (tile)

for which a prediction was performed was related

to an image taken 1 s later, resulting in small shifts

in the areas where the larvae were located. The

normalised activity map was calculated using the

Gunnar-Farneback optical ﬂow technique (Farnebäck,

2003). Then, a binary mask was determined using

the deﬁned Farneback activity threshold, where white

pixels represent areas with activity above the thresh-

old. The Farneback activity threshold parameter was

ﬁne-tuned under the constant parameter train size.

A masked RGB image was used for prediction, where

only areas with the minimum deﬁned activity are visi-

ble. When reporting the results from the model evalu-

ations, the use of the described method was indicated

by an appendix in the name + spatio-temporal mask-

ing. The spatio-temporal masking method was pre-

sented in Figure 2c.

2.4 Evaluation

Four sets of samples were distinguished for evalua-

tion purposes: a training set, a validation set, a test

set and a set deﬁned as an image pool. Independence

between the sets was provided at the level of the raw

images from which the tiles were extracted. The size

of the training set was deﬁned by the parameter train

size, which speciﬁed approximately the proportion of

the number of objects in this set relative to the number

of objects in the entire dataset. The training set was

used to train the pest detection model. The analysis

was conducted for four training set sizes: 0.02, 0.04,

0.08 and 0.16. The size of the validation set was ﬁxed

and was 1/2 train size (for example, when the training

set contained about 16% of all labelled objects, the

validation set then contained about 8% of all labelled

objects). The validation set was used to evaluate the

model during training and select the model from the

best epoch. The size of the test set was ﬁxed and equal

to 0.3 (about 30% of all manually labelled objects rep-

resenting pests were in the test set). The test set was

used for the ﬁnal evaluation of the models, and the

referenced results are from the evaluation on this set.

The remaining samples not included in the training,

validation and test sets belonged to the image pool.

Including images from the pool in model training de-

pended on the approach used.

Two types of evaluation were conducted for (1)

low/moderate pest infestation and (2) high pest in-

festation. In the case of (1), the evaluation consid-

ered tiles with and without pests. For low/moderate

infestation, which was present in the analysed im-

ages, there were approximately ﬁve pest-free tiles per

tile with at least one pest, as described in more de-

tail in section 2.2. In case (2), the evaluation con-

sidered only tiles with pests. It was decided to carry

out these two types of evaluation because of the sig-

niﬁcant number of false-positive errors that resulted

from the similarity between the analysed objects and

the background elements. The possibility of numer-

ous false-positive errors implies that the accuracy of

the models will strictly depend on the level of pest

infestation.

Besides evaluating the approaches indicated in

section 2.3: without pool (lower baseline), pool used

with pseudo labels, an upper baseline of model accu-

racy was also determined by using true labels instead

of pseudo labels for the pool samples. This approach

was named pool used with true labels.

The following parameter values were checked for

parameter ﬁne-tuning procedures: (a) for the conﬁ-

dence score threshold parameter - [0.1, 0.3, 0.5, 0.7,

0.9], and for the Farneback activity threshold param-

eter - [0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2]. For the parameter

Farneback activity threshold, the range of values was

determined based on a preliminary qualitative assess-

ment of the calculated activity maps.

For one experiment related to the selected type

of evaluation, the type of approach and the size of

the training set (parameter train size), three repeats

of pest detection model training were performed re-

lated to the different division of the samples into sets:

training, validation, test and image pool. The results

obtained were averaged over these repeats. Repetition

of training was also used in parameter ﬁne-tuning.

Standard metrics for object detection were cho-

sen as quantitative indicators for evaluation: AP50

(average precision with IoU=50%), F1-score, preci-

sion and recall. The values of F1-score, precision,

and recall were related to the optimal working point

at which the value of the F1-score metric was max-

imised. The values of the indicated metrics were de-

termined separately for the two deﬁned object classes:

pest larvae and pest beetle, and averaged over these

classes.

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

352

3 RESULTS AND DISCUSSION

A comparison of the proposed approaches for the two

types of evaluation is summarised in Table 1 and in

Figures 4a and 4b. In addition, Figures 3a and 3b

show the results of the ﬁne-tuning of two parame-

ters: conﬁdence score threshold and Farneback ac-

tivity threshold. For the discussion of the results,

the AP50 metric (independent of the conﬁdence score

threshold) was chosen for parameter ﬁne-tuning and

the F1-score metric (associated with a speciﬁc work-

ing point) for comparing approaches. Fine-tuning was

conducted with a training set size of 0.04 and for eval-

uation type: low/moderate pest infestation. As lower

baseline in Figure 3a the metric values for the with-

out pool (lower baseline) approach were speciﬁed. In

Figure 3b the lower baseline was associated with the

pool used with pseudo labels approach. In Table 1, in

addition to the value of the deﬁned train size param-

eter, the averaged number of manually labelled sam-

ples in the training and validation set is also provided.

Figure 3: Fine-tuning results for: (a) conﬁdence score

threshold and (b) Farneback activity threshold.

Figure 3a and 3b conﬁrm the rationale for ﬁne-

tuning the two selected parameters: conﬁdence score

threshold and Farneback activity threshold. For con-

ﬁdence score threshold ﬁne-tuning, the difference be-

tween the lower baseline and the working point was

∆AP50 = 8.9 (increase from 44.3 to 53.2), while for

Farneback activity threshold ∆AP50 = 4.4 (increase

from 53.2 to 57.6). For further approaches, the param-

eter values indicated in Figures 3a and 3b as working

points were used, i.e. 0.3 for conﬁdence score thresh-

old and 0.8 for Farneback activity threshold.

Figure 4: Comparison of the proposed methods according

to the F1-score metric for pest detection for cases of: (a)

low/moderate pest infestation, and (b) high pest infestation.

The impact of pseudo-labelling on pest detec-

tion accuracy can be assessed by comparing the re-

sults for approaches without pool (lower baseline)

(blue line) and pool used with pseudo labels (orange

line) in Figures 4a and 4b. For both low/moderate

and high pest infestation, we can see a positive

and signiﬁcant effect of using pseudo-labelling for

the image pool on pest detection accuracy. In the

case of the low/moderate pest infestation evaluation,

pseudo-labelling contributed to an increase in the av-

erage F1-score (averaged over different train size) of

∆F1 = 4.0 and in the case of the high infestation F1-

score increased by ∆F1 = 5.4.

The inﬂuence of spatio-temporal masking on de-

tection accuracy was assessed by pairwise compar-

ison of the results for the approaches pool used

with pseudo labels (orange line) and pool used with

pseudo labels + spatio-temporal masking (green line)

and for the approaches pool used with true labels

(red line) and pool used with true labels + spatio-

temporal masking (purple line) in Figures 4a and 4b.

Improved Pest Detection in Insect Larvae Rearing with Pseudo-Labelling and Spatio-Temporal Masking

353

Table 1: Comparison of the proposed methods for two cases: (1) low/moderate pest infestation, (2) high pest infestation.

evaluation type

(degree of pest infestation)

approach type

train size

(# samples)

AP50 F1-score [%] precision [%] recall [%]

low/moderate

pest infestation

without pool

(lower baseline)

0.02 (79) 40.6 45.2 45.4 45.9

0.04 (158) 44.3 48.4 50.7 47.5

0.08 (316) 52.6 56.6 56.8 56.9

0.16 (631) 57.1 59.9 58.9 61.2

pool used with

pseudo labels

0.02 (79) 48.8 51.4 51.5 52.1

0.04 (158) 53.2 53.5 53.6 54.9

0.08 (316) 57.3 59.4 57.9 61.3

0.16 (631) 60.9 61.7 60.4 63.8

pool used with pseudo

labels + spatio-temporal

ﬁltering

0.02 (79) 51.9 55.6 57.0 55.8

0.04 (158) 57.6 58.1 61.8 57.5

0.08 (316) 60.9 62.2 64.7 60.9

0.16 (631) 65.7 66.6 69.5 64.8

pool used with

true labels

all (1841) 65.3 64.8 60.7 72.0

pool used with true

labels + spatio-temporal

ﬁltering

all (1841) 68.6 68.6 68.7 69.0

high

pest infestation

without pool

(lower baseline)

0.02 (79) 61.5 63.3 69.6 58.9

0.04 (158) 64.8 67.3 74.7 62.2

0.08 (316) 72.6 73.0 79.3 68.5

0.16 (631) 76.4 76.9 83.9 71.5

pool used with

pseudo labels

0.02 (79) 70.7 70.2 76.1 66.5

0.04 (158) 76.2 74.0 77.3 71.7

0.08 (316) 79.7 77.4 81.4 74.3

0.16 (631) 83.9 80.5 82.4 79.1

pool used with pseudo

labels + spatio-temporal

ﬁltering

0.02 (79) 68.0 68.5 77.4 63.3

0.04 (158) 74.1 73.5 81.1 68.5

0.08 (316) 76.3 75.7 81.5 71.6

0.16 (631) 80.5 78.9 83.6 75.4

pool used with

true labels

all (1841) 86.9 84.1 85.0 83.5

pool used with true

labels + spatio-temporal

ﬁltering

all (1841) 84.1 82.6 85.7 80.0

Considering the pool used with pseudo labels ap-

proach, an improvement in detection accuracy using

the spatio-temporal masking technique was noted for

the low/moderate pest infestation case. For this case,

F1-score increased by ∆F1 = 4.1. For the high pest

infestation case, a small reduction in detection accu-

racy was noted - F1-score decreased by ∆F1 = −1.4.

The small reduction in detection accuracy was due to

masking areas with pests characterised by low mo-

bility. As expected, applying the spatio-temporal

masking technique in general increased precision

with decreasing recall. However, for the case of

low/moderate pest infestation, in addition to the ex-

pected increase in precision (∆ precision = 7.4), an

increase in recall was even observed (∆recall = 1.7),

which was due to the possibility of moving the work-

ing point to a lower conﬁdence score threshold value,

resulting in an increased recall. Despite the small re-

duction in model accuracy in the case of high pest

infestation, it should be stated that this is acceptable,

considering that most boxes during the daily inspec-

tion are characterised by low/moderate pest infesta-

tion. The positive effect of spatio-temporal mask-

ing on detection accuracy is expected to be higher

the smaller the pest infestation. Analogous results

were obtained for the pool used with true labels ap-

proach, where an increase in F1-score was obtained

(∆F1 = 3.8) for the low/moderate infestation and a

small decrease in F1-score (∆F1 = −1.5) for the high

pest infestation case.

Analysing the effect of training set size on detec-

tion accuracy, a signiﬁcant inﬂuence of this param-

eter was observed in the considered range of 0.02 –

0.16. Comparing the results between train size 0.02

and 0.16 for pool used with pseudo labels + spatio-

temporal masking approach (green line), an increase

in F1-score was observed by ∆F1 = 11.0 for the

low/moderate pest infestation case and by ∆F1 = 10.4

for the high pest infestation case. Further manual

labelling of the pool samples (representing approxi-

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

354

mately 0.46 of the dataset and 1210 additional sam-

ples for manual annotation), as expected, had a posi-

tive effect on the accuracy of the models, but it was

not such a spectacular improvement as in the con-

sidered range from 0.02 to 0.16. The difference be-

tween the upper baseline (the pool used with true

labels + spatio-temporal ﬁltering approach) and the

pool used with pseudo labels + spatio-temporal mask-

ing approach at train size=0.16 was ∆F1 = 2.0 for

the low/moderate infestation case and ∆F1 = 3.7 for

the high pest infestation case, respectively. For the

speciﬁc pest detection problem addressed in this ar-

ticle, the required minimum training set size should

be at least 0.16 (associated with the validation set

size 0.08), resulting in approximately 630 manually

labelled objects. Assuming a low/moderate pest in-

festation under large-scale rearing conditions, obtain-

ing this number of samples in a reasonable time is

only possible with the support of a weak model (e.g.

a model from the pool used with pseudo labels ap-

proach with a small train size) for identifying the

boxes with the highest number of pests.

Lower metric values for the low/moderate pest in-

festation evaluation were obtained due to an increase

in the number of false-positive predictions. Some of

these predictions actually represented objects falsely

detected as pests, e.g., fragments of dead larvae sim-

ilar to pest beetles. A part of these false-positive pre-

dictions was ﬁltered out by spatio-temporal masking

(selected examples are shown in Figure 5).

Figure 5: Examples of false-positive predictions ﬁltered out

by spatio-temporal masking.

After analysing the mistakes made by the pest de-

tection model among the false-positive errors, we can

also ﬁnd many predictions that can represent not la-

belled pests. Some objects were difﬁcult for the anno-

tator to recognise, inﬂuenced by dense scenes, overlap

and small size. Selected objects missed during an-

notation but correctly detected by the pest detection

model are shown in Figure 6.

Figure 6: Selected objects missed during annotation but cor-

rectly detected by the pest detection model.

The observed problem with noisy (or lack of) la-

bels, on the one hand, suggests that the model’s accu-

racy can be even better than referred, and on the other

hand, shows the direction of further work in label re-

ﬁnement.

4 CONCLUSIONS

The results presented here conﬁrmed the potential

of the proposed methods (pseudo-labelling, spatio-

temporal masking) for developing pest detection

models. Pseudo-labelling is particularly important for

developing the ﬁrst models (so-called weak models)

when we have a small labelled dataset and access to

a pool of unlabelled images. The role of the spatio-

temporal masking technique is highest in the case of

a low pest infestation when the main problem is po-

tential false alarms, which is the most common situ-

ation found in professional farming. In future work,

we plan to develop additional methods, e.g., based on

expert knowledge and using new imaging domains to

increase the precision of pest detection in the case of

Improved Pest Detection in Insect Larvae Rearing with Pseudo-Labelling and Spatio-Temporal Masking

355

low pest infestation. Future work should also analyse

the real characteristics of the change in the number of

pests over time when changing the infestation from

low/moderate to high, which requires a fast reaction

from the farmer. This analysis will enable us to im-

prove our solution for a particular use case.

ACKNOWLEDGEMENTS

We wish to thank Mariusz Mrzygłód for developing

applications for the designed data acquisition work-

station. We wish to thank Paweł Górzy

nski and

Dawid Biedrzycki from Tenebria (Lubawa, Poland)

for providing a data source of boxes with Tenebrio

Molitor. The work presented in this publication was

carried out within the project “Automatic mealworm

breeding system with the development of feeding

technology” under Sub-measure 1.1.1 of the Smart

Growth Operational Program 2014-2020 co-ﬁnanced

from the European Regional Development Fund on

the basis of a co-ﬁnancing agreement concluded with

the National Center for Research and Development

(NCBiR, Poland); grant POIR.01.01.01-00-0903/20.

REFERENCES

Bjerge, K., Frigaard, C. E., Mikkelsen, P. H., Nielsen, T. H.,

Misbih, M., and Kryger, P. (2019). A computer vision

system to monitor the infestation level of varroa de-

structor in a honeybee colony. Computers and Elec-

tronics in Agriculture, 164:104898.

Bjerge, K., Nielsen, J. B., Sepstrup, M. V., Helsing-Nielsen,

F., and Høye, T. T. (2021). An automated light trap to

monitor moths (lepidoptera) using computer vision-

based tracking and deep learning. Sensors, 21(2):343.

Dunford, J. C. and Kaufman, P. E. (2006). Lesser

mealworm, litter beetle, alphitobius diaperinus

(panzer)(insecta: Coleoptera: Tenebrionidae): Eeny-

367/in662, rev. 6/2006. EDIS, 2006.

Farnebäck, G. (2003). Two-frame motion estimation based

on polynomial expansion. In Image Analysis: 13th

Scandinavian Conference, SCIA 2003 Halmstad, Swe-

den, June 29–July 2, 2003 Proceedings 13, pages 363–

370. Springer.

Jiao, L., Dong, S., Zhang, S., Xie, C., and Wang, H.

(2020). Af-rcnn: An anchor-free convolutional neu-

ral network for multi-categories agricultural pest de-

tection. Computers and Electronics in Agriculture,

174:105522.

Jocher, G., Nishimura, K., Mineeva, T., and Vilariño,

R. (2020). yolov5. Code repository https://github.

com/ultralytics/yolov5, page 9.

Li, W., Zheng, T., Yang, Z., Li, M., Sun, C., and Yang, X.

(2021). Classiﬁcation and detection of insects from

ﬁeld images using deep learning for smart pest man-

agement: A systematic review. Ecological Informat-

ics, 66:101460.

Nagar, H. and Sharma, R. (2020). A comprehensive sur-

vey on pest detection techniques using image process-

ing. In 2020 4th International Conference on Intelli-

gent Computing and Control Systems (ICICCS), pages

43–48. IEEE.

Ngugi, L. C., Abelwahab, M., and Abo-Zahhad, M. (2021).

Recent advances in image processing techniques for

automated leaf pest and disease recognition–a review.

Information processing in agriculture, 8(1):27–51.

Oerke, E.-C. (2006). Crop losses to pests. The Journal of

Agricultural Science, 144(1):31–43.

Rosenkranz, P., Aumeier, P., and Ziegelmann, B. (2010).

Biology and control of varroa destructor. Journal of

invertebrate pathology, 103:S96–S119.

Rustia, D. J. A., Chiu, L.-Y., Lu, C.-Y., Wu, Y.-F., Chen,

S.-K., Chung, J.-Y., Hsu, J.-C., and Lin, T.-T. (2022).

Towards intelligent and integrated pest management

through an aiot-based monitoring system. Pest Man-

agement Science, 78(10):4288–4302.

Rustia, D. J. A., Lu, C.-Y., Chao, J.-J., Wu, Y.-F., Chung,

J.-Y., Hsu, J.-C., and Lin, T.-T. (2021). Online semi-

supervised learning applied to an automated insect

pest monitoring system. Biosystems Engineering,

208:28–44.

Sajid, Z. N., Aziz, M. A., Bodlah, I., Rana, R. M., Ghramh,

H. A., and Khan, K. A. (2020). Efﬁcacy assessment of

soft and hard acaricides against varroa destructor mite

infesting honey bee (apis mellifera) colonies, through

sugar roll method. Saudi Journal of Biological Sci-

ences, 27(1):53–59.

Siemianowska, E., Kosewska, A., Aljewicz, M., Skib-

niewska, K. A., Polak-Juszczak, L., Jarocki, A., and

Jedras, M. (2013). Larvae of mealworm (tenebrio

molitor l.) as european novel food. Agricultural Sci-

ences.

Sun, Y., Liu, X., Yuan, M., Ren, L., Wang, J., and Chen, Z.

(2018). Automatic in-trap pest detection using deep

learning for pheromone-based dendroctonus valens

monitoring. Biosystems engineering, 176:140–150.

Turkoglu, M., Yaniko

glu, B., and Hanbay, D. (2022). Plant-

diseasenet: Convolutional neural network ensemble

for plant disease and pest detection. Signal, Image

and Video Processing, 16(2):301–309.

Wang, S., Zeng, Q., Ni, W., Cheng, C., and Wang, Y.

(2023). Odp-transformer: Interpretation of pest clas-

siﬁcation results using image caption generation tech-

niques. Computers and Electronics in Agriculture,

209:107863.

Zhang, Z., Gong, Z., Hong, Q., and Jiang, L. (2021).

Swin-transformer based classiﬁcation for rice diseases

recognition. In 2021 International Conference on

Computer Information Science and Artiﬁcial Intelli-

gence (CISAI), pages 153–156. IEEE.

Zhu, L., Ma, Q., Chen, J., and Zhao, G. (2022). Current

progress on innovative pest detection techniques for

stored cereal grains and thereof powders. Food Chem-

istry, page 133706.

VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications

356