Highlighting Vulnerabilities in a Genomics Biocybersecurity Lab

Through Threat Modeling and Security Testing

Jared Sheldon

, Isabelle Brown-Cantrell

, Patrick Pape

and Thomas Morris

Center for Cybersecurity Research and Education, University of Alabama in Huntsville, Huntsville, Alabama, U.S.A.

Keywords:

Biocybersecurity, Cybersecurity, Genomics, Threat Modeling, ATT&CK, TTPs, STRIDE.

Abstract:

Biocybersecurity, a specialty ﬁeld applying modern cybersecurity developments to the bioeconomy, is garner-

ing progressively more attention as concerns increase over the protection of bioeconomic data generated each

year. Genomic data is a key data type that falls under the bioeconomy umbrella and can be protected health

information, intellectual property, or research data, depending on the use case. To increase understanding of

cybersecurity for genomic lab environments, a biocybersecurity laboratory was set up and threat modeling

was conducted on it using the STRIDE threat modeling methodology. Potential attack techniques were then

mapped using the MITRE ATT&CK enterprise matrix and attack trees were generated to sequentially show

the steps of these attacks. Going a step further, the initial steps of an attack tree were attempted against a DNA

sequencer in the biocybersecurity lab. While the results of this testing did not yield an exploitable vulnerability

that could be used to further test the attack tree techniques, lessons learned along the way can be taken into

account by future research projects pursuing similar goals.

1 INTRODUCTION

Genomic data is highly important and the environ-

ments that generate this data have unique character-

istics that must be accounted for when seeking to pro-

tect it. Whether the genomic data is Protected Health

Information (PHI), Intellectual Property (IP), or re-

search project data, its loss would present a signiﬁ-

cant loss of time, money, and potentially privacy for

individuals if the genomic data is from a human.

Each genome sequenced is the result of labora-

tory technicians spending time moving a DNA sam-

ple through a series of laboratory machines that each

prepare the DNA sample in a different way before ar-

riving at the DNA sequencer. After these preparatory

steps have been completed and quality of the sam-

ple has been assured, the laboratory technicians use

the DNA sequencer to generate digital data from the

physical sample. Throughout this process, consum-

ables and time have been used to prepare the sample

and generate the resulting digital data. All of this in-

vestment can be made moot if the resulting data is

https://orcid.org/0009-0009-7909-4217

https://orcid.org/0009-0004-8820-6448

https://orcid.org/0009-0005-4922-4026

https://orcid.org/0000-0002-4854-5419

lost or corrupted. This loss of economic investment is

only deepened if the genomic data lost is IP, as would

be the case for genetically modiﬁed crops or biophar-

maceuticals, since such products also represent an in-

vestment in research and development time.

Aside from economic investment being lost, indi-

vidual privacy can be impacted if the genomic data

lost was a person’s PHI. The impact that exposure of

this kind of PHI in a data breach scenario can have

is only worsened by the fact that the relevant indi-

vidual’s family is also affected. This trait of genomic

data introduces unique privacy concerns, on top of the

concerns already at play, since genomic data is PHI

that never or rarely changes for affected parties. Un-

like a credit card number exposed in a data breach,

a person whose genomic data has been exposed can-

not simply change their data. For cybersecurity and

privacy, this coupled with the data’s ability to affect

entire family trees means that genomic data should be

protected as PHI for as long as it is stored.

This next section of this paper covers details re-

garding a biocybersecurity lab (BCL) created to fa-

cilitate biocybersecurity research. The following sec-

tion then discusses the STRIDE threat modeling effort

conducted on the BCL and attack mappings generated

with the gathered threat modeling insights. Network

scans demonstrating the initial reconnaissance phase

626

Sheldon, J., Brown-Cantrell, I., Pape, P., Morris and T.

Highlighting Vulnerabilities in a Genomics Biocybersecurity Lab Through Threat Modeling and Security Testing.

DOI: 10.5220/0013523800003979

In Proceedings of the 22nd International Conference on Security and Cryptography (SECRYPT 2025), pages 626-631

ISBN: 978-989-758-760-3; ISSN: 2184-7711

of a potential attack targeting an Illumina NovaSeq

6000 are then reviewed. Results and considerations

are then covered, followed by future work ideas and

conclusions that were drawn from this research en-

deavor.

2 BIOCYBERSECURITY

LABORATORY

For the past four years, we have partnered with

the HudsonAlpha Institute for Biotechnology, a lo-

cal genomic sequencing laboratory campus, to con-

duct research into the genomic threat landscape. Over

the past year, the sequencing laboratory has cre-

ated a hands-on, modular biocybersecurity laboratory

(BCL) to spearhead crucial research into the area.

Through our partnership, we were given access to the

BCL to conduct the threat modeling exercises and net-

work scans discussed in the succeeding sections.

2.1 Laboratory Setup

The BCL currently consists of a 1,224-square foot

lab space containing devices comprising the ﬁrst two

stages of the genomic data life cycle: creation and

storage. This includes a Laboratory Information Man-

agement System (LIMS) to document sample intake,

prescribe a psuedoidentiﬁer, and keep track of the

sample as it is sequenced in the lab. Next, the BCL

contains genomic devices that handle DNA extrac-

tion, DNA fragmentation, library preparation, and

quality control before the sample is fed to a genomic

sequencer. To provide a well-rounded laboratory en-

vironment for testing, the BCL has devices from mul-

tiple major genomic device manufacturers such as

PacBio, Illumina, Tecan, and Agilent Technologies.

These fully installed and operational devices in the

BCL can be seen in Figure 1.

Figure 1: BCL genomic sequencing devices installed and

operational.

2.2 Laboratory Purpose

The BCL was created to provide a unique opportunity

for students, researchers, and organizations to carry

out both technical research projects and educational

learning experiences. Several camps and trainings

have been conducted in the BCL aimed at teaching

high school and undergraduate college students about

genomic sequencing devices, their purpose in the life

cycle, how to secure them, and what happens to the

sequence data created by the devices. Additionally,

the BCL is set up to allow organizations to test the ap-

plication of cybersecurity and privacy standards and

frameworks currently being developed.

3 THREAT MODELING

To better understand the organization and capabilities

of the BCL, a thorough cybersecurity threat model

was created and iterated over. The ﬁrst step in threat

modeling the BCL environment was to discuss the

laboratory conﬁguration and data ﬂows with BCL

staff and create a series of diagrams documenting the

lab. Next, a STRIDE analysis was conducted against

the components and data ﬂows documented in the di-

agrams. These steps and their outcomes are detailed

below.

3.1 Diagramming

The diagramming process began with a series of in-

person tours at the local genomic sequencing labora-

tory campus. These tours included a detailed walk-

through of how each device they own ﬁts into the

genomic data life cycle, how lab technicians inter-

act with each device, and the overall process of going

from obtaining a physical sample to having detailed

analytical results of the genome occurs. Physical and

network segmentations were discussed to determine

the presence of inherent security boundaries for the

diagrams. After the in-person tours, weekly meetings

were held with BCL staff members to discuss the lab

setup and devices further and to address any questions

that came up as the diagrams were developed.

After determining the basic components, ﬂows,

and security boundaries that needed to be present in

the diagram, it was essential to determine a common

notation to use to ensure ease of readability. The no-

tation developed and documented in MITRE’s Play-

book for Threat Modeling Medical Devices was uti-

lized (Bochniewicz et al., ). This notation has six

unique components: processes, trust boundaries, ex-

ternal entities, data stores, users, and data ﬂows.

Highlighting Vulnerabilities in a Genomics Biocybersecurity Lab Through Threat Modeling and Security Testing

627

To ensure reader understanding, additional images

and icons were used in the diagrams, such as a ge-

nomic sequencer icon within the component box for

the genomic sequencer. Color was also utilized to

show a separation between the elements of the BCL

and their underlying trust boundaries. A detailed

breakdown of the wet laboratory within the BCL can

be seen in Figure 2.

3.2 STRIDE Analysis

Given the complexity of the DFDs, it became im-

portant to prioritize the data ﬂows for which threats

would be modeled. Doing this would allow for the

threat modeling effort to focus on all of the compo-

nents and the highest value data ﬂows to keep the ef-

fort more focused. To accomplish this prioritization,

we identiﬁed the data ﬂows that were of high value

either due to the value of the data sent over the data

ﬂow or due to the general criticality of the data ﬂow to

laboratory operations. To ensure accuracy, these high

value data ﬂows were presented to specialists from

our sequencing lab partner to conﬁrm that the cho-

sen data ﬂows were where time spent threat modeling

would provide the most beneﬁt to a genomics lab.

Once this list of high value data ﬂows was con-

ﬁrmed, a STRIDE analysis was conducted. This anal-

ysis used the STRIDE threat modeling methodology

to elicit the spooﬁng, tampering, repudiation, infor-

mation disclosure, denial of service, and elevation of

privilege threats applicable to all components within

the threat model as well as those applicable to the high

value data ﬂows (Shostack, 2014). To maintain con-

sistency throughout the process of identifying threats,

Table 3 from the Playbook for Threat Modeling Medi-

cal Devices (Bochniewicz et al., ) was used to provide

a basis for which STRIDE elements were applicable

to which types of components and data ﬂows. This

increased the STRIDE analysis speed and resulted in

the identiﬁcation of over two hundred threats across

the genomic lab threat model.

3.3 Attack Mapping

After enumerating the possible threats and mitiga-

tions for each lab component, it was essential to

map the identiﬁed threats to a well-known, standard

framework. For this purpose, the MITRE Adver-

sarial Tactics, Techniques, and Common Knowledge

(MITRE ATT&CK) framework was chosen (MITRE,

n.d.). The MITRE ATT&CK framework consists of

14 tactic categories with over 200 individual tech-

niques. These techniques range from open-source in-

telligence gathering to utilizing a command and con-

trol channel to exﬁltrate data. The abundance of

potential techniques that are highly speciﬁc allows

for detailed mappings between STRIDE threats and

ATT&CK techniques to be possible. The result of this

mapping can be seen in Table 1.

To create the mappings seen in Table 1, the

threat descriptions created during the STRIDE anal-

ysis were utilized. The team evaluated the descrip-

tions altogether to determine the tactic category and

individual technique for the mappings, as well as the

individual components of the description, effectively

creating an attack chain of ATT&CK techniques.

4 NETWORK SCANS

Building off of the attack mapping performed, the

biocybersecurity lab was leveraged as a target envi-

ronment for network scans and tests. The device se-

lected for these scans and tests was an Illumina No-

vaSeq 6000 no longer used in production environ-

ments. This device was deployed in the biocyberse-

curity lab by the sequencing lab partner, the device

owner, and access to a virtual machine on the BCL

network was used to conduct the following tests.

The next scans performed were TCP and UDP

Nmap (Lyon, n.d.) scans of the sequencer with the

goal of determining open port numbers. Once the

open port numbers were identiﬁed, a series of scans

were performed to ﬁnd what Nmap guessed as the op-

erating system and to have Nmap identify the services

running on those open ports. The information from

these scans informed the types of scans and tests per-

formed next. The SYN scan results can be seen in

Figure 3.

The most interesting service identiﬁed from these

scans was an HTTP server. This HTTP server was

heavily targeted in a series of numerous tests. These

tests included attempts to leverage HTTP verbs using

cURL (curl, ) to determine if any would yield interest-

ing results. The next tests also used cURL and were

attempts at directory traversal attacks through manip-

ulating the URL targeted. Another round of tests in-

cluded banner information gathering through a variety

of tools in an attempt to determine more information

about the running HTTP server. No interesting results

were found in these tests.

Nikto, a web application vulnerability scanner

(Sullo and Lodge, n.d.), was used to scan the web

server, but still no useful information was returned.

Gobuster (OJ, n.d.) was used to try enumerat-

ing the directories on the HTTP server using the

SecLists combined directories.txt wordlist (Miessler

et al., n.d.). No results were returned from this enu-

SECRYPT 2025 - 22nd International Conference on Security and Cryptography

628

Figure 2: Detailed BCL DFD.

Figure 3: SYN Scan of DNA Sequencer.

Highlighting Vulnerabilities in a Genomics Biocybersecurity Lab Through Threat Modeling and Security Testing

629

Table 1: STRIDE-to-TTP Mapping Results.

Component Name ID S T R I D E

Wet Lab 1 T1078 T1565 T1070 T1590 T1499

Sequencer 2 T1091 T1040 T1499 T1068

Sample Input Interface 2.1 T1059 T1556 T1529 T1068

Local Temporary Datastore 2.2 T1565 T1005 T1529

Sample Output Interface 2.3 T1041

Sequencer Control Workstation 2.4 T1565 T1005 T1529 T1068

Physical Maintenance Interface 2.5 T1542 T1485 T1529 T1547

Lab Network Administration Interface 2.6 T1071 T1485 T1003 T1529 T1569

Remote Maintenance Interface 2.7 T1565 T1485 T1529 T1563

Research Computing Environment 3 T1078 T1222 T1485 T1005 T1499 T1548

Management and Tooling 4 T1078 T1070 T1485 T1005 T1529 T1078

Data Delivery DMZ 5 T1199 T1222 T1485 T1005 T1498

Partners 6 T1199 T1021 T1537

Manufacturers 7 T1199 T1070

LIMS 8 T1600 T1485 T1005 T1499 T1134

DNA Extraction 9 T1565 T1005 T1499

DNA Fragmentation 10 T1565 T1005 T1499

Library Preparation 11 T1059 T1040 T1499

Quality Control 12 T1600 T1040 T1499

Compute Nodes 13 T1078 T1195 T1485 T1005 T1499 T1053

Cluster Filesystem 14 T1222 T1005 T1499

APplications and Services 15 T1078 T1195 T1485 T1005 T1489 T1611

IT Tooling 15.1 T1078 T1195 T1485 T1005 T1489 T1611

Cyber Tooling 15.2 T1078 T1195 T1485 T1005 T1489 T1611

Sequencer Management 15.3 T1078 T1195 T1485 T1005 T1489 T1611

Monitoring and Security Logs 16 T1070 T1005 T1489

Cloud Service Providers 17 T1078 T1485 T1489

Hypervisor 18 T1564 T1489

Lab Technician 19 T1078 T1485 T1650

Manufacturer Technician 20 T1078 T1542 T1485 T1056

Administrator 21 T1078 T1485 T1650

User 22 T1078 T1485 T1650

meration attempt. Additional cURL tests were per-

formed with the goal of getting more informative re-

sponses from the HTTP server, such as user agent

spooﬁng and specifying the allow unsafe option, but

responses to these requests were no more elucidat-

ing. Network fuzzing using (xmendez, n.d.) was

then conducted using the SecLists combined wordlist

for HTTP server testing to ﬁnd network requests that

could be sent to the HTTP server to get more inter-

esting information in responses. Analysis was per-

formed on the number of characters in the responses

returned by the server during this testing and found

nothing of note.

4.1 Results and Considerations

Ultimately, the scans and tests targeting the DNA

sequencer did not ﬁnd an exploitable vulnerability.

However, there are still lessons to be learned from the

effort regarding technical information acquisition and

self-reliant device deployment. These were issues that

presented themselves throughout the research project

and were difﬁcult to overcome.

During the research project, it was difﬁcult to lo-

cate full technical details about the target sequencing

device from the manufacturer. The most useful tech-

nical information is held behind a pay wall and is not

readily available for the public, making research ef-

forts such as this more difﬁcult. Accessing this infor-

mation was not easier for the local genomic sequenc-

ing laboratory that contributed to the project. Having

access to device documentation which is more tech-

nical than what is publicly available would have been

beneﬁcial to this research project and future projects.

In reviewing the data collected from the tests and

scans performed on the DNA sequencer, the research

SECRYPT 2025 - 22nd International Conference on Security and Cryptography

630

team believes that some of the sequencer’s behavior

– such as the error responses from the HTTP servers

– may be attributable to not being fully deployed in

a production environment by a manufacturer techni-

cian. Technical documentation deﬁning expected be-

havior is necessary to conﬁrm our suspicions. Future

projects would beneﬁt from engagement from the de-

vice manufacturer to fully deploy target sequencer de-

vices.

5 FUTURE WORK

Further research into the overall security posture of

genomic-speciﬁc devices, such as DNA sequencers

and other wet lab devices, would contribute greatly

to the nascent, yet maturing, ﬁeld of biocybersecu-

rity. Vulnerability assessments of the vast number of

device models used in wet labs, starting with those

devices that represent the largest market share, would

improve sequencing lab trust in the devices that they

connect to their networks and rely on for the produc-

tion of DNA sequence data. While unachievable by

academia alone, the creation of Manufacturer Usage

Descriptions (MUD) stands to beneﬁt genomics labs

as network access for specialty devices such as the

DNA sequencer could be reduced to only those con-

nections that the device strictly needs.

6 CONCLUSIONS

Working together with a local genomic sequencing

lab has provided valuable insight into the inner work-

ings of genomic labs. Through consistent commu-

nication with the lab, the threat model was able to

be iteratively developed. As new characteristics of

the network were discovered through tours or inter-

views, previous threat modeling steps were revisited

and adapted as needed to ﬁt the new information. This

led to the model’s ﬁdelity increasing over the course

of the project.

The most interesting ﬁnding from the threat model

to note is that device manufacturers or vendors may

require direct access to their deployed devices in their

customers’ networks. For example, a DNA sequencer

may require that it be reachable from the manufac-

turer for the purposes of updates and maintenance.

This presents a cybersecurity concern as network ad-

ministrators must take this into account when design-

ing ﬁrewall rules or monitoring network trafﬁc. In

the case of a DNA sequencer, it is also worth noting

that some manufacturers or vendors may use remote

maintenance software to access the PC workstation

attached to the sequencer when performing mainte-

nance. Manufacturers or vendors may also send a

maintenance technician to the sequencing lab’s cam-

pus in-person to perform maintenance such as up-

dates, depending on the situation.

Access to the BCL provided access to devices for

research that otherwise would have been too cost pro-

hibitive to conduct due to device prices. This allowed

us to conduct network scans and tests in an environ-

ment that can be expanded over time. Leveraging

a non-production lab allowed us to conduct network

scans and tests without concern for harming an ongo-

ing sequencing workﬂow.

Although a vulnerability was not discovered from

the tests and scans, the effort has shown that more

detailed technical documentation from manufacturers

would assist research efforts and that future efforts

may beneﬁt from manufacturer-guided deployments

in biocybersecurity testing labs. These guided de-

ployments would ensure that the device is properly

conﬁgured and that its network services are fully op-

erational.

ACKNOWLEDGEMENTS

We acknowledge the HudsonAlpha Institute for

Biotechnology for collaborating with this research ef-

fort and providing insight into a real-world genomics

lab. Their involvement allowed for this research effort

to have access to real-world genomics equipment and

a functional environment to conduct threat modeling

on and to perform scans and tests on.

REFERENCES

Bochniewicz, E., Chase, M., Coley, S. C., Wallace, K.,

Weir, M., and Zuk, M. Playbook for threat modeling

medical devices.

curl. Ofﬁcial curl website.

Lyon, G. F. Nmap ofﬁcial site.

Miessler, D., Haddix, J., Portal, I., and g0tmi1k. Seclists

github repository.

MITRE. Mitre att&ck enterprise matrix.

OJ. Gobuster github repository.

Shostack, A. (2014). Threat modeling: Designing for secu-

rity. Wiley.

Sullo, C. and Lodge, D. Nikto download page.

xmendez. Wfuzz github repository.

Highlighting Vulnerabilities in a Genomics Biocybersecurity Lab Through Threat Modeling and Security Testing

631