Prioritizing Feasible and Impactful Actions to Enable Secure AI Development and Use in Biology

Josh Dettman; Emily Lathrop; Aurelia Attal‐Juncqua; Matthew Nicotra; Allison Berke

doi:10.1002/bit.70132

journal article Open Access Jan 17, 2026

Prioritizing Feasible and Impactful Actions to Enable Secure AI Development and Use in Biology

Josh Dettman

Emily Lathrop Aurelia Attal‐Juncqua Matthew Nicotra Allison Berke

Biotechnology and Bioengineering · Wiley

View at Publisher Save 10.1002/bit.70132

Abstract

ABSTRACT
As artificial intelligence continues to enhance biological innovation, the potential for misuse must be addressed to fully unlock the potential societal benefits. While significant work has been done to evaluate general‐purpose AI and specialized biological design tools (BDTs) for biothreat creation risks, actionable steps to mitigate the risk of AI‐enabled biothreat creation are underdeveloped. This paper provides policy and technology strategies collected from a diverse range of sources placed in the context of an organizing framework aligned with steps in the AI‐enabled creation of a biothreat. After collating previous reports (typically on one or a small set of mitigation options) and evaluating the proposed mitigation options by projected feasibility and impact, we prioritize development of seven mitigation strategies (with a total of twelve individual mitigations): model unlearning and information removal techniques (a combination of five mitigations), classifier‐based input and output filtering for BDTs, AI agents for biosecurity, safety bug bounty programs, ensuring enforcement of existing material/equipment protections, enhancing biosurveillance and bioattribution, and screening metadata/audit logs before DNA synthesis. We invite collaboration among policymakers, researchers, and technologists to refine and implement these strategies into a strong layered defense, ensuring that AI can be used safely and securely to the benefit of all.

Topics

No keywords indexed for this article. Browse by subject →

References

42

[1]

Anthropic. (2023 July). Frontier Threats Red Teaming for AI Safety. Retrieved December 22 2024 from.https://www.anthropic.com/news/frontier-threats-red-teaming-for-ai-safety.

[2]

Anthropic. (2025 March). Tracing the Thoughts of a Large Language Model. Retrieved July 21 2025 from.https://www.anthropic.com/research/tracing-thoughts-language-model.

[3]

Batalis S.(2024 December). Anticipating Biological Risk: A Toolkit for Strategic Biosecurity Policy. Retrieved December 22 2024 from.https://cset.georgetown.edu/publication/anticipating-biological-risk-a-toolkit-for-strategic-biosecurity-policy/.

[4]

10.1101/2025.02.18.638918

[5]

Bureau of Industry and Security "Controls on Certain Laboratory Equipment and Related Technology To Address Dual Use Concerns About Biotechnology" Federal Register (2025)

[6]

Burzstein E. andM.Tishchenko(2025 April). Google Announces Sec‐Gemini v1 A New Experimental Cybersecurity Model. Retrieved July 21 2025 from.https://security.googleblog.com/2025/04/google-launches-sec-gemini-v1-new.html.

[7]

10.1038/d41586-024-03214-7

[8]

Campbell Q. L. J.Herington andA. D.White(2023). Censoring Chemical Data to Mitigate Dual Use Risk. arXiv Preprint arXiv:2304.10510.https://arxiv.org/abs/2304.10510.

[9]

Carter S. R. N. E.Wheeler C.Issac andJ. M.Yassif(2024 November). Developing Guardrails for AI Biodesign Tools. Retrieved December 22 2024 from.https://www.nti.org/analysis/articles/developing-guardrails-for-ai-biodesign-tools/.

[10]

Casper S. L.Schulze O.Patel andD.Hadfield‐Menell(2024). Defending Against Unforeseen Failure Modes With Latent Adversarial Training. arXiv Preprint arXiv:2403.05030.https://arxiv.org/abs/2403.05030.

[11]

Centers for Disease Control and Prevention and U.S. Department of Agriculture. (2024). 2024 Annual Report of theFederal Select Agent Program. Retrieved November 3 2025 from.https://www.selectagents.gov/resources/publications/index.htm.

[12]

Chen A. S. D.Stanton R. G.Alberstein A. M.Watkins R.Bonneau andV.Gligorijević(2024). LLMs Are Highly‐Constrained Biophysical Sequence Optimizers. arXiv Preprint arXiv:2410.22296.https://arxiv.org/abs/2410.22296.

[13]

Cibralic B. (2024)

[14]

10.1016/j.techfore.2011.03.021

[15]

Esvelt K. M. (2022)

[16]

10.1038/nrg.2017.88

[17]

Hu S. Y.Fu Z. S.Wu andV.Smith.2025.Unlearning or Obfuscating? Jogging the Memory of Unlearned LLMs via Benign Relearning. arXiv Preprint arXiv:13356.https://doi.org/10.48550/arXiv.2406.13356.

[18]

Institute of Medicine and National Research Council (2011)

[19]

Generalized biomolecular modeling and design with RoseTTAFold All-Atom

Rohith Krishna, Woody Ahern, Pascal Sturmfels et al.

Science 10.1126/science.adl2528

[20]

10.1038/s41467-020-19149-2

[21]

Li D.(2025 March). Automate Cybersecurity at Scale With Microsoft Security Copilot Agents. Retrieved July 21 2025 from.https://techcommunity.microsoft.com/blog/securitycopilotblog/automate-cybersecurity-at-scale-with-microsoft-security-copilot-agents/4394675.

[22]

Li N. A.Pan A.Gopal S.Yue D.Berrios andA.Gatti(2024). The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning. arXiv Preprint arXiv:2403.03218.https://arxiv.org/abs/2403.03218.

[23]

10.1038/s42256-025-00985-0

[24]

10.1609/aaai.v37i12.26752

[25]

Mazarr M. J.(2018). Understanding Deterrence. Tech. rep. RAND Corporation.https://www.rand.org/pubs/perspectives/PE295.html. 10.7249/pe295

[26]

Mazeika M. L.Phan X.Yin A.Zou Z.Wang andN.Mu(2024). HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal. arXiv Preprint arXiv:2402.04249.https://arxiv.org/abs/2402.04249.

[27]

10.1016/j.plas.2021.100022

[28]

10.1089/hs.2023.0068

[29]

10.1038/s41598-024-79952-5

[30]

National Institute of Standards and Technology (2025)

[31]

Nevo S. (2024)

[32]

OpenAI. (2024 December). OpenAI o1 System Card. Retrieved December 22 2024 from.https://openai.com/index/openai-o1-system-card/.

[33]

Rissman T. (2024)

[34]

Rosati D. J.Wehner K.Williams Ł.Bartoszcze D.Atanasov andR.Gonzales(2024). Representation Noising: A Defence Mechanism Against Harmful Finetuning. arXiv Preprint arXiv:2405.14577.https://arxiv.org/abs/2405.14577.

[35]

Sabin S.(2025 April). Exclusive: Anthropic warns fully AI employees are a year away. Retrieved July 21 2025 from.https://www.axios.com/2025/04/22/ai-anthropic-virtual-employees-security.

[36]

Sharma M. M.Tong J.Mu J.Wei J.Kruthoff andS.Goodfriend(2025). Constitutional Classifiers: Defending Against Universal Jailbreaks Across Thousands of Hours of Red Teaming. Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming.https://arxiv.org/abs/2501.18837.

[37]

Tamirisa R. B.Bharathi L.Phan A.Zhou A.Gatti andT.Suresh(2024). Tamper‐Resistant Safeguards for Open‐Weight LLMs. arXiv Preprint arXiv:2408.00761.https://arxiv.org/abs/2408.00761.

[38]

U.S. Government Accountability Office. (2012 September). Biosurveillance: DHS Should Reevaluate Mission Need and Alternatives before Proceeding With BioWatch Generation‐3 Acquisition. Biosurveillance: DHS Should Reevaluate Mission Need and Alternatives before Proceeding With BioWatch Generation‐3 Acquisition. Retrieved December 1 2025 from.https://www.gao.gov/products/gao-12-810.

[39]

10.1126/science.adu8578

[40]

Zou A. L.Phan S.Chen J.Campbell P.Guo andR.Ren(2023). Representation Engineering: A Top‐Down Approach to AI Transparency. arXiv Preprint arXiv:2310.01405.https://arxiv.org/abs/2310.01405.

[41]

Zou A. L.Phan J.Wang D.Duenas M.Lin andM.Andriushchenko(2024). Improving Alignment and Robustness with Circuit Breakers. arXiv Preprint arXiv:2406.04313.https://arxiv.org/abs/2406.04313.

[42]

Łucki J. B.Wei Y.Huang P.Henderson F.Tramèr andJ.Rando(2025). An Adversarial Perspective on Machine Unlearning for AI Safety. arXiv Preprint arXiv:2409.18025.https://doi.org/10.48550/arXiv.2409.18025.

Metrics

3

Citations

42

References

Details

Published: Jan 17, 2026
License: View

Authors

J

Josh Dettman