An Empirical Evaluation of LLMs for Solving Offensive Security Challenges | allinfosecnews.com

Feb. 20, 2024, 5:11 a.m. | Minghao Shao, Boyuan Chen, Sofija Jancheska, Brendan Dolan-Gavitt, Siddharth Garg, Ramesh Karri, Muhammad Shafique

cs.CR updates on arXiv.org arxiv.org

arXiv:2402.11814v1 Announce Type: new
Abstract: Capture The Flag (CTF) challenges are puzzles related to computer security scenarios. With the advent of large language models (LLMs), more and more CTF participants are using LLMs to understand and solve the challenges. However, so far no work has evaluated the effectiveness of LLMs in solving CTF challenges with a fully automated workflow. We develop two CTF-solving workflows, human-in-the-loop (HITL) and fully-automated, to examine the LLMs' ability to solve a selected set of CTF …

arxiv capture challenges computer computer security cs.cr ctf evaluation far flag language language models large llms offensive offensive security security security challenges understand work

More from arxiv.org / cs.CR updates on arXiv.org

Causal Inference with Differentially Private (Clustered) Outcomes 15 hours ago | arxiv.org

algorithm arxiv cs.cr cs.lg +12

An artificial neural network approach to finding the key length of the Vigen\`{e}re cipher 15 hours ago | arxiv.org

accuracy article artificial arxiv +9

Generic Selfish Mining MDP for DAG Protocols 15 hours ago | arxiv.org

analysis arxiv bitcoin breaking +15

Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response 15 hours ago | arxiv.org

algorithms arxiv cs.cr data +14

Succinct arguments for QMA from standard assumptions via compiled nonlocal games 15 hours ago | arxiv.org

argument arxiv building crypto +8

On Training a Neural Network to Explain Binaries 15 hours ago | arxiv.org

aid arxiv binary code +15

Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning 15 hours ago | arxiv.org

arxiv attacks backdoor backdoor attacks +14

Leveraging Label Information for Stealthy Data Stealing in Vertical Federated Learning 15 hours ago | arxiv.org

arxiv attack attacks cs.cr +16

An Extensive Survey of Digital Image Steganography: State of the Art 15 hours ago | arxiv.org

adoption art arxiv attention +21

Sr. Cloud Security Engineer

@ BLOCKCHAINS | USA - Remote

View on infosec-jobs.com

Network Security (SDWAN: Velocloud) Infrastructure Lead

@ Sopra Steria | Noida, Uttar Pradesh, India

View on infosec-jobs.com

Senior Python Engineer, Cloud Security

@ Darktrace | Cambridge

View on infosec-jobs.com

Senior Security Consultant

@ Nokia | United States

View on infosec-jobs.com

Manager, Threat Operations

@ Ivanti | United States, Remote

View on infosec-jobs.com

Lead Cybersecurity Architect - Threat Modeling | AWS Cloud Security

@ JPMorgan Chase & Co. | Columbus, OH, United States

View on infosec-jobs.com