An advanced persistent threat (APT) refers to a covert and long-term cyberattack, typically conducted by state-sponsored actors, targeting critical sectors and often remaining undetected for long periods. In response, collective intelligence from around the globe collaborates to identify and trace surreptitious activities, generating substantial documentation on APT campaigns publicly available on the web. While a multitude of prior works predominantly focus on specific aspects of APT cases, such as detection, evaluation, cyber threat intelligence, and dataset creation, limited attention has been devoted to revisiting and investigating these scattered dossiers in a longitudinal manner. The objective of our study lies in filling the gap by offering a macro perspective, connecting key insights and global trends in the past APT attacks. We systematically analyze six reliable sources–three focused on technical reports and another three on threat actors—examining 1,509 APT dossiers (i.e. totaling 24,215 pages) spanning from 2014 to 2023 (a decade), and identifying 603 unique APT groups in the world. To efficiently unearth relevant information, we employ a hybrid methodology that combines rule-based information retrieval with large-language-model-based search techniques. Our longitudinal analysis reveals shifts in threat actor activities, global attack vectors, changes in targeted sectors, and the relationships between cyberattacks and significant events, such as elections or wars, which provides insights into historical patterns in APT evolution. Over the past decade, 154 countries have been affected, primarily using malicious documents and spear phishing as the dominant initial infiltration vectors, and a noticeable decline in zero-day exploitation since 2016. Furthermore, we present our findings through interactive visualization tools, such as an APT map or a flow diagram, to facilitate intuitive understanding of the global patterns and trends in APT activities.
@inproceedings{aptstudy-ccs25,author={Yuldoshkhujaev, Shakhzod and Jeon, Mijin and Kim, Doowon and Nikiforakis, Nick and Koo, Hyungjoon},title={A Decade-long Landscape of Advanced Persistent Threats: Longitudinal Analysis and Global Trends (To appear)},year={2025},month=oct,publisher={ACM},url={},doi={},booktitle={Proceedings of the 32nd ACM Conference on Computer and Communications Security (CCS’25)},pages={XXX-XXX},keywords={APT, landscape, logitudinal analysis, global trends},location={Taiwan},series={CCS '25},}
Bootkits and rootkits are among the most elusive and persistent forms of malware, subverting system defenses by operating at the lowest levels of system architecture. Bootkits compromise the firmware or bootloader, allowing them to manipulate the boot sequence and gain control before security mechanisms initialize. Meanwhile, rootkits embed themselves within the OS kernel, stealthily conceal malicious activities, and maintain long-term persistence. Despite their critical implications for security, these threats remain underexplored due to the technical complexity involved in their study, the scarcity of real-world samples, and the challenges posed by defense-in-depth security in modern OSes. In this paper, we introduce BOOTKITTY, a hybrid bootkit-rootkit capable of circumventing modern security features in multiple OS platforms, across Windows, Linux, and Android. We explore critical firmware and bootloader vulnerabilities that can lead to a low-level compromise, demonstrating techniques that bypass advanced security protections by breaking the chain of trust. Our study addresses technical challenges such as exploiting UEFI drivers, manipulating kernel memory, and evading advanced mitigations in the boot process, and provides actionable insights. Our systematic evaluations show that BOOTKITTY reveals critical weaknesses in contemporary security mechanisms, highlighting the need for better security design that offers holistic (low-level) protection.
@inproceedings{bootkitty-woot25,author={Lee, Junho and Kwon, Jihoon and Seo, HyunA and Lee, Myeongyeol and Seo, Hyungyu and Jung, Jinho and Koo, Hyungjoon},title={BOOTKITTY: A Stealthy Bootkit-Rootkit Against Modern Operating Systems},year={2025},month=aug,publisher={USENIX Association},url={},doi={},booktitle={Proceedings of the 19th USENIX WOOT Conference on Offensive Technologies (WOOT'25)},pages={303-320},keywords={bootkit, rootkit},location={Seattle, USA},series={WOOT '25},}
Phishing attacks pose a significant threat to Internet users, with cybercriminals elaborately replicating the visual appearance of legitimate websites to deceive victims. Visual similarity-based detection systems have emerged as an effective countermeasure, but their effectiveness and robustness in real-world scenarios have been underexplored. In this paper, we comprehensively scrutinize and evaluate the effectiveness and robustness of popular visual similarity-based anti-phishing models using a large-scale dataset of 451k real-world phishing websites. Our analyses of the effectiveness reveal that while certain visual similarity-based models achieve high accuracy on curated datasets in the experimental settings, they exhibit notably low performance on real-world datasets, highlighting the importance of real-world evaluation. Furthermore, we find that the attackers evade the detectors mainly in three ways: (1) directly attacking the model pipelines, (2) mimicking benign logos, and (3) employing relatively simple strategies such as eliminating logos from screenshots. To statistically assess the resilience and robustness of existing models against adversarial attacks, we categorize the strategies attackers employ into visible and perturbation-based manipulations and apply them to website logos. We then evaluate the models’ robustness using these adversarial samples. Our findings reveal potential vulnerabilities in several models, emphasizing the need for more robust visual similarity techniques capable of withstanding sophisticated evasion attempts. We provide actionable insights for enhancing the security of phishing defense systems, encouraging proactive actions.
@inproceedings{phishing-models-usenix25,author={Ji, Fujiao and Lee, Kiho and Koo, Hyungjoon and You, Wenhao and Choo, Euijin and Kim, Hyoungshick and Kim, Doowon},title={Evaluating the Effectiveness and Robustness of Visual Similarity-based Phishing Detection Models},year={2025},month=aug,publisher={USENIX Association},url={},doi={},booktitle={Proceedings of the 34nd USENIX Conference on Security Symposium (USENIX)},pages={???-???},keywords={phishing detection model, evaluation, visual similarity},location={Seattle, USA},series={USENIX '25},}
The recent advancements in artificial intelligence drive the
widespread adoption of Machine-Learning-as-a-Service platforms,
which offers valuable services. However, these pervasive utilities in the cloud
environment unavoidably encounter security and privacy issues.
In particular, a membership inference attack (MIA) poses a threat by recognizing
the presence of a data sample in a training set for the target model.
Although prior MIA approaches underline privacy risks
repeatedly by demonstrating experimental results with standard benchmark
datasets such as MNIST and CIFAR.
However, the effectiveness of such techniques on a real-world
dataset remains questionable.
We are the first to perform an in-depth empirical study on black-box based
MIAs that hold realistic assumptions, including six metric-based
and three classifier-based MIAs with the high-dimensional image dataset
that consists of identification (ID) cards and driving licenses.
Additionally, we introduce the Siamese-based MIA that shows similar or
better performance than the state-of-the-art approaches and suggest
training a shadow model with autoencoder-based reconstructed images.
Our major findings show that the performance of MIA techniques
against too many features may be degraded; the MIA configuration
or a sample’s properties can impact the accuracy of membership
inference on members and non-members.
@inproceedings{blackbox-mia,author={Kwon, Yujeong and Woo, Simon S. and Koo, Hyungjoon},title={An Empirical Study of Black-box based Membership Inference Attacks on a Real-World Dataset},year={2024},month=dec,publisher={Association for Computing Machinery},url={},doi={},booktitle={Proceedings of the 17th International Symposium on Foundations and Practice of Security (FPS)},pages={XXX-XXX},keywords={Membership Inference Attack, Machine Learning},location={Montreal, Canada},series={FPS '24},}
Decompilation is a process of converting a low-level machine code snippet back into a high-level programming
language such as C. It serves as a basis to aid reverse engineers in comprehending the contextual semantics of
the code. In this respect, commercial decompilers like Hex-Rays have made significant strides in improving
the readability of decompiled code over time. While previous work has proposed the metrics for assessing the
readability of source code, including identifiers, variable names, function names, and comments, those metrics
are unsuitable for measuring the readability of decompiled code primarily due to i) the lack of rich semantic
information in the source and ii) the presence of erroneous syntax or inappropriate expressions. In response,
to the best of our knowledge, this work first introduces R2I, the Relative Readability Index, a specialized metric
tailored to evaluate decompiled code in a relative context quantitatively. In essence, R2I can be computed by
i) taking code snippets across different decompilers as input and ii) extracting pre-defined features from an
abstract syntax tree. For the robustness of R2I, we thoroughly investigate the enhancement efforts made by
existing decompilers and academic research to promote code readability, identifying 31 features to yield a
reliable index collectively. Besides, we conducted a user survey to capture subjective factors such as one’s
coding styles and preferences. Our empirical experiments demonstrate that R2I is a versatile metric capable of
representing the relative quality of decompiled code (e.g., obfuscation, decompiler updates) and being well
aligned with human perception in our survey.
@inproceedings{r2i,author={Eom, Haeun and Kim, Dohee and Lim, Sori and Koo, Hyungjoon and Hwang, Sungjae},title={R2I: A Relative Readability Metric for Decompiled Code},year={2024},month=jul,publisher={Association for Computing Machinery},url={https://dl.acm.org/doi/10.1145/3643744},doi={10.1145/3634737.3645006},booktitle={Proceedings of the ACM International Conference on the Foundations of Software Engineering (FSE)},pages={383-405},keywords={Code Readability, Code Metric, Decompiled Code, Decompiler},location={Porto de Galinhas, Brazil},series={FSE '24},}
Binary reverse engineering is crucial to gain insights into the inner
workings of a stripped binary. Yet, it is challenging to read the original semantics from a binary code snippet
because of the unavailability of high-level information in the source, such as function
names, variable names, and types. Recent advancements in deep learning show the possibility
of recovering such vanished information with a well-trained model from a pre-defined dataset.
Albeit a static model’s notable performance, it can hardly cope with ever-increasing
data stream (e.g., compiled binaries) by nature. The two viable approaches for ceaseless learning are
retraining the whole dataset from scratch and fine-tuning a pre-trained model;
however, retraining suffers from large computational overheads and fine-tuning
from performance degradation (i.e., catastrophic forgetting). Lately, continual learning (CL) tackles
the problem of handling incremental data in security domains (e.g., network intrusion
detection, malware detection) using reasonable resources while maintaining performance in practice.
In this paper, we focus on how CL assists the improvement of a generative model that
predicts a function symbol name from a series of machine instructions. To this end, we introduce
BinAdapter, a system that can infer function names from an incremental dataset
without performance degradation from an original dataset by leveraging CL techniques.
Our major finding shows that incremental tokens in the source (i.e., machine instructions) or the target
(i.e., function names) largely affect the overall performance of a CL-enabled model.
Accordingly, BinAdapter adopts three built-in approaches: i) inserting adapters in case of
no incremental tokens in both the source and target, ii) harnessing multilingual neural
machine translation (M-NMT) and fine-tuning the source embeddings
with i) in case of incremental tokens in the source, and iii) fine-tuning target embeddings with
ii) in case of incremental tokens in both. To demonstrate the effectiveness of BinAdapter, we
evaluate the above three scenarios using incremental datasets with or without a set of new tokens
(e.g., unseen machine instructions or function names), spanning across different architectures and
optimization levels. Our empirical results show that BinAdapter outperforms the state-of-the-art
CL techniques for an F1 of up to 24.3% or a Rouge-l of 21.5% in performance.
@inproceedings{binadapter,author={Murodova, Nozima and Koo, Hyungjoon},title={BinAdapter: Leveraging Continual Learning for Inferring Function Symbol Names in a Binary},year={2024},month=jul,publisher={Association for Computing Machinery},url={https://dl.acm.org/doi/10.1145/3634737.3645006},doi={10.1145/3634737.3645006},booktitle={Proceedings of the 19th ACM ASIA Conference on Computer and Communications Security (ASIACCS)},pages={1200-1213},keywords={Binary analysis, Software security, Reverse engineering, Continual learning},location={Singapore},series={ASIACCS '24},}
Identifying compiler toolchain provenance serves as a basis for both benign and malicious
binary analyses. A wealth of prior studies mostly focuses on the inference of a popular compiler toolchain
for C and C++ languages from stripped binaries that are built with GCC or clang. Lately, the popularity of
an emerging compiler is on the rise such as Rust, Go, and Nim programming languages that complement
the downsides of C and C++ (e.g., security), which little has been explored on them. The main challenge
arises when applying previous inference techniques for toolchain provenance because some emerging
compilation toolchains adopt the same backend of traditional compilers. In this paper, we propose ToolPhet,
an effective end-to-end BERT-based system for deducing the provenance of both traditional and emerging
compiler toolchains. To this end, we thoroughly study the characteristics of both an emerging toolchain
and an executable binary that is generated by that toolchain. We introduce two separate downstream
tasks for the compiler toolchain inference with a (BERT-based) fine-tuning process, which produces i) a
toolchain classification model, and ii) a binary code similarity detection model. Our findings show that
the classification model (i) may not suffice when producing a binary with the existing backend like Nim,
which we adopt the detection model (ii) that can infer underlying code semantics. We evaluate ToolPhet
with the previous work including one signature-based tool and four machine-learning-based approaches,
demonstrating its effectiveness by achieving higher F1 scores with the binaries compiled with emerging
compilation toolchains.
@article{toolphet,title={ToolPhet: Inference of Compiler Provenance from Stripped Binaries with Emerging Compilation Toolchains},author={Jang, Hohyeon and Murodova, Nozima and Koo, Hyungjoon},journal={IEEE Access},volume={11},pages={12667--12682},year={2024},month=jan,publisher={IEEE},doi={10.1109/ACCESS.2024.3355098},}
Fuzzing has demonstrated great success in bug discovery, and plays a crucial role in software testing today.
Despite the increasing popularity of fuzzing, automated root cause analysis (RCA) has drawn less attention.
One of the recent advances in RCA is crash-based statistical debugging, which leverages the behavioral differences
in program execution between crash-triggered and non-crashing inputs. Hence, obtaining non-crashing behaviors
close to the original crash is crucial but challenging with previous approaches (e.g., fuzzing).
In this paper, we present BENZENE, a practical end-to-end RCA system that facilitates
an automated crash diagnosis. To this end, we introduce a novel technique, called under-constrained state mutation,
that generates both crashing and non-crashing behaviors for effective and efficient RCA.
We design and implement the BENZENE prototype, and evaluate it with 60 vulnerabilities in the wild.
Our empirical results demonstrate that BENZENE not only surpasses in performance (i.e., root cause ranking),
but also achieves superior results in both speed (4.6 times faster) and memory footprint (31.4 times less)
on average than prior approaches.
@inproceedings{benzene,author={Park, Younggi and Lee, Hwiwon and Jung, Jinho and Koo, Hyungjoon and Kim, Huykang},booktitle={Proceedings of the 45th IEEE Symposium on Security and Privacy (S&P)},title={BENZENE: A Practical Root Cause Analysis System with an Under-Constrained State Mutation (Distinguished Paper Award)},year={2024},issn={2375-1207},pages={74-74},keywords={root cause analysis;vulnerability analysis},doi={10.1109/SP54263.2024.00074},url={https://doi.ieeecomputersociety.org/10.1109/SP54263.2024.00074},publisher={IEEE Computer Society},address={Los Alamitos, CA, USA},month=may,}
The ever-increasing phishing campaigns around the globe have been one of the main threats to
cyber security. In response, the global anti-phishing entity (e.g., APWG) collectively maintains the up-to-date
blacklist database (e.g., eCrimeX) against phishing campaigns, and so do modern browsers (e.g., Google
Safe Browsing). However, our finding reveals that such a mutual assistance system has remained a blind
spot when detecting geolocation-based phishing campaigns. In this paper, we focus on phishing campaigns
against the web portal service with the largest number of users (42 million) in South Korea. We harvest
1,558 phishing URLs from varying resources in the span of a full year, of which only a small fraction (3.8%)
have been detected by eCrimeX despite a wide spectrum of active fraudulence cases. We demystify three
pervasive types of phishing campaigns in South Korea: i) sophisticated phishing campaigns with varying
adversarial tactics such as a proxy configuration, ii) phishing campaigns against a second-hand online market,
and iii) phishing campaigns against a non-specific target. Aligned with previous findings, a phishing kit that
supports automating the whole phishing campaign is prevalent. Besides, we frequently observe a hit-andrun scam
where a phishing campaign is immediately inaccessible right after victimization is complete, each
of which is tailored to a single potential victim over a new channel like a messenger. As part of mitigation
efforts, we promptly provide regional phishing information to APWG, and immediately lock down a victim’s
account to prevent further damages.
@article{phishhunter,title={Demystifying the Regional Phishing Landscape in South Korea},author={Park, Hyunjun and Lim, Kyungchan and Kim, Doowon and Yu, Donghyun and Koo, Hyungjoon},journal={IEEE Access},pages={130131--130143},year={2023},month=nov,publisher={IEEE},doi={10.1109/ACCESS.2023.3333883},}
The recovery of contextual meanings on a machine code is required by a wide range of
binary analysis applications, such as bug discovery, malware analysis, and code clone detection.
To accomplish this, advancements on binary code analysis borrow the techniques from natural language
processing to automatically infer the underlying semantics of a binary, rather than replying
on manual analysis. One of crucial pipelines in this process is instruction normalization,
which helps to reduce the number of tokens and to avoid an out-of-vocabulary (OOV) problem.
However, existing approaches often substitutes the operand(s) of an instruction with
a common token (e. g., callee target → FOO), inevitably resulting in the loss of important information.
In this paper, we introduce well-balanced instruction normalization (WIN), a novel approach
that retains rich code information while minimizing the downsides of code normalization.
With large swaths of binary code, our finding shows that the instruction distribution follows
Zipf’s Law like a natural language, a function conveys contextually meaningful information,
and the same instruction at different positions may require diverse code representations.
To show the effectiveness of WIN, we present DEEP SEMANTIC that harnesses the BERT architecture
with two training phases: pre-training for generic assembly code representation, and
fine-tuning for building a model tailored to a specialized task. We define a downstream task of
binary code similarity detection, which requires underlying code semantics. Our experimental results
show that our binary similarity model with WIN outperforms two state-of-the-art binary similarity tools,
DeepBinDiff and SAFE, with an average improvement of 49.8% and 15.8%, respectively.
@article{win-normalization,author={Koo, Hyungjoon and Park, Soyeon and Choi, Daejin and Kim, Taesoo},journal={IEEE Access},title={Binary Code Representation With Well-Balanced Instruction Normalization},year={2023},month=mar,number={},pages={29183-29198},doi={10.1109/ACCESS.2023.3259481},}
A smart contract is a self-executing program on a blockchain to ensure an immutable
and transparent agreement without the involvement of intermediaries. Despite its growing popularity
for many blockchain platforms like Ethereum, no technical means is available even when
a smart contract requires to be protected from being copied. One promising direction to claim
a software ownership is software watermarking. However, applying existing software watermarking
techniques is challenging because of the unique properties of a smart contract, such as
a code size constraint, non-free execution cost, and no support for dynamic allocation under
a virtual machine environment. This paper introduces a novel software watermarking scheme,
dubbed Smartmark, aiming to protect the ownership of a smart contract against a pirate activity.
Smartmark builds the control flow graph of a target contract runtime bytecode, and locates
a collection of bytes that are randomly elected for representing a watermark. We implement
a full-fledged prototype for Ethereum, applying Smartmark to 27,824 unique smart contract bytecodes.
Our empirical results demonstrate that Smartmark can effectively embed a watermark into
a smart contract and verify its presence, meeting the requirements of credibility and imperceptibility
while incurring an acceptable performance degradation. Besides, our security analysis shows that
Smartmark is resilient against viable watermarking corruption attacks; e.g., a large number of
dummy opcodes are needed to disable a watermark effectively, resulting in producing
an illegitimate smart contract clone that is not economical.
@inproceedings{smartmark,author={Kim, Taeyoung and Jang, Yunhee and Lee, Chanjong and Koo, Hyungjoon and Kim, Hyoungshick},booktitle={Proceedings of the IEEE/ACM 45th International Conference on Software Engineering (ICSE)},title={Smartmark: Software Watermarking Scheme for Smart Contracts},year={2023},month=may,volume={},number={},pages={283-294},doi={10.1109/ICSE48619.2023.00035},}
Reverse engineering of a stripped binary has a wide range of applications,
yet it is challenging mainly due to the lack of contextually useful information within.
Once debugging symbols (e.g., variable names, types, function names) are discarded, recovering
such information is not technically viable with traditional approaches like static or dynamic binary analysis.
We focus on a function symbol name recovery, which allows a reverse engineer to gain a quick overview of
an unseen binary. The key insight is that a well-developed program labels a meaningful function name
that describes its underlying semantics well. In this paper, we present AsmDepictor,
the Transformer-based framework that generates a function symbol name from a set of assembly codes
(i.e., machine instructions), which consists of three major components: binary code refinement,
model training, and inference. To this end, we conduct systematic experiments on the effectiveness of
code refinement that can enhance an overall performance. We introduce the per-layer positional
embedding and Unique-softmax for AsmDepictor so that both can aid to capture a better relationship between tokens.
Lastly, we devise a novel evaluation metric tailored for a short description length, the Jaccard* score.
Our empirical evaluation shows that the performance of AsmDepictor by far surpasses that of
the state-of-the-art models up to around 400%. The best AsmDepictor model achieves an F1 of 71.5
and Jaccard* of 75.4.
@inproceedings{asmdepictor,author={Kim, Hyunjin and Bak, Jinyeong and Cho, Kyunghyun and Koo, Hyungjoon},title={A Transformer-Based Function Symbol Name Inference Model from an Assembly Language for Binary Reversing},year={2023},month=jul,isbn={9798400700989},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3579856.3582823},doi={10.1145/3579856.3582823},booktitle={Proceedings of the 18th ACM ASIA Conference on Computer and Communications Security (ASIACCS)},pages={951–965},numpages={15},keywords={reversing, neural networks, assembly, function name, Transformer},location={Melbourne, VIC, Australia},series={ASIA CCS '23},}
Password-based authentication is one of the most commonly adopted mechanisms for online security.
Choosing strong passwords is crucial for protecting ones’ digital identities and assets, as weak passwords
can be readily guessable, resulting in a compromise such as unauthorized access. To promote the use of
strong passwords on the Web, the National Institute of Standards and Technology (NIST) provides
website administrators with password composition policy (PCP) guidelines. We manually inspect popular websites
to check if their password policies conform to NIST’s PCP guidelines by generating passwords that meet
each criterion and testing the 100 popular websites. Our findings reveal that a considerable number of
web sites (on average, 53.5%) do not comply with the guidelines, which could result in password breaches.
@inproceedings{passwd-policy,author={Lim, Kyungchan and Kang, Joshua Hankyul and Dixson, Matthew and Koo, Hyungjoon and Kim, Doowon},title={Evaluating Password Composition Policy and Password Meters of Popular Websites},year={2023},month=may,isbn={979-8-3503-1237-9},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://ieeexplore.ieee.org/document/10188654},doi={10.1109/SPW59333.2023.00006},booktitle={Proceedings of the 2023 IEEE Security and Privacy Workshops (SPW)},pages={12–20},numpages={9},location={San Francisco, CA, USA},series={SecWeb '23},}
A progressive web application (PWA) becomes an attractive option
for building universal applications based on feature-rich web Application Programming Interfaces (APIs).
While flexible, such vast APIs inevitably bring a significant increase in an API attack surface,
which commonly corresponds to a functionality that is neither needed nor wanted by the application.
A promising approach to reduce the API attack surface is software debloating, a technique wherein
an unused functionality is programmatically removed from an application. Unfortunately, debloating PWAs
is challenging, given the monolithic design and non-deterministic execution of a modern web browser.
In this paper, we present DeView, a practical approach that reduces the attack surface of a PWA
by blocking unnecessary but accessible web APIs. DeView tackles the challenges of PWA debloating by
i) record-and-replay web API profiling that identifies needed web APIs on an app-by-app basis by replaying
(recorded) browser interactions and ii) compiler-assisted browser debloating that eliminates
the entry functions of corresponding web APIs from the mapping between web API and its entry point in a binary.
Our evaluation shows the effectiveness and practicality of DeView. DeView successfully eliminates 91.8% of
accessible web APIs while i) maintaining original functionalities and ii) preventing 76.3%
of known exploits on average.
@inproceedings{deview,author={Oh, ChangSeok and Lee, Sangho and Qian, Chenxiong and Koo, Hyungjoon and Lee, Wenke},title={DeView: Confining Progressive Web Applications by Debloating Web APIs},year={2022},month=dec,isbn={9781450397599},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3564625.3567987},doi={10.1145/3564625.3567987},booktitle={Proceedings of the 38th Annual Computer Security Applications Conference (ACSAC)},pages={881–895},numpages={15},keywords={Record-and-Replay, PWA, Debloating, Progressive Web Application, Program Analysis, Browser, Web APIs},location={Austin, TX, USA},series={ACSAC '22},}
Binary code similarity detection (BCSD) serves as a basis for a wide spectrum of applications,
including software plagiarism, malware classification, and known vulnerability discovery.
However, the inference of contextual meanings of a binary is challenging due to the absence of
semantic information available in source codes. Recent advances leverage the benefits of
a deep learning architecture into a better understanding of underlying code semantics and
the advantages of the Siamese architecture into better BCSD. In this paper, we propose BinShot,
a BERT-based similarity learning architecture that is highly transferable for effective BCSD.
We tackle the problem of detecting code similarity with one-shot learning (a special case of few-shot learning).
To this end, we adopt a weighted distance vector with a binary cross entropy as a loss function
on top of BERT. With the prototype of BinShot, our experimental results demonstrate the effectiveness,
transferability, and practicality of BinShot, which is robust to detecting the similarity of
previously unseen functions. We show that BinShot outperforms the previous state-of-the-art approaches for BCSD.
@inproceedings{binshot,author={Ahn, Sunwoo and Ahn, Seonggwan and Koo, Hyungjoon and Paek, Yunheung},title={Practical Binary Code Similarity Detection with BERT-Based Transferable Similarity Learning},year={2022},month=dec,isbn={9781450397599},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3564625.3567975},doi={10.1145/3564625.3567975},booktitle={Proceedings of the 38th Annual Computer Security Applications Conference (ACSAC)},pages={361–374},numpages={14},keywords={Binary Analysis, Similarity Detection, Deep Neural Network},location={Austin, TX, USA},series={ACSAC '22},}
IoTivity Packet Parser for Encrypted Messages in Internet of Things
Hyeonah Jung, Hyungjoon Koo, and Jaehoon Paul Jeong
In Proceedings of the 24th International Conference on Advanced Communication Technology (ICACT), Jan 2022
The Internet of Things (IoT) market has been ever-growing because both the demand of smart lives
and the number of mobile users keep increasing. On the other hand, IoT device manufacturers tend to employ
proprietary operating systems and network protocols, which may lead device interoperability issues.
The Open Connectivity Foundation (OCF) has established a standard protocol for seamless IoT communication.
IoTivity is one of reference implementations that conforms to the OCF specification.
IoTivity utilizes both Datagram Transport Layer Security (DTLS) and Constrained Application Protocol (CoAP)
to support a lightweight and secure communication. Although a packet analysis tool like Wireshark offers
a feature to decrypt messages over TLS or DTLS by feeding a session key that a Web browser records,
it cannot be directly applied to IoTivity because it lacks such a key-tracing functionality. In this paper,
we present an IoTivity Packet Parser (IPP) for encrypted CoAP messages tailored to IoTivity.
To this end, we modify IoTivity source code to extract required keys, and leverage them to parse each field
automatically for further protocol analysis in a handy manner.
@inproceedings{iotivity,author={Jung, Hyeonah and Koo, Hyungjoon and Jeong, Jaehoon Paul},booktitle={Proceedings of the 24th International Conference on Advanced Communication Technology (ICACT)},title={IoTivity Packet Parser for Encrypted Messages in Internet of Things},year={2022},month=jan,volume={},number={},series={ICACT '22},pages={53-57},doi={10.23919/ICACT53585.2022.9728913},}
The ease of reproducibility of digital artifacts raises a growing concern in copyright infringement;
in particular, for a software product. Software watermarking is one of the promising techniques
to verify the owner of licensed software by embedding a digital fingerprint. Developing an ideal
software watermark scheme is challenging because i) unlike digital media watermarking,
software watermarking must preserve the original code semantics after inserting software watermark,
and ii) it requires well-balanced properties of credibility, resiliency, capacity, imperceptibility, and efficiency.
We present SoftMark, a software watermarking system that leverages a function relocation where
the order of functions implicitly encodes a hidden identifier. By design, SoftMark does not introduce
additional structures (i.e., codes, blocks, or subroutines), being robust in unauthorized detection,
while maintaining a negligible performance overhead and reasonable capacity. With various strategies
against viable attacks (i.e., static binary re-instrumentation), we tackle the limitations of previous
reordering-based approaches. Our empirical results demonstrate the practicality and effectiveness
by successful embedding and extraction of various watermark values.
@inproceedings{softmark,author={Kang, Honggoo and Kwon, Yonghwi and Lee, Sangjin and Koo, Hyungjoon},title={SoftMark: Software Watermarking via a Binary Function Relocation},year={2021},month=dec,isbn={9781450385794},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3485832.3488027},doi={10.1145/3485832.3488027},booktitle={Proceedings of the 37th Annual Computer Security Applications Conference (ACSAC)},pages={169–181},numpages={13},keywords={Function Relocation, Watermark, Software Watermarking, Binary Instrumentation, Function Reordering},location={Virtual Event, USA},series={ACSAC '21},}
A function recognition problem serves as a basis for further binary analysis and many applications.
Although common challenges for function detection are well known, prior works have repeatedly claimed
a noticeable result with a high precision and recall. In this paper, we aim to fill the void of what has been
overlooked or misinterpreted by closely looking into the previous datasets, metrics, and evaluations
with varying case studies. Our major findings are that i) a common corpus like GNU utilities is
insufficient to represent the effectiveness of function identification, ii) it is difficult to claim,
at least in the current form, that an ML-oriented approach is scientifically superior to deterministic ones
like IDA or Ghidra, iii) the current metrics may not be reasonable enough to measure
varying function detection cases, and iv) the capability of recognizing functions depends on
each tool’s strategic or peculiar choice. We perform re-evaluation of existing approaches on our own dataset,
demonstrating that not a single state-of-the-art tool dominates all the others.
In conclusion, a function detection problem has not yet been fully addressed, and we need a better methodology
and metric to make advances in the field of function identification.
@inproceedings{lookback,author={Koo, Hyungjoon and Park, Soyeon and Kim, Taesoo},title={A Look Back on a Function Identification Problem},year={2021},month=dec,isbn={9781450385794},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3485832.3488018},doi={10.1145/3485832.3488018},booktitle={Proceedings of the 37th Annual Computer Security Applications Conference (ACSAC)},pages={158–168},numpages={11},keywords={Binary, ML-oriented, Function Recognition, Lookback, Function Identification},location={Virtual Event, USA},series={ACSAC '21},}
Today, a web browser plays a crucial role in offering a broad spectrum of web experiences.
The most popular browser, Chromium, has become an extremely complex application to meet ever-increasing user demands,
exposing unavoidably large attack vectors due to its large code base. Code debloating attracts attention
as a means of reducing such a potential attack surface by eliminating unused code. However, it is very challenging
to perform sophisticated code removal without breaking needed functionalities because Chromium operates on
a large number of closely connected and complex components, such as a renderer and JavaScript engine.
In this paper, we present Slimium, a debloating framework for a browser (i.e., Chromium) that harnesses
a hybrid approach for a fast and reliable binary instrumentation. The main idea behind Slimium is to determine
a set of features as a debloating unit on top of a hybrid (i.e., static, dynamic, heuristic) analysis,
and then leverage feature subsetting to code debloating. It aids in i) focusing on security-oriented features,
ii) discarding unneeded code simply without complications, and iii) reasonably addressing a non-deterministic
path problem raised from code complexity. To this end, we generate a feature-code map with a relation vector
technique and prompt webpage profiling results. Our experimental results demonstrate the practicality
and feasibility of Slimium for 40 popular websites, as on average it removes 94 CVEs (61.4%) by cutting down
23.85 MB code (53.1%) from defined features (21.7% of the whole) in Chromium.
@inproceedings{slimium,author={Qian, Chenxiong and Koo, Hyungjoon and Oh, ChangSeok and Kim, Taesoo and Lee, Wenke},title={Slimium: Debloating the Chromium Browser with Feature Subsetting},year={2020},month=nov,isbn={9781450370899},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3372297.3417866},doi={10.1145/3372297.3417866},booktitle={Proceedings of the 27th ACM SIGSAC Conference on Computer and Communications Security (CCS)},pages={461–476},numpages={16},keywords={binary rewriting, program analysis, browser, debloating},location={Virtual Event, USA},series={CCS '20},}
With legitimate code becoming an attack surface due to the proliferation of code reuse attacks,
software debloating is an effective mitigation that reduces the amount of instruction sequences
that may be useful for an attacker, in addition to eliminating potentially exploitable bugs
in the removed code. Existing debloating approaches either statically remove code that is guaranteed
to not run (e.g., non-imported functions from shared libraries), or rely on profiling with
realistic workloads to pinpoint and keep only the subset of code that was executed.
In this work, we explore an alternative configuration-driven software debloating approach
that removes feature-specific code that is exclusively needed only when certain configuration
directives are specified—which are often disabled by default. Using a semi-automated approach,
our technique identifies libraries solely needed for the implementation of a particular functionality
and maps them to certain configuration directives. Based on this mapping, feature-specific libraries
are not loaded at all if their corresponding directives are disabled. The results of our experimental
evaluation with Nginx, VSFTPD, and OpenSSH show that using the default configuration in each case,
configuration-driven debloating can remove 77% of the code for Nginx, 53% for VSFTPD, and 20% for OpenSSH,
which represent a significant attack surface reduction.
@inproceedings{conf-debloating,author={Hyungjoon Koo, Seyedhamed Ghavamnia and Polychronakis, Michalis},title={Configuration-Driven Software Debloating},year={2019},month=may,isbn={9781450362740},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3301417.3312501},doi={10.1145/3301417.3312501},booktitle={Proceedings of the 12th European Workshop on Systems Security (EuroSec)},numpages={6},location={Dresden, Germany},series={EuroSec '19},}
Despite decades of research on software diversification, only address space layout randomization
has seen widespread adoption. Code randomization, an effective defense against return-oriented programming exploits,
has remained an academic exercise mainly due to i) the lack of a transparent and streamlined deployment model
that does not disrupt existing software distribution norms, and ii) the inherent incompatibility of program variants
with error reporting, whitelisting, patching, and other operations that rely on code uniformity.
In this work, we present compiler-assisted code randomization (CCR), a hybrid approach that relies on
compiler–rewriter cooperation to enable fast and robust fine-grained code randomization on end-user systems,
while maintaining compatibility with existing software distribution models.
The main concept behind CCR is to augment binaries with a minimal set of transformation-assisting metadata,
which i) facilitate rapid fine-grained code transformation at installation or load time, and
ii) form the basis for reversing any applied code transformation when needed, to maintain compatibility with
existing mechanisms that rely on referencing the original code. We have implemented a prototype of this approach
by extending the LLVM compiler toolchain, and developing a simple binary rewriter that leverages the embedded metadata
to generate randomized variants using basic block reordering. The results of our experimental evaluation demonstrate
the feasibility and practicality of CCR, as on average it incurs a modest file size increase of 11.46% and
a negligible runtime overhead of 0.28%, while it is compatible with link-time optimization and control flow integrity.
@inproceedings{ccr,author={Koo, Hyungjoon and Chen, Yaohui and Lu, Long and Kemerlis, Vasileios P. and Polychronakis, Michalis},booktitle={Proceedings of the 39th IEEE Symposium on Security and Privacy (S&P)},title={Compiler-Assisted Code Randomization},year={2018},month=may,volume={},number={},pages={461-477},doi={10.1109/SP.2018.00029},}
Over the past few years, return-oriented programming (ROP) attacks have emerged as
a prominent strategy for hijacking control of software. The full power and flexibility of ROP attacks
was recently demonstrated using just-in-time ROP tactics (JIT-ROP), whereby an adversary repeatedly
leverages a memory disclosure vulnerability to identify useful instruction sequences and compile them
into a functional ROP payload at runtime. Since the advent of just-in-time code reuse attacks,
numerous proposals have surfaced for mitigating them, the most practical of which involve the
re-randomization of code at runtime or the destruction of gadgets upon their disclosure.
Even so, several avenues exist for performing code inference, which allows JIT-ROP attacks
to infer values at specific code locations without directly reading the memory contents of those bytes.
This is done by reloading code of interest or implicitly determining the state of randomized code.
These so-called “zombie gadgets” completely undermine defenses that rely on destroying code bytes
once they are read. To mitigate these attacks, we present a low-overhead, binary-compatible defense
which ensures an attacker is unable to execute gadgets that were identified through code reloading or
code inference. We have implemented a prototype of the proposed defense for closed-source Windows binaries,
and demonstrate that our approach effectively prevents zombie gadget attacks with negligible runtime overhead.
@inproceedings{rerand,author={Morton, Micah and Koo, Hyungjoon and Li, Forrest and Snow, Kevin Z. and Polychronakis, Michalis and Monrose, Fabian},editor={Bodden, Eric and Payer, Mathias and Athanasopoulos, Elias},title={Defeating Zombie Gadgets by Re-randomizing Code upon Disclosure},booktitle={Proceedings of the 3rd IEEE European Symposium on Security & Privacy (EuroS&P)},year={2017},month=jul,publisher={Springer International Publishing},address={Cham},pages={143--160},isbn={978-3-319-62105-0},}
The concept of destructive code reads is a new defensive strategy that prevents
code reuse attacks by coupling fine-grained address space layout randomization with a mitigation
for online knowledge gathering that destroys potentially useful gadgets as they are disclosed by an adversary.
The intuition is that by destroying code as it is read, an adversary is left with no usable gadgets
to reuse in a control-flow hijacking attack. In this paper, we examine the security of this new mitigation.
We show that while the concept initially appeared promising, there are several unforeseen attack tactics
that render destructive code reads ineffective in practice. Specifically, we introduce techniques
for leveraging constructive reloads, wherein multiple copies of native code are loaded into a process’
address space (either side-by-side or one-after-another). Constructive reloads allow the adversary
to disclose one code copy, destroying it in the process, then use another code copy
for their code reuse payload. For situations where constructive reloads are not viable,
we show that an alternative, and equally powerful, strategy exists: leveraging code association
via implicit reads, which allows an adversary to undo in-place code randomization by inferring
the layout of code that follows already disclosed bytes. As a result, the implicitly learned code
is not destroyed, and can be used in the adversary’s code reuse attack. We demonstrate the effectiveness
of our techniques with concrete instantiations of these attacks against popular applications.
In light of our successes, we argue that the code inference strategies presented herein paint
a cautionary tale for defensive approaches whose security blindly rests on the perceived inability
to undo the application of in-place randomization.
@inproceedings{zombie-gadgets,author={Snow, Kevin Z. and Rogowski, Roman and Werner, Jan and Koo, Hyungjoon and Monrose, Fabian and Polychronakis, Michalis},booktitle={Proceedings of the 37th IEEE Symposium on Security & Privacy (S&P)},title={Return to the Zombie Gadgets: Undermining Destructive Code Reads via Code Inference Attacks},year={2016},month=may,volume={},number={},pages={954-968},doi={10.1109/SP.2016.61},}
The Internet’s importance in promoting free and open communication has led to
widespread crackdowns on its use in countries around the world. In this study, we investigate
the relationship between national policies around freedom of speech and Internet topology
in various countries. We combine techniques from network measurement and machine learning
to identify features of Internet structure at the national level that are the best indicators
of a country’s level of freedom. We find that IP density and path lengths to other countries
are the best indicators of a country’s freedom. We also find that our methods predict
the freedom category (Free/Partly Free/Not Free) of a country with 95% accuracy.
@inproceedings{passwd-policz,author={Singh, Rachee and Koo, Hyungjoon and Miramirkhani, Najmeh and Mirhaj, Fahimeh and Akoglu, Leman and Gill, Phillipa},title={The Politics of Routing: Investigating the Relationship between AS Connectivity and Internet Freedom},year={2016},month=aug,publisher={The Advanced Computing Systems Association},url={https://ieeexplore.ieee.org/document/10188654},doi={10.1109/SPW59333.2023.00006},booktitle={Proceedings of the 6th USENIX Workshop on Free and Open Communications on the Internet (FOCI)},numpages={7},location={Austin, TX, USA},series={FOCI '16},}
Code diversification is an effective mitigation against return-oriented programming attacks,
which breaks the assumptions of attackers about the location and structure of useful instruction sequences,
known as "gadgets". Although a wide range of code diversification techniques of varying levels of
granularity exist, most of them rely on the availability of source code, debug symbols, or the assumption
of fully precise code disassembly, limiting their practical applicability for the protection of closed-source
third-party applications. In-place code randomization has been proposed as an alternative binary-compatible
diversification technique that is tolerant of partial disassembly coverage, in the expense though of leaving
some gadgets intact, at the disposal of attackers. Consequently, the possibility of constructing robust
ROP payloads using only the remaining non-randomized gadgets is still open. In this paper we present
instruction displacement, a code diversification technique based on static binary instrumentation that
does not rely on complete code disassembly coverage. Instruction displacement aims to improve
the randomization coverage and entropy of existing binary-level code diversification techniques
by displacing any remaining non-randomized gadgets to random locations. The results of
our experimental evaluation demonstrate that instruction displacement reduces the number of
non-randomized gadgets in the extracted code regions from 15.04% for standalone in-place code randomization,
to 2.77% for the combination of both techniques. At the same time, the additional indirection introduced
due to displacement incurs a negligible runtime overhead of 0.36% on average for the SPEC CPU2006 benchmarks.
@inproceedings{juggling,author={Koo, Hyungjoon and Polychronakis, Michalis},title={Juggling the Gadgets: Binary-Level Code Randomization Using Instruction Displacement},year={2016},month=may,isbn={9781450342339},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/2897845.2897863},doi={10.1145/2897845.2897863},booktitle={Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security (ASIACCS)},pages={23–34},numpages={12},keywords={return-oriented programming, code diversification},location={Xi'an, China},series={ASIA CCS '16},}
Traffic differentiation—giving better (or worse) performance to certain classes of
Internet traffic—is a well-known but poorly understood traffic management policy. There is
active discussion on whether and how ISPs should be allowed to differentiate Internet traffic,
but little data about current practices to inform this discussion. Previous work attempted
to address this problem for fixed line networks; however, there is currently no solution
that works in the more challenging mobile environment.In this paper, we present the design,
implementation, and evaluation of the first system and mobile app for identifying
traffic differentiation for arbitrary applications in the mobile environment
(i.e., wireless networks such as cellular and WiFi, used by smartphones and tablets).
The key idea is to use a VPN proxy to record and replay the network traffic generated
by arbitrary applications, and compare it with the network behavior when replaying
this traffic outside of an encrypted tunnel. We perform the first known testbed experiments
with actual commercial shaping devices to validate our system design and demonstrate
how it outperforms previous work for detecting differentiation. We released our app and
collected differentiation results from 12 ISPs in 5 countries. We find that differentiation
tends to affect TCP traffic (reducing rates by up to 60%) and that interference from middleboxes
(including video-transcoding devices) is pervasive. By exposing such behavior,
we hope to improve transparency for users and help inform future policies.
@inproceedings{traffic-differentiation,author={Molavi Kakhki, Arash and Razaghpanah, Abbas and Li, Anke and Koo, Hyungjoon and Golani, Rajesh and Choffnes, David and Gill, Phillipa and Mislove, Alan},title={Identifying Traffic Differentiation in Mobile Networks},year={2015},month=oct,isbn={9781450338486},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/2815675.2815691},doi={10.1145/2815675.2815691},booktitle={Proceedings of the 15th Internet Measurement Conference (IMC)},pages={239–251},numpages={13},keywords={network neutrality, traffic differentiation, mobile networks},location={Tokyo, Japan},series={IMC '15},}