Program Repair

Motivation: Automated Program Repair is an emerging technology to alleviate the onerous burden of manually fixing bugs on developers. A substantial number of APR techniques have been proposed over the years with several breakthroughs that inspired potential practical adoption of APR. Unfortunately, developer's trust on APR-generated patches is still a challenge for achieving greater adoption of APR on industry.

Approach: This theme aim not only to enhance the trust of developer on APR-generated patches by providing supporting artifacts/information/evidents about APR-generated patches but also to investigate unknown issues regarding the trustworthiness of APR systems.

  • Test Overifitting: One key factor compromising the trustworthiness of APR lies in the absence of comprehensive specifications that validate the correctness of APR-generated patches. A common approach involves relying on developer-written test cases as correctness specifications. However, the incompleteness of test suite usually leads to overfitting problem, in which APR-generated patches can satisfy APR-defined correctness specifications but is still incorrect. As unreliable overfitting patches cause developers to lose trust in APR tools, overfitting problem is an important challenge in trust enhancement of APR systems. To address this problem, my colleagues and I have proposed Invalidator (TSE'23) an automated method to reason the correctness of APR-generated patches via program invariants and code representations. We further developed PatchZero (Submitted to TSE) , that ultilized large pre-trained code models along with an Instance-wise Tailored Demonstration and an In-context Learning Inference for a zero-shot setting, in which the patches generated by a new/unseen APR tool.
  • Robustness : Another factor contributing to trust issues is the limited size of the evaluation dataset, particularly in existing Neural Program Repair (NPR) evaluation datasets, which typically contain fewer than a thousand bugs. These small datasets struggle to adequately represent real-world bugs, raising concerns about the robustness of APR tools against unseen bugs. To address this issue, we propose automated tools, Midas (TSE'23) and VulCurator (FSE'22) for identifying vulnerability-fixing commits based on their source code and related artifacts, such as issues and commit messages. These tools allow us to mining vulnerability-fixing commits and create more comprehensive benchmark for program repair.
  • Explainability: Lastly, APR tools, especially Neural Program Repair techniques that rely on deep learning models, usually work in a black-box manner. The opacity of these tools often results in developers lacking a clear understanding and feeling uncertain about APR-generated patches. To address this issue, I am also interested in self-explainable APR systems. Particularly, I want to create APR systems that can automatically providing explanations about their generated patches. I am still hunting for good ideas on this direction.
  • Related Publications

    [Arxiv] Evaluating Program Repair with Semantic-Preserving Transformations: A Naturalness Assessment

    [[TSE'24] Leveraging Large Language Model for Automatic Patch Correctness Assessment

    [TSE-ICSE'24] Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning

    [TSE] MiDas: Multi-Granularity Detector for Vulnerability Fixes

    [ICSME'22] FFL: Fine-grained Fault Localization for Student Programs via Syntactic and Semantic Reasoning

    [ESEC/FSE'22] VulCurator: A Vulnerability-Fixing Commit Detector

    [ISSRE'21] Usability and Aesthetics: Better Together for Automated Repair of Web Pages