Building Implicit Vector Representations of Individual Coding Style

Vladimir Kovalenko, Egor Bogomolov, Timofey Bryksin, and Alberto Bacchelli

July, 2020. Published in the proceedings of CHASE'20 (Workshop).

Abstract. With the goal of facilitating team collaboration, we propose a new approach to building vector representations of individual developers by capturing their individual contribution style, or coding style. Such representations can find use in the next generation of software development team collaboration tools, for example by enabling the tools to track knowledge transfer in teams. The key idea of our approach is to avoid using explicitly defined metrics of coding style and instead build the representations through training a model for authorship recognition and extracting the representations of individual developers from the trained model. By empirically evaluating the output of our approach, we find that implicitly built individual representations reflect some properties of team structure: developers who report learning from each other are represented closer to each other.

Paper Pre-print Data