2010 4 A pril 20 10
36 4 Journal of Be ij ing U n iversity o f A eronautics and A stronautics V o l. 36 N o4
李 虎 刘 超 刘 楠 李晓丽
( , 100 19 1)
提出一种 Java源代码和字节码都适用的剽窃检测方法并实现了 持系统,
该方法以类的 Java文件或 class文件为比较单元, 从中抽取代表程序语法和语义特征的 5 种特
征向量, 综合计算产生两 个类文件之间的相似度, 可用于帮助判断两 个类文件之间是否存在全
部或部分剽窃现象. 在人工修改程序的场景下进行的对比实验结果和剽窃检测实验结果表明,
该方法可有效检测程序代码的严格拷贝和近似拷贝, 有较高的检测性能, 并且能够识别程序剽
窃行为中对 Java源文件所做的大部分类型的代码变换.
剽窃检测; Java源代码; Java 字节码; 相似性度量
TP 311. 5
A ( 2010) 0404 2405
M ethod and its system of Java source and byte code p lagiarism detection
L iH u L iu Ch ao L iu N an L i X iao li
( Schoo l of C om puter S cience and T echnology, Beijing U n iversity of A eronau t ics and A st ronau tics, B eijing 10019 1, C h ina)
A bstract A plag iarism detection approach to detect bo th Java source code and byte code w as proposed.
T he proposed m ethod com p ares Java source files or class f iles by m ult ip le smi ilarity m easures deve loped to re
presen t the syntax stru ctures and sem antic features of the program s. A n effic ien t plag iarism detection too l u sing
the proposed techn iqu e w as deve loped to analyze p lag iarism beh av ior of Java source code o r class code. Statis
tical ana lysis and severa l graphical v isua lizat ion s aid in the in terpretation o f ana lysis results. A n expermi ental
com parison w ith a typ ica l comm erc ial source code plag iarism detection too l asw e ll as a case study by apply ing
the too l to p lag iarism detect ion w ith a set o f m anua lly m odif ied program s w ere condu cted. Expermi ent resu lts
show th at the too l is m ore effic ient and the proposed technique can recognize both exact copy and appro