реферат (1185422)

Файл №1185422 реферат (реферат)реферат (1185422)2020-08-252020-08-25СтудИзба

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла

Lomonosov Moscow State UniversityFaculty of Computational Mathematics and CyberneticsDepartment of algorithmic languagesReport.Code clones detection and its using.Asiryan Alexander KamoevichGroup 524Moscow, 2017ContentsINTRODUCTION ...................................................................................................... 3BACKGROUND ......................................................................................................... 3Clone types ................................................................................................................

3Code clone detection approaches ............................................................................. 4Program Dependence Graph ..................................................................................... 5PDG GENERATION.................................................................................................. 5CLONE DETECTION ............................................................................................... 6PDGs’ splitting .......................................................................................................

7Fast checks ............................................................................................................. 7Metrics based clone detection ................................................................................ 7Slice based clone detection .................................................................................... 8Tree based clone detection ..................................................................................... 9Differences of clone detection methods ................................................................. 9Filtration .................................................................................................................

9AUTOMATIC CLONE GENERATION FOR TESTING ................................... 10CONCLUSION ......................................................................................................... 10BIBLIOGRAPHY ..................................................................................................... 1221. INTRODUCTIONSoftware developers can reuse the same fragments of code many times bymaking small modifications. Hard deadlines usually increase copy-paste activities,which increase number of code clones. Code cloning can lead to many semanticerrors. For example, software developer can forget to rename some variable aftercopy-past. The software, which has many clones, probably will have many mistakesand low quality.

According to different studies up to 20 percent of source code canbe clone in software. Clone detection tools are widely used: During software development to avoid mistakes and improve its quality. For automatic refactoring. For code size optimizations. For semantic mistakes detection.The goal of the research is to introduce LLVM based code clone detectionframework.

In the first stage of tool’s work PDGs are generated for each function ineach source file of the project. They are constructed, based on intermediaterepresentation of LLVM bitcode, during compilation time of the project. Thisapproach allows generating PDGs of the project very fast and without doubly analyzeof source code. Third stage is responsible for splitting PDGs on small subgraphs. Thethird stage analyzes PDGs for code clones detection.

It contains number of newalgorithms for similar subgraphs detection. Due to use of combined algorithms thetool scalable to analyze million lines of source code. Last stage is the results filtering.2. BACKGROUND2.1. Clone typesThere are three basic types of clones. The first type is identical code fragmentsexcept the variations in whitespace (may be also variations in layout) and comments(T1). The second type is structurally/syntactically identical code fragments exceptthe variations in identifiers, literals, types, layout and comments (T2). The third typeis copied fragments of code with further modifications. Statements can be changed,3added or removed in addition to variations in identifiers, literals, types, layout andcomments (T3).2.2.

Code clone detection approachesThere are five basic approaches for code clone detection.1. Methods based on textual approach consider the source code as text andtry to find equal substrings. These substrings are clones. When all clonesare found, clones which are located nearby can be combined to one.Basically (T1) types of clones are determined.2.

In case of lexical approach source code is parsed to sequence of tokens.Then longest common subsequence is determined. There are a few effectivealgorithms based on the parameterized suffix tree for clone detection.One more interesting method transform java code to some intermediaterepresentation and compare them instead of original source. These types ofalgorithms can find basically (T1) and (T2) clone types.3. The next is syntactic approach.

The algorithm works on Abstract SyntaxTree (AST). In this case clones are matched subtrees of AST. Somealgorithms directly compare two ASTs to find common subtrees. Anotheralgorithm constructs vectors for AST subtrees and compares them.Algorithms based on this approach find all three types of clones.4. Metrics based algorithms are widely used for clone detection. Algorithmsbased on this method, compute number of metrics for code fragment andcompare them. Basically these metrics are computed for AST and ProgramDependence Graph (PDG).

Another method clusters computed metrics byusing neural networks. Metrics based algorithms have better performancethan AST or PDG comparison algorithms, but low accuracy.5. The last is semantic approach. The source code is parsed to PDG. Nodesof PDG are instructions of program. Edges of PDG are dependencesbetween instructions. Algorithms based on PDG try to find maximalisomorphic subgraphs for pair of PDGs. All algorithms are approximate4because maximal isomorphic subgraphs detection problem is NP hard.PDG based methods have high accuracy but low performance.2.3.

Program Dependence GraphProgram is presented in a program dependence graph or PDG. This is one ofthe most common representations of code like Abstract Syntax Tree or Control FlowGraph, which shows dependencies between statements and predicates as orientedgraph. The advantage of using code representation PDG compared with otherstandard concepts such as the CFG and AST is that PDG explicitly shows therelationship data, in contrast to CFG, and flow control is presented only implicitlyin AST.3. PDG GENERATIONPDGs for the project are generated based on LLVM intermediaterepresentation called bitcode. Separate pass of LLVM is added for these graphsgeneration.

It has several advantages. Graphs are generated during compilation-timeof the project. It allows effectively construct graphs for large scale projects (millionlines of source code). Vertices of PDG graph are LLVM bitcode instructions. Edgesare obtained based on LLVM use-def, alias and control flow analyses. Those verticeswhich have no edges are removed, after optimized PDGs stored to files.Edges indicate dependencies between instructions, and may be of data andcontrol. Edges responsible for control flow is Control-dependent and conducted bytransmitting control instruction to the instruction, in which control is transferred.They can be built in three ways:1. Only edges which represent transitions between base blocks.2.

The edges are constructed, not only between base blocks, but alsowithin them, between successive instructions.3. The edges are constructed between base blocks, but are held to all theinstructions of the base block, not just the first.Edges showing data dependence constructed in two ways:51. use-def analysis: the edges represent the relation between the LLVMinstruction and its operands.2. alias analysis: there are three types of relationships: True-dependence: the first instruction writes to memory, fromwhich the second statement then reads. Anti-dependence: the first instruction reads the memory, which thenwrites the second instruction. Output-dependence: the first instruction writes to memory, in thatthe second statement also writes.Also between the first and second instructions, there is not even a singlestatement, which overwrites the memory used by the first instruction, and may be inthe execution path between them.

All the edges are held by the first instruction to thesecond.LLVM provides compiler API and has big set of optimization libraries. Dueto this many programming languages provide source code translation to LLVMbitcode. It allows apply developed tool for all these languages. PDG is uniform forall supported languages which allows detect code clones cross different languages.4. CLONE DETECTIONClone detection is multistage process. At first generated PDGs are loaded tomemory, then four basic steps are performed. The first step is splitting of PDGs tosubgraphs.

These subgraphs are considered as potential clones of each other. Thesecond step is application of fast check algorithms. These algorithms have linearcomplexity and try to prove that considered pair of PDGs cannot have enough bigisomorphic subgraphs. The third stage is maximal isomorphic subgraphs detection.New algorithms, based on slice, metrics and tree, are purposed for maximalisomorphic subgraphs detection. The forth step is filtration of obtained pairs ofmaximal isomorphic subgraphs. Last step is printing of corresponding source codefor isomorphic subgraphs, as clone.64.1. PDGs’ splittingThree methods are realized for splitting.

Характеристики

Тип файла

PDF-файл

Размер

588,52 Kb

Материал

реферат

Тип материала

Курсовая работа

Предмет

Английский язык

Высшее учебное заведение

МГУ им. Ломоносова

Тип файла PDF

PDF-формат наиболее широко используется для просмотра любого типа файлов на любом устройстве. В него можно сохранить документ, таблицы, презентацию, текст, чертежи, вычисления, графики и всё остальное, что можно показать на экране любого устройства. Именно его лучше всего использовать для печати.

Например, если Вам нужно распечатать чертёж из автокада, Вы сохраните чертёж на флешку, но будет ли автокад в пункте печати? А если будет, то нужная версия с нужными библиотеками? Именно для этого и нужен формат PDF - в нём точно будет показано верно вне зависимости от того, в какой программе создали PDF-файл и есть ли нужная программа для его просмотра.

Список файлов курсовой работы

referat.rar

реферат.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.