Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2020, 2020 International Joint Conference on Neural Networks (IJCNN)
During the development cycle of a project, it is common for software requirements and functionality to change and for code errors to occur. To deal with these unforeseen changes, the artifact known as change request, which is a formal proposal to alter a system, is used. Its assignment is an important step in the development process. Projects can receive a very high number of requests daily, which makes the automation of this process compelling. This work proposes a method for assigning unresolved requests, based on developer’s profiles. The proposed method consists of three steps. The first step is to extract code quality metrics, commit data and previously resolved requests, in order to model developers through the mining of repositories. The second step concerns with the selection of the profile of potential developers through the application of natural language processing and information retrieval techniques. And finally, in the third step the appropriate developers are selected...
Journal of Software: Evolution and Process, 2011
The paper presents an approach to recommend a ranked list of expert developers to assist in the implementation of software change requests (e.g., bug reports and feature requests). An Information Retrieval (IR)-based concept location technique is first used to locate source code entities, e.g., files and classes, relevant to a given textual description of a change request. The previous commits from version control repositories of these entities are then mined for expert developers. The role of the IR method in selectively reducing the mining space is different from previous approaches that textually index past change requests and/or commits. The approach is evaluated on change requests from three open-source systems: ArgoUML, Eclipse, and KOffice, across a range of accuracy criteria. The results show that the overall accuracies of the correctly recommended developers are between 47 and 96% for bug reports, and between 43 and 60% for feature requests. Moreover, comparison results with two other recommendation alternatives show that the presented approach outperforms them with a substantial margin. Project leads or developers can use this approach in maintenance tasks immediately after the receipt of a change request in a free-form text. a bug, or a feature. Clearly, this activity is reactive and may not necessarily yield an effective or efficient answer. An active developer of ArgoUML, where this activity is manual, stated that they would welcome any tool that would lead to more enjoyable and efficient job experience, and is not perceived as a hindrance. In open-source software development, where much relies on volunteers, it could serve as a catalyst if there was a tool that automatically mapped change requests to appropriate developers. That is, developers do not have to wade through the numerous change requests to seek for what they can contribute to; they are presented a 'filtered' set of change requests that suits their palates instead. Both help seekers and sustained software evolution in such a situation would greatly benefit from a proactive approach that automatically recommends the appropriate developers based solely on information available in textual change requests. Change requests are typically specified in a free-form textual description using natural language (e.g., a bug reported to the Bugzilla system of a software project).
International Journal of Advanced Trends in Computer Science and Engineering , 2021
GitHub is a product improvement stage that advances participation and joint exertion in project advancement. Generally, engineers investigate related activities to reuse capacities and during investigating they coincidentally find a few elements that might help them in their undertaking. Recommending developers some projects that are similar to their work can save their time. Yet, it is difficult getting relevant projects amongst the pool of many projects on GitHub. Besides, every other user may have different requirements and choices for the project. A recommendation system saves developers from spending their time searching for projects that can help them in developing. In this paper, we propose an interactive and customized recommendation approach that recognizes software project features and developer behavior. This proposed approach naturally suggests the top-N most comparable programming projects. The outcomes utilizing information slithered from GitHub shows that our proposed approach suggest significant tasks with high review at cutoff 5 and 10.
Journal of Intelligent Information Systems, 2017
Code reviews consist in proofreading proposed code changes in order to find their shortcomings such as bugs, insufficient test coverage or misused design patterns. Code reviews are conducted before merging submitted changes into the main development branch. The selection of suitable reviewers is crucial to obtain the high quality of reviews. In this article we present a new method of recommending reviewers for code changes. This method is based on profiles of individual programmers. For each developer we maintain his/her profile. It is the multiset of all file path segments from commits reviewed by him/her. It will get updated when he/she presents a new review. We employ a similarity function between such profiles and change proposals to be reviewed. The programmer whose profile matches the change most is recommended to become the reviewer. We performed an experimental comparison of our method against state-of-the-art techniques using four large open-source projects. We obtained improved results in terms of classification metrics (precision, recall and F-measure) and performance (we have lower time and space complexity).
DÜMF Mühendislik Dergisi
The essential destination of this research is to develop a hybrid recommendation system methodology to enhance the overall performance accuracy of such existed systems, this recommendation approach normally utilized to assign or propose a few counted numbers of programmers or developers that capable of resolving system's bug reports generated automatically from an open source bug repository, meaning the system decides which programmers or developers should be taken into account to be in charge of finding a solution the bugs mentioned in the bug's report. The definition of the bug selection problems in bug repositories are the activities that developers achieve within program maintenance to fix some specific bugs. Because of lot of bugs are created daily, many developers required are quite large, therefore it is difficult to specify the accurate programmers or developers to find a solution for the issues for specific bug inside the code. The article also aims to improve the accuracy results obtained than existed traditional approaches for this purpose. Besides, we have considered the case of prioritization of system developers, the case can be utilized to find an appropriate grade of developers' achievements as prior knowledge to assist the system in assigning of bug report issue. The results have found that the importance of developers could support the bug triage worker more and help software tasks to solve the bugs fast and within required time during development and support cycles of the software.
Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice
Code reviewing is a commonly used practice in software development. It refers to the process of reviewing new code changes before they are merged with the code base. However, to perform the review, developers are mostly assigned manually to code changes. This may lead to problems such as: a time-consuming selection process, limited pool of known candidates and risk of over-allocation of a few reviewers. To address the above problems, we developed Carrot, a machine learning-based tool to recommend code reviewers. We conducted an improvement case study at Ericsson. We evaluated Carrot using a mixed approach. we evaluated the prediction accuracy using historical data and the metrical Mean Reciprocal Rank (MRR). Furthermore, we deployed the tool in one Ericsson project and evaluated how adequate the recommendations were from the point of view of the tool users and the recommended reviewers. We also asked the opinion of senior developers about the usefulness of the tool. The results show that Carrot can help identify relevant non-obvious reviewers and be of great assistance to new developers. However, there were mixed opinions on Carrot's ability to assist with workload balancing and the decrease code review lead time.
arXiv (Cornell University), 2023
Developers often struggle to navigate an Open Source Software (OSS) project's issue-tracking system and find a suitable task. Proper issue labeling can aid task selection, but current tools are limited to classifying the issues according to their type (e.g., bug, question, good first issue, feature, etc.). In contrast, this paper presents a tool (GiveMeLabeledIssues) that mines project repositories and labels issues based on the skills required to solve them. We leverage the domain of the APIs involved in the solution (e.g., User Interface (UI), Test, Databases (DB), etc.) as a proxy for the required skills. GiveMeLabeledIssues facilitates matching developers' skills to tasks, reducing the burden on project maintainers. The tool obtained a precision of 83.9% when predicting the API domains involved in the issues. The replication package contains instructions on executing the tool and including new projects.
Advances in Intelligent Systems and Computing, 2021
The growing complexity of software and associated code makes it difficult for software developers to produce high-quality code in a timely fashion. However, this process of assessing code quality can be automated with the help of software code metrics, which is a quantitative measure of code properties. The software metrics consist of several attributes, which describe the source code and this includes lines of code, program length, the effort required, the difficulty involved, cyclomatic complexity, volume, vocabulary, intelligence count, and so on. With the help of these features, code can be classified as a well-written code or badly-written code. This study focuses on evaluating the performance of main classification algorithms: Naïve Bayes, K-nearest neighbors (KNN), logistic regression, stochastic gradient descent (SGD) classifier, support vector machine (SVM), and decision tree (D-Tree) with thirteen of NASA metrics data program (MDP) dataset. The research work also focuses on understanding the math and working of each of the classifiers and the quality of each dataset. The comparison measure for the evaluating classifiers includes confusion matrix and other derived measures, namely F-measure, recall, precision, accuracy, and Matthews correlation coefficient (MCC). The best model is chosen along with the appropriate dataset. In order to allow the developers to use the trained model, we created Code Buddy a SharePoint web-portal; which allows the developers either assess the code quality by sending the review request to any of the colleagues or assess the code automatically using a trained model, which will predict whether the code is well written or badly written. Moreover, if the developer is not satisfied with the results, he/she can send a review request to any fellow colleague who can review the code and provide the review comment on the same.
There are many application of text mining and it can work in coordination with other paradigm of computing. There has been numerous software are available for various domain, like web browsers, utility application and special purpose applications. Hence to install and check all the software seems to be a cumbersome task hence there must need of some robust approach that can classify the available software by considering their provided description. The study will purpose a model for the better classification of software against each domain using description and by considering semantic similarity, which results in creation of recommend system for software repository. Moreover, this proposed model will be extended to provide an automate model for the bug assigning to a poll of developers for any application. This work will be done by seeing the description and speciality of any developer. This model will defiantly enhance the user experience toward software selection and bug removal.
Bug resolution refers to the activity that developers perform to diagnose, fix, test, and document bugs during software development and maintenance. It is a collaborative activity among developers who contribute their knowledge, ideas, and expertise to resolve bugs. Given a bug report, we would like to recommend the set of bug resolvers that could potentially contribute their knowledge to fix it. We refer to this problem as developer recommendation for bug resolution.
Proceedings of the 29th ACM/IEEE international conference on Automated software engineering - ASE '14, 2014
Change Requests (CRs) are key elements to software maintenance and evolution. Finding the appropriate developer to a CR is crucial for obtaining the lowest, economically feasible, fixing time. Nevertheless, assigning CRs is a laborintensive and time consuming task. In this paper, we present a semi-automated CR assignment approach which combines rule-based and information retrieval techniques. The approach emphasizes the use of contextual information, essential to effective assignments, and puts the development team in control of the assignment rules, toward making its adoption easier. Results of an empirical evaluation showed that the approach is up to 46,5% more accurate than approaches which rely solely on machine learning techniques.
arXiv (Cornell University), 2023
Context: Recent research has used data mining to develop techniques that can guide developers through source code changes. To the best of our knowledge, very few studies have investigated data mining techniques and-or compared their results with other algorithms or a baseline. Objectives: This paper proposes an automatic method for recommending source code changes using four data mining algorithms. We not only use these algorithms to recommend source code changes, but we also conduct an empirical evaluation. Methods: Our investigation includes seven open-source projects from which we extracted source change history at the file level. We used four widely data mining algorithms i.e., Apriori, FP-Growth, Eclat, and Relim to compare the algorithms in terms of performance (Precision, Recall and F-measure) and execution time. Results: Our findings provide empirical evidence that while some Frequent Pattern Mining algorithms, such as Apriori may outperform other algorithms in some cases, the results are not consistent throughout all the software projects, which is more likely due to the nature and characteristics of the studied projects, in particular their change history. Conclusion: Apriori seems appropriate for large-scale projects, whereas
2012 28th IEEE International Conference on Software Maintenance (ICSM), 2012
There is a tremendous wealth of code authorship information available in source code. Motivated with the presence of this information, in a number of open source projects, an approach to recommend expert developers to assist with a software change request (e.g., a bug fixes or feature) is presented. It employs a combination of an information retrieval technique and processing of the source code authorship information. The relevant source code files to the textual description of a change request are first located. The authors listed in the header comments in these files are then analyzed to arrive at a ranked list of the most suitable developers. The approach fundamentally differs from its previously reported counterparts, as it does not require software repository mining. Neither does it require training from past bugs/issues, which is often done with sophisticated techniques such as machine learning, nor mining of source code repositories, i.e., commits. An empirical study to evaluate the effectiveness of the approach on three open source systems, ArgoUML, JEdit, and MuCommander, is reported. Our approach is compared with two representative approaches: 1) using machine learning on past bug reports, and 2) based on commit logs. The presented approach is found to provide recommendation accuracies that are equivalent or better than the two compared approaches. These findings are encouraging, as it opens up a promising and orthogonal possibility of recommending developers without the need of any historical change information.
Proceedings of the 22nd International Conference on Program Comprehension, 2014
The paper presents an approach, namely iMacPro, to recommend developers who are most likely to implement incoming change requests. iMacPro amalgamates the textual similarity between the given change request and source code, change proneness information, authors, and maintainers of a software system. Latent Semantic Indexing (LSI) and a lightweight analysis of source code, and its commits from the software repository, are used. The basic premise of iMacPro is that the authors and maintainers of the relevant source code, which is change prone, to a given change request are most likely to best assist with its resolution. iMacPro unifies these sources in a unique way to perform its task, which was not investigated and reported in the literature previously. An empirical study on three open source systems, Ar-goUML, JabRef , and jEdit, was conducted to assess the effectiveness of iMacPro. A number of change requests from these systems were used in the evaluated benchmark. Recall values for top one, five, and ten recommended developers are reported. Furthermore, a comparative study with a previous approach that uses the source-code authorship information for developer recommendation was performed. Results show that iMacPro could provide recall gains from 30% to 180% over its subjected competitor with statistical significance.
International Journal of Advanced Computer Science and Applications
Improvements and acceleration in software development has contributed towards high quality services in all domains and all fields of industry causing increasing demands for high quality software developments. In order to match with the high-quality software development demands, the software development industry is adopting human resources with high skills, advanced methodologies and technologies for accelerating the development life cycle. In the software development life cycle, one of the biggest challenges is the change management between versions of the source codes. The versing of the source code can be caused by various reasons such as change in the requirements or adaptation of functional update or technological upgradations. The change management does not only affect the correctness of the release for the software service, rather also impact the number of test cases. It is often observed that, the development life cycle is delayed due to lack of proper version control and due to the improver version control, the repetitive testing iterations. Hence the demand for better version control driven test case reduction methods cannot be ignored. A number of version control mechanisms are proposed by the parallel research attempts. Nevertheless, most of the version controls are criticized for not contributing towards the test case generation of reduction. Henceforth, this work proposes a novel probabilistic refactoring detection and rule-based test case reduction method in order to simplify the testing and version control mechanism for the software development. The refactoring process is highly adopted by the software developers for making efficient changes such as code structure, functionality or apply change in the requirements. This work demonstrates a very high accuracy for change detection and management. This results into a higher accuracy for test case reductions. The final outcome of this work is to reduce the development time for the software for making the software development industry a better and efficient world.
International Conferences on Software Engineering and Knowledge Engineering, 2019
The software development workflow typically involves developers executing tasks and manipulating artifacts. When developers receive a new task they typically envision a task context with the artifacts they intend to manipulate based on their past experiences. Given software projects may last several months, accumulating a vast amount of tasks, artifacts and developers, envisioning this initial task context may be difficult and error-prone. Developers have to walk-through months of past experiences or examine the experience of other developers, select similar tasks and then define the initial context. This paper introduces a method that helps developers defining the initial task context by combining interaction information over artifacts with text information of tasks. First, the Method uses the Clustering technique to organize project tasks into similar groups by interaction in artifacts. Then, the Method uses the Natural Language Processing technique to associate a new task with groups of similar tasks by interaction. The evaluation shows that the clustering of similar tasks by interaction produces similar tasks assigned with artifacts that will be edited by new tasks. The association of new tasks with similar groups by interaction indicates correlation between textual similarity and interaction similarity.
Proceedings of the 19th International Conference on Enterprise Information Systems, 2017
Software development has become an essential activity for organizations that increasingly rely on these to manage their business. However, poor software quality reduces customer satisfaction, while high-quality software can reduce repairs and rework by more than 50 percent. Software development is now seen as a collaborative and technology-dependent activity performed by a group of people. For all these reasons, choosing correctly software development members teams can be decisive. Considering this motivation, classifying participants in different profiles can be useful during project management team's formation and tasks distribution. This paper presents a developer modeling approach based on software quality metrics. Quality metrics are dynamically collected. Those metrics compose the developer model. A machine learning-based method is presented. Results show that it is possible to use quality metrics to model developers.
IEEE Transactions on Software Engineering, 2004
Software developers are often faced with modification tasks that involve source which is spread across a code base. Some dependencies between source code, such as those between source code written in different languages, are difficult to determine using existing static and dynamic analyses. To augment existing analyses and to help developers identify relevant source code during a modification task, we have developed an approach that applies data mining techniques to determine change patterns-sets of files that were changed together frequently in the past-from the change history of the code base. Our hypothesis is that the change patterns can be used to recommend potentially relevant source code to a developer performing a modification task. We show that this approach can reveal valuable dependencies by applying the approach to the Eclipse and Mozilla open source projects and by evaluating the predictability and interestingness of the recommendations produced for actual modification tasks on these systems.
Software development processes need to have an integrated environment that fulfills specific developer needs. In this context, this paper describes the modeling approach SAGM ((Similarity for Adaptive Guidance Model) that provides adaptive recursive guidance for software processes, and specifically tailored regarding the profile of developers. A profile is defined from a model of developers, through their roles, their qualifications, and through the relationships between the context of the current activity and the model of the activities. This approach presents a similarity measure that evaluates the similarities between the profiles created from the model of developers and those of the development team involved in the execution of a software process. This is to identify the profiles classification and to deduce the appropriate type of assistance (that can be corrective, constructive or specific) to developers.
Recently frequent and sequential pattern mining algorithms have been widely used in the field of software engineering to mine various source code or specification patterns. In practice software evolves from one version to another is needed for providing extra facilities to user. This kind of task is challenging in this domain since the database is usually updated in all kinds of manners such as insertion, various modifications as well as removal of sequences. If database is optimized then this optimized information will help developer in their development process and save their valuable time as well as development expenses. Some existing algorithms which are used to optimize database but it does not work faster when database is incrementally updated. To overcome this challenges an efficient algorithm is recently introduce, called the Canonical Order Tree that captures the content of the transactions of the database and orders. In this paper we have proposed a technique based on the Canonical Order Tree that can find out frequent patterns from the incremental database with speedy and efficient way. Thus the database will be optimized as well as it gives useful information to recommend software developer.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.