Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Indonesian Journal of Electrical Engineering and Computer Science
The current paper proposes a novel type of decision tree, which is never used for software development cost prediction (SDCP) purposes, the cluster-based fuzzy regression tree (CFRT). This model uses the fuzzy k-means (FKM), which deals with data uncertainty and imprecision. The tree expansion is based on the variability measure by choosing the node with the highest value of granulation diversity. This paper outlined an experimental study comparing CFRT with four SDCP methods, notably linear regression, multi-layer perceptron, K-nearest-neighbors, and classification and regression trees (CART), employing eight datasets and the leave-one-out cross-validation (LOOCV). The results show that CFRT is among the best, ranked first in 3 datasets according to four accuracy measures. Also, according to the Pred(25%) values, the proposed CFRT model outperformed all the twelve compared techniques in four datasets: Albrecht, constructive cost model (COCOMO), Desharnais, and The International Sof...
2009
This work addresses the issue of software effort prediction via fuzzy decision trees generated using historical project data samples. Moreover, the effect that various numerical and nominal project characteristics used as predictors have on software development effort is investigated utilizing the classification rules extracted. The approach attempts to classify successfully past project data into homogeneous clusters to provide accurate and reliable cost estimates within each cluster. CHAID and CART algorithms are applied on approximately 1000 project cost data records which were analyzed, preprocessed and used for generating fuzzy decision tree instances, followed by an evaluation method assessing prediction accuracy achieved by the classification rules produced. Even though the experimentation follows a heuristic approach, the trees built were found to fit the data properly, while the predicted effort values approximate well the actual effort.
This paper addresses the issue of software cost estimation through fuzzy decision trees, aiming at acquiring accurate and reliable effort estimates for project resource allocation and control. Two algorithms, namely CHAID and CART, are applied on empirical software cost data recorded in the ISBSG repository. Approximately 1000 project data records are selected for analysis and experimentation, with fuzzy decision trees instances being generated and evaluated based on prediction accuracy. The set of association rules extracted is used for providing mean effort value ranges. The experimental results suggest that the proposed approach may provide accurate cost predictions in terms of effort. In addition, there is strong evidence that the fuzzy transformation of cost drivers contribute to enhancing the estimation process.
Financial health of many organizations now-a-days is being affected by investment in software and their cost estimation. Therefore, to provide effective cost estimation models are the most complex activity in software engineering fields. This paper presents a fuzzy clustering and optimization model for software cost estimation. The proposed model uses Pearson product-moment correlation coefficient and one-way ANOVA analysis for selecting several effort adjustment factors. Further, it applies fuzzy C-means clustering algorithm for project clustering. Then, parameters of COCOMO model have been optimized using Multi-objective Genetic Algorithm (MOGA). Here, two objectives are considered. One is to minimize the Mean Magnitude of Relative Error (MMRE) and other is to maximize the Prediction (PRED). This model has been tested on the COCOMO dataset. The optimization result has also been compared with Multi-objective Particle Swarm Optimization (MOPSO) algorithm. The result has proved superiority of MOGA in parameter optimization for getting strength back the accuracy of software cost estimation.
2013 Third International Conference on Communications and Information Technology (ICCIT), 2013
Accurate software effort estimation has been a challenge for many software practitioners and project managers. Underestimation leads to disruption in the project's estimated cost and delivery. On the other hand, overestimation causes outbidding and financial losses in business. Many software estimation models exist; however, none have been proven to be the best in all situations. In this paper, a decision tree forest (DTF) model is compared to a traditional decision tree (DT) model, as well as a multiple linear regression model (MLR). The evaluation was conducted using ISBSG and Desharnais industrial datasets. Results show that the DTF model is competitive and can be used as an alternative in software effort prediction.
International Journal of Advanced Computer Science and Applications
The role of decision trees in software development effort estimation (SDEE) has received increased attention across several disciplines in recent years thanks to their power of predicting, their ease of use, and understanding. Furthermore, there are a large number of published studies that investigated the use of a decision tree (DT) techniques in SDEE. Nevertheless, in reviewing the literature, a systematic literature review (SLR) that assesses the evidence stated on DT techniques is still lacking. The main issues addressed in this paper have been divided into five parts: prediction accuracy, performance comparison, suitable conditions of prediction, the effect of the methods employed in association with DT techniques, and DT tools. To carry out this SLR, we performed an automatic search over five digital libraries for studies published between 1985 and 2019. In general, the results of this SLR revealed that most DT methods outperform many techniques and show an improvement in accuracy when combined with association rules (AR), fuzzy logic (FL), and bagging. Additionally, it has been observed a limited use of DT tools: it is therefore suggested for researchers to develop more DT tools to promote the industrial utilization of DT amongst professionals.
IFIP International Federation for Information Processing, 2006
This paper suggests several estimation guidelines for the choice of a suitable machine learning technique for software development effort estimation. Initially, the paper presents a review of relevant published studies, pointing out pros and cons of specific machine learning methods. The techniques considered are Association Rules, Classification and Regression Trees, Bayesian Belief Networks, Neural Networks and Clustering, and they are compared in terms of accuracy, comprehensibility, applicability, causality and sensitivity. Finally the study proposes guidelines for choosing the appropriate technique, based on the size of the training data and the desirable features of the extracted estimation model.
Journal of Systems and Software, 2008
Parametric software cost estimation models are based on mathematical relations, obtained from the study of historical software projects databases, that intend to be useful to estimate the effort and time required to develop a software product. Those databases often integrate data coming from projects of a heterogeneous nature. This entails that it is difficult to obtain a reasonably reliable single parametric model for the range of diverging project sizes and characteristics. A solution proposed elsewhere for that problem was the use of segmented ...
World Applied Sciences …, 2011
Project planning plays a significant role in software projects so that imprecise estimations often lead to the project faults or dramatic outcomes for the project team. In recent years, various methods have been proposed to estimate the software development effort accurately. Among all proposed methods the non algorithmic methods by using soft computing techniques have presented considerable results. Complexity and uncertain behavior of software projects are the main reasons for going toward the soft computing techniques. In this paper a hybrid system based on combining C-Means clustering, neural network and analogy method is proposed. Since, there are complicated and non linear relations among software project features, the proposed method can be useful to interpret such relations and to present more accurate estimations. The obtained results showed that fuzzy clustering could decrease the negative effect of irrelevant projects on accuracy of estimations. In addition, evaluation of proposed hybrid method showed the significant improvement of accuracy as compared to the neural network the analogy method and statistical methods.
2020
Software cost prediction is the technique of accurately evaluating the amount while developing the software. Estimation involves the total time required for the completion of the software, effort required that is measured in terms of person per month (PM), and the total cost to complete the activity. Accuracy and duration are the two desirable criteria in the software estimation process. In software estimation process, there are several inputs that are being fed to the system and these inputs are used for the generation or calculation of the set of outputs. The important work of the software project managers in the present scenario is the computation of cost or effort before the absolute advancement of any particular software. There are several methods applied for software cost estimation but we will focus on the fuzzy logic which is a soft-computing method. We feel that model which is based on fuzzy logic for the software cost estimation should be able to give the uncertain values ...
2015
Software cost estimation is the process of predicting the effort required to develop a software system. Many estimation models have been proposed over the last 30 years. With the development of software engineering, estimation of project cost and duration has been a very important work. It plays an important role in project bid and project planning. Many papers have been published regarding this topic, which aims at predicting costs of projects to a tolerable degree of accuracy at the early stage. In this paper, several existing fuzzy logic methods for software cost estimation are illustrated and they are compared with the intermediate COCOMO model. Comparing the features of the methods, it could be applied for clustering based on abilities and is also useful for selecting the special method for each project.
The correct cost estimation for software companies and company executives and companies is very important task. In required effort prediction, estimated time of delivery and cost of the project with high accuracy is still vast challenge for projects masters. For many years, algorithmic models like Constructive Cost Model (COCOMO) families were used for required effort prediction. Nowadays, intelligent methods have many applications in SCE. In this study, we firstly give introduction to software cost estimation (SCE) and requirements. Then, we proposed a new intelligent approach by means of additive regression model for classifying the training and testing instances on NASA projects. The results show that our proposed system is more effective. Performance is evaluated by comparing the COCOMO and additive regression based classifier results.
International Journal of Advanced Computer Science and Applications, 2011
Web Effort Estimation is a process of predicting the efforts and cost in terms of money, schedule and staff for any software project system. Many estimation models have been proposed over the last three decades and it is believed that it is a must for the purpose of: Budgeting, risk analysis, project planning and control, and project improvement investment analysis. In this paper, we investigate the use of Fuzzy ID3 decision tree for software cost estimation, it is designed by integrating the principles of ID3 decision tree and the fuzzy settheoretic concepts, enabling the model to handle uncertain and imprecise data when describing the software projects, which can improve greatly the accuracy of obtained estimates. MMRE and Pred are used, as measures of prediction accuracy, for this study. A series of experiments is reported using Tukutuku software projects dataset. The results are compared with those produced by three crisp versions of decision trees: ID3, C4.5 and CART.
A tool named 2CEE was developed for developing new software cost estimation model using data mining technique. The accuracy of these models has been validated internally through N-Fold Cross Validation (also called Leave One Out Cross Validation. However these newly developed models are well predicted and validated in the real world through various researches. This paper is going to find out how well this machine learning algorithm such as Naive Bayes classification and neural network can be applied for cost estimation using COCOMO II cost model for comparisons. This estimate the new project with comparing the past project data or cost driver in past project data with the help of data mining tool Weka.
2010
A new software cost estimation approach is proposed in this paper, which attempts to cluster empirical, non-homogenous project data samples via an entropy-based fuzzy k-modes clustering algorithm. The target is to identify groups of projects sharing similar characteristics in terms of cost attributes or descriptors, and utilise this grouping information to provide estimations of the effort needed for a new project that is classified in a certain group. The effort estimates produced address the uncertainty and fuzziness of the clustering process by yielding interval predictions based on the mean and standard deviation of the samples having strong membership within a cluster. Empirical validation of the proposed methodology was conducted using a filtered version of the ISBSG dataset and yielded encouraging results both in terms of practical usage of the clustered groups and of approximating effectively project costs.
One of the challenges faced by the managers in the software industry today is the ability to accurately define the requirements of the software projects early in the software development phase. The cost-benefit analysis forms the basis of the planning and decision making throughout the software development lifecycle. As such there is a need for efficient software cost estimation techniques for making any endeavor viable. Software cost estimation is the process of prognosticating the amount of effort required to build a software project. In this paper we have proposed a Particle Swarm Optimization (PSO) technique which operates on data sets clustered using the K-means clustering algorithm. PSO is employed to generate parameters of the COCOMO model for each cluster of data values. The clusters and effort parameters are then trained to a Neural Network by using Back propagation technique, for classification of data. Here we have tested the model on the COCOMO 81 dataset and also compared the obtained values with standard COCOMO model. By making use of the experience from Neural Networks and the efficient tuning of parameters by PSO operating on clusters, the proposed model is able to generate better results and it can be applied efficiently to larger data sets.
2011
Web Effort Estimation is a process of predicting the efforts and cost in terms of money, schedule and staff for any software project system. Many estimation models have been proposed over the last three decades and it is believed that it is a must for the purpose of: Budgeting, risk analysis, project planning and control, and project improvement investment analysis. In this paper, we investigate the use of Fuzzy ID3 decision tree for software cost estimation; it is designed by integrating the principles of ID3 decision tree and the fuzzy set-theoretic concepts, enabling the model to handle uncertain and imprecise data when describing the software projects, which can improve greatly the accuracy of obtained estimates. MMRE and Pred are used as measures of prediction accuracy for this study. A series of experiments is reported using two different software projects datasets namely, Tukutuku and COCOMO’81 datasets. The results are compared with those produced by the crisp version of the...
JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), 2021
Software development involves several interrelated factors that influence development efforts and productivity. Improving the estimation techniques available to project managers will facilitate more effective time and budget control in software development. Software Effort Estimation or software cost/effort estimation can help a software development company to overcome difficulties experienced in estimating software development efforts. This study aims to compare the Machine Learning method of Linear Regression (LR), Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Decision Tree Random Forest (DTRF) to calculate estimated cost/effort software. Then these five approaches will be tested on a dataset of software development projects as many as 10 dataset projects. So that it can produce new knowledge about what machine learning and non-machine learning methods are the most accurate for estimating software business. As well as knowing between the selection between using Par...
2013
Accurate software development effort estimation is critical to the success of software projects. Although many techniques and algorithmic models have been developed and implemented by practitioners, accurate software development effort prediction is still a challenging endeavor in the field of software engineering, especially in handling uncertain and imprecise inputs and collinear characteristics. In this paper, a hybrid in-telligent model combining a neural network model integrated with fuzzy model (neuro-fuzzy model) has been used to improve the accuracy of estimating software cost. The performance of the proposed model is assessed by designing and conducting evaluation with published project and industrial data. Results have shown that the proposed model demonstrates the ability of improving the estimation accuracy by 18% based on the Mean Magnitude of Relative Error (MMRE) criterion.
Procedia Technology, 2012
Software Cost Estimation (SCE) is one of important topics in producing software in recent decades. Real estimation requires cost and effort factors in producing software by using of algorithmic or Artificial Intelligent (AI) techniques. Boehm developed the Constructive Cost Model (COCOMO) that is one of the algorithmic SCE models. Also, these models contain three increasingly basic, intermediate and detailed forms, i.e. basic COCOMO is suitable for quick, early, rough order of among the estimates of required effort in producing software, but its accuracy is limited due to its loss of factors to account for difference between cost drivers. Intermediate COCOMO assumes these project attributes into account. In addition detailed COCOMO accounts for individual project phases used. The COCOMO algorithmic techniques families have used since 1981. In recent years, some techniques emerged by using intelligent techniques to solve and estimate the effort required in producing software. In this paper, different data mining techniques to estimate software costs are presented and then the results of each technique are evaluated and compared. However, NASA's projects to train and test each of these techniques are applied. Then, data set to train and test the data mining techniques improve the estimation accuracy of the models in many cases. We show the comparison between COCOMO model and data mining techniques here. The results indicate that these methods result in many benefit answers. Also we show the comparison of the estimation accuracy of COCOMO model with data mining techniques. Data mining techniques improve the estimation accuracy of the models in many cases. So the estimated effort more improvement in this models.
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011
This empirical study investigates two commonly used decision tree classification algorithms in the context of cost-sensitive learning. A review of the literature shows that the cost-based performance of a software quality prediction model is usually determined after the model-training process has been completed. In contrast, we incorporate cost-sensitive learning during the model-training process. The C4.5 and Random Forest decision tree algorithms are used to build defect predictors either with, or without, any cost-sensitive learning technique. The paper investigates six different cost-sensitive learning techniques: AdaCost, Adc2, Csb2, MetaCost, Weighting, and Random Undersampling (RUS). The data come from case study include 15 software measurement datasets obtained from several high-assurance systems. In addition, to a unique insight into the costbased performance of defection prediction models, this study is one of the first to use misclassification cost as a parameter during the model-training process. The practical appeal of this research is that it provides a software quality practitioner with a clear process for how to consider (during model training) and analyze (during model evaluation) the cost-based performance of a defect prediction model. RUS is ranked as the best cost-sensitive technique among those considered in this study.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.