DuBois, P J (1996) An efficient method to accurately predict general condition costs in the building construction industry. Unpublished PhD thesis, Colorado School of Mines, USA.
Abstract
Building construction estimators use simple (one predictor variable) regression analysis to predict general condition costs for construction projects. This dissertation is the development of a method that combines historical data (database) with multivariate (multiple predictor variables) regression analysis to predict future general condition costs of projects. The method incorporates a simple four-step model building procedure that is found in many regression analysis textbooks. The steps are: 1. Data collection and preparation of predictor variables. 2. Data collection and preparation of response variables. 3. Model refinement and selection. 4. Model validation. One apparently obvious predictor variable is building type. Others, such as project duration and percent of total revenue, are predictor variables often used in the industry to predict general condition cost in the traditional simple regression estimating method. These predictor variables have stood the test of time. By adding an easily obtainable parameter like gross square footage, combined with the functional transposition and interaction of these basic predictor variables, we have a set of predictor variables that may improve general condition estimating. They are: … [omitted for legibility] … The response variables are given. They are the historic general condition costs provided by the sponsor. The data provided has 82 material cost categories and 40 labor cost categories and includes five years of projects. Due to the similarities of some cost categories and the non-routine nature of others, the data was modified resulting in 25 material and 15 labor cost categories all in constant dollars. Model refinement and selection is a two-step 0process that incorporates commercially available statistical software (Minitab). The command "Best Regress" conducts a regression analysis on every combination of independent variables (8191 combinations for 13 predictor variables) and selects the best variables for the model based on the Rsquare. Acceptable adjusted Rsquare are in the 70-100 range. Once the best variables are selected (typically 6 to 7), the command "Regress" is used to determine the coefficient that corresponds to each variable and the model is formed. The criterion used for model validation is the mean absolute deviation (MAD). We define success as MADestimator approximately equals MADmodel. The estimator has an advantage of zero error for cost plus jobs and engineers and managers usually stop charging cost categories once the actual cost equals the estimated cost. Hence, obtaining comparable MADs is a desirable goal. The first results obtained are from hospital building types. Unfortunately, the adjusted Rsquares were all below 70 except for one cost category examined. Due to these low adjusted Rsquares, we were not optimistic about the results obtained when calculating the MADs. Surprisingly, 12 out of the 18 cost categories examined had MADestimator > MADmodel indicating the models performed better than the traditional estimator. However, a pairwise t test indicated that the MADs were about the same. Therefore, the criterion was satisfied by doing as good as the traditional estimators. Due to the surprising success of having models with low adjusted Rsquares, but fairly good MADs, a search for a method to improve the adjusted Rsquares was made. The search resulted in resorting the data into two building types ('new builds' and 'remodel') rather than the previous nine building types. The results are impressive. In the 'new build' building types all cost categories had adjusted Rsquares greater than 70 (with the majority in the 80-100 range) except for four. The 'remodel' building types were equally as good if not better by having all but two cost categories with adjusted Rsquares greater than 70 and the majority in the 80-100 range. With the improvement in the adjusted Rsquares, one would expect the MADs to be improve significantly. In both the 'new build' and 'remodel' building types, the MADestimator is significantly greater than the MADmodel indicating the model outperformed the traditional estimator. For the 'new build' cost categories, the MADmodel was better than the MADestimator in 30 out of the 35 cost categories. For remodel projects, the model outperformed the traditional estimator in 17 out of the 20 cost categories. In dollar terms, the model was $2,145,156.93 more accurate than the traditional estimators. Multivariate regression analysis estimates the general condition cost better than traditional estimating methods. The results indicate multivariate regression analysis is a viable tool for estimating general condition costs. Companies employing this methodology, can use it confidently in either of two ways. First, the methodology can be used as a check of traditional estimating methods. We realize that some of the estimators must get familiar with the methodology before implementing it on its own. The other implementation option is to use the methodology as the primary predictor and spot-check by the estimator. In either case, the results indicate the methodology should be used by the construction industry.
Item Type: | Thesis (Doctoral) |
---|---|
Uncontrolled Keywords: | market; uncertainty; construction project; markets; productivity; project cost; client; United States; net present value; case study |
Date Deposited: | 16 Apr 2025 19:22 |
Last Modified: | 16 Apr 2025 19:22 |