Dots of Tech Perception

March 5, 2008

Data Collection – Mining for Defects

Filed under: bugzilla,defect prediction,mozilla,mozilla firefox — GG @ 1:50 am

As a non critical software system, it is widely recognized that Firefox contains post release defects. OSS facilitates the collection of data to be used in defect prediction models. An important requirement for OSS code is that it should be rigorously modular, self-contained and self explanatory, to allow development at remote sites. Therefore, the data that can be used for prediction models in OSS could be retrieved from the source code version repositories (CVS) and bug tracking systems (Bugzilla). On the other hand, OSS development is characterized by lack of a formal process, poor design and architecture, and development tools that are not comparable to those used in commercial development. Few of the defect prediction approaches in commercial software can be directly applied to OSS development, however results obtained from OSS prediction models can be used in an industry environment.

1. Versions: Firefox is based on independent Mozilla Core components layered together. Due to this architecture some of Mozilla’s applications share many components, but they are fundamentally different in functionality.

The Mozilla source code is organized in several branches. The trunk is the main branch, the central source code that is used for continuous and ongoing development. Trunk builds contain the very latest changes and updates. However, the trunk can also be very unstable at times. When development is started for a specific Mozilla version a new branch is created. At conception, a derived branch contains everything that the principal branch contains. Firefox 1.0 branch was derived from Mozilla Branch 1.7 while Firefox 1.5 from Mozilla Branch 1.8. Firefox branches that are forked from the existing Mozilla branch will be used for all future releases of Firefox. The term release is used in OSS development to refer to different types of releases: major and minor, alpha and beta.

Firefox Branch 1.5.0.3 resynchronized the code base with the trunk which contained additional features not available in Firefox 1.0. On the other hand, in release 1.5.0.3 the focus was not on adding features but on improving security related aspects, which were bypassed in version 1.5.0. This peculiarity of the three selected releases allowed us to test if the performance of a defect prediction models increases when trained on data collected from major releases instead of minor ones.

2. Module Selection: The reason behind branching is that components that need to be prepared for a future release are at the same time continuously developed on the trunk. A distinction needs to be made between Firefox-specific source code, i.e. code that does not support any other Mozilla application, and the Mozilla components that support Firefox.

3. Metrics: To derive the product metrics for each source file Understand C++ can be used. The tool computes the source code metrics for C and C++ programs and generates metrics reports. The reports contain three categories of metrics: project level, file level, and function level. It also contains object oriented metrics for the .cpp files.

The reason behind branching is that components that need to be prepared for a future release are at the same time continuously developed on the trunk. A distinction needs to be made between Firefox-specific source code, i.e. code that does not support any other Mozilla application, and the Mozilla components that support Firefox.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.