Home » M.Tech Thesis Guided

M.Tech Thesis Guided

Anomaly detection in bug fixing process of open source projects
Veena Saini (2017) View Abstract  Download

An empirical investigation into the association between architecture and performance attributes of music production software
Apexit Dhandav (2017) View Abstract  Download

Energy consumption of integrated development environments in educational setup: a simulation study
Sanehadeep Kaur Bagga (2017) View Abstract  Download

Mining IoT projects for quality characteristics based recommendation of IoT components
Rajat Sharma (2017) View Abstract  Download

An Empirical Investigation into the Impact of Code Refactoring Activities on Software developers’ Sentiments
Navdeep Singh (2017) View Abstract  Download

Analyzing Code Smell Removal Sequences for Enhanced Software Maintainability
Yukti Mehta (2017) View Abstract  Download

Bug Assignment using Association Rule Mining
Kavita (2017)

Execution Trace Streaming based Scalable and Real Time Collection of dynamic Metrics using PaaS
Harkomal Singh (2016)

An Empirical Investigation into Scenario Level Software Evolution using Calling Context Trees
Sarishty Gupta (2016) View Abstract  Download

An Empirical Investigation into Code Smells Elimination Sequences for Energy Efficient Software
Garima Dhaka (2016) View Abstract  Download

Enhancing PDG based Clone Detection using Approximate Subgraph Matching
C. M. Kamalpriya (2016) View Abstract  Download

Effective Prioritization of Classes for Reduced Refactoring Effort
Aabha Choudhary (2016) View Abstract  Download

An Empirical Study into the Test Case Execution Based Collection of Dynamic Metrics
Himanshu Garg (2016)

A Competent Algorithm to Detect Attractors in Gene Regularity Networks (GRNs) based on Divide and Conquer
Shubham Jain (2016)

Improving Advanced Encryption Standard (AES) using a Novel Block Key Generation Algorithm and Dynamic S-Boxes
Harpreet Singh (2016) View Abstract  Download

Offline Signature Verification using Grid and Global Centroid Based Features
Sunil Kumar (2016)

Particle Swarm Optimization based Hierarchical Agglomerative Clustering for Software Modularization
Monika (2015) View Abstract  Download

Efficient and Scalable Collection of Dynamic Metrics using MapReduce
Shallu Sarvari (2015) View Abstract  Download

Software Clone Detection Using Cosine Distance Similarity
Chavi Ralhan (2015) View Abstract Download

Product and Process Metrics Based Software Fault Prediction Using Statistical Regression Models
Rahul (2015) View Abstract 

Adaptive Genetic Approach to Test Case Prioritization
Vandana Chahal (2015) View Abstract  Download

Model Based Test Case Generation Using Multi-ObjectiveGenetic Algorithm
Gurpreet Kaur (2015) View Abstract  Download

Permission Based Android Malware Detection Using Supervised Learning Techniques
Govind Saini (2015) View Abstract 

Validating software evolution of agile projects using lehman laws
Gurpreet Kour (2015) View Abstract 

Metric based empirical study into the impact of software category on various design attributes
Annupriya (2015) View Abstract 

Optimization of AODV protocol using enhanced DCF-MAC for IEEE 802.11g MANET
Bal Krishan (2015)

ω- Automation driven specification and verification of UML 2.0 sequence diagram
Azad (2014) View Abstract 

Comparative performance analysis of calling context profiling data structures
Prita (2014) View Abstract  Download

Performance analysis of java bytecode instrumentation libraries
Rohit Anand (2014) View Abstract 

Design and validation of aspect-oriented coupling metrics
Prerna Pandey (2014) View Abstract 

Optimization of search engine results based on user implicit feedback comprising of user clicks and time spent data
Hawasingh Soni (2014) View Abstract 

Performance analysis of DCT based digital image watermarking using color spaces
Shelza Kapur (2014) View Abstract

Packet size optimization with LDPC codes in wireless sensor networks
Daljit Singh (2014)

Dynamic coupling based behavioral analysis of object oriented systems
Rani Geetika (2013) View Abstract 

Diversity based performance analysis of wireless communication under reyleigh and rician fading for different modulation schemes
Arvind Mahindru (2013)




Initial research in the field of affective software engineering has shown that developers’ sentiments observed while performing various software development tasks have substantial impact on factors like individual productivity, task quality, creativity, etc. It is hence important to continue exploring and understanding the developers’ sentiments associated with specific software development tasks with the goal to unleash various task-to-sentiment relationships. This study empirically investigates the impact of software code refactoring, an important modern-day development and maintenance task, on the sentiments of developers in open source projects. A comprehensive analysis of sentiments attached with 15 different refactoring activities across the evolution of 60 open source Java projects has been performed by mining relevant commit messages. This study investigates 3,171 refactoring related commit messages (out of total 6,15,625 commit messages) representing 4,891 refactoring instances. The study outcome shows that in general software developers express more negative sentiments than positive sentiments, while performing refactoring tasks. It was also found that 5 out of 15 refactoring activities were mainly responsible for this observed outcome. Also the refactoring activities demonstrate different sentiment polarity trends over the evolution of software systems. The findings of this study help the software developers and project managers by providing them with the details about refactoring-to-sentiment relationships.


Maintenance and evolution are the keys for developing a successful software product. Software maintenance in the area of software engineering is defined as modifying a software product after its delivery to the customer in order to remove the faults, and improve the software quality and performance. During the software evolution, the design of the software becomes complex which reduces its maintainability. Most of the companies incur huge losses for scarping off projects when they fail to meet the completion deadlines. Code smells are the surface indications which affect the maintainability of the software. Code smells disturb the maintainability of the code by starting a chain reaction of breakages in dependent modules which makes it difficult to read and modify. Therefore, it becomes important to propose a solution which can generate maintainable software by efficiently removing the code anomalies present in the software in less time as removing all possible code smells is a tedious process.

This work empirically investigates the impact of removing three prominent code smells, long method, god class and feature envy in six possible combinations from 16 open source java applications. The work applies appropriate refactoring techniques to the object-oriented software classes of the sample java projects with the help of tools. The classes are firstly prioritized in ascending order on the basis of a proposed metric called Maintainability Complexity Index (MCI) that is based on maintainability index and relative logical complexity metrics. The code smells are removed in six possible permutations as per the rank of the classes. The results comprise the analysis of maintainability predicting metrics (MCI, MI, RLC) values for the software versions obtained from the removal of code smells in different permutation sequences. Further, the work aims to yield those code smell removal sequences which give the most maintainable software versions in order to assist developers and researchers in saving valuable efforts in producing high quality software.


Modern era of technological advancements has witnessed innumerous transmission mediums being vastly used for the transfer of colossal amounts of classified data. As security is one of the vital aspects of data communication, various encryption algorithms like Data Encryption Standard (DES), Triple Data Encryption Standard (3DES), Advanced Encryption Standard (AES), International Data Encryption Standard (IDEA), Rivest Cipher 6 (RC6), BLOWFISH, SERPENT and CAST-128 are employed to safeguard the classified data being transmitted or stored on web servers from falling into wrong hands. Out of these algorithms, the wide acceptance of AES shows its superiority in providing confidentiality to secret information. Basic characteristics of AES are its simplicity of implementation, cost and security. Several modifications have been proposed to enhance these characteristics in recent times by cryptographers and researchers all around the world. Majority of those researchers have modified the static nature of S-boxes used in AES, while some other researchers have proposed modified key schedules to generate round keys. However, all of them have utilized the same secret key for encrypting each plain text block like primitive AES. This dissertation presents a novel Block key generation algorithm in synchrony with dynamic S-box generation algorithm to generate considerably more unpredictable cipher texts in comparison to AES. Each plain text block is encrypted using entirely different block key generated from the secret key. This in turn generate non identical S-boxes to encrypt each plain text block. Furthermore, proposed algorithm revamps the cryptographic strength of original AES by eliminating any possibility of cryptanalysis and is both reliable and easy to implement. Three metrics that have been chosen for the purpose of performance analysis of the proposed algorithm are Operational complexity, Avalanche effect and Strict avalanche criterion. Further analysis has revealed that the Operational complexity of the modified algorithm is higher than the basic AES, which was expected due to the addition of Block key generation algorithm and key dependent S-boxes. Moreover, the proposed algorithm satisfies the desired property of Avalanche effect and has greater security towards linear and differential cryptanalysis in comparison to original AES.


Green IT is a modern that refers to the development of energy efficient products (software and hardware). Software that runs on the hardware is responsible for the amount of energy consumed by it. Incorporating sustainability concerns into software development practices is one of the fastest emerging trends in software industry. Recent sustainability studies have shown that changing the internal code structure of the software affects its energy consumption behavior as well as architecture maintainability. Code smells refer to the bad design structures which makes software difficult to evolve, maintain, read, reuse etc. Various refactoring techniques are defined to eliminate code smells. Refactoring alters the internal code structure of software systems while preserving the external behavior. Since internal code structure is modified in the process of eliminating code smells, it changes software energy consumption behavior as well as software metrics.

 This work empirically investigates the impact of eliminating three of the notorious code smells – god class (G), feature envy (F) and long method (L), individually as well as in all possible sequences (GFL, GLF, FGL, FLG, LGF, LFG), on architecture maintainability and energy consumption behavior of software systems. The experiments have been performed using three open source java applications. The study outcomes show that these permutations yield variant levels of energy consumption values for the resulting refactored software versions. It is also observed that a particular permutation (GFL) is learned to yield most energy efficient refactored software version, in comparison to those yielded by all other permutations for all three applications; thus revealing consistency. Results also revealed that there is no definite relationship between software architecture metrics and software energy consumption. These findings can be useful for the software developers in understanding and adopting those code smells elimination sequences which may result in more sustainable refactored software versions.


Software Maintenance is a one of the paramount processes in Software Engineering. It accounts for over sixty percent of the total effort and cost expended in the overall software engineering process. The main goal of software maintenance is to preserve the value of the developed software over time. During software development, programmers often reuse existing code to build new code. Replicated code fragment that is an exact copy or a modified version of an existing code fragment is called a code clone. Replicating existing code helps in quick software development and enhancement for changed user requirements. So, the occurrence of code clones is inevitable in the development of large software systems. But, cloning occurrences lead to higher cost and effort for software maintenance, increased probability of bug propagation, sloppier system design and increase in system size.

Clone detection and removal are therefore, fundamental to efficient and effective software development and maintenance process. There are different types of code clones based on the amount and kind of replication performed. Various tools and techniques have been developed to detect clones from software systems. Each clone detection tool or technique specializes in detection of one or more type of clones. Program Dependency Graph (PDG) based clone detection techniques have a key advantage over other techniques, that they are capable of detecting non-contiguous code clones. This work proposes further enhancement to PDG-based detection in order to identify all possible clone relations from the obtained clone results by applying Approximate Subgraph Matching (ASM).

The results of the proposed technique were obtained on three subject software systems. The obtained results are composed of many new subsumed clone relations and exact and approximate clone relations derived from the clone pair results of PDG-based technique. These results indicate that using the proposed approach, a large number of new clone relations can be identified from the clone pairs obtained by PDG-based detection. The results have been manually validated for each subject system. This work also presents a novel approach using ASM to identify different node-to-node mappings between code fragments of each detected clone relation and proposes a new ASM-based distance measure to quantify their similarity.


Improving a software system’s internal structure through regular refactoring is considered vital for its long and healthy life. Refactoring is an endless process of quality enhancement. It improves the readability, understandability, reusability, extensibility and maintainability of the software system. However, despite its amenities, refactoring is not readily adopted by software development teams in industry mainly due to the strict project deadlines and limited resources. Hence, they look for optimal refactoring recommendations that would incur minimal effort overhead while outputting decent benefits in terms of enhanced software quality. To this end, an approach is proposed in this study for identifying and prioritizing object oriented software classes in need of refactoring. The proposed approach’s novelty is in identifying the most change-prone as well as architecturally relevant classes and then generating a class-wise rank based on the code smells information extracted from those classes. The approach is evaluated on a sample of 1621 classes, 2358 code smell instances distributed over 28 versions of four open source java systems. The results thus obtained reveal that only 16% high priority classes contain up to 65% of code smells; hence making such classes the prime refactoring targets. The proposed prioritization scheme contributes through assisting developers in locating classes with the most significant incremental refactoring opportunities.


Software evolution and maintenance plays a crucial role in software development life cycle. Software evolution is a process of undergoing change from the conception to the decommissioning of a software system. Software comprehension is needed for better understanding of software functionality. Software metrics are one of the preferred ways to understand and control the software systems. Software evolution can be better understood if analyzed at user scenario level. Understanding the behavior of participating classes across a set of software versions could be helpful in scenario comprehension. Scenario level analysis defines the behavior of software system in a user centric perspective. Dynamic analysis techniques help in analyzing the run time behavior of programs. Calling Context Tree (CCT) provides complete information about the dynamic behavior of programs. CCTs are used in a wide range of software development processes and large applications such as testing, debugging and error reporting, performance analysis, program analysis, security enforcement, and event logging. CCTs have never been used to study the scenario level software evolution before. This work empirically investigates whether CCT based metrics such as number of nodes, height etc. provide new insights into comprehending the evolution of scenarios. A set of four static, three dynamic, and four CCT metrics are analyzed to comprehend the evolution of eight scenarios across four open source java applications. Correlation analysis and principal component analysis are used to analyze the relationship among the selected set of metrics. The results reveal that two out of four CCT metrics have high correlation with selected static and dynamic metrics. Height of CCT remains constant across multiple versions of sample applications per scenario. Therefore, CCT metrics provide useful information for scenario level evolution.


Software modularization is the process of automatically restructuring software units into modules to improve the software’s structure. Software systems need to evolve in order to extend or change their functionality, improve their performance, incorporate environment changes and so on. As software is revised to accommodate the required changes, its structure deteriorates. In recent years, various clustering techniques have been explored to improve the architecture of such systems. These techniques can be divided in to following categories: – graph theoretic techniques, Information Retrieval (IR) based techniques, data mining based techniques, pattern matching based techniques, hierarchical clustering techniques, and meta-heuristic approaches. Graph theoretical techniques, hierarchical clustering techniques, and meta-heuristic techniques are the prominently used techniques. Clustering in software modularization, groups the entities on the basis of similarities among them. Hierarchical agglomerative clustering (HAC) algorithms have been widely used to restructure software systems, providing a multiple level architectural view. Weighted Combined Algorithm (WCA) is a hierarchical agglomerative clustering algorithm used for software modularization. Particle Swarm Optimization (PSO) is a partition based meta-heuristic search technique which has been successfully applied to solve the clustering problem in past. In this work we propose an approach for optimizing WCA using PSO for software modularization. To evaluate the performance of the proposed algorithm, a set of experiments are performed on five open source java software systems. The results of this empirical study show that proposed approach outperforms both WCA and PSO clustering techniques when applied to software modularization.


The Google android mobile phone platform is one of the most smartphone operating systems in the market. The open source android platform permits developers to exploit the versatile operation of mobile framework. These permissions however raise significant issues identified with malicious applications. The detection of malware for the android applications is considered as a serious problem. Many researchers have been applying new and creative techniques that can detect malware with low computational cost. Such research works are expected to grow rapidly because of continuous rise in the number of new mobile applications in market.

This work proposes a new method to combine requested permissions, used permissions and features to build a Permission-Based framework for the detection of malicious android applications. In this model, the permission is extracted from each applications profile data. Various data mining techniques such as J48 decision tree algorithm, support vector machine, Naïve Bayes are used to classify whether an application is possibly malicious or benign by utilizing permissions for each application as features. An inherent benefit of our model is that it does not depend upon any dynamic tracing of the system calls, which is a much costlier affair, but only uses static analysis to discover system functions from each application. In addition, this method can be generalized to all the mobile applications due to the fact that requested permissions and used permissions are always present for all types of mobile applications. To validate the performance of the algorithm a malware dataset of 181 android applications from Android Malware Genome Project was used for this research. As a result, proposed method achieves high detection rate and accuracy as compared to previous methods that confirms the simplicity of extracting permissions and classifying whether the application is malicious or not.


Among many software testing activities, test case generation is the most fundamental and important task. Test case generation is a process of identifying test cases which basically forecasts the expected output of the system under development. A number of techniques exist for test case generation like model based testing, evolutionary testing etc. Model based testing has gained a lot of popularity in the past years. The reasons of choosing model based testing for test case generation are early detection of faults leading to reduction in overall development time, cost and effort. UML has been adopted as a universal standard for modeling the structural and behavioral aspects of the system. Different UML diagrams are used to model the structure and behavior of the system. Activity diagrams are UML behavioral diagrams that describe the system in the form of workflow of activities. Concurrent and decision activities are also addressed by the activity diagram drawn for the given application. Existing activity diagram based test case generation techniques using genetic algorithm only improves a single objective to be optimized at one point of time. But in practice, there could be some situations which require the optimization of multiple objectives.

The proposed approach generates test cases from activity diagram using multi objective genetic algorithm. XMI file is generated from activity diagram and information fetched from XMI is used to create tree structure which forms an initial population for multi objective algorithm. Selection and crossover is performed on initial population to get new generation of trees which are then converted into binary trees. Depth First Search is used for traversing binary trees to get final and valid set of test cases. The approach is coupled with mutation analysis for checking its effectiveness. Results showed that multi-objective genetic algorithm gives better results as compared to single objective genetic algorithm.


Modern software evolution requires an effective regression testing process in order to keep advancing through a series of high quality updated software versions. Each such updated version accounts for an innumerable growth in the number of test cases. In such a situation, the key idea to improve the effectiveness of software testing is test case prioritization. The existing test case prioritization approaches determine the execution order of all test cases by considering only the test case structural coverage information of the original version, without considering the execution information of test cases for the modified version of program. Recently, an adaptive test case prioritization approach has been proposed, which not only considers the structural coverage information of the original version but also utilizes the test case execution information of the current updated version.

This work presents a genetic optimization of the existing adaptive approach for test case prioritization with an aim to improve the effectiveness of adaptive test case prioritization. An empirical study is conducted using the test cases of an open source java software, Apache Ant 1.9.4, to evaluate the effectiveness of the proposed and the existing approaches. The original Apache Ant 1.9.4 project is used as input to MUCLIPSE plugin in Eclipse tool, to generate mutants in order to carry out the required experiment. The statement coverage information of the test cases of original version of Ant is captured using ECLEMMA plugin in Eclipse. Experimental results show that the proposed approach proves to be significantly more effective in terms of fault detection and competitive in terms of statement coverage, when compared to the existing adaptive approach. Also, an empirical evaluation is performed on the experimental results in order to study the impact of more precise values of probabilistic parameters (used in adaptive approach) on the effectiveness of the existing and the proposed approaches.


Quality and reliability of the subsequent version of the software can be improved by using effective fault prediction techniques. A correct prediction of faulty modules in software projects also helps in reducing the cost of development and evaluation. One of the most effective fault prediction methods is through the use of software metrics. Where software product metrics have already proven to be good defect predictors, process metrics have just been realized to be as effective. However, these process metrics need a large scale empirical evaluation before they can be applied to predict faults in industrial software projects. Validating the effectiveness of process metrics as defect predictors would require a thorough empirical analysis involving both product and process metrics based software fault prediction models to assess the improvement in fault prediction accuracy introduced by the use of process metrics.

This study presents an empirical evaluation in which several process metrics were used to identify those process metrics which can significantly improve the fault prediction models that are based only on product metrics. In order to decide the worthiness of process metrics the prediction models that use only product metrics were compared with those using product metrics as well as one of the investigated process metrics, or combination of all process metrics. Metric data collected from a wide range of Software projects available on Promise Repository was used for the experimental study. A correlation study was conducted between the process metrics and the number of defects. A number of statistical tests such as Shapiro Wilk test and T test along with the effect size analysis were also performed on the metric data. Fault prediction models were trained using multiple linear regression and logistic regression techniques. The results of both statistical techniques were compared on the basis of accuracy to decide which technique performs better. Software fault prediction models trained on data set containing product metrics as well as all process metrics taken collectively performs better as compared to software fault prediction models trained only using product metrics. More so when the logistic regression technique is applied, the accuracy of process metrics based fault prediction models come out to be higher and effect size larger when all the process metrics were taken together, than the accuracy and effect size obtained by using linear regression technique. Similar results were obtained after applying any one of the process metrics with product metrics to train the fault prediction models. Hence this work concludes that the use of process metric is reasonable in constructing fault prediction models.


The critical problem in the development of software is existence of duplicate code that can affect the maintainability of the software. The main goal of the clone detection technique is to identify the parts of the software code which are identical. In the past the text based and token based techniques reflected identical code fragments. However they were not considered reliable method because of their inability to find out syntactic differences between programs. The syntactic difference can be efficiently evaluated using abstract syntax tree. Syntax based clone detection has been found to be useful in detecting duplicate code. There are many ways to find similarity between two programs. The name of the techniques is characteristic vector clustering, metric based vector comparison independent component analysis to analyze method vector and hybrid approach applied on variable size vector for finding similarity. The similarity measure is the distance between two vectors in the above mentioned techniques.

This work aims at providing robust and accurate similarity index for detecting software clones. The performance parameters considered are line of codes detected as clones and range of value obtained from cosine distance. The operations used for the required comparison are tree construction of the program code, characteristic vector extraction from the trees and evaluation of the characteristic vector. The results imply that cosine distance similarity is more efficient than random distance measuring function.


Dynamic Metrics are known to assess the actual behavior of software systems as they are extracted from the runtime data obtained during program execution. These metrics have the potential to be good indicators of quality attributes of software such as maintainability, understandability, reusability, error-proneness, etc. However, recent literature indicates that dealing with dynamic information remains a formidable challenge due to the huge size of execution data at hand, resulting in long processing delays. The navigation and exploration of such data for the collection of a particular dynamic metric turns out to be quite a challenging task; that hinders the widespread adoption of dynamic metrics. In this respect, the metric collection performance in terms of computational time could be enhanced through parallelization. Hadoop MapReduce is known to be an ideal technique to develop such highly scalable applications that swiftly process massive amount of data in parallel over clusters of computing nodes. This choice of technique was also stimulated by the fact that it has become the de-facto standard for parallel computation in industry, and is well supported to work on the cloud.

We present an efficient and scalable technique to extract design level dynamic metrics from Calling Context Tree (CCT) using cloud based Map-Reduce paradigm. Calling context profiler JP2 is used to profile applications in our approach, for which we developed a custom dumper to get CCT profile in multiple files to enable the parallel analysis. CCT profiles having node count up to 40 million are used to extract a number of dynamic coupling metrics. On an average, 73% increase in performance is observed as compared to sequential analysis. Also other performance characteristics like speed-up and scale-up are analyzed to strengthen the applicability of our parallel computation approach.


Software modularization is the process of automatically restructuring software units into modules to improve the software’s structure. Software systems need to evolve in order to extend or change their functionality, improve their performance, incorporate environment changes and so on. As software is revised to accommodate the required changes, its structure deteriorates. In recent years, various clustering techniques have been explored to improve the architecture of such systems. These techniques can be divided in to following categories: – graph theoretic techniques, Information Retrieval (IR) based techniques, data mining based techniques, pattern matching based techniques, hierarchical clustering techniques, and meta-heuristic approaches. Graph theoretical techniques, hierarchical clustering techniques, and meta-heuristic techniques are the prominently used techniques. Clustering in software modularization, groups the entities on the basis of similarities among them. Hierarchical agglomerative clustering (HAC) algorithms have been widely used to restructure software systems, providing a multiple level architectural view. Weighted Combined Algorithm (WCA) is a hierarchical agglomerative clustering algorithm used for software modularization. Particle Swarm Optimization (PSO) is a partition based meta-heuristic search technique which has been successfully applied to solve the clustering problem in past.

In this work we propose an approach for optimizing WCA using PSO for software modularization. To evaluate the performance of the proposed algorithm, a set of experiments are performed on five open source java software systems. The results of this empirical study show that proposed approach outperforms both WCA and PSO clustering techniques when applied to software modularization.


Software quality measurement has become a major focus in the field of software engineering and object oriented software engineering is also no exception. Software metrics have proved out to be good indicators of the product’s quality as they provide a quantification base to the software systems. The subjective nature of external quality attributes make it difficult to measure software quality in early phases. However, this limitation is mitigated by use of internal quality attributes which can measure software quality in early phases and also act as good indicators of external quality attributes. Coupling is one of the internal attributes which is considered important due to its profound repercussions for external software quality attributes. In the past decade, a vast number of object oriented coupling metrics have been proposed and validated. But most of them are static in nature and do not take into account the runtime behavior of object oriented software systems. A lesser emphasis has been placed on designing dynamic metrics. Of those proposed so far, only few have been validated using real world applications inhibiting their use in the software industry.

As this work is based on dynamic behavior study, so DCBO (Dynamic Coupling Between Objects) which is known to be run time version of CBO (Coupling Between Objects), is selected. DCBO is a dynamic coupling metric which can be used to measure coupling between objects at run time. This work focuses on empirically investigating DCBO at class, method and message levels using a set of real world applications. A scenario based approach in combination with appropriate statistical techniques has been used in an attempt to capture the correlation between static and dynamic coupling metrics and, in turn, the structural and behavioral aspects of object oriented software systems.


Digital watermarking is the field of digital image processing which deals with hiding information in a digital content. It is a process of adding or concealing some important data in the digital representation of the media formats such as image, audio, video etc in order to prevent copyright infringement. This area is gaining a lot of research interest now a days due to rise of internet usage which is threatening the copyrights of digital content. A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. This research work aims at analyzing the effects of various color spaces on digital watermarking on the basis of DCT. For this purpose, eight different color spaces are considered viz. RGB, YCbCr, JPEG-YCbCr, YIQ, YUV, HIS, HSV, CIELab.

The initial part of this research work deals with the implementation of DCT based watermarking technique in general i.e. watermark embedding is done. This is followed by the watermark extraction process. In the next section of this work, DCT based watermarking is implemented on eight color spaces. and the effects of these color spaces have been analyzed. Performance of the DCT based watermarking is evaluated using two performance metrics i.e. Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR). The higher the value of PSNR, the better is the performance of the DCT based watermarking technique. Last, comparison of PSNR values for DCT based digital watermarking and DWT based digital watermarking is done. The performance of DCT based digital image watermarking is best in case of YIQ color space where as JPEG-YCbCr has greatest PSNR value in case of DWT based digital watermarking.


Existing Search Engines such as Google and Yahoo Index web pages based on some words and meta tags. Although, the search engine provides us with the most relevant list of results according to the keyword we give as input to it, but still the results may be of less relevance to the user. In this work we have involved the user implicit feedback to improve the ranking process of search engine. Our idea is to divide the web pages into “good pages” and “bad pages” by incorporating the user feedback comprising of click data and time spent on webpage. If a user spends time “t” on a webpage and if time “t” is less than or equals to time “t0” (“t0” is minimum time to read/browse the webpage) then this page may be useful and therefore considered as “good page”. For an irrelevant webpage, the user closes the webpage immediately or moves to other results and thus “t” may be less than “t0” and therefore making it a “bad page”. Using the user’s prospective, we can provide better ranking to web pages as a webpage with more number of “good page” will be ranked higher as compared to “bad page”.

We performed a series of experiments taking Google as our default search engine. In our experiment some random users are invited to search some query keywords and visit them as per their normal behavior. We use browser plug-ins to acquire the click data and time spent on webpage by different users. After collecting the data we applied our proposed system to re-rank the web pages. The ideal ranks of the webpages were formulated manually by a team of human judges. The results from our experiment were then compared to the results by Google search engine w.r.t the ideal ranking provided by human judges. We verified that our results are more relevant to the ideal results by human judge than that of Google search engine.



Aspect-oriented programming (AOP) is a new programming paradigm which introduces the concept of separation of concerns. The major emphasis is related to problem analysis, software design, and implementation techniques of AOP. Software metrics have proved out to be good indicators of the product’s quality as they provide a quantification base to the software systems. Coupling is one of the internal attributes which is considered important due to its profound repercussions for external software quality attributes like maintainability and fault-proneness. A very small amount of work has been done in the field of AOP metrics. In the past decade, a number of aspect-oriented coupling metrics have been proposed, most of which are derived from object oriented metric suite. A lesser emphasis has been placed on validating these metrics. Of those proposed so far, only few have been validated using real world applications inhibiting their use in the software industry.

As this work is based on validating aspect oriented coupling metrics, so Base Aspect Coupling (BAC) and Coupling between Components (CBC) coupling metrics are selected. BAC is an indication of the amount of reuse and dependence of the class on other class for implementation while CBC is defined for a component (class or aspect) as a total number of the other components to which it is coupled. This work focuses on theoretical and empirical validation of the metrics. Additionally, two new coupling metrics have been proposed, Aspect Coupling Factor (ACF) and Aspect Data Abstraction Coupling (ADAC) which are derived from the object-oriented coupling metric suite along with their theoretical validation.


Bytecode instrumentation libraries are gilt-edged for transparently intensifying virtual execution environments for purposes such as program monitoring and profiling. This helps in making software system more effective and efficient. Instrumentation facilitates in understanding the software system. The analysis depends on evaluation of metrics against the application to get its information. There are several open source bytecode instrumentation libraries available. Many frameworks and tools are also available made using these bytecode instrumentation libraries. Libraries are available for a variety of programming languages for instrumentation. In this dissertation work, we conduct a performance analysis on three of the well known Java bytecode instrumentation libraries: JAVASSIST (Java Programming Assistant), BCEL (ByteCode Engineering Library) and ASM. This performance analysis is done by evaluating chosen static metric set for the taken sample set by using each library independently. This experiment would help in comparing these libraries for the chosen static metric set


A critical problem in the development of computer software is to identify badly designed portions of code that can affect execution performance. The main goal of a performance profiler is that of identifying which parts of the program should be optimized in order to improve the global execution speed. Static analysis had a lot of success largely in procedural and object-oriented paradigms in past years. But nowadays, due to the dependency of modern day programs written in object-oriented languages, such as java, C# etc, on the runtime features like interprocedural control flow, dynamic libraries, polymorphism and late binding, the importance of dynamic analysis has considerably increased. This is due to the fact that changes in the development methodologies in the last years have created the need for the adoption of a dynamic analysis of the behaviour of a program. Program profiling is used to facilitate dynamic program analysis in order to investigate a program’s behaviour using information gathered during program execution.

Calling context profiling is one of the key types of program profiling. Calling context profiles have been found to be useful in program understanding and optimization. Calling context can be organized in several ways like Dynamic Call Tree (DCT), Calling Context Tree (CCT), Hot Calling Context Tree (HCCT), Approximate CCT (ACCT) and Calling Context Uptree (CCU). This work conducts performance analysis on all major calling context profiling data structures using a set of cohesion and coupling metrics at package, class and method levels applied on a set of open source Java applications. The performance parameters considered are time, space and accuracy. The operations used for the required comparison are tree construction and metric evaluation on the data extracted from the trees. The results imply that DCT, CCT and CCU are all similar in metric accuracy but vary in time and space requirements. ACCT, while incurring least time and space overhead, generates highly inaccurate and misleading results. The performance of HCCT is highly dependent on hot-context threshold chosen.


Unified Modeling Language (UML) sequence diagrams capture the interaction of objects and help in minimizing the gap between abstract design and coding. Although UML is now accepted as a de facto standard for design and specification of object oriented systems, but its structures have various disadvantages. For example, it lacks features for defining semantics of the systems to be developed. UML 2.0 introduced many new features, such as combined fragments, to make sequence diagram more expressive than UML 1.0. However, the lack of formal semantics descriptions of these features makes it difficult for practitioners and tool builders to construct and analyze sequence diagrams that specify high assurance systems. In order to assure software reliability, verification of sequence diagrams using formal methods is expected to be quite efficient. Hence, modern software engineering is looking forward to formal methods for the development of reliable software systems.

We propose a method to verify UML 2.0 sequence diagrams using Büchi automata and (Event Deterministic Finite Automata) ETDFA. A Büchi automaton is a type of ω – automaton which can be used for the verification of desired properties specified by Linear Temporal Logic (LTL). ETDFA is a formal model for sequence diagram. This work uses two case studies to illustrate our approach and show how verification can help designers in discovering the design faults in UML 2.0 sequence diagrams.


Software metric is a measure of some property of a piece of software or its specification. Metrics are used to track development process and to calculate code quality. As object oriented analysis and design techniques become widely used, the demand on assessing the quality of object-oriented designs substantially increases. Recently, there has been much research effort to develop and empirically validate metrics for OO design quality. Complexity, coupling, and cohesion have received a considerable interest in the field. Since metrics are crucial source of information for decision making, a large number of OO metrics have also been proposed over the last decade to capture the structural quality of OO code and design and are related to external quality measures. Open source software is software that can be freely used, changed, and shared by anyone. Open source software is made by many people, and distributed under licenses that comply with the Open Source Definition. When a software program is open source, it means the program’s source code is freely available to the public. Unlike commercial software, open source programs can be modified and distributed by anyone and are often developed as a community rather than by a single organization.

In this work, we empirical analyze nine metrics measuring various object oriented design attributes at system level in order to study the impact of software’s type on various design’s attribute. Ten different software categories for this study are games, chat, UML, encryption, database, code analyzer, media player, web server, editor and management. Each category contains nine different softwares. So a total of ninety java based open source software analyzed by assuming that the number of download indicates the success of the software. Finally statistical and correlation of the metric data collected has been analyzed.


The primary focus in software engineering involves issues related to upgradation, migration and evolution of existing software systems. Software evolution is a continuous process which includes activities like enhancement, adaptation or fixing, that occurs after the operational release of the software to the users. Due to the increase in use of computers in all aspects of human activity, collectively or individually, and the important role that computers play in many of the applications, software evolution is becoming an important challenge. Software evolution is the study of the processes of system change over its lifetime which encompasses development and maintenance. Software maintenance refers to modifying a program after it has been put into use, and changes are implemented by modifying existing components and adding new components to the system.

While maintenance refers to an activity that takes place at any time after the new development project is implemented, the software evolution is defined as examining the dynamic behaviour of systems, i.e., how they change or evolve over time. Lehman and Belady first applied the term evolution of software in the 1970s and since then most investigators have used that term to refer to the long and broad view of change in the software systems. However, it is not surprising that empirical research on software evolution is scarce. Off these, most of the empirical research on software evolution is applied on software developed using traditional development methods.

We perform an empirical study to better understand the evolution of software systems developed using agile development methodology. We evaluated a selected set of software metrics for three agile projects and used this metrics data to compare these projects within the context of Lehman’s laws of software evolution.