It is already true that Big Data has drawn huge attention from researchers in information sciences, policy and decision makers in governments and enterprises. As the speed of information growth exceeds Moore's Law at the beginning of this new century, excessive data is making great troubles to human beings. However, there are so much potential and highly useful values hidden in the huge volume of data. A new scientific paradigm is born as data-intensive scientific discovery (DISD), also known as Big Data problems. A large number of fields and sectors, ranging from economic and business activities to public administration, from national security to scientific researches in many areas, involve with Big Data problems. On the one hand, Big Data is extremely valuable to produce productivity in businesses and evolutionary breakthroughs in scientific disciplines, which give us a lot of opportunities to make great progresses in many fields. There is no doubt that the future competitions in business productivity and technologies will surely converge into the Big Data explorations. On the other hand, Big Data also arises with many challenges, such as difficulties in data capture, data storage, data analysis and data visualization. This paper is aimed to demonstrate a close-up view about Big Data, including Big Data applications, Big Data opportunities and challenges, as well as the state-of-the-art techniques and technologies we currently adopt to deal with the Big Data problems. We also discuss several underlying methodologies to handle the data deluge, for example, granular computing, cloud computing, bio-inspired computing, and quantum computing. (C) 2014 Elsevier Inc. All rights reserved.
AdaBoost is a popular method for vehicle detection, but the training process is quite time-consuming. In this paper, a rapid learning algorithm is proposed to tackle this weakness of AdaBoost for vehicle classification. Firstly, an algorithm for computing the Haar-like feature pool on a 32 x 32 grayscale image patch by using all simple and rotated Haar-like prototypes is introduced to represent a vehicle's appearance. Then, a fast training approach for the weak classifier is presented by combining a sample's feature value with its class label. Finally, a rapid incremental learning algorithm of AdaBoost is designed to significantly improve the performance of AdaBoost. Experimental results demonstrate that the proposed approaches not only speed up the training and incremental learning processes of AdaBoost, but also yield better or competitive vehicle classification accuracies compared with several state-of-the-art methods, showing their potential for real-time applications. (C) 2014 Elsevier Inc. All rights reserved.
Metaheuristics are widely recognized as efficient approaches for many hard optimization problems. This paper provides a survey of some of the main metaheuristics. It outlines the components and concepts that are used in various metaheuristics in order to analyze their similarities and differences. The classification adopted in this paper differentiates between single solution based metaheuristics and population based metaheuristics. The literature survey is accompanied by the presentation of references for further details, including applications. Recent trends are also briefly discussed. (c) 2013 Elsevier Inc. All rights reserved.
Distributed networked control systems have attracted intense attention from both academia and industry due to the multidisciplinary nature among the areas of communication networks, computer science and control. With ever-increasing research trends in these areas, it is desirable to review recent advances and to identify methodologies for distributed networked control systems. This paper presents a brief overview of such systems regarding system configurations, challenging issues and methodologies. First, networked control systems are introduced and their prevalent configurations including centralized, decentralized and distributed structures are outlined. Second, an emphasis is laid on a number of challenging issues from the analysis and synthesis of distributed networked control systems. More specifically, these challenging issues are identified through three integrated aspects: communication, computation and control. Third, different methodologies in the literature for distributed networked control systems are reviewed and categorized based on three pairs: undirected and directed graphs, fixed and time-varying topologies, and time-triggered and event-triggered mechanisms. Finally, concluding remarks are drawn and some potential research directions are suggested. (C) 2015 Elsevier Inc. All rights reserved.
Swarm intelligence is a research field that models the collective intelligence in swarms of insects or animals. Many algorithms that simulates these models have been proposed in order to solve a wide range of problems. The Artificial Bee Colony algorithm is one of the most recent swarm intelligence based algorithms which simulates the foraging behaviour of honey bee colonies. In this work, modified versions of the Artificial Bee Colony algorithm are introduced and applied for efficiently solving real-parameter optimization problems. (C) 2010 Elsevier Inc. All rights reserved.
Training classifiers with datasets which suffer of imbalanced class distributions is an important problem in data mining. This issue occurs when the number of examples representing the class of interest is much lower than the ones of the other classes. Its presence in many real-world applications has brought along a growth of attention from researchers. We shortly review the many issues in machine learning and applications of this problem, by introducing the characteristics of the imbalanced dataset scenario in classification, presenting the specific metrics for evaluating performance in class imbalanced learning and enumerating the proposed solutions. In particular, we will describe preprocessing, cost-sensitive learning and ensemble techniques, carrying out an experimental study to contrast these approaches in an intra and inter-family comparison. We will carry out a thorough discussion on the main issues related to using data intrinsic characteristics in this classification problem. This will help to improve the current models with respect to: the presence of small disjuncts, the lack of density in the training data, the overlapping between classes, the identification of noisy data, the significance of the borderline instances, and the dataset shift between the training and the test distributions. Finally, we introduce several approaches and recommendations to address these problems in conjunction with imbalanced data, and we will show some experimental examples on the behavior of the learning algorithms on data with such intrinsic characteristics. (C) 2013 Elsevier Inc. All rights reserved.
The cloud computing exhibits, remarkable potential to provide cost effective, easy to manage, elastic, and powerful resources on the fly, over the Internet. The cloud computing, upsurges the capabilities of the hardware resources by optimal and shared utilization. The above mentioned features encourage the organizations and individual users to shift their applications and services to the cloud. Even the critical infrastructure, for example, power generation and distribution plants are being migrated to the cloud computing paradigm. However, the services provided by third-party cloud service providers entail additional security threats. The migration of user's assets (data, applications, etc.) outside the administrative control in a shared environment where numerous users are collocated escalates the security concerns. This survey details the security issues that arise due to the very nature of cloud computing. Moreover, the survey presents the recent solutions presented in the literature to counter the security issues. Furthermore, a brief view of security vulnerabilities in the mobile cloud computing are also highlighted. In the end, the discussion on the open issues and future research directions is also presented. (C) 2015 Elsevier Inc. All rights reserved.
An efficient optimization method called 'Teaching-Learning-Based Optimization (TLBO)' is proposed in this paper for large scale non-linear optimization problems for finding the global solutions. The proposed method is based on the effect of the influence of a teacher on the output of learners in a class. The basic philosophy of the method is explained in detail. The effectiveness of the method is tested on many benchmark problems with different characteristics and the results are compared with other population based methods. (C) 2011 Elsevier Inc. All rights reserved.
Memetic computing is a subject in computer science which considers complex structures as the combination of simple agents, memes, whose evolutionary interactions lead to intelligent structures capable of problem-solving. This paper focuses on memetic computing optimization algorithms and proposes a counter-tendency approach for algorithmic design. Research in the field tends to go in the direction of improving existing algorithms by combining different methods or through the formulation of more complicated structures. Contrary to this trend, we instead focus on simplicity, proposing a structurally simple algorithm with emphasis on processing only one solution at a time. The proposed algorithm, namely three stage optimal memetic exploration, is composed of three memes: the first stochastic and with a long search radius, the second stochastic and with a moderate search radius and the third deterministic and with a short search radius. The bottom-up combination of the three operators by means of a natural trial and error logic, generates a robust and efficient optimizer, capable of competing with modern complex and computationally expensive algorithms. This is suggestive of the fact that complexity in algorithmic structures can be unnecessary, if not detrimental, and that simple bottom-up approaches are likely to be competitive is here invoked as an extension to memetic computing basing on the philosophical concept of Ockham's Razor. An extensive experimental setup on various test problems and one digital signal processing application is presented. Numerical results show that the proposed approach, despite its simplicity and low computational cost displays a very good performance on several problems, and is competitive with sophisticated algorithms representing the-state-of-the-art in computational intelligence optimization. (C) 2011 Elsevier Inc. All rights reserved.
Cloud computing emerges as a new computing paradigm that aims to provide reliable, customized and quality of service guaranteed computation environments for cloud users. Applications and databases are moved to the large centralized data centers, called cloud. Due to resource virtualization, global replication and migration, the physical absence of data and machine in the cloud, the stored data in the cloud and the computation results may not be well managed and fully trusted by the cloud users. Most of the previous work on the cloud security focuses on the storage security rather than taking the computation security into consideration together. In this paper, we propose a privacy cheating discouragement and secure computation auditing protocol, or SecCloud, which is a first protocol bridging secure storage and secure computation auditing in cloud and achieving privacy cheating discouragement by designated verifier signature, batch verification and probabilistic sampling techniques. The detailed analysis is given to obtain an optimal sampling size to minimize the cost. Another major contribution of this paper is that we build a practical secure-aware cloud computing experimental environment, or SecHDFS, as a test bed to implement SecCloud. Further experimental results have demonstrated the effectiveness and efficiency of the proposed SecCloud. (C) 2013 ElseVier Inc. All rights reserved.
The hesitant fuzzy linguistic term sets (HFLTSs), which can be used to represent an expert's hesitant preferences when assessing a linguistic variable, increase the flexibility of eliciting and representing linguistic information. The HFLTSs have attracted a lot of attention recently due to their distinguished power and efficiency in representing uncertainty and vagueness within the process of decision making. To enhance and extend the applicability of HFLTSs, this paper investigates and develops different types of distance and similarity measures for HFLTSs. The paper first proposes a family of distance and similarity measures between two HFLTSs. Then a variety of weighted or ordered weighted distance and similarity measures between two collections of HFLTSs are proposed and analyzed for discrete and continuous cases respectively. After that, the application of these measures to multi-criteria decision making problems is given. Based on the proposed distance and similarity measures, the satisfaction degrees for different alternatives are established and are then used to rank alternatives in multi-criteria decision making. Finally a practical example concerning the evaluation of the quality of movies is given to illustrate the applicability and advantage of the proposed approach and the differences between the proposed distance and similarity measures. (C) 2014 Elsevier Inc. All rights reserved.
In recent years, various heuristic optimization methods have been developed. Many of these methods are inspired by swarm behaviors in nature. in this paper, a new optimization algorithm based on the law of gravity and mass interactions is introduced. In the proposed algorithm, the searcher agents are a collection of masses which interact with each other based on the Newtonian gravity and the laws of motion. The proposed method has been compared with some well-known heuristic search methods. The obtained results confirm the high performance of the proposed method in solving various nonlinear functions. (C) 2009 Elsevier Inc. All rights reserved.
In this paper, we propose a variety of distance measures for hesitant fuzzy sets, based on which the corresponding similarity measures can be obtained. We investigate the connections of the aforementioned distance measures and further develop a number of hesitant ordered weighted distance measures and hesitant ordered weighted similarity measures. They can alleviate the influence of unduly large (or small) deviations on the aggregation results by assigning them low (or high) weights. Several numerical examples are provided to illustrate these distance and similarity measures. (C) 2011 Elsevier Inc. All rights reserved.
Microarray data classification is a difficult challenge for machine learning researchers due to its high number of features and the small sample sizes. Feature selection has been soon considered a de facto standard in this field since its introduction, and a huge number of feature selection methods were utilized trying to reduce the input dimensionality while improving the classification performance. This paper is devoted to reviewing the most up-to-date feature selection methods developed in this field and the microarray databases most frequently used in the literature. We also make the interested reader aware of the problematic of data characteristics in this domain, such as the imbalance of the data, their complexity, or the so-called dataset shift. Finally, an experimental evaluation on the most representative datasets using well-known feature selection methods is presented, bearing in mind that the aim is not to provide the best feature selection method, but to facilitate their comparative study by the research community. (C) 2014 Elsevier Inc. All rights reserved.
The evaluation of clustering algorithms is intrinsically difficult because of the lack of objective measures. Since the evaluation of clustering algorithms normally involves multiple criteria, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper presents an MCDM-based approach to rank a selection of popular clustering algorithms in the domain of financial risk analysis. An experimental study is designed to validate the proposed approach using three MCDM methods, six clustering algorithms, and eleven cluster validity indices over three real-life credit risk and bankruptcy risk data sets. The results demonstrate the effectiveness of MCDM methods in evaluating clustering algorithms and indicate that the repeated-bisection method leads to good 2-way clustering solutions on the selected financial risk data sets. (C) 2014 Elsevier Inc. All rights reserved.
The complexity and impact of many real world decision making problems lead to the necessity of considering multiple points of view, building group decision making problems in which a group of experts provide their preferences to achieve a solution. In such complex problems uncertainty is often present and although the use of linguistic information has provided successful results in managing it, these are sometimes limited because the linguistic models use single-valued and predefined terms that restrict the richness of freely eliciting the preferences of the experts. Usually, experts may doubt between different linguistic terms and require richer expressions to express their knowledge more accurately. However, linguistic group decision making approaches do not provide any model to make more flexible the elicitation of linguistic preferences in such hesitant situations. In this paper is proposed a new linguistic group decision model that facilitates the elicitation of flexible and rich linguistic expressions, in particular through the use of comparative linguistic expressions, close to human beings' cognitive models for expressing linguistic preferences based on hesitant fuzzy linguistic term sets and context-free grammars. This model defines the group decision process and the necessary operators and tools to manage such linguistic expressions. (C) 2013 Elsevier Inc. All rights reserved.
Experimental analysis of the performance of a proposed method is a crucial and necessary task in an investigation. In this paper, we focus on the use of nonparametric statistical inference for analyzing the results obtained in an experiment design in the field of computational intelligence. We present a case study which involves a set of techniques in classification tasks and we study a set of nonparametric procedures useful to analyze the behavior of a method with respect to a set of algorithms, such as the framework in which a new proposal is developed. Particularly, we discuss some basic and advanced nonparametric approaches which improve the results offered by the Friedman test in some circumstances. A set of post hoc procedures for multiple comparisons is presented together with the computation of adjusted p-values. We also perform an experimental analysis for comparing their power, with the objective of detecting the advantages and disadvantages of the statistical tests described. We found that some aspects such as the number of algorithms, number of data sets and differences in performance offered by the control method are very influential in the statistical tests studied. Our final goal is to offer a complete guideline for the use of nonparametric statistical procedures for performing multiple comparisons in experimental studies. (C) 2009 Elsevier Inc. All rights reserved.
This paper is a brief survey on the existing problems and challenges inherent in model-based control (MBC) theory, and some important issues in the analysis and design of data-driven control (DDC) methods are here reviewed and addressed. The necessity of data-driven control is discussed from the aspects of the history, the present, and the future of control theories and applications. The state of the art of the existing DDC methods and applications are presented with appropriate classifications and insights. The relationship between the MBC method and the DDC method, the differences among different DDC methods, and relevant topics in data-driven optimization and modeling are also highlighted. Finally, the perspective of DDC and associated research topics are briefly explored and discussed. (c) 2012 Elsevier Inc. All rights reserved.
Service-oriented computing (SOC) represents a paradigm for building distributed computing applications over the Internet. In the past decade, Web services composition has been an active area of research and development endeavors for application integration and interoperation. Although Web services composition has been heavily investigated, several issues related to dependability, ubiquity, personalization, among others, still need to be addressed, especially giving the recent rise of several new computing paradigms such as Cloud computing, social computing, and Web of Things. This article overviews the life cycle of Web services composition and surveys the main standards, research prototypes, and platforms. These standards, research prototypes, and platforms are assessed using a set of assessment criteria identified in the article. The paper also outlines several research opportunities and challenges for Web services composition. (C) 2014 Elsevier Inc. All rights reserved.
The Bonferroni mean (BM) can capture the interrelationships among arguments, which plays a crucial role in multi-criteria decision making problems. In this paper, we explore the geometric Bonferroni mean (GBM) considering both the BM and the geometric mean (GM) under hesitant fuzzy environment. We further define the hesitant fuzzy geometric Bonferroni mean (HFGBM) and the hesitant fuzzy Choquet geometric Bonferroni mean (HFCGBM). Then we give the definition of hesitant fuzzy geometric Bonferroni element (HFGBE), which is considered as the basic calculational unit in the HFGBM and reflects the conjunction between two aggregated arguments. The properties and special cases of the HFGBM are studied in detail based on the discussion of the HFGBE. In addition, the weighted hesitant fuzzy geometric Bonferroni mean (WHFGBM) and the weighted hesitant fuzzy Choquet geometric Bonferroni mean (WHFCGBM) are proposed considering the importance of each argument and the correlations among them. In the end, we apply the proposed aggregation operators to multi-criteria decision making, and give some examples to illustrate our results. (C) 2012 Published by Elsevier Inc.