Recommender systems have developed in parallel with the web. They were initially based on demographic, content-based and collaborative filtering. Currently, these systems are incorporating social information. In the future, they will use implicit, local and personal information from the Internet of things. This article provides an overview of recommender systems as well as collaborative filtering methods and algorithms; it also explains their evolution, provides an original classification for these systems, identifies areas of future implementation and develops certain areas selected for past, present or future importance.
This paper proposes a novel population-based optimization algorithm called Sine Cosine Algorithm (SCA) for solving optimization problems. The SCA creates multiple initial random candidate solutions and requires them to fluctuate outwards or towards the best solution using a mathematical model based on sine and cosine functions. Several random and adaptive variables also are integrated to this algorithm to emphasize exploration and exploitation of the search space in different milestones of optimization. The performance of SCA is benchmarked in three test phases. Firstly, a set of well-known test cases including unimodal, multi-modal, and composite functions are employed to test exploration, exploitation, local optima avoidance, and convergence of SCA. Secondly, several performance metrics (search history, trajectory, average fitness of solutions, and the best solution during optimization) are used to qualitatively observe and confirm the performance of SCA on shifted two-dimensional test functions. Finally, the cross-section of an aircraft's wing is optimized by SCA as a real challenging case study to verify and demonstrate the performance of this algorithm in practice. The results of test functions and performance metrics prove that the algorithm proposed is able to explore different regions of a search space, avoid local optima, converge towards the global optimum, and exploit promising regions of a search space during optimization effectively. The SCA algorithm obtains a smooth shape for the airfoil with a very low drag, which demonstrates that this algorithm can highly be effective in solving real problems with constrained and unknown search spaces. Note that the source codes of the SCA algorithm are publicly available at .
With the advent of Web 2.0, people became more eager to express and share their opinions on web regarding day-to-day activities and global issues as well. Evolution of social media has also contributed immensely to these activities, thereby providing us a transparent platform to share views across the world. These electronic Word of Mouth (eWOM) statements expressed on the web are much prevalent in business and service industry to enable customer to share his/her point of view. In the last one and half decades, research communities, academia, public and service industries are working rigorously on sentiment analysis, also known as, opinion mining, to extract and analyze public mood and views. In this regard, this paper presents a rigorous survey on sentiment analysis, which portrays views presented by over one hundred articles published in the last decade regarding necessary tasks, approaches, and applications of sentiment analysis. Several sub-tasks need to be performed for sentiment analysis which in turn can be accomplished using various approaches and techniques. This survey covering published literature during 2002–2015, is organized on the basis of sub-tasks to be performed, machine learning and natural language processing techniques used and applications of sentiment analysis. The paper also presents open issues and along with a summary table of a hundred and sixty-one articles.
In this paper a novel nature-inspired optimization paradigm is proposed called Moth-Flame Optimization (MFO) algorithm. The main inspiration of this optimizer is the navigation method of moths in nature called transverse orientation. Moths fly in night by maintaining a fixed angle with respect to the moon, a very effective mechanism for travelling in a straight line for long distances. However, these fancy insects are trapped in a useless/deadly spiral path around artificial lights. This paper mathematically models this behaviour to perform optimization. The MFO algorithm is compared with other well-known nature-inspired algorithms on 29 benchmark and 7 real engineering problems. The statistical results on the benchmark functions show that this algorithm is able to provide very promising and competitive results. Additionally, the results of the real problems demonstrate the merits of this algorithm in solving challenging problems with constrained and unknown search spaces. The paper also considers the application of the proposed algorithm in the field of marine propeller design to further investigate its effectiveness in practice. Note that the source codes of the MFO algorithm are publicly available at .
In this paper, we present the first deep learning approach to aspect extraction in opinion mining. Aspect extraction is a subtask of sentiment analysis that consists in identifying opinion targets in opinionated text, i.e., in detecting the specific aspects of a product or service the opinion holder is either praising or complaining about. We used a 7-layer deep convolutional neural network to tag each word in opinionated sentences as either aspect or non-aspect word. We also developed a set of linguistic patterns for the same purpose and combined them with the neural network. The resulting ensemble classifier, coupled with a word-embedding model for sentiment analysis, allowed our approach to obtain significantly better accuracy than state-of-the-art methods.
We propose a decentralized belief propagation-based method, PD-LBP, for multi-agent task allocation in open and dynamic grid and cloud environments where both the sets of agents and tasks constantly change. PD-LBP aims at accelerating the online response to, improving the resilience from the unpredicted changing in the environments, and reducing the message passing for task allocation. To do this, PD-LBP devises two phases, pruning and decomposition. The pruning phase focuses on reducing the search space through pruning the resource providers, and the decomposition addresses decomposing the network into multiple independent parts where belief propagation can be operated in parallel. Comparison between PD-LBP and two other state-of-the-art methods, Loopy Belief Propagation-based method and Reduced Binary Loopy Belief Propagation based method, is performed. The evaluation results demonstrate the desirable efficiency of PD-LBP from both the shorter problem solving time and smaller communication requirement of task allocation in dynamic environments.
► This article proposes a new Fruit Fly Optimization Algorithm. ► FOA is simple computational process. ► FOA ease of transformation of such concept into program code. ► FOA ease to understanding. The treatment of an optimization problem is a problem that is commonly researched and discussed by scholars from all kinds of fields. If the problem cannot be optimized in dealing with things, usually lots of human power and capital will be wasted, and in the worst case, it could lead to failure and wasted efforts. Therefore, in this article, a much simpler and more robust optimization algorithm compared with the complicated optimization method proposed by past scholars is proposed; the Fruit Fly Optimization Algorithm. In this article, throughout the process of finding the maximal value and minimal value of a function, the function of this algorithm is tested repeatedly, in the mean time, the population size and characteristic is also investigated. Moreover, the financial distress data of Taiwan’s enterprise is further collected, and the fruit fly algorithm optimized General Regression Neural Network, General Regression Neural Network and Multiple Regression are adopted to construct a financial distress model. It is found in this article that the RMSE value of the Fruit Fly Optimization Algorithm optimized General Regression Neural Network model has a very good convergence, and the model also has a very good classification and prediction capability.
The hesitant fuzzy linguistic term set (HFLTS) is a new and flexible tool in representing hesitant qualitative information in decision making. Correlation measures and correlation coefficients have been applied widely in many research domains and practical fields. This paper focuses on the correlation measures and correlation coefficients of HFLTSs. To start the investigation, the definition of HFLTS is improved and the concept of hesitant fuzzy linguistic element (HFLE) is introduced. Motivated by the idea of traditional correlation coefficients of fuzzy sets, intuitionistic fuzzy sets and hesitant fuzzy sets, several different types of correlation coefficients for HFLTSs are proposed. The prominent properties of these correlation coefficients are then investigated. In addition, considering that different HFLEs may have different weights, the weighted correlation coefficients and ordered weighted correlation coefficients are further investigated. Finally, an application example concerning the traditional Chinese medical diagnosis is given to illustrate the applicability and validation of the proposed correlation coefficients of HFLTSs in the process of qualitative decision making.
In this paper, we investigate the hesitant fuzzy multiple attribute decision making (MADM) problems in which the attributes are in different priority level. Motivated by the ideal of prioritized aggregation operators [R.R. Yager, Prioritized aggregation operators, International Journal of Approximate Reasoning 48 (2008) 263–274], we develop some prioritized aggregation operators for aggregating hesitant fuzzy information, and then apply them to develop some models for hesitant fuzzy multiple attribute decision making (MADM) problems in which the attributes are in different priority level. Finally, a practical example about talent introduction is given to verify the developed approaches and to demonstrate its practicality and effectiveness.
Transfer learning aims to provide a framework to utilize previously-acquired knowledge to solve new but similar problems much more quickly and effectively. In contrast to classical machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling consisting of different data patterns in the current domain. To improve the performance of existing transfer learning methods and handle the knowledge transfer process in real-world systems, computational intelligence has recently been applied in transfer learning. This paper systematically examines computational intelligence-based transfer learning techniques and clusters related technique developments into four main categories: (a) neural network-based transfer learning; (b) Bayes-based transfer learning; (c) fuzzy transfer learning, and (d) applications of computational intelligence-based transfer learning. By providing state-of-the-art knowledge, this survey will directly support researchers and practice-based professionals to understand the developments in computational intelligence-based transfer learning research and applications.
Hesitant fuzzy set (HFS), which allows the membership degree of an element to a set represented by several possible values, is considered as a powerful tool to express uncertain information in the process of multi-attribute decision making (MADM) problems. In this paper, we develop a novel approach based on TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) and the maximizing deviation method for solving MADM problems, in which the evaluation information provided by the decision maker is expressed in hesitant fuzzy elements and the information about attribute weights is incomplete. There are two key issues being addressed in this approach. The first one is to establish an optimization model based on the maximizing deviation method, which can be used to determine the attribute weights. According to the idea of the TOPSIS of Hwang and Yoon , the second one is to calculate the relative closeness coefficient of each alternative to the hesitant positive-ideal solution, based on which the considered alternatives are ranked and then the most desirable one is selected. An energy policy selection problem is used to illustrate the detailed implementation process of the proposed approach, and demonstrate its validity and applicability. Finally, the extended results in interval-valued hesitant fuzzy situations are also pointed out.
Collaborative filtering has become one of the most used approaches to provide personalized services for users. The key of this approach is to find similar users or items using user-item rating matrix so that the system can show recommendations for users. However, most approaches related to this approach are based on similarity algorithms, such as cosine, Pearson correlation coefficient, and mean squared difference. These methods are not much effective, especially in the cold user conditions. This paper presents a new user similarity model to improve the recommendation performance when only few ratings are available to calculate the similarities for each user. The model not only considers the local context information of user ratings, but also the global preference of user behavior. Experiments on three real data sets are implemented and compared with many state-of-the-art similarity measures. The results show the superiority of the new similarity model in recommended performance.
We introduce a new type of fuzzy preference structure, called interval-valued hesitant preference relations, to describe uncertain evaluation information in group decision making (GDM) processes. Moreover, it allows decision makers to offer all possible interval values that are not accounted for in current preference structure types when one compares two alternatives. We generalize the concept of hesitant fuzzy set (HFS) to that of interval-valued hesitant fuzzy set (IVHFS) in which the membership degrees of an element to a given set are not exactly defined, but denoted by several possible interval values. We give systematic aggregation operators to aggregate interval-valued hesitant fuzzy information. In addition, we develop an approach to GDM based on interval-valued hesitant preference relations in order to consider the differences of opinions between individual decision makers. Numerical examples are provided to illustrate the proposed approach.
Epilepsy is the neurological disorder of the brain which is difficult to diagnose visually using Electroencephalogram (EEG) signals. Hence, an automated detection of epilepsy using EEG signals will be a useful tool in medical field. The automation of epilepsy detection using signal processing techniques such as wavelet transform and entropies may optimise the performance of the system. Many algorithms have been developed to diagnose the presence of seizure in the EEG signals. The entropy is a nonlinear parameter that reflects the complexity of the EEG signal. Many entropies have been used to differentiate normal, interictal and ictal EEG signals. This paper discusses various entropies used for an automated diagnosis of epilepsy using EEG signals. We have presented unique ranges for various entropies used to differentiate normal, interictal, and ictal EEG signals and also ranked them depending on the ability to discrimination ability of three classes. These entropies can be used to classify the different stages of epilepsy and can also be used for other biomedical applications.
In this paper, we proposed a novel spam detection method that focused on reducing the false positive error of mislabeling nonspam as spam. First, we used the wrapper-based feature selection method to extract crucial features. Second, the decision tree was chosen as the classifier model with C4.5 as the training algorithm. Third, the cost matrix was introduced to give different weights to two error types, i.e., the false positive and the false negative errors. We define the weight parameter as to adjust the relative importance of the two error types. Fourth, -fold cross validation was employed to reduce out-of-sample error. Finally, the binary PSO with mutation operator (MBPSO) was used as the subset search strategy. Our experimental dataset contains 6000 emails, which were collected during the year of 2012. We conducted a Kolmogorov–Smirnov hypothesis test on the capital-run-length related features and found that all the values were less than 0.001. Afterwards, we found = 7 was the most appropriate in our model. Among seven meta-heuristic algorithms, we demonstrated the MBPSO is superior to GA, RSA, PSO, and BPSO in terms of classification performance. The sensitivity, specificity, and accuracy of the decision tree with feature selection by MBPSO were 91.02%, 97.51%, and 94.27%, respectively. We also compared the MBPSO with conventional feature selection methods such as SFS and SBS. The results showed that the MBPSO performs better than SFS and SBS. We also demonstrated that wrappers are more effective than filters with regard to classification performance indexes. It was clearly shown that the proposed method is effective, and it can reduce the false positive error without compromising the sensitivity and accuracy values.
The aim of an intrusion detection systems (IDS) is to detect various types of malicious network traffic and computer usage, which cannot be detected by a conventional firewall. Many IDS have been developed based on machine learning techniques. Specifically, advanced detection approaches created by combining or integrating multiple learning techniques have shown better detection performance than general single learning techniques. The feature representation method is an important pattern classifier that facilitates correct classifications, however, there have been very few related studies focusing how to extract more representative features for normal connections and effective detection of attacks. This paper proposes a novel feature representation approach, namely the cluster center and nearest neighbor (CANN) approach. In this approach, two distances are measured and summed, the first one based on the distance between each data sample and its cluster center, and the second distance is between the data and its nearest neighbor in the same cluster. Then, this new and one-dimensional distance based feature is used to represent each data sample for intrusion detection by a -Nearest Neighbor ( -NN) classifier. The experimental results based on the KDD-Cup 99 dataset show that the CANN classifier not only performs better than or similar to -NN and support vector machines trained and tested by the original feature representation in terms of classification accuracy, detection rates, and false alarms. I also provides high computational efficiency for the time of classifier training and testing (i.e., detection).
Accurate annual power load forecasting can provide reliable guidance for power grid operation and power construction planning, which is also important for the sustainable development of electric power industry. The annual power load forecasting is a non-linear problem because the load curve shows a non-linear characteristic. Generalized regression neural network (GRNN) has been proven to be effective in dealing with the non-linear problems, but it is very regretfully finds that the GRNN have rarely been applied to the annual power load forecasting. Therefore, the GRNN was used for annual power load forecasting in this paper. However, how to determine the appropriate spread parameter in using the GRNN for power load forecasting is a key point. In this paper, a hybrid annual power load forecasting model combining fruit fly optimization algorithm (FOA) and generalized regression neural network was proposed to solve this problem, where the FOA was used to automatically select the appropriate spread parameter value for the GRNN power load forecasting model. The effectiveness of this proposed hybrid model was proved by two experiment simulations, which both show that the proposed hybrid model outperforms the GRNN model with default parameter, GRNN model with particle swarm optimization (PSOGRNN), least squares support vector machine with simulated annealing algorithm (SALSSVM), and the ordinary least squares linear regression (OLS_LR) forecasting models in the annual power load forecasting.
A theoretical visual interaction framework to model consensus in social network group decision making (SN-GDM) is put forward with following three main components: (1) construction of trust relationship; (2) trust based recommendation mechanism; and (3) visual adoption mechanism. To do that, dual trust propagation is investigated to connect incomplete trust relationship by trusted third partners, in a way that it can fit our intuition in these cases: trust values decrease while distrust values increase. Trust relationship is proposed to be used in determining the trust degree of experts and in aggregating individual opinions into a collective one. Three levels of consensus degree are defined and used to identify the inconsistent experts. A trust based recommendation mechanism is developed to generate advices according to individual trust relationship, making recommendations more likeable to be implemented by the inconsistent experts to achieve higher levels of consensus. Therefore, it has an advantage with respect to existing interaction models because it does not force the inconsistent experts to accept advices irrespective of their trust on them. Finally, a visual adoption mechanism, which provides visual information representations on experts’ individual consensus positions before and after adopting the recommendation advices, is presented and analysed theoretically. Experts can select their appropriate feedback parameters to achieve a balance between group consensus and individual independence. Consequently, the proposed visual interaction model adds real and needed flexibility in guiding the consensus reaching process in SN-GDM.
Epilepsy is an electrophysiological disorder of the brain, characterized by recurrent seizures. Electroencephalogram (EEG) is a test that measures and records the electrical activity of the brain, and is widely used in the detection and analysis of epileptic seizures. However, it is often difficult to identify subtle but critical changes in the EEG waveform by visual inspection, thus opening up a vast research area for biomedical engineers to develop and implement several intelligent algorithms for the identification of such subtle changes. Moreover, the EEG signals are nonlinear and non-stationary in nature, which contribute to further complexities related to their manual interpretation and detection of normal and abnormal (interictal and ictal) activities. Hence, it is necessary to develop a Computer Aided Diagnostic (CAD) system to automatically identify the normal and abnormal activities using minimum number of highly discriminating features in classifiers. It has been found that nonlinear features are able to capture the complex physiological phenomena such as abrupt transitions and chaotic behavior in the EEG signals. In this review, we discuss various feature extraction methods and the results of different automated epilepsy stage detection techniques in detail. We also briefly present the various open ended challenges that need to be addressed before a CAD based epilepsy detection system can be set-up in a clinical setting.
As a clean and renewable energy source, wind energy has been increasingly gaining global attention. Wind speed forecast is of great significance for wind energy domain: planning and design of wind farms, wind farm operation control, wind power prediction, power grid operation scheduling, and more. Many wind speed forecasting algorithms have been proposed to improve prediction accuracy. Few of them, however, have studied how to select input parameters carefully to achieve desired results. After introducing a Back Propagation neural network based on Particle Swam Optimization (PSO-BP), this paper details a method called IS-PSO-BP that combines PSO-BP with comprehensive parameter selection. The IS-PSO-BP is short for Input parameter Selection (IS)-PSO-BP, where IS stands for Input parameter Selection. To evaluate the forecast performance of proposed approach, this paper uses daily average wind speed data of Jiuquan and 6-hourly wind speed data of Yumen, Gansu of China from 2001 to 2006 as a case study. The experiment results clearly show that for these two particular datasets, the proposed method achieves much better forecast performance than the basic back propagation neural network and ARIMA model.