In this paper we propose a robust fuzzy linear regression model based on the Least Median Squares–Weighted Least Squares (LMS–WLS) estimation procedure. The proposed model is general enough to deal with data contaminated by outliers due to measurement errors or extracted from highly skewed or heavy tailed distributions. We also define suitable goodness of fit indices useful to evaluate the performances of the proposed model. The effectiveness of our model in reducing the outliers influence is shown by using applicative examples, based both on simulated and real data, and by a simulation study.
Data science is a research field concerned with processes and systems that extract knowledge from massive amounts of data. In some situations, however, data shortage renders existing data-driven methods difficult or even impossible to apply. Transfer learning has recently emerged as a way of exploiting previously acquired knowledge to solve new yet similar problems much more quickly and effectively. In contrast to classical data-driven machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling in the current domain. A significant number of transfer learning methods that address classification tasks have been proposed, but studies on transfer learning in the case of regression problems are still scarce. This study focuses on using transfer learning techniques to handle regression problems in a domain that has insufficient training data. We propose an original fuzzy regression transfer learning method, based on fuzzy rules, to address the problem of estimating the value of the target for regression. A Takagi-Sugeno fuzzy regression model is developed to transfer knowledge from a source domain to a target domain. Experimental results using synthetic data and real-world datasets demonstrate that the proposed fuzzy regression transfer learning method significantly improves the performance of existing models when tackling regression problems in the target domain.
Statistical regression analysis is a powerful and reliable method to determine the impact of one or several independent variable(s) on a dependent variable. It is the most widely used of all statistical methods and has broad applicability to numerous practical problems. However, various problems can arise, when for instance the sample size is too small, distributional assumptions are not fulfilled, the relationship between independent and dependent variables is vague or when there is an ambiguity of events. Moreover, the complexity of real-life problems often makes the underlying models inadequate, since information is frequently imprecise in many ways. To relax these rigidities, numerous researchers have modified and extended concepts of statistical regression analysis by means of concepts of fuzzy set theory. By now, there is a large number of papers on the topic of fuzzy regression analysis, especially concerning possibilistic, fuzzy least squares or machine learning approaches. Additionally, the variety of approaches includes probabilistic, logistic, type-2 and clusterwise fuzzy regression methods, among many others. Besides papers mainly devoted to advances in methodology, there are also several papers presenting case studies in various research fields. To structure this diversity of papers, proposals and applications we give in this paper a comprehensive systematic review and provide a bibliography on the topic of fuzzy regression analysis. Thus, the paper intends to consolidate the topic in order to aid new researchers in this area, focuses the field’s attention on key open questions, and highlights possible directions for future research.
In classical data-driven machine learning methods, massive amounts of labeled data are required to build a high-performance prediction model. However, the amount of labeled data in many real-world applications is insufficient, so establishing a prediction model is impossible. Transfer learning has recently emerged as a solution to this problem. It exploits the knowledge accumulated in auxiliary domains to help construct prediction models in a target domain with inadequate training data. Most existing transfer learning methods solve classification tasks; only a few are devoted to regression problems. In addition, the current methods ignore the inherent phenomenon of information granularity in transfer learning. In this study, granular computing techniques are applied to transfer learning. Three granular fuzzy regression domain adaptation methods to determine the estimated values for a regression target are proposed to address three challenging cases in domain adaptation. The proposed granular fuzzy regression domain adaptation methods change the input and/or output space of the source domain's model using space transformation, so that the fuzzy rules are more compatible with the target data. Experiments on synthetic and real-world datasets validate the effectiveness of the proposed methods.
Regression analysis is a powerful statistical tool that has many applications in different areas. The problem of regression analysis under a fuzzy environment has been treated in the literature from different points of view and considering a variety of input/output data (crisp or fuzzy). However, we realize that, in general, most research papers have a conflict between the solution of the fuzzy regression problem using crisp distances (minimizing a real error function) and the interpretation of fuzzy data as possibility distributions. The main aim of this paper is to develop a methodology to solve this problem introducing a fuzzy partial order and a family of fuzzy distance measures on the whole set of fuzzy numbers. The new approach allows us to obtain linear and nonlinear models that reach the lowest fuzzy error; the estimation process, in general, can be considered easier to apply in practice, and it is not limited to triangular fuzzy numbers. Numerical examples are provided to illustrate the usefulness and applicability of these results, and comparisons with existing methodologies show that the performance of the proposed solution is very satisfactory.
Two types of uncertainty, namely, randomness and fuzziness, exist in preference modeling. Fuzziness is mainly caused by human subjective judgment and incomplete knowledge, and randomness often originates from the variability of influences on the inputs and outputs of a preference model. Various techniques have been utilized to develop preference models. However, only few previous studies have addressed both fuzziness and randomness in preference modeling. Among these limited studies, none have considered the randomness caused by particular independent variables. To fill this research gap, this study proposes probabilistic fuzzy regression (PFR), a new approach for preference modeling. PFR considers both the fuzziness of data sets and the randomness caused by independent variables. In the proposed approach, probability density functions (PDFs) are adopted to model randomness. The parameter settings of the PDFs are determined using a chaos optimization algorithm. The probabilistic terms of the PFR models are generated according to the expected value functions of the random variables. Fuzzy regression analysis is employed to determine the fuzzy coefficients for all the terms of the PFR models. An industrial case study of a tea maker design is used to illustrate the applicability of PFR and evaluate its effectiveness. Modeling results obtained from PFR are compared with those obtained from statistical regression, fuzzy regression, and fuzzy least-squares regression. Results of the training and validation tests show that PFR outperforms the other approaches in terms of training and validation errors.
Fuzzy regression analysis was extensively used in previous studies to model the relationships between dependent and independent variables in a fuzzy environment. Various approaches have been proposed to perform fuzzy regression analysis with most of the approaches adopting a single objective function in the generation of fuzzy regression models. Some previous studies attempted to generate fuzzy regression models using a multi-objective optimization approach in order to improve the prediction accuracy of the generated fuzzy regression models. However, in the studies, the subjective judgments of parameter settings are required for solving multi-objective optimization problems and a complete representation of Parato optimal solutions cannot be generated in a single run. To address the limitations, a multi-objective evolutionary approach to fuzzy regression analysis is proposed in this paper. In the proposed approach, a multi-objective optimization problem is formulated which involves three objectives; minimizing the fuzziness of fuzzy outputs, minimizing the effect of outliers and minimizing the mean absolute percentage error of modeling. A non-dominated sorting genetic algorithm-α is introduced to solve the problem and generate a set of Pareto optimal solutions. Finally, a technique for order of preference by similarity to ideal solution is applied to determine a final optimal solution by which a fuzzy regression model can be generated. A case study is conducted to illustrate the proposed approach. Sixteen validation tests are conducted to evaluate the effectiveness of the proposed approach. The results of the validation tests show that the proposed approach outperforms Tanaka's fuzzy regression, Peters’ fuzzy regression, compromise programming based multi-objective fuzzy regression, fuzzy least-squares regression and probabilistic fuzzy regression approaches in terms of training errors and prediction accuracy.
Weighted regression approach is one of the popular problems in robust regression analysis. Recently, robust fuzzy regression models have proven to be alternative approaches to fuzzy regression models attempting to identify, down-weight and/or ignore unusual points (outliers). This paper proposes a new robust fuzzy regression modeling technique known as weighted least squares (LS) fuzzy regression to construct a model for crisp input-fuzzy output data. We introduce a new weighted objective function to overcome the disadvantages of the ordinary LS approach in the presence of outliers. We derive and describe an iterative reweighted algorithm for minimization of the objective function. The algorithm is presented to approximate the weighted estimators of the fuzzy regression by solving the weighted optimization problem. The proposed algorithm decreases the affect of outliers on the model fit attempting to identify/down-weight them. To this end, experiments on datasets with different numbers of outliers are performed. The accuracy of our approach in a real setting is also tested on establishing a predictive model for evaluation of suspended load based on a real world dataset in hydrology engineering. The numerical results show that in the presence of unusual points the proposed weighted fit tracks the main body of the data considerably better than the ordinary LS fuzzy regression fit both in terms of the selected performance criteria and in terms of identifying and down weighting unusual data (outliers). The results of the numerical examples show that this approach has the capability to examine the behavior of value changes in the goodness-of-fit criteria of the fuzzy regression models when the downweighted observations are omitted.
Linear regression models with fuzzy independent variables and crisp parameters are considered in this paper. The constant term may be fuzzy. It is determined using calculus of variations. Sometimes linear regression models do not satisfy the conditions under which the least squares estimator should be consistent. It is known from econometrics that in some cases instrumental variables can be used to find a consistent estimator for such models. In this paper, an instrumental variables estimator of a linear fuzzy regression model is constructed and consistency of the estimator is established.