The proposal that coupled folding to binding is not an obligatory mechanism for intrinsically disordered (ID) proteins was put forward 10 years ago. The notion of fuzziness implies that conformational heterogeneity can be maintained upon interactions of ID proteins, which has a functional impact either on regulated assembly or activity of the corresponding complexes. Here I review how the concept has evolved in the past decade, via increasing experimental data providing insights into the mechanisms, pathways and regulatory modes. The effects of structural diversity and transient contacts on protein assemblies have been collected and systematically analyzed (Fuzzy Complexes Database, ). Fuzziness has also been exploited as a framework to decipher molecular organization of higher-order protein structures. Quantification of conformational heterogeneity opens exciting future perspectives for drug discovery from small molecule–ID protein interactions to supramolecular assemblies.
This paper investigates a relationship between the fuzziness of a classifier and the misclassification rate of the classifier on a group of samples. For a given trained classifier that outputs a membership vector, we demonstrate experimentally that samples with higher fuzziness outputted by the classifier mean a bigger risk of misclassification. We then propose a fuzziness category based divide-and-conquer strategy which separates the high-fuzziness samples from the low fuzziness samples. A particular technique is used to handle the high-fuzziness samples for promoting the classifier performance. The reasonability of the approach is theoretically explained and its effectiveness is experimentally demonstrated.
Countering cyber threats, especially attack detection, is a challenging area of research in the field of information assurance. Intruders use polymorphic mechanisms to masquerade the attack payload and evade the detection techniques. Many supervised and unsupervised learning approaches from the field of machine learning and pattern recognition have been used to increase the efficacy of intrusion detection systems (IDSs). Supervised learning approaches use only labeled samples to train a classifier, but obtaining sufficient labeled samples is cumbersome, and requires the efforts of domain experts. However, unlabeled samples can easily be obtained in many real world problems. Compared to supervised learning approaches, semi-supervised learning (SSL) addresses this issue by considering large amount of unlabeled samples together with the labeled samples to build a better classifier. This paper proposes a novel fuzziness based semi-supervised learning approach by utilizing unlabeled samples assisted with supervised learning algorithm to improve the classifier’s performance for the IDSs. A single hidden layer feed-forward neural network (SLFN) is trained to output a fuzzy membership vector, and the sample categorization (low, mid, and high fuzziness categories) on unlabeled samples is performed using the fuzzy quantity. The classifier is retrained after incorporating each category separately into the original training set. The experimental results using this technique of intrusion detection on the NSL-KDD dataset show that unlabeled samples belonging to low and high fuzziness groups make major contributions to improve the classifier’s performance compared to existing classifiers e.g., naive bayes, support vector machine, random forests, etc.
, , and have become ubiquitous in today's mass media and are universally known terms used in everyday speech. If we look behind these often misused buzzwords, we find at least one common element, namely data. Although we hardly use these terms in the “classic discipline” of mineral economics, we find various similarities. The case of phosphate data bears numerous challenges in multiple forms such as uncertainties, fuzziness, or misunderstandings. Often simulation models are used to support decision-making processes. For all these models, reliable and accurate sets of data are an essential premise. A significant number of data series relating to the phosphorus supply chain, including resource inventory or production, consumption, and trade data ranging from phosphate rock to intermediates like marketable concentrate to final phosphate fertilizers, is available. Data analysts and modelers must often choose from various sources, and they also depend on data access. Based on a transdisciplinary orientation, we aim to help colleagues in all fields by illustrating quantitative differences among the reported data, taking a somewhat engineering approach. We use common descriptive statistics to measure and causally explain discrepancies in global phosphate-rock production data issued by the US Geological Survey, the British Geological Survey, Austrian World Mining Data, the International Fertilizer Association, and CRU International over time, with a focus on the most recent years. Furthermore, we provide two snapshots of global-trade flows for phosphate-rock concentrate, in 2015 and 1985, and compare these to an approach using total-nutrient data. We find discrepancies of up to 30% in reported global production volume, whereby the major share could be assigned directly to China and Peru. Consequently, we call for a global, independent agency to collect and monitor phosphate data in order to reduce uncertainties or fuzziness and, thereby, ultimately support policy-making processes.
We investigate essential relationships between generalization capabilities and fuzziness of fuzzy classifiers (viz., the classifiers whose outputs are vectors of membership grades of a pattern to the individual classes). The study makes a claim and offers sound evidence behind the observation that higher fuzziness of a fuzzy classifier may imply better generalization aspects of the classifier, especially for classification data exhibiting complex boundaries. This observation is not intuitive with a commonly accepted position in "traditional" pattern recognition. The relationship that obeys the conditional maximum entropy principle is experimentally confirmed. Furthermore, the relationship can be explained by the fact that samples located close to classification boundaries are more difficult to be correctly classified than the samples positioned far from the boundaries. This relationship is expected to provide some guidelines as to the improvement of generalization aspects of fuzzy classifiers.