Data mining functionalities are used to specify the kind of patterns to be found in data mining tasks. data mining tasks can be classified into two categories: descriptive and predictive. Data mining with unique Techniques and Technologies functionalities are used to specify the kind of patterns to be found in data mining tasks. data mining tasks can be classified into two categories: descriptive and predictive Techniques and Technologies analysis describes and models regularities or trends for objects whose behavior changes over time.
Concept And Class :
Data can be associated with classes or concepts. For example, in the all electronics store, classes of items for sale include computers and printers, and concepts of customers include big Spenders and budget Spenders. It can be useful to describe individual classes and concepts in summarized, concise, and yet precise terms. Classifying a class or a concept involves describing it through class/concept descriptions. You can derive these descriptions by summarizing the data of the target class in general terms, comparing the target class with one or a set of contrasting classes, or both data characterization and discrimination.
Data characterization is a summarization of the general characteristics or features of a target class of data. The data corresponding to the user-specified class are typically collected by a database query the output of data characterization can be presented in Various forms. Examples include pie charts, bar charts, curves, multidimensional data cubes, and multidimensional tables, including crosstabs.
Data discrimination is a comparison of the general features of target class data objects with the general features of objects from one or a set of contrasting classes, The target and contrasting classes can be specified by the user, and the corresponding data objects Discrimination descriptions expressed in rule form are referred to as discriminate Mining Frequent Patterns, Associations, and Correlations Frequent patterns, as the name suggests, are patterns that occur frequently in data There are many kinds of frequent patterns, including itemsets, subsequences, and substructures.
A frequent itemset typically refers to a set of items that frequently appear together in a transactional data set, such as Computer and Software. A frequently occurring subsequence, such as the pattern that customers tend to purchase first a PC, followed by a digital camera, and then a memory card, is a (frequent) sequential pattern.
Classification and Prediction :
Classification is the process of finding a model (or function) that describes and distinguishes data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown. The derived model is based on the analysis of a set of training data (i.e., data objects whose class label is known). The derived model may be represented in various forms, such as classification (IF- THEN) rules, decision trees, mathematical formulae, or neural networks.
A decision tree is a flow-chart-like tree structure, where such node denotes a test on an attribute value, each branch represents an outcome of the test, and tree leaves represent classes or class distributions. Decision trees can easily be converted to classification rules.
A neural network, when used for classification, is typically a collection of neuron- like processing units with weighted connections between the units. There are many other methods for constructing classification models, such as naïve. Bayesian classification, support vector machines, and k-nearest neighbor classification. Whereas classification predicts categorical (discrete, unordered) labels, prediction models Continuous-valued functions. That is, it is used to predict missing or unavailable numerical data values rather than class labels. Although the term prediction may refer to both numeric prediction and class label prediction.
Cluster Analysis:
Classification and prediction analyze class-labeled data objects.
Outlier Analysis:
A database may contain data objects that do not comply with the general behavior or model of the data. These data objects are outliers. Most data mining methods discard outliers as noise or exceptions, However, in some applications such as fraud detection, the rare events can be more interesting than the more regularly occurring ones. The analysis of outlier data is referred to as outlier mining.
Evolution Analysis:
Data evolution analysis describes and models regularities or trends for objects whose behavior changes over time. Although this may include characterization, discrimination, association and correlation analysis, classification, prediction, or clustering of time related data, distinct features of such an analysis include time-series data analysis, Sequence or periodicity pattern matching, and similarity-based data analysis.
Technologies can be use :
Classification:
This analysis is used to retrieve important and relevant information about data, and metadata. This data mining method helps to classify data in different classes. This technique finds its origins in machine learning. It classifies items or variables in a data set into predefined groups or classes. It uses linear programming, statistics, decision trees, and artificial neural network in data mining, amongst other techniques. Classification is used to develop software that can be modelled in a way that it becomes capable of classifying items in a data set into different classes.
For instance, we can use it to classify all the candidates who attended an interview into the first group is the list of those candidates who were selected and the second is the list that features candidates that were rejected. Data mining software can be used to perform this classification job.
Clustering:
Clustering analysis is a data mining technique to identify data that are like each other. This process helps to understand the differences and i similarities between the data, This technique creates meaningful object clusters that share the same characteristics, People often confuse it with classification, but if they properly understand how both these techniques work, they won’t have any issue.
Unlike classification that puts objects into predefined classes, clustering puts objects in classes that are defined by it. Let us take an example. A library is full of books on different topics. Now the challen0e is to organize those books in a way that readers don’t have any problem in finding of books on a particular topic. We can use clustering to keep books with similarities in one shelf and then give those shelves a meaningful name. Readers looking for books on a particular topic can go straight to that shelf. They won’t be required to roam the entire library to find their book.
Regression:
Regression analysis is the data mining method of identifying and analyzing the relationship between variables. It is used to identify the likelihood of a specific variable. given the presence of other variables.
Association Rules:
This data mining technique helps to find the association between two or more Items. It discovers a hidden pattern in the data set. It is one of the most used data mining techniques out of all the others. In this technique, a transaction and the relationship between its items are used to identify a pattern.
This is the reason this technique is also referred to as a relation technique. It is used to conduct market basket analysis, which is done to find out all those products that customers buy together on a regular basis. This technique is very helpful for retailers who can use it to study the buying habits of different customers. Retailers can study sales data of the past and then lookout for products that customers buy together. Then they can put those products in close proximity of each other in their retail stores to help customers save their time and to increase the sales.
Outer detection:
This type of data mining technique refers to observation of data items in the dataset which do not match an expected pattern or expected behavior. This technique can be used in a variety of domains, such as intrusion. detection, fraud or fault detection, etc. Outer detection is also called Outlier Analysis or Outlier mining.
Sequential Patterns:
This data mining technique helps to discover or identify similar patterns or trends transaction data for certain period. This technique aims to use transaction data, and then identify similar trends, patterns, and events in it over a period of time. The historical sales data can be used to discover item that buyers bought together at different times of the year. Business can make sense of this information by recommending customers to buy those products at times when the historical data doesn’t suggest they would. Businesses can use lucrative deals and discounts to push through this recommendation.
Prediction:
Prediction has use combination of the other techniques of data mining like trends, sequential patterns, clustering, classification, etc. It analyzes past events or instances in a right sequence for predicting a future event. This technique predicts the relationship that exists between independent and dependent variables as well as independent variables alone. It can be used to predict future pre depending on the sale. Let us assume that profit and sale are dependent and independent variables, respectively. Now, based on what the past sales data says, we can make a prof prediction of the future using a regression curve.
Overfitting: Due to small size training database, a model may not fit future states. Data mining needs large databases which sometimes are difficult to manage Business practices may need to be modified to determine to use the information uncovered. If the data set is not diverse, data mining results may not be accurate. Integration information needed from heterogeneous databases and global information systems could be complex.
You can also read php topic