SOLUTION: Cumberland University Week 5 Intro to Data Mining Discussion

Manasa Jasti POST:
Types of classifiers
A classifier maps the input data to a specific category, and so the output labels generated
by classifiers are fundamentally helpful in identifying the types of classifiers (Garg, 2021).
Classifiers are categorized based on their outputs as follows:
1. Binary vs. Multiclass – Binary classifiers, as their name indicates, have only two
states, and each data point is typically categorized into one of the two. Whereas
multiclass classifiers have more outcomes (Tan et al., 2019).
2. Deterministic vs. Probabilistic – deterministic classifiers have more definitive and
well-defined labels. In contrast, probabilistic classifiers give a real number indicating the
degree to which a data point could belong to a label (Tan et al., 2019).
3. Linear vs. Nonlinear – Linear classifiers relies on simple linear separation hyperplanes
to differentiate the inputs, whereas nonlinear can accommodate more complex decision
surfaces (Tan et al., 2019).
4. Global vs. Local-Global classifiers differ from local classifiers in that they do not
partition the data set and accommodate each partition with its distinct model. However, at
the same time, this flexibility also brings in the pitfall of model overfitting (Tan et al.,
2019).
5. Finally, generative vs. Discriminative – Discriminative classifiers are more
simplistic and straightforward in their functionality in that they only try to assign a label
to the input. Still, generative classifiers also help identify the characteristics of data points
that would fall under a specific label. So this helps gain insightful information about
every label which can be utilized for further processing (Tan et al., 2019).
Rule-Based Classifier
Rule-based classifiers rely on a collection of “if…then” rules to classify their data
instances; the condition included in the if-clause is referred to as “antecedent,” and the output
class that is selected if that condition is met is referred to as the “consequent.” However, the issue
with these classifiers is that the rules are not always exclusive, meaning a data instance can agree
to more than one rule. Moreover, there could be instances that do not agree to any of the
conditions (Christopher, 2019).
Difference between nearest neighbor and naïve Bayes classifiers
1. Naïve Bayes is a linear classifier and is often highly accurate and faster than the nearest
neighbor, especially when dealing with large data sets. However, Bayes only works if the
decision boundary is linear, elliptic, or parabolic (Glen, 2019).
2. Bayes requires the knowledge of underlying probability distributions, whereas nearest
neighbors do not need information about underlying probability distributions.
3. Bayes requires training whereas nearest-neighbor does not (Glen, 2019).
4. Bayes is not affected by the curse of dimensionality and large data sets, whereas the
nearest neighbor suffers from both (Glen, 2019).
5. Bayes specializes in tasks such as computer vision, whereas nearest neighbor excels
when it comes to rare occurrences (Glen, 2019).
Logistic regression
Logistic regression is classifying incoming data into labels based on historical data
available concerning the incoming data. So when utilized within a machine learning application,
the classifier gets better and better as more data flows through the application. Therefore, this
method of predicting a data value based on prior observations is critical in the machine learning
discipline (Rosencrance & Burns, 2019).
References:
Christopher, J. (2019). The science of Rule-based classifiers. 2019 9th International Conference
on Cloud Computing, Data Science & Engineering
(Confluence). https://doi.org/10.1109/confluence.2019.8776954
Garg, R. (2021, May 7). 7 types of classification algorithms. Analytics India
Magazine. https://analyticsindiamag.com/7-types-classification-algorithms/.
Glen, S. (2019). Comparing classifiers: Decision Trees, K_NN & Naive Bayes. Data Science
Central. https://www.datasciencecentral.com/profiles/blogs/comparing-classifiersdecision-trees-knn-naive-bayes.
Rosencrance, L., & Burns, E. (2019, May 10). What is logistic regression? – definition from
whatis.com.
SearchBusinessAnalytics. https://searchbusinessanalytics.techtarget.com/definition/logistic
-regression.
Tan, P.-N., Steinbach, M., Karpatne, A., & Kumar, V. (2019). Introduction to data mining.
Pearson.
Avinash Sama POST:
Answer 1
There are different types of classifiers, a classifier is an algorithm that maps the input data
to a specific category. Now, let us look at the different types of classifiers:
Naive Bayes Classifier: It is a classification technique based on Bayes’ Theorem with the
assumption of independence among predictors. In other words , a Naive Bayes classifier assume
that the presence of a particular feature in a class is unrelated to the presence of any other feature
or that all of these properties have independent contribution to the probability. This family of
classifiers is relatively easy to build and particularly useful for very large data sets as it is highly
scalable. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated
classification methods.
Nearest Neighbor: The k-nearest-neighbors algorithm is a supervised classification technique that
uses proximity as a proxy for ‘sameness’. The algorithm takes a bunch of labelled points and uses
them to learn how to label other points. To label a new point, it looks at the labelled points closest
to that new point (those are its nearest neighbors). Closeness is typically expressed in terms of a
dissimilarity function. Once it checks with ‘k’ number of nearest neighbors, it assigns a label based
on whichever label most of the neighbors have.
Decision Trees: Decision tree builds classification or regression models in the form of a tree
structure. It breaks down a data set into smaller and smaller subsets while at the same time an
associated decision tree is incrementally developed. The final result is a tree with decision nodes
and leaf nodes. A decision node has two or more branches and a leaf node represents a
classification or decision. The topmost decision node in a tree which corresponds to the best
predictor called root node. Decision trees can handle both categorical and numerical data.
Answer 2
Rule-based classifiers are just another type of classifier which makes the class decision
depending by using various “if..else” rules. These rules are easily interpretable and thus these
classifiers are generally used to generate descriptive models. The condition used with “if” is called
the antecedent and the predicted class of each rule is called the consequent. (Kadous, 2018)
Properties of rule-based classifiers:

Coverage: The percentage of records which satisfy the antecedent conditions of a rule.

The rules generated by the rule-based classifiers are generally not mutually exclusive, i.e.
many rules can cover the same record.

The rules generated by the rule-based classifiers may not be exhaustive, i.e. there may be
some records which are not covered by any of the rules. (Peterson, 2019)

The decision boundaries created by them is linear, but these can be much more complex
than the decision tree because the many rules are triggered for the same record.
Answer 3
If having conditional independence will highly negative affect classification, you’ll want
to choose K-NN over Naive Bayes. Naive Bayes can suffer from the zero-probability problem;
when a particular attribute’s conditional probability equals zero, Naive Bayes will completely fail
to produce a valid prediction. This could be fixed using a Laplacian estimator, but K-NN could
end up being the easier choice. Naive Bayes will only work if the decision boundary is linear,
elliptic, or parabolic. Otherwise, choose K-NN. Naive Bayes requires that you known the
underlying probability distributions for categories. The algorithm compares all other classifiers
against this ideal. Therefore, unless you know the probabilities and pdfs, use of the ideal Bayes is
unrealistic. In comparison, K-NN doesn’t require that you know anything about the underlying
probability distributions. K-NN doesn’t require any training—you just load the dataset and off it
runs. On the other hand, Naive Bayes does require training. K-NN (and Naive Bayes) outperform
decision trees when it comes to rare occurrences. For example, if you’re classifying types of cancer
in the general population, many cancers are quite rare. A decision tree will almost certainty prune
those important classes out of your model. If you have any rare occurrences, avoid using decision
trees (Glen, 2019).
Answer 4
Logistic regression is used to find the probability of event=Success and event=Failure. We
should use logistic regression when the dependent variable is binary (0/ 1, True/ False, Yes/ No)
in nature. Here the value of Y ranges from 0 to 1 and it can represent by following equation.
Odds= p/ (1-p) = probability of event occurrence / probability of not event occurrence
ln (odds) = ln(p/(1-p))
logit (p) = ln(p/(1-p)) = b0+b1X1+b2X2+b3X3….+bkXk
Above, p is the probability of presence of the characteristic of interest. A question that we
should ask here is “why have we used log in the equation?”
Since we are working here with a binomial distribution (dependent variable), we need to
choose a link function which is best suited for this distribution. And, it is logit function. In the
equation above, the parameters are chosen to maximize the likelihood of observing the sample
values rather than minimizing the sum of squared errors (like in ordinary regression).
References
Glen,
S.
(2019,
June
19).
TechTarget.
Retrieved
from
https://www.datasciencecentral.com/profiles/blogs/comparing-classifiers-decision-trees-knnnaive-bayes
Kadous, W. (2018, Feb 17). What is the difference between rule-based classifiers and decision tree
classifiers? Retrieved from https://www.quora.com/What-is-the-difference-between-rule-basedclassifiers-and-decision-tree-classifiers
Peterson, M. (2019, Jul 21). 6 Classifying Documents in Oracle Text. Retrieved from
https://docs.oracle.com/cd/B28359_01/text.111/b28303/classify.htm#g1011013

Purchase answer to see full
attachment

Haven’t Found The Relevant Content? Hire a Subject Expert to Help You With
SOLUTION: Cumberland University Week 5 Intro to Data Mining Discussion
Post Your Own Question And Get A Custom Answer
Hire Writer
Written Assignments
Get 20% Discount on This Paper
Pages (550 words)
Approximate price: -

Why Choose Us?

Quality Papers

We value our clients. For this reason, we ensure that each paper is written carefully as per the instructions provided by the client. Our editing team also checks all the papers to ensure that they have been completed as per the expectations.

Professional Academic Writers

Over the years, our Written Assignments has managed to secure the most qualified, reliable and experienced team of writers. The company has also ensured continued training and development of the team members to ensure that it keeps up with the rising Academic Trends.

Affordable Prices

Our prices are fairly priced in such a way that ensures affordability. Additionally, you can get a free price quotation by clicking on the "Place Order" button.

On-Time delivery

We pay strict attention to deadlines. For this reason, we ensure that all papers are submitted earlier, even before the deadline indicated by the customer. For this reason, the client can go through the work and review everything.

100% Originality

At Written Assignments, all papers are plagiarism-free as they are written from scratch. We have taken strict measures to ensure that there is no similarity on all papers and that citations are included as per the standards set.

Customer Support 24/7

Our support team is readily available to provide any guidance/help on our platform at any time of the day/night. Feel free to contact us via the Chat window or support email: support@writtenassignments.com.

Try it now!

Order Now to Get 20% Discount

We'll send you the first draft for approval by at
Total price:
$0.00

How our best essay writing service works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

Written Assignments has stood as the world’s leading custom essay writing paper services provider. Once you enter all the details in the order form under the place order button, the rest is up to us.

Essays

Cheapest Essay Writing Service

At Written Assignments, we prioritize all aspects that bring about a good grade such as impeccable grammar, proper structure, zero plagiarism and conformance to guidelines. Our experienced team of writers will help you completed your essays and other assignments.

Admissions

Admission and Business Papers

Be assured that you’ll get accepted to the Master’s level program at any university once you enter all the details in the order form. We won’t leave you here; we will also help you secure a good position in your aspired workplace by creating an outstanding resume or portfolio once you place an order.

Editing

Editing and Proofreading

Our skilled editing and writing team will help you restructure your paper, paraphrase, correct grammar and replace plagiarized sections on your paper just on time. The service is geared toward eliminating any mistakes and rather enhancing better quality.

Coursework

Technical papers

We have writers in almost all fields including the most technical fields. You don’t have to worry about the complexity of your paper. Simply enter as many details as possible in the place order section.