Which is the typical algorithms to generate trees?
A、A horribly Intractable Algorithm
B、C4.5
C、CART
D、A Better Algorithm
A、A horribly Intractable Algorithm
B、C4.5
C、CART
D、A Better Algorithm
A、Underfitting means poor accuracy both for training data and unseen samples
B、Overfitting means high accuracy for training data but poor accuracy for unseen samples
C、Underfitting implies the model is too simple that we need to increase the model complexity
D、Overfitting occurs too many branches that we need to decrease the model complexity
A、Both of them are methods to deal with overfitting problem
B、Pre-pruning does not split a node if this would result in the goodness measure falling below a threshold
C、Post-pruning removes branches from a “fully grown” tree
D、It is easy to choose an appropriate threshold when making pre-pruning
A、First, consider the cost complexity of a tree
B、Then, for each internal node, N, compute the cost complexity of the subtree at N
C、And also compute the cost complexity of the subtree at N if it were to be pruned
D、At last, compare the two values. If pruning the subtree at node N would result in a smaller cost complexity, the subtree is pruned. Otherwise, the subtree is kept
Calculation of Information Gain of a Traffic Conflict Problem The students in Southeast University conducted a series of field observation of the occurrence of traffic conflict on December 14, 2019 at an intersection at Si-Pai-Lou Street and North Taiping Road in Nanjing, China. They also recorded the associated traffic volume, drivers’ demographical information, and visibility, as have been listed in Table 1. Table 1. The Rule Table for a Traffic Conflict Problem Traffic Volume Driver’ age License Less than 5 years Visibility Conflict high Young no fair no high Young no excellent no high Medium no fair yes high Medium yes fair yes medium Young no fair no medium Old yes fair yes medium Young yes excellent yes medium Medium no excellent yes medium Old no excellent no medium Old no fair yes low Old yes fair yes low Old yes excellent no low Medium yes excellent yes low Young yes fair yes (1) Plot the decision tree that can illustrate the traffic conflict problem in Table 1.(3 points) (2) Fill in Table 2 for the positive counts of conflict (pi) and negative counts of conflict (ni), and their totals.(3 points) Table 2. Positive and Negative Counts of Conflicts Traffic Volume pi ni Total Low Medium High Total (3) What is the probability of the class “conflict->yes” and the class “conflict -> no”?(2 points) (4) Calculate the information gained by branching on attribute “Traffic Volume” using ID3/C4.5 method.(4 points)
为了保护您的账号安全,请在“简答题”公众号进行验证,点击“官网服务”-“账号验证”后输入验证码“”完成验证,验证成功后方可继续查看答案!