Machine Learning Project

Threshold 1 requires a user to have an average of 1 useful vote per review to be deemed “useful.”

Threshold 2 requires a user to have an average of 2 useful votes per review to be deemed “useful.”

“Inclusive” means that the algorithm includes attributes from other users such such as “cool” votes.

“Exclusive” means that the algorithm only uses attributes that do not depend on other users.

Classifier	10-fold Cross validation accuracy (Inclusion)	10-fold Cross validation accuracy (Exclusion)
rules.ZeroR	50.0077%	50.0077%
trees.J48	98.6751%	75.0732%
rules.BayesNet	68.8338%	67.6013%
rules.NaiveBayes	60.5454%	57.5258%
lazy.IBk	57.8801%	56.4474%

J48 node order:

Inclusion: votes.funny > review_count > yelping_for_period

Exclusion: yelping_for_period > review_count > review_stars_avg > biz_stars_avg

[Validation Set Results] - application of training model on validation set)

: Inclusion Results

trees.J48	94.839%
rules.BayesNet	67.8169%

: Exclusion Results

trees.J48	71.0032%
rules.BayesNet	64.3891%

Classifier	10-fold Cross validation accuracy (Inclusion)	10-fold Cross validation accuracy (Exclusion)
rules.ZeroR	50.0026%	50.0026%
trees.J48	99.258%	67.0904%
rules.BayesNet	67.9051%	62.2594%
rules.NaiveBayes	59.8879%	59.5143%
lazy.IBk	57.6981%	57.2051%

J48 node order:

Inclusion: votes_total > review_count > biz_stars_avg

Exclusion: yelping_for_period> review_count > biz_review_count_avg

[Validation Set Results] - application of training model on validation set)

: Inclusion Results

trees.J48	96.8813%
rules.BayesNet	67.9248%

: Exclusion Results

trees.J48	63.5411%
rules.BayesNet	66.2302%