Chapter 8. Application of Random Walks to Bayesian Classification and Business Decision Making
C. H. C. Leung1, PhD, Y. X. Li2, PhD and L. J. Hao3
1School of Science and Engineering and Guangdong
Provincial Key Laboratory of Future Networks of Intelligence, Chinese University of Hong Kong, Shenzhen, China
2Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
3School of Data Science, Chinese University of Hong Kong, Shenzhen, China
Part of the book: Advances in Business and Management. Volume 20
Many decision-making scenarios can be viewed as classification problems. Classification decisions are pervasive and occur in many business situations. In many applications, classification problems do not occur in individually but in groups where several classification problems need to be solved. Examples of these include student admissions at colleges, whether or not to extend job offers to applicants, the effectiveness of advertising channels, and determining if COVID patients should be hospitalized. With any form of classification, however, there are unavoidable inaccuracies arising in different forms, especially when multiple classification tasks need to be performed. Since typical classifiers are not free from errors, classification errors tend to accumulate, and having frequent misclassifications are often unacceptable. Moreover, in unsupervised learning situations, there are typically no pre-determined ground truth classes; in such a situation the ground truth class is determined by the view of the majority of classifiers. In this chapter, we examine the situation of multiple classifications within the Naïve Bayes framework, where the ground truth is determined by the decision of most classifiers, and where there are finite resources requiring decisions to be made within a limited budget. Here, we represent the classification tasks as a one-dimension random walk process and perform a probabilistic analysis of the situation. We find that by raising the budget, the probability of error in classification can be controlled, and the extent of the reduction can be quantified. These results can be beneficially deployed in a variety of business decisionmaking situations in measuring and enhancing the quality of decisions.
Keywords: binary classification, naïve Bayes classifier, multiple classification, random walk, unsupervised learning
 Gao C., Y. Lu, and D. Zhou. 2016. “Exact Exponent in Optimal Rates for
Crowdsourcing.” Paper presented at the Thirty-Third International Conference on
Machine Learning, New York, June 20-22.
 Berend D. and A. Kontorovich.2015. “A Finite Sample Analysis of the Naive Bayes
Classifier.” Journal of Machine Learning Research, 16(1):1519-1545.
 Bishop C. M. 2006. Pattern Recognition and Machine Learning (Information Science
and Statistics). Springer-Verlag.
 Bonald T. and R. Combes. 2017. “A Minimax Optimal Algorithm for
Crowdsourcing.” Paper presented at the Thirtieth International Conference on Neural
Information Processing Systems, Long Beach, December, 4-7.
 Chen X., Q. Lin, and D. Zhou. 2013. “Optimistic Knowledge Gradient Policy for
Optimal Budget Allocation in Crowdsourcing.” In Proceedings of the Thirtieth
International Conference on Machine Learning, Atlanta, June 17-19.
 Feller W. 2008. Introduction to Probability Theory and Its Applications, Volume I, 3rd. Ed. Wiley.
 Manino E., L. Tran-Thanh, and N. R. Jennings. 2019. “On the Efficiency of Data
Collection for Multiple Naïve Bayes Classifiers.” Artificial Intelligence, 275: 356–378.
 Manino E., L. Tran-Thanh, and N. R. Jennings. 2018. “On the Efficiency of Data
Collection for Crowdsourced Classification.” In Proceedings of the Twenty-Seventh
International Joint Conference on Artificial Intelligence, Stockholm, July 13-19.
 Karger D. R., S. Oh, and D. Shah. 2014. “Budget-Optimal Task Allocation for Reliable
Crowdsourcing Systems.” Operations Research, 62(1):1-24,
 Khetan A. and S. Oh. “2016.Achieving Budget-Optimality with Adaptive Schemes in
Crowdsourcing.” In Proceedings of the Twenty-Ninth International Conference on
Neural Information Processing Systems, Barcelona, December 5-8.
 Fawcett T. 2006. “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, 861–874.
 Lewis D. D. and W. A. Gale.1994. “A Sequential Algorithm for Training Text
Classifiers.” In Proceedings of the Seventeenth Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval, Dublin, July 3-6.
 Littlestone N. and M. K. Warmuth. 1994. The Weighted Majority Algorithm.
Information and Computation, 108(2):212-261.
 Liu Q., J. Peng, and A. Ihler. 2012. “Variational Inference for Crowdsourcing.” In
Proceedings of the Twenty-Fifth International Conference on Neural Information
Processing Systems, Lake Tahoe. December 3-8.
 Ho C.-J., S. Jabbari, and J. W. Vaughan. 2013. “Adaptive Task Assignment for
Crowdsourced Classification.” In Proceedings of the Thirtieth International
Conference on Machine Learning, Atlanta, June 16-21.
 Kuang N. L. J., C. H. C. Leung. 2019. “Analysis of Evolutionary Behavior in Self Learning Media Search Engines.”
In Proceedings of the IEEE International
Conference on Big Data, Los Angeles, December 9-12.
 Kuang N. L. J. and C. H. C. Leung. 2018. “Performance Dynamics and Termination
Errors in Reinforcement Learning – A Unifying Perspective.” In Proceedings of the
IEEE International Conference on Artificial Intelligence and Knowledge Engineering
(AIKE), Laguna Hills, September 26-28.
 Parzen E. 2018. Stochastic Processes. Dover.
 Oleson D., A. Sorokin, G. Laughlin, V. Hester, J. Le, and L. Biewald. 2011.
“Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing.”
In Proceedings of the Twenty-Fifth AAAI conference on artificial intelligence, August 7-11.
 Ramchurn S. D., F. Wu, W. Jiang, J. E. Fischer, S. Reece, S. Roberts, T. Rodden, C.
Greenhalgh, and N. R. Jennings. 2016. “Human-agent collaboration for disaster
response.” In Proceedings of the 15th International Conference on Autonomous
Agents and Multiagent Systems, Singapore, May 9-13.
 Simpson E. and S. Roberts. 2014. Bayesian Methods for Intelligent Task Assignment
in Crowdsourcing Systems. In Scalable Decision Making: Uncertainty, Imperfection,
Deliberation, 1-32. Springer.
 Snow R., B. O’Connor, D. Jurafsky, and A. Y. Ng.2008. “Cheap and Fast – but is It
Good? Evaluating Non-expert Annotations for Natural Language Tasks.” In
Proceedings of the Conference on Empirical Methods in Natural Language
Processing, Waikiki, October 25-27.
 Kuang N. L. J., C. H. C. Leung. 2019. “Leveraging Reinforcement Learning
Techniques for Effective Policy Adoption and Validation,” in Misra S. et al. (eds) in
Computational Science and Its Applications – ICCSA, Saint Petersburg, July, 1-4.
 Kuang N. L. J., C. H. C. Leung.2019. “Performance Effectiveness of Multimedia
Information Search Using the Epsilon-Greedy Algorithm,” in Proceedings of the IEEE
International Conference on Machine Learning and Applications, Boca Raton,