Skip to main content
Log in

Machine learning models for mathematical symbol recognition: A stem to stern literature analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Given the ubiquity of handwriting and mathematical content in human transactions, machine recognition of handwritten mathematical text and symbols has become a domain of great practical scope and significance. Recognition of mathematical expression (ME) has remained a challenging and emerging research domain, with mathematical symbol recognition (MSR) as a requisite step in the entire recognition process. Many variations in writing styles and existing dissimilarities among the wide range of symbols and recurring characters make the recognition tasks strenuous even for Optical Character Recognition. The past decade has witnessed the emergence of recognition techniques and the peaking interest of several researchers in this evolving domain. In light of the current research status associated with recognizing handwritten math symbols, a systematic review of the literature seems timely. This article seeks to provide a complete systematic analysis of recognition techniques, models, datasets, sub-stages, accuracy metrics, and accuracy details in an extracted form as described in the literature. A systematic literature review conducted in this study includes pragmatic studies until the year 2021, and the analysis reveals Support Vector Machine (SVM) to be the most dominating recognition technique and symbol recognition rate to be most frequently deployed accuracy measure and other interesting results in terms of segmentation, feature extraction and datasets involved are vividly represented. The statistics of mathematical symbols-related papers are shown, and open problems are identified for more advanced research. Our study focused on the key points of earlier research, present work, and the future direction of MSR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Ali I, Mahjoub M (2018) Dynamic random forest for the recognition of arabic handwritten mathematical symbols with a novel set of features. Int Arab J Inf Technol 15(3A special issue):565–575

    Google Scholar 

  2. Álvaro F, Sánchez JA (2010) Comparing several techniques for offline recognition of printed mathematical symbols. International conference on pattern recognition, 1953–1956. https://doi.org/10.1109/ICPR.2010.481

  3. Alvaro F, Sanchez JA, Benedi JM (2013) Classification of on-line mathematical symbols with hybrid features and recurrent neural networks. Proceedings of the international conference on document analysis and recognition, ICDAR, 1012–1016. https://doi.org/10.1109/ICDAR.2013.203

  4. Alvaro F, Sanchez JA, Benedi JM, Sánchez J-A, Benedí J-M (2014). Offline features for classifying handwritten math symbols with recurrent neural networks. 22nd international conference on pattern recognition, 2944–2949. https://doi.org/10.1109/ICPR.2014.507

  5. Baker JB, Sexton AP, Sorge V (2010) Faithful mathematical formula recognition from PDF documents. 9th IAPR international workshop on document analysis systems, 485–492.https://doi.org/10.1145/1815330.1815393

  6. Bouvett E, Casha O, Grech I, Cutajar M, Gatt E, Micallef J (2012) An FPGA embedded system architecture for handwritten symbol recognition. Proceedings of the Mediterranean Electrotechnical conference - MELECON, 653–656. https://doi.org/10.1109/MELCON.2012.6196516

  7. Chajri Y, Maarir A, Bouikhalene B (2016) A comparative study of handwritten mathematical symbols recognition. Thirteenth international conference on computer graphics, imaging and visualization, 448–451. https://doi.org/10.1109/CGiV.2016.92

  8. Chan KF, Yeung DY (2000a) Mathematical expression recognition: A survey. Int J Doc Anal Recognit (IJDAR) 3(1):3–15. https://doi.org/10.1007/PL00013549

    Article  Google Scholar 

  9. Chan K, Yeung D (2000b) Novel application of on-line mathematical expression recognition technology. Proceedings of sixth international conference on document analysis and recognition (ICDAR). IEEE, 200, 774–778

  10. Chan K-F, Yeung DY (2001) Error detection, error correction and performance evaluation in on-line mathematical expression recognition. Pattern Recogn 34(8):1671–1684. https://doi.org/10.1016/S0031-3203(00)00102-3

    Article  MATH  Google Scholar 

  11. Char BW, Watt SM (2007) Representing and characterizing handwritten mathematical symbols through succinct functional approximation. Proceedings of the international conference on document analysis and recognition, ICDAR, 2, 1198–1202. https://doi.org/10.1109/ICDAR.2007.4377105

  12. Chen Y, Okada M (2001) Structural analysis and semantic understanding for offline mathematical expressions. Int J Pattern Recognit Artif Intell 15(EC06):967–987. https://doi.org/10.1142/S021800140100126X

    Article  Google Scholar 

  13. Clark R, Kung Q, Van Wyk A (2013) System for the recognition of online handwritten mathematical expressions. Eurocon 2013:2029–2035. https://doi.org/10.1016/j.ympev.2006.04.014

    Article  Google Scholar 

  14. Dai Nguyen H, Duc Le A, Nakagawa M (2016) Recognition of online handwritten math symbols using deep neural networks. IEICE Trans Inf Syst E99.D:3110–3118. https://doi.org/10.1587/transinf.2016EDP7102

    Article  Google Scholar 

  15. Kenny Davila; Richard Zanibbi. (2018). Visual search engine for handwritten and typeset math in lecture videos and LATEX notes. 2018 16th international conference on Frontiers in handwriting recognition, ICFHR, 2018-Augus, 50–55. https://doi.org/10.1109/ICFHR-2018.2018.00018

  16. Davila K, Ludi S, Zanibbi R (2014) Using off-line features and synthetic data for on-line handwritten math symbol recognition. Fourteenth international conference on Frontiers in handwriting recognition, 323–328. https://doi.org/10.1109/ICFHR.2014.61

  17. Dimitriadis YA, Coronado JL, Moreno CG, Izquierdo JMC (1993) On-line handwritten symbol recognition, using an ART based neural network hierarchy. IEEE international conference on neural networks - conference proceedings, 1993-Janua, 944–949. https://doi.org/10.1109/ICNN.1993.298684

  18. Drsouza L, Mascarenhas M (2018) Offline handwritten mathematical expression recognition using convolutional neural network. International conference on information, communication, engineering and technology, 1–3. https://doi.org/10.1109/ICICET.2018.8533789

  19. Fang D, Zhang C (2020) Multi-feature learning by joint training for handwritten formula symbol recognition. IEEE Access 8(2):48101–48109. https://doi.org/10.1109/ACCESS.2020.2979346

    Article  Google Scholar 

  20. Fang D, Feng G, Yang H (2019) Gabor features assist semantic feature learning for handwritten formula symbol recognition. 2019 IEEE 9th international conference on electronics information and emergency communication, 230–233. https://doi.org/10.1109/ICEIEC.2019.8784656

  21. Farulla GA, Armano T, Capietto A, Murru N, Rossini R (2016) Artificial neural networks and fuzzy logic for recognizing alphabet characters and mathematical symbols. International conference on computers helping people with special needs, 7–14. https://doi.org/10.1007/978-3-319-41264-1_1

  22. Firdaus SA, Vaidehi K (2020) Handwritten mathematical symbol recognition using machine learning techniques: review. Advances in Decision Sciences, Image Processing, Security and Computer Vision, 658–671. https://doi.org/10.1007/978-3-030-24318-0_75

  23. Garain U, Chaudhuri BB, Ghosh RP (2004) A multiple-classifier system for recognition of printed mathematical symbols. Proceedings of the 17th international conference on pattern recognition, 1, 380–383. https://doi.org/10.1109/ICPR.2004.1334131

  24. Golubitsky O, Watt SM (2009) Confidence measures in recognizing handwritten mathematical symbols. Lecture notes in computer science (including subseries lecture notes in artificial Intelligence and lecture notes in bioinformatics), 5625 LNAI, 460–466. https://doi.org/10.1007/978-3-642-02614-0_36

  25. Golubitsky O, Watt SM (2010) Distance-based classification of handwritten symbols. Int J Doc Anal Recognit 13(2):133–146. https://doi.org/10.1007/s10032-009-0107-7

    Article  Google Scholar 

  26. Green BN, Johnson CD, Adams A (2006) Writing narrative literature reviews for peer-reviewed journals: secrets of the trade. J Chiropr Med 5(3):101–117

    Article  Google Scholar 

  27. Guan SK, Moh M, Moh TS (2019) Context-based multi-stage offline handwritten mathematical symbol recognition using deep learning. International conference on high performance computing and simulation, HPCS 2019, 185–192. https://doi.org/10.1109/HPCS48598.2019.9188180

  28. Hu R, Watt SM (2013) Identifying features via homotopy on handwritten mathematical symbols. Proceedings - 15th international symposium on symbolic and numeric algorithms for scientific computing, SYNASC 2013, 1, 61–67. https://doi.org/10.1109/SYNASC.2013.15

  29. Hu L, Zanibbi R (2011) HMM-based recognition of online handwritten mathematical symbols using segmental K-means initialization and a modified pen-up/down feature. International conference on document analysis and recognition, 457–462. https://doi.org/10.1109/ICDAR.2011.98

  30. Hu L, Hart K, Pospesel R, Zanibbi R (2012) Baseline extraction-driven parsing of handwritten mathematical expressions. 21st international conference on pattern recognition, 326–330

  31. Jakjoud W (2009) Representation, handling and recognition of mathematical objects: state of the art. Proceedings of the 2009 3rd international conference on research challenges in information science, RCIS 2009, 427–438. https://doi.org/10.1109/RCIS.2009.5089307

  32. Jakjoud W, Lazrek A (2011) Segmentation method of offline mathematical symbols. International conference on multimedia computing and systems, 1–7. https://doi.org/10.1109/ICMCS.2011.5945634

  33. Jimenez ND, Nguyen L (2013) Recognition of handwritten mathematical symbols with PHOG features

  34. Julca-Aguilar F, Hirata NST, Viard-Gaudin C, Mouchere H, Medjkoune S (2014) Mathematical symbol hypothesis recognition with rejection option. 2014 14th international conference on Frontiers in handwriting recognition, 500–505. https://doi.org/10.1109/ICFHR.2014.90

  35. Kacem A, Belaïd A, Ben Ahmed M (2001) Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. Int J Doc Anal Recognit 4(2):97–108. https://doi.org/10.1007/s100320100064

    Article  Google Scholar 

  36. Kanahori T, Tabata K, Cong W, Tamari F, Suzuki M (2000) On-line recognition of mathematical expressions using automatic rewriting method. International conference on multimodal interfaces, 394–401. https://doi.org/10.1007/3-540-40063-x_52

  37. Keshari B, Watt SM (2007) Hybrid mathematical symbol recognition using support vector machines. Nineth international conference on document analysis and recognition, 2, 859–863. https://doi.org/10.1109/ICDAR.2007.4377037

  38. Keshari B, Watt SM (2008) Online mathematical symbol recognition using svms with features from functional approximation. Proc. Mathematical User-Interfaces Workshop

  39. Kitchenham B (2004) Procedures for performing systematic reviews. Ann Saudi Med 33(1):79–83. https://doi.org/10.5144/0256-4947.2017.79

    Article  Google Scholar 

  40. Kurtzberg JM (1987) Feature analysis for symbol recognition by elastic matching. IBM J Res Dev 31(1):91–95. https://doi.org/10.1147/rd.311.0091

    Article  Google Scholar 

  41. Labahn SMG (2013) A new approach for recognizing handwritten mathematics using relational grammars and fuzzy sets. Int J Doc Anal Recognit 16(2):139–163. https://doi.org/10.1007/s10032-012-0184-x

    Article  Google Scholar 

  42. Lee H, Lee M (1994) Understanding mathematical expressions using procedure-oriented transformation. Pattern Recogn Lett 27(3):447–457

    Article  Google Scholar 

  43. Liu CL, Yin F, Wang DH, Wang QF (2013) Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recogn 46(1):155–162. https://doi.org/10.1016/j.patcog.2012.06.021

    Article  Google Scholar 

  44. Luo ZX, Shi Y, Soong FK (2008) Symbol graph based discriminative training and rescoring for improved math symbol recognition. ICASSP, IEEE international conference on acoustics, speech and signal processing - proceedings, 1953–1956. https://doi.org/10.1109/ICASSP.2008.4518019

  45. MacLean S, Labahn G (2015) A Bayesian model for recognizing handwritten mathematical expressions. Pattern Recogn 48(8):2433–2445. https://doi.org/10.1016/j.patcog.2015.02.017

    Article  MATH  Google Scholar 

  46. Mahdavi M, Condon M, Davila K, Zanibbi R (2019) LPGA: line-of-sight parsing with graph-based attention for math formula recognition. International conference on document analysis and recognition, 647–654. https://doi.org/10.1109/ICDAR.2019.00109

  47. Malon C, Uchida S, Suzuki M (2006) Support vector machines for mathematical symbol recognition. Lecture notes in computer science (including subseries lecture notes in artificial Intelligence and lecture notes in bioinformatics), 4109 LNCS, 136–144. https://doi.org/10.1007/11815921_14

  48. Malon C, Uchida S, Suzuki M (2008) Mathematical symbol recognition with support vector machines. Pattern Recogn Lett 29(9):1326–1332. https://doi.org/10.1016/j.patrec.2008.02.005

    Article  Google Scholar 

  49. Marinai S, Miotti B, Soda G (2011) Using earth mover’s distance in the bag-of-visual-words model for mathematical symbol retrieval. In: Proceedings of the international conference on document analysis and recognition, ICDAR, pp 1309–1313. https://doi.org/10.1109/ICDAR.2011.263

    Chapter  Google Scholar 

  50. Medjkoune S, Mouchère H, Petitrenaud S, Viard-gaudin C (2011) Handwritten and audio information fusion for mathematical symbol recognition. International conference on document analysis and recognition, 379–383. https://doi.org/10.1109/ICDAR.2011.84

  51. Mohamed Shaffril HA, Samsuddin SF, Abu Samah A (2021) The ABC of systematic literature review: the basic methodological guidance for beginners. Qual Quant 55(4):1319–1346

    Article  Google Scholar 

  52. Nazemi A, Tavakolian N, Fitzpatrick D, Fernando CA, Suen CY (2019) Offline handwritten mathematical symbol recognition utilising deep learning In Computer Vision and Pattern Recognition

  53. Nguyen DH, Le Duc A, Nakagawa M (2015)s Deep neural networks for recognizing online handwritten mathematical symbols. Third IAPR Asian Conference on Pattern Recognition Deep, 121–125. https://doi.org/10.1109/ACPR.2015.7486478

  54. Okamoto M, Imai H, Takagi K (2001) Performance evaluation of a robust method for mathematical expression recognition. In: Sixth international conference on document analysis and recognition, pp 121–128. https://doi.org/10.1109/ICDAR.2001.953767

    Chapter  Google Scholar 

  55. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, Chou R, Glanville J, Grimshaw JM, Hróbjartsson A, Lalu MM, Li T, Loder EW, Mayo-Wilson E, McDonald S, … Moher D (2021) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372:n71. https://doi.org/10.1136/bmj.n71

    Article  Google Scholar 

  56. Pathak A, Pakray P, Das R (2019) LSTM neural network based math information retrieval. 2019 2nd international conference on advanced computational and communication paradigms, ICACCP 2019, 1–6. https://doi.org/10.1109/ICACCP.2019.8882887

  57. Pillay A (2014) Intelligent combination of structural analysis algorithms: application to mathematical expression recognition. Rochester Institute of Technology

  58. Průša D, Hlaváč V (2007) Mathematical formulae recognition using 2D grammars. 2017 Nineth international conference on document analysis and recognition, 2, 849–853. https://doi.org/10.1109/ICDAR.2007.4377035

  59. Ramadhan I, Purnama B, Al Faraby S (2016) Convolutional neural networks applied to handwritten mathematical symbols classification. In 00 (Ed.), 4th international conference on information and communication technology (pp. 1–4). https://doi.org/10.1109/ICoICT.2016.7571941

  60. Ramirez-Pina C, Sanchez JS, Valdovinos-Rosas RM, Hernández-Servín JA (2018) A hybrid feature extraction method for offline handwritten math symbol recognition. Iberoamerican Congress on Pattern Recognition, 1, 893–901. https://doi.org/10.1007/978-3-030-13469-3

  61. Rong LL, Li WJ, Wang G (2009) The research of ISOETRP clustering algorithm on optical mathematical symbols recognition. Proceedings - 2009 9th international conference on hybrid intelligent systems, HIS 2009, 3, 433–436. https://doi.org/10.1109/HIS.2009.301

  62. Sakshi, Kukreja V (2021) A retrospective study on handwritten mathematical symbols and expressions : classification and recognition. Eng Appl Artif Intell 103:104292. https://doi.org/10.1016/j.engappai.2021.104292

    Article  Google Scholar 

  63. Saroui BS, Sorge V (2015) Trajectory recovery and stroke reconstruction of handwritten mathematical symbols. International conference on document analysis and recognition, 1051–1055. https://doi.org/10.1109/ICDAR.2015.7333922

  64. Shi Y, Li HY, Soong FK (2007) A unified framework for symbol segmentation and recognition of handwritten mathematical expressions. 9th international conference on document analysis and recognition, 2, 854–858. https://doi.org/10.1109/ICDAR.2007.4377036

  65. Takiguchi Y, Okada M, Miyake Y (2005) A fundamental study of output translation from layout recognition and semantic understanding system for mathematical formulae. Eighth international conference on document analysis and recognition, 745–749. https://doi.org/10.1109/ICDAR.2005.10

  66. Tapia E, Rojas R (2003) Recognition of on-line handwritten mathematical formulas in the e-chalk system. Seventh international conference on document analysis and recognition, 3, 980–984. https://doi.org/10.1109/ICDAR.2003.1227805

  67. Tian X-D, Li H-Y, Li X-F, Zhang L-P (2006) Research on symbol recognition for mathematical expressions. First international conference on innovative computing, information and control, 357–360. https://doi.org/10.1109/icicic.2006.506

  68. Tian XD, Zuo LN, Yang F, Ha MH (2007) An improved method based on gabor feature for mathematical symbol recognition. 2007 international conference on machine learning and cybernetics, 3, 1678–1682. https://doi.org/10.1109/ICMLC.2007.4370417

  69. Toyozumi K, Yamada N (2004) A study of symbol segmentation method for handwritten mathematical. Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, 2, 2–5

  70. Tran GS, Huynh CK, Le TS, Phan TP, Bui KN (2018) Handwritten mathematical expression recognition using convolutional neural network. 3rd international conference on control, robotics and cybernetics, 15–19. https://doi.org/10.1109/CRC.2018.00012

  71. Wang C, Mouchère H, Viard-Gaudin C, Jin L (2016a) Combined segmentation and recognition of online handwritten diagrams with high order Markov random field. International conference on Frontiers in handwriting recognition, 252–257. https://doi.org/10.1109/ICFHR.2016.0056

  72. Wang H, Wang Y, Lu L, Liu J, Li S, Zhang Y (2016b) An improved algorithm for symbol segmentation of mathematical formula images. 16th international symposium on communications and information technologies, ISCIT 2016, 461–464. https://doi.org/10.1109/ISCIT.2016.7751674

  73. Wang J, Du J, Zhang J (2020) Stroke constrained attention network for online handwritten mathematical expression recognition. Pattern Recogn 119:1–29. http://arxiv.org/abs/2002.08670. Accessed 3 Nov 2020

  74. Watt SM, Xie X (2005) Prototype pruning by feature extraction for handwritten mathematical symbol recognition. http://www.csd.uwo.ca/~watt/pub/reprints/2005-mc-charrec.pdf. Accessed 20 Dec 2020

  75. Xie X, Watt SM (2005) Recognition for large sets of handwritten mathematical symbols. Eighth international conference on document analysis and recognition (ICDAR’05), 1–4

  76. Xinyan C, Hongli Y, Xin W (2013) Handwritten mathematical symbol recognition based on niche genetic algorithm. Third international conference on intelligent system design and engineering applications, ISDEA 2013, 803–806. https://doi.org/10.1109/ISDEA.2012.191

  77. Xuejin Z, Xinyu L, Shengling Z, Baochang P, Tang YY (1997) On-line recognition handwritten mathematical symbols. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition, vol. 2, pp 645–648. https://doi.org/10.1109/ICDAR.1997.620585

  78. Zanibbi R, Blostein D (2012) Recognition and retrieval of mathematical expressions. Int J Doc Anal Recognit 15(4):331–357. https://doi.org/10.1007/s10032-011-0174-4

    Article  Google Scholar 

  79. Zanibbi R, Blostein D, Cordy JR (2001) Baseline structure analysis of handwritten mathematics notation. Sixth international conference on document analysis and recognition, 768–773. https://doi.org/10.1109/ICDAR.2001.953892

  80. Zanibbi R, Blostein D, Cordy JR (2002) Recognizing mathematical expressions using tree transformation. IEEE Trans Pattern Anal Mach Intell 24(11):1455–1467. https://doi.org/10.1109/TPAMI.2002.1046157

    Article  Google Scholar 

  81. Zhang DY, Tian XD, Li XF (2010) An improved method for segmentation of touching symbols in printed mathematical expressions. IEEE International Conference on Advanced Computer Control, 2, 251–253. https://doi.org/10.1109/ICACC.2010.5486679

  82. Zhao W, Gao L, Yan Z, Peng S, Du L (2021) Handwritten mathematical expression recognition with Bidirectionally trained transformer, Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

  83. Zhu B, Nakagawa M (2011) On-line handwritten Japanese characters recognition using a MRF model with parameter optimization by CRF. Proceedings of the international conference on document analysis and recognition, ICDAR, may 2014, 603–607. https://doi.org/10.1109/ICDAR.2011.127

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sakshi.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

Table 13 Table of Acronyms

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kukreja, V., Sakshi Machine learning models for mathematical symbol recognition: A stem to stern literature analysis. Multimed Tools Appl 81, 28651–28687 (2022). https://doi.org/10.1007/s11042-022-12644-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12644-2

Keywords

Navigation