Paper Title : Transformer Model for Math Word Problem Solving via Dependency-based Word Embedding
ISSN : 2394-2231
Year of Publication : 2021
10.29126/23942231/IJCT-v8i6p2
MLA Style: Transformer Model for Math Word Problem Solving via Dependency-based Word Embedding " Xinxin Li, Jiawen Wang " Volume 8 - Issue 6 November-December , 2021 International Journal of Computer Techniques (IJCT) ,ISSN:2394-2231 , www.ijctjournal.org
APA Style: Transformer Model for Math Word Problem Solving via Dependency-based Word Embedding " Xinxin Li, Jiawen Wang " Volume 8 - Issue 6 November-December , 2021 International Journal of Computer Techniques (IJCT) ,ISSN:2394-2231 , www.ijctjournal.org
Abstract
Most existing approaches for math word problems directly take the problem sequence as the input, ignoring the syntactic and semantic information. In this paper, we propose a transformer neural model for math word problems via dependency-based word embedding, which utilizes the relations between objects, predicates, and quantities in the problem. Our model uses an encoder-decoder framework, where the encoder takes the dependency-based word embedding as the input, and the decoder predicts the math equations for the problems. The experiments show that incorporating dependency information to the model improves the representation ability of word embedding, and achieves better performance
Reference
[1] D. G. Bobrow, “Natural Language Input for a Computer Problem Solving System,” Department of Mathematics, MIT, Cambridge, 1964. Accessed: Apr. 16, 2019. [2] E. Charniak, “CARPS: A Program which Solves Calculus Word Problems,” MIT, Cambridge, 1968. Accessed: Apr. 16, 2019. [3] C. R. Fletcher, “Understanding and Solving Arithmetic Word Problems: A Computer Simulation,” Behavior Research Methods, Instruments, & Computers, vol. 17, no. 5, pp. 565–571, Sep. 1985 [4] N. Kushman, Y. Artzi, L. Zettlemoyer, and R. Barzilay, “Learning to Automatically Solve Algebra Word Problems,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, Jun. 2014, pp. 271– 281 [5] M. J. Hosseini, H. Hajishirzi, O. Etzioni, and N. Kushman, “Learning to Solve Arithmetic Word Problems with Verb Categorization,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, Oct. 2014, pp. 523–533 [6] A. Mitra and C. Baral, “Learning To Use Formulas To Solve Simple Arithmetic Problems,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, Aug. 2016, pp. 2144–2153 [7] W. Ling, D. Yogatama, C. Dyer, and P. Blunsom, “Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (volume 1: Long Papers), Vancouver, Canada, Jul. 2017, pp. 158–167 [8] M. Alikhani, P. Sharma, S. Li, R. Soricut, and M. Stone, “Cross-modal coherence modeling for caption generation,” in Proceedings of the 58th annual meeting of the association for computational linguistics, Online, Jul. 2020, pp. 6525–6535 [9] L. Dong, F. Wei, M. Zhou, and K. Xu, “Question Answering over Freebase with Multi-Column Convolutional Neural Networks,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, Jul. 2015, pp. 260–269 [10] F. Liu, X. Ren, Y. Liu, H. Wang, and X. Sun, “simNet: Stepwise image-topic merging network for generating detailed and comprehensive image captions,” in Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, Oct. 2018, pp. 137–149 [11] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in 3rd international conference on learning representations, ICLR 2015, san diego, CA, USA, may 7-9, 2015, conference track proceedings, 2015 [12] L. Miculicich, D. Ram, N. Pappas, and J. Henderson, “DocumentLevel Neural Machine Translation with Hierarchical Attention Networks,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, Oct. 2018, pp. 2947–2954 [13] J. Zhang et al., “Graph-to-tree learning for solving math word problems,” in Proceedings of the 58th annual meeting of the association for computational linguistics, Online, Jul. 2020, pp. 3928– 3937 [14] D. Huang, J.-G. Yao, C.-Y. Lin, Q. Zhou, and J. Yin, “Using Intermediate Representations to Solve Math Word Problems,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, Jul. 2018, pp. 419–428 [15] C.-C. Liang, S.-H. Tsai, T.-Y. Chang, Y.-C. Lin, and K.-Y. Su, “A Meaning-Based English Math Word Problem Solver with Understanding, Reasoning and Explanation,” in Proceedings of COLING 2016, the 26th international conference on computational linguistics: system demonstrations, Osaka, Japan, Dec. 2016, pp. 151– 155. [16] R. F. Simmons, “Natural Language Question-answering Systems: 1969,” Commun. ACM, vol. 13, no. 1, pp. 15–30, Jan. 1970 [17] D. J. Briars and J. H. Larkin, “An Integrated Model of Skill in Solving Elementary Word Problems,” Cognition and Instruction, vol. 1, no. 3 [18] M. S. Riley, J. G. Greeno, and J. I. Heller, “Development of Children’s Problem-Solving Ability in Arithmetic,” in The Development of Mathematical Thinking, Orlando, FL: Academic Press, Inc., 1984. Accessed: Apr. 16, 2019 [19] S. Roy and D. Roth, “Solving General Arithmetic Word Problems,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, Sep. 2015, pp. 1743–1752 [20] S. Roy and D. Roth, “Unit Dependency Graph and Its Application to Arithmetic Word Problem Solving,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 2017, pp. 3082–3088. Accessed: May 04, 2019 [21] S. Roy and D. Roth, “Mapping to Declarative Knowledge for Word Problem Solving,” Transactions of the Association for Computational Linguistics, vol. 6, pp. 159–172, Jul. 2018 [22] Y. Wang, X. Liu, and S. Shi, “Deep Neural Solver for Math Word Problems,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, Sep. 2017 [23] D. Huang, J. Liu, C.-Y. Lin, and J. Yin, “Neural Math Word Problem Solver with Reinforcement Learning,” in Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, Aug. 2018, pp. 213–223. [24] L. Wang, Y. Wang, D. Cai, D. Zhang, and X. Liu, “Translating a Math Word Problem to a Expression Tree,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, Oct. 2018, pp. 1064–1069 [25] J. Li, L. Wang, J. Zhang, Y. Wang, B. T. Dai, and D. Zhang, “Modeling Intra-Relation in Math Word Problems with Different Functional Multi-Head Attentions,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, Jul. 2019, pp. 6162–6167 [26] Z. Xie and S. Sun, “A Goal-Driven Tree-Structured Neural Model for Math Word Problems,” in Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19, Jul. 2019, pp. 5299–5305 [27] T.-R. Chiang and Y.-N. Chen, “Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, Jun. 2019, pp. 2656–2668 [28] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and their Compositionality,” in NIPS, Harrahs and Harveys, Lake Tahoe, Nevada, United States, 2013, pp. 1–9. [29] M. E. Peters et al., “Deep contextualized word representations,” in Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: Human language technologies, NAACL-HLT 2018, new orleans, louisiana, USA, Jun. 2018, vol. 1, pp. 2227–2237 [30] A. Radford and K. Narasimhan, “Improving language understanding by generative pre-training,” 2018. [31] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pretraining of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers), Minneapolis, Minnesota, 2019, pp. 4171–4186 [32] O. Levy and Y. Goldberg, “Dependency-Based Word Embeddings,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, Maryland, Jun. 2014, pp. 302–308 [33] A. Komninos and S. Manandhar, “Dependency based embeddings for sentence classification tasks,” in Proceedings of the 2016 conference of the north American chapter of the association for computational linguistics: Human language technologies, San Diego, California, Jun. 2016, pp. 1490–1500 [34] B. Li et al., “Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, Sep. 2017, pp. 2421– 2431 [35] S. MacAvaney and A. Zeldes, “A deeper look into dependency-based word embeddings,” in Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: Student research workshop, New Orleans, Louisiana, USA, Jun. 2018, pp. 40–45 [36] A. Vaswani et al., “Attention is all you need,” in Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, long beach, CA, USA, Dec. 2017, pp. 5998–6008 [37] R. Koncel-Kedziorski, S. Roy, A. Amini, N. Kushman, and H. Hajishirzi, “Mawps: A Math Word Problem Repository,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, Jun. 2016, pp. 1152–1157 [38] R. Koncel-Kedziorski, H. Hajishirzi, A. Sabharwal, O. Etzioni, and S. D. Ang, “Parsing Algebraic Word Problems into Equations,” Transactions of the Association for Computational Linguistics, vol. 3, pp. 585–597, Dec. 2015 [39] L. Wang et al., “Template-Based Math Word Problem Solvers with Recursive Neural Networks,” in The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, Honolulu, Hawaii, USA, Feb. 2019, pp. 7144–7151
Keywords
— Math word problem, Transformer network, Dependency-based word embedding