Evolution of AI in solving accounting problems: A comparison between GPT4 and GPT4o in solving the Accounting Proficiency Exam
DOI:
https://doi.org/10.18800/contabilidad.2025ESP.005Keywords:
Use of technologies, ChatGPT4o, Large language models (LLMs)Abstract
The research aimed to evaluate the performance of the GPT-4o model compared to the GPT-4 model in solving questions on the Accounting Proficiency Exam. This study is grounded in the concept of natural language processing (PLN), as discussed by Brown et al. (2020). The research was conducted using the design science methodology, which aims to build and/or evaluate different technological artifacts by applying the proficiency exam questions to GPT-4o, using OpenAI's ChatGPT. While actual statistics for the Proficiency Exam show that only a portion of accountants pass, the results of artificial intelligence (AI) showed that all four editions evaluated passed with at least a 64% success rate. Overall, across the sample analyzed, the GPT-4o AI model achieved 77% accuracy compared to 71% for the GPT-4 model, achieving 84% accuracy in the last two exams using the most recent model. However, on some questions that the GPT-4 model had previously answered correctly, the more recent model ended up answering incorrectly or differently than expected by the question developers. These results contribute to the literature studying the use of AI in accounting, particularly the application of natural language processing models and large language models (LLMs).
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Contabilidad y Negocios

This work is licensed under a Creative Commons Attribution 4.0 International License.





