Enhancing the government accounting information systems using social media information: An application of text mining and machine learning
Introduction
Future accounting systems will utilize large amounts of exogenous data (Brown-Liburd et al., 2019) in conjunction with traditional accounting data. Government accounting systems will move to be a conglomerate of three main components: 1) traditional financial, 2) infrastructure maintenance, and 3) quality of services (Bora et al., 2021). This study illustrates how exogenous variables eventually integrated into service processes can be used within modern accounting and assurance operational services. It explores an alternative performance measure by analyzing social media information to enhance government managerial decision-making and bring innovation to governmental operations. The progressive development of information and communication technologies (ICTs) and the digital transformation of operations have fundamentally changed every aspect of people’s lives, social needs, as well as communication strategies with the government. Modern government reporting demands reform toward a “data-driven, analytics-based, real-time, and proactive reporting paradigm” (Bora et al., 2021). A dynamic and interconnected communication channel with the citizens would generate the exogenous data source to improve public services’ performance and delivery. It would also be part of the three-dimensional reporting system measuring and reporting the quality of services. Outdated measurements and old-fashioned ways of operations cannot provide efficient public services to meet current citizens’ needs and expectations. For example, the New York City (NYC) Mayor’s Office of Operations implements a Scorecard inspection program to assess the cleanliness of its streets and sidewalks by relying on inspectors’ subjective judgment during a drive-by visual inspection of sampled locations.1 This method was established in 1973 and has not changed for nearly fifty years (Office of the New York State Comptroller, 2020). The ratings are adjusted for street miles but not for the population, housing density, or the nature of activity in the inspected area, such as residential or commercial areas. Based on the current rating, the majority of the streets are rated as acceptably clean (See Appendix A). However, the Office of New York State Comptroller issued an audit report in 2020 where it stated several weaknesses of the methodology used by the Mayor’s office, specifically the inspection process and the rating calculation, which raise concerns over the reliability of the ratings.
The auditors also pointed out that “without analyzing and acting on all available data, including complaints, to identify and mitigate the underlying problem, there is material risk that the same sanitation problems will continue to surface and negatively impact the quality of life for residents and visitors in those areas” (Office of the New York State Comptroller, 2020). The state auditors encouraged the Department of Sanitation to consider all the available data sources to develop and implement additional performance measures for street cleanliness (Office of the New York State Comptroller, 2020). The current service reporting system is what technology of the last century could provide. As accounting information systems are rigid and backward-looking, the public would be much better served with close-to-real-time service reporting integrated with a system of public accountability.
Additionally, NYC residents increasingly contact the Department of Sanitation via NYC311 about missed trash pickups, overflowing litter baskets, and other insalubrious conditions. The examination of the NYC311 service request data from May 22, 2014, to May 22, 2019, reveals an increasing trend of complaints or requests for services by NYC residents to the Department of Sanitation and the Department of Health and Mental Hygiene (as shown in Fig. 1).
To better embrace innovation in government, many plans and proposals are being considered and implemented, including big data analytics, smart cities, machine learning, drone usage, etc. Governments are increasingly adopting innovative data sources and data analytics to better support the decision-making process, such as mobile device sensor-based app data, crowdsourcing data, Twitter sentiment, and postings (Kitchin, 2014, O’Leary, 2013, OECD, 2017, Zeemering, 2021). Several cities have been exploring this area, using different management information systems to gather exogenous data and monitor public services and functions. Examples of these include monitoring traffic based on transportation network data, the data analytic center of the Centro De Operacoes Prefeitura Do Rio in Brazil, London’s Dashboard and LoveCleanStreets App, Boston’s infrastructure monitoring system, etc. (Kitchin, 2014, Li et al., 2018, O’Leary, 2019a, O’Leary, 2013). Incorporating big data into government information systems as part of service evaluation and assessment factors improves public services’ effectiveness, which allows the government official to make data-driven decisions, promptly address the issues, and better deploy the resources.
As an example to demonstrate the possibility of using exogenous data in supporting government managerial decision-making, this study proposes an alternative performance measure. This measure uses social media information to assess the street cleanliness in NYC in response to the New York State auditors’ recommendations stated in the 2020 audit report. It utilizes text mining techniques and machine learning algorithms to examine social media information, applies an analytical approach to identify temporal trends and patterns of street cleanliness, provides a different perspective about street cleanliness other than official cleanliness ratings, and assesses the tweets’ sentiment to measure the performance of municipal services. The study finds that the overall sentiment trend over the examined period is negative, inconsistent with the official Scorecard ratings. This study proposes that the government incorporates social media information into municipal performance evaluation and assessment factors. A continuous monitoring dashboard for street cleanliness that integrates various data sources, including social media information, can be built to support public services decision-making.
Public accountability is an essential factor for a sustainable and stable government. Many government institutions demonstrate their accountability by disclosing the tax revenue amount and illustrating how they spend taxpayers’ money efficiently and effectively, as well as how that expenditure benefits citizens’ lives (Callahan and Holzer, 1999). Involving citizens in the process of government fiscal budgeting and decision-making process, particularly in resource allocation and performance measurement, is critical to meeting citizens’ expectations and increasing the government’s accountability (Berner and Smith, 2004, Ebdon and Franklin, 2004, Justice et al., 2006, Robbins et al., 2008, Woolum, 2011). The majority of governments’ performance measures concentrate on information used to make internal management decisions, such as inputs, outputs, staffing patterns, and resource allocations (Ho and Ni, 2005, Woolum, 2011). Incorporating exogenous data, such as social media information, into government accounting information systems is a way of considering citizens’ preference and their views on public issues, which helps government decision-makers to provide better public services that matter to citizens, determine how public services should be managed, measured, and reported.
The contributions of this study mainly focus on three areas. First, this study demonstrates the possibility of incorporating social media information into the government information systems to support decision-making. Collecting and analyzing social media information is a direct and efficient way to obtain timely feedback from citizens and proactively interact with the public. Government accounting information systems can incorporate these measures and link them to cost figures allowing the understanding of the efficiency and effectiveness of operations. Second, this study presents a data analytical approach to enhance decision-making using more real-time type data rather than only historical data provided by accounting systems. Users can retrieve valuable information from the tweets by utilizing text mining techniques and machine learning algorithms and can handle a dataset with an imbalanced class distribution issue. Among the total number of tweets collected, only a small portion of the data is relevant to the subject; thus, the distribution of the dataset is skewed. The sampling methods used in the study can resolve the imbalanced class distribution issue, and the methodology can be generalized to other areas, such as predicting financial fraud and assessing bankruptcy possibilities. Third, this study provides an example of using social media information as an alternative performance measure. It applies emerging technologies and an analytical approach to examine social media information and provides a different perspective from the general public for tackling a public problem.
The remainder of this study is organized as follows: the second section reviews existing literature on the study of social media information. The third section provides the methodology of this study. The fourth section shows the results, and the fifth section focuses on extending the analysis to another social media platform. Finally, the last section discusses the conclusions and limitations of the study and provides future avenues for research.
Section snippets
Literature review
Research on social media has exponentially grown in recent years. As part of the exogenous data, the added value and the impact of social media are significant considering the volume, velocity, variety, and veracity of the information that is available (Buhl et al., 2013, Vasarhelyi et al., 2015, Yoon et al., 2015, Zhang et al., 2015). A Twitter platform facilitates network interconnections and perfectly illustrates the social network theory. The interconnected network among users generates a
Methodology
The general workflow for this study is illustrated in Appendix B. The following subsections describe each step in detail.
Results
The approach for obtaining results can be divided into two steps. The first step is relevancy determination, which uses a supervised machine learning method to retrieve relevant tweets related to this study. The second step is sentiment analysis, which applies VADER to the relevant tweets identified during the first step.
Framework extension
The tweets were collected based on NYC’s longitude and latitude, not at a granular level (e.g., at street level), due to the limitation of the Twitter API used. To complement this limitation and evaluate the approach to analyzing social media information, another social media platform (Facebook) is selected for testing. Another purpose of this extension is to explore the potential usage of Facebook data in evaluating NYC street cleanliness. Due to Facebook’s privacy restriction on personal
Summary
This study demonstrates how to bring an innovative data source to the government information system and utilize social media information to support government managerial decision-making. Text mining techniques and machine learning algorithms analyze social media information. These social media data sources can develop an alternative performance measure for NYC street cleanliness. Specifically, this paper applies text mining techniques and supervised machine learning algorithms to analyze
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We are thankful for the helpful comments received from Daniel O’Leary, Helen Brown-Liburd, Aleksandr Kogan, Deniz Appelbaum, Lawrence Gordon, and everyone from Rutgers, The State University of New Jersey – Continuous Auditing & Reporting Lab (CAR Lab). Special thanks to the editors and two anonymous reviewers from the journal; thank you for your valuable comments on the publication of this paper.
This paper was presented at the 2019 American Accounting Association (AAA) Annual Research Workshop
References (100)
- et al.
Twitter mood predicts the stock market
J. Comput. Sci.
(2011) - et al.
Using VADER sentiment and SVM for predicting customer response sentiment
Expert Syst. Appl.
(2020) - et al.
An experimental investigation of accounting information’s influence on the individual giving process
J. Account. Public Policy
(2006) - et al.
Making words work-using financial text as a predictor of financial events
Decis. Support Syst.
(2010) Social network analysis: an approach and technique for the study of information exchange
Lib. Inform. Sci. Res.
(1996)Identifying disgruntled employee systems fraud risk through text mining: a simple solution for a multi-billion dollar problem
Decis. Support Syst.
(2009)Annual report readability, current earnings, and earnings persistence
J. Account. Econ.
(2008)From E-government to we-government: defining a typology for citizen coproduction in the age of social media
Govern. Inform. Quart.
(2012)- et al.
Connecting citizens and local governments? Social media and interactivity in major U.S cities
Govern. Inform. Quart.
(2013) Blog mining-review and extensions: “from each according to his opinion”
Decis. Support Syst.
(2011)
On the relationship between number of votes and sentiment in crowdsourcing ideas and comments for innovation: a case study of Canada’s digital compass
Decis. Support Syst.
The usefulness of financial and nonfinancial performance information in resource allocation decisions
J. Account. Public Policy
The impact of nonmonetary performance measures upon budgetary decision making in the public sector
J. Account. Public Policy
Evaluating sentiment in financial news articles
Decis. Support Syst.
Functional Fragmentation in City Hall and Twitter Communication During the COVID-19 Pandemic: evidence from Atlanta, San Francisco, and Washington
DC. Government Information Quarterly
Detecting Spam Accounts on Twitter
Introduction to Machine Learning
Hybrid N-gram model using Naïve Bayes for classification of political sentiments on Twitter
Neural Comput. Appl.
A hybrid classification method for twitter spam detection based on differential evolution and random forest
Concurrency Comput.: Pract. Experience
The state of the states: a review of state requirements for citizen participation in the local government budget process
State Local Govern. Rev.
Latent Dirichlet Allocation
J. Mach. Learn. Res.
A set of metrics to assess stakeholder engagement and social legitimacy on a corporate facebook page
Online Inform. Rev.
Mastering Social Media Mining with Python
The transformation of government accountability and reporting
J. Emerg. Technol. Account.
Measuring with Exogenous Data (MED), and Government Economic Monitoring (GEM)
J. Emerg. Technol. Account.
Big data
Bus. Inform. Syst. Eng.
Which spoken language markers identify deception in high-stakes settings? Evidence from earnings conference calls
J. Lang. Social Psychol.
Interactive or reactive? Marketing with Twitter
J. Consumer Market.
Results-Oriented Government: Citizen Involvement in Performance Measurement. Performance & Quality Measurement in Government
Can social media predict election results? Evidence from New Zealand
J. Polit. Market.
A hybrid method for taxonomy creation
Int. J. Digital Account. Res.
Performance measurement and adoption of balanced scorecards: a survey of municipal governments in the USA and Canada
Int. J. Public Sector Manage.
“Like It Or Not”: consumer responses to word-of-mouth communication in on-line social networks
Manage. Res. Rev.
Lightweight methods to estimate influenza rates and alcohol sales volume from twitter messages
Lang. Resour. Eval.
Social media sentiment analysis: lexicon versus machine learning
J. Consum. Market.
Crowdsourcing as a new instrument in the Government’s Arsenal: explorations and considerations
Canad. Public Admin.
The current state and future direction of IT audit: challenges and opportunities
J. Inform. Syst.
Searching for a role for citizens in the budget process
Public Budget. Finance
Implementation of balanced scorecard in Indonesian government institutions: a systematic literature review
J. Public Admin. Stud.
Balanced scorecard implementation in an Italian Local Government Organization
Public Money Manage.
Balanced scorecard use in New Zealand Government Departments and Crown Entities
Aust. J. Public Admin.
Mining Twitter to explore the emergence of COVID-19 symptoms
Public Health Nurs.
Have cities shifted to outcome-oriented performance reporting?—A content analysis of city budgets
Public Budget. Finance
The rise and use of balanced scorecard measures in Australian government departments
Finan. Account. Manage.
Twitter adoption and use in mass convergence and emergency events
Int. J. Emergency Manage.
Cited by (18)
The predictors of the quality of accounting information system: Do big data analytics moderate this conventional linkage?
2023, Journal of Open Innovation: Technology, Market, and ComplexityAccounting and information systems
2024, Research Handbook on Accounting and Information Systems