Institute for Software Research
School of Computer Science, Carnegie Mellon University
Machine Detection of Persisting Pragmatic Linguistic Relations
Eric A. Daimler
Humans speak to each other in a variety of mediums about their expectations for the future. These conversations have various degrees of preparation, formality, and impact. A central banker's speech may be listened to more carefully than an intra-company email, but both are efforts to set expectations of the future. They all contain biases. Combining recent developments in Network Science, Computational Linguistics, and Machine Learning enables new efforts to measure the impact of human-generated text. Measuring the bias may help to reduce it.
This work considers a new multi-step framework for the analysis of text. The efficacy is explored in the domains of public policy (the monetary policy of central banking) and corporate communications (the equity price of a publicly-traded firm) using machine-enhanced semantic network analysis. The implication of this study may view these techniques through different lenses of information use: central banks, corporate treasury, and investors. In supplying a set of reliable quantitative measurements to previously qualitative information, this study may help to improve both communication and the biases in its interpretation. In studying these issues using different communication modes and contexts, I hope to contribute to a broad analysis of communication concerns.
Classical approaches measure sentiment of these texts most often as bimodal (good/bad, increasing/decreasing, etc.). However, it is in decision making that more information is needed and reliable nuanced analysis become useful. In this study, I present approaches in computer science that address these challenges by explicitly testing the circumstances under which quantitative to qualitative relationships occur in the domain of finance and economics. The approach takes as input qualitative data from various sources in addition to quantitative data in the form of financial data. I develop a meta algorithm for measuring and testing the relationship which helps to identify a causal relationship among different data sets in different circumstances. The approach leads to insights on price movement (asset valuation) for the purposes of public policy but also for corporate management in the domain of the treasury function.
The approach I develop may support the assessment and estimation of financial decision processes in many circumstances. This range suggests an ability to generalize beyond financial decision-making. I start with qualitative data (text data) in various contexts that are then cleaned of extraneous markings such as date, location, and original distribution location (email, speech, etc.). Second, I use a sequence of steps in Dynamic Network Analysis to extract a semantic network that will be used as the quantitative structure for the best comparison with other quantitative data. Third, I collect appropriate quantitative data external to the text against which to compare the semantic network results. Fourth, I use learning algorithms to identify the degree to which a relationship can be found between the extracted semantic network analysis and the external quantitative data. This trimmed structure should allow for further development in future work of a predictive framework in financial decisions.
Text analysis of even the most basic kind has shown to be beneficial, but new approaches are needed. More adaptive systems, where an intelligent system assesses the text as it occurs and provides feedback when necessary, is a promising area of research that can help provide scaffolding for the interpretation of communication. Little is known about how to build these systems and what effects they might have on our collaboration and learning. In this dissertation, I augment existing semantic analysis systems with a more sophisticated analysis and then design, build, and evaluate a more powerful framework.