Estimating Ideal Points from Votes and Text
Type
We introduce a framework for combining vote data and text data within a single formal and statistical framework. Formally, we model vote choice and word choice in terms of a common set of underlying preference parameters. Statistically, we implement a method for recovering these preference and location parameters. The method estimates the number of underlying ideological dimensions, models zero inflation, and is robust to extreme outliers. We apply the method to rollcall and floor speech from recent US Senates. We find two stable dimensions, one ideological and the other capturing leadership. We then show how the method can leverage common speech in order to impute missing data, to estimate rank-and-file ideal points using only their words and the vote history of party leaders, and even to scale newspaper editorials.