Institute for Software Research
School of Computer Science, Carnegie Mellon University


300 Cities - An Exploration in Characterizing US Cities

Michael K. Martin, Kathleen M. Carley, Neal Altman

June 2008

Center for the Computational Analysis of
Social and Organizational Systems (CASOS) Technical Report


Keywords: Census 2000, social distance construction, multi-dimensional scaling, ORA, ORA group analayses

The goal of the 300-Cities Project is to support IRS policy decisions by finding a small number of city clusters, where the cities within each cluster will respond similarly to IRS interventions. This report describes two types of analyses based on U.S. Census 2000 data. The first is an agent-class analysis. In this analysis city clustering operations are based on the correspondence of population profiles for pairs of cities. Extensive effort using this analysis framework in conjunction with the SAS statistical package demonstrates that although the framework is conceptually straightforward, it is computationally impractical and conceptually impoverished. The second analysis framework, the city-matching analysis, combines city summary and population heterogeneity metrics with information access constraints and taxpayer categories to create a city-matching index for each pair of cities. The city-matching analysis thus shifts the basis of analysis from a city's population profile to its information diffusion characteristics, and provides "hooks" to IRS classification schemes to make the findings more actionable. City clustering operations in this framework are based on city-matching indices, which were analyzed by traditional social network analysis techniques using the Organizational Risk Analyzer (ORA). Although the issue of how best to integrate the various components of the city-match index remain unresolved, exploratory results show promise by yielding actionable city clusters. The city clusters, however, only account for 95 of the 297 cities in the Census 2000 data. Together, the two analysis frameworks raise questions as to whether canonical city types exist. At this point, it does seem reasonable to believe that iterative development of the nascent city-matching analysis, coupled with virtual experiments to validate results provided by the framework, will yield actionable information for IRS interventions. Whether that actionable information will employ canonical city clusters, however, remains unclear.

95 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by