Preterm birth can lead to many health problems in infants, including brain damage, neurologic disorders, asthma, intestinal problems and vision problems, but the exact cause of preterm birth is unclear. In this study, we investigated if geographic location or the environment can contribute to preterm birth by building a customized data model based on multiple controlled terminologies. We then performed a large-scale quantitative analysis to understand the relationships between the prevalence of preterm birth, the biological mothers’ demographic information and the Metropolitan Statistical Areas (MSAs) of their primary residency from 2010 to 2014. More specifically we considered education, income, race and marital status information of 388 MSAs from the US Census Bureau. The results demonstrated that the overall preterm birth rate for the United States decreased during 2010 to 2014, with Chicago-Naperville-Elgin (Illinois) Metro Area, Houston-Sugar Land (Texas) Metro Area and Billings (Montana) Metro Area observing the most visible improvement. There are statistically significant correlations between race distribution, education level and preterm birth. But median income, marital status and insurance coverage ratio are found irrelevant to preterm birth. This study demonstrated the power of controlled terminologies in integrating medical claims data and geographic data to study preterm birth for first time. The customized common data model and the interactive tool for online visualizing a large preterm dataset from both the temporal and spatial perspectives can be used for future public health studies of many other diseases and conditions.